std.unicode_table_generator

This is a tool to automatically generate source code for unicode data structures.

If not present, the script will automatically try to download the files from: https://www.unicode.org/Public

Make sure the current working directory is the /tools folder.

To update std.internal.unicode*.d files, run:

rdmd -m32 unicode_table_generator.d
rdmd -m64 unicode_table_generator.d --min

The -m32 run will replace the files, while the -m64 run with --min will append 64-bit specific parts. The 32-bit compilation of the generator is needed because it depends on 32-bit data structures defined in std.uni. To make -m32 work on linux, you may need to grab a 32-bit libphobos2.a from dmd2/linux/lib32 and pass it as argument:

rdmd -m32 -Llibphobos2.a -defaultlib= unicode_table_generator.d

Pull Requests to untangle this complex bootstrap process are welcome! :)

TODO: Support emitting of Turkic casefolding mappings

Members

Functions

writeDstring
void writeDstring(File sink, T[] tab)

Write a dchar[] as a dstring ""d

writeDstringTable
void writeDstringTable(File sink, string name, dchar[] table)

Write a function that returns a dchar[] with data stored in table

writeUintArray
void writeUintArray(File sink, T[] tab)

Write a dchar[] as hex string

writeUintTable
void writeUintTable(File sink, string name, uint[] table)

Write a function that returns a uint[] with data stored in table

Manifest constants

outputDir
enum outputDir;

Where to put generated files

unicodeBaseUrl
enum unicodeBaseUrl;

Url from which unicode files are downloaded

unicodeDir
enum unicodeDir;

Directory in which unicode files are downloaded

Structs

FullCaseEntry
struct FullCaseEntry
Undocumented in source.
SimpleCaseEntry
struct SimpleCaseEntry

8 byte easy SimpleCaseEntry, will be compressed to SCE which bit packs values to 4 bytes

Variables

combiningClass
CodepointSet[256] combiningClass;

canonical combining class

Meta

Authors

Dmitry Olshansky

License

Boost