char, wchar, or dchar
UseReplacementDchar.yes means replace invalid UTF with replacementDchar, UseReplacementDchar.no means throw UTFException for invalid UTF
A bidirectional range if R is a bidirectional range and not auto-decodable, as defined by std.traits.isAutodecodableString.
A forward range if R is a forward range and not auto-decodable.
Or, if R is a range and it is auto-decodable and is(ElementEncodingType!typeof(r) == C), then the range is passed to byCodeUnit.
Otherwise, an input range of characters.
UTFException if invalid UTF sequence and useReplacementDchar is set to UseReplacementDchar.no
GC: Does not use GC if useReplacementDchar is set to UseReplacementDchar.yes
import std.algorithm.comparison : equal; // hellö as a range of `char`s, which are UTF-8 assert("hell\u00F6".byUTF!char().equal(['h', 'e', 'l', 'l', 0xC3, 0xB6])); // `wchar`s are able to hold the ö in a single element (UTF-16 code unit) assert("hell\u00F6".byUTF!wchar().equal(['h', 'e', 'l', 'l', 'ö'])); // 𐐷 is four code units in UTF-8, two in UTF-16, and one in UTF-32 assert("𐐷".byUTF!char().equal([0xF0, 0x90, 0x90, 0xB7])); assert("𐐷".byUTF!wchar().equal([0xD801, 0xDC37])); assert("𐐷".byUTF!dchar().equal([0x00010437]));
import std.algorithm.comparison : equal; import std.exception : assertThrown; assert("hello\xF0betty".byChar.byUTF!(dchar, UseReplacementDchar.yes).equal("hello\uFFFDetty")); assertThrown!UTFException("hello\xF0betty".byChar.byUTF!(dchar, UseReplacementDchar.no).equal("hello betty"));
import std.range.primitives; wchar[] s = ['ă', 'î']; auto rc = s.byUTF!char; static assert(isBidirectionalRange!(typeof(rc))); assert(rc.back == 0xae); rc.popBack; assert(rc.back == 0xc3); rc.popBack; assert(rc.back == 0x83); rc.popBack; assert(rc.back == 0xc4); auto rw = s.byUTF!wchar; static assert(isBidirectionalRange!(typeof(rw))); assert(rw.back == 'î'); rw.popBack; assert(rw.back == 'ă'); auto rd = s.byUTF!dchar; static assert(isBidirectionalRange!(typeof(rd))); assert(rd.back == 'î'); rd.popBack; assert(rd.back == 'ă');
Iterate an input range of characters by char type C by encoding the elements of the range.
UTF sequences that cannot be converted to the specified encoding are either replaced by U+FFFD per "5.22 Best Practice for U+FFFD Substitution" of the Unicode Standard 6.2 or result in a thrown UTFException. Hence byUTF is not symmetric. This algorithm is lazy, and does not allocate memory. @nogc, pure-ity, nothrow, and @safe-ty are inferred from the r parameter.