sanitize

Sanitizes a string by replacing malformed code unit sequences with valid code unit sequences. The result is guaranteed to be valid for this encoding.

If the input string is already valid, this function returns the original, otherwise it constructs a new string by replacing all illegal code unit sequences with the encoding's replacement character, Invalid sequences will be replaced with the Unicode replacement character (U+FFFD) if the character repertoire contains it, otherwise invalid sequences will be replaced with '?'.

immutable(E)[]
sanitize
(
E
)
(
immutable(E)[] s
)

Parameters

s immutable(E)[]

the string to be sanitized

Examples

assert(sanitize("hello \xF0\x80world") == "hello \xEF\xBF\xBDworld");

Meta

Standards

Unicode 5.0, ASCII, ISO-8859-1, ISO-8859-2, WINDOWS-1250, WINDOWS-1251, WINDOWS-1252