STREAM::encoding¶
Description¶
Tells TMOS how to convert data stream octets to Unicode characters for
matching by the Stream Profile.
By default TMOS assumes that data-stream octets are single-byte
characters in the ISO-8859-1 character set (of which ASCII is a
subset) and translates them directly to the first 256 Unicode
characters in numerical order (like
0x41 == «A» == "\u0041"
or
0xF9 == «ù» == "\u00f9"
). Optionally, TMOS can treat data-stream octets as UTF-8 and
transcode them to Unicode.
Syntax¶
STREAM::encoding [ascii | utf-8]
STREAM::encoding [ascii | utf-8]¶
Specifies the translation of data-stream octets to Unicode characters before matching against Stream Profile match values (which are regular expressions). The default translation mode is
ascii
(which means ISO-8859-1, including ASCII). The only other valid option is
utf-8
.
In
ascii
mode, the octet sequence
0xC3 0xBC
will be matched as two characters
\u00c3\u00bc
or
«Ã¼»
. In
utf-8
mode, the identical octet sequence
0xC3 0xBC
will be matched as the single character
\u00fc
or
«ü»
.
When you attempt to match binary octets (non-text) you should choose
ascii
mode, so you can match each octet
0xHH
using the character escape sequence
\u00HH
(see STREAM::expression for details).