STREAM::encoding

Description

Tells TMOS how to convert data stream octets to Unicode characters for matching by the Stream Profile.
By default TMOS assumes that data-stream octets are single-byte characters in the ISO-8859-1 character set (of which ASCII is a subset) and translates them directly to the first 256 Unicode characters in numerical order (like
0x41 == «A» == "\u0041"

or

0xF9 == «ù» == "\u00f9"
). Optionally, TMOS can treat data-stream octets as UTF-8 and transcode them to Unicode.

Syntax

STREAM::encoding [ascii | utf-8]

STREAM::encoding [ascii | utf-8]

  • Specifies the translation of data-stream octets to Unicode characters before matching against Stream Profile match values (which are regular expressions). The default translation mode is

    ascii
    

    (which means ISO-8859-1, including ASCII). The only other valid option is

    utf-8
    
    .

In
ascii

mode, the octet sequence

0xC3 0xBC

will be matched as two characters

\u00c3\u00bc

or

«Ã¼»

. In

utf-8

mode, the identical octet sequence

0xC3 0xBC

will be matched as the single character

\u00fc

or

«ü»
.
When you attempt to match binary octets (non-text) you should choose
ascii

mode, so you can match each octet

0xHH

using the character escape sequence

\u00HH
(see STREAM::expression for details).

Examples