Fmlib_parse.Ucharacter
Parser for streams of unicode characters.
There are several possibilities to encode unicode characters in byte streams.
There are the following modules available:
Make_utf8
: Parse text streams encoded in utf-8
.Make_utf16_be
: Parse text streams encoded in utf-16
big endian.Make_utf16_le
: Parse text streams encoded in utf-16
little endian.Make
: Parse text streams in any encoding. The encoder and decoder have to be provided as module parameter.All parsers in this module work like a character parser (see Character.Make
) with some additional combinators to recognize unicode characters.
module Make_utf8
(State : Fmlib_std.Interfaces.ANY)
(Final : Fmlib_std.Interfaces.ANY)
(Semantic : Fmlib_std.Interfaces.ANY) :
sig ... end
Parse an input stream consisting of unicode characters encoded in utf-8.
module Make_utf16_be
(State : Fmlib_std.Interfaces.ANY)
(Final : Fmlib_std.Interfaces.ANY)
(Semantic : Fmlib_std.Interfaces.ANY) :
sig ... end
Parse an input stream consisting of unicode characters encoded in utf-16 big endian.
module Make_utf16_le
(State : Fmlib_std.Interfaces.ANY)
(Final : Fmlib_std.Interfaces.ANY)
(Semantic : Fmlib_std.Interfaces.ANY) :
sig ... end
Parse an input stream consisting of unicode characters encoded in utf-16 little endian.
module Make
(Codec : Interfaces.CHAR_CODEC)
(State : Fmlib_std.Interfaces.ANY)
(Final : Fmlib_std.Interfaces.ANY)
(Semantic : Fmlib_std.Interfaces.ANY) :
sig ... end
Parse an input stream consisting of unicode characters. The unicode characters are encoded and decoded by using the module Codec
.