This class represent a simple stateless converter from UCS-4 and to UCS-4 for each single code point. More...

#include <util.hpp>

Collaboration diagram for boost::locale::util::base_converter:

Public Member Functions
virtual	~base_converter ()

virtual int	max_len () const
	Return the maximal length that one Unicode code-point can be converted to, for example for UTF-8 it is 4, for Shift-JIS it is 2 and ISO-8859-1 is 1. More...

virtual bool	is_thread_safe () const
	Returns true if calling the functions from_unicode, to_unicode, and max_len is thread safe. More...

virtual base_converter *	clone () const
	Create a polymorphic copy of this object, usually called only if is_thread_safe() return false. More...

virtual uint32_t	to_unicode (char const &begin, char const end)
	Convert a single character starting at begin and ending at most at end to Unicode code-point. More...

virtual uint32_t	from_unicode (uint32_t u, char begin, char const end)
	Convert a single code-point u into encoding and store it in [begin,end) range. More...

Static Public Attributes
static const uint32_t	illegal =utf::illegal
	This value should be returned when an illegal input sequence or code-point is observed: For example if a UCS-32 code-point is in the range reserved for UTF-16 surrogates or an invalid UTF-8 sequence is found. More...

static const uint32_t	incomplete =utf::incomplete
	This value is returned in following cases: The of incomplete input sequence was found or insufficient output buffer was provided so complete output could not be written. More...

Detailed Description

This class represent a simple stateless converter from UCS-4 and to UCS-4 for each single code point.

This class is used for creation of std::codecvt facet for converting utf-16/utf-32 encoding to encoding supported by this converter

Please note, this converter should be fully stateless. Fully stateless means it should never assume that it is called in any specific order on the text. Even if the encoding itself seems to be stateless like windows-1255 or shift-jis, some encoders (most notably iconv) can actually compose several code-point into one or decompose them in case composite characters are found. So be very careful when implementing these converters for certain character set.

Constructor & Destructor Documentation

virtual boost::locale::util::base_converter::~base_converter ( )

inlinevirtual

Member Function Documentation

virtual base_converter* boost::locale::util::base_converter::clone ( ) const

inlinevirtual

Create a polymorphic copy of this object, usually called only if is_thread_safe() return false.

References BOOST_ASSERT.

virtual uint32_t boost::locale::util::base_converter::from_unicode	(	uint32_t	u,
		char *	begin,
		char const *	end
	)

inlinevirtual

Convert a single code-point u into encoding and store it in [begin,end) range.

If u is invalid Unicode code-point, or it can not be mapped correctly to represented character set, illegal should be returned

If u can be converted to a sequence of bytes c1, ... , cN (1<= N <= max_len() ) then

If end - begin >= N, c1, ... cN are written starting at begin and N is returned
If end - begin < N, incomplete is returned, it is unspecified what would be stored in bytes in range [begin,end)

References illegal, and incomplete.

virtual bool boost::locale::util::base_converter::is_thread_safe ( ) const

inlinevirtual

Returns true if calling the functions from_unicode, to_unicode, and max_len is thread safe.

Rule of thumb: if this class' implementation uses simple tables that are unchanged or is purely algorithmic like UTF-8 - so it does not share any mutable bit for independent to_unicode, from_unicode calls, you may set it to true, otherwise, for example if you use iconv_t descriptor or UConverter as conversion object return false, and this object will be cloned for each use.

virtual int boost::locale::util::base_converter::max_len ( ) const

inlinevirtual

Return the maximal length that one Unicode code-point can be converted to, for example for UTF-8 it is 4, for Shift-JIS it is 2 and ISO-8859-1 is 1.

virtual uint32_t boost::locale::util::base_converter::to_unicode	(	char const *&	begin,
		char const *	end
	)

inlinevirtual

Convert a single character starting at begin and ending at most at end to Unicode code-point.

if valid input sequence found in [begin,code_point_end) such as begin < code_point_end && code_point_end <= end it is converted to its Unicode code point equivalent, begin is set to code_point_end

if incomplete input sequence found in [begin,end), i.e. there my be such code_point_end that code_point_end > end and [begin, code_point_end) would be valid input sequence, then incomplete is returned begin stays unchanged, for example for UTF-8 conversion a *begin = 0xc2, begin +1 = end is such situation.

if invalid input sequence found, i.e. there is a sequence [begin, code_point_end) such as code_point_end <= end that is illegal for this encoding, illegal is returned and begin stays unchanged. For example if *begin = 0xFF and begin < end for UTF-8, then illegal is returned.

References boost::asio::begin, illegal, and incomplete.

Member Data Documentation

const uint32_t boost::locale::util::base_converter::illegal =utf::illegal

static

This value should be returned when an illegal input sequence or code-point is observed: For example if a UCS-32 code-point is in the range reserved for UTF-16 surrogates or an invalid UTF-8 sequence is found.

Referenced by from_unicode(), and to_unicode().

const uint32_t boost::locale::util::base_converter::incomplete =utf::incomplete

static

This value is returned in following cases: The of incomplete input sequence was found or insufficient output buffer was provided so complete output could not be written.

Referenced by from_unicode(), and to_unicode().

The documentation for this class was generated from the following file:

boost_1_57_0/boost/locale/util.hpp

Public Member Functions

Static Public Attributes

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation