Led 3.0 Class Library Documentation

Module CodePage


Classes and Global Functions Index

Module Description:

This module is designed to provide mappings between wide UNICODE and various other code pages and UNICODE encodings.


Class: CodePage [public]

Description:

A codePage is a Win32 (really DOS) concept which describes a particular single or multibyte (narrow) character set encoding. Use Win32 CodePage numbers. Maybe someday add a layer to map to/from Mac 'ScriptIDs' - which are basicly analagous.

Use this with CodePageConverter.


Class: CodePageConverter [public]

Description:

Helper class to wrap conversions between code pages (on Mac known as scripts) and UTF-16 (WIDE UNICODE).

Member Details

CodePageConverter::GetHandleBOM [public]

bool CodePageConverter::GetHandleBOM () const

In UNICODE, files are generally headed by a byte order mark (BOM). This mark is used to indicate if the file is big endian, or little-endian (if the characters are wide-characters). This is true for 2 and 4 byte UNICODE (UCS-2, UCS-4) UNICODE, as well as for UTF-X encodings (such as UTF-7 and UTF-8). It is also used to indicate whether or not the file is in a UTF encoding (as byte order doesn't matter in any (most?) of the UTF encodings.

The basic rubrick for BOM's is that they are the character 0xfeff, as it would be encoded in the given UTF or UCS encoding.

Because of this type of encoding - if you have a 0xfeff character (after decoding) at the beginning of a buffer, there is no way for this routine to know if that was REALLY there, or if it was byte order mark. And its not always desirable for the routine producing these encodings to produce the byte order mark, but sometimes its highly desirable. So - this class lets you get/set a flag to indicate whether or not to process BOMs on input, and whether or not to generate them on encoded outputs.

See also CodePageConverter::SetHandleBOM, and note that there is an overloaded CTOR that lets you specify CodePageConverter::eHandleBOM as a final argument to automatically set this BOM converter flag.

CodePageConverter::MapFromUNICODE_QuickComputeOutBufSize [public]

size_t CodePageConverter::MapFromUNICODE_QuickComputeOutBufSize (const wchar_t* inChars, size_t inCharCnt) const

Call to get an upper bound, reasonable buffer size to use to pass to MapFromUNICODE calls.

CodePageConverter::MapToUNICODE [public]

void CodePageConverter::MapToUNICODE (const char* inMBChars, size_t inMBCharCnt, wchar_t* outChars, size_t* outCharCnt) const

Map the given multibyte chars in the fCodePage codepage into wide UNICODE characters. Pass in a buffer 'outChars' of size large enough to accomodate those characrters.

'outCharCnt' is the size of the output buffer coming in, and it contains the number of UNICODE chars copied out on return.

CodePageConverter::MapToUNICODE_QuickComputeOutBufSize [public]

size_t CodePageConverter::MapToUNICODE_QuickComputeOutBufSize (const char* /*inMBChars*/, size_t inMBCharCnt) const

Call to get an upper bound, reasonable buffer size to use to pass to CodePageConverter::MapToUNICODE calls.

CodePageConverter::SetHandleBOM [public]

void CodePageConverter::SetHandleBOM (bool handleBOM)

See also CodePageConverter::GetHandleBOM.


Class: CodePagePrettyNameMapper [public]

Description:

Code to map numeric code pages to symbolic user-interface appropriate names.


Class: CodePagesGuesser [public]

Description:

Guess the code page of the given argument text.

Member Details

CodePagesGuesser::Guess [public]

CodePage CodePagesGuesser::Guess (const void* input, size_t nBytes, Confidence* confidence, size_t* bytesFromFrontToStrip)

Guess the code page of the given snippet of text. Return that codepage. Always make some guess, and return the level of quality of the guess in the optional parameter 'confidence' - unless its NULL (which it is by default), and return the number of bytes of BOM (byte-order-mark) prefix to strip from teh source in 'bytesFromFrontToStrip' unless it is NULL (which it is by default).


Class: CodePagesInstalled [public]

Description:

Helper class to check what code pages are installed on a given machine.

Member Details

CodePagesInstalled::GetAll [public]

const vector < CodePage > & CodePagesInstalled::GetAll ()

Returns a list of all code pages installed on the system. This list is returned in sorted order.

CodePagesInstalled::GetDefaultCodePage [public]

CodePage CodePagesInstalled::GetDefaultCodePage ()

Returns the operating systems default code page. NOTE - this is NOT the same as the default code page Led will use. Led will use this occasionally as its default, however. On Windows - this is basically a call to ::GetACP ().

CodePagesInstalled::IsCodePageAvailable [public]

bool CodePagesInstalled::IsCodePageAvailable (CodePage cp)

Checks if the given code page is installed.


Class: TableDrivenCodePageConverter < CODEPAGE > [public]

Description:

Helper class - probably should not be directly used.


Return to Led Page Return to Led ClassLib Documentation Index Return to Led Reference Manual Index
Last Updated 2001-10-20