UTF-8 data with WIN32 'A' functions

Michael (michka) Kaplan [MS]

2005-01-27 14:57:24 UTC

Win32 "A" functions almost always convert to Unicode and call the "W"
functions. They usually use the CP_ACP for the conversion and because of
this they csn never use UTF-8....
--
MichKa [MS]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Microsoft Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.

Post by Yaron Alterman
Hi,
Is is possible to feed UTF-8 data to WIN32 ANSI functions? Do those
functions parse the data and check MBCS validity or do they just look
for the '\0' at the end in which case UTF-8 will be transparent to
them.
Got a good explenation for why I even ask this, if you will bare with
me :)
I'm working in a team which is developing a software library with
Unicode support.
Instead of having 2 targets, which are ANSI/Unicode by pre-processing,
I would like to have only one target, with 2 sets of internal
functions: Unicode and ANSI, just like WinNT WIN32 API does. For me
it's easier since I code in C++, in which case I can overload the
methods.
In the interface exposed to the client there will be only one
decleration per method, determained in pre-process with the TCHAR
trick,
internally, both implementations will be compiled, and the linker will
get the right implementation from my binary library(static lib), into
the client's executable.
class A
{
void Foo(char* str);
void Foo(wchar_t* str);
}
#include <tchar.h>
class A
{
void Foo(TCHAR* str);
}
this works quite nicely :)
I wanted the core to be Unicode (just like NT WIN32), and have my ANSI
methods convert the data to Unicode, then delegate to the Unicode
methods, thus enabling me to build my entire library with the _UNICODE
flag.
This works great in NT based Windows, I have a problems in Windows98
though, since all WIN32 APIs which are translated as 'W' functions
dont work, or work badly, or even throw exceptions. Since the
Windows98 core is ANSI, there's no good support there for Unicode (or
is there???).
In an act of despair, the last trick I was thinking of was having the
core compiled as _MBCS, and have all my core interfaces have char*
types, and in the outer APIs have my ANSI methods do their stuff, and
my Unicode methods, convert the data to UTF-8, and feed those to my
char* methods, which in turn might call a WIN32 API (an 'A' function
this time). This will enable support for Windows98, but I was
wondering what will be the effect.
If this doesn't work i will have to wither resort back to 2
pre-processed targets, or give up support for Windows98
Just wanted your opinions on this,
Thanks,
Yaron