cldr, locales and trauma

Posted by: sn00p

cldr, locales and trauma - 05/12/2009 06:28

I wonder if anybody can provide the solution on how to do this.

I have a need to extract the LC_MONETARY structures from the cldr (Unicode.org) dat, they provide POSIX locale source files, but I'm unsure of exactly how I extract the data from the posix source files.

If I run localedef, I can convert the source file into binary data, so a LC_MONETARY file appears, but this is in binary format and there appears to be no way of dumping this information directly from the file, the only way would appear to be to install the said locale and then use a small c program to extract the data into my structures.

Does anybody know of a nice way to do what I want, the locale utility can dump the data in the form I want directly to the console, but it only works with installed locales, not some arbitray locale file I have just generated.

I just want a dump of this data in UTF8 and I've just wasted a day trying to figure out how to do it.

Ideas and solutions highly appreciated!

Adrian
Posted by: mlord

Re: cldr, locales and trauma - 05/12/2009 12:04

apt-get source xxxxx, where xxxxx is a program/library that knows how to parse locales ?
Posted by: mlord

Re: cldr, locales and trauma - 05/12/2009 12:16

Yeah, perhaps this..

I just pulled down source code for uclibc (a simple/complete C-library), and the included locale.c source file may have much of what you want. Look at function update_hr_locale(), perhaps.

I hope this helps, despite my lack of knowledge about locales in general.

smile
Posted by: sn00p

Re: cldr, locales and trauma - 07/12/2009 10:51

Thanks Mark,

I solved the problem by parsing the UTF-8.cm file to extract the UTF-8 mappings and then loaded each .UTF-8.src file in turn and extracted the LC_MONETARY information using the UTF-8 mappings where appropriate.

I now have all the data I need.

Adrian