JIS and other Multibyte language support.

Posted by: xanatos

JIS and other Multibyte language support. - 23/05/2003 15:45

One thing I would like to see is support for Multibyte characters in ID3 tags that are readable by the player. I have a substansual Asian CD collection, and as to be expected, many of these use Mutli-byte characters for the titles of the songs, artists, albums, etc. It gets kind of tiring typing out all of these into standard letters. I know some of these languages can have thousands of characters, so if there was some way to read from a TTF or OTF file to display on the screen, that would probablly make life much easier to implement. I know that we'll most likely never see this, so I can wish eh? My primary concern is Japanese and Korean, and eventually go from there.
Posted by: tman

Re: JIS and other Multibyte language support. - 23/05/2003 16:09

I think it's been discussed before and the main problem was actually making a font that was visible. You don't get much space for each character in the small font sizes. The screen is only 128x64 and that's if you use the entire screen for one line.

- Trevor
Posted by: xanatos

Re: JIS and other Multibyte language support. - 23/05/2003 16:37

True, with the medium font, you could get most simplier characters, and as special needs arrise, bigger characters could be used for the characters. It would be interesting to see would would all need to be changed to accomidate this.
Posted by: tman

Re: JIS and other Multibyte language support. - 23/05/2003 16:50

You could rig something up like Emphatic that reads the data and then displays it.
Probably would be the easiest way to integrate it into the empeg player.

- Trevor
Posted by: wfaulk

Re: JIS and other Multibyte language support. - 23/05/2003 19:14

This web site has (non-free) Japanese bitmap fonts (Chinese and Korean, too) that are as small as 9x9. Go figure.
Posted by: Roger

Re: JIS and other Multibyte language support. - 24/05/2003 02:43

v3.0 will sport a database that supports UTF8. We still need a font, and the necessary changes to emplode still need to be done.
Posted by: peter

Re: JIS and other Multibyte language support. - 24/05/2003 04:10

True, with the medium font, you could get most simplier characters, and as special needs arrise, bigger characters could be used for the characters. It would be interesting to see would would all need to be changed to accomidate this.
We've already sort-of done this. The forthcoming Pearl portable supports kanji, and all the UTF-8 and font-plotting code was prototyped on a car-player. Two lines of existing car-player code needed to be changed for plotting to work. (This was with a 13-pixel-high Kanji font as "medium.bf", which again the screen layout code just dealt with.)

The remaining issues are Emplode, the search windows (T9ing for kana and other nonideographic character sets is easy, but T9ing for kanji isn't -- how does that sort of thing usually work?), Infotex visuals, and, possibly worst of all, the sheer size of Unicode fonts. For the prototyping I mentioned, I borrowed Mike's Mark 2a (with 16Mb); original Mark 2s like mine (12Mb) and Mark 1s (8Mb) will probably never fit Unicode fonts.

Peter
Posted by: peter

Re: JIS and other Multibyte language support. - 24/05/2003 04:22

many of these use Mutli-byte characters for the titles of the songs, artists, albums, etc.
Oh, while we're on, if you're actually in East Asia, could you possibly go round to the houses of anyone who writes tagging software that puts JIS or other non-global encodings in ID3v2 headers but sets the header bits meaning tags are ISO8859-1, and hit them with big wet fish? Ta.

ID3v2 allows for UTF-16 and UTF-8 for a good reason, and it's so the same tag means the same thing to everyone in the world, without having to know the locale of the PC it was written on (which you can't know). It's not rocket science, CJKV tag editing software folks, and it's in your own advantage too, as it means you stand a fighting chance of seeing accented characters, Cyrillic etc. if you ever get your hands on MP3s from France, Russia, Greece, Eastern Europe or any other not-just-ASCII part of the world.

[/rant], sorry

Peter
Posted by: xanatos

Re: JIS and other Multibyte language support. - 24/05/2003 13:23

Wow, I had no clue all of this had already been thought up, and in some way or another is worked on being implemented. Goes to show you that this community is alive and well. I can't wait until the Pearl hits, definately going to get one. It will also be nice to have all of my Japanese CD's ready to go on the empeg and Perl using Kana and Kanji (which does have a way to search, and has an "order" by which to seach, however I'm not sure how that order actually works!).
Posted by: larry818

Re: JIS and other Multibyte language support. - 31/05/2003 07:35

> I think it's been discussed before and the main problem was actually making a font that was visible. You don't get much space for each character in the small font sizes. The screen is only 128x64 and that's if you use the entire screen for one line.

This may not be such a problem.

Entire words in chinese are just one character. For example, "Valen Hsu" would be three characters (for her real chinese name), and "Beautiful Dream Comes True" is only four chars. This could be bigger than normal.

Korean is phonetic, but then, their characters are as simple as ours and would be just as readable.

I don't know much about Japanese, since they have such a mix of systems.

Now of course for Thailand, the empeg would have to write backwards.