Just wanted to post a follow-up here with more information on the messed up filename situation with regards to accented characters. I still haven't found an automated way to make corrections, but I know how to find any problem files now, even if searching for one problem type at a time.

My conversion of tag to filename fixed a lot of the problems but unfortunately, it didn't take care of everything - likely because the problem characters also existed in the tags.

I did find a command-line linux program called convmv which sounded promising, but I wasn't able to get it to do anything useful at all. As is sometimes the case, the author seems to have been hyper-focused on his own need for creating the program that he fails to provide decent documentation or examples for everyone else.

Anyway, the filename is caused because instead of the accented characters being made up of single unicode entities, they're using multiple. They're made up of Unicode Combining Characters.

A list of Unicode Combining Characters can be found on this US government page funnily enough. "MARC standards" - they're at the bottom of the list.

You can see that each combining character is two bytes itself. These get added to a one or two byte letter to create a visual representation of the accented character.

These should NOT be used in filenames as far as I can tell. They won't zip nor work with plenty of software because the characters returned aren't what's expected. This is why iTunes fails to read these files.

To help me locate affected filenames I've made a file containing all these characters (using a hex editor) and labeled them accordingly. I load this file into Notepad and then copy the combining character which I then use to paste into a file search field. Windows will happily find the files using the combining characters. Then I make sure the tags are OK and then I perform either a manual or tag-to-filename rename.

I've attached the combining_characters.txt file to this post. MOstly just in case I need it again and lose my local copy. smile

I've also now found at least one source of the problem. An iTunes script for populating the Track name from file name will create these combining characters after reading the otherwise correct filenames.


Attachments
combining_characters.txt (57 downloads)

_________________________
Bruno
Twisted Melon : Fine Mac OS Software