For that market, anything that might possibly confuse the user is a problem - a clever system might categorise one way for one CD, and another way for the next.

I understand what you're saying, but I was saying that the very problem you're afraid of could theoretically be solved with good software.

My idea was to write a piece of software that was so good at determining the organization/categorization from the various possible ways of naming/tagging files, that the end-user would never have to choose or be confused by it. It would be completely transparent and work almost all the time. Provided that the disk you fed it had at least some semblance of organization (as opposed to just a random selection of files without tags or proper file names), I think it's theoretically possible.

Of course, this is easier said than done. But if you've got a relatively recent copy of the CDDB as one of your possible pieces of reference material, I believe it could be done.

For instance, when faced with a loose MP3 file that has no tag and is named:

"Some words - Some words - Some words.mp3"

the software has no way of determining, all by itself, which of those "words" is an artist, which is an album, and which is a track title. However, a quick scan through the CDDB (with some fuzzy matching to get around misspellings and "The" discrepancies) would (in most cases) give a high probabilty match that a given field is an artist and a given field is an album. Even better, if you scanned for the track length of that MP3, you might even be able to get an exact match and be able to populate its tracknumber field.

The above example could be extrapolated to directory names and poorly/partially tagged files. And it's more complicated than the examples I've cited, because what if the artist name contains a hypen. But I believe that some really clever parsing software, combined with database searches, could do it.

As the CDDB copy gets older and more outdated, this would begin to work less and less reliably, so that's a problem that would need to be dealt with. You'd need to make sure that the software failed gracefully in situations where it doesn't find a CDDB match, and it just makes the best educated guess it can.
_________________________
Tony Fabris