Are you sure the RID is going to come out the same for two slightly different masterings of the same song? For instance, a version of a song that got remastered for a greatest hits collection. If I understand the way the RIDs are calculated, even a slight difference in volume would cause a different RID. Or am I wrong about that?
You're correct. In fact, even a difference in encoding (an upgrade of Lame, for instance, or different Lame options, or the use of mp3gain) would cause a different RID.

Deduplication by RID is a solved problem -- we do it for the Karma. Toby and the others in this thread are talking about deduplication when the RID is different. It's effectively impossible to do automatically(*), but right now there aren't even tools to help you do it manually. It's not IMO a player-side problem, it's a PC-side problem (though it's easiest to patch up when your PC-side filesystem supports symbolic links).

(The "ve" in that comment wasn't "ve at Empeg", it was just meant to be me getting all Dr Strangelove on the ass of my own music collection.)

I've said this before, but IMO the area of tools for letting you get a firm grip on a large music collection is where the next great opportunity lies for being much better than the current state of the art of digital music software.

Peter

(*) "Go" and "Go (Woodtick Mix)" are the same song, but "Leo Leo" and "Leo Leo (Featuring Raagman)" aren't the same song, despite being exactly the same length and indeed binary-identical as WAVs for the first and last 10s.