In reply to:

Wow .. performance goes WAY up this way... I took 1000 byte block every 50000 byte interval (roughly 100 samples on a normal MP3) and it dropped from 8 seconds to 300 ms. I'd love to know, mathematically, how this affects the odds of a collision.




Well, its been a while since i've gone to a statistics and probabilty class, but here's my best guess:

1/(2^32) is the chance for a collision on a regular crc32 check.
you would multiply this with the probabiliy of a 50000 byte data stream that is the same the 1st 1000 bytes and different the rest.
then each segment lowers the proability of a collision. so if:
x byte block every y byte interval, with z samples, then the probability of a collision is:

[ 1/(2^32) ] * [ (yz)/(y-x)) ]

in other words, for your example, [(50000*100)/(49000)] * [1/(2^32)]

hahah. hope that makes sense to you cuz it surely doesnt to me =)