Yeah, I was never particularly happy with its treatment of the Sledgehammer intro.

It currently operates on a fixed proportional change per unit time. I think it's about 3db increase, and 6db decrease per second at the moment.

It linearly scales the current volume from between 0 and 1 to between about 0.1 and 1. That means that if you have input volumes of 0.01 and 0.001, these will both be scaled (given enough time for the gradual increase to finish doing its stuff) to very close to 0.1 output value. I have been thinking of how to do it differently.

Here's one idea:
Currently, the silence detection threshold is at whatever input value gives a desired output multiplier of 128. As it's scaling to around 0.1 for a minimum output value, then that means "silence" is anything smaller than 0.001. If we then attempt to maintain the ratios between the scaled outputs, then 1 => 1, 0.1 -> 0.5, 0.01 -> 0.24, 0.001 -> 0.1.
Basically, in this case, output_volume = exp(ln(input_volume) / 3)

The only problem with this is, does anyone have any (hopefully fairly fast) fixed point routines for ln and exp?

Actually, I only call this once per 4608 byte block, so perhaps they don't need to be that fast. I might try and roll my own approximations.

Richard.