Well, last night I completed phase two of my volume adjustment experiment, and once I ironed out some annoying bugs, I'd have to declare it a pretty much unqualified success.

It works like this:

Monitors the maximum sample magnitude in x sample blocks. It then calculates
a desired scaling factor based on this maximum sample. This calculation is a simple linear interpolation between a user specified minimum desired volume, expressed as a number between 0 and 1. The limit of the scaling factor as the minimum desired volume approaches 0 is 1. i.e. If you set minvol to 0, you don't get any distortion of any kind.

If the desired scaling factor differs from the current scaling factor, then multiply the current scaling factor by a number close to 1, so that it becomes closer to the desired scaling factor. This number (and the number of samples we take the maximum from) is what determines the speed at which we respond to volume changes in the music. I've found that setting them so that we get a maximum volume slew rate of about 3db per second is a good level.

That's the basic algorithm! It is basically a filter that requires no memory of previous samples, just the maximum value.

However, there is a problem that if we set minvol to 1, and our music volume is slowly increasing, we will get clipping because we can't respond to the increasing sample sizes until AFTER we have actually played them. To reduce this effect, I added a sort of "headroom" parameter into the desired scaling factor calculation. This means that we will always have room to play samples that are slightly bigger than the ones we have examined so far. This allows the slow volume adjustment to work in both the increasing and decreasing directions.

However, it can't handle the case where you have quiet music that suddenly gets loud. So, I added an emergency volume reducer. This is the bit which needs some sort of read-ahead buffer. It looks n samples into the future. If it finds any samples which would cause clipping it then changes the scaling factor (linearly at the moment, I couldn't be bothered doing it exponentially) from the current value so that by the time the big sample is ready to be played, the scaling factor is the desired scaling factor.

I found this to work reasonably well with a read-ahead buffer of 100 samples! I haven't tried it with a smaller buffer, I guess I should.

That's all I've done so far.

In summary, it listens to the music. If the music is too soft, it gradually turns up the volume. If the music is getting too loud, it gradually turns down the volume. However, it looks into the future, and if we would get clipping soon, then it very quickly turns the volume down so that we don't.

I've currently implemented this in perl, because it was easy to do. If any one wants a copy of the code, I'll mail a copy to Rob and ask him to put it on the developer site. I'm not proud of the code, as I have deliberately ignored many implementation issues in order to quickly prove the concept. The code is ugly, inconsistent, inflexible and slow (it reads one sample at a time, converts them into numbers in perl, and then works on them from there) but all of this could be easily fixed with a decent reimplementation in C. I haven't got time at the moment, though.

I've done only minimal testing so far. My test samples have been the first 30 seconds or so of Holst's Mars, and of Sledgehammer. I can never hear the quiet bits at the start of these tracks in the car. The results are pretty good. The only problem is that because of the slow volume change rate, it takes a few seconds to get to the desired volume, but I can live with that. I'd rather miss out on just the first few seconds of a quiet section than the whole thing.

All my testing at the moment has been done with the minimum desired volume set to 1. This means that it attempts to scale everything to maximum volume. In real life you would only use this value if your listening environment noise floor was essentially the same as your maximum volume. This is the most severe condition, and all the changes are less drastic with a smaller minvol. The way I envisage it, if it was ever implemented in the empeg, minvol could be set with a secondary volume control. You might leave your main volume (maximum volume) set to the same value all the time, but change minvol depending on the amount of background noise.

What do people think? I think it's a very simple algorithm, with very few resource requirements, that could make a big difference to the level of enjoyment.

Richard