Hi there.

As has previously been discussed, it's not easy for the player to do real-time volume normalisation within a track, because that would require it to read ahead (potentially a long way).

People have also pointed out that normalisation is not, in fact, enough, because a single track can have loud and soft sections. If you normalise the loud stuff, you can't hear the soft bits. Some sort of compression is what's really needed, apparently.

Add to this the fact that when you are moving fast in the car, you don't necessarily want the louds to be louder, but you definitely want the soft parts louder. Your noise floor has risen, but not necessarily your maximum listening level.

I have a proposal which (it seems to me) addresses all these problems. Bear with me, it's a bit long.

Proposal:

We have two volume controls, one to adjust the absolute maximum volume output, and the other to adjust the "minimum" volume. (A traditional volume control always has minimum == 0.) We take a chunk of samples, and determine the "apparent volume" (more on this later). Then we scale the entire chunk of samples so that the apparent volume now lies proportionally between min and max, rather than between 1 and 0. (Normal volume controls scale between 0 and max.)

Note that your actual output samples will still be between 0 and max, it's
just your "apparent volume" which is between min and max.

So, in effect, you simply have a conventional volume control which is set for each chunk of samples. You would want a chunk of samples to be sufficiently large that volume adjustments are slow compared to the frequencies in the music. I would think that 10 volume adjustments per second would be OK, if it was any slower than that you would probably want to smooth the "apparent volume" so that you had smooth changes of scaling factor.

Now, how do you determine the apparent volume of a chunk of sample? You could just take the largest sample in the chunk. This should be fairly simple to implement, but it doesn't give you much flexibility. I reckon it would be a cool thing to supply a file full of apparent volume for the track. You could precalculate the contents of this file with your favourite software. This file would be pretty small, and you could upload it along with your music. If the music is 4f0, and the tag info is 4f1, then this file could be 4f2 in /drive/fids. Then the player could find out the apparent volume for each point in the tune simply by looking it up in the file. If there was no such file, it just wouldn't do any scaling. The main advantage of precalculating the apparent volume would be that you could tweak the corrections for particular songs, or sections of songs.

Regardless of how we calculate the apparent volume, I think this would be worth doing. The advantages of this:

  • it's fairly easy to do
  • setting the minimum volume to 0 means that there is no distortion at all, if the car is stopped (or the empeg isn't in the car) and you want hi-fi
  • it handles different dynamic levels in the same tune, but
  • different dynamic levels still stay different (it doesn't make the louds sound like the softs, unless you set maximum == minimum), just with smaller separation
  • it works even if you don't normalise your actual MP3 files
  • it really would add value in situations where the background noise varies (e.g. in the car)


I think this would be really cool, and useful.

Of course, there are various things you could do to improve it. For instance:


  • have a "volume correction = on/off" menu item. If this is on, then the minimum volume is always (say) 20db below the maximum volume, then the user only has to worry about one volume control. Personally, I'd prefer to have access to both volume controls, though.
  • (in the MKII) have the minimum volume determined by the level of background noise detected by the voice recognition microphone.

What do you think?

Richard