That's probably because the tuner audio needs to be sampled first (and enough of it to do some decent analysis) before the visual can respond to it. So the visual can respond to the audio only a little while after you've already heard it. I don't think there's much that can be done about that.

/Pepijn