Quote:
for (j = 0 ; j < numFrames ; j++)
tempVal[j] += buffer[j];
Unroll the loop (some compilers may partially do this already for you), and reverse the loop index to count down to zero (saves 1-2 instructions, usually).
So, something like:Code:
// Assumes numFrames is multiple of 4
for ( j = numFrames - 4; j != 0; j -= 4) {
tempVal[j] = buffer[j];
tempVal[j+1] = buffer[j+1];
tempVal[j+2] = buffer[j+2];
tempVal[j+3] = buffer[j+3];
}
Switching to pointers rather than table indexes might save another couple of instructions per iteration.
But for *real* speed, one could use MMX/SSE(2) etc.. instructions. Any experts here?
Cheers