Quote:
But for *real* speed, one could use MMX/SSE(2) etc.. instructions. Any experts here?

I'm no expert, but I do recall that modern GCC will use SSE instructions automatically in this sort of situation, especially if you've unrolled the loop in that way. It can't do that, though, unless it can prove that the two arrays don't overlap (consider "float buffer[101]; tempVal=&buffer[1]"). If one or more of buffer and tempVal is a local array variable, it can work out that they don't overlap; if this code is in a function where tempVal and buffer are both passed-in as float pointers, mark them "restrict" if you're sure they'll never overlap.

Peter