Graphics routines don't need to be in assembly (but nice if they are), they just have to be written to run fast rather than look "CS Class Pretty", or possibly both simultaneously.

Are you sure your slowdown is not due to the ioctl() interface for userland screen updates? To test the theory, use the regular empeg MMAP function from empeg_display() to run your app while the player is shut down.. see if it's any faster that way (direct memory access, almost..).

If faster, then we can add a second mmap() page for the hijack buffer.

Cheers

-ml