Quote:
Couldn't they just as easily define a new architecture that's ia32 plus more registers and gotten improved performance out of that?

Hmm. I suppose that you might as well go to 64-bit anyway since you're already defining an incompatible architecture.

That's, more or less, what the x86-64 (as appearing in AMD's Opteron, Intel's Xeon EMT, etc.) architecture is all about. It's backward compatible with x86 in all the ways you'd expect but they added a 64-bit mode, which gets you more and wider registers, and other assorted goodies. This is almost exactly the same process that Sun and MIPS went through in moving their architectures from 32 to 64-bits. The general rule of thumb, for an architect managing a transition like this, is to make sure that old code keeps getting faster, but that new code can really rock the house when it takes advantage of the new features.

Probably the coolest feature of getting a 64-bit address space is that you can now memory map the whole damn filesystem if you want and do everything through demand paging. That's a big win for scientific computing, databases, and even web server applications. Of course, with wider registers, you can move that much more data back and forth at a time. This can be a particualrly big win for "media" applications (e.g., MPEG decoding, Photoshop filters, etc.) that use the parallel half-word instructions (e.g., SSE).

Incidentally, part of why we're seeing dual core CPUs now is that increasing clock speeds are bumping into some nasty heat issues that will, eventually, get worked out. Meanwhile, they can always stamp out two cores on the same die with any of a variety of different cache architectures (either sharing or not sharing the biggest cache). That's easy to do, from the chip designer's perspective, and it puts all the software people on notice that they need to get on the ball and write multithreaded apps when performance matters. Of course, the OS can always run multiple processes on multiple processors.