Here's some fun math.

On the D700, the base ISO is 200 and I wouldn't go past 3200. That's four stops. Another way of looking at it is that the D700 gives you 12 good bits per pixel. (The actual sensor gives you 14 bits per pixel, but the bottom two bits are crap.) The D800 gives you roughly the same, 12 good bits per pixel, but you've got 3x the pixels. If you downsample a D800 to the same resolution as the D700, then you're averaging those three pixels together, giving you somewhere between 1-2 bits of additional useful signal per reduced pixel.

Now, imagine a hypothetical D700+, fabbed with current generation technology, but still 12 megapixels. The bazillion dollar question is whether the D700+ could have more than 14 useful bits per pixel. If you could give me (dreaming now) 16 beautiful bits per pixel, then that would yield HDR goodness in every shot. Or, assuming a base ISO of 100, you'd be able to shoot without any noise whatsoever at ISO 25600. (Some cameras claim to work at this speed today. They don't do it well.)

Lastly, here's your tradeoff. Are you more likely to want 12 megapixels with 16 beautiful bits per pixel, or would you prefer 36 megapixels with 12 beautiful bits per pixel? The former gives you glorious HDR. The latter gives you outrageous high resolution, if your lens supports it, and can be downsampled to give you 14 beautiful bits per pixel at resolutions you actually care about.

My guess is that somebody inside Nikon thought long and hard about these tradeoffs, and had hard numbers for both options, and decided that more pixels was preferable.