Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#367978 - 03/12/2016 21:25 Any ideas on windows 7 boot problem?
pca
old hand

Registered: 20/07/1999
Posts: 1102
Loc: UK
This one is driving me insane.

My main PC consists of a Gigabyte GA-Z97X Gaming 5 motherboard, socket 1150, with a core I5-4770k processor, 32GB of Kingston hyperX DDR3 Ram, an AMD 7870 graphics card, and a Samsung 850 EVO SSD. There are various random usb things like mice, wacom tables, serial devices, etc, but those are probably secondary to the current issue. The machine also has an 8 port PC serial card and a PCIE USB3 card as well. All powered by a 750W PSU, a good one.

About 18 months ago, I had to replace the motherboard with an identical one due to a total failure. It had died shortly after I got the two 40 inch 4k monitors due in fact to those monitors, or rather the issue with the DP cable I mentioned here at the time. I transferred the SSD onto the new system, plugged all the cards in, and after a certain amount of faffing around, it all started working fine. So far, so good.

Six or seven weeks ago, this newer machine suddenly decided it wouldn't boot reliably. It's normally hibernated and reboots from cold only about once a month, but it started giving STOP 7B errors, saying it couldn't find the boot device.

I pointed out to it, quite reasonably, that the boot device was exactly where it had been all along and perhaps it could look more carefully. It failed to heed my advice so I had to go in and do it myself. After a while, I found it would boot about one time in six, more or less completely randomly. I did all the normal diagnostics, replacing or removing as much hardware as I could including the PSU and ram, running ram tests, drive tests, booting under linux (which works fine), the whole thing, and came to the conclusion everything seemed to work properly. I also unplugged all the usb devices, took out the cards, stripped it right back to the motherboard and the onboard graphics with a keyboard, mouse, and HDMI monitor.

Nothing really helped, it was still iffy as hell, even after uninstalling everything I'd installed in the previous month. It wouldn't even boot into safe mode. It insisted on running the system repair every couple of boot failures, which also failed without telling me much more than 'it didn't work'. Finally I backed it up, then reformatted and restored a backup from a week ago.

That did the same thing.

So did one from a month before, and six months before THAT.

All of them, restored onto a different HD, would boot perfectly happily on an older machine with a core 2 quad processor in it. But that new HD failed in the same way in this machine.

OK, I thought, the motherboard has an issue with life, let's swap everything over to the identical brand new one I bought at the same time as part of my cunning plan that's been sitting in the cupboard ever since. Which I did.

Same problem.

Over the next week I restored that sodding thing more times than I care to remember, tweaked boot records, faffed around with hand building partitions, the lot. All I managed to do was waste time and make it even less reliable, so in the end I gave up.

I wiped the lot, reinstalled windows, all the drivers, all the applications, all the data, re-entered all the security codes and keys for about sixty programs, and wasted another week making a whole new system. I'd resisted doing this because it's so time consuming and irritating. This worked fine and I thought that was the end of the problem.

Until three days ago when it did exactly the same damn thing, except this time it always gives a STOP F4 error on boot. It also reliably boots into safe mode, so yay for that.

Again, it will occasionally boot properly, and having done so work perfectly. Playing around with it the only things I could find that would provoke a repeatable behaviour was to either change something big in the bios, then reboot, at which point it would normally boot properly exactly once, or remove the AMD graphics drivers, which would make it boot reliably but at low resolution.

Aha, I thought, it must be the graphics card with some odd subtle fault. The thing had once or twice given some slight graphics corruption so it seemed plausible, perhaps it was finally feeling the effect of the bad cable. I acquired another one of the same type, a Sapphire one rather than an MSI one, which turned up today. Quick swap of cards and... Exactly the same thing.

Great.

Much testing later, and I have got no idea what the hell is going on.

So far I have:

Swapped out the motherboard with the spare one.
Taken all the ram out and replaced it with a single 4GB stick that's known good from another machine.
Removed all the cards and gone back to the built in graphics.
Unplugged all the USB devices and reverted to a PS/2 mouse and keyboard
Swapped out the PSU (again).
Tried a different monitor.
Unplugged all the drives other than the boot one.
Disconnected any unused cables and replaced the ones in use with new ones.
Sworn a lot.
Booted in safe mode and set the thing to selective boot, then turned off EVERYTHING. It boots with less running than safe mode has, which make it boot really fast, true, but you can't do anything with it.
Uninstalled all the drivers for all the unused devices and a lot of the used ones

None of it has made any difference at all.

Boot logging shows nothing useful, when it goes bang it doesn't leave a boot log or a memory dump. It seems to crash immediately after loading the last system device, if I turn on OS boot information I get to the point it tells me the machine type and OS version then it falls over.

I'm stumped. I have no idea what's wrong with it, how to find out, or how to fix it. The only thing this particular iteration of the machine has in common with the one from four days ago is the processor, box, and OS. I'm 90%+ sure there's no hardware issue, either with this hardware or all the other parts lying around the room. The original graphics card MIGHT have a fault, but if so it's not the one that's doing this.

If it's a driver fault, which is plausible, how do I find it? With all the third party stuff allegedly disabled, it would presumably have to be a microsoft one, but I have no idea which.

The more than annoying final point is that it, just as last time, will every now and then randomly boot perfectly happily and then sit there running without any issues at all, looking smugly at me. I can hibernate it and resume it perfectly well, which is a sort of workaround, right up to the point I actually need to reboot it. Bearing in mind this is windows such a thing comes up sooner or later no matter what.

Starting yet again with a clean install is not something I'm prepared to do for the fairly obvious reason that until I know what's going on it would most likely do the same thing all over again. The machine was fine up to a month or so ago, and this version isn't really that version anyway. Switching to linux or Macos is not an option either, I run a lot of very definitely windows only stuff I can't use with either.

I have to admit I'm on the verge of tossing the entire collection of junk into the rubbish and giving up on computers entirely. Possibly buying a whole new computer would fix it, possibly not, but I can't afford to do that right now regardless. I've wasted weeks on this, quite a lot of money, fallen behind in my work, and raised my blood pressure, all to no effect.

If anyone has any helpful suggestions on where to look next for a solution I'd certainly love to hear them...

pca


Edited by pca (03/12/2016 21:26)
_________________________
Experience is what you get just after it would have helped...

Top
#367980 - 03/12/2016 22:57 Re: Any ideas on windows 7 boot problem? [Re: pca]
BartDG
carpal tunnel

Registered: 20/05/2001
Posts: 2616
Loc: Bruges, Belgium
There's only two things I can think of, and it's a combination of both. First, I've had problems with Gigabyte in the past. All sorts of strange issues, sometimes undefinable. Which is why I try to avoid this brand now all together. It seems Gigabyte boards sometimes really don't like certain hardware. I've never been able to find out why though... I try to use Asus now when I can, and maybe my next board will be Supermicro. Stability is more important to me than raw speed. The second thing that comes to mind is that maybe Windows automatically installed some update your motherboard doesn't agree with. Which could be why everything worked OK up to a few months ago and a clean install (with all the most recent updates no doubt) didn't fix it. I'm sorry I can't be more precise, but this is what my gut feeling is telling me.
_________________________
Riocar 80gig S/N : 010101580 red
Riocar 80gig (010102106) - backup

Top
#367981 - 04/12/2016 01:11 Re: Any ideas on windows 7 boot problem? [Re: BartDG]
pca
old hand

Registered: 20/07/1999
Posts: 1102
Loc: UK
Originally Posted By: Archeon
There's only two things I can think of, and it's a combination of both. First, I've had problems with Gigabyte in the past. All sorts of strange issues, sometimes undefinable. Which is why I try to avoid this brand now all together. It seems Gigabyte boards sometimes really don't like certain hardware. I've never been able to find out why though... I try to use Asus now when I can, and maybe my next board will be Supermicro. Stability is more important to me than raw speed. The second thing that comes to mind is that maybe Windows automatically installed some update your motherboard doesn't agree with. Which could be why everything worked OK up to a few months ago and a clean install (with all the most recent updates no doubt) didn't fix it. I'm sorry I can't be more precise, but this is what my gut feeling is telling me.


Amusingly enough, I went TO Gigabyte from Asus for exactly the same reason smile I used to have a lot of Asus hardware, but about ten years back I went through three systems in a row that blew up with no cause I could determine, in one case literally, taking the hard drives, ram, processor, and graphics card with it. Then ate another processor when I tried a spare in the board to see if it was a processor or board that was at fault (before I'd found everything else that was bad or I wouldn't have risked it!) which led to the obvious conclusion that the motherboard was faulty. So that processor went into the new board it had come from in the first place...

Two dead motherboards, two CPUs, a couple of gigs of ram when that was both a lot and expensive, three hard drives and a graphics card all went in the rubbish. Not happy.

And it wasn't a power supply fault either, the PSU was working fine when tested. It got chucked as well on the basis of being guilty by association, even so. Asus sort of admitted, without admitting it in a way that would lead to a refund, that there were some problems with that model of motherboard. Went right off them at that point frown

As far as this problem goes, I may have found the fault, or more accurately, faults. I'm almost certain the original graphics card has some mental issues of its own, but the root cause seems to be the ram disk driver I've been using for three years now not being compatible with windows 7! So your theory of a microsoft update breaking something is most likely true.

I certainly am not stupid enough to allow auto-updates in the house, but I do occasionally manually install security patches and the like. My guess is that one of these has helpfully cocked something up that the ramdisk program relies on, in a very subtle manner. It then interferes with the graphics drivers as well, possibly everything all ends up sitting in the same part of memory under some conditions when booting or something like that.

The end result is that if the ramdisk driver is installed, the machine BSODs on boot about 75% of the time, and runs flawlessly the rest of the time. Uninstall it, it boots (so far) 100% of the time like it used to. Put it back, it breaks again. And so on.

It's the most repeatable problem I've managed to find but only time will tell if it's the true cause. The problem is obviously that there's no way to know if you've actually fixed the issue, only that you've fixed it up to the point of the last boot. Who knows what the next one will do?

RobS and I spent about five hours on skype discussing the problem, deciding that it had to be a kernel-mode driver, then finding a tool to list them all and uninstalling everything that we couldn't see was essential. We got eight programs in before it would boot reliably four times in a row, which was the start of the solution.

Fingers crossed, but it may now work. We'll see...

I certainly hope it is, though, I was going slowly nuts trying to figure it out!
_________________________
Experience is what you get just after it would have helped...

Top
#367987 - 05/12/2016 13:57 Re: Any ideas on windows 7 boot problem? [Re: pca]
BartDG
carpal tunnel

Registered: 20/05/2001
Posts: 2616
Loc: Bruges, Belgium
To be honest, I've had problems with Asus in the past before too. Fortunately not lately though. But you're right: they are not completely without fault also. That is one reason I started using Intel motherboards at one time. Those boards were very good, but afaik Intel doesn't make 'em anymore nowadays I believe. This is also the main reason I'm now looking towards Supermicro. They are known for their stable server motherboards, and now they have created some consumer motherboards as well. I have good hopes for them. smile

Good to read you've finally been able to pinpoint the root of the problem! Glad to be of help, even if it only was pointing you in the right direction. smile
_________________________
Riocar 80gig S/N : 010101580 red
Riocar 80gig (010102106) - backup

Top
#367990 - 05/12/2016 18:51 Re: Any ideas on windows 7 boot problem? [Re: pca]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31563
Loc: Seattle, WA
Originally Posted By: pca
I pointed out to it, quite reasonably, that the boot device was exactly where it had been all along and perhaps it could look more carefully.


LOL smile

Addressing your original question, though... Out of all your diagnostic steps, I didn't see "swapped out for a different hard drive" or "swapped out for a different SATA cable".

If I'd gotten the "boot device" error, that would have been my first go-to.
_________________________
Tony Fabris

Top
#367992 - 05/12/2016 19:42 Re: Any ideas on windows 7 boot problem? [Re: tfabris]
BartDG
carpal tunnel

Registered: 20/05/2001
Posts: 2616
Loc: Bruges, Belgium
It's his eighth point. smile
Quote:
Disconnected any unused cables and replaced the ones in use with new ones.
_________________________
Riocar 80gig S/N : 010101580 red
Riocar 80gig (010102106) - backup

Top
#367995 - 06/12/2016 01:27 Re: Any ideas on windows 7 boot problem? [Re: BartDG]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31563
Loc: Seattle, WA
OK. What about the hard drive?
_________________________
Tony Fabris

Top
#367997 - 06/12/2016 08:15 Re: Any ideas on windows 7 boot problem? [Re: tfabris]
BartDG
carpal tunnel

Registered: 20/05/2001
Posts: 2616
Loc: Bruges, Belgium
I don't know why he didn't do that, but I wouldn't have either since it booted and ran Linux just fine.
_________________________
Riocar 80gig S/N : 010101580 red
Riocar 80gig (010102106) - backup

Top