I'm not getting as many good hits as I'd like from Google on this topic. I've got a few, but there's not a lot of detail in the areas I'm looking for.

We keep our unit tests and our integration tests very separate. The unit tests are truly atomic, and if someone breaks a unit test, everything stops until it's fixed. That's as it should be.

Our automated integration tests are a bit more wibbly wobbly (as they should be, I suppose). Some of our integration tests depend on third party resources which might have temporary down time, for example. Sometimes, our integration tests fail for what seems to be no good reason. It's not a huge problem, but we're looking at taking some time to focus on improving the reliability of our automated integration tests. I have a list of things that I think are good practices, but they're mainly from my own mind rather than from any sort of industry-standard Best Practices guide. I have yet to find a Best Practices guide in a google search which gets into the level of detail that I'm getting into. I'm wondering if anyone has seen any really good guides for this on the internet that I'm missing?

Here's an example of my level of detail, and I'm not seeing anything with this kind of level of detail in any of the Best Practices lists that are coming up in my google searches:
"If test is clearly failing because third-party resource is temporarily down, which is obvious in the failure output: Suggested plan of action: Test author should add a pre-check for the resource in the test, and Assert.Inconclusive if the resource is down (and ONLY if the resource is down)."
_________________________
Tony Fabris