Friday, October 20, 2006

The reason for integration testing

I am a do-it-yourselfer when it comes to PCs. Nearly all of my PCs have been home-built. Well, not really home-built, since I don't solder chips, but home-assembled - I buy boards, drives, cases, etc., and assemble it into a system.

I do it because it's a hobby, and I enjoy doing the work. Sometimes, however, I read articles about people choosing to build their own PCs for professional work. Often because some corporate finance-type person thinks he can save money over buying complete systems from Dell or HP. (And the idea that you can save money by building your own systems hasn't been true for many years, but many people haven't figured this out yet.)

Similar principles apply (more frequently) to people who perform their own upgrades - adding/replacing memory, video cards or hard drives.

What these people fail to realize is that assembling a computer system for a production environment involves much more than simply slapping parts together. Even if there are no software-compatibility issues (not always the case), there are sometimes obscure problems that only become apparent after extensive integration testing. All the major PC manufacturers test their systems before offering them for sale. Very few hobbyists do much testing beyond what's necessary to get their favorite game up and running. And I doubt many corporate finance people realize the expense (in terms of time spent) needed to thoroughly test a completely custom-designed system.

(FWIW, many corporations have standardized software environments, and even this requires extensive integration testing as PC manufacturers introduce new systems. It may take a month (or more, if there are problems) to ensure that a new computer is compatible with corporate-standard software. Now imagine the time that would be required if the hardware itself also had to be thoroughly tested.)

But this is nothing new. I mention it simply because recent articles report of a problem that is an almost textbook case for why integration testing is necessary.

For the last several years, Apple's laptop computers have sported a "sudden motion sensor" (SMS) that senses acceleration and vibration. The system software uses this sensor to detect sudden motion (such as if the computer is dropped) and parks the hard drive heads, so the drive won't be damaged on impact.

Western Digital's new Scorpio line of hard drives also has a motion sensor, serving the same purpose as Apple's.

So what happens when you put a Scorpio drive in a MacBook? The two systems step on each other. When the computer is jostled, the drive parks itself, and the Apple SMS system sends the command to park the drive. The SMS system get an error back from the drive (the drive probably goes off-line while it is in this auto-park state), assumes that the drive has failed, and panics the OS kernel.

One perfectly good drive plus one perfectly good computer combine to form an unstable system. This is a problem that very few people could ever predict without actually assembling and testing a completed system. It is something that a manufacturer would (or at least should) test for as a part of deciding what brand/model drive to bundle, but is something the rest of us will not be able to figure out in advance while shopping for a new drive.

No comments: