Classic Computer Magazine Archive COMPUTE! ISSUE 129 / MAY 1991 / PAGE 60

How to choose a hard disk: choosing a hard disk can be tricky. (computer data storage)
by Mark Minasi

Buying a hard disk is confusing nowadays. How big? What brand? Ho* about RLL? There's a lot to worry about. Here's how to kick the tires and read the EPA mileage sticker when you're shopping for a drive (or a PC with a drive). Your hard disk subsystem consists of two pieces: the hard disk itself and the hard disk controller board. The controller is usually a circuit board in your PC, although some newer computers put the controller's electronics right on the main PC circuit board, the motherboard.

A few years ago, you wouldn't worry about buying a controller; you'd just use the one that came standard with your machine (in the case of an AT-type system) or buy a disk/controller combination all at once (in the case of an XT-type system). Since you have a variety of choices in drives and controllers, you've got to make sure that they can talk to each other.

Making these choices may seem a bit daunting, but read on-hard drives aren't tough to understand. And having a grasp of the terms found here will allow you to pick the right drive for your system. Here's a quick look at your options.

Drive options. You must choose size, seek time (which affects speed), band stepper or voice coil, and the drive's self-parking capability.

Controller options. Here you choose XT, AT, or PS/2; the interleave factor (which affects speed); whether or not to get an on-board cache; and the drive's sector translation.

Matching drive and controller. There are several items that must match on the drive and controller, including the interface (ST506, SCSI, ESDI, or IDE) and encoding scheme MFM or RLL).

What Size is Best?

Many computers these days are advertised as coming with 40MB drives, but assess your needs carefully before jumping at such a package. The sad truth is that virtually every program you buy will demand a few megabytes of your disk's space, and you'll soon be looking for more room.

For example, the popular Micrografx Designer drawing program gobbles up five megabytes in a basic configuration (it can take much more), Windows 3.0 takes up about seven megabytes without a swap file, and even old Lotus 2.1 requires a couple of megs.

That doesn't even consider the real biggies, like OS/2 (more than 30MB when its Extended Edition is loaded). Downloadable fonts can suck up space in no time. And greater use of graphics strains the disk further. For example, a nongraphic computer screen can be stored in just 4K; a graphical screen can take up a megabyte. Your 40 megabytes of space will disappear in no time.

Economics seems to favor 80MB or larger drives. The typical 42MB drive (the Seagate ST251-1 is the most common) runs about $300.00 discounted, or about $7.50 per megabyte. In contrast, Maxtor's 80MB drive is now selling for as low as $410.00, or $5.13 per megabyte. Further, the Maxtor is a voice-coil drive, which is preferable to the 251-1's band-stepper design. (Fear not, explanations of voice coil and band stepper are coming up soon.)

And when shopping for really big drives, watch out for an old scam, reporting "unformatted" drive capacity. Drives must give up as much as 30 percent of their capacity for system overhead. For example, a 20MB drive may actually have 26MB of capacity, but the extra 6MB is required for system overhead.

Every drive has this meaningless "unformatted" capacity that looks impressive but is of no value to the buyer. Look out for unscrupulous dealers who report the larger, useless unformatted capacity in their magazine ads. (By the way, format in this article means low-level-not the familiar DOS-format; it's something generally handled by your dealer.)

Seeking the Fastest Drive

Part of what makes a drive subsystem fast is how fast a drive can move its read/write head over the data you want-that is, how long it takes to find the data. The average time to find an area on disk is called the seek time, and it's measured in milliseconds (ms, thousandths of seconds). The lower the number, the better.

Don't buy a drive with a seek time larger than 28 ms. The best on the market are in the 10- 1 2 ms range-you'll know from the price tag which those are. Band Steppers and Voice Coils A lot of what makes a drive fast or slow is whether it moves its read/ write head with a band stepper or a voice coil.

Cheaper drives move the head to and fro over the disk surface with a combination of flexible metal bands and a stepper motor, hence the name band stepper. They rely on a mechanical approach to find data, an approach that isn't reliable in the long term, as the mechanical parts do not display consistent behavior over time; telling a new drive head to move 1/1000 inch may yield different actual movement than making the same request of an older drive.

The alternative is a voice coil. Named after the voice-coil circuit used in telephone electronics, this is a coil with a cylindrical rod at its middle. When the coil is energized, the rod moves in or out of the coil, depending on how much energy is used. The rod is connected to the heads, so energizing the coil moves the heads in or out. Meanwhile, as the heads are moving, they're reading address information from the drive; that way, the head knows whether it's found the desired data or not.

Which is better? The voice coil, for three reasons. First, and most important, the voice coil is a constantly self-adjusting system; the mechanical parts may change with time, but the head will always find the data. The stepper acts on the unrealistic idea that its mechanicals will never change as time goes on. Second, the voice coil parks its head automatically when the drive is shut down, thus protecting the disk. Most steppers require you to run a head-parking program of some kind. Third, voice coils are generally faster than band steppers.

You'll find that most 80MB and larger drives are voice coil, so buying large drives will pay off in reliability and speed as well as capacity.

Get in Control

If your computer already has a controller, you needn't worry about picking a new one. Or should you? Superpowerful controllers now appearing on the market can squeeze the last ounce of performance out of a drive.

First, make sure your PC can use the controller! Your controller must be made to work with your computer. Vendors sell XT-type controllers, also called 8-bit controllers, and AT-style controllers, also called 16-bit controllers. An XT controller can work in an AT system (albeit slowly), but an AT controller generally won't work in an XT system. There are some PS/2 microchannel controllers, but the market for them isn't large, as all the PS/2 microchannel computers come standard with a fairly fast controller.

Next, make sure it's a speedy controller. We've seen that an important determinant of a drive's speed is its seek time. Controllers also contribute to the speed of your disk subsystem with their interleave factor.

The seek time refers to how long it takes to find the data on disk. The interleave factor tells how quickly the disk subsystem can read the data, once it's been found. interleave factors look like 1:6, 1:3, 1: 1, and the like. A lower second number is better, so 1: 1 is the best. Controllers that feature 1: I interleave used to be very expensive-$400 or more for AT systems-but now they're about $120, only about $20 more than the more common and slower 1:2 controllers.

If you're buying an AT system (286, 386SX, or 386) today, insist on 1:1. Buyers of XT systems will find 1:3 controllers their best bargain; there aren't any 1:1 XT controllers, and the 1:2 controllers are a bit expensive. All microchannel PS/2 systems come standard with 1: 1 controllers.

Maintaining Cache Flow

Most 1: I controllers include a speed-enhancing feature called on-board cache. A cache is necessary because hard disks retrieve data thousands of times more slowly than your computer's RAM. Every time the computer needs to read the hard disk, it must twiddle its thumbs, waiting (and waiting and waiting ... ) for a device that seems, in terms of CPU speeds, positively geological in time scale.

It would be nice just to copy the whole hard disk to the much-faster RAM, but that's impractical. Buying even enough RAM to accommodate a 20MB disk would be prohibitively expensive.

Computer designers have noticed, however, that most of us seem to return to the same areas on the disk over and over again. Even though your hard disk is 20MB in size, you may do 90 percent of your work in just 2MB or so. That's where a cache comes in.

A cache is a TSR (memory-resident) program that sets aside some of your PC's memory as a temporary holding area. It then monitors your disk usage. Every time DOS goes to read a file, the cache transparently copies that file's contents to its holding area in memory. Then, if DOS needs to reread that file later, the cache supplies the file to DOS, fooling DOS into thinking that the information came from the disk drive.

The benefit? The file reread occurs by transferring information from memory to memory, rather than from disk to memory, yielding much faster apparent disk performance. If you have expanded or extended memory that you aren't using, putting in a cache program is an ideal way to speed up your disk subsystem.

PC Tools, Mace, and The Norton Utilities all include cache programs, or you may want to pick up a copy of Multisoft's PC-Kwik (call Multisoft at 503-644-5644). If you've got the memory for one, a 512K cache will speed up apparent disk speed quite a bit. Now that computer memory is so much cheaper, you may want to spend some cash on memory so you can spend that memory on cache (sorry-couldn't resist).

Thus far, I've explained caches as add-on software. But some hard disk-controller designers have gone a bit farther and actually have implemented small hardware caches right on the controller. The caches tend to be 8K-32K in size.

It sounds like a good idea, but it often isn't. The problem is that a cache that tiny doesn't do much. An 8K cache makes a disk look really fast to the kind of small speed-test programs that computer magazines run when writing reviews, but they don't help much for real-world applications.

Further, built-in caches can confuse many disk-tester programs like SpinRite, Disk Technician, and the like. The cache makes them think the system is a good bit faster than it actually is. The bottom line is this: If your controller has an on-board cache, fine. But make sure you can disable the caching so you can reliably run a disk-maintenance program in the future. Sector Translation The last thing to look out for when shopping for a controller is sector translation. When hard disks first became popular in the PC world around 1983, they used a disk-encoding method called MFM (Modified Frequency Modulation, discussed in the next section).

This slowly is being replaced by RLL (Run Length Limited). RLL makes it easier to build large-capacity drives, and it, too, is discussed in the next section.

In 1986, when RLL first appeared on the PC scene, some PC programs had trouble talking to RLL-type disk subsystems because they looked different from the MFM-type disk subsystems that the programs had been designed to expect.

That's not a problem with today's software, but at the time, the makers of RLL disk controllers decided to solve the problem with sector translation.

Sector translation makes a newer RLL disk subsystem look like an older MFM disk subsystem. Most translating controllers give you the option to disable translation and "come clean" about their RLL-ness.

Why disable translation? Again, because of SpinRite and the crowd. Disk-fixer and -maintenance programs are greatly hampered in what they can do for your disk if the controller is translating. Make sure you've got the option to disable translation. You'll also see translation on some of the 300MB and larger drives, as well as on many IDE drives, discussed in the next section.

Interface Basics

Up to now, you've seen the characteristics that a drive or a controller can have; these characteristics can be mixed and matched in just about any way. But the drive and controller have to agree on how to communicate; that's determined by their interface type and encoding scheme.

How does the controller talk to the drive? Originally before 1983), you'd buy a controller and a drive from the same company, so you wouldn't worry about the interface. Nowadays, it's likely that you'll want to buy a controller from one vendor, like Western Digital or Data Technology, and a drive from another vendor, like Seagate, Maxtor, or Mitsubishi. This implies that both the drive and controller must support some common standard interface.

Originally, the now-defunct Shugart Technologies used something it called the ST 506/412 interface, or as it's more commonly known, ST506. Most PC drives use ST506 to this day. It can support a maximum data-transfer rate of 7.5 million bits per second (Mbps). That doesn't sound slow, but it is, and that's one reason why it's slowly fading from the scene. The other reason is that it's noise prone.

Real muscle drives these days are using a replacement interface called ESDI (Enhanced Small Device Interface). ESDI, like all other interfaces after the ST506, reduces noise and boosts speed and reliability by putting part of the controller right on the drive. ESDI could theoretically support 24 Mbps. The ESDI interface has another useful feature; the drive can describe itself to the controller, which makes drive setup easier.

Another interface that high-end machines are using more and more goes by the unfortunate acronym SCSI (pronounced scuzzy and standing for Small Computer Systems Interface). IBM's recent announcement of some PS/2 models with a SCSI interface and the U.S. government's recent gigantic purchase of SCSI-equipped PCs under its Desktop III contract will boost SCSI acceptance in the PC world.

SCSI transfers data at up to 20 megabits per second. Eventually SCSI will support over I 00 megabits per second, but for now it's in the ESDI range of speed. Taking things a bit farther than ESDI, SCSI actually puts the whole controller on the drive-the board in the computer really doesn't have much to do and is, strictly speaking, not a controller but a host adapter.

SCSIs are also neat because the interface lets you daisychain up to eight devices. That means theoretically you could run a couple of SCSI hard disks, a CD-ROM player (which also uses SCSI), and a scanner all off a single host adapter. While SCSI is probably a better interface in the long run, ESDI is currently better suited to the DOS environment and probably the better bet for now.

IDE (Integrated Drive Electronics) is basically a SCSI-like approach to ST506. The electronics can't handle SCSI speeds, and the interface relies on ST506 technology, but the controller is, again, located right on the drive, allowing greater transfer rates. The resulting stream of digital data is already preformatted for an IBM-type bus on a 40-pin connector, rather than using the more common two-cable approach. Compaq uses IDE extensively in its systems.

Sound good? It is, basically, with one twist: You can't maintain IDE with software. You're not supposed to low-level format it, and in fact I've seen a low-level format damage a Compaq drive. The Norton Utilities will work for some data recovery, but, again, disk-fixer programs can't help you much because IDEs tend to be sector-translating systems. Further, there's not really a standard IDE interface. In fact, one data-recovery firm reports at least 25 different kinds of IDE. There's something a bit too disposable about these drives; they're basically reliable, but you're helpless if they do develop a problem. IDE would be a very good idea if programs could reformat the drive and the IDE manufacturers would agree on a standard.

These things may be the case in a year or two. Right now, be careful.

What About Hardcards?

Several firms offer hardcards, which are controller boards with a slim drive mounted right on them. They don't take up a drive bay, but they do take up a slot. Some, in fact, are designed so badly that they take up three slots-look out for these ! Hardcards are nice if you need a means to transport a lot of information, such as if you had to set up 20 identical machines in a learning lab. You'll probably want to avoid them, however, since they tend to be IDE and many generate a fair amount of heat near your other circuit boards. The Great Encoding Debate Part of a disk-system designer's j ob is figuring out how best to pack data on a drive. That's called the disk's encoding scheme, and it's always a matter of compromise-more data in an area means less reliability. Most PC drive/ controller combinations prior to 1988 used modified frequency modulation (MFM).

Around 1986, a newer encoding scheme, run length limited (RLL-the idea was borrowed from mainframe drive design), started appearing on PC systems. It took any given drive and packed 50 percent more data on it-a drive that held 20MB when connected to an MFM controller could hold 30MB when paired with an RLL controller.

Obviously, the extra 50 percent doesn't come without cost. You can't just hook up an RLL controller to a drive that's been doing MFM, reformat, and instantly get more space. The drive has to be engineered better to be able to reliably store the more compact RLL format. That's why you see drives rated as either MFM or RLL quality.

For example, the Seagate ST4096 (an 80MB MFM drive) and the ST4144R (a 120MB RLL drive) are basically the same drive- I 20MB is 50 percent larger than 80MB. The 4144R is just built a bit better, and it costs a little more. The 4096 is $527 discounted; the 4144R is $589 discounted.

RLL has unfairly gotten a bad name in some circles because some computer dealers in the late 1980s matched up MFM-quality drives with RLL controllers. The result was larger-capacity, unreliable drives and a legion of headaches for PC fix-it people.

So when you're buying an RLL controller, buy an RLL-quality drive. Or you could buy a little insurance by matching up an RLL-quality drive with an MFM controller. Con this: The ST4096 is a good drive, but why not spend $60 more for the ST4144R and format it under MFM as 80MB? After all, $589 is still a reasonable price for an 80MB drive, and you'd have an over engineered system that's very reliable.

By the way, when people advertise MFM or RLL drives, they really mean MFM- or RLL-encoded ST506. ESDI, SCSI, and IDE all encode with RLL.

Recommendations

Growing program sizes, downloadable fonts, and graphics make drives of 80MB and larger a necessity. The Maxtor or Seagate 80MB drives are both good and widely discounted. If you buy a 40-megger now, you'll only save a little money over an 80, you'll end up buying a larger drive in a year or two, and you'll be giving up a voice coil for a band stepper.

For an XT system, buy a 1:3 controller like the Western Digital (WD) XT-GEN or the Data Technology (DT) 5150 CX; both are good, basic, inexpensive 8-bit MFM controllers that can support a wide variety of drives. For XT RLL, try the WD 1004-27X. Avoid the Seagate ST-11R XT RLL controller, as it has a peculiarity that limits data reconstruction and recovery possibilities, and, besides, it only supports Seagate drives.

For an AT system, WD offers the 1006V-MM2 MFM controller and the 1006V-SR2 RLL controller. DT's 7280 MFM controller is also quite trouble-free. All three are 1:1 controllers, and each can be had for about $120.

If you need something larger (over 120MB), you'll probably have to go ESDI. CDC Imprimus (now owned by Seagate) makes good drives, as do Maxtor and Micropolis.

When buying computers, think twice about IDE drives. Again, IDE is a good idea, and you'll save a few bucks, but it robs you of a lot of disk-maintenance options. That means you should be careful about buying hardcards.

Ensure that on-board cache and sector translation, if present, can be disabled to get the maximum benefit from disk-maintenance programs.

* Gidget, the dog on our title page, was treated fairly and humannely.

Hard-Driving Acronyms ARLL. Advanced Run Length Limited is a data-encoding method used in IDE drives that allows storage of 50 percent more data than standard RLL and 100 percent more data than MFM. ESDI. Enhanced Small Device interface is an interface standard that puts some controller functions on the drive itself. ESDI allows for data transfers of 1MB-3MB per second and can be used for drives up to 1 gigabyte in size. IDE. integrated Drive Electronics, like SCSI, is an interface design that puts the controller on the drive itself. IDE, however, only offers ST506 performance.

MB. One megabyte is 1,000,000 bytes, or 1,000K.

MFM. Modified Frequency Modulation is a data-encoding method that has been the standard until recently. Now, RLL is more common, at least for highcapacity drives.

One millisecond is 1/1000 second. Milliseconds are commonly used to measure a hard disk's seek time. RLL. Run Length Limited, like MFM, is a data-encoding method, but RLL allows storage of 50 percent more data than MFM.

SCSI. Small Computer System interface is an interface standard that puts most of the controller functions on the drive itself. it offers transfer speeds of 1MB4MB per second. SCSI also allows as many as eight devices to be daisychained together.

ST506. Shugart Technologies' 506/412 interface is an interface that supports transfer speeds of about 500K per second and is limited to a hard disk of 127.5MB or smaller.