It's rare that a company would release internal data on drive failure rates -- even more so when that company, Backblaze, earns its living storing consumer data in the cloud. That makes the hard drive data released this week even more valuable.
Data storage service provider Backblaze yesterday revealed failure rates among more than 27,000 consumer-class hard drives it uses in its data center.
The breadth and depth of Backblaze's data has given consumers unprecedented access to specific hard drive failure rates across the three largest vendors of the technology: Seagate, Hitachi and Western Digital. It offers an unvarnished look at hard drives (models and serial numbers included), and even details which drives Backblaze will no longer use because they're so unreliable.
While users should check out the actual data for more granular information, the big picture boils down to this: Over a three-year period, 3.1% of Hitachi's drives failed; 5.2% of Western Digital's drives died; and a sizable 26.5% of Seagate's drives failed.
"Hitachi does really well," Backblaze said in its blog. "There is an initial die-off of Western Digital drives, and then they are nice and stable. The Seagate drives start strong, but die off at a consistently higher rate, with a burst of deaths near the 20-month mark."
The study includes data on 15 drive models totaling more than 12,000 drives each from Seagate and Hitachi, and almost 3,000 drives from Western Digital. There were also several dozen drives from both Toshiba and Samsung, but not enough for solid statistical results.
IT vendors often pitch studies and "user surveys" to the press. Most of the time, those studies are overtly self-serving. For example, my colleagues and I regularly get study and survey pitches from security software makers on consumer data vulnerability -- i.e. "your data is vulnerable, buy our software to protect it."
Professional journalists typically ignore these kinds of reports, unless they can be used in concert with objective data. So why make a big deal over Backblaze's data?
Gleb Budman, Backblaze's co-founder and CEO, told Computerworld today that his company lives by the ethos that, when it can, it will openly share information that helps others. And no, that doesn't include customer data.
"We use Linux, we use Tomcat, we use Apache. We use a variety of open-source software and information people publish about technology or marketing. So we like to give back when can," he said.
Now, for a grain of salt. Obviously, on some level Backblaze compiled the drive failure-rate data to draw attention to its $5-a-month storage service. The message is simple: If hard drives fail, yours could, too. So go out and sign up for the cloud storage service.
But in this one case, the data offered by Backblaze is still compelling.
Racks of Backblaze's Pods - storage arrays filled with consumer-class hard drives
It goes without saying that the hard drive industry is an incestuous one where companies regularly acquire one another's technology. Going back to the early 2000s, Maxtor acquired Quantum's drive division; Seagate acquired Maxtor; then it purchased Samsung's and LaCie's. In 2009, Toshiba bought Fujitsu's drive business. In 2011, Western Digital purchased Hitachi's drive facilities and then sold them to Toshiba. You can try to keep up, but it's not easy.
So even with Backblaze's hard and fast data on drive failure rates, you might still be left uncertain as to which products are best.
But, assuming Backblaze's failure-rate data is not skewed (and there's no reason it would be), it is still hugely beneficial to consumers: Basically, it offers an evaluation of 15 drive models, details how many BackBlaze used and which ones failed over three years in its data center. And it details the vendors whose products had the best overall reliability.
With that information, buyers can make a vastly more informed choice on which hard drive they'll want in a computer. Although the drives listed by Backblaze are older, Budman said his company plans to release updated failure rates on a quarterly basis.
"That will add data points in terms of drives already in this study as they will get older. We'll also be adding three petabytes of storage capacity per month to our data center, so there's new data to be collected," he said. "So as new drives come out, there will be new data released on them."
The company may also begin reporting how drives failed -- for example, whether a read/write head or an internal motor died. That data may be culled from the Self-Monitoring, Analysis and Reporting Technology (SMART), an internal drive monitoring software most manufacturers include in their products.
One class of drive the company hopes to add once they're more affordable is helium-filled models. Helium drives will offer up to 6TB of capacity compared to today's 4TB, air-filled drives. Helium reduces friction, so manufacturers can pack more drive platters into a smaller area without overheating.
Unfortunately, because solid-state drives (SSDs) are so much more expensive than hard drives, Backblaze doesn't plan to include those in any studies any time soon -- not until SSDs achieve price parity with hard drives, Budman said.
Because it buys from retail sites, Backblaze is not beholden to drive suppliers or any pressure they might apply to fend off bad publicity. That said, when Backblaze released its latest blog, Seagate retweeted it. Kudos to Seagate.
Backblaze sticks its consumer drives into RAIDed storage arrays it calls "Pods." That's where it stores customer data. Because the storage servers use RAID, drives can fail and data can be rebuilt because its been striped across multiple drives. In other words, data generally isn't lost when a drive fails.
The company only uses 313 enterprise-class drives in its Dell PowerVault storage systems for corporate data. Even so, last year it published a compelling report comparing enterprise and consumer drive failure rates. It showed the annual failure rate of expensive enterprise-class drives (4.6%) was about the same as cheap consumer-class drives (4.2%).
That blog post went viral, and rightfully so. That kind of information is highly useful, just as the data released this week is. It absolutely deserves your attention.
Lucas Mearian covers consumer data storage, consumerization of IT, mobile device management, renewable energy, telematics/car tech and entertainment tech for Computerworld. Follow Lucas on Twitter at @lucasmearian or subscribe to Lucas's RSS feed. His e-mail address is firstname.lastname@example.org.
Read more about data storage in Computerworld's Data Storage Topic Center.
Copyright 2009 IDG Magazines Norge AS. All rights reserved
Postboks 9090 Grønland - 0133 OSLO / Telefon 22053000
Ansvarlig redaktør Henning Meese / Utviklingsansvarlig Ulf Helland / Salgsdirektør Tore Harald Pettersen