Is Your SSD Reliable? Storage Study Examines Failure Rates For Thousands Of Drives
As with most storage mediums, the life of a solid state drive (SSD) can be really long, disappointingly short, or somewhere in between. That's not exactly helpful info. So how reliable are SSDs really? That, dear reader, is a loaded question. Fortunately, the folks at Backblaze, a cloud backup provider that employs both mechanical hard disk drive (HDD) and SSD storage in its pods, continues to release reliability reports that compare and highlight failure rates on thousands of drives.
Backblaze began adding SSDs to its arsenal the fourth quarter of 2018. By the end of 2021, it was hosting 2,200 SSDs. That number grew to 2,558 SSDs by the end of 2022, and as of June 30, 2023, the cloud backup firm is using 3,144 SSDs in its storage servers.
"In this environment, the drives do much more than boot the storage servers. They also store log files and temporary files produced by the storage server. Each day a boot drive will read, write, and delete files depending on the activity of the storage server itself," Backblaze explains.
Drive stats and trends become more meaningful over time, as they're less skewed by outliers. This is important to consider because if looking at the data without any kind of lens, an 829.55% annualized failure rate (AFR) for Seagate's 240GB SSDSCKKB240GZR appears rather damning. But the devil is in the details.
There were just two of those drive models at the beginning of the year, one of which failed shortly after it was installed. The other drive is still running and thus has a 0% AFR.
"Which AFR is useful? In this case neither, we just don’t have enough data to get decent results. For any given drive model, we like to see at least 100 drives and 10,000 drive days in a given quarter as a minimum before we begin to consider the calculated AFR to be 'reasonable'," Backblaze states.
The data is still included for the sake of completeness and if you're wanting to dive into the numbers, by all means, that's why these frequent audits exist. But in terms of meaningful analysis, arguably the most useful is a bathtub curve that Backblaze put together.
Previously, Backblaze graphed its HDD failure over time to see how they fit to the classic bathtub curve used in reliability engineering. Here's how SSDs compare in the still-early going...
"While the actual curve (blue line) produced by the SSD failures over each quarter is a bit 'lumpy', the trend line (second order polynomial) does have a definite bathtub curve look to it," Backblaze says. "The trend line is about a 70% match to the data, so we can’t be too confident of the curve at this point, but for the limited amount of data we have, it is surprising to see how the occurrences of SSD failures are on a path to conform to the tried-and-true bathtub curve."
There's really no substitute for time and so we could see failure trends change as more SSDs come into the mix, and existing ones rack up more days, weeks, months, and years of service. Whether SSDs take a prolonged dip on the bath curve remains to be seen.
Regardless of what you take away from storage audit, we can't stress enough that you should frequently back up your data, especially the more important bits (precious family photos and videos, work projects, and so forth). Multiple backups is the best way to go, including at least one offsite backup, be it a portable drive stored at relative's house (or safe deposit box) or in the cloud.