Since 2013, Backblaze has published statistics and insights based on the hard drives in our data center. You'll find links to those reports below. We also publish the data underlying these reports, so that anyone can reproduce them. You'll find an overview of this data and the download links further down this page.
Each day in the Backblaze data center, we take a snapshot of each operational hard drive. This snapshot includes basic drive information along with the S.M.A.R.T. statistics reported by that drive. The daily snapshot of one drive is one record or row of data. All of the drive snapshots for a given day are collected into a file consisting of a row for each active hard drive. The format of this file is a "csv" (Comma Separated Values) file. Each day this file is named in the format YYYY-MM-DD.csv, for example, 2013-04-10.csv.
The first row of the each file contains the column names, the remaining rows are the actual data. The columns are as follows:
The schema for each quarter may change. The basic information: date, serial_number, model, capacity_bytes, and failure will not change. All of the changes will be in the number of SMART attributes reported for all of the drives in a given quarter. There will never be more than 255 pair of SMART attributes reported. When you load the CSV files for each quarter you will need to account for the potential of a different number of SMART attributes from the previous quarter.
You can download and use this data for free for your own purpose, all we ask is three things:
Hopefully the information above has provided you with the information you need to access and use the hard drive data we have collected. Here is the data: