Comments on: Reliability Data Set for 41,000 Hard Drives Now Open-source https://www.backblaze.com/blog/hard-drive-data-feb2015/ Cloud Storage & Cloud Backup Fri, 13 Aug 2021 12:03:28 +0000 hourly 1 https://wordpress.org/?v=6.4.3 By: Scott https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-310241 Fri, 15 Jul 2016 23:15:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-310241 Not sure if there is a better venue to ask a question about the nature of the data, but I’m looking at the 2014 dataset and I noticed that some serial numbers show up at the end of the first day in service as having accumulated a large number of drive hours (using smart_9_raw variable – I’m assuming this is the number of hours the drive was operational and in service but not certain as I see POH on the Wikipedia page and I can’t confirm if these are referring to the same column). I don’t understand how this could happen if a particular serial number showing up for the first time in the dataset implies that it was the first day the device was put into service. Repaired device from the previous year perhaps?

A second question – What is going on with devices that have been in service for a number of days and then all of a sudden disappear from the dataset without a failed indicator ever being tripped? Are some devices being removed from service before they fail? If so is this random or are they being removed as somebody notices they are about to fail? The latter could greatly bias any reliability analysis.

Example:
Serial number Z3015LM8 encompasses both of these situations. It shows up on Feb 19, 2014 as having a smart_9_raw value of 583 (again, I’m assuming this reflects the number of hours the drive was operational and in service), and can be tracked all the way through the last day it appears in the data on Sep 17, 2014 where the variable failure has a value of 0 (i.e. it didn’t fail). How does this device accumulate 583 hours of service at the end of the first day it’s turned on? Also, why is the device no longer in the dataset if it didn’t fail?

Thanks!

]]>
By: MontyW https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-304251 Tue, 15 Mar 2016 10:30:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-304251 Brian, what model of HDD would you yourself buy for back-up purposes? ($64,000 question!)

]]>
By: Riviera https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-299381 Sun, 17 Jan 2016 14:33:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-299381 In reply to amadvance.

I believe they mentioned that migrating smaller drives to higher capacity drives was one of the main reasons for replacing drives; drives that went over the drive usage statistic thresholds was another reason, and I think those were counted as ‘failed.’

I agree though, it would be nice to have the actual numbers on these swaps.

]]>
By: m4dsk https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-297841 Wed, 02 Dec 2015 10:44:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-297841 Brian, could you please tell us whether most disks labeled as 1 are actual failures or proactive replacements done by the admins? Would there be any chance to distinguish between these 2?

]]>
By: Michael T https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-296351 Mon, 02 Nov 2015 16:10:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-296351 Has anyone offered some guidelines (and/or the scripts) to collect this data ourselves?
Sure I can write my own flavor of it but I’d really rather make my data comparable and scaled the same as the Backblaze data collection process. Regardless of how good/bad the means BB has been using, I think it would be handy if we all did it the same way to consolidate data. I’m spinning a lot of disks and I’d love to provide some of this data to the outside world.

]]>
By: jack phelm https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-285521 Fri, 05 Jun 2015 04:44:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-285521 In reply to amadvance.

Still waiting some comment on this!!

]]>
By: jack phelm https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-285511 Fri, 05 Jun 2015 04:42:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-285511 What are the attribute 15 and 255 for, can’t find any reference about them on the web.
Any help would be appreciated

]]>
By: hartfordfive https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-280481 Sun, 15 Mar 2015 19:13:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-280481 For anyone that might be interested, I’ve created a Go application to import the data into Elasticsearch. You can view it at https://github.com/hartfordfive/backblaze-hd-data-importer. I appreciate any positive feedback anyone can provide me with!

]]>
By: Adela https://www.backblaze.com/blog/hard-drive-data-feb2015/#comment-279381 Wed, 11 Mar 2015 09:12:00 +0000 https://www.backblaze.com/blog/?p=23131#comment-279381 Hi Brian, I was wondering if in the data, there’s a way of distinguishing between disks which have been replaced due to actual drive failure, and which have been replaced due to predicted failure? Are there many disks that are replaced due to the prediction based on the smart parameters, or do you mostly wait for the disks to completely fail? Thanks!

]]>