2

I have 8TB HDD WD Gold. Here they claim 2M MTBF (mean time before failure) hours. What is the statistical variance (estimated at least) of this data ?

user865044
  • 33
  • 4

1 Answers1

2

MTBF is often based on an exponential distribution model exp(-lambda*T). The MTBF is equal to 1/lambda. The variance can be calculated, it is 1/lambda², that is a standard deviation also equal to 1/lamdba (hence equal to the MTBF itself).

PierU
  • 1,539
  • 5
  • 20
  • 1
    Now the point is, the variance brings little useful information: -- it is fully dependent on the assumed distribution and not independent from the MTBF (the exponential model is described by a single parameter) -- the maximum of the distribution is always at T=0 and not at all near the MTBF value. Some vendors may use other models than the simple exponential one, but which are not very different from it. The exponential model is based on a constant failure rate over time, which is close to what happens in the real world. – PierU Aug 29 '22 at 14:12
  • The best for me would be if you could roughly plot the graph of WD 8TB Gold, First Failure. On the "x" axis should be hours of up HDD and on "y" probability that it fails in a given interval on 'x". So something like density f. – user865044 Aug 29 '22 at 18:18
  • For x expressed in years and y the probability (in %), this is the curve: https://prnt.sc/xaM7OX3OmEEK . For instance the value at 200 (approx 0.17%) is the probably that the drive fails during the year 201 for the first time. On a more useful scale you can see that the probability is almost constant during the first years (approx 0.35%): https://prnt.sc/WbTATvAyp3xo – PierU Aug 30 '22 at 11:46
  • Is there validation from actual drive failure numbers for these, or is this just the model manufacturers *assume*? Google did its large [drive failure trends](https://static.googleusercontent.com/media/research.google.com/en//archive/disk_failures.pdf) study, but that's now 15 years old. – MiG Aug 30 '22 at 11:50
  • 1
    More useful is the graph showing the probability that the drive lasts at least x years: https://prnt.sc/u6MXRgeYLrdH / https://prnt.sc/ulsjWmnNh3nC . You can see that there are 96% chances that it lasts at least 10 years. – PierU Aug 30 '22 at 11:50
  • Remember that these estimations assume that the drive is operated in ideal conditions: temperature not exceeding 40°C, no vibrations, etc... – PierU Aug 30 '22 at 11:51
  • @MiG it's quite difficult to find confirmations from real world usage... The vendors perform accelerated aging tests and extrapolate. Backblaze is regularly publishing drive failure stats, though: https://www.backblaze.com/blog/backblaze-drive-stats-for-q2-2022/ – PierU Aug 30 '22 at 13:05
  • MTBF sounds more like a promise from the manufacturers than a genuine statistic then :) – MiG Aug 30 '22 at 13:39
  • @MiG well, it's difficult anyway to produce real statistics on products at the moment they enter the market. Anyway, what is more meaningful in practice is the annual failure rate (even though it is completely linked to the MTBF) – PierU Aug 30 '22 at 14:06
  • True. What I'm hoping for however is large companies having more recent similar studies to the google one. That would probably also provide some depth regarding the question of the topic starter :) – MiG Aug 30 '22 at 15:04
  • 1
    @PierU I've got this interesting answer: Hello Jan, Thank you for your reply. `I am sorry but variance of MTBF is not publicly available information.` If you have any further questions, please reply to this email and we will be happy to assist you further. Sincerely, Angelo P. Western Digital Customer Service and Support https://www.westerndigital.com/support – user865044 Sep 01 '22 at 08:16
  • +1 for just outright mailing them! I think the best channel for this kind of data would be large users (like google) though :) – MiG Sep 01 '22 at 08:51
  • Their answer looks like "I don't know" :) – PierU Sep 01 '22 at 09:18
  • There are common misunderstandings about the MTBF: it is obviously not an observed statistic (2.5 Mhours is 235 years, no real world test can be made), and it is also obviously not the expected lifetime of the drive (no one can imagine a drive lasting 235 years). The MBTF is nothing more than a pure mathematical parameter that describe the reliability during the typical *operation* lifespan of the drive (a drive rarely operated during more than 10 years, it is just retired at some point even if it's still working). That's why vendors tend now to promote the Annualized Failure Rate instead. – PierU Sep 01 '22 at 09:44