Google Play icon

Reliable disk arrays without a need to replace failed disks?

Share
Posted January 6, 2015

In the world of technologies there is no such thing as the perfect reliability. However, it is possible to get close to that by reaching unusually high levels of solid operation without any failure of the dedicated function. Performances of this kind a required in banking sector, military, aeronautics, medicine and usually involve very complex systems with very high associated upkeep costs.

Picture: Shirokane1 High-Speed Large-Capacity Disk Array System. Data array systems like this one require exceptionally high levels of reliability. Could it be possible to make such or similar systems maintenance-free? Image source: Human Genome Center, University of Tokyo.

Picture: Shirokane1 High-Speed Large-Capacity Disk Array System. Data array systems like this one require exceptionally high levels of reliability. Could it be possible to make such or similar systems maintenance-free? Image source: Human Genome Center, University of Tokyo.

However, the trends of high complexity and high maintenance costs are gradually shifting. Well, this is happening at least in the field of information technologies where the cost to produce the system is becoming increasingly lower compared with the cost required to service the same system.

This is the idea of a new research paper published on arXiv.org this week. The article written by a team of computer engineering researchers suggests an idea of producing disk array systems for data storage that would not require any major maintenance even in case of failure of separate constituting components, while simultaneously achieving operation reliability (of probability of not losing data) level as high as 99.999 percent over the course of four years.

Now this sounds really captivating, doesn’t it? Is it really possible to make a data storage system completely maintenance-free or, to be more exact, to make this idea financially viable? From technical point of view you could buy several units of the most advanced data disks, connect them in proper RAID configuration, and you should be done. But what about that “maintenance-free” option and the cost efficiency?

According to traditional solutions, failed disks in disk array systems need to be replaced and their content has to be regenerated. However, this approach has some disadvantages. “First, it introduces an additional delay, which will have a detrimental effect on the reliability of the storage system. Second, the cost of the service call is likely to exceed that of the equipment being replaced”, argue the authors of the study.

To avoid these negative effects, the researchers propose using self-repairing disk arrays. The disks are not literally self-repairing, explain the authors. Such array would contain enough spare disks which could be switched with the failed ones automatically over the expected lifetime of each array.

Data disk failure rates acquired from online storage service. Image courtesy of the researchers.

Data disk failure rates acquired from online storage service. Image courtesy of the researchers.

How many spare disks exactly would we need? The researchers argue that it is possible to build a disk array that can achieve 99.999 percent reliability over a duration of four years with a space overhead not exceeding that of disk mirroring. To prove this concept, they devised a model to simulate the reliability of a disk array system based on standard disk performance parameters (e.g. five-year lifetime, mean time to failure (MTTF) of 100 000 hours). Later, the model was complemented with a practically collected disk reliability data obtained from a major online backup service.

Complete two-dimensional adaptive RAID arrays were selected as a base of configuration for simulation model because they have lower update costs than three-dimensional RAID arrays, the scientists note. The simulation results showed that in order to achieve a desired level of reliability (99.999% over four years) the required number of spare disks and parity disks would be approximately equal to the number of data disks. However, this approach would become not economically viable in systems with a very large number of data disks (most efficient scenario with smallest space overhead contained 45 disks).

Four-year survival rates and space overheads of two-dimensional self-repairing disk arrays. Image courtesy of the researchers.

Four-year survival rates and space overheads of two-dimensional self-repairing disk arrays. Image courtesy of the researchers.

The team also demonstrated that the same approach could be extended to more traditional RAID level 6 array system; however, in this case only a system containing 10 data disks, 2 parity disks and 18 spare disks could achieve the reliability target. Otherwise, a non-standard RAID solution or standard RAID with at least triple parity level would be required to meet these objectives, the authors of the paper conclude.

Written by Alius Noreika

Featured news from related categories:

Technology Org App
Google Play icon
85,413 science & technology articles

Most Popular Articles

  1. New treatment may reverse celiac disease (October 22, 2019)
  2. "Helical Engine" Proposed by NASA Engineer could Reach 99% the Speed of Light. But could it, really? (October 17, 2019)
  3. New Class of Painkillers Offers all the Benefits of Opioids, Minus the Side Effects and Addictiveness (October 16, 2019)
  4. The World's Energy Storage Powerhouse (November 1, 2019)
  5. Plastic waste may be headed for the microwave (October 18, 2019)

Follow us

Facebook   Twitter   Pinterest   Tumblr   RSS   Newsletter via Email