What is RAID Data Recovery?
RAID is the abbreviation of "Redundant Array of Independent Disk", which means independent redundant disk array. Redundant disk array technology was born in 1987 and was proposed by the University of California, Berkeley. The simple explanation is to combine N hard disks into a virtual single large-capacity hard disk through a RAID controller (divided into Hardware and Software). The adoption of RAID brings huge benefits to the storage system (or the built-in storage of the server), among which the increase in transmission rate and the provision of fault tolerance are the biggest advantages.
- Chinese name
- Redundant Disk Array Data Recovery
- Foreign name
- Redundant Array of Independent Disk
- Birth time
- year 1987
- Proposing organization
- Presented by University of California, Berkeley
- RAID is the abbreviation of "Redundant Array of Independent Disk", which means independent redundant disk array. Redundant disk array technology was born in 1987 and was proposed by the University of California, Berkeley. The simple explanation is to combine N hard disks into a virtual single large-capacity hard disk through a RAID controller (divided into Hardware and Software). The adoption of RAID brings huge benefits to the storage system (or the built-in storage of the server), among which the increase in transmission rate and the provision of fault tolerance are the biggest advantages.
RAID data recovery technology features
- RAID data recovery is one of the technical features of Vishay. Through the introduction of the latest foreign-specific RAID authoritative tools, currently it can analyze and reorganize from the underlying principles with a very high success rate. In response to industry problems such as dual-cycle, RAID5 ADG, RAID6, Vital has made great technological breakthroughs, and has successfully recovered data from RAID disk arrays under the following operating systems: Windows NT 4.0, Windows 2000, Windows 2003, and Linux. Various versions, various versions of various UNIX manufacturers.
- The server is often a place where data is stored and managed centrally. It has certain advantages in storage capacity, storage security, and storage speed. Because of this advantage, a unit often uses the server to store extremely important data. Once the server's data is lost to The losses caused by users are also very serious.
- Beijing Zhongguancun Data Recovery Center specializes in product development and technical services in the field of data recovery.
- Beijing Zhongguancun Data Recovery Center independently developed a RAID server analysis program, RAID3000; at the same time, various RAID technologies, disk opening technologies, Mac, Linux, Unix, Solaris, SCOUnix, Hp, Unix, SQL database, oracle database have considerable research.
Basic knowledge of RAID data recovery
- A simple explanation of RAID array (Redundant Array of Independent Disks) is to combine N hard disks into a virtual single large-capacity hard disk through a RAID Controller (hardware, software). Its characteristic is that the simultaneous reading speed of N hard disks is accelerated. And provides fault tolerance Fault Tolerant, so RAID is the storage that is mainly used to access Data, not the Backup Solution.
- In 1988, Professor DA Patterson of the University of California, Berkeley, etc. first proposed the concept of RAID in the paper "A Case of Redundant Array of Inexpensive Disks" [1], that is, Redundant Array of Inexpensive Disks. Since large-capacity disks were relatively expensive at the time, the basic idea of RAID was to organically combine multiple small-capacity and relatively inexpensive disks to obtain the capacity, performance, and reliability equivalent to expensive large-capacity disks at a lower cost. As disk costs and prices continue to decrease, RAID can use most of the disks, and "cheap" is meaningless. Therefore, the RAID Advisory Board (RAB) decided to replace "cheap" with "independent", when RAID became Redundant Array of Independent Disks. But this is just a change of name, and the substance has not changed.
- The design idea of RAID was quickly accepted by the industry. As a high-performance and highly reliable storage technology, RAID technology has been widely used. RAID mainly uses data striping, mirroring, and data verification technologies to obtain high performance, reliability, fault tolerance, and scalability. RAID can be divided into different levels according to the strategy and architecture of the three technologies used or combined To meet the needs of different data applications. In the paper of DA Patterson et al., The original RAID levels of RAID1 to RAID5 are defined, and RAID0 and RAID6 have been expanded since 1988. In recent years, storage vendors have continuously introduced RAID levels such as RAID7, RAID10 / 01, RAID50, RAID53, and RAID100, but these have no uniform standards. The current industry-recognized standards are RAID0 to RAID5. Four levels other than RAID2 have been designated as industry standards. The most commonly used RAID levels in practical applications are RAID0, RAID1, RAID3, RAID5, RAID6 and RAID10.
- From the perspective of implementation, RAID is mainly divided into three types: soft RAID, hard RAID, and mixed hard and soft RAID. All functions of soft RAID are completed by operating system and CPU. There is no independent RAID control / processing chip and I / O processing chip, so the efficiency is naturally the lowest. Hard RAID is equipped with a dedicated RAID control / processing chip and I / O processing chip and array buffer. It does not occupy CPU resources, but the cost is high. The hardware and software hybrid RAID has a RAID control / processing chip, but lacks the I / O processing chip, which requires a CPU and driver to complete. The performance and cost are between soft and hard RAID. Each level of RAID represents an implementation method and technology. There is no difference between levels. In practical applications, the appropriate RAID level should be selected in accordance with the characteristics of the user's data application, comprehensive consideration of availability, performance, and cost, as well as specific implementation methods.
RAID data recovery common failures
- (1) The system cannot start
- (2) RAID information destruction
- (3) Because a hard disk is offline, the reconstruction fails after replacement, and the system crashes
- (4) RAID information is lost
- (5) Hard disk (single or multiple) is offline
- (6) The RAID card is damaged and the system crashes after replacement
- (7) Partition information is lost
- (8) Hard disk bad sectors (physical, logical)
- (9) Reconfigure RAID array information
- (10) Disk sequence error
- (11) The dynamic disk database is missing or damaged
- (12) LINUX, UNIX system startup is unsuccessful, or the partition cannot be mounted, and the partition cannot be found
- (13) Rebuild failed midway
- (14) After successful rebuild, the partition cannot be found or the system cannot start
- (15) The red light keeps flashing, or the yellow light keeps flashing (some yellow lights flashing means reading)
- (16) MBR is damaged, DBR is damaged
- (17) A single disk or multiple bad sectors.
RAID data recovery technical specifications
Introduction to RAID data recovery specifications
- Redundant disk array technology was originally developed to combine small inexpensive disks to replace large expensive disks, in order to reduce the cost of large-volume data storage. At the same time, it is also hoped that redundant information will be used so that when the disk fails, the correct The access to data is lost, so a certain level of data protection technology is developed, and the data transmission speed can be appropriately increased.
- In the past, RAID has always been enjoyed by high-end servers, and has been used as a supporting technology for high-end SCSI hard disks. Recently, with the development of technology and the continuous decline of product costs, the performance of IDE hard disks has greatly improved, and the popularity of RAID chips has made RAID gradually applied to personal computers.
- So why is it called a redundant disk array? Redundant Chinese meaning is redundant and repeated. The disk array description is not just a disk, but a group of disks. At this point you should understand that it uses duplicate disks to process data, so that the stability of the data is improved.
How RAID data recovery works
- How does RAID achieve high stability of data storage? Let's take a look at how it works. RAID is divided into different levels according to different implementation principles, and different levels have different working modes. The entire RAID structure is a number of disk structures. By combining disks, you can increase efficiency and reduce errors. Don't be frightened by so many terms. Their principle is actually very simple. For the convenience of explanation, each square in the following diagram represents a disk, which is called a block or a disk array vertically, and is called a stripe horizontally.
JBOD RAID data recovery JBOD mode
- JBOD is often called Span. It logically connects several physical disks one after another to form a large logical disk. JBOD does not provide fault tolerance. The capacity of the array is equal to the sum of the capacities of all the disks that make up the span. In the strict sense, JBOD does not belong to the scope of RAID. However, many IDE RAID control chips now have this mode. JBOD is a simple hard disk capacity stacking, but the system does not use a parallel method when writing data. When writing data, it is a hard disk that is written first. Two hard drives ...
- The most common in practical applications is RAID0, RAID1, RAID5, and RAID10. In most cases, RAID5 includes the advantages of RAID2-4, so RAID2-4 basically withdraws from the market.
- Currently, it is generally considered that RAID2-4 is only used for RAID development research
- The above is a description of the principle of RAID, and we are most concerned about the application of RAID. We use IDE hard disks everyday, and it is easy to buy IDE RAID cards and motherboards with integrated RAID chips. So the closest thing to us is IDE RAID. Due to the low application level, most IDE RAIDs only support RAID 0, RAID 1, RAID 0 + 1, and JBOD modes.
RAID Three key concepts and technologies in RAID data recovery
- Mirroring, Data Stripping and Data Parity [3] [4] [5]. Mirroring, copying data to multiple disks, can improve reliability on the one hand, and read data from two or more copies concurrently to improve read performance. Obviously, the write performance of the mirror is slightly lower, and it takes more time to ensure that the data is correctly written to multiple disks. Data striping, which saves data fragments on multiple different disks. Multiple data fragments together form a complete data copy. This is different from multiple copies of a mirror, which is usually used for performance considerations. Data striping has a higher degree of concurrency. When accessing data, you can read and write data on different disks at the same time, thereby achieving a very significant I / O performance improvement. Data verification uses redundant data for data error detection and repair. Redundant data is usually calculated using algorithms such as Hamming codes and XOR operations. With the check function, the reliability, robustness, and fault tolerance of the disk array can be greatly improved. However, data verification requires reading data from multiple places and performing calculations and comparisons, which will affect system performance. Different levels of RAID use one or more of the three technologies to obtain different data reliability, availability, and I / O performance. As for what kind of RAID (or even a new level or type) to design or what type of RAID to use, you need to make a reasonable choice based on a thorough understanding of system requirements, and comprehensively evaluate reliability, performance and cost to make a compromise.
- The idea of RAID has been widely accepted by the industry since its introduction. The storage industry has invested a lot of time and financial resources to research and develop related products. Moreover, with the continuous development of processors, memory, computer interfaces and other technologies, RAID continues to develop and innovate, and has been widely used in the field of computer storage, gradually extending from high-end systems to ordinary low-end systems. RAID technology is so popular because it has significant features and advantages and can basically meet most data storage needs. In general, the main advantages of RAID are as follows:
- (1) Large capacity
- This is an obvious advantage of RAID. It increases the capacity of the disk, and a RAID system composed of multiple disks has a large amount of storage space. Now the capacity of a single disk can reach more than 1TB, so the storage capacity of RAID can reach PB level, and most storage requirements can be met. Generally, the available RAID capacity is less than the total capacity of all member disks. Different levels of RAID algorithms require a certain amount of redundancy overhead, and the specific capacity overhead is related to the algorithm used. If you know the RAID algorithm and capacity, you can calculate the available capacity of the RAID. Generally, RAID capacity utilization is between 50% and 90%.
- (2) High performance
- The high performance of RAID benefits from data striping technology. The I / O performance of a single disk is limited by computer technologies such as interfaces and bandwidth. The performance is often limited, which easily becomes the bottleneck of system performance. Through data striping, RAID distributes data I / O across member disks, thereby achieving aggregate I / O performance that is exponentially greater than a single disk.
- (3) Reliability
- Availability and reliability are another important feature of RAID. In theory, a RAID system consisting of multiple disks should be less reliable than a single disk. There is an implicit assumption here: a single disk failure will render the entire RAID unusable. RAID uses data redundancy technologies such as mirroring and data verification to break this assumption. Mirroring is the most primitive redundancy technology. It completely copies the data on one set of disk drives to another set of disk drives, ensuring that a copy of the data is always available. Compared with the 50% redundancy overhead of mirroring, the data check is much smaller. It uses check redundancy information to check and correct the data. RAID redundancy technology greatly improves data availability and reliability, ensuring that if several disks fail, data will not be lost and the continuous operation of the system will not be affected.
- (4) Manageability
- In fact, RAID is a virtualization technology that virtualizes multiple physical disk drives into a single large-capacity logical drive. For external host systems, RAID is a single, fast, and reliable large-capacity disk drive. In this way, users can organize and store application system data on this virtual drive. From a user application perspective, the storage system is simple to use and easy to manage. Because RAID has completed a large number of storage management tasks, the administrator only needs to manage a single virtual drive, which can save a lot of management work. RAID can dynamically increase or decrease disk drives, and can automatically perform data verification and data reconstruction, which can greatly simplify management.
RAID RAID data recovery standard RAID level
- SNIA, Berkeley and other organizations have set RAID7, RAID1, RAID2, RAID3, RAID4, RAID5, and RAID6 as standard RAID levels, which is also recognized by the industry and academia. The standard level is the most basic set of RAID configurations, using data striping, mirroring, and data parity techniques individually or in combination. Standard RAID can be combined, that is, the RAID combination level, to meet the storage application requirements that require higher performance, security, and reliability. [1]
- Soft RAID:
- Soft RAID does not have a dedicated control chip and I / O chip, and the RAID function is completely implemented by the operating system and CPU. Modern operating systems basically provide soft RAID support. By adding a software layer on the disk device driver, it provides an abstraction layer between the physical drive and the logical drive. Currently, the most common RAID levels supported by the operating system are RAID0, RAID1, RAID10, RAID01, and RAID5. For example, Windows Server supports three levels of RAID0, RAID1 and RAID5, Linux supports RAID0, RAID1, RAID4, RAID5, RAID6, etc. Mac OS X Server, FreeBSD, NetBSD, OpenBSD, Solaris and other operating systems also support corresponding RAID levels.
- The configuration management and data recovery of soft RAID are relatively simple, but all the processing of RAID tasks is completely done by the CPU, such as calculating the check value, so the execution efficiency is relatively low. This method needs to consume a large amount of computing resources. Less, it is difficult to be widely used. Soft RAID is implemented by the operating system. Therefore, the partition where the system is located cannot be used as a logical member disk of RAID. Soft RAID cannot protect system disk D. For some operating systems, the RAID configuration information is stored in the system information instead of being stored on the disk in the form of a file. In this way, when the system crashes unexpectedly and needs to be reinstalled, the RAID information will be lost. In addition, the fault tolerance technology of a disk does not mean that it fully supports online replacement, hot plugging, or hot swapping. Whether the hot swapping of an error disk can be supported is related to the implementation of the operating system.
- Hard RAID:
- Hard RAID has its own RAID control processing and I / O processing chips, and even array buffering. The CPU usage and overall performance are the best among the three types of implementations, but the implementation costs are also the highest. Hard RAID usually supports hot swap technology to replace failed disks while the system is running.
- Hard RAID includes a RAID card and a RAID chip integrated on the motherboard, and most server platforms use RAID cards. The RAID card consists of a RAID core processing chip (CPU on the RAID card), a port, a cache, and a battery. Among them, the port refers to the disk interface type supported by the RAID card, such as IDE / ATA, SCSI, SATA, SAS, FC and other interfaces. 5.3 Mixed hard and soft RAID Soft RAID has poor performance and cannot protect system partitions, so it is difficult to apply to desktop systems. Hard RAID is very expensive. Different RAIDs are independent and not interoperable. Therefore, people adopt a combination of software and hardware to implement RAID, so as to obtain a compromise in performance and cost, that is, a high cost performance.
- Although this RAID uses a processing control chip, in order to save costs, the chip is often relatively cheap and the processing power is weak. Most of the RAID task processing is done by the CPU through a firmware driver. 6 RAID Application Selection There are three main factors in choosing a RAID level, namely data availability, I / O performance, and cost.
- At present, the mainstream RAID levels commonly used in practical applications are RAID0, RAID1, RAID3, RAID5, RAID6, and RAID10. The technical comparison between them is shown in Table 1. If availability is not required, select RAID0 for high performance. If availability and performance are important and cost is not a major factor, choose RAID1 based on the number of disks. If availability, cost, and performance are equally important, choose RAID3 or RAID5 based on general data transfers and number of disks. In practical applications, the appropriate RAID level should be selected based on the user's data application characteristics and specific conditions, and comprehensive consideration of availability, performance, and cost.