Huawei data storage systems have become widespread relatively recently, and today the Chinese manufacturer successfully competes with brands such as Dell/EMC, Netapp and HPE. Following the principle of “more for less money”, Huawei even in the entry-level storage uses technology that competitors are found only in the top models.
The OceanStor series is designed to store large amounts of enterprise data, and even the youngest model of OceanStor 2200 V3 is already scalability up to 300 drives with a total capacity of 2.4 Pb, it is file and block access, it is FC, FCoE and Ethernet interfaces with iSCSI support, it is SAS 12 Gbps for connecting head units with shelves, it is a 2-controller design without a single point of failure and support for block virtualization of hard drives, a technology that Huawei proudly calls RAID 2.0+.
RAID 2.0+ - block HDD virtualization
A traditional RAID array can be compared to a wall where each brick is a hard drive or SSD. Bricks it is desirable to have one type and the size and if to pull out one or two - the wall will be shaken and if three - will collapse even RAID 6, having taken with itself all stored data. In traditional RAID-arrays it makes no sense to combine disks at the same time on 7200 and 10 000 RPM, when replacing the “brick” array recovery can take up to several days, inside the array all the disks are equivalent, and if you distribute the data to “cold”, “warm” and “hot”, then please allocate SSD for “hot” data and create a separate RAID array on 10/15K SAS HDD - for “warm” data. Of course, you still need to reserve 2 or 3 disks of different types for hot swap, but these are the realities of traditional RAID-s, and this has to be tolerated even in very expensive storage.
It is quite another thing when you take completely different hard drives and/or SSDS and divide their space into equal blocks, well, let’s say 128 kilobytes. And already from these blocks you collect RAID array on any of the traditional schemes - from a simple mirror RAID 1 to RAID 60. In dunnam case, the controller operates not physical media, but the space inside them, and it opens truly unlimited possibilities. The simplest option is to create multiple simultaneous RAID arrays of different types on a single hard disk pool. This could be useful for creating a small fast disk group, but the technology has gone further - the Huawei OceanStor 2200 V3 controller itself is able to divide data into “hot” and “cold” and move them inside one RAID array to faster carriers - 15K HDD or SSD. In the developer’s terminology, RAID 2.0+ divides disk group space into chunks, chunk groups, and extents. In the settings of the storage these terms are not found, so if you are interested in how it works, we offer you slides from the manufacturer’s presentation.
The reliability of the collected arrays is a matter of pride for Huawei: imagine - not only that the recovery of a 12-disk array takes 30-40 minutes, not only that you can do without a Hot Spare disk (as a hot swap is not a physical disk, but the free space of the pool), so also the traditional RAID 5 will withstand the fall of two, three, four, or… nine hard disks - there would be free space. Each time the HDD fails, the controller OceanStor 2200 V3 automatically redistributes data to unoccupied areas of hard drives, and after 30-40 minutes your array is alive and well again, and having lost one HDD, it will survive the failure of the next hard drive. Let’s test this feature of the survivability of the array!
To our lab Huawei OceanStor 2200 V3 came in configuration with 12 hard drives NL-SAS volume of 2 TB. For testing, all disks were combined into a common pool, within which RAID 5 was built with one virtual Hot Spare disk. The useful volume of the array was 20 TB, within which 10 LUNs were created with a total volume of 9.6 TB. We consistently disabled one hard drive, measuring the recovery time of the array.
The system sustained failure of 6 disks in 12-disk RAID 5, and after breakage of the fifth HDD, the array was restored to the online state, and on the sixth disconnected hard drive the free space ended, and rebild occurred only partially, having left the array in the degraded, but working condition.
What to do in this case? The answer is obvious: by removing one of the LUNs, we made additional room and the reconstruction process started further. On this it was decided to cease testing because of failure of even 50% of the hard disks in a storage system in practice is not found ever, but the Huawei OceanStor 2200 V3 will be able to survive such a scenario, if only the discs were not flying at the same time.