OceanStor OS Enables Fast and Stable All-flash Storage

By Wang Jiaxin from Huawei

For most people, the first word they associate with all-flash system is fast. Storage vendors spare no effort to enhance the indicators of various flash systems, which has led to such developments as million-level IOPS, 0.3 ms, and even 0.1 ms latency. On face value, it would appear that SSDs can tremendously improve storage performance.

However, this is not the case. Not all products can fully offer all the benefits it claims to provide. For example, nearly all cars have the scale of 200 km/h on its dashboard, but few can actually operate freely when driving at this speed. Without solid chassis, precise steering, and good suspension, a car is taking risks when it runs at fast speeds. For storage, these components are all present in all-flash. The all-flash operating system can easily operate the all-flash super sports car (storage), making it run steadily under extreme road conditions.

IT equipment is configured with a large number of integrated circuits and PBC boards. The equipment design and production require sophisticated technologies and strict process control. Cutting-edge technology, sufficient capital, and decades of experience are all crucial for manufacturing reliable IT hardware devices. Storage is one of the most important aspects for all enterprises. It adopts the integrated design of hardware and software. All hardware needs to work with software to maximize its capability, which further enhances difficulties in storage engineering.

Hardware is the carrier of storage, and software is the soul of storage. Neither of them can survive independently. It requires great efforts to manufacture hardware, and even greater efforts for top-quality software. Manufacturing the operating system, considered the core of software, is the most demanding and time-consuming part. A common storage operating system contains at least 17 million lines of codes, requires the diligent work of 3000 first-class engineers, and at least five years of development and tests before it is completed. Over 65% of all development tests are made to test the fault-tolerance designs to ensure the high availability of the system. Only a handful of vendors in the industry have achieved such a feat. Most storage startup vendors’ are incompetent in designing a complete storage operating system. Therefore, a storage leader’s strength lies in an independent storage operating system.

With years of industry experience accumulated from varying industries, Huawei has developed its own state of the art storage operating system – OceanStor OS. For over eight years, this OS has been running on more than 50,000 sets of devices on the live network of enterprises, ensuring the stability and reliability of enterprises’ key data. By focusing on the most popular SSD technology in the industry, Huawei is one of the vendors holding a large number of SSD-level patents. OceanStor OS is based on a stable software platform and many SSD-level patents. Recently, Huawei launched a new OceanStor OS version dedicated to flash storage to fully utilize the capabilities of SSDs and flash systems.

Another aspect of Huawei’s cutting edge OceanStor products is the use of SSDs and garbage collection. For SSD read/write, SSD flash cells can be re-written only after being erased. Generally, SSDs write data by page (size: 16 KB) and erase data by block (size: 8 MB). Each block consists of multiple pages comprising valid pages, invalid pages, and empty pages. In an SSD, if pages within a logical location are newly written, the pages of original physical location become invalid. To avoid erasing valid pages, valid pages in a block must be migrated to another space. When all valid pages in the block are converted to invalid pages, the entire block can be erased at one time. The process of migrating valid data is known as garbage collection. Garbage collection improves the space re-utilization of an SSD, but each migration undermines the performance of the storage system. In addition, larger amounts of migrated valid data and shorter periods lasting from when each page is written to the SSD to when the page becomes invalid imposes greater impacts on the system performance.

Innovative disk-control collaboration lays a solid foundation for high performance

The key to effectively controlling garbage collection is to maximize the performance of SSDs and flash storage systems. Powered by proprietary SSDs and the flash operating system, OceanStor OS, Huawei’s OceanStor Dorado adopts an innovative disk-controller collaboration technology. By optimizing internal software algorithms, Dorado enables storage controllers to detect the data layouts in SSDs in real time and make adjustments accordingly. In this way, the data layout in the storage controller and SSDs can maintain consistent with each other. This prevents data migration and garbage collection after the data is written to SSDs, ensuring consistently high performance for flash storage systems.

Huawei-patented global wear/anti-wear leveling technology prolongs SSD service life

Different from HDDs, SSDs can only withstand a limited number of read and write operations. The service life of an SSD is an inverse proportion to the amount of data written to the SSD. Therefore, an all-flash storage system requires load balancing between multiple SSDs to prevent overly-used disks from failing.

OceanStor OS adopts a global wear leveling technology. Based on the collaboration between the controller software and SSD drives, all data is evenly distributed to multiple SSDs to share the service pressure. In addition, OceanStor OS periodically queries the SSD controller for the disk wear degree, and uses the wear degree as the basis for space allocation, thereby ensuring the reliability of the entire system.

Global wear leveling and global anti-wear leveling

Global wear leveling and global anti-wear leveling

However, when an SSD disk enters the end of its service life, for example, the disk wear degree reaches 80% or higher, multiple disks may be faulty at the same time, resulting in data loss. Huawei has developed a patented global anti-wear leveling technology to prevent SSDs from being faulty in batches.

OceanStor OS selects the most severely worn SSD and writes new data onto it as long as it has idle space. This reduces that SSD’s life faster and users are advised to replace it sooner, avoiding potential simultaneous failures and service interruptions. This technology is perfect for scenarios requiring IT devices replacement.

Purchasing a batch of new IT devices to replace the existing system is not an overnight task. It takes a long time to complete the procurement process approval, new device deployment, legacy service migration, and user acceptance. This is a very delicate process for users’ core services and data, so the effects are major if all SSDs break down during this period. Global anti-wear leveling technology can gradually replace the faulty SSDs and prolong the service life of legacy devices on the live network until official rollout of new devices.

Global inline deduplication and compression improve efficiency and service life SSD service life

Based on its analysis on HDDs and SSDs, IDC predicts that the price of an SSD will drop from more than 3 times in 2016 to 2 times at the end of 2018 over a 10K SAS disk. To accelerate the commercial use of SSDs, the industry uses inline deduplication and compression technologies to reduce the data volume before data is written into SSDs and minimize the amount of data that actually moves into SSDs, without affecting user experience.

IDC predicts the price comparison between SSDs and HDDs

IDC predicts the price comparison between SSDs and HDDs

Huawei OceanStor OS developed global inline deduplication and compression technologies. To obtain the best data reduction ratio, different types of services require different deduplication and compression granularities. The weak hash algorithm and byte-by-byte comparison are used for deduplication. After data is divided into data blocks by service type, deduplication is started. Then the system uses the weak hash algorithm to calculate the fingerprints of the data blocks and compares the fingerprints with the existing ones. If the fingerprint of a data block exists in the system, the system does not write the data block but only increases the fingerprint count. If a fingerprint is unique, the system adds it to the fingerprint table and writes the data block to SSDs. Deduplication is performed in real time, not after data has been written to SSDs.

Working principle of deduplication

Working principle of deduplication

The byte-by-byte comparison technology for addendum compares deduplicated data by byte to prevent the fingerprint hash conflict and ensure 100% data reliability.

The compression algorithm is a compute-intensive program. Inline compression consumes significant CPU resources, affecting end-to-end performance of the system. Industry peers often use the open source compression algorithm with high performance and low compression rate, such as LZ4, LZO, and Snappy. Huawei’s OceanStor OS is optimized based on the open-source LZ4 compression algorithm. The unit for storing compressed data is 1 KB, which doubles the compression efficiency and saves the storage space for compressed data.

Most all-flash storage vendors in the industry claim that their operating systems support inline deduplication and compression technologies. However, there are technological differences between OceanStor Dorado V3 all-flash storage and other products. According to the actual project test, under the data model of dual-controller, 100% random 8 KB I/O blocks and 7:3 mixed read/write, by stimulating the most common database scenarios, OceanStor Dorado with inline deduplication and compression enabled can maintain a 0.5 ms low latency and high performance. In the same test environment, the performance of OceanStor Dorado is twice that of EMC VMAX 950F or HPE StorServ 20850.

Based on the efficient deduplication and compression algorithms, Huawei promises a 3:1 data reduction ratio to customers who purchase OceanStor Dorado V3 all-flash storage series, helping users save investment and achieve higher return on investment (ROI). What’s more, if the guaranteed ratio is not met, Huawei is liable for providing additional storage capacity or exempting price on the capacity in future procurement. This function improves the storage system utilization, reduces user’s effective capacity cost per GB, maximizes space occupation, and reduces power consumption, lowers air conditioning and maintenance cost, helping to the end-to-end OPEX. Less data written into SSDs reduces the wear of SSDs, and prolongs the service life of SSDs and the storage system.

Comprehensive data protection and efficient software maximize the advantages of all-flash high performance

Thanks to years of expertise accumulation, Huawei’s new-generation OceanStor OS applies features of data protection software, including clone, remote replication, active-active, and 3DC, and inherits features from high efficiency software, including thin provisioning, QoS, and heterogeneous Virtualization. In addition, Huawei has fully optimized SSDs and developed more competitive features, such as lossless snapshot, RAID-TP that tolerates three-disk failure, and non-disruptive data migration, staying ahead of competitors in the all-flash era.

In addition to supporting gateway-free active-active mode and ensuring critical services with z zero RPO and a close-to-zero RTO, OceanStor OS is ahead of its peers in terms of performance, reliability, and efficiency.

  • For performance, OceanStor OS is based on the high performance of all-flash storage and the optimization of internal lock mechanism in the active-active software, enabling Dorado all-flash storage to reach 200,000 IOPS at 1 ms latency, topping in the industry.
  • For reliability, OceanStor OS supports upgrading an active-active solution to a geo-redundant solution to ensure 99.9999% high availability for critical services.
  • For efficiency, OceanStor OS enables Huawei’s all-flash storage to support HyperMetro for both SAN and NAS. The integration of SAN and NAS changes the traditional active-active solution where extra gateways are added on arrays to provide active-active SAN and NAS services, decreasing the number of devices by more than two and reducing deployment complexity and costs. In addition, the advantages of SAN and NAS parallel architectures are fully utilized to improve service performance.

In addition, network integration changes the coexistence of multiple networks, such as Fibre Channel and IP networks. Between active-active sites, Fibre Channel or IP is used to deploy data replication networks, configure networks, and heartbeat networks in a unified manner, reducing deployment costs. In the traditional storage active-active solution, two arbitration mechanisms work separately, leading to inconsistent arbitration results of SAN and NAS services in the event a network fault occurs between sites. OceanStor OS adopts unified arbitration to ensure that in all instances SAN and NAS services are deployed at the same site and share the same number of resources.

Currently, one of the industry’s largest active-active clusters supports only eight nodes, and it cannot meet storage performance requirements in large-scale deployment scenarios. Huawei’s integrated SAN + NAS active-active solution inherits the scale-out architecture of common clusters and supports a maximum of 32 nodes in active-active mode, meeting customers’ fast-growing requirements for storage performance.

Yahoo, the Japan’s largest Internet company, uses Huawei’s integrated SAN+NAS all-flash active-active solution to ensure real-time synchronization of inventory data in online stores, maintaining consistency between online and offline inventory. This solution completes fault switchovers (180 km far between two sites) within seconds, five times faster than a solution provided by NetApp.

Integrated SAN+NAS active-active solution

Integrated SAN+NAS active-active solution

Manufacturing all-flash operating system is not an easy task. The preceding description of OceanStor OS is just the tip of the iceberg. When enterprise users actually put the complete set of OceanStor OS in use, they can find out the unique advantages by themselves. Ultimately Huawei is nothing without its customer base. Thankfully, after 20 years of successful deployment and intuitive research, Huawei has become of the world’s leading brands with a customer base ranging across multiple countries and industries, and has since become a brand that you can trust.

The post OceanStor OS Enables Fast and Stable All-flash Storage appeared first on Huawei Enterprise Blog.

Source: Huawei Enterprise Blog