Toshiba, a major manufacturer of flash drives and hard drives, predicts that in 2019 only 10% of corporate data will be stored on SSDS, although it seems that all storage manufacturers have forgotten about hard drives and focused on solid state drives. However, no matter how beautiful the IOPS promised us PR, no matter what synthetic tests showed, the real speed of a typical virtual machine depends on how your server is configured and how the application itself works. Most server programs, ranging from databases to front-end with a web-based interface use caching in RAM, and therefore may not depend on the disk system at all under any load.
SHOW FULL ARTICLE
The modern corporate NAS is not just a file-guard, but also a computing node on which container (Docker) and host virtualization (hypervisor) are deployed, and since the data in such a system is stored on the same host where it is processed, running any application on the Synology VMM hypervisor, we can count on certain bonuses from RAM.
- First, it is caching in Diskstation/Rackstation RAM all file read operations of the virtual machine by the standard Linux cache.
- Secondly, it is the acceleration of write operations, which is achieved due to the absence of a layer between the hypervisor and the operating system. For example, if you connect an ESXi 6 host.x to NAS over NFS Protocol, all write operations are necessarily synchronized by VMware hypervisor, which leads to a decrease in speed.
- Third, we can install a redis caching server on Synology, both in the virtual machine and in the DSM itself via community packages or docker. We will have a non-volatile RAM-caching server, whose database is on the RAID-array, well, is not it a charm?
With the third point, perhaps, it is worth starting.
1.The most effective way is to use Redis
If you already have a ready infrastructure with VMware vSphere servers, and you purchase NAS only for storing backups or as the main storage, look at the memory: Synology RackStation RS18017xs+ has 16 GB of RAM, which can be expanded to 128 GB. The entire DSM operating system (DiskStation Manager) rarely requires more than 2 GB of RAM, so unused memory can be given to Redis. It is a NoSQL server that stores data in memory by periodically flushing its database to disk. When rebooting, Redis recovers the database from disk by loading data into RAM, and even when power is off, after rebooting you have access to all the data since the last sync. You can put not only strings but also files inside Redis, and if your application constantly accesses a directory with tens of thousands of small files (for example, in machine learning), then you probably know that in this case any modern file system slows down, but Redis does not.
Redis can be installed through the Synology DSM package by connecting the Synocommunity repository, but there is an old version 3.0.5-5, so it is better to use Docker or virtualka running on the NAS-E. Install the Synology Virtual Machine Manager package and deploy the 5th version of Redis on the Debian operating system.
Let’s test the access speed using the built-in redis benchmark. One and a half million transactions per second in pipeline mode and five hundred thousand with default settings. Redis itself is single-threaded, so you can clone a virtual machine to use more than 1 NAS processor core.
The example with Redis clearly demonstrates that today we can consider storage only as a file guard in two cases: when it comes to home 2-disk NAS or Vice versa, when we talk about a powerful infrastructure of banks or airlines. For the rest-please, here is a centralized caching system: connect to it and save resources of your hosts. You don’t even need a 10-Gigabit network connection: in typical use cases, a 1-Gigabit network is enough for fast connected clients.
And don’t forget that Synology DSM can use Redis data protection with snapshots at the btrfs file system level. Of course, using Redis will require you to do a little reworking of the application, which is not always possible, so let’s see how Synology DSM’s built-in caching works.
2. Caching in RAM of the NAS
Even if the dev under Windows or Linux is on the disk hundreds of gigabytes, and the use of units or tens of gigabytes of disk space: logs and database files, frequently accessed files in General everything that is not cached in memory most of the guest operating system or application. Frequently requested data blocks are stored in the Synology DSM RAM itself, as we have seen many times in synthetic direct file access tests. The caching mechanism in RAM is best observed on random read disk operations.
In this chart, the ideal of access to a test region with a capacity of 16 GB. Almost all of it can fit into the NAS RAM, which happens during the test. Please note:” swinging ” NAS long enough-about 10 minutes, and then goes to maximum performance.
When the cache is full, the read speed grows 3 times, but still remains small by the standards of what can be squeezed out of RAM. Does it make sense to add SSDS for operations that use a small active partition area that can fit in storage memory?
3. SSD cache in Synology implementation
SSD cache can operate in two modes: read only and read/write. In the first case, you only need 1 solid-state drive, and in the second case – you will need at least a pair to combine into a “mirror”. Caching SSDS can also be combined into more complex arrays, including RAID 5, as long as the write cache is fault-tolerant.
In the current version of DiskStation Manager, the contents of the SSD read cache are not saved after the NAS is rebooted. That is, after rebut you are waited by a certain period of warming up though DSM begins to push data on SSD literally from the first minutes after start. There is no such problem for the read/write cache.
To repeat our test, we will first use 1 SSD in read cache mode, and then 2 SSD in read/write cache mode, combining them into a mirror RAID 1.
We see that SSDS, to put it mildly, work faster, and the mirror array further increases performance by reading from two drives at the same time. But apart from the fact that the SSD cache is faster, it is also filled faster.
It turns out that the NAS does not need to beg to save data in the cache: SSD reach maximum speed in 3-4 minutes, and RAM – in 10-15 minutes. In addition, the SSD cache actively frees data and rebuilds between loads, although the diagrams do not show this. But, as they say, only reading will not be alive, and it is very interesting how the caches will behave in the patterns of VDI and SQL tasks. We will use 2 test area sizes: 16GB, comparable to RAM and 96GB, three times more than there is memory in the NAS.
Where you add a record, you already need to be more competent in choosing the SSD itself, given that they are likely to be constantly filled with data, and their speed will differ from the maximum. Let’s increase the test area by 6 times:
By the way, Synology DSM constantly monitors the health of SSD-shek and will warn you when the drive is better to replace. For Seagate HDD production there is an advanced diagnostics through IronWolf Health Management system (read more in our review), but it is rather new technology and time will show how much it is useful. Let’s change the pattern to SQL, and look at the behavior of the array.
Interestingly, in the SQL load when filling the cache decreases the amplitude of oscillations of performance. Let’s compare the averages in different patterns.
Attentive readers have noticed that for the SQL pattern we do not provide a chart for the 16-Gigabyte partition area. Of course, you could wave your hand and say: “so everything is clear – you need to put at least 2 SSD”, but we will not rush to conclusions, and run the OLTP test in a real application in a virtual machine.
The test is Sysbench OLTP
Take MariaDB databases in virtual reality with a small amount of memory, like 8 GB. Create a table of 50 million records so that it has 11.2 GB more RAM available to the guest but less NAS RAM (16 GB) and force the machine to actively use the disk in transactional load mode using random read requests. Do this test three times: first, the dev is working on a host under VMware ESXi 6.7, connected via iSCSI, then the same thing, but with NFS, and then moved dev in Synology Virtual Machine Manager, using to migrate the Synology package Active Backup for Business.
In this case, SSD caching does not provide any special advantages due to the fact that part of the virtual disk on which the database file is located easily fits into the NAS RAM. Let’s create a situation in which the size of the database greatly exceeds the free memory of the Rackstation RS18017xs+. It is impossible to increase the number of rows in the test table to billions: the database itself begins to slow down, making the results not representative. It is much easier to take away the extra memory from Synology DSM, for which we will run a virtual machine with 12 GB of RAM in the Synology VM hypervisor, as a result, only about 2.5 GB will remain under the cache.
And here SSD cache smoothes the negative effect of the lack of memory, although still reading performance is worse than in the previous test. We need to make sure that the speed is affected by the lack of extra memory, not the Synology VM hypervisor, for which we need to run the same test on the NAS itself.
After moving the database to the Synology VMM hypervisor, we had to add another 16 GB of memory to RS18017xs+ in order to keep the possibility of caching in NAS RAM. The tests show the same performance, which is not surprising because all file operations in the storage use a shared pool. That is, for the practical use of the database can be quite do means Synology VMM, reducing the number of servers in your company.
Going deeper into the buffering settings at the application level and experimenting with the InnoDB Buffer Pool Size parameter, I noticed that at values from 1 GB to 6 GB, the performance does not change significantly, so it is more profitable to give this amount of memory to the NAS. This is what hosting providers do, offering to rent virtual machines with a small amount of memory: the database is actively working with the disk subsystem, which acts as a storage with SSD and a large amount of memory.
Moreover, it is worth noting that not every SAN-array has a caching function in RAM: buffering LUNs at the block level is a rare feature, but Synology now even iSCSI LUNs are stored as files, so in addition to snapshots on a schedule, DSM is easy to navigate in what you need to keep in memory, and what – no.
Which SSD to choose?
Try for SSD caching to choose drives based on MLC or SLC, but not 3D NAND TLC. If possible, choose a business class SSD, and in the reviews pay attention to the distribution of IOPS over time, as in our review of NVME SSD. Keep in mind that your SSD must be in the Synology compatibility list, and then you can not worry about the disk array.
What conclusions can be drawn from our testing? First of all, notice how aggressively Synology DSM writes data to the SSD. Just a few minutes under load – and they are copied to SSD drives, accelerating and NFS connections, and iSCSI LUNs. According to synthetic tests, SSDS work even faster than RAM, but in fact it turns out quite differently: the large amount of hot data in your infrastructure, the more memory you need to install in the NAS, it does not matter whether it uses hard drives, SSDS or hybrid arrays.
Well, our example with Redis shows that if you have entered the path of good and decided to install a modern smart NAS with virtualization instead of the old SAN-storage, then use its capabilities to the maximum: it is not necessary to try to score all the storage compartments with solid-state disks – you can simply add support for NoSQL databases to your software and on the simplest Synology model of the Rackstation series to get a miracle speed that no SSD will give for many years.