Synology UC3200 review: dual-controller iSCSI storage

Perhaps, since Synology released its first NAS for rack installation, at all press conferences, the company was asked when they will release a 2-controller solution? The manufacturer himself explained for a long time that the company has a High Availability solution for organizing NASes in a cluster, that in the modern world of file access and local clouds, it is possible and necessary to achieve fault tolerance with software, but apparently giving up under the onslaught of questions, released its 2-controller storage of the Active-Active type at the end of last year, stepping into a new territory for SAN devices.

Как это часто бывает в нашей работе, получить UC3200 на тест было непростой задачей: слишком сильный спрос у интеграторов на тестовый образец. Но, благодаря всемирному карантину, мне удалось пощупать эту СХД во всей её красе.

IP SAN is the right term for iSCSI storage

Synology UC3200 is a relatively new type of storage designed for converged networks that are becoming increasingly common in medium-sized businesses. We have already written many times, but let me remind you again of the basic principle of network convergence: in your company, all network traffic is transmitted via TCP over a copper cable (twisted pair). This includes storage traffic (NFS, FC, ISCSI), application traffic, and voice traffic, and prioritization and fault tolerance are determined by regular network switches. As a result, you save on hardware by unifying the data transmission environment: everywhere you use copper twisted pair, or at least optics, but still only the Ethernet network, which is easy to manage and maintain.

Scheme-1

And although converged networks have become" converged " since they learned to package FC traffic, using FCoE does not give any advantages over the “native” protocols for Ethernet. In fact, if you need to configure a centralized storage for virtualization nodes, you will have to choose between the NFS and iSCSI protocols. The argument about which protocol is better is meaningless: it is a question of the integrator’s preferences and the requirements of a specific installation, and the two types of access are approximately the same in speed, both support multipath for resistance to communication breaks, and both protocols are unloaded by modern network controllers.

Synology implemented only support for the iSCSI protocol in the UC3200, so this device can no longer be classified as a NAS, because it provides block access to LUNs, but it should not be considered a SAN, because there is no FC protocol, and you do not need to buy separate switches for SAN networks. And to learn how to" properly prepare " UC3200, it is important to understand that we have an independent type of storage, very cheap by the standards of Synology servers, created for those who need to configure iSCSI volumes once and forget about the entire storage infrastructure for several years. The developer positions this device as a storage device for Microsoft Hyper-V, OpenStack Cinder, or VMware ESXi servers, which we will use in our tests.

DSM UC: brand new operating system

We are used to the fact that NAS Synology is cool, and therefore expensive, because usually in one box you have a powerful scalable video surveillance software, a file backup system, a system for reserving servers and working computers, and your own hardware and container virtualization… none of this is in the DSM UC operating system, as there is no “packages” function that allows you to install applications from the Synology repository.

Why so? Because this is a highly specialized device focused on increased reliability and speed. Fortunately, the DSM interface has been preserved with the same storage Manager, iSCSI Manager, and search.

Central to the web interface is the High Availability Manager, a high availability technology that synchronizes two identical Synology controllers. In General, the HA Manager was available and tested on the Synology NAS, but here it just works inside one device, synchronizing two controllers.

For each “head” available statistics of CPU load, memory usage, disks, partitions and iSCSI volumes, as well as the ability to configure “performance signals” - warnings about too high CPU load, increased network latency, or when accessing the disk. Notifications can be sent to you via E-Mail, SMS, or Push technology, which I find most convenient.

ISCSI storage organization

Three years ago, Synology made a bid for BTRFS, and today this file system is used by default in all the company’s devices. In UC3200, all LUNs are stored as files on the created volumes, and functions such as space reorganization and snapshots use Copy on Write technologies used in BTRFS, so there is no support for the old-fashioned EXT4 here anymore. Note the LUN type when creating it: “Thick Provision” promises higher speed and is recommended for database maintenance, and" Thin Provision " has advanced features: defragmentation, space reorganization to free up unused area, and up to 256 snapshots per LUN. The latter you can not only store and protect from deletion, but also replicate to another Synology NAS, keeping off-site copies of your LUNs.

The UC3200 model has a feature that is not typical for other Synology NASes: for LUNs, you can enable buffered access that works for both read and write operations. This option will be useful for arrays on conventional hard drives that are used for backups, archives of video surveillance systems, for video editing operator workstations, and in General for sequential access.

In total, you can access up to 128 LUNs, up to 128 iSCSI targets, and up to 32 internal volumes. Here we should explain that in Synology terminology, a “storage pool” is a disk array that is divided into partitions that already store LUNs. You can assign the created pool to the first or second controller, and later change the binding to balance the load between processors. Yes, the beauty of 2-controller storage is that you can create multiple storage pools, distributing the load between the “heads”.

Design

In terms of topology, Synology UC3200 is two servers connected via a switch. The head unit has 12 compartments for 3.5-inch drives with SAS interface, in which you can install both 2.5-inch SSD and 3.5-inch HDD. At the time of preparing the review, hard drives up to 16 TB and SSDS up to 3.84 TB were supported, and you can find the full list on this page.

Each node is a separate server built on a 4-core Xeon d-1521 processor with a frequency of 2.4-2.7 GHz. The Xeon D family is designed specifically for NAS and low-power devices. This processor has a 2-channel DDR4 ECC Registered PC2133 memory controller, and each UC3200 node has one 8 GB RAM module installed.

In our article on caching in Synology servers, we found out that the NAS caches data from iSCSI LUNs to memory to speed up read operations, so in certain cases you can avoid wasting disk space on SSDS, and make do with increasing the amount of RAM to speed up the disk array. By the way, speaking of using SSD caching, I would like to note that VMWare ESXi since version 6.7 U2 has a very good caching mechanism on any storage volume.

Interestingly, the storage system has a function for syncing data stored in memory, which works like this: let’s say you had to restart one of the controllers to install updates or because of a breakdown. As soon as it is connected to the active state, the second controller will pass it the data cached in RAM, so you do not have to spend time warming up its cache, even if this controller was a backup and did not have active storage pools. Thus, the storage system remains not only working during the shutdown of one of the controllers, but also constantly “warmed up”. The same is true for SSD cache, with the only difference that dual-port SAS SSD drives “do not notice” the disconnection of one of the controllers, and do not experience the risk of data loss.

Well, since we’re talking about scalability, the accessories available to you are memory modules, 10-GB network controllers 10GBase-T Synology and 10/25-GB network controllers Intel, Marvell and Mellanox. By default, each node has 2 regular 1-Gigabit ports and one 10-Gigabit 10GBASE-T port, so the issue of expanding the number of network connections can be particularly acute if you use optics. To expand capacity, you can connect two rx1219sas disk shelves to the Synology UC3200, which is a JBOD of 12 3.5" drives with a fault-tolerant SAS expander and power supply.

In fact, such an expansion scheme can withstand failure at the same time: three power supplies, two SAS expanders and one controller, but serial connection of the expansion shelves will not allow you, for example, to pull out the middle shelf of the RX1219SAS on a working machine. Of course, the probability that you will need to do this is negligible, except that in the future Synology will have an expansion shelf for 2.5-inch drives, and you decide to replace the HDD with an SSD… But we will not suck out of our fingers almost unrealistic use cases, but rather let’s see how the storage behaves when working out various failures, because this is the indicator I consider the most important for this device.

Test bench configuration:

NAS:

  • 4 x HDD Seagate Exos 16Tb
  • RAID 10

OS:

  • VMWare ESXi 6.7U3
  • Windows Server 2016
  • Connection iSCSI
  • File system LUN - NTFS, 4kb

Fault tolerance testing

When configuring storage, you need to choose which network ports will work in fault-tolerant mode, so that when one controller is disabled, their IP addresses will be duplicated to the second controller. Simple mirroring is provided here: port 1 on controller A is reserved with port 2 on controller B, and so on. Note that for fault tolerance, the reserved ports must have a static IP address, identical subnets, a gateway, and even an MTU. These are quite normal and understandable requirements, and to see how fault tolerance works in the NAS, let’s start with synthetic tests.

To do this, connect a regular Thin Provision LUN in Windows Server 2016 and look at the delay in accessing the volume in different conditions. The first test is a 5-minute read of the 4K sector in random order, in which we see good constant access stability throughout the entire interval.

When you disable the active controller in Random 4K reading mode, the downtime is just some record-breaking small - only 13 seconds, and I’m not mistaken if I say that 99% of applications will not even feel this small delay, and will not lead to service interruptions.

Working on the backup controller in read mode also does not differ from working on the main one, except for a slightly increasing delay, which will be visible on some SSD models, but from a practical point of view will not affect the operation of the service.

It takes about 120 seconds for the main controller to return, but the disk access interruption is already about 20 seconds, and as we can see from the graph, access to the array is interrupted twice.

The results that Synology UC3200 demonstrates are a real breakthrough, if not a miracle, because such a short switching time from the main controller to the backup one is typical for much more expensive machines. This could be a curtain call, but first you need to make sure that in real life everything will be as smooth as on synthetic tests. Let’s repeat all the above for a 2-stream load of type 4K Rnd Read/Write in a 50/50 ratio.

We have a disk subsystem assembled on hard drives with a spindle speed of 7200 RPM, and of course the access time jumps a lot. Over time, obviously, predictive algorithms give up and the maximum delay increases.

The switching time from the active controller to the standby one is already significantly increased - up to 20 seconds, but still remains relatively low for a device of this price level. Returning the controller to active mode interrupts the storage operation for 15 seconds, after which the overall array delay is noticeably reduced.

Let’s move on to testing directly in VMware

I mentioned above that ESXi 6.7 has a very good caching system, and I am interested in how it will affect the switching time? First of all, connect a “thick” iSCSI volume with standard parameters and fully mark it up in VMFS 6, then create a virtual disk on it and throw it into the guest Windows Server 2016 for testing.

The VM itself is located on a different disk during testing, so any manipulation of the iSCSI volume does not affect its performance.

Disk subsystem downtime is slightly higher than for Windows,

And caching increases it even more. For me, it is surprising that the iSCSI initiator of VMware works worse than Microsoft’s, so if you need a high-speed iSCSI volume under Windows, it is better to throw the LUN from Synology UC3200 directly into the guest system and backup it with storage tools. This will be faster than using an ESXi virtual disk. But even in this simple case, when the virtual disk image is on an iSCSI-connected volume, the time to switch to the backup controller is too short for the guest OS to issue a disk access error.

Recommendations for ordering

The average retail cost of the Synology UC3200 is $ 7,500 in a disk-free, rack-mounted configuration with 8 GB of RAM per controller. No additional licenses are required for the device to work, and even with the existing 8 GB of memory, the system works quickly and without brakes, and due to the lack of additional features, you do not need to expand the memory. If your company already uses the Synology server, and you want to purchase the UC3200 through a tender, then specify compatibility with the Snapshot Replication package, which only works between Synology devices, as a prerequisite. This will protect you from the delivery of analogues.

In general, Synology UC3200 is an interesting replacement for SAN devices, which does not use Vendor Lock for hard drives and solid-state drives, so you can not worry that your company will face restrictions on the supply of HDD/SSD in the future. This device is for those who need increased reliability with extremely short switching time between controllers, which even in Active-Active mode, not every storage manufacturer has. If you still need a full set of business packages, including Virtual Machine Manager and Surveillance Station, then Synology has a 2-controller model SA3200D with Active-Passive mode, but as they say, this is a completely different story.