Huawei Taishan 2280 V2: ARM64 Kunpeng920-based server FAQ and review

To get on testing an ARM-based server from Huawei, created on the basis of a Huawei processor with a Huawei chipset and even a network card on the Huawei NIC, I took up the queue as early as the end of August. And only in December, due to the proximity of the new year holidays, I managed to snatch a test slot for 2 weeks: the Huawei Taishan test fleet is quite tiny, and the interest in them is huge. Everyone wants to check out what this new beast is that came to compete with Intel Xeon and AMD EPYC, and first of all, in the queue for testing, all sorts of departments, structures and committees crowded next to me.

1. WHY?

Somewhere there, in other countries, they talk about saving electricity and CO2 emissions, someone believes that the Internet has outlived itself in the modern cloud world, and in other countries they look at the political situation. Anyway, Intel and AMD are corporations of USA, and the notorious “Uncle Sam” at any moment can not only open all the worms that are in processors and chipsets, but also simply limit the supply of technology to any country.

This is all very serious, no matter how strange it may sound, there is a huge demand for “non-American” protected from hacking and malware IT over the world. Here is only proposals almost there is no. And given that Huawei recently mentioned a desire to invest about $ millions to train specialists in the field of ARM and Cloud, it is already clear that all roads should be open for this company.

2. Are there many of Taishans?

Yes, Huawei decided to cover the entire spectrum of the server market, well, except for blades and quite a budget. Considered Taishan 2280 V2 belongs to the class of universal servers in opposition to HP DL380, but there is also Taishan 5280 in a 4U case for 40 hard drives, Taishan X6000 in a Hi-Density case for 4 nodes, announced Edge server (read what is Edge computing ), and plus, the company will transfer its storage systems to KunPeng (read the review of Huawei OceanStor 2200 V3), which now use Xeon processors.

That is, Taishan is not a one-time, experimental product, but a ready-to-implement ARM-based world that you can move to without looking at the x86 architecture, but we will still turn to compatibility with Intel/AMD over and over again.

3. So what kind of KunPeng is this?

We know about ARM processors that they are installed in smartphones, and today no one doubts their performance and energy efficiency. I will not take the liberty of explaining the pros and cons of ARM architecture before x86, assuming that someone has already done it before me or will do it instead. Much more importantly, x86 and ARM are not compatible, and roughly speaking, you will not install regular Windows on KunPeng, and your old Linux will not be transferred by simply copying files to a new server. For some the lack of a live VM migration Intel to AMD c - is a disaster (why only for some - read our article), and here the whole world: you will have to use the ARM-version of your operating systems and their programs, and some compile again… Hardened pros will tell you that with this compatibility server on the market no chance, and is buried at the time, Itanium and drove to the outskirts of the galaxy IBM Power, but not the same: community grows ARM. Why?

Because ARM in hardware level scales easily. Today, Kunpeng is up to 64 cores on a single processor produced using 7 nm technology, an 8-channel advanced memory controller DDR4-2933, PCI Express 4.0, an integrated 100-Gigabit network controller with RDMA support, 64 MB of L3 cache, and tomorrow it can be doubled, because the chip design on the ARM architecture allows you to almost infinitely scale the computing power … I almost forgot - one server can have up to 4 KunPeng processors. And yes, if it makes any sense to you- “Chinese brothers” usually make cheaper than “American partners”, so with the same number of cores, Huawei Taishan should be cheaper than its counterpart on Intel Xeon or AMD EPYC.

But do not rush to drive the horses, because today Kunpeng-s at least-7 modifications, and on the Huawei website you need to look for information about them. By the way, all the chips have the logo of the company Hisilicon, which has even less information on its website. And even the fact that it is a subsidiary of Huawei only gives out copyrights at the bottom of web pages. Obviously, this is another political affair, the best way to avoid getting bogged down in it is to look at the iron with your own eyes, or with our own.

4. What is about the quality of Taishan 2280 v2 harware?

I mentioned that Huawei Taishan should be cheaper than its counterparts from HPE / Dell, but this does not mean that the manufacturer will save “on matches”. And if you are the sysadmin who opens every new server that comes into operation, then the first thing I draw your attention to is the attention to detail. Taiwan’s AVC fans have:

  • a) hot-swappa ble,
  • b) anti-vibration suspension,
    and
  • c) a shutter that prevents reverse air circulation when stopped.

You are looking for something to find fault with, and there is nothing to find fault with in General, so let’s instead of the usual description of the inside, answer the frequent questions.

Well, for example, expansion boards in risers are attached with soft rotary latches, and the risers themselves are installed in the housing, sliding along rails made of powder aluminum. Instead of risers on PCI-E boards, you can order I / O modules to install on the back of four 2.5" SSDS, if 6 PCIe slots are too many for you. And if you are interested in supporting PCI Express 4.0, then I hasten to please you: all slots in Taishan 2280 v2 have the PCIe 4.0 16x standard (well, except for the slot occupied by the RAID controller’s mezzanine).

5. Can I install my own expansion boards?

It seems to me that the server was specially created to install expansion boards in it.

By default, 5 PCIe 4.0 16x slots of full height and length are available in the server. We experimentally installed an Intel X520-DA2 network card - it worked.

6. Does Taishan 2280 V2 Support GPU installation?

Only boards without external power, such as the Tesla T4. for this purpose, additional power is supplied to the two PCI slots of the riser. In General, the power of small 2-kilowatt power supplies of the standard 80 Plus Platinum should be enough for at least 4 of the most powerful boards, but the trouble is-there is no place to connect additional power cables, and they are not included, and the instructions do not raise this issue. Huawei has its own artificial intelligence accelerators, and this may be the reason, but in any case, this server is not claimed to be GPU-compatible.

7. What built-in network interfaces does the server have?

Huawei Taishan 2280 V2 has two network card slots in mezzanine format, and one of them can be changed without access to the server (cold swap). In Huawei terminology, these mezzanines are called FlexIO, and today there are two types of such boards: 4-port Gigabit copper (model TM210) and 4-port 25-Gigabit optics (model TM280).

As you can see, we have a simple TM210 installed, and it is not interesting: there is no hot replacement, as the chips are a dull Realtek 8211.

8. Whose chips are used for high-speed interfaces?

We received a server for testing with an additional network card that has 4 ports of 25 Gbit / s. Probably, it is superfluous to say that there is no information about this controller with the name BC51ETHB, and to shed light, I had to check the marking of the processor. What was my surprise when I saw again the brainchild of Huawei, the FPGA processor HiSilicon I1822. This is the same chip that is used in the Huawei IN200 controller, which is positioned by the company as an Intelligent NIC, and not just a network card (read how Intelligent NIC differs from a network card).

The controller supports RoCE, iWARP, SR-IOV and has the following hardware offloads:

  • VLAN offload
  • Stateless offload (Checksum/TSO/LRO/RSS)
  • VxLAN tunnel encapsulation/decapsulation offload
  • OVS TC Flower offload

If network traffic is heavy, such as on security gateways, this card can reduce CPU usage by 15-20% and reduce latency by 30%. The card’s power consumption is up to 15 W, and by the way, the card itself is supported by all operating systems, including VMware ESXi, Windows Server, RHEL, Ubuntu, and others.

9. is Huawei also responsible for data security?

The holy of holies of any server, disk subsystem are given to the power of LSI/Avago/Broadcom products: active backplane and sas3508 controller in the form of a mezzanine, supporting RAID up to level 60 with an immortal super-capacitor for cache protection and the ability to configure via the WEB interface. The controller can switch to JBOD mode for use in software data stores, especially with the ZFS file system. The Backplane Board also uses the LSI processor, so as elsewhere in the server world, there is nothing new here.

The carriers have Vendor Lock, and they use mainly HGST for HDD and Samsung for SSD as the OEM core, but not necessarily.

10. What about memory?

With memory in General, everything is interesting: there are 32 slots for ddr4 DIMM ECC Registered modules with a frequency of 2933 MHz. You can install 16-to 128-GB memory sticks by typing a crazy 4 Terabytes on a single host, but keep in mind: all memory modules must be the same in volume, frequency, and other parameters.

11. Can I replace the processor, or buy a 2-processor server with 1 CPU?

No, because the Kunpeng 920 processors are soldered on the motherboard in the amount of 2 pieces. By marking Hi1620, you can guess that we are looking at a 48-nuclear Kunpeng 920-4826. It has 48 physical cores, 46 MB of L3 cache, a frequency of 2.6 GHz, and a TDP of around 158 watts. In these parameters, the chinese is very similar to AMD EPYC, but only more so.

Processors do not support HyperThreading, so 96 cores means 96 threads. The CPUs are connected by two Hydra Link buses, each of which has a bandwidth of about 30 Gbit / s. If you are worried that this is not enough for cases when, for example, a virtual machine attached to the core of the left processor works with memory served by the right one, you can enable the “One NUMA Per Socket” policy in the BIOS, then the hypervisor will be able to treat the memory and the core as a single object, to which the VM is assigned. By the way, it is very cool that this is done in the server settings, and not at the software level: VMWare has a separate case dedicated to how they treated this problem on AMD EPYC Napples processors.

12. does the ARM server have a BIOS?

If you were worried about whether the ARM server has a BIOS, then exhale with relief: Yes. Not AMI or Award, but its own, and by server standards-very technological. You can configure the parameters of PCI Express slots, memory polling frequency, configure the RAID controller from the General interface, and other basic parameters.

Don't forget to change your password for remote management.
  1. Does it have Remote control?

Admit that it is unusual to see a server without the ASpeed AST1500 / 2500 chip responsible for BMC, and all because it uses the Hi1710 controller from the same Huawei.From the point of view of all sorts of “anti - sanctions” - this is ideal, because ASpeed is a Taiwanese company, and Hisilicon is the same Huawei, and trusting the server management to a third - party supplier is not safe and prestigious.

Besides, we have a full HTML5 interface, which (to be honest) still far away from ASPeed: there is no support for mobile browsers, you cannot mount the image for CIFS/NFS, but the truth is there are very good things, for example, simultaneous operation of several operators console, automatic screenshots and a video of the error and the select user role BMC, as well as integration in the domain.

In general, the design shows that Huawei has a lot of experience in the production of x86 servers, and here the company has just realized its potential, which is read in every detail. And before we revive this monster by pressing the virtual Power button, let’s repeat what’s here from which country. Given that everything is made in China, I am interested in what jurisdiction the headquarters of the developer of a particular device belongs to.

Component Jurisdiction of the developer’s headquarters
The Processor / SoC
The development of processors China
Production of processors Taiwan
Disk subsystem
Controllers Singapore (Malaysia)
HDD USA
SSD South Korea
Network interface
1 Gbit/s Taiwan
25 Gbit/s China
Remote management of iBMC China
Harware
Cooling system Taiwan
Power supplies China

Of course, we’ve been writing in reviews for years that it’s time to switch to SSD, but the reason has never been political.

14. Which Linux is installed on the server?

Almost all server distributions have an ARM64 version called “aarch64”. Almost all the same packages that are available for the AMD64 version are compiled for ARM64 and installed by the same commands from the repositories. For testing, I used Debian 10, which was no different from installing on other hardware: all network interfaces are picked up without problems, so you can use the Netinstall distribution.

15. What software is available?

I love Debian because it has a list of available packages on its website, indicating the architectures they are compiled for, so you can see in advance before you buy the server whether the software you need is in the repository, or whether it can be compiled from sources. Immediately after installing the OS, I installed: Webmin and Cockpit for remote monitoring, MariaDB, QEMU-KVM for virtualization with GUI management, and Jupyter for running Python notebooks, and Git for accessing the world’s largest source code library.

In other words, it is important to understand that the basic components of enterprise distributions, such as databases, remote management, web server, and virtualization, are debugged and work in such a way that you will not notice any difference with x86: the same commands, the same configs.

16. What software has the problems?

With scientific software developed by the “non-community”. Well, for example, we start with Tensorflow: Yes, it is not under ARM64, but the developers ’ site has instructions on how to compile it from sources: it is only 3 commands and 5 minutes in the console, but as soon as you start installing libraries via pip, then you will have problems waiting for you: you need to put something with Github, look for something in the sources and compile it if you find it. There is no such excellent environment Manager as Anaconda under ARM64 at all, and as a result, the Keras libraries, including matplotlib, pandas and others are full of errors when installing.

Of course, you can’t check all the software in the review and tell us everything, but you should understand that for some SOFTWARE that is not part of the linux distribution, it is better to add an abstraction level and use third-party builds, if available. And here, for the first time in all my acquaintance with the Taishan server, I meet a serious drawback of this entire project: Huawei does not have any service or mini-market like Bitnami, from which you can download applications Packed in a container or virtual machine.

17. How about virtualization?

The company VMware in October last year showed ESXi on the ARM architecture, but so far this is a matter of distant prospects, and today virtualization works only under Linux, and without emulation of x86/AMD64. This means that you can only install ARM64-based operating systems inside the VM, and you will not have compatibility with the Intel/AMD-based stack.

Let’s put it this way: not very good, because the hypervisor does not have x86 emulation, which means you can virtualize, but all your VMS will have the ARM64 architecture. Simply put, you will not install Windows via the hypervisor, and you will not get compatibility with the Intel/AMD-based stack, so either you will have to build a closed cluster on the aarch64/ARM architecture (and we know that this is how professionals build their data Centers), or use container virtualization, and run Docker pods under KVM.

18. How about containers?

Everything is fine: Docker is installed the same way for both x86 and ARM64. Using Portainer, I combined the server on AMD EPYC and our test Huawei Taishan 2280 v2 in one interface to show visually that you can have a common interface for managing your containers under any platform. True, you can’t transfer image files between AMD64 and ARM64, but you can pack your app into a docker image yourself and convert it to any architecture.

If you Google deep, you can find that people manage to build (example 1, example 2) a shared cluster with hosts on x86 and ARM64, but mostly we are talking about Raspberry Pi and TV boxes. If you do not set a goal to make friends with architecture from different eras, you can not bother and leave everything as it is.

I think it’s unnecessary to say that Nginx, Redis, MySQL/Mariadb, and even ready-made Wordpress/Drupal/Joomla images are installed in one click.

20. How about speed?

To some extent, the performance of all modern CPUs in typical tasks is comparable, and in its class, the Kunpeng 920 focuses on the niche of top Intel and AMD processors with 48 cores or more. We don’t have a 2-processor server at hand to put the ARM architecture and x86 face to face, there is only a 32-core AMD EPYC 7551p that we can use just to see if we can compare AMD64 and ARM64 in terms of speed?

And of course, we apologize in advance to those who were waiting for a war like “Intel vs Huawei”, but it has been repeatedly proven that in the server segment, the software version has a greater impact on performance than the architecture and model of the processor. But since there is an absolute lack of understanding among my colleagues about the speed prospects of ARM64, our tests will show you what you can expect by purchasing these servers.

Test Bench configuration:

Huawei Taishan 2280 v2

  • 2x KunPeng 920 (48C, 2,6 ГГц)
  • 512 ГБ оперативной памяти DDR4-2933
  • 2x480 Гб SSD

Competitor:

Software:

  • Debian Linux 10
  • Redis Server 5
  • MariaDB 10.3
  • Sysbench

Let’s take two databases: 100 thousand and 10 million records and compare the behavior of machines under the same conditions, which will allow you to slightly extrapolate the results to your needs.

En_test1 En_test2

With the growth of the number of streaming connections to the base, Taishan 2280 v2 begins to go into isolation, and I would not say that the matter is in the number of physical cores (96 for Huawei versus 32 for EPYC), because even on 16 threads the difference is almost double. In 1-stream mode, we will not consider the data as not having any value.

20. What about money?

Literally every announcement concerning servers on ARM skips a paragraph about how profitable they are from the point of view of CAPEX and OPEX, so let’s use the retail GPL and calculate how much more profitable Huawei Taishan 2800 V2 is than its competitors on Intel and AMD? At present, Intel has only processors up to Xeon Platinum 8180 on open sale, but to make the comparison more honest in terms of operating frequency, we will use Xeon Platinum 8180 (2.5 GHz, 28 cores), that is, we keep in mind that in the case of Intel, we have almost half as many physical cores. The situation is the same with AMD: HPE does not currently have models with EPYC Rome, so we are limited to the first series of processors, having 64 cores on Board. In addition, we will reject products from the US and take a pair of SSDS, without hard drives.

Component Huawei Taishan 2280 V2 HPE ProLiant DL380 Gen10 HPE ProLiant DL385 Gen10
CPU 2 x Kunpeng 920 2 x Intel Xeon Platinum 8180 2 x AMD EPYC - 7601
Frequency, GHz 2.6 2.5 - 3.8 2.2 - 3.8
Number of cores per server 96 56 64
Memory, GB 16 x 32 16 x 32 32 x 16
Memory type DDR4-2933 DDR4-2666 DDR4-2666
Raid controller SR450C-M 2GB (Avago3508) HPE Smart Array P408i-a SR / 2GB HPE Smart Array P816i-a SR Gen10 / 4GB
Converged 10/25 Gbps network adapter TM280 Onboard NIC, 25GE/10GE Optical Interface,Four-Port,SFP28 2 x HPE StoreFabric CN1300R 10/25Gb Dual Port Converged Network Adapter 2 x HPE StoreFabric CN1300R 10/25Gb Dual Port Converged Network Adapter
Number of ports 25 Gbit / s 4 4 4
Power supply unit 2 x 2000 W 2 x 500 W 2 x 800 W
SSD 2 x ES3510S V5 SSD, 960GB SAS 12Gb/s, Read Intensive 2 x HPE 960GB SATA 6G Read Intensive LFF (P09689-B21) HPE 960GB SATA 6G Read Intensive SFF (2.5 in)
TOTAL, $ 39,500 66,851 39,745

And now it is clear that the only competitor for the price is a mythical server with two AMD EPYC 7542, which at the time of writing, HPE has not yet had.

Recommendations to IT specialists when ordering

Today, the demand for ARM processors in the data center segment shows double-digit growth y/y, and it is mainly supported by Hyper-retailers who purchase servers by the hundreds and thousands. When you see a ready - made server running on ARM64, you understand that the future has already come, and you can smoothly start building data center segments on an alternative architecture, using Taishan 2280 v2 as bricks,and give the functions of storage, SD-WAN and security gateways to the software.

What you need to remember before buying Taishan 2280 V2:

  • This is the same technological hyperconverged server as any DL380 G10 or PowerEdge R840. But unlike them, this is a real New Age, really something new over the past 30 years.
  • Now only Linux is here! No VMware, FreeBSD, or Holy-Holy-Holy, Windows Server is not here, and whether it will appear or not is not clear. Commercial distributions are friends with Kunpeng processors and provide technical support for Them.
  • Everything is ready for the import substitution plan
  • Standard simple tasks such as hosting, database management, traffic balancing, and VPN work right out of the box, being installed and configured with the same commands as for x86/AMD64.
  • The more applications you have in containers, the easier it will be to implement them.
  • Problems may arise with commercial software that does not distribute source code and with some libraries, especially scientific ones.

Given that this server has the highest degree of independence from American companies, it is an ideal solution for building a scalable infrastructure based on Kubernetes or Docker Swarm, with a performance not lower than x86/AMD64, but at the same time resistant to trade wars and conflicts.