EPYC hosting: how AMD is changing the VDS hosting market

The data center market is showing double-digit growth yoy, along with the continued exponential growth of information on the Internet. Today, your own" VDS in the cloud " has become as commonplace for modern active people as a credit card once was. And if 15 years ago the concept of “hosting” was associated exclusively with websites, today it is already inextricably linked with virtual machines, inside which anything can be run: what you want to work around the clock, reliably, quickly and stably, you send “live in the cloud”-this is the basis of IoT, the foundation for IT startups and just a damn convenient thing.

In the retail segment, VDS is bought for various needs: for backups, for accounting and warehouse accounting, for processing data from Internet of things sensors, for trading robots, and of course for website hosting. therefore, a typical hosting provider working with individuals and small businesses does not have a general portrait of the consumer and its load, for which it is worth optimizing the service or projecting the infrastructure. Roughly speaking, you have to build your work in such a way as to please everyone. If you are thinking about creating your own cloud hosting to earn money by migrating workflows to online, then we have chosen a good example for you.

There is a widespread myth among ordinary people that a hosting provider must have its own data center, even in the basement, but its own. However, this long-standing stereotype has nothing to do with reality: most modern hosting providers simply rent server racks in data centers, and even the well-known CloudFlare service does not have its own data center, but also its own maintenance staff, paying only for the placement of its own EPYC-based servers in data centers around the world. If you want to stay afloat, rent everything and buy nothing, globalisation dictates. And of course, the ability to rent a server rack or server space to put equipment there and start selling VDS services looks like a very lively idea for a startup with very low capital investment.

Moreover, a modern service is simply obliged to offer the client protection against DDoS attacks or even WAF (Web Application Firewall), and here the VDS provider’s business model again relies on a third-party solution, DDoS protection by subscription, included in the price for each client of the service. Yes, by default, all virtual machines of modern VDS services should already be protected from DDoS attacks, and WAF protection is also possible.

In other words, if you want to become a leader in VDS services and serve tens of thousands of customers at the same time, take the example of VDS startups: these companies do not have excessive costs for air conditioning, backup power for server racks, or intelligent security gateways. The only hardware that is purchased by modern hosting providers is TOR switches and servers that run the entire software-defined infrastructure. Servers are purchased at wholesale prices, and the minimum range reduces the cost of Spare parts, and of course the choice of the server depends on whether you can give a competitive price for your services next month or run to stop the outflow of customers.

The economy of a hosting company depends entirely on the CPU

When choosing a hosting server, the provider usually has only one question: how many virtual machines can be placed on it, because on average, one CPU core should bring 2-5 USD per month, depending on the tariff. It is more profitable for a hoster to sell multi-core virtual machines not only for 12-16 cores, but even for 32-64 vCPU, since the price for such services includes a certain “exclusivity” and is clearly focused on a wealthy client. Although, the mass client is still satisfied with 1 virtual processor. This means that everything that concerns processors with 12 cores or less is gradually moving into the category of “cloud-exotics”, for example, for 1-thread applications with poorly optimized code, interpreters that do not support caching, high-load VPN servers, etc.

We have already said that in the retail segment, it is impossible to predict the nature of the next client’s load: someone will run a backup program once a week, and someone will constantly compile the next project. Also, it is not possible to allocate separate servers for clients with SAP, as some enterprise-grade hosters does. Therefore, in order not to make a mistake when choosing, it is advantageous for hosting providers to install the most powerful x86 processors with 48 and 64 cores each.

In the economy segment, the top-end AMD EPIC 7742 allows hosting providers to dump, selling the frequency 11% cheaper than it can be done on Intel processors. Let’s talk again about where this figure comes from: so, if we conditionally sell 100 virtual machines that consume 320 MHz, then taking into account the cost of the required 320,000 MHz (the product of cores at the maximum frequency), on the Intel platform we can sell them for$ 500, and on AMD - for$ 450 per month. Please note again: these are retail prices that take into account the Turbo Boost and Max Boost algorithms, the purchase price and warranty, as well as the price for electricity and the secretary’s salary.

First table for base frequencies

Comparative characteristics of 64-core EPYC CPUs
CPU AMD EPYC 7702 AMD EPYC 7662 AMD EPYC 7742
Number of Cores 64 64 64
Base Frequency, GHz 2.0 2.0 2.25
Total base Frequency (Capacitance, GHz) 128 128 144
The configurable TDP range for all working cores, Wt 165 - 200 225 - 240 225 -240
Retail Price, $ 6450 6150 6950
Price per GHz, $ 50,4 48,04 48,2

And second fot Max Boost ones

Max Boost
CPU AMD EPYC 7702 AMD EPYC 7662 AMD EPYC 7742
Maximum Frequency, GHz 3.2 3.3 3.4
The expected maximum frequency, GHz* 3.2 3.2 3.2
Maximum Total Frequency, (maximum frequency capacitance), GHz 214,4 211,2 217,6
Price per maximum GHz, $ 30,08 29,11 31,9
    • in general, AMD EPYC processors have dynamic frequency adjustment for each core, depending on load, temperature, and power availability. And although our tests show that in EPYC 2 all cores can run at the maximum frequency, AMD itself makes a reservation that they say “not always”, and there are options when, if the threshold is exceeded in 95C, the processor will begin to reduce the core frequency in steps of 25 MHz, until the temperature drops below the threshold of 95 degrees Celsius. Therefore, along with the “maximum frequency”, the “expected maximum frequency” should be taken into account, which is 5-7% lower than the declared maximum. Moreover, if you look at a trio of three 64-core processors, one of them, 7702 (64 cores, TDP 200W, cTDP 165-200W) supports a TDP limit of up to 165W, allowing the customer to squeeze into the power limit of the server rack, if any. But if there are no limits on electricity and cooling, then the EPYC 7762 is the best buy in terms of paying per GHz.*

In second place in importance is memory, and Linux memory. The fact is that the free KVM hypervisors, unlike VMWare ESXi, is not able to dynamically allocate RAM allocated for Windows virtual machines and gives them the entire specified amount. There is no such problem with Linux, and the RAM of these operating systems is compressed and deduplicated at the hypervisor level, so the ideal client is a buyer of a 1-core Linux virtual machine with a minimum amount of RAM. In general, you can buy more RAM or SSD drives than you need for your business, and a server with 1-2 TB of RAM is no longer uncommon. You can scale everything except the CPU frequency.

As for disk space, surprisingly, customers do not particularly care what kind of solid-state drive you have: NVMe, PCI Express or SATA/SAS. By and large, it is understandable - the difference in speed is almost impossible to notice, and the percentage of NVME hosting is growing every day. Any modern storage system scales perfectly, so you should not focus on the cost of SSD: you can buy drives as needed, watching your customer base grow with profit.

AMD EPYC 7742: to have 1000 VMs at one host

In our article “what factors should be used to choose a server processor”, I pointed out that the only correct parameter when choosing a CPU for the cloud segment is the total frequency capacity (the number of all cores multiplied by their frequency) and the price per Megahertz derived from it. Let’s make a small calculation: the latest generation of hypervisors (whether it’s VMware ESXi, Microsoft Hyper-V or Linux KVM) allows you to run up to 1024 virtual machines on a single server (theoretically, you can hang up to 4 thousand virtual machines on KVM, but we don’t consider such options). Practically, modern technologies allow you to place up to 8 virtual machines on each processor core without significant loss of performance. In idle mode, a typical Linux VM, whether Ubuntu or CentOS, consumes about 33 MHz, and Windows Server 2016 consumes 26 MHz. Even according to the most conservative estimates, 1000 virtual machines that are idle will require a total of 33.7 GHz, which is about 16 modern cores with a base frequency of 2.1 GHz.

What can one EPYC 7742 offer? The base frequency capacity is 144 GHz (64 cores, each with a frequency of 2.25 GHz), which is 4 times more than you need to run 1024 client VMS. We do not consider HyperThreading mode, since it is usually useless in virtualization, so much so that VMWare recommends disabling it altogether. Of course, during the day, the load of clients on IT/OT tasks increases, but not as much as is commonly believed, and the more clients you have, the more peak loads are smoothed out, spreading between free resources. In our test cluster, which uses a hyperconverged architecture (NAS + gateway + mail + sites + Python Jupyter + Prometheus/Grafana are collected in one box), the average consumption of one virtual machine in the period from 11 to 15 hours of the day is 280 MHz. On the scale of 1024 virtual machines, this load will require a total processor core frequency of 286 GHz, which is 2 times more than a single EPYC 7742 can provide. Therefore, hosters buy choose 2-processor configurations.

Total frequency (capacitance) of server CPUs, GHz
CPU Intel Xeon 8380HL AMD EPYC 7742
Number of Cores 28 64
Base Frequency, GHz 2.9 2.25
Total base frequency (capacitance), GHz 81.2 144
Total base frequency for (capacitance) for Dual CPUs, GHz 162,4 288

I want to say that today Intel does not have such technologies, and even the top-end Xeon Scalable 3rd generation model 8380HL has 28 cores with a frequency of 2.9 GHz, which gives a total capacity of 81.2 GHz, that is, there can be no question of any 1024 virtual machines in operating mode, either when installing one processor or when installing two. No, I certainly know that Intel has a top-end Xeon Platinum 9200 line, but it is does’t exist in market.

But AMD EPYC 7742 has another trump card - the Max Boost system, which allows you to increase the frequency of ALL_CORES_AT_THE_SAME_TIME up to 3.4 GHz, which was proved by US in the test-comparison of EPYC against Threadripper, but I want to repeat once again - it is NOT a fact that in your case the processor will “turbo-start” at the maximum declared frequencies. AMD guarantees that all the cores of the second-generation EPYC processor (with the exception of the even higher-frequency 7 Fx2 series) can simultaneously operate at frequencies around 3.2 GHz. What is especially nice is that at this increased frequency, the server can work for a long time until the need for a boost disappears. In total, 2 processors at the maximum speed offer you 435.2 GHz of total frequency capacity, which is enough for 1024 client VMs.

Total Frequency capacitance of modern CPUs in Turbo Boost, MHz
CPU Intel Xeon 8380HL AMD EPYC 7742
Number of cores 28 64
Maximum allowed frequency, GHz 4.3 3.4
Maximum possible frequency for all cores, GHz 2.9 3.2
Number of cores, working at maximum frequency at same time with 100% load of cores 1 64
Maximum frequency capacitance of 1 CPU, GHz 82.6 204,8

Do I need to explain that if you decide to move all your 12 thousand clients to new AMD servers, today you will need… only 3 boxes 2U high, each of which has 4 dual-processor nodes (today this format is called 2U4N = 2 Units, 4 Nodes), plus a storage cluster and switches. And this is not some kind of exotic, almost all major vendors have already presented their 2 4 solutions on AMD EPYC:

What used to occupy server racks from floor to ceiling, today you can fit under the table, well, or to put it in a dry business language, if you rented 3-4 racks with a height of 42U yesterday, then today you need one 42U rack (they just don’t give you less), of which you can sub-lease half to other hosting providers, getting additional profit.

The creators of AMD EPYC gave the world of Cloud providers the ability to install 512 physical x86 cores in a 2U-high enclosure divided into 4 servers with 2 processors each, and Supermicro brought this idea to life in its 2124BT-HNTR platform, which we are now testing.

TESTING

We were allocated a VIP virtual machine for the test, which has 240 cores and 512 GB of RAM. In general, the server node had 1 TB of memory, but as it turned out, not all modern software, especially Windows Server 2016, can work stably on such amounts of RAM, so the memory was artificially limited. What can I say? In desktop applications, you can’t realize how powerful it is, this server: Facebook and Youtube also open for 10 seconds (on a 10-Gigabit Internet channel), a regular archive with a 200 MB Cinebench can be unpacked on the desktop for about a minute on a PCI Express SSD. This is your Windows with antivirus and ad blockers kills any speed, and you are only happy that Google Chrome will not eat up all the memory, although who knows…

But as soon as you touch something that works in a multi-thread, even one node tears to shreds everything you knew about the speed of servers up to this point, and there are four such nodes.

The Cinebench R20 test, beloved by many of my colleagues, shows a record in rendering, and this state of absolute victory is confirmed in each of the AIDA64 tests.

How do I interpret the results of user tests in the hosting area? Yes, very simple!

Well, look: the speed of AES encryption (used in VPN) is higher than the total speed of all interfaces that could be installed in this server. Of course, VeraCrypt is More reserved in its estimates, but even here we understand that we will never run into a lack of processor speed. Now the concept of “data at work” encryption, which complements Data-at-Flow and Data-at-Rest, no longer looks so crazy: you can use this latest trend of 2020 in the field of Cloud right now to stand out from the General competitive mass, and for sure, if your hosting client encrypts the disk of its virtual machine (and it will), it will not affect performance in any way.

What about prices

Even with the minimum tariff plans of 300, a single server with 1024 virtual machines on Board will bring 2.5K USD per month. The lower its purchase price, the faster it will pay off and start making a profit. Let’s count three different configurations to find out which type of machine has the best profitability.

Server Variants
Model Simple Dual CPU AMD Server 2U4N AMD Server Simple Dual-CPU Intel Server
Platform Gigabyte R282-Z91 Gigabyte H262-Z63 Supermicro case + X11DPI-N
CPU 2 x AMD EPYC 7742 8 x AMD EPYC 7742 2 x Intel Xeon Platinum 8280
RAM 32 x 32 Gb DDR4 ECC Reg 64 x 64 Gb DDR4 ECC Reg 2933MHz 16 x 64 Gb DDR4 LR ECC DIMM
Storage 2 x 240Gb SSD Samsung 883 DCT 8 x 240 Gb Intel SSD 2 x 240 Gb Intel SSD S4510
25 Gbps Interfaces Mellanox ConnectX-4 Lx EN 25 Гб/с SFP28 4 x 2x SFP28 LAN ports, Supports 25Gb/s per port, Marvell FastLinQ QL4102-A2G OCP 1 x Mellanox ConnectX-4 SFP28
Summary 25 460 $ 119 153 $ 37 490 $
Cost Price of single VM (1vCPU, 1Gb RAM) 24.4$ 28.77$ 36.11$
Payback hosting for 1024 VMs at one node 185 days 218 days 274 days
Profit of server for 36 months 122 122 $ 470 357 $ 109 966$

All prices are shown without project discounts, and with the wholesale purchase of servers, the payback period can be reduced even more dramatically, since today, when delivered to projects, prices for EPYC can be reduced by 30-40% from retail.

Conclusions

Perhaps for the first time, we were able to calculate the benefits of using the AMD EPYC platform in hosting to explain it to you on your fingers. Strange as it may sound, but it is the most powerful and most expensive processors that allow you to make VDS hosting available to literally everyone. Moreover, there are tariff plans for 16-32 cores, VDS tariff designers are coming into use, and now providers can compete not only with price, but also with capabilities. Do not assume that everything is already distributed and occupied in this market: VDSina has proven that with a competent approach, you can quickly become a market leader and form a huge customer base.

The experience of American hyperscales shows that such sales models as “fee for frequency”, “fee for resources consumed”, sales of SaaS services and protected virtual machines are in demand, and our calculations show that AMD EPYC multi-core processors reduce the payback period for server investments to 6 months.