Virtualization Performance Impact – Recap

In previous articles in this series we explored the virtualization performance impact on Core2 class x86-64 CPUs as well as virtualization performance penalty on more recent, Nehalem and Sandy Bridge Xeon CPUs. In this article we will cover similar assessment on the current, 2020 generation of AMD Zen 2 EPYC CPU, along with the impact of mitigations of CPU flaws discovered over the past couple of years. Previous tests showed the virtualization tax to be anywhere between 25% and 50% compared to bare metal performance, so it will be interesting to see whether things have improved.

System Setup

CPU: Zen2 EPYC 7402P, 24c/48t, 2.8 – 3.35 GHz
RAM: 256GB ECC DDR4 RDIMM @ 2933 MHz
Motherboard: Asrock EPYCD8-2T
Host OS: CentOS 8
Guest OS: CentOS 8

Test:

curl https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.4.70.tar.xz | tar -Jxf -;
cd linux-5.4.70;
make allmodconfig;
time (make -j96 2>&1 > /dev/null);
cd ..;
rm -rf linux-5.4.70

Guest is assigned all 24 cores / 48 threads. Everything is done on tmpfs, so there is no storage I/O factor to account for, this purely tests the performance impact on a pure CPU / RAM workload.

Virtualization Performance Impact: Results

Guest/Host Memory Pages Mitigations Seconds Performance %
Guest 4KB off 474 75%
Guest 2MB on 449 80%
Guest 2MB off 436 83%
Bare Metal N/A N/A 362 100%

Analysis

CPU flaw mitigations make no measurable difference on this test when only the host is in use. In the best case scenario, with CPU flaw mitigations disabled, on the current generation of AMD CPUs, virtualization tax amounts to 17%. Realistically, you would never run without mitigations enabled in a production environment, but the good news is that they only add approximately 3% of additional overhead, far less than I expected. This seems to be a large advantage with the current generation of AMD CPUs since they don’t suffer from the flaws which require more expensive mitigations. Finally, using huge memory pages provides a reasonably significant performance boost of 8% compared to allocating the guest memory from regular 4KB memory pages.

Conclusion

The performance penalty with the current generation of AMD CPUs is clearly lower than with previous generations of Intel CPUs. The worst case performance impact observed in this case was similar to the best case observed before. The best case performance impact is still 17%, and while this is better than ever before, it is still relatively significant for performance sensitive environments.

Do you need a hand with getting the most out of your Linux servers and databases? Give Shattered Silicon a call for a free initial consultation.