How to improve graphics performance in modern virtual machines

  • Virtualization introduces a slight loss of performance in CPU and RAM, but the impact is usually greater in storage and graphics, especially in remote desktops.
  • The graphics experience of VMs depends on the GPU, CPU, memory, disk I/O, network, and the remote desktop protocol used.
  • SR-IOV and GPU passthrough offer better graphics performance, but add complexity and cost; virtio-gpu, SPICE, and virgl are the practical choice in KVM/Proxmox environments.
  • Choosing between a VM or a physical server for graphics workloads requires stress testing and fine-tuning of bottlenecks, adjusting hardware and hypervisor accordingly.

Graphics performance in virtual machines

When you start playing seriously with modern virtualization, sooner or later you run into a recurring problem: Virtual machines rarely offer the same graphics performance as the operating system installed directly on the hardware.While the host desktop may run smoothly even in 4K, the VM desktop may be choppy, with mouse lag, screen tearing, or videos that don't play as smoothly as they should.

This scenario is repeated both in domestic environments and in Enterprise platforms that use KVM, Proxmox, VMware, Hyper-V, or the public cloud.And the feeling is the same: "The host is working perfectly, but the VM is slow... what am I doing wrong? Do I need a dedicated GPU, SR-IOV, to change hypervisors, or simply more raw CPU power?"

Graphics performance in VMs: what you can really expect

The first thing is to adjust expectations: Virtualizing desktops with "near-native" 3D acceleration remains a challengeEspecially if you want to share a single GPU between a host and several virtual machines without resorting to very expensive or complex solutions.

In a typical case with Debian 12 as host over KVM, Ryzen 7 PRO laptop with Radeon iGPU and 4K displayThe physical desktop works perfectly: moving windows is instantaneous, websites load quickly, and 4K YouTube plays without skipping. However, on Linux VMs with virtio or SPICE graphics, performance drops: Heavy web pages and online videos experience more lag, and the smoothness is not as good as the host's..

When testing different configurations (VirtIO-GPU driver, SPICE, virgl, different remote viewers such as virt-viewer, Windows clients, etc.) it is observed that The pointer and overall responsiveness are somewhat improved, but tearing, dropped frames, and a distinct feeling of a less "lively" desktop are still present.This leads many people to immediately consider GPU passthrough. Or even switching platforms.

It is important to understand that, even in powerful infrastructures, Virtualization introduces a small overhead on CPU, RAM, and especially disk I/O and graphics.In traditional server loads (web, databases, microservices) that penalty is acceptable; but when you start asking Fine graphic interactivity, low latency, and smooth video, every millisecond counts.

How to improve graphics performance in modern virtual machines

Virtual machines vs. physical servers: real impact on performance

Although we focus on graphics here, it's worth putting virtualization in context. Physical (bare metal) servers remain the benchmark when you're looking for raw performance and minimal latency.Especially in high-performance databases, 3D rendering, AI, or real-time streaming.

Typical benchmark tests show that a well-configured VM on KVM or VMware performs very close to bare metal in CPU and RAM: approximate losses of 5-8% in CPU and 7-13% in memoryThe biggest gap is in storage. 4K IOPS can drop by 17-25%, which is critical if your workload is very disk-intensive.

This penalty also exists in the graphic design, with the nuance that The GPU typically shares resources with multiple VMs, and the presentation path (SPICE, VNC, RDP, the hypervisor's own protocol, etc.) adds latency and compression.The result: the system is "not unusable," but when compared to the host, it feels less smooth.

That's why there are scenarios where it pays to stick with bare metal: large transactional databases (Oracle, SQL Server Enterprise, SAP HANA), AI/ML engines with heavy GPUs, or game/streaming servers with very strict latency requirements. In these situations, the CPU, memory, I/O, and GPU overhead of the virtualization layer becomes much more noticeable.

In contrast, web applications, microservices, development environments, and virtual office desktops —even a lightweight desktop in Ubuntu— They fit very well in VMs. They benefit from snapshots, high availability, and fast scaling, and the slight performance loss is perfectly acceptable.

CPU, RAM, disk, and network: what metrics to look at in a slow VM

Before blaming the GPU, we need to confirm that You are not limited by CPU, memory, disk, or networkMany "slow desktop" problems are actually saturations of another resource: CPU waiting its turn, intensive swap, or disk at its limit.

In VMware vSphere, for example, the CPU of each vCPU goes through four states: RUN (working), WAIT (waiting/I/O or idle), READY (in queue without physical CPU) and COSTOP (co-stop in multi-core VMs)High READY or COSTOP values ​​are clear indicators of contention and that the host is over-subscribed.

For CPUs, the key metrics are the sustained usage percentage, MHz usage per vCPU, and Ready/COSTOP countersIf a VM is constantly at 90-100% usage, or is READY more than 10% of the time, that machine is struggling. Adding more vCPUs willy-nilly almost never helps if the host is already under load.

In memory, we must watch over the Global usage includes paging/swap and, on platforms like Azure or Hyper-V, paging or swap files on secondary disksWhen those volumes show a lot of reads/writes, it's a clear sign that the VM has run out of RAM.

On disk and network, the following is observed: average read/write latency, IOPS, and network bandwidthSustained latencies above 15-20 ms on disk or drops in availability and timeouts in remote storage (Azure Storage, SAN, etc.) are direct enemies of perceived performance on the remote desktop.

Azure Monitor

Monitoring and diagnostic tools: from ESXTOP to Azure Monitor

Major manufacturers offer well-developed tools for dissecting a VM's performance. Some examples:

  • VMware: vCenter and ESXTOP.
  • Azure: Azure Monitor and PerfInsights.
  • Hyper-V: Performance Monitor and PowerShell.
  • KVM/Proxmox: combinations such as top, htop, iostat, virt-top and the web interface itself.

ESXTOP is a classic for real-time analysis. It allows you to view, every few seconds, metrics per vCPU such as %USED, %RUN, %SYS, %WAIT, %IDLE, %RDY, %CSTP, %MLMTD and many more. The basic rule: if %RDY or %CSTP spike, you have too many vCPUs or too many VMs for the host.

In Azure, enabling diagnostics at the VM and storage account level gives you charts of CPU, memory, disk and networkalong with metrics on availability, latency, throttling, and storage timeout errors. This information helps distinguish between a platform issue and a bottleneck on your end due to excessive IOPS or throughput.

In Hyper-V, the work is divided between Hyper-V Manager, Performance Monitor, Resource Monitor, and PowerShell cmdletsYou can inspect physical vs. logical cores, NUMA, VHDX disks, virtual adapters, disk queues, and much more to fine-tune which part is falling short.

Beyond the manufacturer, many guides recommend running specific stress tests: sysbench for CPU, stress-ng and memtester for RAM, fio for disk I/O, iperf3 or netperf for network. This allows you to easily compare bare metal vs VM and see the limits of each hypervisor.

GPU virtualization: SR-IOV, passthrough, and proprietary solutions

When the bottleneck is clearly graphical (screen tearing, low frame rate, slow animations, choppy video), it's time to look at the GPU virtualizationThere are three main families of solutions here:

  • GPU passthrough (PCI passthrough)A full graphics card is assigned to a single VM. This offers near-native performance, but with obvious limitations: that GPU becomes unavailable to the host and other VMs, and you typically need a dedicated video output for that VM, which isn't ideal if you want everything on the same screen.
  • GPU virtualization by SR-IOV (Single Root I/O Virtualization)It allows exposing virtual GPU functions (VFs) to different VMs. The idea is very appealing: sharing graphics hardware with minimal overhead. Intel is promoting this approach in its Xe2 iGPUs for laptops (like Lunar Lake) and in data center GPUs (Flex), while AMD and NVIDIA are primarily reserving this feature for very expensive business cards where, in addition, there are often licensing and subscription models that are not very user-friendly for home users or small businesses.
  • SR‑IOV. This solution It is not entirely transparent to VMs, requires specific drivers, BIOS/firmware and hypervisor support, and can bring its own compatibility issues.It's not always worth upgrading all your hardware (for example, buying an Intel Lunar Lake laptop just for that) if the rest of your workflow is going to remain limited by other factors. A good PC hardware analysis helps to decide.
  • Proprietary GPU virtualization solutionsSuch as NVIDIA RTX vWS, NVIDIA VGX, or their successors. These combine specific hardware (for example, VGX K1/K2 type cards with multiple Kepler GPUs, large amounts of GDDR5 memory, and thousands of CUDA cores) with a GPU hypervisor that allows multiplexing the graphics computing capacity across dozens of virtual desktops.

QEMU

Partial GPU technologies in desktop environments: virtio-gpu, virgl, and SPICE

For those using KVM, QEMU, Proxmox or similar, the usual path involves Paravirtualized graphics controllers such as virtio-gpu, combined with remote desktop protocols such as SPICEOn the guest side, a driver is installed that "understands" that virtual device and allows a certain level of basic 2D/3D acceleration.

VirGL is an additional layer that translates OpenGL calls from the guest to the host GPUThus, an application within the VM indirectly utilizes the real 3D acceleration. In theory, this should improve the graphics performance of the desktop and apps. However, in practice, the opposite sometimes occurs. If the host's iGPU is underpowered or the implementation is not polished, a significant performance drop is noticeable.

In fact, many users with AMD iGPUs (for example, Renoir) report that when they activate VirGL, the The VM desktop becomes much slower and heavierto the point of being worse than using Virtio-GPU "without the GPU". This doesn't mean VirGL is useless, but it does depend heavily on the combination. hardware + drivers + VM graphics load.

At Proxmox, the trio virtio‑gpu + SPICE + virt‑viewer This is usually the minimum reasonable configuration for a graphical Linux desktop. It allows for a decent mouse pointer, window resizing, and better image compression than simple VNC, but still... Don't expect the same experience as with the VMware ESXi remote console or VMRC, which are highly polished after years of optimization.

That's why many administrators coming from ESXi are surprised when they try Proxmox. Despite having a very powerful hypervisor, The feeling of "snappiness" of the remote desktop is lower unless you tweak a lot of fine-tuning or use a dedicated GPU.

When is GPU passthrough worth it, and when isn't it?

GPU passthrough remains the best-performing option for a specific VM. However, in everyday desktop use scenarios, there are several drawbacks. For example, need for another monitor input, loss of GPU for the host, additional complication (IOMMU, groups, BIOS, drivers, bugs with suspend, etc.).

If your goal is that a single VM has full 3D accelerationThe effort is usually worth it. Projects like Looking Glass allow you to "reinject" the VM image into the host desktop to avoid additional monitors. But, if what you want is several office or testing VMs with good basic fluencyTransferring a GPU to each one is not feasible.

For powerful desktop computers, you can consider a hybrid combination: Primary GPU for the host and passthrough from a second, more modest GPU for a specific VMThis way you maintain a very usable host desktop and give that VM a graphical environment very close to native; a laptop analysis It can offer perspective on desktop alternatives versus laptops.

With laptops, things get more complicated. They usually have a single iGPU (or iGPU + dGPU highly integrated with the firmware)With limited resources and no realistic possibility of installing another graphics card, passthrough is rarely worthwhile. It makes more sense to leverage paravirtualized options (virtio-gpu, SPICE, RDP) to reduce the graphics expectations of the VMs.

In summary, Passthrough is the right tool for a few very demanding VMs.For labs with many machines or lightweight desktops, you're more interested in adjusting the hypervisor, controlling CPU/RAM/I/O overhead, and choosing the right remote desktop protocol.

Hypervisors, NUMA, dynamic memory, and other performance factors

Beyond the GPU, the way the hypervisor manages CPU, memory, storage and network It directly influences the perceived fluidity of the VM's desktop. Hyper-V, KVM, VMware, and others have somewhat different philosophies, but they all share common concepts.

The Hyper-V architecture, for example, is based on a hypervisor that controls access to the hardware, a root partition with the management system, and secondary partitions for the VMsThis is supported by technologies such as virtual NUMA, dynamic memory, virtual switches, network SR-IOV, and storage optimizations such as ODX.

NUMA (Non-Uniform Memory Access) is especially critical in servers with many cores. If a large VM is poorly partitioned between physical NUMA nodes, its memory latency increases. And performance suffers, even if it appears to have ample resources on paper. Ideally, the VM's vNUMA topology should align with the host's pNUMA topology.

Dynamic memory (in Hyper-V, ballooning in other hypervisors) can save global RAM, but It's not a good match for latency-sensitive workloads like databases or desktops with many open applications.In such cases, it is advisable to allocate fixed memory to avoid pauses when the hypervisor decides to reclaim RAM all at once.

Storage is by far the most common bottleneck. It is recommended Use fixed-size VHDX disks, separate system and data disks, opt for enterprise-grade SSDs or NVMe drives, and avoid RAID configurations with poor write behavior (RAID 5/6) for intensive workloads.Where available, Storage Spaces Direct or NVMe arrays help keep latencies within acceptable margins.

On a network, it is advisable to configure External virtual switches on fast NICs (10 GbE if possible), use NIC teaming, enable SR-IOV for very heavy network loads, and tune MTU and offloads Only if the entire network chain supports it. A poor network configuration can make a remote desktop, even with a good GPU, look worse than expected.

Stress testing and use cases: when to choose VM or physical

To decide whether to migrate a graphics workload to a VM or leave it on physical media, it is important Test with benchmarks and stress tools They should measure CPU, RAM, disk, and network usage. And, when possible, GPU usage as well. Ideally, the "real" application should be compared to the same application running in a VM.

A realistic pattern might be: Run sysbench or Geekbench for CPU, stress-ng or memtester for RAM, fio for 4K IOPS and disk latency, iperf3 for network bandwidth, and some basic graphics benchmark (e.g., glxgears or a browser-based WebGL test) on both the host and the VM.

If the loss in performance is within acceptable limits (for example, Less than 10% on CPU/RAM and a 15-20% penalty on diskIf the remote desktop seems smooth enough for the intended use (office automation, admin, lightweight development), virtualization is a perfectly valid option.

If, on the other hand, the application relies heavily on GPU, low latency and high sustained I/O throughput (rendering in Blender, heavy CAD, AI engines training large models, games, etc.), the experience is usually much better on a physical server with a dedicated GPU or on a professional-grade GPU passthrough/virtualized VM.

The key is to detect which component (ready CPU, insufficient RAM, throttled I/O, lack of a real GPU, slow network, or poorly optimized desktop protocol) is slowing down each VM. apply the simplest and most cost-effective solution possible for that caseand reserve the heavy investments (dedicated GPUs, SR-IOV, professional hardware) for the workloads where they really make a difference.