Skip to content

Comparative Analysis of Hardware Performance Metrics

This white paper presents a comparative analysis of hardware performance metrics across various systems, including Redundant Web Services (RWS) virtualized and bare-metal environments, as well as dedicated GPU instances. The data evaluated encompasses CPU, GPU, memory, storage, and machine learning benchmarks, offering insights into the strengths and weaknesses of each configuration.

Key Findings

  • CPU Performance: Single-core and multicore performance varied significantly between virtualized and bare-metal systems. The bare-metal system exhibited superior single-core performance, while the virtualized system excelled in multicore performance.
  • GPU Performance: GPU clock speeds and computational performance (GFLOPS) were considerably higher on dedicated GPU instances (Tesla K80, P100, V100) compared to the integrated graphics of the virtualized and bare-metal systems. This disparity was particularly evident in Geekbench GPU scores and ML benchmark results.
  • Memory Performance: Memory bandwidth and latency were generally consistent across systems, with the bare-metal system demonstrating a slight advantage in bandwidth.
  • Storage Performance: The virtualized system outperformed the bare-metal system in random write IOPS and read/write speed, while the bare-metal system showed better random read IOPS.
  • Machine Learning Benchmarks: Dedicated GPU instances significantly outperformed CPU-based systems in all ML benchmarks, highlighting the importance of hardware acceleration for machine learning workloads. The V100 consistently delivered the best performance, followed by the P100 and then the K80.

Implications

  • Virtualized Impact: Virtualization can impact CPU performance, particularly single-core performance. However, the effect on other metrics, such as memory and storage performance, is less pronounced.
  • Cost-Performance Analysis: A cost-performance analysis would help determine the most cost-effective hardware configuration for different use cases.
  • Emerging Technologies: Investigating the performance of emerging technologies, such as FPGAs and specialized AI accelerators, would provide insights into their potential benefits for specific applications.

Benchmark Results

The data on these charts were calculated using Geekbench 6. We showcase results from a few different categories including single-core performance, multi-core performance, bandwidth, latency, random read/write speeds, and sequential read/write speeds.

We benchmarked our virtualized and bare-metal machines against several AWS machines and other machines which were tested in our data center listed below:

  1. AWS g6.xlarge
  2. AWS g5.xlarge (A10G)
  3. AWS g5.4xlarge
  4. V100
  5. AWS g4dn.xlarge (Nvidia T4)
  6. Tesla K80
  7. AWS K80
  8. AWS P3.2xlarge (Nvidia V100)
  9. AWS GS3.xlarge (Nvidia M60)
  10. P100
  11. AWS g3.4xlarge

Single-Core Performance

Our bare-metal machines performed the best out of all the machines tested for single-core performance. Our bare-metal machines came in first performing 24% better than the AWS g6.xlarge which came in second place.

Single-Core Performance

Multi-Core Performance

When testing multi-core performance our virtualized machines came in 3rd place with just a 2.31% performance difference.

Multi-Core Performance

Bandwidth

Bandwidth

Latency

Latency

Random Read/Write IOPS

Random Read/Write IOPS

Random Read/Write Speed

Random Read/Write Speed

Sequential Read/Write Speed

Sequential Read/Write Speed

Geekbench Browser Score

Geekbench Browser

1076

Single-Core Score

2344

Multi-Core Score

Geekbench 6.3.0 for Linux x86 (64-bit)

System Information

System Information
Operating System:Ubuntu 24.04.1 LTS
Model:Amazon EC2 g4dn.xlarge
Motherboard:Amazon EC2
CPU Information
Name:Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
Topology:1 Processor, 2 Cores, 4 Threads
Identifier:GenuineIntel Family 6 Model 85 Stepping 7
Base Frequency:3.10 GHz
Cluster 1:0 Cores
L1 Instruction Cache:32 KB x 2
L1 Data Cache:32 KB x 2
L2 Cache:1.00 MB x 2
L3 Cache:35.8 MB x 1
Memory Information
Size:15.42 GB