What Nvidia’s new MLPerf AI benchmark results really mean

Couldn’t attend Transform 2022? View all summit sessions in our on-demand library now! Watch here.

Nvidia today published results against the new MLPerf artificial intelligence (AI) benchmarks for its AI-targeted processors. While the results looked impressive, it’s important to note that some of the comparisons they make to other systems aren’t really apples to apples. For example, Qualcomm’s systems run on a much smaller power footprint than the H100 and target market segments similar to the A100, where test comparisons are much fairer.

Nvidia tested its flagship H100 system based on the latest Hopper architecture. The now mid-range A100 system aims for edge computing. and the smaller Jetson system that targets smaller individual and/or extreme workload types. This is the first H100 submission and it shows up to 4.5 times higher performance than the A100. According to the chart below, Nvidia has some impressive results for its flagship H100 platform.

Image source: Nvidia.

Inference workload for AI inference

Nvidia used the MLPerf Inference V2.1 benchmark to evaluate its capabilities in various AI inference workload scenarios. Inference differs from machine learning (ML) where training models are created and systems “learn”.

Inference is used to run the learned models on a series of data points and obtain results. Based on conversations with companies and vendors, we at J. Gold Associates, LLC, estimate that the AI ​​inference market is many times larger in volume than the ML training market, so having good inference benchmarks is critical to success.


MetaBeat 2022

MetaBeat will bring together thought leaders to provide guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, California.

Register here

Why Nvidia would run MLPerf

MLPerf is a series of industry standard benchmarks that has broad input from various companies and models a variety of workloads. These include natural language processing, speech recognition, image classification, medical imaging and object detection.

The benchmark is useful in that it can run on machines from high-end data centers and the cloud, to smaller scale edge computing systems, and can provide a consistent benchmark across products from various vendors, even though not all subtests in the benchmarks performed by all testers.

It can also create scripts to run offline, single-stream or multi-stream tests that create a series of AI functions to simulate a real-world example of a full workflow (e.g. speech recognition, natural language processing, search and suggestions, text sending in -speech, etc.).

While MLPerf is widely accepted, many players believe that running only parts of the test (ResNet is the most common) is a valid indicator of their performance, and these results are more generally available than the full MLPerf. Indeed, we can see from the graph that many of the benchmark chips do not have test results in other MLPerf components to compare with Nvidia systems, as the vendors chose not to build them.

Is Nvidia ahead of the market?

The real advantage Nvidia has over many of its competitors is in its platform approach.

While other players offer chips and/or systems, Nvidia has built a robust ecosystem that includes the chips, related hardware, and a full stable of software and development systems optimized for their chips and systems. For example, Nvidia has built tools like its Transformer Engine that can optimize the level of floating-point computation (such as FP8, FP16, etc.) at various points in the workflow that are best for the task at hand, which has the ability to speed up calculations, sometimes by orders of magnitude. This gives Nvidia a strong position in the market as it enables developers to focus on solutions rather than trying to work on low-level hardware and related code optimizations for systems without the corresponding platforms.

Indeed, competitors Intel, and to a lesser extent Qualcomm, have emphasized the platform approach, but startups generally only support open source options that may not be on the same level of capabilities as those provided by major vendors. In addition, Nvidia has optimized frameworks for specific market segments that provide a valuable starting point from which solution providers can achieve faster time to market with reduced efforts. New AI chip vendors cannot offer this level of resources.

Image source: Nvidia.

The power factor

The one area that fewer companies are testing for is the amount of power required to run these AI systems. High-end systems like the H100 can require 500-600 watts of power to operate, and most large training systems use many H100 components, possibly thousands, in their complete system. As a result, the cost of running such large systems is extremely high.

The lower-end Jetson only draws about 50-60 watts, which is still excessive for many high-end computing applications. Indeed, the big hyperscalers (AWS, Microsoft, Google) all see this as a problem and are building their own energy-efficient AI accelerator chips. Nvidia is working on lower-power chips, particularly because Moore’s Law allows power to decrease as process nodes get smaller.

However, it needs to achieve products in the 10-watt range and below if it wants to fully compete with the newer optimized edge processors coming to market and companies with lower power credentials like Qualcomm (and ARM, in general). There will be many low-power uses for AI inference that Nvidia cannot currently compete in.

Nvidia’s key benchmark

Nvidia has shown some impressive benchmarks for its latest hardware, and the test results show that companies should take Nvidia’s AI leadership seriously. But it’s also important to note that the potential AI market is huge, and Nvidia may not be a leader in all areas, particularly in the low-power segment where companies like Qualcomm may have an advantage.

While Nvidia presents a comparison of its chips against standard Intel x86 processors, it does not have a comparison with Intel’s new Habana Gaudi 2 chips, which are likely to show a high level of AI computing power that could match or exceed some Nvidia products .

Despite these caveats, Nvidia still offers the broadest product family, and its emphasis on integrated platform ecosystems puts it ahead of the AI ​​race and will be hard for competitors to match.

VentureBeat’s mission is set to be a digital town square for technical decision makers to learn about and transact business-transformative technology. Discover our Updates.

Leave a Comment