In its debut in the industry’s MLPerf benchmarks, NVIDIA Orin, a low-power system-on-chip based on the NVIDIA Ampere architecture, set new records for AI inference, raising the per-accelerator performance bar to the suburbs.
Overall, NVIDIA and its partners continued to show the highest performance and the broadest ecosystem to run all machine learning workloads and scenarios in this fifth round of the industry metric. for production AI.
In edge AI, a pre-production version of our NVIDIA Orin conducted five of the six benchmarks. It ran up to 5x faster than our previous generation Jetson AGX Xavier, while delivering 2x the average power efficiency.
NVIDIA Orin is available today in the NVIDIA Jetson AGX Orin Development Kit for Robotics and Autonomous Systems. More than 6,000 customers, including Amazon Web Services, John Deere, Komatsu, Medtronic, and Microsoft Azure, use the NVIDIA Jetson platform for AI inference or other tasks.
It is also a key component of our NVIDIA Hyperion platform for autonomous vehicles. China’s largest electric vehicle manufacturer. BYD is the latest automaker to announce that it will use the Orin-based DRIVE Hyperion architecture for its next-generation automated electric vehicle fleets.
Orin is also a key ingredient in NVIDIA Clara Holoscan for Medical Devices, a platform that system makers and researchers use to develop next-generation AI instruments.
Small module, big stack
Servers and devices with NVIDIA GPUs, including Jetson AGX Orin, were the only top accelerators to run all six MLPerf benchmarks.
With its JetPack SDK, Orin runs the full NVIDIA AI platform, a proven software stack in the data center and cloud. And it’s backed by a million developers using the NVIDIA Jetson platform.
Footnote: MLPerf v2.0 inference closed; Performance per accelerator derived from top MLPerf results for respective submissions using number of accelerators reported in Data Center Offline and Server. Qualcomm AI 100: 2.0-130, Intel Xeon 8380 from MLPerf v.1.1 submission: 1.1-023 and 1.1-024, Intel Xeon 8380H 1.1-026, NVIDIA A30: 2.0-090, NVIDIA A100 (Arm): 2.0 -077, NVIDIA A100 (X86): 2.0-094.
The MLPerf name and logo are registered trademarks. See www.mlcommons.org for more information.
NVIDIA and its partners continue to show industry-leading performance in all tests and scenarios from the latest MLPerf inference cycle.
The MLPerf benchmarks have broad support from organizations such as Amazon, Arm, Baidu, Dell Technologies, Facebook, Google, Harvard, Intel, Lenovo, Microsoft, Stanford, and the University of Toronto.
Most partners, submissions
The NVIDIA AI platform again attracted the most MLPerf submissions from the broadest partner ecosystem.
Azure continued its strong debut in December on MLPerf training tests with strong results this cycle on AI inference, both using NVIDIA A100 Tensor Core GPUs. Azure’s ND96amsr_A100_v4 instance matched our best-performing eight-GPU submissions in nearly every inference test, demonstrating the power readily available from the public cloud.
System makers ASUS and H3C made their MLPerf debuts this round with submissions using the NVIDIA AI platform. They joined system makers Dell Technologies, Fujitsu, GIGABYTE, Inspur, Lenovo, Nettrix and Supermicro who submitted results on more than two dozen NVIDIA-certified systems.
Why MLPerf Matters
Our partners participate in MLPerf because they know it is a valuable tool for customers evaluating AI platforms and vendors.
MLPerf’s diverse tests cover today’s most popular AI workloads and scenarios. This gives users confidence that the benchmarks will reflect the performance they can expect across all of their tasks.
The software makes it shine
All the software we used for our tests is available in the MLPerf repository.
Two key components that enabled our inference results – NVIDIA TensorRT to optimize the AI models and NVIDIA Triton Inference Server to deploy them efficiently – are available for free on NGC, our catalog of GPU-optimized software.
Organizations around the world are adopting Triton, including cloud service providers such as Amazon and Microsoft.
We are constantly folding all our optimizations into containers available on NGC. This way, every user can start putting AI into production with peak performance.