NVIDIA’s Nemotron 3 Super takes the lead in an open source AI benchmark

news
NVIDIA’s Nemotron 3 Super takes the lead in an open source AI benchmark

NVIDIA’s Nemotron 3 Super has taken the top spot in the open source model leaderboard for EnterpriseOps Gym, beating models such as Kimi K2.5, DeepSeek v3.2, and GPT OSS 120B. The result shows that NVIDIA is not only competing in AI hardware. It is also pushing hard on the software and model side.

Nemotron 3 Super is a 120B parameter model with 12B active parameters. It uses a hybrid MoE design, which means only part of the model is active during inference. That helps improve efficiency while keeping model quality high.

Nemotron 3 Super is built for enterprise agent workflows

EnterpriseOps Gym tests AI models across 1,150 tasks in interactive environments with 512 working tools. These tasks are meant to reflect enterprise workflows where an AI agent must use different systems, tools, and steps to complete work.

In that benchmark, Nemotron 3 Super reached an average score of 27.3 points on the open source leaderboard. NVIDIA says the model led in TEAMS, Email, and Hybrid workflows, while staying competitive in CSM, ITSM, and Drive tasks.

ModelLeaderboard position
NVIDIA Nemotron 3 Super1st
Kimi K2.52nd
DeepSeek v3.23rd
GPT OSS 120B5th

The model also supports a native 1 million token context window, which helps with long running agent tasks. For enterprise use, that matters because agents often need to remember documents, tool outputs, emails, tickets, and instructions across many steps.

NVIDIA also built the model around several efficiency features. It uses latent MoE, multi token prediction, a hybrid Mamba Transformer backbone, and native NVFP4 pretraining optimized for Blackwell GPUs. NVIDIA says this can improve throughput and reduce memory needs compared with older setups.

The bigger message is that NVIDIA wants to be seen as a full AI stack company. Its GPUs already dominate AI training and inference, but models like Nemotron 3 Super help strengthen the software layer around that hardware.

That matters for companies choosing an AI platform. If NVIDIA can offer hardware, optimized models, training tools, inference frameworks, and enterprise agent support together, it becomes harder for rivals to compete on chips alone.

The result does not mean Nemotron 3 Super is the best model for every task. Benchmarks measure specific workloads, and real world performance depends on deployment, data, latency, cost, and safety needs. Still, taking first place in an enterprise agent benchmark is a strong result for NVIDIA’s open model lineup.

For now, Nemotron 3 Super shows that NVIDIA’s AI strategy is moving beyond selling GPUs. The company wants its models and software tools to be part of the reason customers stay inside its ecosystem.

Discover: News

Discussion (0)

Be the first to comment.