Nvidia is shaking up the AI market, particularly in the open-source sector.
Typically, tech giants like Google or OpenAI keep their AI developments under wraps: no one knows exactly what the models were trained on.
Nvidia is now breaking with this tradition and, with Nemotron 3 Super, is delivering an agentic model that is not only free but also comes with a(51-page documentation)full of technical details. The model is therefore actually open source and not, like most models of its magnitude, “merely” open weight.
Transparency as a New Announcement
In addition to the model itself, the complete datasets and weights used for training were also published.
Dr. Károly, a computer graphics researcher at the University of Vienna and operator of the YouTube channel(
What’s inside?Nemotron 3 Super is based on 120 billion parameters, was trained on 25 trillion tokens, and achieves a level of intelligence roughly equivalent to the best closed models from a year and a half ago, according to Zsolnai-Fehér.
With a context length of up toone million tokensNemotron 3 Super is particularly strong in areas such as software engineering and complex logical reasoning (agentic reasoning).
It is a hybrid model that combines Mamba architecture with classic Transformer elements (attention). This has a decisive advantage:
What excites researchers and communities: The speed
In the so-called NVFP4 version, the model is up to seven times faster than comparable open-source competitors. This enormous performance is no coincidence, but the result of four technical “secrets” that Nvidia reveals in its research report:
- NVFP4 quantization:The model uses extremely low precision in its computations without noticeably losing accuracy.
- Multi-Token Prediction (MTP):While conventional AIs write word by word (token by token), Nemotron 3 Super calculates up toseven tokens simultaneouslyand verifies them in a single pass.
- Mamba Layer:Traditional systems “read” the entire manual over and over again for every query. The Mamba Layer works more like a student who reads the book once and takes highly compressed notes, which saves a massive amount of memory.
- Stochastic rounding:To compensate for calculation inaccuracies, the researchers add targeted “noise” that cancels out to zero on average. This ensures that the model arrives at the target precisely despite the high speed.
What the Community Says
The release is already being hotly debated in the technical community on Reddit. BitterProfessional7p, for example, celebrates the transparency, but there are also critical voices regarding the benchmark comparison:
“The most important thing is: Nemotron 3 Super is completely open-source—weights, datasets, and recipes. Developers can easily customize it and use it on their own infrastructure for maximum privacy.”
By the way:Researchers are stumped: AI systems go to great lengths to defend other chatbots (and we don’t know why)
Others don’t find the system impressive because it doesn’t come out on top in benchmarks. User jeekp is one of them:
“Early signs are rather underwhelming. In the LM arena, it lags significantly behind the lighter Qwen3.5 models.”
Only time will tell if Nemotron 3 Super can hold its own against strong competition from models like Qwen in everyday use, but the trend toward extremely fast, transparent open-source models is now firmly established.
Are open-source models exciting to you? Feel free to let us know in the comments!
