2025¶

August 15, 2025
15 min read

Reproducing Llama-Nemotron-Super-49B-V1.5 Evals

In this tutorial, we will reproduce the evals for the Llama-3.3-Nemotron-Super-49B-v1.5 model using NeMo-Skills. For an introduction to the NeMo-Skills framework, we recommend going over our introductory tutorial.

We assume you have /workspace defined in your cluster config and are executing all commands from that folder locally. Change all commands accordingly if running on slurm or using different paths.

July 10, 2025
20 min read

A Simple Pipeline to Improve Math Reasoning Accuracy

This tutorial walks you through a simplified version of the pipeline that we used to win the AIMO2 Kaggle competition. We will start with Qwen2.5-14B-Instruct model that only scores ~10% on AIME24 benchmark and improve it to ~30% through a series of NeMo-Skills jobs.

If you’re following along, you’ll need access to either an NVIDIA DGX box with eight NVIDIA A100 (or newer) GPUs or a Slurm cluster with similarly configured nodes. All commands should only take ~2 hours to run.