Llama 4 Day 0 Support on AMD Instinct GPUs

AMD

together we advance_

Published Apr 5, 2025

In today’s fast-evolving AI landscape, innovation is driven by powerful collaborations and cutting-edge hardware. AMD is proud to announce Day 0 support for Meta’s latest breakthrough — the Llama 4 Maverick and Scout models on our AMD Instinct™ MI300X and MI325X GPU accelerators.

A Collaborative Milestone

AMD’s deep-rooted partnerships have always been about pushing the limits of what’s possible. By working closely with Meta, vLLM, and Hugging Face, we’ve ensured that Llama 4 can run seamlessly on our GPUs from Day 0 with Pytorch and vLLM. This unified effort not only accelerates innovation but also empowers developers to harness the full potential of open-source AI with optimal performance and efficiency.

Inside Llama 4

Llama 4 represents a leap forward in AI technology. This model harnesses a mixture-of-experts (MoE) architecture, combining 17 billion activated parameters with up to a staggering 400 billion parameters. Designed for multimodal text and image processing, it supports extended context lengths—enabling applications that require deep, nuanced understanding of vast amounts of data.

The AMD Advantage

The AMD Instinct MI300X and MI325X GPU accelerators are engineered to meet the challenges of next-generation AI models. Key highlights include:

Both MI300X and MI325X are capable of running the massive 400B-parameter Llama 4 Maverick in BF16 datatype on a single node, streamlining deployment and reducing infrastructure complexity.
With their large HBM capacities, these GPUs effortlessly support Llama4’s extended context lengths, ensuring smooth performance even under the most demanding scenarios.
With optimized Triton and AITER kernels, both MI300X and MI325X can achieve best-in-class performance and TCO for Llama 4 in production deployment.

Looking Ahead

This milestone is not just a technical achievement—it’s a commitment to driving AI innovation forward. By leveraging the strengths of AMD’s GPU accelerators and the pioneering work behind Llama 4, developers and enterprises are now empowered to explore new frontiers in AI applications.

Stay tuned for more updates and in-depth technical insights on how AMD and our partners support and optimize the exciting Llama 4.

AMD

5mo

We also already have a handy getting started guide: From Docker setup to running inference with optimized Triton kernels on AMD’s high-performance GPUs with single-node deployment, massive HBM memory, and top-tier TCO for production. Thanks to AMD’s collaboration with Meta, vLLM, and Hugging Face, developers can dive into next-gen AI innovation right now. https://rocm.blogs.amd.com/artificial-intelligence/llama4-day-0-support/README.html

4 Reactions

Mehboob Ali B.E-EE

Electrical Engineer｜PLC & Industrial Automation Engineer｜Production | HMI | SCADA | Technician,12+ Years Experience➽➽Actively looking for new opportunities.

5mo

Congratulations on the successful collaboration and the exciting advancements with Llama 4 on AMD GPUs! 🚀

1 Reaction

Tun Jian Tan

vLLM Contributor

5mo

Absolutely amazing.

1 Reaction

Khalid Ibrahim Natto

➢ Brand Ambassador ➢ Omni Channel Marketing ➢ Prompt Engineer Generative A.I. ➢ Business Development ➢ Strategist ➢ Financial Planning ➢ Poet and song writer

5mo

This is amazing work Lisa Su Congratulations on closing the deal Your forecqsts and estimates are always accurate thanks to your teams excellence in engineering cc Ryan Sagare

LinkedIn respects your privacy

Llama 4 Day 0 Support on AMD Instinct GPUs

AMD

together we advance_

More articles by this author

Others also viewed

EdgeMatrix: Scaling 70B Parameter Models for Enterprise AI

Jensen Sets the Stage on Fire at GTC 🔥

Will expensive GPUs slow the progress of AI?

AMD Drives System Level AI Advances

Future-Proof Your Workstation: Expert Tips on Choosing the Best GPU

AMD Makes Definitive GenAI Statement

GTC 2025 Keynote Highlights: The Next Era in AI

Edge AI Innovations #61

NVIDIA Unveils Blackwell Ultra & Vera Rubin: The Dawn of Exascale AI

NETINT Quadra vs. NVIDIA T4

Explore content categories

Powering America’s AI Future: Innovation, Infrastructure, and Leadership

Jul 29, 2025

El Capitan Takes Exascale Computing to New Heights

Jan 10, 2025