SlideShare a Scribd company logo
Confidential do not distribute 1
Generative AI Automation
for private Enterprise LLMs
Part 1: LM-Controller
Confidential do not distribute 2
● AI Models and Applications are the new class of Kubernetes
workloads
● We start tackling this from LLMs
● Enterprise already invested in CPU-based Kubernetes clusters
Enterprise AI workloads
Confidential do not distribute 3
● AI Application Developers shouldn’t worry about the complexity of
model deployment.
● Platform Teams: LLMs become platform components
○ Security and Governance: signing and verification
○ RBAC and Tenancy
○ Standardization across organizations
○ Available for the Dev teams via self-service portals
Why Weave AI?
Confidential do not distribute 4
● Day 0 - Out-of-the-box experiences
○ weave-ai install
○ weave-ai run zephyr-7b-beta
● Day 1 - Integrate them to your DevOps / GitOps pipelines
○ weave-ai install --export
● Day 2 - Build and maintain model catalog for the Dev teams
○ flux commands
○ Fine-tuning models / RAG data pipelines
Why Weave AI?
Confidential do not distribute 5
Confidential do not distribute 6
● The first controller released as part of the Weave AI Controllers
● LM Controller is a Flux controller that helps deploy Large
Language Models on Kubernetes.
● It supports LLMs in the Flux OCI format.
● It uses Flux Source Controller as the in-cluster model cache.
What is LM Controller?
Confidential do not distribute 7
LLMs are snowflakes
Confidential do not distribute 8
Hugging Face
Compatible Models
GitHub / GitLab
CI
Your App
LLM Serving
Your Data CPU or GPU
on Cloud
or
on-Prem
fine-tuning
store
packaged
pulled
deploy
context
manage
LLM as Flux OCI
Confidential do not distribute 9
Why use LM Controller?
LLM Serving
LLMs
injects
all required information
to the deployment units
LM Controller
Confidential do not distribute 10
● A curated list of LLM catalog
○ In Flux’s OCI format
● Flux’s Source Controller as in-Cluster model Cache
○ No PVC required
● A controller that takes care of this and that LLM parameters for you
● A set of pre-built OpenAI API Compatible engines
○ No-AVX, AVX, AVX2, AVX512 and more to come
● An easy-to-use CLI
What Weave AI provides so far
Confidential do not distribute 11
It’s Demo Time

More Related Content

PDF
Supercharge Your AI Development with Local LLMs
PDF
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
PDF
GIST AI-X Computing Cluster
PDF
The Aipowered Developer Meap V01 Chapters 1 To 4 Of 8 Nathan B Crocker
PDF
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
PDF
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
PDF
Containers & AI - Beauty and the Beast!?!
PDF
Dell PowerEdge R7615 servers with Broadcom 100GbE NICs can deliver lower-late...
Supercharge Your AI Development with Local LLMs
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
GIST AI-X Computing Cluster
The Aipowered Developer Meap V01 Chapters 1 To 4 Of 8 Nathan B Crocker
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Containers & AI - Beauty and the Beast!?!
Dell PowerEdge R7615 servers with Broadcom 100GbE NICs can deliver lower-late...

Similar to Weave AI Controllers (Weave GitOps Office Hours) (20)

PDF
AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracl...
PDF
Coding with AI - Understanding LLMs and how to use them
PPTX
[KZ] Web Ecosystem with Multimodality of Gemini.pptx
PDF
Intro to Generative-AI(Gen AI Study Jams GDGC ZHCET)
PDF
20240411 QFM009 Machine Intelligence Reading List March 2024
PPTX
Multimodel_LLM_for_Content_Generation.pptx
PDF
20221130 - Luxembourg HUG Meetup
PPTX
[DSC Europe 24] Tomislav Tipuric - Exploring LLMs across clouds – A Year in t...
PPTX
[DSC DACH 24] AI and XR - Ivan Voras
PDF
Generative AI on Enterprise Cloud with NiFi and Milvus
PDF
KubeCon & CloudNative Con 2024 Artificial Intelligent
PDF
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
PDF
Omniverse for the Metaverse
PDF
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
PDF
From Traction to Production Maturing your LLMOps step by step
PDF
Generative AI for the rest of us
PDF
Building and deploying LLM applications with Apache Airflow
PDF
Overview of Artificial Intelligence - Technology
PDF
Implementing AI: Running AI at the Edge
 
PDF
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracl...
Coding with AI - Understanding LLMs and how to use them
[KZ] Web Ecosystem with Multimodality of Gemini.pptx
Intro to Generative-AI(Gen AI Study Jams GDGC ZHCET)
20240411 QFM009 Machine Intelligence Reading List March 2024
Multimodel_LLM_for_Content_Generation.pptx
20221130 - Luxembourg HUG Meetup
[DSC Europe 24] Tomislav Tipuric - Exploring LLMs across clouds – A Year in t...
[DSC DACH 24] AI and XR - Ivan Voras
Generative AI on Enterprise Cloud with NiFi and Milvus
KubeCon & CloudNative Con 2024 Artificial Intelligent
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Omniverse for the Metaverse
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
From Traction to Production Maturing your LLMOps step by step
Generative AI for the rest of us
Building and deploying LLM applications with Apache Airflow
Overview of Artificial Intelligence - Technology
Implementing AI: Running AI at the Edge
 
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
Ad

More from Weaveworks (20)

PDF
Flamingo: Expand ArgoCD with Flux (Office Hours)
PDF
Six Signs You Need Platform Engineering
PDF
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
PDF
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
PDF
Flux Beyond Git Harnessing the Power of OCI
PDF
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
PDF
How to Avoid Kubernetes Multi-tenancy Catastrophes
PDF
Building internal developer platform with EKS and GitOps
PDF
GitOps Testing in Kubernetes with Flux and Testkube.pdf
PDF
Intro to GitOps with Weave GitOps, Flagger and Linkerd
PDF
Implementing Flux for Scale with Soft Multi-tenancy
PDF
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
PDF
The Story of Flux Reaching Graduation in the CNCF
PDF
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
PDF
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
PDF
Flux’s Security & Scalability with OCI & Helm Slides.pdf
PDF
Flux Security & Scalability using VS Code GitOps Extension
PDF
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
PDF
Robust Network Security and Observability with GitOps and Cilium
PDF
Intro to GitOps & Flux.pdf
Flamingo: Expand ArgoCD with Flux (Office Hours)
Six Signs You Need Platform Engineering
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Flux Beyond Git Harnessing the Power of OCI
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
How to Avoid Kubernetes Multi-tenancy Catastrophes
Building internal developer platform with EKS and GitOps
GitOps Testing in Kubernetes with Flux and Testkube.pdf
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Implementing Flux for Scale with Soft Multi-tenancy
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
The Story of Flux Reaching Graduation in the CNCF
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Flux Security & Scalability using VS Code GitOps Extension
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Robust Network Security and Observability with GitOps and Cilium
Intro to GitOps & Flux.pdf
Ad

Recently uploaded (20)

PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Modernizing your data center with Dell and AMD
PDF
Machine learning based COVID-19 study performance prediction
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPT
Teaching material agriculture food technology
PPTX
A Presentation on Artificial Intelligence
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Electronic commerce courselecture one. Pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Digital-Transformation-Roadmap-for-Companies.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Per capita expenditure prediction using model stacking based on satellite ima...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Modernizing your data center with Dell and AMD
Machine learning based COVID-19 study performance prediction
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Teaching material agriculture food technology
A Presentation on Artificial Intelligence

Weave AI Controllers (Weave GitOps Office Hours)

  • 1. Confidential do not distribute 1 Generative AI Automation for private Enterprise LLMs Part 1: LM-Controller
  • 2. Confidential do not distribute 2 ● AI Models and Applications are the new class of Kubernetes workloads ● We start tackling this from LLMs ● Enterprise already invested in CPU-based Kubernetes clusters Enterprise AI workloads
  • 3. Confidential do not distribute 3 ● AI Application Developers shouldn’t worry about the complexity of model deployment. ● Platform Teams: LLMs become platform components ○ Security and Governance: signing and verification ○ RBAC and Tenancy ○ Standardization across organizations ○ Available for the Dev teams via self-service portals Why Weave AI?
  • 4. Confidential do not distribute 4 ● Day 0 - Out-of-the-box experiences ○ weave-ai install ○ weave-ai run zephyr-7b-beta ● Day 1 - Integrate them to your DevOps / GitOps pipelines ○ weave-ai install --export ● Day 2 - Build and maintain model catalog for the Dev teams ○ flux commands ○ Fine-tuning models / RAG data pipelines Why Weave AI?
  • 5. Confidential do not distribute 5
  • 6. Confidential do not distribute 6 ● The first controller released as part of the Weave AI Controllers ● LM Controller is a Flux controller that helps deploy Large Language Models on Kubernetes. ● It supports LLMs in the Flux OCI format. ● It uses Flux Source Controller as the in-cluster model cache. What is LM Controller?
  • 7. Confidential do not distribute 7 LLMs are snowflakes
  • 8. Confidential do not distribute 8 Hugging Face Compatible Models GitHub / GitLab CI Your App LLM Serving Your Data CPU or GPU on Cloud or on-Prem fine-tuning store packaged pulled deploy context manage LLM as Flux OCI
  • 9. Confidential do not distribute 9 Why use LM Controller? LLM Serving LLMs injects all required information to the deployment units LM Controller
  • 10. Confidential do not distribute 10 ● A curated list of LLM catalog ○ In Flux’s OCI format ● Flux’s Source Controller as in-Cluster model Cache ○ No PVC required ● A controller that takes care of this and that LLM parameters for you ● A set of pre-built OpenAI API Compatible engines ○ No-AVX, AVX, AVX2, AVX512 and more to come ● An easy-to-use CLI What Weave AI provides so far
  • 11. Confidential do not distribute 11 It’s Demo Time