SlideShare a Scribd company logo
2
Most read
5
Most read
6
Most read
Francisco Javier Arceo
Senior Principal Software Engineer, Red Hat
Kubeflow Steering Committee Member
Feast Maintainer
Feast, RAG,
and Milvus
Hello! 👋
A little about me
Led Data Science, Data Engineering, and ML Infra teams
at different companies
Somehow stumbled into maintaining Feast, the Open
Source Feature store
Get to work on a mixture of distributed training,
pipelines, feature store, RAG, and agents!
In my ample free time I like to write code
I've spent 12+ years building AI/ML
solutions for banks and fintechs
1
Joined Red Hat to work on Open Source AI
2
Wife and 2 children and I call NJ home 🤠
3
What is RAG?
Retrieval Augmented Generation
Published in NeurIPs 2020
Query Encoding
Retriever + Generator
Meta AI Research team
A pretrained encoder
In the seminal paper, they ran end-to-end
backpropagation/fine tuning on both the
Retriever and Generator
Why did RAG become so popular?
OpenAI
Published in NeurIPs 2020
ChatGPT took flight in Oct 2022
Google Trend shows takeoff
Most RAG applications only use
inference 😅
Meta AI Research team
They suggested using RAG 🤯
Easier to do than fine tuning!
How does RAG work?
The Simplest RAG
Embed Data
Take documents/text and convert it into numeric
(vector) representation
Insert Data into datastore
Insert all of that data (often in batch)
Embed User Query
In real-time, embed a user's query
Retrieve Documents with
Vector Similarity Search
Compute the cosine similarity between query
and all other vector representations and return
top k
How can Feast help with RAG?
Empowers MLEs to do what they do best, harness the power of their data!
Easy to ship RAG to
production!
Battle-tested support for real-
time, batch, and streaming
Built to scale for distributed
computing and ingestion
Fine-tuning as a first class
citizen
Fully Open Source!
Feast in Production
Feast values inference and fine tuning as first class citizens
Online Infrastructure
Offline Infrastructure
Scale
For model inference / RAG
For model fine tuning
Kubernetes (Helm + Operator)
Feast 🤝Milvus 🤝Docling
Talk with your Docs!
Feast 🤝Milvus 🤝Docling
Feast Objects
Entities
Data Sources
Feature Views
These are primary keys
Files and Request objects (i.e., a CSV and an
API call)
This defines a collection of features/fields
where we easily can enable vector search
during retrieval
Feast 🤝Milvus 🤝Docling
Document/Data Transformation!
Feast allows for Feature
Transformation in
Decorators!
Batch Compute Engines (e.g., Spark)
Streaming Compute Engines (E.g,. Spark,
Flink)
API Servers (e.g., the Feast Feature Server)
Defines entities, schemas, data sources, and
some other configurations
Allows for MLEs to easily take data to
production
Feast 🤝Milvus 🤝Docling
Document/Data Ingestion
Ingestion in Feast is simple
Supports more scalable
ingestion as well
Several API endpoints available
More details in the docs
Feast Roadmap 🚀
What's on the horizon for Feast?
More NLP!
We want Feast to be the go-to-framework for AI users to customize their RAG
solutions and that means investing more in Milvus
Image Support
Images often benefit from metadata in recommender systems and we intend on
enhancing Feast in this space, in part because the benefits for RAG are very clear
Scaling Batch with Spark and Ray
We plan to continue to invest in the Spark development experience
We plan to add Ray as a new compute engine
Latency Improvements
We want to make Feast blazing fast and have made significant progress here
Thank you!
Here are some useful links:
Feast RAG Blog Post
Feast Documentation
Feast Website
GitHub Repo with Demo
GitHub Demo with Docling Demo

More Related Content

PDF
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
DOCX
Dennis DeWittt-11-2015-FM-Meteor-Base2Template
PPTX
Architecting an Open Source AI Platform 2018 edition
PDF
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
PDF
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
PDF
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
PDF
Rapid Web Development with Python for Absolute Beginners
PPTX
API Athens Meetup - API standards 25-6-2014
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Dennis DeWittt-11-2015-FM-Meteor-Base2Template
Architecting an Open Source AI Platform 2018 edition
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Rapid Web Development with Python for Absolute Beginners
API Athens Meetup - API standards 25-6-2014

Similar to Smarter RAG Pipelines: Scaling Search with Milvus and Feast (20)

PPTX
API Athens Meetup - API standards 25-6-2014
PPTX
Salesforce & SAP Integration
PDF
Containers & AI - Beauty and the Beast!?!
PDF
Introdution to Dataops and AIOps (or MLOps)
PDF
Serverless APIs, the Good, the Bad and the Ugly (2019-09-19)
PDF
Andrea Baldon, Emanuele Di Saverio - GraphQL for Native Apps: the MyAXA case ...
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PPTX
[Strata] Sparkta
PDF
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
PDF
2018 Oracle Impact 발표자료: Oracle Enterprise AI
PDF
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Spark + AI Summit 2020 イベント概要
PPTX
Overview data analyis and visualisation tools 2020
PDF
Database@Home - Data Driven Reference Architecture
PDF
TechRadarCon 2022 | Have you built your platform yet ?
PDF
GCP for Apache Kafka® Users: Stream Ingestion and Processing
PDF
Serverless Computing with Python
ODP
What is apache pig
PDF
Level Up – How to Achieve Hadoop Acceleration
API Athens Meetup - API standards 25-6-2014
Salesforce & SAP Integration
Containers & AI - Beauty and the Beast!?!
Introdution to Dataops and AIOps (or MLOps)
Serverless APIs, the Good, the Bad and the Ugly (2019-09-19)
Andrea Baldon, Emanuele Di Saverio - GraphQL for Native Apps: the MyAXA case ...
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
[Strata] Sparkta
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...
2018 Oracle Impact 발표자료: Oracle Enterprise AI
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Spark + AI Summit 2020 イベント概要
Overview data analyis and visualisation tools 2020
Database@Home - Data Driven Reference Architecture
TechRadarCon 2022 | Have you built your platform yet ?
GCP for Apache Kafka® Users: Stream Ingestion and Processing
Serverless Computing with Python
What is apache pig
Level Up – How to Achieve Hadoop Acceleration
Ad

More from Zilliz (20)

PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
PDF
Zilliz Cloud Demo for performance and scale
PDF
Open Source Milvus Vector Database v 2.6
PDF
Zilliz Cloud Monthly Technical Review: May 2025
PDF
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
PDF
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
PDF
Webinar - Zilliz Cloud Monthly Demo - March 2025
PDF
What Makes "Deep Research"? A Dive into AI Agents
PDF
Combining Lexical and Semantic Search with Milvus 2.5
PDF
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
PDF
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
PDF
February Product Demo: Discover the Power of Zilliz Cloud
PDF
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
PDF
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
PDF
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
PDF
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
PDF
1 Table = 1000 Words? Foundation Models for Tabular Data
PDF
How Milvus allows you to run Full Text Search
PDF
How to Optimize Your Embedding Model Selection and Development through TDA Cl...
PDF
Milvus: Scaling Vector Data Solutions for Gen AI
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz Cloud Demo for performance and scale
Open Source Milvus Vector Database v 2.6
Zilliz Cloud Monthly Technical Review: May 2025
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Webinar - Zilliz Cloud Monthly Demo - March 2025
What Makes "Deep Research"? A Dive into AI Agents
Combining Lexical and Semantic Search with Milvus 2.5
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
February Product Demo: Discover the Power of Zilliz Cloud
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
1 Table = 1000 Words? Foundation Models for Tabular Data
How Milvus allows you to run Full Text Search
How to Optimize Your Embedding Model Selection and Development through TDA Cl...
Milvus: Scaling Vector Data Solutions for Gen AI
Ad

Recently uploaded (20)

PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
KodekX | Application Modernization Development
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KodekX | Application Modernization Development
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Machine learning based COVID-19 study performance prediction
Network Security Unit 5.pdf for BCA BBA.

Smarter RAG Pipelines: Scaling Search with Milvus and Feast

  • 1. Francisco Javier Arceo Senior Principal Software Engineer, Red Hat Kubeflow Steering Committee Member Feast Maintainer Feast, RAG, and Milvus
  • 2. Hello! 👋 A little about me Led Data Science, Data Engineering, and ML Infra teams at different companies Somehow stumbled into maintaining Feast, the Open Source Feature store Get to work on a mixture of distributed training, pipelines, feature store, RAG, and agents! In my ample free time I like to write code I've spent 12+ years building AI/ML solutions for banks and fintechs 1 Joined Red Hat to work on Open Source AI 2 Wife and 2 children and I call NJ home 🤠 3
  • 3. What is RAG? Retrieval Augmented Generation Published in NeurIPs 2020 Query Encoding Retriever + Generator Meta AI Research team A pretrained encoder In the seminal paper, they ran end-to-end backpropagation/fine tuning on both the Retriever and Generator
  • 4. Why did RAG become so popular? OpenAI Published in NeurIPs 2020 ChatGPT took flight in Oct 2022 Google Trend shows takeoff Most RAG applications only use inference 😅 Meta AI Research team They suggested using RAG 🤯 Easier to do than fine tuning!
  • 5. How does RAG work? The Simplest RAG Embed Data Take documents/text and convert it into numeric (vector) representation Insert Data into datastore Insert all of that data (often in batch) Embed User Query In real-time, embed a user's query Retrieve Documents with Vector Similarity Search Compute the cosine similarity between query and all other vector representations and return top k
  • 6. How can Feast help with RAG? Empowers MLEs to do what they do best, harness the power of their data! Easy to ship RAG to production! Battle-tested support for real- time, batch, and streaming Built to scale for distributed computing and ingestion Fine-tuning as a first class citizen Fully Open Source!
  • 7. Feast in Production Feast values inference and fine tuning as first class citizens Online Infrastructure Offline Infrastructure Scale For model inference / RAG For model fine tuning Kubernetes (Helm + Operator)
  • 9. Feast 🤝Milvus 🤝Docling Feast Objects Entities Data Sources Feature Views These are primary keys Files and Request objects (i.e., a CSV and an API call) This defines a collection of features/fields where we easily can enable vector search during retrieval
  • 10. Feast 🤝Milvus 🤝Docling Document/Data Transformation! Feast allows for Feature Transformation in Decorators! Batch Compute Engines (e.g., Spark) Streaming Compute Engines (E.g,. Spark, Flink) API Servers (e.g., the Feast Feature Server) Defines entities, schemas, data sources, and some other configurations Allows for MLEs to easily take data to production
  • 11. Feast 🤝Milvus 🤝Docling Document/Data Ingestion Ingestion in Feast is simple Supports more scalable ingestion as well Several API endpoints available More details in the docs
  • 12. Feast Roadmap 🚀 What's on the horizon for Feast? More NLP! We want Feast to be the go-to-framework for AI users to customize their RAG solutions and that means investing more in Milvus Image Support Images often benefit from metadata in recommender systems and we intend on enhancing Feast in this space, in part because the benefits for RAG are very clear Scaling Batch with Spark and Ray We plan to continue to invest in the Spark development experience We plan to add Ray as a new compute engine Latency Improvements We want to make Feast blazing fast and have made significant progress here
  • 13. Thank you! Here are some useful links: Feast RAG Blog Post Feast Documentation Feast Website GitHub Repo with Demo GitHub Demo with Docling Demo