SlideShare a Scribd company logo
Powered by
Weaviate Air #3 -
Connor Shorten
🦜🔗 LangChain
SequentialPrompting
Texteditingexample
“Hallucination” -- The core problem of ChatGPT
Weaviate + ChatGPT →
Here is a history of Lauri Markkanen's NBA basketball
career
{copy and pasted wikipedia data}
Given this history.
{Does Lauri Markkanen play for the bulls?}
Answer: Lauri Markkanen was drafted by the Minnesota
Timberwolves with the seventh pick of the first round of
the 2017 NBA draft. His rights were traded to the Chicago
Bulls, where he signed on 5 July 2017. In his rookie season,
he set an NBA record for most 3-pointers in the first three
games of an NBA career with ten 3-pointers and was
named to the NBA All-Rookie First Team. Markkanen
played four seasons with the Bulls before being traded to
the Cleveland Cavaliers in 2021.In 2022, he was traded to
the Utah Jazz, where he currently plays.
LLM Tool Use with Weaviate
“I would like to perform a Hybrid Search
through Keras Code Examples”
“I would like to perform a Vector Search
where wordCount is less than 50 through
Keras Stack Overflow Questions”
Comingsoon
ChatGPT
What is RLHF and why is ChatGPT
different from GPT3?
Thelanguagemodelingobjective[...]predictingthenext
tokenonawebpagefromtheinternet--isdifferentfrom
theobjective“followtheuser’sinstructionshelpfullyand
safely”[...]Thus,wesaythatthelanguagemodeling
objectiveismisaligned.
Ambiguous Dataset Details
Ambiguous Dataset Details
InstructGPT Result
Galactica is trained on a large
and curated corpus of humanity’s
scientific knowledge. This
includes over 48 million papers,
textbooks and lecture notes,
millions of compounds and
proteins, scientific websites,
encyclopedias and more.
Galactica: A Large Language Model
for Science - Taylor et al. 2022
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Galactica vs. ChatGPT
So much interesting stuff on model inference
and deployment! (All linked in description)
Efficiently Scaling Transformer Inference - Pope et al. 2022
Who Says Elephants Can’t Run: Bringing Large Scale MoE Modelsinto
Cloud Scale Production - Kim et al. 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for
Large Language Models - Xiao et al. 2022
Deploy T5 11B for inference for less than $500 - Philipp Schmid
Our method more than doubles the
compression gains relative to
previously-proposed one-shot
quantization methods, preserving
accuracy, allowing us for the first
time to execute an 175
billion-parameter model inside a
single GPU.
GPTQ: Accurate Post-Training Quantization for
Generative Pre-Trained Transformers - Frantar
et al. 2022
Interaction-Centric AI
Juho Kim, Invited talk at NeurIPS 2022
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Aligning Models
with User Intent
We study the problem of
retrieval with instructions,
where users of a retrieval
system explicitly describe
their intent along with
their queries, making the
system task-aware.
Task-Aware Retrieval with Instructions -
Asai et al. 2022
Weaviate Air #3 - New in AI segment.pdf
Intent, domain, unit
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Training TART Dual
Training TART Full
Weaviate Air #3 - New in AI segment.pdf
Results
Instruction Ablation
WeaviateAirFlashback!
Promptagator & InPars(Bonifacio et al. 2022)
Vs.
TART (Task-aware ReTriever)
The majority of the popular IR
benchmarks today focus
exclusively on downstream task
accuracy and thus conceal the
costs incurred by systems that
trade away efficiency for quality
Moving Beyond Downstream Task
Accuracy for Information Retrieval
Benchmarking - Santhanam et al. 2022
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
We unify different
multi-vector retrieval models
from a token routing
viewpoint and propose
conditional token interaction
via dynamic lexical routing
CITADEL: Conditional Token Interaction via
Dynamic Lexical Routing for Efficient and
Effective Multi-Vector Retrieval - Li et al. 2022
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
Sparsely Activated Router Design
When the query data is drawn
from a different distribution -
e.g., when index represents
image embeddings and query
represents textual embeddings -
such algorithms [DiskANN,
FAISS-IVF, HNSW] lose much of
this performance advantage.
OOD-DiskANN: Efficient and Scalable Graph
ANNS for Out-of-Distribution Queries - Jaiswal
et al. 2022
Improvements
● RobustVAMANA - add a small set (1-2%) of OOD queries to graph optimization
● Query-Aware Product Quantization - Don’tcompletelyunderstandthisyet,somethingabout
addingpivotsthatareoptimizedwithgradientdescentinadditiontocentroidrepresentations
● ParallelGorder - In greedy graph traversal, the out neighbors are typically accessed via
random access leading to large cache miss rates -- graph reordering methods place
information regarding nodes likely to be referenced together on the same or adjacent
cache lines.
Results - +40% latency, +15% recall
New in A.I.
1 - ChatGPT and Large Language Models
2 - Interaction-Centric AI
3 - Efficient and Performant Information Retrieval
4 - New ideas in Approximate Nearest Neighbor Search!
Let us know what
you thought was
interesting!
Please check out weaviate.io
to learn more about the
Weaviate Vector Search
Engine!

More Related Content

PDF
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
PDF
stackconf 2022: Introduction to Vector Search with Weaviate
PDF
Vector Search for Data Scientists.pdf
PDF
Mother of Language`s Langchain
PPTX
Demystifying Graph Neural Networks
PDF
LLMs Bootcamp
PPTX
Webinar on ChatGPT.pptx
PDF
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
stackconf 2022: Introduction to Vector Search with Weaviate
Vector Search for Data Scientists.pdf
Mother of Language`s Langchain
Demystifying Graph Neural Networks
LLMs Bootcamp
Webinar on ChatGPT.pptx
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬

What's hot (20)

PDF
Vector database
PDF
MLOps Using MLflow
PDF
And then there were ... Large Language Models
PDF
Vector databases and neural search
PDF
Generative AI
PDF
Generative-AI-in-enterprise-20230615.pdf
PDF
Vector Databases - A Technical Primer.pdf
PPTX
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
PDF
Large Language Models - Chat AI.pdf
PDF
Build an LLM-powered application using LangChain.pdf
PDF
generative-ai-fundamentals and Large language models
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
PDF
Customizing LLMs
PDF
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
PDF
Large Language Models Bootcamp
PDF
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
PDF
ChatGPT and OpenAI.pdf
PDF
Introduction to Elasticsearch
PDF
Introduction to elasticsearch
PDF
LLMs_talk_March23.pdf
Vector database
MLOps Using MLflow
And then there were ... Large Language Models
Vector databases and neural search
Generative AI
Generative-AI-in-enterprise-20230615.pdf
Vector Databases - A Technical Primer.pdf
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Large Language Models - Chat AI.pdf
Build an LLM-powered application using LangChain.pdf
generative-ai-fundamentals and Large language models
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Customizing LLMs
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
Large Language Models Bootcamp
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
ChatGPT and OpenAI.pdf
Introduction to Elasticsearch
Introduction to elasticsearch
LLMs_talk_March23.pdf
Ad

Similar to Weaviate Air #3 - New in AI segment.pdf (20)

PDF
machine learning a gentle introduction 2018 (edited)
PPTX
Benchmarking search relevance in industry vs academia
PDF
Dato Keynote
PDF
Lessons learned from building practical deep learning systems
PDF
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
PDF
Intro to LLMs
PDF
What are the Unique Challenges and Opportunities in Systems for ML?
PDF
OWF14 - Big Data : The State of Machine Learning in 2014
PDF
Human-in-a-loop: a design pattern for managing teams which leverage ML
PDF
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
PDF
Human-in-the-loop: a design pattern for managing teams that leverage ML
PDF
MLSEV Virtual. State of the Art in ML
PDF
BSSML17 - Deepnets
PDF
BigML Release: Image Processing
PDF
Big learning 1.2
PDF
Time series modeling workd AMLD 2018 Lausanne
PDF
Landscape of AI/ML in 2023
PDF
DutchMLSchool. ML: A Technical Perspective
PDF
Interpretability beyond feature attribution quantitative testing with concept...
PDF
Learning how to learn
machine learning a gentle introduction 2018 (edited)
Benchmarking search relevance in industry vs academia
Dato Keynote
Lessons learned from building practical deep learning systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Intro to LLMs
What are the Unique Challenges and Opportunities in Systems for ML?
OWF14 - Big Data : The State of Machine Learning in 2014
Human-in-a-loop: a design pattern for managing teams which leverage ML
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Human-in-the-loop: a design pattern for managing teams that leverage ML
MLSEV Virtual. State of the Art in ML
BSSML17 - Deepnets
BigML Release: Image Processing
Big learning 1.2
Time series modeling workd AMLD 2018 Lausanne
Landscape of AI/ML in 2023
DutchMLSchool. ML: A Technical Perspective
Interpretability beyond feature attribution quantitative testing with concept...
Learning how to learn
Ad

Recently uploaded (20)

PDF
The scientific heritage No 166 (166) (2025)
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPT
protein biochemistry.ppt for university classes
PPTX
2. Earth - The Living Planet earth and life
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
Sciences of Europe No 170 (2025)
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
The scientific heritage No 166 (166) (2025)
INTRODUCTION TO EVS | Concept of sustainability
Taita Taveta Laboratory Technician Workshop Presentation.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
TOTAL hIP ARTHROPLASTY Presentation.pptx
neck nodes and dissection types and lymph nodes levels
AlphaEarth Foundations and the Satellite Embedding dataset
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
The KM-GBF monitoring framework – status & key messages.pptx
microscope-Lecturecjchchchchcuvuvhc.pptx
Cell Membrane: Structure, Composition & Functions
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
protein biochemistry.ppt for university classes
2. Earth - The Living Planet earth and life
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
ECG_Course_Presentation د.محمد صقران ppt
Sciences of Europe No 170 (2025)
. Radiology Case Scenariosssssssssssssss
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5

Weaviate Air #3 - New in AI segment.pdf

  • 1. Powered by Weaviate Air #3 - Connor Shorten
  • 4. “Hallucination” -- The core problem of ChatGPT
  • 5. Weaviate + ChatGPT → Here is a history of Lauri Markkanen's NBA basketball career {copy and pasted wikipedia data} Given this history. {Does Lauri Markkanen play for the bulls?} Answer: Lauri Markkanen was drafted by the Minnesota Timberwolves with the seventh pick of the first round of the 2017 NBA draft. His rights were traded to the Chicago Bulls, where he signed on 5 July 2017. In his rookie season, he set an NBA record for most 3-pointers in the first three games of an NBA career with ten 3-pointers and was named to the NBA All-Rookie First Team. Markkanen played four seasons with the Bulls before being traded to the Cleveland Cavaliers in 2021.In 2022, he was traded to the Utah Jazz, where he currently plays.
  • 6. LLM Tool Use with Weaviate “I would like to perform a Hybrid Search through Keras Code Examples” “I would like to perform a Vector Search where wordCount is less than 50 through Keras Stack Overflow Questions” Comingsoon
  • 7. ChatGPT What is RLHF and why is ChatGPT different from GPT3?
  • 12. Galactica is trained on a large and curated corpus of humanity’s scientific knowledge. This includes over 48 million papers, textbooks and lecture notes, millions of compounds and proteins, scientific websites, encyclopedias and more. Galactica: A Large Language Model for Science - Taylor et al. 2022
  • 17. So much interesting stuff on model inference and deployment! (All linked in description) Efficiently Scaling Transformer Inference - Pope et al. 2022 Who Says Elephants Can’t Run: Bringing Large Scale MoE Modelsinto Cloud Scale Production - Kim et al. 2022 SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models - Xiao et al. 2022 Deploy T5 11B for inference for less than $500 - Philipp Schmid
  • 18. Our method more than doubles the compression gains relative to previously-proposed one-shot quantization methods, preserving accuracy, allowing us for the first time to execute an 175 billion-parameter model inside a single GPU. GPTQ: Accurate Post-Training Quantization for Generative Pre-Trained Transformers - Frantar et al. 2022
  • 19. Interaction-Centric AI Juho Kim, Invited talk at NeurIPS 2022
  • 27. We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries, making the system task-aware. Task-Aware Retrieval with Instructions - Asai et al. 2022
  • 37. WeaviateAirFlashback! Promptagator & InPars(Bonifacio et al. 2022) Vs. TART (Task-aware ReTriever)
  • 38. The majority of the popular IR benchmarks today focus exclusively on downstream task accuracy and thus conceal the costs incurred by systems that trade away efficiency for quality Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking - Santhanam et al. 2022
  • 41. We unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval - Li et al. 2022
  • 45. When the query data is drawn from a different distribution - e.g., when index represents image embeddings and query represents textual embeddings - such algorithms [DiskANN, FAISS-IVF, HNSW] lose much of this performance advantage. OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries - Jaiswal et al. 2022
  • 46. Improvements ● RobustVAMANA - add a small set (1-2%) of OOD queries to graph optimization ● Query-Aware Product Quantization - Don’tcompletelyunderstandthisyet,somethingabout addingpivotsthatareoptimizedwithgradientdescentinadditiontocentroidrepresentations ● ParallelGorder - In greedy graph traversal, the out neighbors are typically accessed via random access leading to large cache miss rates -- graph reordering methods place information regarding nodes likely to be referenced together on the same or adjacent cache lines.
  • 47. Results - +40% latency, +15% recall
  • 48. New in A.I. 1 - ChatGPT and Large Language Models 2 - Interaction-Centric AI 3 - Efficient and Performant Information Retrieval 4 - New ideas in Approximate Nearest Neighbor Search!
  • 49. Let us know what you thought was interesting! Please check out weaviate.io to learn more about the Weaviate Vector Search Engine!