Vector search, RAG, and large language models

Clara Shih

Head of Business AI at Meta | Founder of Hearsay | Fortune 500 Board Director | TIME 100 AI

Published Dec 15, 2023

Large language models, or LLMs, can't be relied on to recall specific data they were trained on, so the way to make them work in the enterprise, where accuracy is paramount, is to feed them with the right data. This is called grounding prompts using retrieval augmented generation, or RAG for short.

RAG relies on both keyword search (for structured data) and vector search (for unstructured data such as documents, call transcripts, videos, spreadsheets etc).
Unstructured data is also sometimes referred to as blobs, or Binary Large Objects (data in binary form that may or may not conform to a specific file format). 80% of enterprise data is unstructured.
Keyword search + vector search together is referred to as hybrid search. Hybrid search makes AI systems like Einstein Copilot very powerful in their ability to understand, generate outputs, and automate actions across a wide variety of use cases, contexts, and data/content types.

Why vectors? Unstructured data can't be stored in rows and columns in a relational database. It requires a different approach than SQL (or the Salesforce equivalents, SOQL and SOSL).

Vectors are an efficient way of representing unstructured data. This matters both for quickly indexing/ performing similarity search (also known as semantic search) against prompts and also to efficiently pass large amounts of data in to LLMs given their limited context windows.
Unstructured data requires much more storage and traditionally was difficult and slow to analyze or search. Enter LLMs - which are very good at understanding the most important, defining attributes of data blobs to pay attention to - these become the vector dimensions. All other dimensions are collapsed/ignored.
A smaller LLM dedicated to vectorizing unstructured data called an embeddings model is used to create the vectors. The embeddings model is different from the LLM that's used to generate outputs (into which the vectors are passed).
Vector embeddings aren't new. Google search has used embeddings for years. But LLMs make vector embeddings both possible and mission-critical for AI applications.

In the coming months and years, every organization and even individuals will need vector databases in order to overcome the limitations of LLMs -- including limited context windows, knowledge cutoff dates, and hallucinations -- and effectively utilize generative AI.

Mudit Agarwal

Head of Business Technology & Automation Engineering at BILL

Clara, Nice! Thanks for sharing!

Kalpesh Sharma

𝐏𝐋𝐄𝐀𝐒𝐄 𝐒𝐇𝐀𝐑𝐄 BELOW LINKEDIN POST LINK 𝐈𝐍 𝐍𝐀𝐓𝐈𝐎𝐍𝐀𝐋 𝐈𝐍𝐓𝐄𝐑𝐄𝐒𝐓: 𝐄𝐯𝐢𝐝𝐞𝐧𝐜𝐞 𝐨𝐟 𝐇𝐨𝐰 𝐂𝐞𝐧𝐭𝐫𝐚𝐥 𝐆𝐨𝐯𝐞𝐫𝐧𝐦𝐞𝐧𝐭 𝐄𝐦𝐩𝐥𝐨𝐲𝐞𝐞𝐬 𝐃𝐢𝐬𝐫𝐞𝐬𝐩𝐞𝐜𝐭𝐞𝐝 𝐓𝐡𝐞 𝐏𝐫𝐢𝐦𝐞 𝐌𝐢𝐧𝐢𝐬𝐭𝐞𝐫, 𝐓𝐡𝐞 𝐇𝐨𝐦𝐞 𝐌𝐢𝐧𝐢𝐬𝐭𝐞𝐫, 𝐓𝐡𝐞 𝐌𝐞𝐝𝐢𝐚 𝐚𝐧𝐝 𝐓𝐡𝐞 𝐂𝐢𝐭𝐢𝐳𝐞𝐧𝐬 𝐨𝐟 𝐈𝐧𝐝𝐢𝐚: https://www.linkedin.com/posts/sharmakalpesh_todays-content-title-%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-activity-7150763464024539137-iPPK

1 Reaction

Vernon Keenan

Transforming Business with Ethical AI 🤖 WorkDifferentWithAI.com/sign-up 🌐 Sr. Industry Analyst at SalesforceDevops.net

Once again. Salesforce leads the enterprise AI race by integrating RAG into their prompt architecture. RAG is all the rage. Using RAG lets you do the thing many people are demanding, which is “how do I use ChatGPT with documents from my company?“ Both Microsoft and Amazon spent December explaining how they are integrating RAG into their enterprise cloud architectures. And the new GPT, available from open AI, also accomplish roughly the same thing. That’s why RAG appears to be the number one orchestration pattern in Enterprise AI today.

Vector search, RAG, and large language models

Clara Shih

Head of Business AI at Meta | Founder of Hearsay | Fortune 500 Board Director | TIME 100 AI

More articles by this author

Others also viewed

Understanding Retrieval-Augmented Generation (RAG) in Azure AI

Lundi, le Quatorze Juillet: Devs using AI code more slowly; a framework for inference using unstructured data; some cool French startups

Power of Vector Databases and its Evolution with AI & ML

Integrating Spring AI with Knowledge Graphs

Vector Databases vs. Knowledge Graphs: Choosing the Right Foundation for Retrieval-Augmented Generation

GenAI-Assisted Data Cleaning: Beyond Rule-Based Approaches

Get Started With GraphRAG

Generative AI might revolutionize Data Science!

Data Trends for 2024

vectordb

Explore topics

Introducing Generative Lightning Canvas: Dynamically Generated Enterprise UX Grounded in Trusted Data and Workflows

Oct 29, 2024

What No One Tells You About Being an Entrepreneur

Apr 9, 2024

The Battle To Win Employees And Conquer Customer Loyalty

Jul 13, 2021

What Being a Startup CEO Taught Me About the Strategic Importance of Customer Service

Jun 2, 2021

Thoughts from Day 1 as CEO of Service Cloud

Jan 25, 2021

Sabbatical Reflections: Our Role in Making a Better Society

Jan 25, 2021

Your Leadership Power #20WIP

Oct 22, 2020

The Last Mile of Insurance

Oct 21, 2020

The Power & Peril of Social Media

Oct 14, 2020

Sequoia Seven Questions: Hard-won advice for founders

Oct 9, 2020