Vector search, RAG, and large language models
AI-generated image using the prompt, "Visualize multidimensional vectors"

Vector search, RAG, and large language models

Large language models, or LLMs, can't be relied on to recall specific data they were trained on, so the way to make them work in the enterprise, where accuracy is paramount, is to feed them with the right data. This is called grounding prompts using retrieval augmented generation, or RAG for short.

  • RAG relies on both keyword search (for structured data) and vector search (for unstructured data such as documents, call transcripts, videos, spreadsheets etc).
  • Unstructured data is also sometimes referred to as blobs, or Binary Large Objects (data in binary form that may or may not conform to a specific file format). 80% of enterprise data is unstructured.
  • Keyword search + vector search together is referred to as hybrid search. Hybrid search makes AI systems like Einstein Copilot very powerful in their ability to understand, generate outputs, and automate actions across a wide variety of use cases, contexts, and data/content types.

Why vectors? Unstructured data can't be stored in rows and columns in a relational database. It requires a different approach than SQL (or the Salesforce equivalents, SOQL and SOSL).

  • Vectors are an efficient way of representing unstructured data. This matters both for quickly indexing/ performing similarity search (also known as semantic search) against prompts and also to efficiently pass large amounts of data in to LLMs given their limited context windows.
  • Unstructured data requires much more storage and traditionally was difficult and slow to analyze or search. Enter LLMs - which are very good at understanding the most important, defining attributes of data blobs to pay attention to - these become the vector dimensions. All other dimensions are collapsed/ignored.
  • A smaller LLM dedicated to vectorizing unstructured data called an embeddings model is used to create the vectors. The embeddings model is different from the LLM that's used to generate outputs (into which the vectors are passed).
  • Vector embeddings aren't new. Google search has used embeddings for years. But LLMs make vector embeddings both possible and mission-critical for AI applications.

In the coming months and years, every organization and even individuals will need vector databases in order to overcome the limitations of LLMs -- including limited context windows, knowledge cutoff dates, and hallucinations -- and effectively utilize generative AI.


Mudit Agarwal

Head of Business Technology & Automation Engineering at BILL

1y

Clara, Nice! Thanks for sharing!

Like
Reply
Kalpesh Sharma

TOP#25 Best Writers: 19th Global Rank in 2023-2024 | Content Writer/Editor | Creative Copywriter | Humor Marketing Writer | Research/Technical Writer | Health/Pharma Writer | Sales/Marketing Writer | German/French Writer

1y

𝐏𝐋𝐄𝐀𝐒𝐄 𝐒𝐇𝐀𝐑𝐄 BELOW LINKEDIN POST LINK 𝐈𝐍 𝐍𝐀𝐓𝐈𝐎𝐍𝐀𝐋 𝐈𝐍𝐓𝐄𝐑𝐄𝐒𝐓: 𝐄𝐯𝐢𝐝𝐞𝐧𝐜𝐞 𝐨𝐟 𝐇𝐨𝐰 𝐂𝐞𝐧𝐭𝐫𝐚𝐥 𝐆𝐨𝐯𝐞𝐫𝐧𝐦𝐞𝐧𝐭 𝐄𝐦𝐩𝐥𝐨𝐲𝐞𝐞𝐬 𝐃𝐢𝐬𝐫𝐞𝐬𝐩𝐞𝐜𝐭𝐞𝐝 𝐓𝐡𝐞 𝐏𝐫𝐢𝐦𝐞 𝐌𝐢𝐧𝐢𝐬𝐭𝐞𝐫, 𝐓𝐡𝐞 𝐇𝐨𝐦𝐞 𝐌𝐢𝐧𝐢𝐬𝐭𝐞𝐫, 𝐓𝐡𝐞 𝐌𝐞𝐝𝐢𝐚 𝐚𝐧𝐝 𝐓𝐡𝐞 𝐂𝐢𝐭𝐢𝐳𝐞𝐧𝐬 𝐨𝐟 𝐈𝐧𝐝𝐢𝐚: https://www.linkedin.com/posts/sharmakalpesh_todays-content-title-%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-activity-7150763464024539137-iPPK

Vernon Keenan

Transforming Business with Ethical AI 🤖 WorkDifferentWithAI.com/sign-up 🌐 Sr. Industry Analyst at SalesforceDevops.net

1y

Once again. Salesforce leads the enterprise AI race by integrating RAG into their prompt architecture. RAG is all the rage. Using RAG lets you do the thing many people are demanding, which is “how do I use ChatGPT with documents from my company?“ Both Microsoft and Amazon spent December explaining how they are integrating RAG into their enterprise cloud architectures. And the new GPT, available from open AI, also accomplish roughly the same thing. That’s why RAG appears to be the number one orchestration pattern in Enterprise AI today.

To view or add a comment, sign in

Others also viewed

Explore topics