From Hype to Reality: The RAG Technique That’s Powering Next-Gen AI — Part 1: The Basics
The AI world isn’t just evolving — it’s sprinting. And right at the center of this transformation is Retrieval-Augmented Generation (RAG). What started as a mouthful of academic theory is now the engine behind smart chatbots, industry-specific copilots, and AI assistants that actually know what they’re talking about.
In this 4-part series, I’m taking you on a hands-on journey through the world of RAG. We'll start with the basics — What even is RAG? — and build all the way up to real-world, production-grade systems with optimization tricks, smart context handling, and scale-ready architecture.
Oh, and yes — there will be code. Lots of it.
Let’s kick off with Part 1: building a basic RAG pipeline with a CSV file as your knowledge base.
What Is RAG, Really?
At its core, RAG combines the following two steps:
This means instead of relying solely on a model's already training data (global knowledge), you're giving it fresh, trusted, domain-specific context — right when it needs it.
What We'll Build in Part 1
Step 1: Create a Knowledge Base
Here’s a small dataset of IT helpdesk articles:
id,title,content
1,Resetting your password,To reset your password, go to the login page and click on "Forgot Password"...
2,Installing VPN,Download the VPN client from the portal and follow the installation guide for your OS...
3,Email not syncing,Check if your device is connected to the internet. Then, go to settings > mail > accounts...
4,Two-factor authentication setup,Go to your profile settings and enable 2FA. Use Google Authenticator or any TOTP app...
5,Accessing company intranet,Use the VPN and navigate to intranet.company.com. Login with your AD credentials...
Save this as knowledge_base.csv
Step 2: Install Required Libraries
pip install langchain openai faiss-cpu pandas
The above is an one time install. So comment it out if you are using notebook or collab to save some energy and time ;-). BTW don't forget to set your OPENAI_API_KEY in your environment. If you dont know how to (which i doubt), leave a comment and i will put a small step by step instruction.
Step 3: Load & Embed Documents
import pandas as pd
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.schema import Document
from langchain.text_splitter import CharacterTextSplitter
# Load CSV
df = pd.read_csv("knowledge_base.csv")
# Convert rows to LangChain documents
docs = [Document(page_content=row["content"], metadata={"title": row["title"]}) for _, row in df.iterrows()]
# Split into chunks (optional for small docs)
splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = splitter.split_documents(docs)
# Create a vector store
embedding = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embedding)
Step 4: Set Up the RAG Chain
from langchain.chat_models import ChatOpenAI
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQAWithSourcesChain
# LLM Setup
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
# Custom prompt template
custom_prompt = PromptTemplate(
input_variables=["context", "question"],
template="""
You are a helpful IT assistant.
Use the following context to answer the question.
If you don't know the answer, just say you don't know.
Context:
{context}
Question:
{question}
Answer:"""
)
# Setup the RAG chain with debug-friendly output
qa_chain = RetrievalQAWithSourcesChain.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever(),
chain_type="stuff",
return_source_documents=True,
chain_type_kwargs={"prompt": custom_prompt}
)
What’s Happening Here?
We’re creating a simple RAG pipeline using LangChain’s RetrievalQA module. Here's how the pieces fit together: I am not explaining exact steps but, in general this is what is happenning -
Step 5: Ask a Question and See the Magic
query = "How do I access the intranet?"
result = qa_chain(query)
# Print context used + final answer
print("=== Retrieved Context ===")
for doc in result['source_documents']:
print(f"[{doc.metadata['title']}]: {doc.page_content}\n")
print("=== Final Answer ===")
print(result['answer'])
Sample Output:
=== Retrieved Context ===
[Accessing company intranet]: Use the VPN and navigate to intranet.company.com. Login with your AD credentials...
=== Final Answer ===
To access the intranet, use the VPN and go to intranet.company.com. Then log in using your Active Directory credentials.
How It Works: A Visual Breakdown
+---------------------------+
| User Query Input |
| "How do I access intranet?"
+------------+--------------+
|
v
+-----------------------------+
| Retriever (FAISS) |
| Search knowledge base for |
| semantically similar |
| content |
+-----------------------------+
|
v
Retrieved Context (e.g. from knowledge_base.csv):
"Use the VPN and navigate to intranet.company.com.
Login with your AD credentials..."
|
v
+----------------------------------------------+
| Prompt sent to LLM (GPT-3.5-Turbo) |
| "Answer the question using the context below: |
| Context: [retrieved content] |
| Question: How do I access intranet?" |
+----------------------------------------------+
|
v
+----------------------------+
| Generated Answer |
| "To access the intranet, |
| use the VPN and go to |
| intranet.company.com..." |
+----------------------------+
Coming Up in Part 2
Next, we’ll take this further by adding:
And by Part 4, we’ll be building a hybrid RAG agent with memory, observability, and failover logic.
#RAGpipeline #GenerativeAI #LangChain #VectorSearch #LLMops #OpenAI #AIinProduction #MachineLearning #AIApplications #PromptEngineering
Follow me to stay updated. Part 2 drops soon!
Questions or feedback? Drop a comment — let’s build smarter AI together.
Insightful! Nice! BTW, FAISS is suitable for prototyping and small applications. For production, you should choose a scalable vector search engine! 😉