From Zero To SaaS #44 RAG

From Zero To SaaS #44 RAG

The AI world is moving so fast that, as an antidote, I have chosen to focus on fundamentals. Said Rik Boere , a top-notch agent builder.

One of these fundamentals is data management; no matter how advanced AI becomes, it always stores information somewhere. 

Doing a deep dive in the databases showed me that:

  • A database is not a larger version of a Google Sheet.
  • RAG has its limitations
  • Reminded me that LLMs don’t understand language.

This week's edition, I will walk you through the fundamentals of data, as preparation for diving into RAG.

The Fundamentals of Digital Information

Data itself is just a bunch of 0s and 1s, but they are stored in a type.  ANALOGY: words on a page. 

  • String: Sequence of characters ("hello!", '42', "John_Doe").
  • Float, numbers with decimals (3.14, -0.001, 2.0)
  • Int, numbers without ‘,’( -10, 0, 42)
  • Boolean, a value that’s either true or false.
  • List, collection of items [1, 2, 3], ["a", "b", "c"]

To read this data, the code that processes data uses formats to recognize the structure. 

Like we read sentences to connect word concepts, we use formats to add more complexity.

  • JSON, used for Web APIs, configuration files, data storage.

{
 "name": "Jasper",
 "age": 30
}        

  • XML, Legacy systems, SOAP APIs, document storage (e.g., Office files).

<user>
  <name>Jasper</name>
  <age>30</age>
</user>        

  • CSV, Spreadsheets, tabular data, data import/export.

name,age
Jasper,30        

  • YAML, Config files (e.g., Docker Compose, GitHub Actions), infrastructure-as-code.

name: Jasper
age: 30        

  • HTML, Web content, templating, and rendering structured documents.

<h1> Jasper Data </h1>
<p>Name: Jasper</p>
<p>Age: 30</p>        

So, we write and form sentences, but to form a book, you need to add pages. 

Each page has a page number, which in the digital information world serves as the ID of the dataset.

On this level, we enter with databases.

If a spreadsheet is a page, then what is the book?

These databases should be seen as gigantic rows stacked upon each other like a flat with 1000 floors, with 40 apartments on each floor.


Article content

The LLM Database

RAG stands for Retrieval-Augmented Generation, a term that describes the process of transforming data into numbers to facilitate its use by an LLM. 

When you type text into ChatGPT, it is transformed into numbers and fed to the model, which then activates a neural network. The model then performs calculations and outputs numbers, which are converted back into text. 

When setting up a RAG database, you use an embedding model, a type of model that transforms text into vectors, or lists of numbers. Because the model gives numbers back, you have to use that exact model to translate it back to text. 

When you use RAG, you improve two processes, 

A) You use fewer tokens to retrieve information because strings or words use more tokens than this list of numbers. 

B) You add context to assist LLM in determining which concepts should be linked to each other.  

In which scenarios should RAG not be used?

  • Suppose the format of the answer needs to be different and more accurate than the standard model. In that case, you use fine-tuning, where you retrain specific parts of the model's weights to decrease the likelihood of hallucination in particular areas, such as government administration, law, finance, and medical records. 
  • As LLMs are based on text, I do not recommend transforming table-formatted data, such as your CRM, ATS, Lead Generation, and Advertisement Campaign Data, to RAG.
  • I also find it challenging to live-sync RAG data and optimize records. Therefore, it is excellent for a knowledge base assistant and in administration; however, I prefer to use API calls for the platforms mentioned above. 
  • You cannot do SQL in the RAG database, but you can also do semantic search, as in even typos or words, in the direction it would find the information you are looking for. 
  • In some cases, you can get away with prompt caching, where you store part of the input data, so the model does not have to translate all the data to numbers each time.


Article content

My RAG protocol


  1. Webscraping OR finding all the relevant files

I use Firecrawl to crawl the entire website, and then Firecrawl again to get the Markup of every page.

2. Data Cleaning 

Then I create a loop that adds each page with context in the beginning, and at the end, which enhances retrieval.

If you have multiple documents, you can best transform the PDFs, etc., into TXT files and then combine all the TXT files into one.

3. Vectorisation Setup

I ran a test on https://platform.vectorize.io/ to find the settings for vectorisation.


Article content

Pinecone is the easiest to set up, but you have less control over it. 

Supabase is a bit more challenging to set up, but you have more control and can see how each item is stored, which is recommended for beginners if you're setting it up for the first time.  

Then set up a vector database.

4. Vectorisation Settings

I set up the n8n settings.

You always have to select the A) right model, set the B) dimensions, and set the C) chunk size and D) overlap.

Then, I execute the workflow, and at the end, I check the DB provider to see if I have the records. 

5. Setting up the retrieval agent.

I ensure that the retrieval agent uses the correct embedding model and has sufficient memory to process the data from the Pincone tool; I usually set it to 5. 


Article content

6. Using ChatGPT to create the agent prompt

I always start with the prompt: 'You are a prompt engineering genius, you need to make a prompt for a RAG agent, which uses the X tool to receive the data.'

You help the user to do x

Start with the mission,

Then, describe how to use the tool

Give an example of the input and output.

7. Optimise the agent prompt

If this prompt is 8, what do you need to make it a 10?

For those who are not into reading, I am filming a video on how to RAG, which will be available either next week or the week after.

Happy Building!

Join me and 161 others to learn How to Build Your First Email Agent

Mon, Jun 23, 2025, 11:00 AM — 12:00 PM


Article content


Jacek Gabanowicz

AI Sales Expert | Fractional Business Development | AI Go-to-Market Strategy | SaaS Partnerships | Software | AI in Sales

2mo

Good breakdown. Honestly it's a tool usability issue with some of these rag agents and some tools are much better intuitively to create your output with much less overall technical friction.

David Benett

Building AI First Teams & Organisations

2mo

Great breakdown Jasper. Thanks for sharing.

To view or add a comment, sign in

Others also viewed

Explore topics