LAI #86: LLM Gaps, Agent Design, and Smarter Semantic Caching

Open-source eval tools, ReAct-based agents, vector databases, and why model weaknesses are often where the best tools get built.

Good morning, AI enthusiasts,

This week’s issue focuses on a recurring theme: most breakthroughs don’t happen in spite of model limitations, they happen because of them. In What’s AI, we break down key LLM weaknesses (reasoning, memory, retrieval) and explore how smart tooling, like ReAct agents, structured search, and caching layers, turns those gaps into new capabilities.

We also highlight a fully open-source eval toolkit from the community, a tutorial on building fast local agents, a deep dive on vector DBs, and a PhD-led research survey on how real people interact with AI. If you’ve ever found yourself duct-taping fixes around your favorite model, this issue is for you.

Let’s get into it.

What’s AI Weekly

This week in What’s AI, I dive into something that we have been discussing since the rise of LLMs: their limitations. But in this iteration, I not only highlight these weaknesses but also walk you through what you can actually do about them. I also share how each of these weaknesses and gaps is an opportunity for anyone who wants to build something on top of LLMs. Read the article to find out how you can make LLMs more reliable or watch the video on YouTube.

— Louis-François Bouchard, Towards AI Co-founder & Head of Community

Learn AI Together Community Section!

Featured Community post from the Discord

Cyber.crat2711 has created an evaluation stack designed to help GenAI teams measure and optimize their LLM pipelines with minimal overhead. It supports dozens of evaluation templates across safety, summarization, retrieval, behavior, and structure, and a wide spectrum of evaluation metrics across text, image, and audio modalities. Check it out on GitHub and support a fellow community member. If you have any questions or feedback, share it in the thread!

AI poll of the week!

This week, we’re doing something a little different. We’re helping out a community member who’s researching how people interact with AI. She's running a short, anonymous survey (just 6 minutes) to better understand user expectations of generative AI.

If you:

Are 18+
Use generative AI regularly (e.g. ChatGPT, Claude, Mistral, Replika…)
Have been using it for at least 6 months (2+ times/month)
Are fluent in English or French

…you’re exactly who she’s looking for. Start the survey here!

Collaboration Opportunities

The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too—we share cool opportunities every week!

1. Superuser666_sigil began work on an AI project called SigilDERG and has landed a Lambda AI Research Grant to take the next step: bootstrapping a training Codex starting with the Rust ecosystem. Currently, he is looking for people who can help with crate analysis / Rust OSS surfacing, metadata scraping & enrichment, building structured AI training sets (Rust-focused), pipeline logic (Python, async, llama.cpp, ONNX), and model evaluation/scoring logic (or even rule-based IRL scaffolds). If this sounds interesting, contact him in the thread!

2. Teddybrown117_45661 has built several automation projects like "voice agent for WhatsApp, Facebook, Instagram", "Chatbot for patients", " Regular News scraper and reporter". He has some more in the pipeline and is looking for co-workers. If this sounds like a relevant opportunity, reach out in the thread!

3. Llsmokell is new to AI and is looking for others who are also learning. If you are open to learning together, sharing resources, and exploring different areas of AI, feel free to connect in the thread!

Meme of the week!

Meme shared by superuser666_sigil

TAI Curated Section

Article of the week

Exploring the Power of Small Language Models By Sunil Rao

For applications requiring efficiency and on-device deployment, Small Language Models (SLMs) present a compelling alternative to their larger counterparts. This analysis details the advantages of SLMs, such as lower operational costs and improved privacy, and covers the architectural and compression techniques used in their development, including pruning and knowledge distillation. It further argues that SLMs are well-suited for agentic AI systems due to their economic benefits and task-specific performance. A practical algorithm for migrating agentic systems from resource-intensive LLMs to more efficient, specialized SLMs is also presented for consideration.

Our must-read articles

1. Networked Narratives By Filip Wójcik, Ph.D.

This article presents a method for integrating Large Language Models (LLMs) and Graph Neural Networks (GNNs). The process starts with an LLM extracting entities and relationships from unstructured text to build a knowledge graph. This graph is then converted into a numerical, GNN-compatible format using libraries like PyTorch Geometric. A case study analyzing Wikipedia articles demonstrates how a GNN can be trained on this structure to predict links and generate embeddings. These embeddings capture complex semantic and structural patterns that are not apparent from text analysis alone, showing a way to enhance AI reasoning.

2. Building ReAct agents with Memory using LangGraph By Aayushi_Sharma

A practical guide for building ReAct agents with LangGraph demonstrates how to create AI workflows that merge reasoning with action. It explains how to structure a graph where nodes represent model calls or tool executions. The process starts with a simple model, then integrates a custom tool, and establishes a loop for the agent to decide when to act. To improve functionality, a feedback mechanism is added for more conversational responses, and memory is implemented using a checkpointer. This enables the agent to maintain context and recall information from previous interactions in a conversation.

3. Stop Wasting LLM Tokens: Build a Smart Semantic Cache with FAISS + HuggingFace By Sai Bhargav Rallapalli

To address latency and high costs in LLM applications, this article details the construction of a semantic cache using FAISS and HuggingFace. This technique avoids redundant API calls by storing previous query-answer pairs as vector embeddings. It then uses similarity search to retrieve cached answers for new, semantically similar questions, bypassing the LLM. It provides a step-by-step implementation guide, including logic for cache expiration to maintain data freshness. It also discusses suitable use cases, such as chatbots and RAG systems, and weighs the benefits against potential drawbacks.

4. Building Smart Agents: LangGraph + Perplexity with Memory for Developers By Sai Bhargav Rallapalli

Leveraging Perplexity AI's Sonar models and the LangGraph library, this guide details how to build a smart agent with conversational memory. It explains the process of setting up a stateful graph using a MemorySaver to retain interaction history, managed by a unique user ID. The author presents the core logic, including a customizable system prompt to define agent behavior, and demonstrates how to expose this functionality through a FastAPI endpoint. The result is a functional, context-aware agent ready for interaction and further development, making it a practical tutorial for developers.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

LAI #86: LLM Gaps, Agent Design, and Smarter Semantic Caching

Towards AI

Making AI accessible to all with our courses, blogs, tutorials, books & community.

Open-source eval tools, ReAct-based agents, vector databases, and why model weaknesses are often where the best tools get built.

What’s AI Weekly

Learn AI Together Community Section!

Featured Community post from the Discord

AI poll of the week!

Collaboration Opportunities

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

More articles by this author

Others also viewed

An AI pioneer is launching a blockchain-powered marketplace for open-source models and applications

An OpenAI ‘open’ model shows how much the company—and AI—has changed in two years

LAI #82: MCP, Byte-Level LLMs, Vision Transformers, and the Week Backprop Finally Clicked

The End of the Unstructured Data Era

Hybrid Search: The Next Frontier Beyond Vector Search!

AI agents platforms comparison

Towards Advanced RAG

Fine-tuning or RAG: Which LLM Strategy is Best for Your GenAI Model?

The Future of AI Building a Production-Grade LLM Application

Analyzing the AI Search Opportunity

Explore topics

Open-source eval tools, ReAct-based agents, vector databases, and why model weaknesses are often where the best tools get built.

What’s AI Weekly

Learn AI Together Community Section!

Featured Community post from the Discord

AI poll of the week!

Collaboration Opportunities

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

All Things AI Under a Minute

Aug 11, 2025

LAI #87: Recurrent Memory, Agentic RAG, and Evaluating LLM Writing

Aug 7, 2025

TAI #164: Generative AI Monetization Accelerates As ChatGPT Weekly Active Users Hit 13% of the Global Online Population

Aug 5, 2025

TAI #163: AI Unlocking History's Secrets; Deepmind’s Aeneas Continues A Recent Trend

Jul 29, 2025

Stop Guessing With AI; Make It Second Nature

Jul 28, 2025

LAI #85: Agents That Work, LLaVA Training, and the $40K RAG Deal

Jul 24, 2025

TAI #162: The Agentic Era of AI: From IMO Gold to Real-World Work with ChatGPT Agent

Jul 23, 2025

LAI #84: Prompting as a Skill, DINOv2 Embeddings, and Claude vs. OLMo 2

Jul 17, 2025

Cheat Sheet: What Most Teams Miss When Building with LLMs

Jul 16, 2025

TAI #161: Grok 4's Benchmark Dominance vs. METR’s Sobering Reality Check on AI for Code

Jul 15, 2025

Others also viewed

An AI pioneer is launching a blockchain-powered marketplace for open-source models and applications

An OpenAI ‘open’ model shows how much the company—and AI—has changed in two years

LAI #82: MCP, Byte-Level LLMs, Vision Transformers, and the Week Backprop Finally Clicked

The End of the Unstructured Data Era

Hybrid Search: The Next Frontier Beyond Vector Search!

AI agents platforms comparison

Towards Advanced RAG

Fine-tuning or RAG: Which LLM Strategy is Best for Your GenAI Model?

The Future of AI Building a Production-Grade LLM Application

Analyzing the AI Search Opportunity

Explore topics