LAI #89: The Rise of LLM Developers, Smarter Search Engines, and Multi-Agent Patterns

Also, community-built study assistants, multimodal retail AI, plus cost-efficient vLLM deployment.

Good morning, AI enthusiasts!

This week, we’re diving into the new role shaping AI development, the LLM Developer, sitting between software engineers and ML specialists, where prompts, fine-tuning, and system design come together. We also look at practical advances in multimodal retail search engines, dynamic AI for wealth planning, and real-world multi-agent architectures. On the community side, don’t miss the CLI-based study assistant, new collaboration opportunities, and our latest poll on where AI’s future impact will come from: giant models or compact, deploy-anywhere ones.

Plenty to explore, let’s get into it!

What’s AI Weekly

This week in What’s AI, I talk about a new role that’s quietly emerging: the LLM Developer. Unlike software developers who code rules or ML engineers who train models, LLM developers live in between—working with foundation models through prompting, fine-tuning, and clever system design. It’s a mix of technical skill, business sense, and knowing how to manage an LLM like an “unreliable intern.” Read the article to see why this role is becoming so important, or watch the video on YouTube.

— Louis-François Bouchard, Towards AI Co-founder & Head of Community

Learn AI Together Community Section!

Featured Community post from the Discord

Heretolearn123_41806 built a CLI-based study assistant that makes learning a lot smoother. It can take automated notes, generate summaries from your material, and even retrieve knowledge on demand, pointing you back to the exact book, note, or timestamped video where the answer lives. Check it out on GitHub and support a fellow community member. If you have any questions or feedback, share them in the thread!

AI poll of the week!

It’s telling that 58% of people are more excited about sub-1B parameter models than about pushing LLMs into the hundreds of billions. The frontier may be about bigger, but the community seems just as interested in smaller, cheaper, deploy-anywhere AI. So, do we keep chasing peak benchmark dominance with trillion-parameter behemoths? Or double down on compact models that can run on phones, browsers, and edge devices, where the real adoption battle will be won? Tell me in the thread!

Collaboration Opportunities

The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too—we share cool opportunities every week!

1. Ainews_mythofsisyphus is building an AI video editor and is looking for a partner who can help with the configuration, login, security, etc. If you are based out of the EU/US, connect in the thread!

2. .ghostvoices is working on an application that acts as Cluely for mobile and is looking for a team to take it beyond a demo. If you are interested, reach out in the thread!

3. Raphael_219 is working on a personal AI assistant called Ava and is looking for someone experienced with LangChain, tool return formatting, and debugging agents who can help finalize Ava’s tool/agent loop. If this sounds like something you would like, contact him in the thread!

Meme of the week!

Meme shared by bigbuxchungus

TAI Curated Section

Article of the week

Designing Multimodal AI Search Engines for Smarter Online Retail By Ashish Abraham

This article details the construction of a multimodal AI search engine for online retail, aimed at improving product discovery. It explains the use of dense vector embeddings for semantic understanding of text and images, along with sparse vectors for precise keyword matching. It also covers metadata filtering for refining searches by specific attributes and reranking to improve result quality. A practical guide demonstrates implementation using a Shein dataset and the Qdrant vector database, including techniques for dynamically generating filters from user queries with an LLM or a fine-tuned NER model.

Our must-read articles

1. Bank Wealth Planning — Dynamic AI “Broker Guider” Platform By Shenggang Li

An AI-driven platform was developed to provide dynamic portfolio management that adapts to individual client needs and market shifts. It utilizes a hybrid reinforcement learning framework where an "actor" model suggests rebalancing actions. A separate dual-critic layer ensures all proposals adhere to specific constraints like risk, liquidity, and taxes. Furthermore, a behavioral model predicts client compliance to ground the advice in reality. This system produces personalized, auditable recommendations that enhance goal achievement and risk control, offering a scalable and compliant solution for financial advisors.

2. Multi-Agent AI Architecture Explained: Patterns & Real-World Use Case By Gaurav Shrivastav

Drawing from personal experience, the author explains how to build effective multi-agent systems using CrewAI. The piece outlines three distinct collaboration patterns (Sequential, Hierarchical, and Parallel) and identifies the manager-led Hierarchical model as the most suitable for complex projects. A practical stock analysis tool is built to demonstrate this structure, featuring a manager agent delegating tasks to specialized financial and news analysts. It also details common pitfalls, such as infinite loops and poor task delegation, providing a guide for structuring cooperative AI teams to produce coherent and useful results.

3. Cost-Efficient Online Document Parsing: Deploy and Serve vLLM on GCP Cloud Run By Jeremy Arancio

This walkthrough explains how to build and deploy a self-hosted AI microservice for extracting structured data from documents. It details serving the Nanonets-OCR-s Vision Language Model on GCP Cloud Run using vLLM, creating a cost-effective and auto-scaling solution. The system is integrated into an application through a FastAPI microservice built with Clean Architecture for modularity. The process covers instructing the model to generate structured JSON output and validating the data with Pydantic, resulting in a reliable in-house alternative to third-party APIs for document parsing.

4. Applied RAG 2.0: From Goldfish Memory to ChatGPT-Like Conversations (10x Smarter Bot) By Aakash Makwana

Addressing the limitations of basic Retrieval-Augmented Generation (RAG) applications, this article outlines practical steps to create a more capable conversational agent. The guide focuses on three key enhancements: implementing chat memory for contextual follow-up queries, utilizing a more effective text splitter for improved document analysis, and adding a keyword search function for greater user control. It provides a detailed code walkthrough for building a Streamlit application that integrates these features. The result is a more sophisticated AI assistant capable of maintaining conversation and providing more accurate, context-aware answers from provided documents.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

LAI #89: The Rise of LLM Developers, Smarter Search Engines, and Multi-Agent Patterns

Towards AI

Making AI accessible to all with our courses, blogs, tutorials, books & community.

Also, community-built study assistants, multimodal retail AI, plus cost-efficient vLLM deployment.

What’s AI Weekly

Learn AI Together Community Section!

Featured Community post from the Discord

AI poll of the week!

Collaboration Opportunities

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

The Towards AI Newsletter

117,292 followers

More articles by this author

Explore topics

Also, community-built study assistants, multimodal retail AI, plus cost-efficient vLLM deployment.

What’s AI Weekly

Learn AI Together Community Section!

Featured Community post from the Discord

AI poll of the week!

Collaboration Opportunities

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

The Towards AI Newsletter

117,292 followers

TAI #166: The GenAI Paradox: Superhuman Models but Mixed Success with Enterprise AI Developments

Aug 19, 2025

LAI #88: GNNs for Knowledge Graphs, DSPy Signatures, and How LLMs Are Really Trained

Aug 14, 2025

TAI #165: GPT-5’s Mixed Reception

Aug 12, 2025

All Things AI Under a Minute

Aug 11, 2025

LAI #87: Recurrent Memory, Agentic RAG, and Evaluating LLM Writing

Aug 7, 2025

TAI #164: Generative AI Monetization Accelerates As ChatGPT Weekly Active Users Hit 13% of the Global Online Population

Aug 5, 2025

LAI #86: LLM Gaps, Agent Design, and Smarter Semantic Caching

Jul 31, 2025

TAI #163: AI Unlocking History's Secrets; Deepmind’s Aeneas Continues A Recent Trend

Jul 29, 2025

Stop Guessing With AI; Make It Second Nature

Jul 28, 2025

LAI #85: Agents That Work, LLaVA Training, and the $40K RAG Deal

Jul 24, 2025

Explore topics