LAI #89: The Rise of LLM Developers, Smarter Search Engines, and Multi-Agent Patterns
Also, community-built study assistants, multimodal retail AI, plus cost-efficient vLLM deployment.
Good morning, AI enthusiasts!
This week, we’re diving into the new role shaping AI development, the LLM Developer, sitting between software engineers and ML specialists, where prompts, fine-tuning, and system design come together. We also look at practical advances in multimodal retail search engines, dynamic AI for wealth planning, and real-world multi-agent architectures. On the community side, don’t miss the CLI-based study assistant, new collaboration opportunities, and our latest poll on where AI’s future impact will come from: giant models or compact, deploy-anywhere ones.
Plenty to explore, let’s get into it!
What’s AI Weekly
This week in What’s AI, I talk about a new role that’s quietly emerging: the LLM Developer. Unlike software developers who code rules or ML engineers who train models, LLM developers live in between—working with foundation models through prompting, fine-tuning, and clever system design. It’s a mix of technical skill, business sense, and knowing how to manage an LLM like an “unreliable intern.” Read the article to see why this role is becoming so important, or watch the video on YouTube.
— Louis-François Bouchard, Towards AI Co-founder & Head of Community
Learn AI Together Community Section!
Featured Community post from the Discord
Heretolearn123_41806 built a CLI-based study assistant that makes learning a lot smoother. It can take automated notes, generate summaries from your material, and even retrieve knowledge on demand, pointing you back to the exact book, note, or timestamped video where the answer lives. Check it out on GitHub and support a fellow community member. If you have any questions or feedback, share them in the thread!
AI poll of the week!
It’s telling that 58% of people are more excited about sub-1B parameter models than about pushing LLMs into the hundreds of billions. The frontier may be about bigger, but the community seems just as interested in smaller, cheaper, deploy-anywhere AI. So, do we keep chasing peak benchmark dominance with trillion-parameter behemoths? Or double down on compact models that can run on phones, browsers, and edge devices, where the real adoption battle will be won? Tell me in the thread!
Collaboration Opportunities
The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too—we share cool opportunities every week!
1. Ainews_mythofsisyphus is building an AI video editor and is looking for a partner who can help with the configuration, login, security, etc. If you are based out of the EU/US, connect in the thread!
2. .ghostvoices is working on an application that acts as Cluely for mobile and is looking for a team to take it beyond a demo. If you are interested, reach out in the thread!
3. Raphael_219 is working on a personal AI assistant called Ava and is looking for someone experienced with LangChain, tool return formatting, and debugging agents who can help finalize Ava’s tool/agent loop. If this sounds like something you would like, contact him in the thread!
Meme of the week!
Meme shared by bigbuxchungus
TAI Curated Section
Article of the week
This article details the construction of a multimodal AI search engine for online retail, aimed at improving product discovery. It explains the use of dense vector embeddings for semantic understanding of text and images, along with sparse vectors for precise keyword matching. It also covers metadata filtering for refining searches by specific attributes and reranking to improve result quality. A practical guide demonstrates implementation using a Shein dataset and the Qdrant vector database, including techniques for dynamically generating filters from user queries with an LLM or a fine-tuned NER model.
Our must-read articles
An AI-driven platform was developed to provide dynamic portfolio management that adapts to individual client needs and market shifts. It utilizes a hybrid reinforcement learning framework where an "actor" model suggests rebalancing actions. A separate dual-critic layer ensures all proposals adhere to specific constraints like risk, liquidity, and taxes. Furthermore, a behavioral model predicts client compliance to ground the advice in reality. This system produces personalized, auditable recommendations that enhance goal achievement and risk control, offering a scalable and compliant solution for financial advisors.
Drawing from personal experience, the author explains how to build effective multi-agent systems using CrewAI. The piece outlines three distinct collaboration patterns (Sequential, Hierarchical, and Parallel) and identifies the manager-led Hierarchical model as the most suitable for complex projects. A practical stock analysis tool is built to demonstrate this structure, featuring a manager agent delegating tasks to specialized financial and news analysts. It also details common pitfalls, such as infinite loops and poor task delegation, providing a guide for structuring cooperative AI teams to produce coherent and useful results.
This walkthrough explains how to build and deploy a self-hosted AI microservice for extracting structured data from documents. It details serving the Nanonets-OCR-s Vision Language Model on GCP Cloud Run using vLLM, creating a cost-effective and auto-scaling solution. The system is integrated into an application through a FastAPI microservice built with Clean Architecture for modularity. The process covers instructing the model to generate structured JSON output and validating the data with Pydantic, resulting in a reliable in-house alternative to third-party APIs for document parsing.
4. Applied RAG 2.0: From Goldfish Memory to ChatGPT-Like Conversations (10x Smarter Bot) By Aakash Makwana
Addressing the limitations of basic Retrieval-Augmented Generation (RAG) applications, this article outlines practical steps to create a more capable conversational agent. The guide focuses on three key enhancements: implementing chat memory for contextual follow-up queries, utilizing a more effective text splitter for improved document analysis, and adding a keyword search function for greater user control. It provides a detailed code walkthrough for building a Streamlit application that integrates these features. The result is a more sophisticated AI assistant capable of maintaining conversation and providing more accurate, context-aware answers from provided documents.
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Tracks how AI reshapes power inside orgs | AI Chief @ EuroOp LLC | ex-WPP/Ogilvy | Branding @ CLL | 🇺🇸 🇪🇺
3dSeeing this shift in the wild, LLM Developer is becoming real thing now. Curious how long before it splits into ops, tuning, and interface roles.