Intelligence and Combinatorial Complexity
Ok, now that all the non math-nerds are gone, we can talk.
I’ve been working for some time now with a team inside of Microsoft on various ideas in the direction of more stable, longer-running agents based on LLMs, that use memory and other techniques keep context and perform more complex tasks, or even just maintain relationships with people (which is pretty wild). We have some good results and some mixed ones - stability turns out to be a hard thing to achieve in a naive way, though there are some really interesting new design capabilities when you start treating cognition and memory
One counter-intuitive thing we have found is that you can get “more” intelligence from more agents working together, or in some kind of state-machine-like flow (as opposed to a single agent) even if they are all using the same base model. That seems counter intuitive - why aren’t you getting “all” of the intelligence of the base model in any inference? How can you get more with just a different kind of inference agent, like working memory or a prefrontal cortex analog or other monitor?
But again, often when working with these systems, thinking about human patterns is at least helpful. In this case, we have the same pattern: you can write a paragraph or solve a math problem, for example, and then go back through in “proofreader” mode or “checker” mode and find errors and improve your own work. Sure, there is some kind of limit eventually (you’re not going to proofread your way into General Relativity unless you’re Einstein) but it’s not zero - we all get some benefit from going back over our work. So it shouldn’t surprise us too much that this works with something like LLM inference too.
A guess (and it’s just that) is that there is something like the idea of combinatorial complexity
It’s hard to get to a high “N” count of behaviors and ideas that way - but much easier to get to that high N count combinatorically, with a good set of composable primitives. And language is really good for the compositional part of that equation - it’s easy to combine these pieces in surprising ways, and much less brittle that it might otherwise be, so the system can even “improvise” fairly well.
Of course, anything at this level of complexity probably comes with its own problems - hallucinations, cult beliefs, lies, etc. It’s possible that that just comes with the territory, that anything above a certain level of complexity will have the same problems, just like it’s hard to weed things like cancer or mutations out of complex biological systems. Maybe it’s a numbers game like entropy is - there are just higher and higher odds you’ll land on a bad square as the complexity or dimensionality goes up.
Base models will continue to get better, and it’s hard to know how much will get solved that way versus higher level architectures like what I described above. But that math, and our own inferred neural infrastructure, does point in the direction that there will always be value in adding code, state machines, other perspectives, etc on top of the base models, no matter how rich they get. It’s time to think about what the fundamentals of that kind of programming and tooling need to be - how we do testing, regression, monitoring, experiments, tracing, etc.
AI Engineer | Generative AI expert | Data Scientist @ Kanayma LLC
2dStrategic Synthesis The RAG series essentially provides the engineering playbook for building the type of combinatorially complex, composable intelligence systems Schillace describes theoretically. Where he identifies the principles—compositional primitives, multi-agent coordination, iterative refinement—my articles provide the implementation strategies, evaluation methods, and production practices. The series demonstrates that RAG systems are not just information retrieval tools, but foundational architectures for building the kind of adaptable, composable AI systems that can handle real-world combinatorial complexity without requiring exhaustive pre-programming of edge cases. This makes the series highly complementary to Schillace's insights—bridging the gap between theoretical understanding of AI intelligence and practical implementation of complex, production-ready systems.
AI Engineer | Generative AI expert | Data Scientist @ Kanayma LLC
2dThe Hallucination Problem Schillace suggests hallucinations may be inevitable in complex systems, "like entropy." The series provides practical approaches to managing this: Day 16 (Citation and Attribution) builds transparency to detect and mitigate hallucinations Day 19 (Factual Consistency) implements verification mechanisms Day 7 (RAG Failure Analysis) systematically approaches debugging these emergent problems Future Architecture Implications The Day 30 (Future of RAG) discussion of end-to-end optimization and learned sparse retrieval directly implements Schillace's vision of composable, adaptive intelligence systems. The progression from modular pipelines to end-to-end optimization mirrors his argument about moving beyond hard-coded approaches to truly combinatorial intelligence.
AI Engineer | Generative AI expert | Data Scientist @ Kanayma LLC
2dState Machines and Memory Schillace mentions state-machine-like flows and working memory as key to more sophisticated agents. The series addresses this extensively: Day 13 (Temporal RAG) handles time-sensitive information and maintains temporal consistency—essentially implementing working memory for information systems Day 21 (Multilingual RAG) manages cross-language state and context Day 28 (Agentic RAG) explicitly implements memory and context management for autonomous information seeking Production Complexity Challenges Schillace notes that high complexity "probably comes with its own problems" and emphasizes the need for "testing, regression, monitoring, experiments, tracing." My Week 4 directly addresses this: Day 24 (RAG Evaluation Frameworks) tackles the testing challenge for complex, emergent behaviors Day 25 (A/B Testing) provides experimental frameworks for complex systems Day 27 (Observability) implements the tracing and monitoring Schillace calls for Day 26 (Security and Privacy) addresses the security vulnerabilities that emerge from system complexity
AI Engineer | Generative AI expert | Data Scientist @ Kanayma LLC
2dThe "Proofreader Mode" Pattern Schillace's observation about getting better results through iterative refinement directly maps to several RAG techniques: Day 19 (Factual Consistency) implements verification mechanisms that act as "checker modes" for generated content Day 25 (A/B Testing) provides systematic approaches to iteratively improve RAG systems Day 27 (Observability) enables the monitoring and debugging equivalent of "going back over our work"
AI Engineer | Generative AI expert | Data Scientist @ Kanayma LLC
2dCombinatorial Complexity Management Schillace argues that high-complexity behaviors emerge from "composable primitives" rather than hard-coded edge cases. The RAG series provides concrete implementations of this principle: Compositional Building Blocks: Day 8 (Hybrid Search Strategies) combines dense and sparse retrieval primitives to handle exponentially more query types than either approach alone Day 12 (Cross-Modal RAG) demonstrates how combining text, image, and code retrieval primitives enables vastly more complex information synthesis Day 29 (GraphRAG) shows how graph-based knowledge representation creates combinatorial reasoning capabilities through relationship traversal Avoiding Hard-Coded Solutions: Day 9 (Query Expansion) uses techniques like HyDE to dynamically adapt to query variations rather than pre-programming responses Day 11 (Semantic Caching) creates adaptive caching that learns patterns rather than relying on static rules