Building the Foundations for Real AI Models
🌀 THE RIFF | Edition #8
Yesterday I caught up with a friend and colleague to talk tech, policy, and the future of humanity—specifically, how to separate AI hype from reality. We lingered on “large” versus “small” language models and realised the word small now carries baggage. Are we talking about smaller than yesterday’s small (through pruning, distillation, quantisation, or tight fine-tuning LLMs), or about small by intent—domain-specific models designed for bounded problems? And how do we get there?
Small vs Smaller: what actually changes?
The learning principles don’t change: these models form statistical associations over the knowledge they’re exposed to and use those associations to solve new prompts. What does change is scope and footprint.
Let’s focus on tools for data curation.
About “memory” and agentic AI
With the need to control data synthesis, “memory” has re-entered the conversation via product features, but agent architectures with memory are decades old (Kalogeropoulos DA, Carson ER, Collinson PO., 2003). The novelty isn’t that memory exists; it’s how today’s models blend statistical association with tool use, (federated) learning loops, and persistent context. The risk is mislabelling marketing features as breakthroughs.
Foundations first: data quality and representational efficiency
If SLLMs are to be genuinely useful (and safe), the responsibility shifts upstream:
Health commons and civic engagement
For health especially, the right “unit” of engagement is often a population health cohort—communities organised around shared problems. Think health commons: civic groups, clinicians, and researchers co-producing datasets, guardrails, and evaluation criteria. This is how we ground model knowledge in lived reality.
Regulation lens: capability, generality, and FLOPs
The EU AI Act and the GPAI Code of Practice introduce two useful levers: capability and generality. Broadly:
Today’s big LLMs clearly land in high-risk GPAI territory. But smaller models may fall below certain thresholds while still posing meaningful bias, safety, and quality risks in deployment. That is precisely why we need standards (not just policy) that travel well across sizes: evaluation protocols, data provenance, domain-specific benchmarks, and civic sandboxes to test socio-technical fit.
This is what this all means for investors, entrepreneurs and policy makers, building on MIT's "The GenAI Divide State of AI in Business 2025" Report:
What investors should keep in mind
What startups—and policy-makers—should prioritise
Where this goes next
Deployment turns models into systems—and that’s where regulation actually bites. In the next RIFF, I’ll unpack “put-to-service”: deployment patterns, evidence requirements, post-market monitoring, and how SLLMs can meet (and raise) the bar for safety, equity, and performance.
If we want real AI, we need to get the foundations right: data, standards, civic participation, and models sized to the problems that truly matter.
This isn’t just the responsibility of technologists—it’s a shared challenge for investors, startups, entrepreneurs, the third sector, and policymakers alike.
At the Global Health Digital Innovation Foundation, we’re working on several initiatives to build NextGen GenAI for health and care. If you’d like to join us in shaping solutions that make a real impact, do reach out.
Stay tuned — I’ll be unpacking this in upcoming editions.
⚡Welcome to The Riff
A sharp, human-centred take on where digital health and AI are headed next—offering signal over noise, with an eye on equity, sustainability, and real-world impact.
Each edition riffs on a theme—from drift in AI systems and digital bias in healthcare, to sandboxes, standards, and smarter models of care. It’s rooted in active work across policy, ethics, and innovation ecosystems—but always grounded in people, practice, and possibility.
Whether you’re shaping the future of health systems, building technology, or asking better questions, The Riff is your lens into what’s emerging, what’s working, and what we need to talk about next.
Founder of Epic Rose | Driving Healthcare AI & Data-Driven Business Transformations | We Boost Business Efficiency through Automation, AI, and Beyond
3wAs always, you cut through the noise. I love the focus on equity and civic participation — building AI with communities, not just for them. From your perspective as CEO, what’s the most realistic way to embed civic participation into AI at scale without slowing innovation?