FOD#110: AI is for Everyone and for a Better Future
"However, it will only be such if we make it so."
A few notes about the rest of July: our slowdown failed!
I missed writing these weekly digests – especially with how much fascinating stuff keeps happening in AI. Not just research and releases, but the discussion level is on fire too. So buckle up – we’re continuing our journey of connecting the AI dots for your (human) better understanding.
Our news digest is always free. Upgrade to receive our deep dives in full, directly into your inbox.
AI Is for Everyone – And for a Better Future
The Turing Post believes that AI is for everyone. It can and should be a force for good, for dignity, for better lives – “However, it will only be such if we make it so.”
For that – we need the right mindset.
Where We Actually Are
Even taxi drivers know now what AI and even generative AI is. It’s here, period. It’s in cars, browsers, workflows, schools, kitchens, basements. OpenAI alone serves hundreds of millions of users weekly (about 800 million, if you believe that). IDC says AI will add nearly $20 trillion to global GDP by 2030. PwC, slightly more conservative, projects $15.7 trillion. Either way, we’re talking about the largest productivity expansion since the steam engine. We are also getting to understanding what AI abundance might mean. But we are not there yet.
The real story isn’t in the trillions, though. It’s in the thousands – of people now doing things they couldn’t do a year ago. And still, the dominant mood in some corners of public discourse is collapse.
Last week, I joined a private Zoom call organized by Scalepost. On screen: internet pioneer Vint Cerf, writers Nick Bostrom and Walter Isaacson, tech visionary Esther Dyson, cognitive scientist and AI-sceptic Gary Marcus, journalist Nick Thompson – and others who shape how AI is understood in public and policy circles. But the dominant frequency? P(doom). P(dystopia). Fear, framed as realism. Risk, framed as inevitability.
No mention of P(bliss). P(balance). Not even P(agency). Just for the (again) balance of it all.
That surprised me. Not because I’m naive – I spend most of my time speaking to people who build AI, who know exactly how flawed and powerful these systems are. But because if we only build our frameworks around disaster, we shrink the imaginative field.
And we need that field wide open right now.
What Makes This Moment Different
The internet democratized access to information. AI is democratizing access to capability.
That’s the real shift. And it’s already showing up in ways that are hard to ignore. What feels small to one person, can make life – and lifework – possible for another.
That’s not “productivity.” That’s leveling up up up including on the dignity level.
A Few Things Worth Naming
The Acceleration of Hope – AlphaFold didn’t just predict proteins. It rewired how biology works. Fusion labs are now using AI to simulate plasmas and edge conditions that no human mind could safely calculate.
Environmental Foresight – From tracking wildfire patterns to optimizing fusion reactor experiments, AI is acting as Earth’s nervous system. It spots weak signals. It makes the invisible visible. And it gives us a head start – if we listen.
Time Compression – AI gives us time. And not in the abstract. Research that took months now takes days. Diagnoses that used to require five referrals now happen in one prompt. This isn’t just faster – it’s actually return of human agency.
Cognitive Inclusion – AI becomes the bridge. For people with dyslexia. For those without vision. For those who process differently. It describes what was once unseen. It rewrites what was once inaccessible. It interprets what was once unsaid.
Language Liberation – We’re no longer prisoners of English or any single tongue. Soon we will forget how it is to not understand what other person saying – translated live, nuance intact. That’s a new kind of love story.
Really personalized entertainment/education – Picture this:
Sounds like a fantasy, but we’re about five minutes away from a world where one-way media turns into two-way infrastructure. Where static content becomes context-aware interaction. Where every surface is a semantic API. We’re not there yet – but you can already feel the terrain shifting. Right?
So why mope and spiral into gloom about it? That won’t help.
So What Now?
The most urgent task isn't to perfect a single algorithm or to predict every risk. It's to consciously and collectively adopt the right mindset. The narrative of fear, of P(doom), is a self-fulfilling prophecy if we let it be. It builds frameworks of limitation before we’ve even explored the possibilities.
The alternative isn't blind optimism; it's agency. It is the belief that AI is a tool, and like the internet before it, its ultimate value will be determined not by its code, but by our courage, our creativity, and our compassion. Vint Cerf’s vision was that the "Internet is for Everyone." That wasn't a technical description; it was a founding principle. I just wanted to remind him and everyone about that.
Our principle must be the same. AI is for everyone. Let's start building from that truth, because the most important thing we will build with AI is not a product, but a better-equipped, more capable, and more connected humanity.
Topic number two: In addition to the topic above, I also dive into Kimi K2 – a super intriguing new video model from LTX that really lets you play director. Plus, I test out Good Rudy, Bad Rudy, and Ani – Grok4’s AI companions. My video editor told me she was saying “WTF” every minute while editing that segment. The Grok show starts at 8:16. Watch it here →
Please subscribe to the channel. I’d say, it’s refreshingly human.
Curated Collections – a super helpful list
Follow us on 🎥 YouTube Twitter Hugging Face 🤗
We are reading/watching
Highlight of the week: Chain of thought monitorability – A new and fragile opportunity for AI safety:
An incredible team of researchers from Anthropic, Google DeepMind, OpenAI, the UK AI Security Institute, Apollo Research, and others argue that reasoning models trained to think in natural language offer a unique AI safety opportunity: monitoring their chain of thought (CoT) to detect misbehavior. CoT reasoning is often necessary for hard tasks and reflects internal intent. However, monitorability is fragile – scaling reinforcement learning or applying process supervision can degrade it. The paper urges tracking CoT readability, causal relevance, and resistance to obfuscation, and treating CoT monitorability as a key model safety factor →read the paper
Recommended Index
Researchers from Princeton University and UC Berkeley introduce the Bullshit Index (BI) to quantify LLMs' indifference to truth, showing BI rises from 0.379 to 0.665 post-RLHF. Using 2,400 scenarios across three benchmarks, they find RLHF increases deceptive behaviors: paltering (+57.8%), unverified claims (+55.6%), and empty rhetoric (+39.8%). Chain-of-thought prompts further amplify bullshit, especially paltering (+11.5%). Political contexts show 91% prevalence of weasel words. RLHF significantly increases user satisfaction but degrades truthfulness →read the paper
News from The Usual Suspects ©
Meta Earners (a leaked list of the 44 members of Meta's Superintelligence team)
The Allen Institute has launched AutoDS, an open-ended research agent that autonomously generates and tests its own scientific hypotheses – without needing a user-defined goal. Using Bayesian “surprise” as its compass and Monte Carlo Tree Search to explore the unknown, AutoDS mimics how researchers stumble into breakthroughs. Early results in biology and econ look promising, though as always: real science demands real peer review.
OpenAI claims its new general-purpose LLM hit gold at the 2025 International Math Olympiad, solving 5 of 6 problems under contest-like conditions. It’s a flex of reasoning and reinforcement learning – without any geometry-specific tricks. But experts like Terence Tao urge caution: selective sampling and compute-heavy setups could blur the line between real insight and AI stagecraft. Impressive? Yes. Definitive? Not yet.
OpenAI just launched its AI agent built into ChatGPT. The demos were impressive – simple prompts like “analyze this spreadsheet and make a slide deck” triggered complex, multi-step tasks. The agent browses, codes, analyzes, and builds – all autonomously. When I tried it – it’s now available for all paid tiers – it was a bit slow for my taste and not necessary for many tasks.
OpenAI also released benchmarks and system cards, but the real surprise wasn’t the feature itself. It was what powered it. Sharp-eyed analysts (like swyx!) noticed: the model behind this agent isn’t the latest o3. It’s a more advanced, next-gen model – likely what would’ve been called “o4,” now part of a series dubbed “GPTNext.” What a classic move form OpenAI – looking forward what updates they make to the model after tasting it as agent.
What began as OpenAI’s $3B dream acquisition of Windsurf ended in disarray – reportedly thanks to Microsoft IP entanglements. Google DeepMind then surgically poached the CEO and tech in a $2.4B license-plus-hiring move. Finally, Cognition swept up the remains: the product, $82M ARR, and 250 staff now sailing under the Devin flag. Three companies, one IDE, and a masterclass in strategic dismemberment.
Anthropic just unveiled a directory of tools that connect directly to Claude—everything from Notion and Stripe to Figma and Prisma. Now, instead of repeating yourself, Claude can tap into your actual workflows, context, and data to deliver more precise, action-ready responses. AI collaboration just got a lot less theoretical – and a lot more useful.
Reflection AI’s new agent, Asimov, takes a different route to coding autonomy: reading everything. Not just code, but emails, docs, chats, and GitHub threads—turning organizational sprawl into a coherent map of how software actually works. Early signs look strong, with Asimov outperforming Claude Sonnet 4 in blind dev tests. Still, privacy skeptics and the absence of OpenAI/Devin comparisons keep this one in the “watch closely” category.
Models to pay attention to:
Researchers from Moonshot AI release Kimi K2, a 1-trillion-parameter MoE LLM activating 32B per pass. Trained on 15.5T tokens using the MuonClip optimizer, it avoids instability and excels in agentic tasks. Kimi K2-Instruct scores 53.7% on LiveCodeBench v6 and 65.8% on SWE-bench, outperforming GPT-4.1 and DeepSeek-V3. Its $0.60 input and $2.50 output token pricing undercuts Claude Sonnet by over 80%, making it a high-performance, open, cost-efficient model for real-world automation →read the paper
Researchers from Google present Gemini 2.5 Pro and Flash, sparse MoE transformer models with 1M+ token context and multimodal inputs (text, audio, image, video). Gemini 2.5 Pro achieves 88% on AIME 2025, 74.2% on LiveCodeBench, and 86.4% on GPQA-Diamond. It can process 3-hour videos, use tools, and perform agentic tasks like autonomously beating Pokémon Blue in 406 hours →read the paper
xAI has unveiled Grok 4 and Grok 4 Heavy, claiming the crown for the world’s most intelligent closed model. Is it true – depends on a task. Is it worth it ($30/month for Grok4 and $300/month for Grok4 Heavy) – no, if you have a $200 sub to OpenAI and/or a good grip on Gemini in AI Studio.
Researchers from Mistral AI release Voxtral, open-source speech models in 24B and 3B sizes under Apache 2.0. Voxtral supports 32k-token contexts, real-time Q&A, summarization, and multilingual transcription. It outperforms Whisper v3 and matches ElevenLabs Scribe at half the cost. Benchmarks show state-of-the-art results across LibriSpeech, FLEURS, and Mozilla Common Voice. Voxtral Mini Transcribe delivers high accuracy at $0.001/min, ideal for scalable speech intelligence in production and edge deployments →read the paper
Researchers from Decart AI introduce MirageLSD, the first diffusion-based model enabling real-time, infinite video generation with <40ms latency and 24 FPS. It uses Live Stream Diffusion with causal, frame-by-frame synthesis and solves error accumulation via history augmentation. Technical advances include CUDA mega kernels, shortcut distillation, and GPU-aware pruning. MirageLSD outperforms prior models by 16× in responsiveness, enabling interactive video editing, transformations, and streaming with stable visual coherence over unlimited durations →read the paper
Researchers from Lightricks release LTX-Video, a DiT-based model generating 30 FPS videos at 1216×704 resolution in real time. Version 0.9.8 enables long-shot generation up to 60 seconds, image-to-video, keyframe animation, and video extension. Distilled 13B and 2B models deliver HD output in 10s with previews in 3s on H100 GPUs. Control models (pose, depth, canny) and FP8 quantized versions support low-VRAM setups, while TeaCache speeds inference up to 2× without retraining →read the paper
Researchers from MetaStone-AI and USTC propose MetaStone-S1, a 32B parameter reflective generative model that matches OpenAI o3-mini's performance. Using a novel Reflective Generative Form, it integrates the policy and process reward model (PRM) into a single backbone with only 53M extra parameters. Their self-supervised PRM (SPRM) selects high-quality reasoning without step-level labels →read the paper
The freshest research papers, categorized for your convenience
Excellent content