ConneX AI Weekly - From 30th June 2025
🔥 Last Week’s AI Power Plays
1. Microsoft MAI-DxO crushes diagnoses Beats doctors 4x over in tough cases. Medical superintelligence just moved from theory to prototype.
2. Apple considers OpenAI and Anthropic for Siri The walled garden may be cracking. Apple’s exploring outside help to finally make Siri useful.
3. Google Gemini 2.5 Pro quietly dominates Quiet drop, loud results. Gemini levels up reasoning, coding, and context length to eye-watering scale.
4. Claude 4 shows stamina Anthropic’s latest model can stay smart for hours, unlocking persistent, high-output workflows.
📉 Global Shocks & Power Moves (aka Last Week in Reality)
1. Trump’s “Great American Tariff Bill” explodes GOP fault lines: The bill proposes sweeping tariffs across nearly all imports, triggering a fierce clash between Trump’s populist base and the pro-trade OBBB bloc (Open Borders Big Business). With whispers of Musk eyeing a new political party, the tension isn’t just economic, it’s existential.
2. NY deepfake law passed: AI-generated content in New York now requires built-in provenance tags. The push for GenAI accountability just went official.
3. Australia-China trade detente back on the table: After years of diplomatic frost, quiet talks suggest a fresh resource export deal may be in the works. With iron ore and lithium prices swinging, any thaw could bring billions back into Aussie markets, or geopolitical strings Australia isn’t ready to pull.
Deep Dives & What’s Behind the Curtain
🧬 Microsoft MAI-DxO: Diagnosing at Superhuman Scale
The Update: Microsoft has dropped MAI-DxO, a modular AI system designed to handle the trickiest medical cases by simulating an entire virtual diagnostic team. It combines hypothesis generators, test selectors, and cost optimizers to power through rare, complex illnesses.
The Nitty Gritty: On the newly built SDBench dataset of 304 difficult cases, MAI-DxO paired with OpenAI’s o3 solved 85.5 percent of them correctly. Human physicians, with 5 to 20 years of experience, averaged just 20 percent. MAI-DxO even beat doctors on efficiency, saving over $500 per case.
The Takeaway: This isn’t just an AI that diagnoses better. It’s one that does it faster, cheaper, and more consistently. A major leap toward healthcare at scale with precision. Think fewer missed diagnoses, fewer unnecessary tests, and much better outcomes.
🍏 Apple Eyes Anthropic and OpenAI for Siri
The Update: Apple’s reportedly in talks to let Claude or ChatGPT power Siri, at least in part, marking a philosophical shift in how the company delivers intelligence to users. The internal Apple Intelligence project has been plagued with delays, and integrating external LLMs is now back on the table.
The Backstory: Siri has long been Apple’s weakest link. With Google and Microsoft sprinting ahead, Apple’s AI team is feeling the heat. Mike Rockwell is now in charge of rebooting the effort. Using third-party models would give Apple breathing room while reworking its long-term AI infrastructure.
Why You Should Care: If this goes ahead, Siri might actually become… helpful. And for Apple, which is notoriously protective of its ecosystem, this could mark a new era of openness (or at least a more pragmatic one).
🔍 Gemini 2.5 Pro: The Quiet Crusher
The Update: No press blitz. No I/O fanfare. But Gemini 2.5 Pro Experimental is outperforming rivals in maths, science, and code. Google dropped it quietly into Advanced and AI Studio, and the numbers speak for themselves.
The Metrics
63.8 percent on SWE-Bench Verified
68.6 percent on Aider Polyglot
1 million token context window
2 million token support in development
Where This Lands: Google is pushing hard on practical reasoning. With long-context memory and top scores in structured tasks, Gemini is gunning to be the default model for developers who want output that’s both powerful and reliable.
🧠 Claude 4 (Opus and Sonnet) Brings Endurance
The Update: Anthropic’s latest release isn’t just smarter, it’s steadier. Claude 4 can run long sessions without hallucinating or breaking down. This opens the door to continuous agents and more sophisticated, uninterrupted tasks.
Inside the Engine Room: Where most LLMs flake out after a few exchanges, Claude 4 can handle hours of consistent interaction. Its reasoning remains intact even across complex chains of thought, document summarisation, or multi-step workflows.
Strategic Angle: Claude 4 is about more than benchmarks. It’s about utility over time. For teams building persistent systems, agents, customer ops, coding copilots — this level of sustained IQ is a real game-changer.
💼 AI & Cloud Jobs of the Week (June 30 Launch)
1. AI Engineer – Melbourne, VIC Create real-world AI systems in enterprise use cases CBD location, flexible hybrid model Big impact role with emerging tech
2. Azure Cloud Engineer – Sydney, NSW Own Azure migrations and infra design for top clients $160K–$170K + perks, hybrid Focus on innovation and architecture, not maintenance
3. Generative AI Engineer – Sydney, NSW Deploy GenAI tools, LLMs, and AI agents Hybrid setup, $140K–$170K range Work across prompt design, orchestration, and MLOps
🧠 Your edge in AI and macro chaos — one crisp weekly update at a time.
Until later this week!
Alex | ConneX AI