12th September - AI News Daily - OpenAI Launches Real-Time Voice API as Mastercard Rolls Out Agentic Checkout

🌍 INAI • The Open AI Hub

The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge.2 Million+ tools, models, agents, tutorials & daily news — free for all, updated every day.

https://github.com/inai-sandy/inAI-wiki

70+ New AI Agents & Apps added today

https://inai.short.gy/12th-sept

Podcast

https://www.buzzsprout.com/2507996/episodes/17831491

📰 AI News Daily — 12 Sept 2025

TL;DR (Top 5 Highlights)

  • OpenAI signs a $300B, 4.5GW cloud deal with Oracle to power next‑gen models—reshaping the cloud race and boosting Oracle’s AI stature.

  • NVIDIA unveils Rubin CPX GPU with 1M+ token context and new SMART infrastructure, promising major efficiency gains for enterprise AI.

  • FTC probes major chatbots over child safety and penalizes inflated AI claims, signaling tougher, evidence‑driven oversight.

  • Microsoft deepens its OpenAI partnership while building custom chips; OpenAI joins Broadcom’s program—industry hedges beyond Nvidia.

  • Mastercard launches agentic AI checkout in the U.S., pushing autonomous, secure shopping into the mainstream.

🛠️ New Tools

  • OpenAI gpt‑realtime and Realtime API: A fast, natural-sounding end‑to‑end speech model and API for voice agents. Lower latency and higher quality enable production‑ready conversational apps for brands like Zillow and T‑Mobile.

  • Google Gemini adds audio transcription and Creation Library: Transcribes and analyzes up to 10‑minute audio files and organizes outputs in one place—streamlining workflows and making Gemini more competitive for everyday productivity.

  • ChatGPT adds MCP tools; Anthropic launches MCP server registry: ChatGPT can now take actions (e.g., update Jira) via MCP, while Anthropic’s registry simplifies tool discovery—advancing secure, interoperable agent actions for teams.

  • Claude AI document editing: Edit Word, Excel, and PDF files with natural language—no app needed. A 30MB limit and planned Office 365 integration make document workflows faster and more accessible.

  • Replit autonomous coding agent: Builds, tests, and ships apps end‑to‑end with minimal guidance. It reduces busywork for developers, accelerating delivery and enabling smaller teams to ship more frequently.

  • DSPy + KùzuDB retrieval: Tool‑calling composes vector and graph retrievers for stronger context. Better retrieval quality improves agent reliability in coding, QA, and analytics tasks.

🤖 LLM Updates

  • Alibaba Qwen3‑Next‑80B‑A3B: Hybrid MoE activates ~3B of 80B parameters per token, targeting ~10x cheaper training and faster inference. Ships with vLLM integration, optimized kernels, and H100 deployments.

  • Baidu ERNIE‑4.5‑21B‑A3B‑Thinking (open‑sourced): A strong reasoning model trending on Hugging Face, broadening accessible “thinking” models for research and industry tasks.

  • mmBERT multilingual encoder: Trained on 3T tokens across 1,800+ languages, improving understanding and search for low‑resource languages and global applications.

  • OpenAI GPT‑OSS in Transformers: Official integration expands access to OpenAI‑style capabilities in the popular ecosystem—lowering friction for experimentation and production adoption.

  • Unsloth 1–3‑bit LLMs: Aggressively quantized models beat flagship closed systems on select tasks, cutting costs and enabling edge and local deployments without heavy hardware.

  • Baichuan DCPO RLHF objective: New alignment objective aims to reduce vanishing gradients and wasted rewards, promising more stable, data‑efficient post‑training.

📑 Research & Papers

  • Mathematics Inc. autoformalization: Chris Szegedy’s team claims its Gauss agent solved the Strong Prime Number Theorem project in weeks—advancing automated theorem proving and reliable math agents.

  • ByteDance AgentGym‑RL: Unified multi‑turn agent training rivaling commercial systems across 27 benchmarks—standardizing training pipelines and improving reproducibility for agent research.

  • DeepMind + Imperial (antibiotic resistance): New findings highlight how AI can map resistance pathways, informing drug discovery strategies and public health interventions.

  • AQCat25 dataset (11M+ reactions): A large reaction dataset to accelerate catalyst discovery and greener chemistry—fueling data‑driven materials and sustainability research.

  • DCQCN wins SIGCOMM 2025 Test of Time: The congestion control system underpins large‑scale training stability—recognizing core infrastructure behind today’s AI performance.

  • Survey of 3D/4D world modeling: Comprehensive review of dynamic scene understanding methods, outlining pathways to more capable embodied and spatially aware AI systems.

🏢 Industry & Policy

  • OpenAI x Oracle $300B cloud pact: A five‑year, 4.5GW capacity deal powers next‑gen models and data centers—including the Stargate initiative—as AI capex across tech giants heads toward $435B by 2029.

  • NVIDIA Rubin CPX GPU: Designed for heavy AI tasks like coding and video gen, with 1M+ context tokens and SMART infrastructure—setting a new performance bar for enterprise workloads.

  • FTC scrutiny intensifies: Probes Meta, OpenAI, and Alphabet over child safety and mental health; sanctions exaggerated AI claims after Workado—pushing the industry toward verifiable, child‑safe products.

  • Microsoft’s dual track: Deepens OpenAI partnership while unveiling custom chips and its first in‑house LLM; OpenAI joins Broadcom’s custom silicon program—diversifying beyond Nvidia for cost and flexibility.

  • Publishers vs. AI platforms: OpenAI challenges Canadian jurisdiction in a copyright suit as media groups press Google and OpenAI for licensing—cases likely to set global data‑usage precedents.

  • Mastercard’s agentic payments: Autonomous checkout rolls out in the U.S. for the holidays, expanding globally. Focus on security and trust aims to normalize agentic commerce across retail.

📚 Tutorials & Guides

  • Anthropic’s agent tool optimization: Practical playbook for building reliable tools with Claude Code and feedback loops—helping teams boost agent accuracy and reduce failure modes.

  • Jurafsky & Martin (SLP3 draft): The free third edition refreshes foundational NLP knowledge—ideal for upskilling engineers entering modern LLM and speech workflows.

  • Scaling AI infra (AWS Builder Loft): Hard‑won lessons for throughput, observability, and cost control—turnkey checklists to scale without new GPUs or major code changes.

  • Context engineering essentials: Studies show longer context raises poisoning/distraction risk; high‑quality, current context and strong guides often beat raw documentation.

  • “RAG isn’t dead” experiments: Tests across 18 models show retrieval remains vital even with long context windows—pointing to hybrid strategies for robust systems.

🎬 Showcases & Demos

  • Seedream 4.0 vs. rivals: ByteDance’s model challenges Gemini 2.5 in portrait and editing, with vivid Shahnameh scene renders—community realism contests stress‑test generative fidelity.

  • New consumer creativity: Delphi AI (digital legends), Kling Avatars (expressive faces), and Veo 3 (fast vertical video) make high‑quality content creation accessible and affordable.

  • Design playgrounds: Mood Font (EmbeddingGemma 300M) suggests fonts by “vibe,” while Glif’s Chrome extension lets users right‑click to remix any web image with AI.

💡 Discussions & Ideas

  • Open vs. closed futures: Debates weigh broad empowerment against gated access, as compute‑based regulation struggles to track evolving training methods.

  • Detection and neutrality: With bots saturating the web, reliable AI‑text detection looks infeasible; Stanford HAI suggests techniques to approximate neutrality rather than enforce absolutes.

  • Many models, not one: Industry trends favor a pluralistic ecosystem and collaborative efforts—echoing MosaicML’s playbook over single‑model dominance.

  • Autonomy and simulation: Reports suggest AI task autonomy doubles every ~7 months; framing models as simulators clarifies why outputs mirror training realities.

  • Deployment economics: Local LLMs can slash heavy‑task costs; network/storage tuning alone can deliver 10x post‑training speedups without changing GPUs.

  • Agent security: Training LLMs as white‑hat hackers surfaces new attack surfaces; stronger governance and oversight needed as agent operations scale.

Sakshi Prakash

Fueling Startup Visionaries for 120X Growth | Linkedin Catalyst | Elevating Networks by 150X Empowering Entrepreneurs for 110X Success | Branding

2w

This roundup nails the balance between hype and substance. Every item feels relevant to where AI is actually headed.

Rabina Hembram

📈 Marketing Leader, ⭐ Social Media Marketing Expert, 🌍 Product Hunt Supporter, 🤝Helping Students Grow, 📩 Message for Collaboration

2w

DeepMind’s work on antibiotic resistance is a reminder of AI’s potential in life-saving research. This is real impact.

Like
Reply

Claude’s seamless editing across formats is a productivity dream. No app dependency means faster workflows.

Like
Reply
Sachin .

Brand Strategist | Market Positioning | Creative Campaigns

2w

Replit’s autonomous agent is quietly revolutionizing dev culture. End-to-end app building with minimal input is wild.

Like
Reply
Singh Yede

Student at Jabalpur Engineering College

2w

Mastercard’s agentic checkout is a bold move. Autonomous commerce is no longer a concept—it’s deployment-ready.

Like
Reply

To view or add a comment, sign in

Explore content categories