AI Report by Explainx
Posts
This AI transcribes with 94% accuracy, better than Eleven Labs

This AI transcribes with 94% accuracy, better than Eleven Labs

TwinMind’s EAR-3 sets new speech-to-text records, Thinking Machines makes LLM outputs reproducible, and Oboe launches adaptive AI learning for all.

Yash Thakker
September 12, 2025

This Week in AI & Tech Innovation

EAR-3 Takes the Crown – TwinMind’s new speech-to-text model sets records in accuracy, speaker labeling, and language coverage while staying the most affordable.
Reproducible LLMs – Thinking Machines Lab introduces “batch-invariant” kernels, ensuring stable and fully reproducible outputs for enterprise and research use.
Oboe Launches – A new adaptive AI learning platform offering multi-format, personalized courses to make lifelong learning engaging and accessible.

These breakthroughs highlight how AI is redefining communication, reliability, and education -pushing the boundaries of what’s possible next.

EAR-3 Takes the Crown: Speech-to-Text’s New Industry King

TwinMind's EAR-3 model, developed by a team of ex-Google X scientists, establishes new industry records in speech-to-text accuracy (94.74% or 5.26% WER), speaker labeling (3.8% DER), and language coverage (140+ languages, including 40+ more than competitors), at just $0.23 per hour—the most affordable among leading services. EAR-3 uses a fine-tuned blend of open-source models, trained on a diverse dataset with human-annotated audio, and employs advanced pipelines for speaker diarization, audio pre-processing, and code-switching support, making it resilient across accents and mixed scripts. The model is cloud-based for scalability, automatically falling back to the offline EAR-2 if connectivity drops, and offers continuous, battery-efficient transcription both on web and mobile, with strict privacy measures that ensure only transcripts are stored locally while audio is deleted instantly. API access and broad app availability (iOS, Android, Chrome) are coming soon, with Pro plans featuring a 2-million-token context window; EAR-3 is expected to impact legal, medical, business, and multilingual settings globally.

Making LLM Outputs Fully Reproducible

Thinking Machines Lab has identified the primary cause of nondeterminism in large language model (LLM) inference as the unpredictable variation in batch sizes during inference, rather than floating-point concurrency. While individual GPU kernels are deterministic, the load-dependent batch size when processing multiple queries leads to varying outputs for identical inputs. Their research introduces a novel approach with “batch-invariant” kernels that maintain consistent computation order regardless of batch size or load, enabling fully reproducible inference results. This breakthrough resolves a fundamental challenge in AI, improving output stability crucial for enterprise applications, scientific reproducibility, and true on-policy reinforcement learning, where consistent model behavior during training and inference reduces noise and enhances learning quality. Although the batch-invariant kernels impose some performance overhead, early tests show acceptable inference speed trade-offs.

Oboe Launches Adaptive AI Learning Platform

Oboe is a newly launched AI-powered learning platform designed to make education personalized, flexible, and fun. Created by the co-founders of Anchor, Oboe lets users instantly create lightweight courses on any topic using a simple prompt. Offering nine different learning formats including articles with real photos, podcasts, games, and quizzes, Oboe caters to various learning preferences. Its innovative multi-agent AI system generates, verifies, and enriches course content in parallel, ensuring high-quality, tailored learning experiences within seconds. The platform promotes curiosity and self-directed learning without algorithmic ads, helping users dive deeper into topics with personalized recommendations. Available globally, users can create up to five free courses with paid tiers for extended use. Oboe aims to harness AI to elevate human intelligence and make lifelong learning approachable and enjoyable for everyone.

Hand Picked Video

In this video, we'll look at Olly's auto-commenter feature.

Top AI Products from this week

Orren – Orren turns your ideas into professional LinkedIn posts instantly. No templates, no writer's block. Just human-sounding content that drives engagement. Generate posts, rephrase sections, post on LinkedIn, get fresh ideas, and manage your content calendar.
rolyai – Create your own ai roles. Chat with different roles in one chat like in a normal group chat. You decide who should respond.
Tallyrus Document Screening in Bulk – Most people who deal with large volumes of documents, teachers grading essays, managers reviewing resumes, or teams analyzing reports, struggle with the same problem.
YouStory - Turn your child's favorite toy, pet or artwork into a storybook hero. AI crafts personalized adventures teaching valuable lessons. Professional illustrations in minutes. Stories that star their real-world friends. Bedtime reimagined. Magic guaranteed.
Meet Macro Terminal – A command line tool that gives you direct natural language access to your databases, csv files, and excel files. You can use it to explore data, write and run queries, and export csv and markdown files to share with others - all directly from your terminal.
Countly – Meet Countly, the ultimate first-party analytics platform that lets you own your digital analytics and customer experience.

This week in AI

Adobe AI Agents Boost CX - Adobe AI agents on Experience Platform automate and optimize customer journeys, audience targeting, experiments, site performance, and support, enabling scalable, personalized marketing.
RSL Standard Automates AI Content Licensing - RSL is a new open web standard enabling publishers to set machine-readable licensing and royalty terms for AI content use, automating fair pay via subscription, pay-per-crawl, or inference.
ElevenLabs Voice Remixing - ElevenLabs launches Voice Remixing alpha, letting users change gender, age, accent, and style of their voices for creative storytelling and precise AI agent design
K2 Think AI Reasoning - MBZUAI and G42 launched K2 Think, a 32B-parameter open-source AI system delivering top-tier reasoning, beating models 20x larger with innovations like chain-of-thought tuning and agentic planning.
OpenAI gpt-realtime - OpenAI's gpt-realtime is a new speech-to-speech model in the Realtime API that offers natural, expressive voice interaction with image input, SIP calling, and improved instruction following.

Paper of The Day

Boost embodiment AI agent efficiency with Auras, a co-designed inference framework that disaggregates perception and generation for asynchronous pipeline parallelism. Auras improves throughput by 2.54× on average, maintains 102.7% accuracy, and solves data staleness with a shared public context buffer. It uses a hierarchical tuner to balance throughput-accuracy trade-offs and applies optimizations like memory offloading, batched execution, and CUDA graph stream management. Tested on multiple open-source embodied AI models, Auras offers scalable high-frequency execution ideal for real-time robotic and autonomous systems.

To read the whole paper 👉️ here.