AI Report by Explainx
Posts
Claude Controlling Your Chrome Browser 🕹️🌐💻

Claude Controlling Your Chrome Browser 🕹️🌐💻

Claude now controls your browser safely, Mistral OCR 3 dominates document parsing, and Luma Ray3 brings frame-accurate Hollywood VFX to AI video.

Yash Thakker
December 21, 2025

AI just dropped three power moves upgrading browsers, documents, and video creation. From agents that click for you to OCR that beats enterprises and VFX-level video control, here’s what’s new:

🌐 Claude Takes the Wheel — Browser Control Goes Mainstream
Anthropic’s Claude for Chrome moves from pilot to all paid plans, letting AI read screens, click buttons, fill forms, manage tabs, and run workflows—locked down with serious safety guardrails.

📄 Mistral OCR 3 — Document AI with a 74% Win Rate
Mistral launches OCR 3, crushing forms, tables, handwriting, and scans into clean markdown/HTML at massive scale—built for real enterprise pipelines, not demos.

🎬 Luma Ray3 Modify — Hollywood-Grade Video, Frame by Frame
Luma unlocks start/end frame control, preserving human performances while swapping scenes, styles, or characters—bringing predictable, pro-level VFX to AI video.

AI isn’t just generating anymore, it’s clicking, reading, and directing.

Claude Takes Browser Control: From Pilot to Paid Plans with Ironclad Safety

Claude for Chrome is Anthropic's innovative Chrome extension that empowers Claude AI to interact directly with your browser—reading screens, clicking buttons, filling forms, and automating workflows like managing calendars, drafting emails, handling expenses, or testing sites, all from a side panel. Initially piloted in August 2025 with 1,000 Max plan users to rigorously test browser-based AI safety amid prompt injection risks (reducing attack success from 23.6% to 11.2% via permissions, confirmations, classifiers, and blocks on high-risk sites like finance/adult content), it expanded to all Max subscribers on Nov 24, 2025, and to every paid plan (Pro/Team/Enterprise) on Dec 18. Key features include Claude Code integration for terminal-to-browser debugging, multi-tab handling, scheduled tasks, smart navigation on Gmail/Slack/GitHub using Haiku 4.5/Sonnet 4.5 models, visual screenshot aids, and team admin controls for site allow/block lists. Users retain full control with site permissions and action confirmations for sensitive tasks; install via Chrome Web Store at claude.ai/chrome after reviewing the safety guide.

Mistral launch OCR 3: With 74% Win Rate

Mistral OCR 3, launched December 2025, sets a new benchmark in document AI with 74% win rate over Mistral OCR 2 on forms, scans, complex tables, and handwriting—outpacing enterprise and AI-native rivals like Google/OCR solutions. This compact model extracts text/images from PDFs/images into markdown with HTML tables (colspan/rowspan preserved), enabling seamless RAG/agents/search pipelines; excels at cursive handwriting, dense forms (invoices/receipts), low-DPI scans, and multilingual docs; processes 10K+ pages/min on one node. Priced at $2/1K pages ($1 via Batch API), access via API (mistral-ocr-2512) or Document AI Playground in Mistral Studio—ideal for digitizing archives, parsing ops docs, enterprise search. Customers praise fidelity for invoices/reports; backward-compatible with OCR 2; powers Mistral Studio's drag-and-drop UI for instant text/JSON output; beats prior gens on real biz benchmarks via fuzzy-match accuracy

Luma Ray3 Modify: Start/End Frames Unlock Hollywood VFX Magic

Luma AI's Ray3 Modify, released Dec 18 2025 via Dream Machine, transforms video editing by generating footage from start/end keyframes while preserving original human performances—motion, timing, eye line, emotion—using character reference images to swap appearances, scenes, or styles with Hollywood-grade fidelity. This hybrid AI-human workflow enables precise spatial continuity for complex shots, reliable physics (water/smoke sims), and VFX in ads/film/games; built on Ray3's HDR video and Draft Mode for rapid ideation. First video-to-video keyframe control, character consistency across shots, pro physics adherence; free tier for basics, pro unlocks 4K/HDR exports at lumalabs.ai/dream-machine—ideal for studios ditching random text-to-video for predictable results.

Hand Picked Video

BG Blur transformed from a basic blur tool into a powerful AI video enhancer with a minimal design, smarter background blur, privacy upgrades, and live previews for creators.

Top AI Products from this week

Notify Human Agent - With Notify Human Agent, AskYura automatically alerts a human agent when a conversation needs manual attention, and includes a clear, AI-generated reason for the handover. Agents instantly understand the context and can take over smoothly, without users repeating themselves. We built this feature because AI support shouldn’t replace humans, it should know when to step aside.
Ideation Your Creative Idea - Ideation - brainstorm with AI... but keep full control over your ideas. Most AI assistants handle everything for you from start to finish. Our app doesn’t solve your problems. It nudges your thinking. Short hints and questions help you look at topics from new angles and unlock your creativity.
GoConvert - GoConvert.pro is your AI-powered digital Swiss Army knife. Convert PDF to Word, compress images, merge documents, and extract text with high-accuracy OCR, all in one place. Unlike clunky alternatives, we offer a clean, lightning-fast experience with no registration required and no hidden paywalls.
Notify Human Agent - With Notify Human Agent, AskYura automatically alerts a human agent when a conversation needs manual attention, and includes a clear, AI-generated reason for the handover. Agents instantly understand the context and can take over smoothly, without users repeating themselves. We built this feature because AI support shouldn’t replace humans, it should know when to step aside.
ScreenWith.me - ScreenWith.me helps companies automate early-stage interviews using AI. Candidates complete structured interview questions asynchronously, while recruiters review insights, notes, and responses in a centralized ATS dashboard. Perfect for teams hiring remotely or at scale.
Reavil - Build better products using customer feedback collected via our lightweight widget. Friction points identified, every complaint analysed, every fix prioritised. Reduce time spent on analysis and churn rate. Built for product teams to stay ahead of the game. Reavil what frustrates users, build what they love in return!

This week in AI

GPT-5.2 Codex - OpenAI's GPT-5.2 Codex upgrades coding agent with dynamic thinking (seconds to 7+ hrs), 51% accuracy on repo-scale refactors/reviews—beats GPT-5 on SWE-bench. Live for Plus/Pro users in CLI/IDE/GitHub.
Angular v21 - Angular v21 (late 2025) ships Signal Forms, zoneless change detection by default, Vitest testing, @angular/aria accessibility, MCP AI server, HttpClient auto-included. Modernizes reactivity/performance.
NVIDIA Robotics - NVIDIA's latest robotics push: Isaac GR00T N1 open foundation model for humanoid reasoning/skills, Newton physics engine in Isaac Lab, synthetic data blueprints—accelerates robot dev from sim to real-world.
T5Gemma 2 - Google's T5Gemma 2 (Dec 2025): Compact encoder-decoder models (270M/1B/4B) from Gemma 3—multimodal (text+images@896x896), 128K context, 140+ languages. Tops VQA/reasoning/coding vs Gemma 3. Open on HF.
FunctionGemma - Google's FunctionGemma (Dec 2025): Gemma 3 270M fine-tuned for on-device function calling—turns natural language into API actions/JSON. 85% accuracy on mobile tasks, runs offline on phones/Jetson. Privacy-first edge AI agents.

Paper of The Day

Discovering and Learning Probabilistic Models of Black-Box AI Capabilities" introduces PCML: active learning algorithm that probes black-box agents via capability queries, building sound/complete pessimistic/optimistic models using MCTS planning. Converges to true models; excels on Minigrid/Saycan/Overcooked, reducing uncertainty for safer AI alignment.

To read the whole paper 👉️ here