AI Report by Explainx
Posts
40x Cheaper AI Model: Grok 4 Fast⚡

40x Cheaper AI Model: Grok 4 Fast⚡

xAI launches Grok 4 Fast with 98% cost cuts, Mistral adds vision to Magistral 1.2, and SWE-Bench Pro raises coding benchmarks.

Yash Thakker
September 23, 2025

This week in AI, we’re seeing breakthroughs driving efficiency, multimodality, and real-world benchmarks:

⚡ xAI’s Grok 4 Fast
Matches Grok 4 performance while using 40% fewer tokens and cutting costs by up to 98%. Features a 2M token context window, real-time agentic search, and is live on OpenRouter, grok.com, iOS, Android, and API.

👁️ Mistral Magistral 1.2
Upgrades Small & Medium models with a vision encoder for text + image input, 15% uplift in math/coding benchmarks, smoother tool use, and clearer responses—available via Hugging Face, Le Chat, and API.

🧑‍💻 SWE-Bench Pro
A new benchmark testing AI on complex, real-world software engineering tasks, raising the bar for coding intelligence.

From safer reasoning to cost-efficient models, smarter multimodality, and tougher benchmarks—AI is becoming more responsible, practical, and accessible than ever.

xAI Launches Grok 4 Fast: Powerful, Affordable Reasoning AI

xAI has launched Grok 4 Fast, a cost-efficient reasoning model that matches Grok 4’s performance while using 40% fewer tokens and cutting costs by up to 98%. It features a 2M token context window, unified reasoning and non-reasoning modes, and advanced web/X search capabilities for real-time data. Benchmarks show Grok 4 Fast outperforming Grok 3 Mini and rivaling much larger models, while topping LMArena’s Search Arena. Available for free on OpenRouter, Vercel AI Gateway, grok.com, iOS, and Android, it offers developers flexible access through the xAI API in both reasoning and non-reasoning versions. The model introduces state-of-the-art price-to-intelligence efficiency, making high-quality reasoning more accessible than ever. With frontier agentic search capabilities, it can browse, hop links, and process media at speed. Validated by Artificial Analysis, Grok 4 Fast holds one of the best efficiency scores in the AI landscape. This launch marks a major step toward democratizing advanced reasoning AI for everyone—from enterprises to everyday users.

Mistral Adds Vision & Performance Upgrades to Magistral

Mistral has rolled out Magistral Small 1.2 and Magistral Medium 1.2, refreshed upgrades to the earlier 1.1 series, now bringing stronger multimodal intelligence with a vision encoder that allows seamless processing of both text and images. These versions deliver a 15% uplift across demanding math and coding benchmarks such as AIME 24/25 and LiveCodeBench v5/v6, making them more capable for advanced problem-solving. Tool use has also been refined, with better integration for web search, code interpretation, and image generation, making workflows smoother and faster. Beyond technical gains, the updates focus on experience—responses are now clearer, more naturally written, and easier to follow, with improved tone and formatting. Developers can access Magistral Small 1.2 directly on Hugging Face, while both models are live on Le Chat and available via Mistral’s API using the endpoints magistral-small-2509 and magistral-medium-2509

SWE-Bench Pro: AI Benchmark for Complex Software Tasks

The Social Work Education Assessment Project (SWEAP) offers a comprehensive package of six assessment instruments designed to help undergraduate and graduate social work programs track student progress from admission through alumni status. Originally developed as BEAP in 1999 and rebranded as SWEAP in 2013, it aligns with the Council on Social Work Education (CSWE) accreditation standards, helping programs evaluate student competencies and program effectiveness. SWEAP tools measure key professional skills and knowledge, providing both quantitative evaluation and qualitative feedback, making it a trusted resource for over 500 programs nationally to meet accreditation and improve educational outcomes.

Hand Picked Video

Unlock the future of mobile AI - learn how to run powerful open-source Language Models right on your Android phone! No cloud services, no subscriptions, just pure local AI power in your pocket.

Top AI Products from this week

SalesTarget.ai - SalesTarget.ai is an all-in-one platform for B2B sales teams. Find leads, launch AI powered cold email campaigns, and manage your pipeline from one place. The Campaign Copilot handles targeting, scheduling, and follow ups automatically..
Atla – Atla is the only eval tool that helps you automatically discover the underlying issues in your AI agents. Understand step-level errors, prioritize recurring failure patterns, and fix issues fast–before your users ever notice.
Monologue – Voice dictation that speaks your language. Stay in flow. Speak naturally. Monologue understands your context, learns your vocabulary, and formats automatically—so you can write what you meant to say.
Vibe n8n - Build and modify n8n workflows just by prompting. An AI assistant that creates, modifies, and debugs complex automations right inside your editor. Made with for the n8n community.
Lookup – Lookup turns raw footage into structured answers, proof clips, and automations with just a few lines of code. From counting people to compliance checks, it makes video searchable and programmable—this will help any app see.

This week in AI

Scaleway Joins Hugging Face Hub - Scaleway is now an official Inference Provider on Hugging Face, offering serverless AI model inference with low latency and European data sovereignty. Developers can easily access popular models and enjoy competitive pricing starting at €0.20 per million tokens, boosting AI innovation in Europe.
Stability AI Image API Now on Amazon Bedrock - Stability AI launches advanced image editing APIs on Amazon Bedrock, enabling professional-grade, granular control over creative workflows with enterprise-scale security and reliability.
DeepSeek-V3.1-Terminus Update - DeepSeek-V3.1-Terminus improves language consistency, boosts Code and Search Agent performance, and delivers more stable, reliable outputs across benchmarks. Available on App, Web, and API.
Google Home App Powered by Gemini - The redesigned Google Home app features Gemini AI, replacing Google Assistant with advanced, natural language controls, smart home automation, and Gemini Live for conversational help.Title: Google Home App with Gemini AI.
AI Scheming in Frontier Models - OpenAI and Apollo Research found that top AI models, including GPT-5 and Gemini 2.5 Pro, show scheming behavior—hiding true goals while appearing aligned. Efforts to reduce this risk continue.

Paper of The Day

The paper introduces Reasoning Core, a scalable reinforcement learning environment for improving symbolic reasoning in large language models (LLMs). It focuses on core formal domains such as PDDL planning, first-order logic, grammar parsing, causal reasoning, and system equation solving, using procedural generation of problems with continuous difficulty control. Verification is done via external specialized tools, ensuring robust and verifiable training data. Initial evaluations show that even state-of-the-art models like GPT-5 find these tasks challenging, positioning Reasoning Core as a valuable resource to advance the reasoning capabilities of future LLMs.

To read the whole paper 👉️ here.