- AI Report by Explainx
- Posts
- AI That’s Fast Enough for Your Pocket
AI That’s Fast Enough for Your Pocket
Liquid AI boosts on-device speed with LFM2-8B-A1B, Kwaipilot advances coding with KAT-Dev-72B-Exp, and Microsoft’s UserLM-8b reshapes conversational AI simulation.
The AI world keeps evolving at edge speed — and this week, it’s all about efficiency, intelligence, and simulation.
⚡ Liquid AI introduces LFM2-8B-A1B, an on-device powerhouse that delivers the performance of large 3–4B dense models while activating just 1.5B parameters per token — enabling faster, low-latency AI experiences on smartphones, tablets, and laptops.
💻 Kwaipilot’s KAT-Dev-72B-Exp redefines coding intelligence with its 72-billion-parameter architecture and reinforcement learning for code correctness — achieving top results on SWE-Bench Verified and empowering developers with smarter debugging, refactoring, and reasoning tools.
🗣️ Microsoft unveils UserLM-8b, a novel model that simulates the user role in conversations to help researchers build and evaluate more natural, robust assistant AIs — advancing the realism and depth of multi-turn dialogue training.
From on-device breakthroughs to developer copilots and next-gen conversational simulation, this week’s updates show how the AI frontier continues to push boundaries in speed, intelligence, and interaction.
LFM2-8B-A1B: Powerhouse AI for On‑Device Performance

LFM2-8B-A1B from Liquid AI is an efficient on-device Mixture-of-Experts (MoE) model with 8.3 billion total parameters but only activates about 1.5 billion per token, enabling it to deliver the performance and quality of larger 3-4 billion parameter dense models while running faster and with lower latency on devices like smartphones, tablets, and laptops. It uses a hybrid architecture with 18 gated convolution blocks and 6 grouped-query attention blocks, incorporates sparse MoE feed-forward networks with 32 experts per layer (top-4 active), and supports a 32K token context length. Trained on a diverse multilingual and code-heavy dataset of about 12 trillion tokens, it excels at tasks including instruction following, math, coding, and creative writing, while being compatible with popular inference frameworks and optimized for edge hardware performance and energy efficiency. Quantized models fit comfortably on modern high-end consumer devices, making it a strong choice for private, fast, and capable AI-powered applications on the edge.
Code Faster, Smarter with KAT-Dev-72B-Exp

KAT-Dev-72B-Exp is a powerful, open-source 72-billion-parameter AI model from Kwaipilot designed to act as a comprehensive software engineering assistant—it achieves top accuracy on benchmarks like SWE-Bench Verified by combining architectural innovations such as a rewritten attention kernel for long code contexts and novel reinforcement learning methods that encourage exploration and code correctness, making it highly capable for tasks like code debugging, refactoring, and reasoning beyond simple completion, while supporting multiple programming languages and access via Hugging Face or StreamLake’s cloud platform.
Microsoft’s UserLM-8b for Conversational AI

Microsoft’s UserLM-8b is a unique large language model designed to simulate the “user” role in conversations, rather than acting as an assistant. Trained on the WildChat dataset, it predicts user turns to generate realistic multi-turn dialogues based on a given task intent, which helps researchers develop and evaluate more robust assistant models. Unlike typical assistant LLMs, UserLM-8b focuses on generating user utterances, including first-turn statements, follow-up responses, and appropriate conversation endings. The model is fine-tuned from Llama 3 8B and excels in simulating diverse user behavior with improved alignment and robustness compared to prior methods. Intended primarily for research use, especially in assistant evaluation, it is not recommended for direct end-user applications. UserLM-8b supports applications like user modeling, judge models, and synthetic data generation but may produce hallucinations or deviate from task intent, typical of generative AI. Training involved significant computational resources, and the model encourages adoption of generation guardrails for effective simulation. This research aims to enhance conversational AI development by providing a realistic user simulation environment to better test assistant LLMs.
Hand Picked Video
OpenAI just released their most comprehensive study ever, analyzing over 1 million conversations from 700 million users worldwide. The findings reveal surprising shifts in how we're actually using AI.
Top AI Products from this week
Traycer AI - Traycer helps devs plan-first, code-faster with spec‑driven development. It breaks down high-level intents into structured plans, hands off to your favorite AI agent, then verifies changes so your large codebase stays solid.
Orchids - Build apps and websites by chatting with an AI Full Stack Engineer
SigmaMind AI - SigmaMind AI is a YC-backed conversational AI platform that lets you build, test, and deploy voice, chat, and email agents for sales and customer service. Use our no-code builder to create complex multi-prompt AI agents, or integrate directly with APIs into your existing systems.
myNeutron - Lose context between AI chats and LLMs? myNeutron gives you portable AI memory: capture from the web, Gmail, and Drive, search semantically, then drop the right context into any model in one click or through our MCP.
MCP360 - MCP360 is the App Store for AI agents—instantly equip your AI with 100+ tools via a single ready to use MCP Platform. Supercharge AI Workflows, unlock new capabilities—all with effortless setup and one unified platform.
GenZai - GenZai is the ultimate AI-powered development platform for Gen Z developers. Transform your ideas into production-ready websites with just a prompt. Experience vibe coding with instant code generation, live preview, and one-click deployment.
This week in AI
Unitree G1 Selling at Walmart - Walmart ships Unitree G1 humanoid robot in the US for $21,600. Basic model features 1.32m height, 23 DOF, 2kg load, 2-hour battery, free shipping, batch orders up to six.
Qwen3-VL-30B-A3B - Advanced vision-language AI excels in text and visual comprehension, long-context video and spatial reasoning, plus robust multimodal task handling for versatile deployment.
DC Comics AI Ban - DC Comics bans AI-generated art and stories "not now, not ever," emphasizing human creativity's authenticity—a firm stance amid AI use backlash in comics
Tulloch Joins Meta - Andrew Tulloch, co-founder of Thinking Machines Lab and former OpenAI and Meta engineer, left the startup to join Meta, pursuing new AI opportunities.
Intel Advances Humanoid Robots - Intel introduces Panther Lake chip and Robotics AI Suite, delivering high AI performance and energy efficiency to power advanced humanoid robots on compact hardware.
Paper of The Day
Google DeepMind’s SCoRe is a novel AI training method that enables large language models to learn from their own mistakes through a two-stage reinforcement learning process. Instead of relying on manual corrections or prompt engineering, SCoRe uses the model’s own generated data to develop self-correction strategies. The first stage ensures the model maintains its initial imperfect responses, creating a learning environment for meaningful error correction. The second stage applies multi-turn reinforcement learning with carefully designed rewards, encouraging effective fixes while avoiding trivial adjustments. This approach significantly enhances accuracy in tasks like math and coding, with SCoRe showing up to a 23% gain in mathematical reasoning benchmarks and increased code generation performance. It also reduces error rates by preventing the model from changing correct answers unnecessarily. SCoRe’s autonomous self-improvement advances AI reliability and efficiency without requiring retraining, representing a major step toward AI systems that gain experience and improve over time.
To read the whole paper 👉️ here.