AI Report by Explainx
Posts
Top AGI-Level Models Leaderboard

Top AGI-Level Models Leaderboard

ARC-AGI sets new standards for efficient reasoning, Meta AI delivers real-time global news, and Rnj-1 brings powerful open-source coding intelligence to everyone.

Yash Thakker
December 09, 2025

AI is evolving faster than ever, moving toward systems that reason efficiently, deliver information in real time, and empower developers with powerful open-source tools. The latest breakthroughs show exactly where the next wave is headed:

🧠 ARC-AGI Leaderboard is redefining what “true intelligence” means, spotlighting models that excel not just in accuracy but in cost-efficient reasoning and adaptive problem-solving.

📰 Meta AI’s Real-Time Content Engine brings live global news, multi-source perspectives, and dynamic updates straight into Meta’s apps, making information discovery faster, richer, and more personalized.

💻 Essential AI’s Rnj-1 emerges as a top open-source 8B model for coding and STEM, with strong long-context reasoning, agentic capabilities, and performance that rivals much larger models.

AI isn’t just advancing, it’s becoming smarter, faster, and more accessible than ever.

Measuring True Intelligence: A Deep Dive Into the ARC-AGI Leaderboard

The ARC Prize leaderboard tracks the latest advancements in artificial general intelligence (AGI), focusing on systems that demonstrate both high performance and cost efficiency in solving complex reasoning tasks. The ARC-AGI benchmark has evolved from ARC-AGI-1, which measured basic fluid intelligence, to ARC-AGI-2, which emphasizes adaptability and efficiency. The leaderboard highlights the relationship between cost-per-task and performance, showing that true intelligence involves solving problems efficiently rather than just achieving high scores. Only systems with a cost below $10,000 are included, and incomplete submissions are penalized by marking remaining tasks as incorrect. Results marked as "preview" are unofficial and may be based on partial testing. The leaderboard features various model types, including base large language models (LLMs), reasoning systems, and custom-built competition entries, allowing direct comparison of their efficiency and effectiveness. This resource is essential for researchers and developers aiming to benchmark AGI progress and understand the current state-of-the-art in AI reasoning and efficiency.Qwen3-TTS: Real-Time, Lifelike AI Voice Engine

Meta AI Expands Real-Time News and Content Access

Meta AI has expanded its ability to deliver real-time content, offering users global news, entertainment, lifestyle stories, and more directly within its apps and devices. By partnering with outlets like CNN, Fox News, Le Monde Group, People Inc., and USA TODAY, Meta aims to provide timely, diverse, and balanced information tailored to individual interests. These updates facilitate easier access to articles by linking out to partner websites, helping publishers reach new audiences while enriching user experience. Meta is committed to enhancing AI responsiveness and accuracy, ensuring coverage of breaking news with multiple viewpoints and content types. This allows users to spark new ideas, edit media, explore topics in-depth, and stay informed with up-to-date information. Continuously adding new content sources and features, Meta AI’s evolving system strives to make information discovery more dynamic, personalized, and timely, supporting users in navigating the fast-changing digital world more effectively. Expect ongoing innovations to improve how AI interacts with real-time news and content delivery, broadening the scope and relevance of what Meta AI can offer.

Essential AI Launches Rnj-1: A Leading Open-Source 8B LLM for Code and STEM

Essential AI has launched Rnj-1, an 8 billion parameter open-source language model built on the Gemma 3 architecture with global self-attention and YaRN for a 32k context length. Rnj-1 excels in code generation, agentic coding, and scientific reasoning, outperforming many larger open models on benchmarks like SWE-bench and HumanEval+. The model is robust to quantization, allowing high-speed inference with minimal loss in quality, and shows top-tier tool use and mathematical problem-solving abilities. Developed by a team including Transformer co-creator Ashish Vaswani, Rnj-1 is designed for flexibility and long-term agentic capabilities, making it a leading open-weight choice for software engineering and STEM tasks. This release strengthens the open-source ecosystem, providing a powerful tool for developers and researchers seeking advanced AI capabilities in a transparent, accessible format.

Hand Picked Video

In this video, we’ll look at how Google’s Gemini Nano Banana model edits complex images, replacing objects, improving vibes, restoring photos, and even making creative transformations.

Top AI Products from this week

GLM 4.6V - GLM-4.6V is GLM's newest open-source multimodal model with a 128k context window. It features native function calling, bridging visual perception with executable actions for complex agentic workflows like web search and coding.
Lyria Camera by Google DeepMind - Lyria Camera turns your surroundings into a live soundtrack. Using Gemini’s visual understanding and the Lyria RealTime API, it transforms what your camera sees into evolving music that matches your mood and movement turning every moment into an immersive audiovisual experience.
thefrontkit - thefrontkit Design + Code UI kits that feel like a real app, not a mock. Token-driven theming, accessibility out of the box, and runnable Next.js + Tailwind examples. Launching two Full kits: AI UX.
Cosmic AI Agents - Autonomous AI assistants that build features, fix bugs, and generate content for you. Give your agent instructions, and let it work in the background to build and optimize your application and content. Code agents create isolated branches and PRs.
Kerno - Never manually write, or and maintain backend tests again! Kerno automates integration testing end-to-end inside your IDE. It understands your codebase, generates meaningful tests, spins up test environments, executes tests.
WarpGrep - Introducing WarpGrep, a fast context subagent that improves coding agent performance. WarpGrep speeds up coding tasks 40% and reduces context rot by 70% on long horizon tasks by treating context retrieval as its own RL trained system.
EpsteinGPT - I connected the Epstein files to a deep learning AI researcher. As many of you know, the Epstein files were released a few weeks ago, with over 20,000 individual text and image documents.

This week in AI

Gemini Nano Banana 2 Flash Launching Soon - Google prepares to launch Gemini Nano Banana 2 Flash, offering Pro-level performance at lower costs for broader accessibility within Gemini.
Antigravity Rate Limits Favor Pro/ Ultra - Google Antigravity now prioritizes Pro and Ultra subscribers with higher, faster-refreshing rate limits for agentic development, while free users get a larger weekly quota.
PNNL Powers U.S. Bioeconomy with AI - PNNL and the Department of Energy launch AMP2, an AI-driven platform to accelerate biofuels, chemicals, and biomaterials research using microbes.
Adversarial Poetry Universal AI Jailbreak - Poetic prompts bypass AI safety, achieving up to 90% attack success across major models, exposing systemic vulnerabilities in alignment.
NYT Sues Perplexity AI - The New York Times has sued Perplexity AI, alleging it copied and used millions of its articles without permission to train its generative AI products.

Paper of The Day

The paper "Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism" reveals that poetic reformulation of harmful prompts can bypass safety mechanisms in large language models, achieving high attack success rates across providers and risk domains. Poetic prompts consistently outperform standard prose, even when automatically generated, and the vulnerability extends to multiple categories, exposing a fundamental gap in current alignment approaches and highlighting the need for robust defenses against stylistic obfuscation.

To read the whole paper 👉️ here