AI Report by Explainx
Posts
Gemini Powered Coding AI Agent👨‍💻

Gemini Powered Coding AI Agent👨‍💻

DeepMind boosts AlphaEvolve with Gemini 2.5, Microsoft unveils smarter AI testing with ADeLe, and Google preps a dev-focused AI agent ahead of I/O 2025.

May 15, 2025

Big moves from DeepMind this week: they’ve upgraded their AlphaEvolve system with the new Gemini 2.5 Pro model, making it smarter at optimizing code, solving complex math, and even improving Google’s data center efficiency. The results are already impressive—faster matrix operations, better GPU performance, and real-world infrastructure gains. AlphaEvolve is now also being prepped for early access, opening the door for researchers to experiment with it directly.

Over at Microsoft, a team of researchers introduced ADeLe, a fresh take on AI evaluation that goes beyond just checking accuracy. Instead, it builds detailed “ability profiles” for models like GPT-4o and LLaMA-3, predicting how they’ll perform on unfamiliar tasks—and why. It’s a smarter way to catch hidden flaws and better understand what a model can (and can’t) do before deploying it in the real world.

Meanwhile, Google is gearing up for a major reveal at its upcoming I/O conference. Among the highlights: a new AI agent built specifically for software developers, aimed at helping with everything from coding tasks to documentation. The company is also planning to showcase Gemini’s voice mode working hands-free with Android XR devices, signaling its bigger vision for ambient, wearable AI.

DeepMind’s AI Agent for Code and Math

Google DeepMind recently announced an update to its AI system AlphaEvolve, powered by the Gemini 2.5 Pro large language model. This update enhances AlphaEvolve’s capability to evolve and optimize complex algorithms across diverse domains such as mathematics, chip design, and data center management. The system now combines the creativity of Gemini models with automated evaluators in an evolutionary framework that iteratively improves algorithmic solutions by generating, verifying, and refining code. The update has already led to practical efficiency gains, including a 0.7% improvement in Google’s global data center compute resource utilization, a 23% speedup in matrix multiplication kernels critical for AI training, and a 32.5% acceleration in GPU kernel performance for Transformer models. AlphaEvolve’s enhanced models also enable it to tackle larger codebases and more complex problems than before, advancing open mathematical problems and optimizing AI infrastructure more effectively. DeepMind is preparing an early access program for researchers to interact with this updated system, signaling broader deployment plans.

Smarter AI Testing with ADeLe

Researchers from Microsoft and collaborators have introduced a new evaluation method called ADeLe, which predicts and explains how AI models will perform on new and unfamiliar tasks. Unlike traditional benchmarks that mainly measure accuracy, ADeLe uses 18 different ability scales-covering cognitive skills, knowledge areas, and task-related factors-to create detailed ability profiles for each model. By matching these profiles to the demands of specific tasks, the system can forecast whether an AI will succeed or fail and explain why, achieving about 88% prediction accuracy for top models like GPT-4o and LLaMA-3.1-405B. This approach not only reveals hidden flaws in current AI testing methods but also provides a more transparent and reliable way to assess AI capabilities, paving the way for safer and more effective deployment of AI systems.

Google to Unveil AI Software Agent for Developers

Google is developing a new AI agent designed to assist software engineers throughout the entire development process, from handling tasks to documenting code, ahead of its annual developer conference, Google I/O, scheduled for May 20, 2025. The company has been demonstrating this tool to employees and external developers, aiming to improve productivity and streamline software creation. Additionally, Google plans to showcase the integration of its Gemini AI chatbot in voice mode with Android XR glasses and headsets, highlighting its broader AI ambitions. This move comes as Google faces increasing pressure from investors to show returns on its significant AI investments amid growing competition and regulatory challenges in its core search and advertising businesses.

Hand Picked Video

In this video, we'll look at building autonomous AI Agent using CrewAI.

Top AI Products from this week

LilysAI - Drowning in information, you're likely missing valuable insights. LilysAI instantly extracts key takeaways from YouTube, PDFs, and articles—letting you dive deeper with one-click tools like mind maps and more. Reading alone just isn't enough anymore.
Tolt - Tolt helps SaaS startups launch and grow affiliate programs. Get automated payouts, fraud protection, seamless integrations (Stripe, Paddle, Chargebee), and a branded affiliate portal. Track performance with detailed reports and drive extra revenue!
Bolto - Bolto is an all in one recruiting and HR platform. It uses AI to source, interview, and match software engineers to startups, then enables startups to pay and manage those engineers in the same place.
Generated Assets - Bolto is an all in one recruiting and HR platform. It uses AI to source, interview, and match software engineers to startups, then enables startups to pay and manage those engineers in the same place.
Granola 2.0 - Introducing the next chapter for Granola: bringing all your team's conversations into one place, and unlocking them with AI
OpenMemory MCP - OpenMemory MCP is a private, local-first memory layer with a built-in UI, compatible with all MCP-clients. It ensures all memory stays local, structured, and under your control with no cloud sync or external storage.

This week in AI

Notion AI Update - Notion launches AI Meeting Notes for instant transcription and summaries, Enterprise Search to find info across tools, Research Mode for auto-drafting docs, and Model Picker to chat with GPT-4.1 & Claude 3.7. AI included in Biz plan.
OpenAI Chief Scientist on AI Research - INFPs are idealistic, empathetic, creative introverts driven by strong personal values. They seek meaning, personal growth, and harmony, often excelling in creative and caring roles.
Google AI on Gemini Prompting - Google AI Developers advise using step-by-step instructions, multishot examples, clear output definitions, and prompt debugging to boost Gemini’s reasoning and response accuracy. For long outputs, be specific and review reasoning carefully.
Tencent Hires WizardLM - Tencent poached Microsoft’s WizardLM team, known for AI models with a rocky past. Now under Tencent’s Hunyuan, they develop cutting-edge AI amid a $12.5B AI investment.
Harvey Expands AI Models - Harvey now integrates Anthropic and Google models alongside OpenAI, auto-routing tasks to the best model for each legal task, enhancing accuracy and offering user model choice.