- AI Report by Explainx
- Posts
- 50+ Customizable Claude Skills Just Dropped
50+ Customizable Claude Skills Just Dropped
Claude turns skills into reusable workflows, Waymo embeds Gemini as the robotaxi brain, and Meta’s PE-AV pushes open multimodal AI to new SOTA levels.
AI just shipped three power moves reshaping skills standardization, autonomous mobility, and multimodal intelligence. From reusable AI workflows to LLM-powered robotaxis and open multimodal encoders breaking SOTA, here’s what’s new:
🧩 Claude Skills Revolution — Standardizing AI Workflows at Scale
The ComposioHQ awesome-claude-skills repo curates 50+ portable, composable Claude Skills—turning AI into repeatable workflows across business, marketing, dev, creative, and ops, making Claude more tool-like, reliable, and production-ready.
🚕 Waymo × Gemini — The Brain Behind the Robotaxi
Leaked meta-prompts reveal Gemini powering Waymo’s in-car assistant—personalized, multimodal, safety-bounded, and human-friendly, signaling how LLMs are becoming the interface layer for autonomous mobility.
🎧 Meta PE-AV — Open Multimodal Intelligence Goes SOTA
Meta open-sources PE-AV, a unified audio-video-text encoder achieving state-of-the-art zero-shot results and powering SAM Audio, pushing multimodal perception closer to general-purpose understanding.
AI isn’t just getting smarter it’s becoming standardized, embodied, and multimodal by default.
Claude Skills Revolution: 50+ Tools for Code, Marketing & More

The ComposioHQ/awesome-claude-skills GitHub repository curates practical Claude Skills—customizable workflows that standardize Claude AI (across Claude.ai, Claude Code, and API) for tasks like business processes, brand guidelines, and code execution. Key categories include Business & Marketing (e.g., Competitive Ads Extractor, Lead Research Assistant), Communication & Writing (Content Research Writer, Meeting Insights Analyzer), Creative & Media (Canvas Design, Slack GIF Creator, Video Downloader), Development (Artifacts Builder with React/Tailwind, Changelog Generator, D3.js Visualization, Playwright automation, MCP Builder, Webapp Testing), and Productivity (CSV Summarizer, File Organizer, Invoice Organizer). Skills are portable, composable folders with SKILL.md (YAML frontmatter + instructions/examples), optional scripts/templates/resources. Getting started: Upload to Claude.ai Settings, run Claude Code in skill folders, or use API with skills parameter. Best practices emphasize specific tasks, testing, error handling. Contribute via PRs following guidelines; Apache 2.0 licensed (68 stars, 6 forks).
Waymo's Secret Gemini AI: Your Robotaxi's New Brain Just Got Leaked!

Waymo is testing Google's Gemini AI as an in-car assistant for its robotaxis, uncovered by researcher Jane Manchun Wong via app reverse-engineering. The "Waymo Ride Assistant Meta-Prompt"—over 1,200 lines—positions Gemini as a friendly companion handling rider questions, cabin controls (climate, lights, music; excludes volume/routes/seats/windows), and reassurance in 1-3 simple sentences. It personalizes greetings by name, leverages trip history, shuns jargon, and ignores driving commentary, Tesla/Cruise rivals, or external actions like food orders. Not live yet, but Waymo eyes delightful rides; Gemini previously boosted training via EMMA multimodal model. Wong's 30-page report details UI prototypes (chat bubbles, voice waves), safety limits (no real-time driving input), multilingual support hints, and integration with Waymo One app for seamless rider experience. This multimodal push accelerates 2025's autonomy race, blending LLMs with AV tech for human-like interaction amid expanding SF/Phoenix fleets
The Future of Shopping? AI + Actual Humans.
AI has changed how consumers shop by speeding up research. But one thing hasn’t changed: shoppers still trust people more than AI.
Levanta’s new Affiliate 3.0 Consumer Report reveals a major shift in how shoppers blend AI tools with human influence. Consumers use AI to explore options, but when it comes time to buy, they still turn to creators, communities, and real experiences to validate their decisions.
The data shows:
Only 10% of shoppers buy through AI-recommended links
87% discover products through creators, blogs, or communities they trust
Human sources like reviews and creators rank higher in trust than AI recommendations
The most effective brands are combining AI discovery with authentic human influence to drive measurable conversions.
Affiliate marketing isn’t being replaced by AI, it’s being amplified by it.
Meta's PE-AV: Open-Source Audio-Video-Text Encoder Powers SAM Audio SOTA

Meta AI open-sourced Perception Encoder Audiovisual (PE-AV), a unified encoder family for joint audio, video, and text understanding, trained via contrastive learning on ~100M audio-video pairs with synthetic captions from a 2-stage data engine. Building on Perception Encoder (PE), it features separate towers—PE frame/video for visuals, DAC-VAE for 40ms audio tokens—fused into a shared space with text projections, enabling zero-shot retrieval/classification across 10 modality pairs without task-specific heads. PE-AV sets SOTA on AudioCaps (45.8 R@1), VGGSound (47.1% acc), ActivityNet (66.5 R@1), and Kinetics-400 (78.9% ZS), outperforming CLAP/ImageBind. Companion PE-A-Frame localizes sound events frame-by-frame. Powers SAM Audio's prompt-based separation (text/visual/temporal) and Perception Models stack for multimodal tasks like detection/reasoning. Six checkpoints available; code/paper on GitHub/arXiv.
Hand Picked Video
In this video, we’ll look at the major updates to Olly, the AI social media automation tool that handles commenting, post generation, and engagement across all platforms. We’re covering the new UI overhaul, voice mode for comments, GPT-5 integration, and agency-focused features. Plus, enjoy our Christmas Sale with 30% off on any purchased plan.
Top AI Products from this week
DiffSense - DiffSense uses the native AFM 3B model on Apple Silicon to generate git commit messages for free. It runs locally with zero latency, ensuring your code stays private. Features customizable message styles and powerful alias macros.
Reddit Summarizer - Some long reddit threads are a gold mine. But you need to dig deep in hundreds or thousands of comments to form the conclusion. To simplify that I automated it all in form of chrome extension with customization to fit your own needs. With one click it: - fetches metadata (with all comments) - cleans it - sends it to selected AI - gold, you are looking for Supported pages: - Threads - Subreddits - Search results.
FinSight AI - FinSight is a subscription service that analyzes market information to provide insight on publicly traded companies. Based on several metrics, FinSight will suggest the customer to buy, sell, or hold. Along with the recommendation, an in-depth explanation is provided. FinSight is significantly more inexpensive compared to other similar services.
Studio Zero Product Photo AI - Studio Zero is the ultimate AI tool for e-commerce owners and creators. Stop spending thousands on professional photoshoots. With Studio Zero, you can transform ordinary product photos into high-quality, studio-grade images instantly.
Keevx - Keevx is an AI marketing video platform for brands and sellers. Transform images into polished videos, convert product URLs into ready-to-publish promos, and generate multilingual digital avatars for demos and explainers — all with minimal time and resources.
Wedding Welcome Signs - 1HP is a proof first workflow for founders. Add research notes from interviews, calls, and exploration. 1HP summarizes them into structured candidate insights with evidence. You approve what’s real, convert insights into decisions, and maintain a weekly.
This week in AI
Sigma's Eclipse Privacy-First AI Browser - Sigma Browser OÜ launched Eclipse, a privacy-focused Chromium-based browser with built-in local LLM for offline AI chat, PDF processing, and no cloud data sharing. Unlike Chrome's Gemini or Perplexity's Comet, it keeps all queries local with unfiltered responses. Needs 16-32GB RAM + RTX 3060+ GPU
MiniMax M2.1 Launch - MiniMax M2.1, released Dec 23, 2025, boosts multi-language coding & agent tasks. Tops VIBE-Web (91.5), excels in 3D apps, Android/iOS dev. Open-source on Hugging Face; API live.
NVIDIA Acquiring Groq - NVIDIA acquires Groq assets for $20B cash (Dec 24, 2025), its largest deal. Gets inference tech license, key talent like CEO Ross. GroqCloud stays independent under new CFO CEO. Boosts AI inference
GPT-5 Math Breakthrough - Mathematician Johannes Schmitt claims GPT-5 independently solved an open algebraic geometry problem using novel techniques, without human input. Paper labels AI/human contributions; awaits peer review.
Zhipu GLM-4.7 Launch - Zhipu AI's GLM-4.7 (355B params, 32B active MoE) excels in autonomous coding w/ "Preserved Thinking." Scores 73.8% SWE-bench, rivals GPT/Claude at 1/7th cost. 200K context. Open on HF. (Dec 23)
Paper of The Day
This study evaluates an LLM medication safety system on real NHS primary care EHRs from 2M+ patients, reviewing 277 complex cases. The 120B gpt-oss-120b model hit 100% sensitivity for issues but only 46.9% full accuracy in issue ID and interventions. Failures were mostly contextual reasoning (86%): overconfidence, rigid protocols ignoring patient context, practice misunderstandings, factual errors, and unsafe processes—outnumbering knowledge gaps 6:1 across models/demographics. Calls for deeper failure analysis before deployment.
To read the whole paper 👉️ here

