Anthropic Education Report: The AI Fluency Index
Anthropic Education Report: The AI Fluency Index Anthropic
Detecting and preventing distillation attacks
Detecting and preventing distillation attacks Anthropic
Anthropic accuses Deepseek, Moonshot, and MiniMax of stealing Claude's AI data through 16 million queries
Anthropic says Chinese AI labs Deepseek, Moonshot, and MiniMax used millions of queries to systematically extract Claude's capabilities and train their own models. The article Anthropic accuses Deepseek, Moonshot, and MiniMax of stealing Claude's AI data through 16 million queries appeared first on
I made an interactive timeline of 171 LLMs (2017–2026)
Built a visual timeline tracking every major Large Language Model — from the original Transformer paper to GPT-5.3 Codex. 171 models, 54 organizations. Filterable by open/closed source, searchable, with milestones highlighted. Some stats from the data: - 2024–2025 was the explosion: 108 models in
Anthropic: "We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax." 🚨
Anthropic is accusing DeepSeek, Moonshot AI (Kimi) and MiniMax of setting up more than 24,000 fraudulent Claude accounts, and distilling training information from 16 million exchanges.
Source: https://www.wsj.com/tech/ai/anthropic-accuses-chinese-companies-of-siphoning-data-from-claude-63a13afc
OpenAI partners with major consulting firms to push Frontier agent platform
OpenAI is teaming up with McKinsey, BCG, Accenture, and Capgemini to roll out its new AI agent platform Frontier to enterprise customers. The article OpenAI partners with major consulting firms to push Frontier agent platform appeared first on The Decoder.
Anthropic Banned OpenClaw: The OAuth Lockdown That Fractured the Claude Developer Community
Anthropic officially banned the use of Claude subscription OAuth tokens in all third-party tools including OpenClaw. Full timeline, economics, and what it means.
Gemini 3.1 Pro Is Here: Google's Reasoning Just Jumped 2.5× in Three Months
Google ships Gemini 3.1 Pro with a verified 77.1% on ARC-AGI-2 — more than double Gemini 3 Pro. It's now the second-best reasoning model behind Deep Think, and it's available to everyone today.
Anthropic Studied Millions of AI Agent Sessions. The Biggest Finding: Humans Are the Bottleneck.
Anthropic analyzed millions of human-AI interactions and found a massive 'deployment overhang' — AI can handle 5-hour tasks, but users cap them at 42 minutes.
DeepMind's Aletheia Solved 4 Problems No Human Could — And Got 68.5% of Everything Else Wrong
Google DeepMind's Aletheia is an AI research agent that wrote a publishable paper solo and cracked unsolved math problems. But its 6.5% useful answer rate tells the real story.
OpenAI's AI Can Now Hack 72% of Smart Contract Vulnerabilities — And That's the Good News
OpenAI and Paradigm launch EVMbench, a benchmark showing AI agents can exploit most known smart contract bugs. The offense-defense gap is the real story.
Claude Sonnet 4.6: The Mid-Range Model That Keeps Embarrassing Flagships
Anthropic's new Sonnet 4.6 nearly matches Opus on coding and reasoning, crushes computer use benchmarks, and somehow beats every model at office work and finance. Full benchmark breakdown inside.
Anthropic vs The Pentagon: When Your AI Company Gets Treated Like a Foreign Adversary
The Pentagon is threatening to designate Anthropic as a 'supply chain risk' over Claude's military usage guardrails. Here's what's actually happening.
Grok 4.20: xAI's 4-Agent AI System Goes Live — Benchmarks, Architecture, and Pliny's Jailbreak
xAI launches Grok 4.20, a multi-agent system where four specialized AI agents debate in real-time. Here's how it works, where it ranks, and the system prompt Pliny the Liberator already extracted.
Sam Altman Fumes That It Takes Longer to Train a Human Than an AI, Plus They Eat All That Wasteful Food
"It also takes a lot of energy to train a human."
Fun fact: Anthropic has never open-sourced any LLMs
I’ve been working on a little side project comparing tokenizer efficiency across different companies’ models for multilingual encoding. Then I saw Anthropic’s announcement today and suddenly realized: there’s no way to analyze claude’s tokenizer lmao! edit: Google once mentioned in a paper that Ge
Gemini 3.1 plays Pokemon without a minimap - until it went sniffing around map data
Ask HN: How do you know if AI agents will choose your tool?
Super New to Godot, used Claude Code/gpt-oss-120b locally to help me vibecode a simple platformer game about a grumpy mage who follows you around making fun of you lmao.
Yeah, I was bored so I spent the last two weeks experimenting with vibecoding with local LLMs, namely gpt-oss-120b. I started with Cline, didn't like it at all because it was overheating my GPU while giving back too little. Codex was even worse, locally, leading to weird CPU switches mid-generation