Meta is ranking engineers on AI token burn

Welcome back! Meta has built an internal leaderboard that ranks staff by how many AI tokens they burn. Anthropic's locked-down Mythos model has leaked to a Discord group within days of launch. OpenAI is pushing Codex into Slack and Salesforce with shareable team agents. And Google's new 8th-generation TPUs split training and inference into dedicated chips for the first time.

In today’s Generative AI Newsletter:

Token burn: Is burning AI tokens now the Silicon Valley status metric?
Mythos leak: How did a Discord group walk into Anthropic's most restricted model?
Team agents: Is OpenAI's Codex-powered Workspace Agents the enterprise unlock?
TPU split: Why has Google separated training and inference into different chips?

Latest Developments

Silicon Valley's new productivity metric is token burn

A new productivity trend has taken hold across Silicon Valley, and the metric is how much AI you burn through. Employees who consume more tokens are considered more productive, at least on paper. The practice has a name. Tokenmaxxing.

Jensen's line: Nvidia CEO Jensen Huang said he would be "deeply alarmed" if his engineers were not burning hundreds of thousands of dollars on tokens.
Meta's leaderboard: A Meta employee built an internal ranking of which staffers consume the most tokens, and the leaderboard is reportedly active inside the company.
Ramp numbers: Ramp Labs has tracked a 13x surge in AI token spend among its customers since January, with Uber among the companies blowing through AI budgets.
Pushback: Reid Hoffman has argued that tracking AI usage matters, but tracking how people use that AI matters more.

Token burn as a productivity metric is useful only until staff work out how to game it. Spend and output are related but they are not the same thing, and the companies that come out ahead will be the ones measuring what the tokens actually produced rather than running a volume leaderboard. Expect the next wave of internal dashboards to be less about token count and more about outcome attribution, because the current version is an expense report dressed up as a KPI.

Special highlight from our network

The fastest way to feel behind on AI is to read about it

Most people try to keep up with AI by consuming more content. Outskill flips it: one weekend, 16 live hours, and you leave with workflows and automations you can actually run on Monday.

Saturday and Sunday, 10 AM to 7 PM EST. Attendees also get a prompt library, an AI monetization roadmap, and a personalized toolkit builder (bonuses valued at $5,000+).

Free for the next 48 hours.

Save your seat

Anthropic's Mythos leaks to a Discord group within days of launch

A Discord group has reportedly gained access to Mythos, Anthropic's cybersecurity model that the company said was too dangerous for public release. The group guessed the model's deployment URL using naming conventions exposed in the recent Mercor breach, then used a borrowed contractor login to get in. Bloomberg reports the group has been using Mythos daily since launch.

Timeline: Mythos was released on April 10 to select partners under 'Project Glasswing', with Anthropic citing the model's capabilities as reason to withhold it from wider release.
Entry method: One Discord member had vendor credentials through contract work, and the Mercor breach provided enough infrastructure detail for the group to locate the endpoint.
Claimed use: The group told Bloomberg they do not use Mythos for cyberattacks or malicious activity, and claimed access to other unreleased models from other labs.
Official line: Anthropic says it has found no evidence its own systems were compromised.

Emergency meetings were called at the White House over the prospect of this model ending up in the wrong hands, and the first reported unauthorised access turned out to be a Discord group with a borrowed login and a URL guess. Vendor access, contractor credentials and partner deployments are the weakest part of the frontier AI stack, and they will get weaker as the partner list grows. Anthropic will not be the last lab to have this conversation.

OpenAI pushes Codex into the workplace with Workspace Agents

OpenAI has launched Workspace Agents, a new Codex-powered version of custom GPTs designed for teams rather than individuals. The agents can retain memory across sessions, connect to apps like Slack and Salesforce, run on a schedule and pick up tasks when users are offline. They are available in research preview to ChatGPT Business, Enterprise, Edu and Teachers plans at no additional cost through May 6.

Framing: OpenAI pitched the release as an evolution of the 2023 solo-user GPTs, with a conversion tool for existing GPTs coming soon.
Internal proof: Inside OpenAI, sales reps use Workspace Agents for account research and follow-up drafts, while accounting uses them for journal entries and reconciliations.
Controls: Teams can set restrictions on data usage, approvals and permissions for each agent, with the intent of making enterprise rollouts more defensible.
Integration reach: Agents trigger across ChatGPT, Slack and third-party apps, moving the product closer to where most enterprise work actually happens.

The first GPT Store was a graveyard of abandoned side projects that never earned their place in anyone's day-to-day. Workspace Agents is a cleaner second attempt, with the agentic capabilities and enterprise controls that were missing the first time. Whether teams adopt these as scheduled, persistent coworkers rather than novelty assistants is what will decide if this lands, and whether OpenAI can hold the enterprise ground it is currently winning against Anthropic and Google.

Google splits training and inference with its 8th-generation TPUs

Google has unveiled its 8th-generation TPUs for what the company is calling the 'agentic era'. The TPU 8t is built for training and is designed to cut frontier model training time from months to weeks. The TPU 8i is built for inference and is tuned to run multiple specialised agents efficiently, including in large pods of up to 1,152 chips. This is the first time Google has separated the two workloads into distinct silicon.

Training side: TPU 8t focuses on raw throughput, aimed at getting Gemini and future frontier models through training faster than the previous generation allowed.
Inference side: TPU 8i is optimised for serving many agent instances at once, reflecting Google's bet that agentic deployment is the next major bottleneck.
Gemini first: Google's own Gemini models will run on the new TPUs first, with external developer access rolling in behind that.
Framework support: Both chips support the major open-source training and inference frameworks developers already use.

Putting training and inference on dedicated silicon reads the direction of travel clearly. Model size was the bottleneck of the last cycle. Inference economics will define the next one, as agent deployments at scale put sustained pressure on cost per token. Google is telling the market the shape of the workload has changed, and the chips need to change with it.

Tool of the Day: Ideogram Custom Models

Ideogram launched Custom Models today, letting users fine-tune image generation on 15 to 100 of their own assets for consistent on-brand outputs. The tool produces images matching a specific visual style, product set or brand system across any prompt. Useful for marketing teams running recurring asset production, creators building a signature look and product teams needing visual consistency across a catalogue.

Try this yourself:

Gather 15 to 100 images of your brand, product line or visual reference style.
Upload them to Ideogram and let the platform train a custom model on the set.
Generate new images in that style through standard prompts at ideogram.ai.

Light Bytes

Xpeng takes its flying car to market: The Chinese automaker is taking customer deliveries of its manned eVTOL aircraft later in 2026. The $300,000 package includes a six-wheeled ground vehicle that doubles as transport and charging station, with a two-seat aircraft docking in the back.
Sony's table tennis robot beats elite players: Sony AI's Ace system, built around an eight-jointed arm and multi-camera ball tracking, can now handle spin and difficult shots against serious opposition. It still loses to professionals but the gap is closing fast.
Qwen3.6-27B beats its own 397B predecessor: Alibaba's Qwen team has open-sourced a 27B model that tops its predecessor, a model more than 14 times its size, across top coding benchmarks.
Google says 75% of its code is now AI-generated: The company credits the shift with measurable gains in security and operations, with agentic development now embedded across internal teams.
AngelList opens early-stage AI to retail investors: A new AngelList fund lets retail investors back early-stage startups including OpenAI, Anthropic and xAI. The launch post has passed 1.5 million views.
Odyssey-2 Max tops world-model benchmarks: The new 3x-larger version from Odyssey topped physics benchmarks in real time. Currently in private beta.