This website uses cookies

Read our Privacy policy and Terms of use for more information.

Welcome back! Nvidia put a frontier model on the internet for free, training data and recipes included. OpenAI rebuilt ChatGPT's memory so it updates itself while you're away, and Apple is reportedly renting Nvidia chips to run the Google model behind the Siri it unveils Monday. Netflix, meanwhile, is testing a voice interface that finds you something to watch.

In today's Generative AI Newsletter:

  • Nvidia: Why give away a frontier model when you could charge for it?

  • ChatGPT memory: What does ChatGPT do with your memories between conversations?

  • Apple: How much of the new Siri is actually Apple's?

  • Netflix: Why is Netflix teaching its app to listen?

Nvidia shipped Nemotron 3 Ultra on Thursday, the strongest model it has ever released, and gave the whole thing away. Weights, training data and the recipes it used to build it. Anyone can download it, fine-tune it and sell whatever they build on top.

It's built for the expensive part of AI agents. An agent working a long task spends most of its time, and your money, deciding what to do between tool calls. Nemotron mixes Mamba layers, which skim long history cheaply, with transformer layers for exact recall, so each step costs less to think through.

  • The pitch: 5x the throughput of comparable open models and up to 30% cheaper on long agent tasks, by Nvidia's own benchmarks.

  • The scores: 65 to 70.4% on SWE-bench Verified, a test built from real GitHub bug fixes, with the score holding across five different agent frameworks.

  • The giveaway: 212 billion fresh training tokens, 10 million fine-tuning samples and the full training recipes, under OpenMDW, the Linux Foundation's license written for open models.

  • Where to try it: Live on OpenRouter, Perplexity for Pro subscribers and build.nvidia.com, with weights on Hugging Face.

What actually changed is who gets to run a model like this. Plenty of teams can't send their data to an outside API, and the open models worth running have mostly come from Chinese labs, which is a hard sell in a bank or a defense ministry. Nemotron is now the strongest American open model by a distance, scoring 48 with benchmark firm Artificial Analysis against 39 for the next best US option, though China's Kimi K2.6 still leads at 54. If Nvidia's numbers hold up in other people's hands, the bill for running serious agents drops this week.

Everybody wants to learn AI. The real question is who actually helps you use it.

AI Academy has 30K+ students and 50+ courses, all built around one principle: you apply what you learn from day one. Instructors mentor you through real projects until you ship. Most platforms hand you a framework. Here you build something real.

Revenue at GenAI Works grew from $1M in 2024 to $2.5M in 2025. Clients include Oracle, Google, NVIDIA and IBM.

Our crowdfunding round funds more courses, more expert instructors, workshops and AI leader interviews.

Invest just from $1,000 before July 1, 2026. Earn up to 25% bonus shares.

In making an investment decision, investors must rely on their own examination of the issuer and the terms of the offering, including the merits and risks involved. Genai Works, Inc. has filed a Form C with the Securities and Exchange Commission in connection with its offering, a copy of which may be obtained here: https://bit.ly/3APlUkJ

ChatGPT's old memory worked like a notepad. It saved what you told it to save, and the notes rotted. Tell it you're prepping a client pitch for Friday and a month later it's still asking how the prep is going. On Thursday OpenAI started rolling out a rebuilt system it calls dreaming, which revises those notes in the background while you're not chatting.

  • Self-updating: Memories account for time passing, so "pitch on Friday" becomes "that pitch already happened" on its own, no correction needed.

  • No magic words: It picks up your preferences and constraints from normal conversation, without you typing "remember this."

  • The rollout: US Plus and Pro users first, with other countries and the Free and Go tiers in the coming weeks, after OpenAI says it cut the compute cost of serving it by about 5x.

  • Your controls: A memory summary page shows what ChatGPT believes about you, and it takes instructions, including OpenAI's own example, "don't bring up Stan again."

Models are close enough now that memory is the lock-in. A chatbot that already knows your projects and preferences beats a smarter blank one for most daily asks, and OpenAI knows it. That's why the compute savings went straight into the free tier. Hundreds of millions of people with a year of context each don't switch chatbots.

The Siri overhaul Apple unveils with iOS 27 on Monday will reportedly do its thinking on Nvidia chips, running Google's Gemini models, on Google's cloud. That's the latest from The Information, and it leaves Apple supplying the phone and the privacy promise.

  • The plan: Heavy Siri requests go to Gemini models hosted on Google Cloud, processed on Nvidia's flagship B200 GPUs with confidential computing turned on, which encrypts your data while it's being worked on.

  • The reversal: Bloomberg's Mark Gurman, who breaks most Apple news, reported in January that Siri would run on Google's TPU chips. The plan now points at Nvidia hardware instead.

  • What Apple shows Monday: A Siri that can see your screen and carry out multi-step tasks across your apps, plus a standalone Siri app that works like ChatGPT, with chat history, file uploads and a voice toggle. The finished product reportedly ships in September.

  • The privacy question: Apple built Private Cloud Compute in 2024 so AI requests would only ever touch Apple silicon. Nobody has explained how that promise survives Nvidia chips in Google's cloud.

Apple has owned every layer of its products since the first iPhone, and it's now two years late on a Siri that works. Renting Google's model and Nvidia's chips is the fastest route to an assistant people actually use, and Apple picked speed over pride. The tell on Monday is how hard the keynote works to avoid saying Google or Nvidia out loud.

Netflix's chief product and technology officer Elizabeth Stone told the Bloomberg Tech conference on Wednesday that generative AI is moving into the core of how Netflix decides what you see. The problem she described is the one you know. Too much content, not enough clarity and a homepage guessing your mood from what you clicked last month.

The fix Netflix is testing reads more like a conversation. The company is using generative AI and natural language processing to work out what you want to watch in a given moment, and it's experimenting with a voice interface on top. The recommendations mix your viewing history with what's trending right now.

Stone framed the viewer's question as "How do I make sense of it, and what's right for me, and what's right for me in this moment?" The version she sketched has you telling Netflix you're after something funny that won't eat your evening, and getting an answer that fits your taste instead of thumbing through rows ranked by your past.

Recommendations are the part of Netflix that has kept it ahead since the DVD-mailer days, and Stone framed this as core product work rather than a side experiment. Every streamer has the same catalog problem and the same wall of posters. The first one that makes talking work will make the rest feel old fast.

Top 5 tools for people who hate managing people

Aukomo.io
The kind of tool that handles technical problems in the background, so your team never has to jump into a late-night troubleshooting call again.

Cuey
It's like having a room full of AI experts debating your question from every angle. And you never have to sit through a single meeting to get there.

Perplexity AI
You can ask one question and get a researched answer with sources in seconds, skipping the group chat and the 23 open browser tabs.

ElevenLabs
Need a voiceover? You can go from script to finished audio without hunting down a narrator, booking studio time or waiting on revision rounds.

Runway ML
You can turn an idea into a video without chasing down designers, editors or production teams to make it happen.

  • Claude is helping build the next Claude: Anthropic says its internal data shows Claude is speeding up AI development inside the company, an early step toward models that improve themselves, and it launched a research institute to study where that goes.

  • IBM bets its consultants on Gemini: Thousands of IBM consultants will now build AI agents on Google's Gemini for banks and governments, a practice both companies call a multi-billion-dollar opportunity, aimed at the stage where most enterprise AI dies, getting agents out of pilots and into production.

  • One Microsoft director is taking notes from the Pope: Taylor Black, Microsoft's AI and venture ecosystems director, told Vatican News that Pope Leo XIV's encyclical helps his teams build better products, because "tech doesn't have anthropology."

Learn more about AI from the experts building it

📸 Follow us on Instagram for fast, visual AI updates in 30 seconds. 

🐝 Subscribe to our Atlas newsletter — trusted by 3M+ subscribers — to stay ahead of AI news across tech, education, and business. 

📺 Watch us on YouTube to hear insights directly from leading AI voices, builders, and innovators.

🐦 Follow us on X for breaking AI news and real-time industry updates.

🧠 Learn how to build your next AI application with practical resources and expert guidance.

🚀 Explore investment opportunities in the future of AI and join our community-backed growth journey.

Reply

Avatar

or to participate

Keep Reading