Welcome back! AI is no longer a single product. It is becoming a medium that people can shape, tune, and bend to their needs. Thinking Machines is gaining momentum because Tinker gives teams something rare in frontier AI: the ability to shape a powerful model around their own data and their own workflow. Instead of adapting to someone else’s system, companies can mold the system to themselves. The same theme is emerging everywhere from world models to research agents.

In today’s Generative AI Newsletter:

Thinking Machines targets a $50B valuation jump
World Labs launches Marble for true spatial generation
NotebookLM adds Deep Research for web-grounded investigation
Baidu unveils ERNIE 5.0 and new Kunlun chips
SIMA 2 pushes embodied agents deeper into 3D worlds

Latest Developments

Thinking Machines, the AI startup founded by former OpenAI CTO Mira Murati, is quietly negotiating a funding round that could price the company near $50B. Some investors say the number could stretch to $55 or even $60B, which would mark one of the sharpest valuation jumps in the current wave of frontier-AI startups. The company was valued at $12B in July after raising $2B, making this leap unusually aggressive even by 2025 standards.

What investors are paying for

  • Frontier-tier talent
    Murati leads the company with cofounder John Schulman and research head Barret Zoph, creating one of the densest clusters of ex OpenAI leadership outside OpenAI itself.

  • Tinker as the core product
    Tinker is a workflow layer that lets teams fine tune strong models by pushing in their own data, avoiding the cost and complexity of training from scratch.

  • Academic traction
    Princeton, Stanford, and several research labs already use Tinker for theorem proving, chemistry, and control systems.

  • Early revenue
    Enterprises are paying for the platform, although the valuation is driven more by expectations than present cash flow.

Thinking Machines sits inside the same alumni surge that produced Safe Superintelligence and Periodic Labs, and the funding conversations show how aggressively capital is chasing anyone who helped shape the modern frontier stack. If this round lands near the numbers discussed, it will confirm that influence inside OpenAI has become one of the most valuable assets in the entire AI economy.

World Labs just released Marble, a multimodal world model that creates full 3D environments from text, images, video, or coarse 3D layouts. The launch marks Fei-Fei Li’s first commercial step toward spatial intelligence, the branch of AI she argues will matter far more than language-only systems. Marble arrives with a level of practical control that feels immediately usable for designers, gamers, researchers, and anyone building synthetic environments.

What Marble can actually do

  • Create worlds from almost anything
    Users can generate 3D scenes from prompts, images, videos, or rough layouts and then expand, edit, or combine them at fine detail.

  • Export to real pipelines
    Worlds can be exported as Gaussian splats, triangle meshes, or videos, making them ready for VFX, game engines, and VR work.

  • Advanced editing tools
    Chisel lets users sculpt structure in 3D and then apply style through text, separating geometry from appearance for precise control.

  • A creative hub
    Marble Labs showcases case studies, tutorials, and real workflows across robotics, design, architecture, and entertainment.

Marble arrives at a moment when world models are becoming the next competitive frontier. Fei-Fei Li has spent years arguing that spatial intelligence is the path toward useful AI, and Marble is her first public proof of that vision. If image models reshaped digital creativity, world models may define how machines learn, reason, and simulate the spaces we live in.

Baidu used its Baidu World 2025 stage to introduce ERNIE 5.0, a 2.4 trillion parameter model built to read, write, see, hear, and generate across every major modality. The company paired the release with two new Kunlun chips and a fresh supernode architecture designed to move far larger training loads than China’s current stack can comfortably support. The announcement landed like a statement of intent, with Baidu positioning ERNIE 5.0 as the most technically ambitious model yet from a Chinese lab.

What Baidu Announced

  • ERNIE 5.0 with major upgrades in multimodal reasoning, creative generation, memory, and tool use

  • Kunlun M100 for large inference workloads arriving in early 2026

  • Kunlun M300 for multimodal training and high volume compute arriving in early 2027

  • Supernode clusters built to link many chips into unified training systems

  • Digital humans with global developer access and rapid growth in e-commerce deployments

Baidu now argues that frontier AI will depend on domestic compute, large interlinked chip clusters, and models that move freely across text and sensor-rich media. Analysts see ERNIE 5.0 as Baidu’s clearest challenge to models like GPT-5 and Claude, and a signal that China’s labs are expanding their ambitions even as export limits tighten. The coming years will show how quickly Baidu can scale the chip roadmap it has sketched today.

Google is expanding NotebookLM with a new research tool that automates the heavy lifting of online investigation. The feature is called Deep Research, and it builds full source-grounded reports by browsing the web on your behalf. It arrives alongside support for more file types, turning NotebookLM into a more complete workspace for students, analysts, and anyone working through dense material.

What’s rolling out

  • Automated research plans
    Deep Research creates a plan from your question and gathers information from websites before assembling a detailed, source-linked report.

  • Two research modes
    Users can choose Deep Research for full briefings or Fast Research for quick passes through search results.

  • New file support
    NotebookLM now accepts Google Sheets, Drive URLs, PDFs from Drive, and Microsoft Word documents for easier summarizing and cross-referencing.

  • Seamless workflow
    Reports can be added directly to a notebook while users continue pulling in new sources from the side panel.

NotebookLM has become one of Google’s most ambitious attempts at an AI study partner. The system can already turn long documents into audio explainers and generate visual overviews from mixed media. Deep Research edges it closer to a true digital analyst that can investigate, organize, and summarize without constant prompting. The next question is how reliably it handles nuance when it goes off to explore the open web on its own.

SIMA 2 is Google DeepMind’s new embodied AI agent built on Gemini. It operates directly inside commercial 3D games by looking at the screen and controlling a virtual keyboard and mouse. It learns skills from human demonstrations, Gemini-labeled data, and self-directed play across different virtual worlds.

Core functions:
Goal reasoning: Interprets high-level goals, plans actions, and explains its intended steps.
Generalization: Executes tasks in games it has never seen, including MineDojo and ASKA.
Multimodal understanding: Responds to text, images, sketches, languages, and emojis.
Skill transfer: Applies learned concepts across worlds, such as using mining knowledge to perform harvesting.
Self improvement: Learns new tasks through iterative self play using its own experience data.

Try this yourself:
Load a complex game task into an AI model with visual input support. Ask it to state the goal, list the required steps, and describe how it would act in the scene. Compare how consistently it tracks objectives and how well it reasons about the environment.

Reply

or to participate

Keep Reading

No posts found