Welcome, AI Builders!
This one touches what you make and how you make it. Appleโs new vision models run locally with near-instant results. Microsoftโs VibeVoice turns a script into a multi-voice show. Tencent pushes translation records and one-photo 3D. Oxford funds AI to speed vaccine design. What changes first for you: camera, mic, workflow, or health?
๐ In todayโs Generative AI Newsletter:
Apple on device vision models
Microsoft VibeVoice 10B multi speaker audio
Tencent wins WMT25 and builds 3D worlds
Oxford and EIT fund AI vaccine science
Special highlight from our network
Intro to Generative AI mini course: From Colorado University Boulder to the GenAI community, free!
๐ To kick off the GenAI Academy, GenAI.Works is offering free access to an exclusive University of Colorado Boulder course. ๐
AI in 5: Intro to Generative AI - mini course
๐๏ธ Sept 8โ12 | 9 AM PT
1h live session per day
Certificate + digital badge
$100 โ now FREE
Learn from Prof. Tom Yeh (MIT PhD, CU Boulder) and Larissa Schwartz (UX researcher, PhD student).
Explore image, video, sound, research, and vibe coding.
For anyone starting their AI journey.
๐ Apple Drops FastVLM and MobileCLIP2 Ahead of iPhone 17 Event

Image Credit: Apple
One week before its โAwe Droppingโ event on September 9th, Apple released two new vision language models, FastVLM and MobileCLIP2, built to run locally on Apple devices with near real-time output. Both are available on Hugging Face and highlight Appleโs push to make powerful AI lightweight while keeping data private.
Hereโs what Apple introduced:
FastVLM family: A visual language model for high-resolution processing, available in 0.5B, 1.5B and 7B parameter versions. The smallest variant runs directly in the browser.
MobileCLIP2 performance: 85 times faster and 3.4 times smaller than earlier versions, tuned for Apple silicon to deliver instant captioning, object detection and scene analysis.
Local execution: Both models process on-device, reducing latency and keeping content secure without relying on cloud servers.
Everyday use cases: From video captioning to text recognition in images, the models act as modular tools for Appleโs expanding AI ecosystem.
With the iPhone 17 event just days away, these releases hint at a tighter integration of Apple silicon, AI and privacy-first design. While rivals chase scale, Apple is positioning its edge: models slim enough to run on your phone, but smart enough to see and understand the world in real time.
๐ง Microsoft Releases 10B-Parameter VibeVoice for Long-Form Speech

Image Credit: Microsoft
Microsoft has introduced a 10B parameter version of VibeVoice, its open-source text-to-speech framework, built to generate multi-speaker, long-form audio. Available under the MIT license, the model can create podcasts up to 45 minutes long in just minutes, with support for contexts as large as 32K.
What makes VibeVoice different?
Scalable speech: Synthesizes up to 90 minutes of audio with as many as 4 speakers, a leap over the 1โ2 speaker cap of prior systems.
Core innovation: Uses continuous acoustic and semantic tokenizers at 7.5 Hz, preserving fidelity while improving computational efficiency.
Hybrid design: Combines an LLM for dialogue flow with a diffusion head for high-fidelity acoustic detail, keeping conversations natural and consistent.
Safety measures: All generated files include an audible disclaimer and imperceptible watermark to prevent misuse and confirm provenance.
The release follows a growing push to extend AI speech beyond short clips into multi-voice, natural-sounding dialogue. By making VibeVoice open-source and research-focused, Microsoft is staking a claim in long-form generative audio while keeping commercial applications at bay, at least for now.
๐ Tencentโs New Hunyuan Models for Translation and 3D Worlds

Image Credit: Tencent
Tencent has introduced two major additions to its Hunyuan AI series, one setting new records in global translation benchmarks and the other expanding 3D spatial modeling. Together, they showcase how Chinaโs tech giant is building both linguistic and spatial intelligence into its stack.
This is what Tencent launched:
Hunyuan-MT-7B: A 7B parameter translation model that ranked first in 30 of 31 categories at WMT25, the top global machine-translation competition.
Chimera edition: A joint system that layers multiple translators into one pipeline for higher accuracy across 33 languages and 5 minority languages.
HunyuanWorld-Voyager: It generates 3D-consistent reconstructions from a single photo, exports point clouds directly and supports joystick-guided exploration of generated spaces.
Scaled efficiency: Small enough to deploy widely, from heavy servers to lighter edge devices, while keeping performance high.
Voyager currently leads Stanfordโs WorldScore benchmark for 3D video generation, while MT-7B has redefined what a small model can achieve in translation. Tencent is positioning Hunyuan as both a linguistic and spatial platform, open-sourced and competitive on benchmarks that matter.
๐ Oxford Secures ยฃ118M to Merge AI and Vaccine Science

Samples from coronavirus vaccine trials are handled at an Oxford Vaccine Group laboratory ยฉ John Cairns/AP
The University of Oxford has launched a ยฃ118M programme with the Ellison Institute of Technology (EIT) to use AI and human challenge trials to fight antibiotic resistance. The initiative, called CoI-AI (Correlates of ImmunityโArtificial Intelligence), will combine Oxfordโs vaccine expertise with EITโs advanced AI systems to accelerate vaccine design against hard-to-treat infections.
The program focuses on:
Major threats: Streptococcus pneumoniae, Staphylococcus aureus and E. coli, which drive antibiotic resistance worldwide.
Human challenge trials: Volunteers are safely exposed to bacteria under controlled conditions, allowing immune responses to be studied in real time.
AI integration: EITโs AI models, supported by Oracleโs computing infrastructure, will analyze immune data to pinpoint the responses that predict protection.
Global impact: Backed by a ยฃ118M investment, the programme aims to create faster, smarter vaccines while training future leaders in infectious disease research.
Oxfordโs Andrew Pollard called it a โnew frontier in vaccine science,โ while Larry Ellison framed it as laying the groundwork for faster discovery during outbreaks. With antibiotic resistance killing over a million people annually, CoI-AI could mark a turning point in how the world prepares for future health crises.

๐ Boost your business with us. Advertise where 13M+ AI leaders engage!
๐ Sign up for the first AI Hub in the world.
๐ฒ Our Socials




