
Welcome back! We are looking at a bizarre week. Google just made high-fidelity music generation a default feature for millions by dropping Lyria 3 straight into Gemini. At the same time, this absolutely frictionless creation is breaking legacy systems. Academic conferences are currently drowning under a tidal wave of fake research papers generated in under a minute. We also have a look at a new real-time avatar that actually nods while you talk and a major update to how half the internet builds websites.
In today’s Generative AI Newsletter:
Google embeds the Lyria 3 music generator directly into Gemini
Tavus launches a 40 FPS avatar engine that reads human emotions
Academia rejects 2,400 papers a month as AI slop breaks peer review
WordPress builds an AI design assistant directly into its core editor
Latest Developments
Lyria 3: Google Brings Music Generation Inside Gemini App

Google officially integrated its Lyria 3 model into Gemini this week, allowing users to spin up 30-second tracks from a simple text prompt or a photo. While niche players like Suno and Udio have dominated AI music for months, Google is the first to put high-fidelity audio generation directly inside a mainstream assistant used by millions. The update handles everything from lyrics to cover art and even allows users to upload a video to serve as the creative mood board for a custom soundtrack.
How to create music in Gemini:
Multimodal Inputs: Users can feed Gemini text, photos, or videos to generate tracks with automatic lyrics, vocals, and genre-matched cover art.
Audio Watermarking: Every track is embedded with an imperceptible SynthID watermark, and users can upload any file to check if it was generated by Google AI.
YouTube Integration: Creators are gaining access to the model through Dream Track for Shorts, making it easier to build custom, copyright-safe backing tracks.
Safety Guardrails: The model is trained to avoid mimicking specific artists, instead using them as broad style inspiration to comply with partner agreements.
Platforms like Suno and Udio have existed for a year but embedding Lyria 3 into Gemini makes high-fidelity music generation a default feature for millions. Google is positioning this as a tool for "fun, unique expression" rather than a replacement for professional musicians. While the quality is high enough to fool casual listeners, the current 30-second limit suggests this is built for social sharing rather than replacing professional production.
Special highlight from our network
Text and voice got us halfway there.
But real communication happens face to face. That’s where video-native AI takes the lead.
In third-party tests, Anam outperforms every other video agent by ~24% across visual quality, lip sync, and responsiveness.
And when users are given the choice?
70% choose video over voice.
Why it matters:
→ +24% higher conversion
→ +44% better retention
→ Across sales, support, onboarding, and training
You can create a custom avatar from a single image. Video is the interface now.
Your AI should look the part.
Tavus Debuts Phoenix-4 Behavioral Engine. AI Learns To Read The Room?

Tavus officially launched Phoenix-4 this week, a real-time human rendering model designed to move AI avatars past the stiffness of the uncanny valley. While older systems rely on simple lip-syncing over pre-recorded video loops, this engine generates every pixel of the head and shoulders from scratch in every frame. This allows the AI to participate in the "natural dance" of conversation by reacting to the user’s tone and facial cues with millisecond-level latency.
Here are the details:
Emotional Fluidity: The model manages seamless transitions between 10+ emotional states like curiosity and concern without any visual snapping.
Active Listening: Phoenix-4 generates distinct visual backchannels such as affirmative nods and brow furrows while the user is still speaking.
HD Performance: The full pipeline runs at 40 FPS in 1080p resolution to ensure micro-expressions feel smooth during live video calls.
Behavioral Integration: The engine pairs with the Raven-1 perception model to mirror the tone and facial cues of the human user.
Tavus is pitching Phoenix-4 for high-stakes industries like healthcare, therapy, and sales, where the feeling of being "heard" is more important than just receiving a correct answer. By modeling how humans actually react rather than just how they speak, Tavus is effectively trying to escape the uncanny valley of robotic ambivalence.
How AI Slop Is Causing A Real Crisis In Computer Science

Raphael Wimmer recently used an OpenAI tool to draft a fake experiment in under a minute to highlight a growing rot in academia. This surge of synthetic research pushed the International Conference on Machine Learning to 24K submissions which is double the previous year. As researcher productivity climbs by 90 percent the traditional peer review system is currently drowning in unverified text.
Key Statistics of the Surge:
90% Productivity Jump: A study in Science confirms that LLM adoption has increased researcher output by roughly 89.3%, leading to a glut of content.
arXiv Rejections: Monthly submissions to the arXiv preprint repository have risen by 50% since late 2022, while rejections have increased fivefold to 2,400 per month.
Prism Effect: Researcher Raphael Wimmer demonstrated that OpenAI’s Prism can generate a realistic-looking (but fake) experiment paper in just 54 seconds.
While many professors blame the AI models themselves, the slop crisis is a symptom of a deeper, systemic failure in how modern science is funded and rewarded. AI has simply acted as an accelerant for an already broken "Publish or Perish" culture. As arXiv co-founder Paul Ginsparg notes, AI slop is an "existential threat" because it is becoming indistinguishable from genuine work during a quick skim.
WordPress AI Assistant: Edit, Design, and Generate Without Leaving the Editor

The WordPress AI Assistant is now built directly into WordPress, working inside the post editor and Media Library. Instead of switching between separate AI tools and the site builder, this assistant operates where content and layout decisions are already happening.
The assistant is available on Business and Commerce plans and integrates directly into block based themes.
Core functions (and how to use them):
Layout adjustments: Ask the assistant to change spacing, update color palettes, adjust typography, or add new sections such as testimonials or contact blocks. Changes appear directly in the live layout.
Content rewriting and refinement: Rewrite paragraphs for tone, generate alternate headlines, translate sections, or expand thin copy without leaving the editor.
AI image generation: Create images directly inside the Media Library by describing the subject and style. Aspect ratios and visual tone can be specified.
Image editing: Modify existing images by changing style, converting to black and white, or replacing objects while keeping composition consistent.
Block notes assistance: Inside collaborative block notes, request fact checks, headline ideas, or supporting examples grounded in the page context.
Try this yourself:
Open a page in the block editor, ask the assistant to modernize the layout and rewrite a section for clarity. Then generate a matching image in the Media Library and place it directly into the page.




