- Generative's AI Newsletter
- Posts
- š§¬OpenAI's Visual Reasoning Models Arrive
š§¬OpenAI's Visual Reasoning Models Arrive
Plus: Copilot Learns to Click, Columbia Confronts AIās Limits, Claude SelfāDirects

Welcome, AI Pioneers!
OpenAIās o3 and o4āmini now think with images and choose tools autonomously. Microsoftās Copilot can click and type through any app interface. Columbia experts warn that increasing intelligence doesnāt equate to true understanding. And Claudeās Research mode autonomously gathers and cites information from the web and your workspace..
In todayās Generative AI Newsletter:
⢠OpenAI's new reasoning models are visual thinkers
⢠Microsoft Copilot can now interact with software autonomously
⢠Columbia explores AIās limitations and ethical considerations
⢠Claude upgrades to become a powerful research assistant
Special highlight from out Network
From Figma to fully live: Animaās AI Playground makes it real (no code needed)
Got a Figma prototype waiting to shine? Animaās new AI Playground might be your fastest path to a live product.
With over 1.4M installs under its belt, Anima just dropped a browser-based tool that turns static Figma designs into functional web apps; no coding, no stress. Just paste your Figma link, describe what you want, and let AI do the rest.
Whatās in it for you?
⢠Build interactive apps from Figma in minutes
⢠No engineering support required
⢠Share instantly with one click
This isnāt just a launch, itās a shift. Animaās making it easier than ever for designers to own the end-to-end experience: from wireframe to working product.
Ready to try it?
Special highlight from out Network
šØ Last chance to invest in the AI startup disrupting the $5.7T hospitality industry
Jurnyās funding round is almost fullāand for good reason.
Powered by its next-gen agentic AI, Nia, Jurny automates up to 90% of daily hospitality operations. Think guest messaging, reviews, task management, cleaning coordination, upselling, and moreāall handled intelligently, so operators can scale without growing overhead.
šØ Operators using Jurny report:
ā
Saving 12+ hours per listing, every week
ā
Managing 100+ units with ease
ā
Boosting guest experience while cutting manual work
Already trusted by 200+ hospitality companies and backed by Mucker Capital, Okapi VC, and 2,000+ investors, Jurny isnāt just another AI promiseāitās delivering real results right now.
ā³ The round closes April 30.
This is your last chance to invest in the AI thatās redefining hospitality.
š§¬OpenAI is Teaching AI to Think in Pictures

Image Credit: OpenAI
OpenAI has released o3 and o4-mini, two reasoning models designed to think more deeply and visually than ever before. They are now capable of autonomous tool use, image-based reasoning, and generating novel scientific ideas. A new open-source coding agent, Codex CLI, completes the release nudging AI one step closer to independent cognition.
Whatās changed:
o3 is OpenAIās most advanced thinker to date, outperforming on complex benchmarks in math, code, science, and visual tasks. It functions as both analyst and collaborator, capable of generating and evaluating original hypotheses.
o4-mini brings speed and efficiency to the same cognitive space. Though smaller, it surpasses prior models on many tasks, including competition-level math and data science.
Both models can autonomously decide how and when to use tools within ChatGPT including web browsing, Python, file analysis, and image generation as part of their reasoning flow.
They also think with images, not just interpreting them but integrating them into their internal problem-solving chain. This enables a new class of tasks that blend visual input with symbolic logic.
Codex CLI, the new coding agent, connects these reasoning models directly to developersā terminals. It enables a form of real-time AI programming assistance that can reason, write, and debug in context.
The boundaries of machine intelligence are shifting. These models do not merely respond, they observe, choose, and act. With memory, multi-tool reasoning, and visual understanding, OpenAI is sculpting systems that resemble general intelligence in structure, if not yet in soul. The frontier is no longer about speed or scale, but about depth and intention.
š±ļø Copilot Can Now Click for You

Image Credit: Microsoft
Microsoft has unveiled a new 'computer use' feature in Copilot Studio that allows AI agents to interact with desktop and web applications the same way a human would by clicking, typing, and navigating through graphical interfaces.
How does it work:
Copilot agents can now operate within GUIs by simulating user actions like selecting menus or filling in fields, even on systems without APIs.
Real-time adaptability allows agents to recover from interface changes using built-in reasoning to avoid workflow disruptions.
Microsoft-hosted infrastructure ensures enterprise data remains secure and is excluded from training any models.
Target use cases include automated data entry, invoice processing, and market research, removing bottlenecks in routine, click-heavy workflows.
No code required: users can build automation flows in natural language, monitor execution via side-by-side video, and refine steps iteratively.
Microsoft is turning Copilot into more than a chat assistant. By giving it hands-on ability to use software, the company is quietly reimagining RPA for the agentic era. Itās autonomous digital labor for everyday enterprise tools. And Microsoftās tight integration with the workplace gives it a massive head start.
š§ The Line Between Humans and Machines

Image Credit: Sam Island
At Columbia University's first AI Summit, scholars and scientists confronted the central paradox of artificial intelligence. As models grow more capable, they also reveal their limits. Machines can simulate intelligence, but can they understand? They can calculate and predict, but can they care?
The details:
Faculty from across disciplines explored AI's expanding role in fields like medicine, energy, design, law, and the arts.
Architecture professor David Benjamin spoke of using generative AI for sustainable structures. Economist Joseph Stiglitz questioned whether the economic incentives behind AI align with the public interest.
Several experts emphasized the fundamental gap between machine performance and human depth. Computer scientist Lydia Chilton observed that AI excels at breadth but lacks depth. It cannot yet think, feel, or judge as humans do.
Sociologist Gil Eyal noted a growing distrust in experts and a misplaced faith in the supposed objectivity of algorithms.
Legal scholars warned of regulatory blind spots, especially in AI therapy and data privacy. Visual arts professor Naeem Mohaiemen predicted major disruptions to creative professions.
The presence of embodied AI, or robots, sparked both fascination and unease. Roboticist Sami Haddadin described the challenges of physical intelligence. Tim Wu recalled Asimov's first law of robotics and warned that we may be nearing the point where machines can harm without oversight.
Artificial intelligence continues to evolve, but so does our understanding of what it cannot yet do. The summit did not celebrate progress without caution. Instead, it posed deeper questions about meaning, ethics, and control. Intelligence is not just a function. It is also a responsibility. As machines begin to mirror our minds, we must decide what they reflect and what they never should.
š§āš» Claude Masters Research Tasks

Image Credit: Anthropic
Anthropic has launched a major upgrade to Claude, introducing a new Research mode that autonomously pulls information from the web and your workspace. Combined with a fresh Google Workspace integration, Claude is now positioned as a powerful AI research assistant for work and decision-making.
Whatās New:
Claudeās Research mode performs multi-step, autonomous web searches and internal queries, surfacing high-quality, cited answers tailored to the userās question.
New Workspace integration enables Claude to securely access Gmail, Google Docs, and Calendar data without manual uploads or context repetition.
Enterprise users gain access to intelligent document cataloging via RAG, letting Claude navigate large internal files and repositories with precision.
Launch availability: Research is in beta for Max, Team, and Enterprise users in the US, Japan, and Brazil. Google Workspace integration is now live for all paid users.
Anthropic is entering the AI research assistant race with its own spin: slower rollout, but high trust and thoughtful design. By pairing state-of-the-art reasoning with access to both online and personal work data, Claude is quietly transforming into a context-rich collaborator and a serious contender for enterprise knowledge work

š Boost your business with usāadvertise where 10M+ AI leaders engage
š Sign up for the first AI Hub in the world.
š² Our Socials
Reply