Welcome, AI Insiders!
This week, OpenAI demonstrated its AI prowess with a Math Olympiad gold medalist and a near-win against the world's best coder. However, its new ChatGPT Agent comically failed at the simple task of buying a lamp.
đ In todayâs Generative AI Newsletter:
OpenAIâs math model aces Olympiad with 35/42 score
Only one human coder beat OpenAIâs model in Tokyo
ChatGPT Agent goes live, but early tests show real flaws
Special highlight from our network
The easiest way to design an app? Just describe it.
App Alchemy is the AI-powered design tool built for non-designers, solo founders, and fast-moving teams.
Just describe your app idea in plain English, and App Alchemy instantly turns it into beautiful, editable screens.
Want to match the look of your favorite app?
Upload it as a style guide.
Need changes?
Just chat.
Update buttons, layouts, colors: no Figma skills required.
You can clone templates, collaborate with your team, and export production-ready designs anytime.
Itâs the simplest way to bring your app idea to life, without hiring a designer or touching a single line of code.
Try App Alchemy for free. No credit card needed.
Start building today.
đ OpenAIâs Math Model Crushes Olympiad-Level Problems

Image source: OpenAI
OpenAIâs latest model has stunned the math world by scoring 35 out of 42 points on the International Math Olympiad a result that puts it on par with gold medalâwinning human contestants. Tested under the same conditions as real participants, the model worked through two 4.5-hour exams using only natural language reasoning, no tools or internet access.
Hereâs what the model just achieved:
Solved 5 of 6 problems from the IMO, the worldâs toughest high-school math contest
Matched top human scores, evaluated by three former IMO medalists who reviewed the work
Used no external tools; only long-form reasoning and proof-writing in natural language
Remains unreleased for now, with OpenAI saying it will take months before public access
The score is historic, but it also signals something bigger: AI is entering a new phase of abstract, theoretical reasoning. While critics like Gary Marcus warn about transparency and cost, the result puts OpenAI at the frontier of what it means to âthinkâ in mathematics and who, or what, gets to compete.
đ„One Human Still Stands Between OpenAI and Coding Supremacy

Image source: Psyho (@FakePsyho on X)
OpenAIâs autonomous coding model just competed at the AtCoder World Tour Finals in Tokyo and nearly won. In a historic first, the model faced off live against the worldâs top human coders with no human help. Only one person beat it: PrzemysĆaw DÄbiak, known as Psyho, who powered through exhaustion to claim victory.
Hereâs what went down in Tokyo:
The contest lasted 10 hours, with challenges based on optimization and robot-guided mazes
OpenAIâs model came in second, just behind Psyho, who beat it by under 10 percent
It was the first AI to compete live, solving all problems autonomously with no manual input
Psyho celebrated online, saying âHumanity has prevailed (for now!)â
OpenAI CEO Sam Altman congratulated the winner, while reaffirming the companyâs prediction that its models will become the worldâs top programmers. That moment may not be far off. This time, a human won. The next time, the leaderboard might look very different.
đ§ ChatGPT Agent Wows on Paper, Stumbles in Practice

(Image: © OpenAI)
OpenAIâs new Agent tool transforms ChatGPT into a hands-on digital assistant, capable of completing complex tasks using its own virtual computer. It can shop, plan, browse, analyze, and generate docs â all from a single prompt. But early users say the tool still feels clunky, with real-world glitches slowing it down.
Hereâs what reviewers found:
Agent uses its own virtual computer, combining browsing, code execution, file editing, and analysis
It merges Operator and Deep Research, enabling end-to-end task execution with context preserved throughout
Early reviews called it ambitious but buggy, with The Verge comparing it to a âday-one internâ
Itâs live for Pro, Plus, and Team users, but access is already delayed due to overwhelming demand
The system can read emails, prep slide decks, and browse the web but early reports show it freezing on basic tasks and missing key steps. The vision is strong, but execution still lags. The real test will be how quickly OpenAI turns this early draft into a reliable digital worker that can truly handle your day.

đ Boost your business with us. Advertise where 12M+ AI leaders engage
đ Sign up for the first AI Hub in the world.
đČ Our Socials