- AI In Real Life
- Posts
- OpenAI's Massive Image Generation Announcement
OpenAI's Massive Image Generation Announcement
Also, I rendered beef tallow so you don't have to
Let’s get serious
What’s the mark of the world’s best, most growth-minded newsletter creators? They’re all on beehiiv.
Why? Our entire platform exists to help serious content creators scale faster. We’re built for those who are ready to take their content and build it into a behemoth.
It’s why we offer a no-code website builder. It’s why our ad network matches you with global brands like Nike and Netflix. It’s why we never take a dime of your subscription revenue. And it’s why Arnold Schwarzenegger and Ashley Graham trust us to connect with their huge fan bases.
It’s all to put your hard work in front of more people. So if you’re ready to build, ready to grow, and ready to make the world take notice, beehiiv is ready for you.
OpenAI’s Big Announcement
Often, right after I finish writing a newsletter and hit “Send”, something momentous happens in AI news that I wish I could have included.
That was certainly the case yesterday. Minutes after I emailed you about my crazy Philz coffee prediction experiment, I saw that OpenAI was holding a livestream to announce something massive.
Turns out that they’re entirely revamping how ChatGPT generates images, and finally upgrading from the (truly terrible) DALL-E to a native, multimodal system.
Even if you don’t care about AI or tech at all, OpenAI’s product announcements are high comedy.
Misplaced plants! Awkward transitions! Pointless chairs!
It’s like a bunch of middle schoolers decided to record their science fair project presentation…expect this is one of the world’s most innovative and valuable companies.
Frankly, I love the vibe of these demos. Anyone who said that Silicon Valley’s informal, somewhat bizarre startup culture died with the maturation of Google and Facebook needs to watch one of these. The Valley culture born in the 1950s, I’m pleased to report, is alive and well.
I watched the demo as it was happening and reacted to it with my own commentary—like Mystery Science Theatre 3000 but with more data science acronyms.
You can watch that here:
Here are the 3 biggest takeaways:
The move to a native, multimodal model means much better handling of text (including paragraphs of accurate text in AI images, a true first).
This image—including the whiteboard text—was generated by the system. That’s pretty insane, especially when many models currently on the market struggle to generate even a few lines of accurate text.

It also means you can effectively edit images using ChatGPT without each generation standing on its own. OpenAI shows how you could edit the above image just by asking to make it into a “selfie view of the photographer, as she turns around to high five him”.


Even in transforming the image, GPT-4o keeps the same female character and the same whiteboard text.
And finally, it means much better world knowledge and prompt understanding.
Many of today’s image generators use an LLM to expand on your prompt before handing it off to the generation function.
4o is unique in using a true multimodal model, which means that the same model is handling both interpreting the prompt and actually making the image.
That means the system is a lot better at using its broad knowledge of the world in order to understand what your prompt is looking for.
As an example, here’s a photo shared by OpenAI of GPT-4o’s response to the prompt: “A cat looking into a puddle of water on a street, but its reflection is that of a tiger, and both reflections are realistically distorted by ripples in the water”

It gets the concept perfectly and creates something lovely.
Ideogram, in contrast, makes a nice image but clearly doesn’t understand the artistic concept the prompt is trying to create.

Mashing up GPT-4o’s world knowledge with a world-class image generator could be a killer combo.
Of course, we can’t know for sure until we actually get to try the system out. OpenAI says it should be rolling out today, so stay tuned!
In Other News
You may have read a lot about beef tallow recently. It’s all the rage, and has also become weirdly political.
In the interest of covering every conceivable bizarre and specific home topic on my YouTube channel, I rendered some beef tallow and documented the whole process.
Enjoy!