OpenAI presents GPT-4o

One of the fastest, smartest, and most multimodal AIs to date has been launched.

Good day, humans.

In today's newsletter, we include:

Let's dive in.

OpenAI Introduces GPT-4o

OpenAI presents GPT-4o

OpenAI announced ChatGPT-4o ("o" for "omni"), one of the fastest, smartest, and most multimodal AIs to date.

ChatGPT-4o will soon be available for free to everyone as a desktop app. Yes, everyone will be able to use GPT-4o and GPTs as a result. ChatGPT+ users will have priority access to GPT-4o with 5x more usage.

This new ChatGPT not only has a higher IQ (currently dominating the LMSYS Chatbot Arena Leaderboard); it can talk and see.

OpenAI has compared GPT-4o’s ability to talk with a real-life version of the film "Her".

Talking with the Machine

We've had computers that can talk for a while, but it's never been close to a genuine conversation. Voice Mode feels like chatting with a real human: it captures your tone, language, and expressions in real time.

Explore what it can do here:

Try not to fall in love with your chatbot.

Live 20/20 Vision

ChatGPT-4o can interpret gestures, landscapes, photos, screenshots, and documents, and use that information to your advantage.

OpenAI shared the first image generated with GPT-4o.

Google announced Veo, its AI Video Generator

Google announces new models

Google doesn't want to be left behind and has introduced two new AI projects:

Astra

Google presented Project Astra, their vision for the future of AI assistants. It is a multimodal AI assistant that can interpret images and audio in real time, identify objects, locate lost items, and explain code.

Veo

They also introduced Veo, a text-to-video generator that allows users to create AI-generated videos. Veo has "an advanced understanding of natural language," allowing the model to understand cinematic terms like "timelapse" or "aerial shots of a landscape." It does not have a release date yet.

Sref 36 is one of the 4,2 billion styles of Midjourney

Midjourney Has 4.2 Billion Styles

Midjourney has introduced a new functionality that allows for consistent replication of visual styles, simplifying the creation of similar images. There are over 4.2 billion styles. To use them, just add the parameter --sref and the style number you want to use. Some of the styles are here.

Google presented its IDE with AI

Google Presents its AI-Powered IDE

IDE Google announced that Project IDX, the company's next-generation browser-based AI-focused development environment, is now in open beta.

Apple works in the version of Siri.

Apple Will Renew Siri

Apple is renewing Siri to catch up with its chatbot competitors, like OpenAI's ChatGPT, incorporating generative artificial intelligence and improving its conversational abilities. (NYT)

Stability AI needs coins.

Stability AI in Trouble

Reports indicate that startup Stability AI is facing economic issues and looking for buyers. It is considered one of the major pillars in AI art generation.

The App can read news articles for its users.

ElevenLabs Launches Screen Reader App

ElevenLabs, an AI voice dubbing startup, has launched the free ElevenLabs Reader: AI Audio app. The app can recognize and voice text from web pages, PDFs, and other documents using 11 different voices. (Bloomberg)

Thanks for reading. See you next week!

Hello 👋 I’m Erik Knobl, a Product Designer by day and an explorer of Generative AI on weekends. I share my learnings in this newsletter. Consider subscribing to stay in touch.