Creative AI
Posts
The new Gen of AI Video is here

The new Gen of AI Video is here

Runway's Gen-3 Alpha is now available

Erik Knobl
July 07, 2024

Good day, humans.

In today's Newsletter we include:

Runway's Gen-3 Alpha is now available to the publi …
Moshi surpasses OpenAI: A chatbot that can speak o …
Figma disables its AI tool
You can now use classic voices in your narrations
Use this Midjourney prompt

Let's get to it!

Tiny Panda on a finger

Runway's Gen-3 Alpha is now available to the public

RunwayML's latest update, Gen-3 Alpha, has improvements in hyper-realistic video generation from text. It can create highly detailed videos with complex scene changes, a wide range of cinematic choices, and detailed art directions. It marks a significant improvement in fidelity, consistency, and motion compared to the previous version.

Gold turns into a rose

Developed by a multidisciplinary team, Gen-3 Alpha initially focuses on text-to-video generation, with plans to expand to image-to-video and video-to-video functionalities in the future.

Man transforms into a wolf

Why it matters

The possibilities this tool opens up are enormous, reducing the costs of creating complicated special effects. Even in its current basic state, it is already possible to quickly perform tasks that used to take weeks.

The interface of Moshi.

Moshi surpasses OpenAI: A chatbot that can speak open to the public

Kyutai, a non-profit AI lab, unveiled Moshi: a voice-enabled AI that is accessible to all. To use it, go to the website, enter your email, and you're ready to chat. These are the presented capabilities:

Moshi can chat smoothly and expressively using its voice, not just text.
It's compact enough to run locally on devices, no internet needed.
The code and model weights will be freely shared (unprecedented for voice AI).
Developers can tweak it, extend it, or use it as a base for voice-enabled products.
It has impressive text-to-speech capabilities with emotion and multi-voice interactions.

Why it matters

Kyutai built this in just 6 months with 8 people. It's true that Moshi's knowledge and factual accuracy are deliberately limited for now. But the product is already available to the public. All this while OpenAI hasn’t shipped the voice mode for GPT-4o, and it’s been 7 weeks since it was announced.

Figma disables its AI tool

Due to the fiasco caused by Figma's AI tool generating copies of an Apple app, Figma CEO Dylan Field announced that the tool had been temporarily disabled and blamed himself for pushing his team to launch it quickly. Worse yet, Figma has acknowledged that it does not know how the tool was trained, as this was done by a provider.

Why it matters

The lack of transparency in AI training continues to imply intellectual property issues for all these tools. Companies must be transparent about how their models are being trained.

James Dean

You can now use classic voices in your narrations

You can now listen to your favorite blogs, articles, books, or research papers narrated by legendary celebrity voices thanks to the new feature called ‘Iconic Voices’ from ElevenLabs.

The company partnered with the estates of Judy Garland, James Dean, Burt Reynolds, and Sir Laurence Olivier so that users can use their voices to read aloud books, articles, and more.

Use this Midjourney prompt

Wild --sref 1000000001--ar 16:9--stylize 1000--p

Thanks for reading.

See you next week! Hi 👋 I'm Erik Knobl, Product Designer by day and explorer of Generative Artificial Intelligence on weekends. I share my learnings in this newsletter. Consider subscribing to stay in touch.