ummid logo

Welcome Guest! You are here: Home » Science & Technology

Creating Video from Text: Sora does it with great Natural Resemblance

More than a year after launching AI Chatbot ChatGPT, OpenAI has now come with Sora – another stunner. Read More

Saturday February 17, 2024 8:20 PM , ummid.com News Network

Creating Video from Text: Sora does it with great Natural Resemblance

[Screenshot of a Sora video generated using a prompt Japanese woman walking down a Tokyo street filled with warm glowing neon and animated city signage]

San Francisco: More than a year after launching AI Chatbot ChatGPT, OpenAI has now come with Sora – another stunner.

Sora creates video from text in seconds that too with a remarkable natural resemblance.

“Sora is an AI model that can create realistic and imaginative scenes from text instruction”, OpenAI introductory message says.

Sora's launch has sent shockwaves through the AI community, sparking intrigue and anticipation as the new AI tool promises to revolutionise video creation.

Unlike traditional methods, Sora can transform simple written prompts into mesmerising one-minute videos, showcasing its ability to breathe life into imaginative scenarios with remarkable realism.

“Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt”, OpenAI announced Friday Feb 16, 2024.

OpenAI has also put on display on the Sora website videos generated directly by the newly launched text-to-video model without modification.

"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world", OpenAI said.

At the same time, OpenAI has also warned about the weaknesses that the current text-to-video model has and that also confuse spatial details of a prompt.

The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.

The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.

Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.

In addition to being able to generate a video solely from text instructions, the model is able to take an existing still image and generate a video from it, animating the image’s contents with accuracy and attention to small detail.

The model can also take an existing video and extend it or fill in missing frames. Sample some of the Sora generated videos:

 

Select Language To Read in Urdu, Hindi, Marathi or Arabic.

For all the latest News, Opinions and Views, download ummid.com App.

Google News

 Post Comments
Note: By posting your comments here you agree to the terms and conditions of www.ummid.com

..