[Screenshot of a Sora video generated using a prompt Japanese woman walking down a Tokyo street filled with warm glowing neon and animated city signage]
San Francisco: More than a year after launching AI Chatbot ChatGPT, OpenAI has now come with Sora – another stunner.
Sora creates video from text in seconds that too with a remarkable natural resemblance.
“Sora is an AI model that can create realistic and imaginative scenes from text instruction”, OpenAI introductory message says.
Sora's launch has sent shockwaves through the AI community, sparking intrigue and anticipation as the new AI tool promises to revolutionise video creation.
Unlike traditional methods, Sora can transform simple written prompts into mesmerising one-minute videos, showcasing its ability to breathe life into imaginative scenarios with remarkable realism.
“Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt”, OpenAI announced Friday Feb 16, 2024.
OpenAI has also put on display on the Sora website videos generated directly by the newly launched text-to-video model without modification.
"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world", OpenAI said.
At the same time, OpenAI has also warned about the weaknesses that the current text-to-video model has and that also confuse spatial details of a prompt.
The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.
The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.
Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.
In addition to being able to generate a video solely from text instructions, the model is able to take an existing still image and generate a video from it, animating the image’s contents with accuracy and attention to small detail.
The model can also take an existing video and extend it or fill in missing frames. Sample some of the Sora generated videos:
Prompt: “A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. she wears a black leather jacket, a long red dress, and black boots, and carries a black purse. she wears sunglasses and red lipstick. she walks confidently and casually.… pic.twitter.com/cjIdgYFaWq
— OpenAI (@OpenAI) February 15, 2024
Prompt: “Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. the art style is 3d and realistic, with a focus on lighting and texture. the mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with… pic.twitter.com/aLMgJPI0y6
— OpenAI (@OpenAI) February 15, 2024
Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.” pic.twitter.com/0JzpwPUGPB
— OpenAI (@OpenAI) February 15, 2024
Select Language To Read in Urdu, Hindi, Marathi or Arabic.