Sora is an AI model that can generate realistic and imaginative scenarios from text prompts.
Prompt: A drone camera orbits a stunning historic church perched on a rocky outcrop along the Amalfi Coast. The scene highlights intricate architectural details, tiered pathways, and patios. Below, waves crash against the rocks, framing the coastal waters and hilly landscapes of Amalfi, Italy. In the distance, figures stroll along patios, marveling at the dramatic ocean views. The warm afternoon sun casts a magical and romantic glow over the scene, beautifully captured through photography.
Prompt: In Tokyo, a fashionable woman strolls down a bustling street illuminated by vibrant neon signs and animated city lights. She exudes style in a black leather jacket paired with a flowing red dress and sleek black boots, complemented by a black purse. Sporting sunglasses and bold red lipstick, she walks with confidence and ease. The damp pavement reflects the luminous colors, enhancing the lively atmosphere filled with pedestrians.
Sora's advanced language comprehension allows it to interpret prompts accurately, crafting compelling characters imbued with vivid emotions. Moreover, Sora can generate multiple scenes seamlessly in a single video, maintaining consistency in characters and visual style throughout.
Prompt: Archeologists discover a generic plastic chair in the desert, excavating and dusting it with great care.
Prompt: A litter of golden retriever puppies playing in the snow. Their heads pop out of the snow, covered in.
Prompt: The camera directly faces colorful buildings in Burano Italy. An adorable dalmation looks through a window on a building on the ground floor. Many people are walking and cycling along the canal streets in front of the buildings.
Prompt: A cat waking up its sleeping owner demanding breakfast. The owner tries to ignore the cat, but the cat tries new tactics and finally the owner pulls out a secret stash of treats from under the pillow to hold the cat off a little longer.
Sora is a diffusion model that generates videos by starting with an initial state resembling static noise and gradually refining it through multiple steps to remove the noise.
Like GPT models, Sora uses a transformer architecture, enabling exceptional scalability.