Google’s Veo: AI-Powered Videos in Cinematic Style

In February, OpenAI unveiled the Sora text-to-video generation model. By providing textual prompts describing video scenes, users can generate full-motion videos up to 60 seconds long. Recently, Google introduced its own AI video generation model, “Veo,” which also supports text-to-video creation, capable of producing 1080P videos exceeding 60 seconds, with multiple cinematic styles and enhanced natural language understanding.

Google showcased Veo’s capabilities through two tweets. One tweet featured a video rendering demonstration, displaying Veo-generated scenes of a city at night, cars speeding along, and urban landscapes in daylight. The other tweet highlighted Veo’s ability to generate videos based on specific textual prompts. The prompt, “Many spotted jellyfish swimming underwater. Their bodies are transparent, glowing in the deep sea,” resulted in a video of several spotted jellyfish gliding through the ocean, with natural and continuous movements, clear light and shadows, and no noticeable image errors.

Google stated that creators using Veo can employ various cinematic terms to achieve the desired visual effects, such as time-lapse photography and aerial landscape shots, thus reducing the time spent adjusting prompts. Additionally, Veo supports video extension capabilities. If creators are dissatisfied with the current video length, they can either let Veo automatically extend the video or add prompts to generate longer videos.

Google has already opened a trial channel and plans to provide a test version to select users within the year. Furthermore, Google mentioned the intention to integrate some Veo features into YouTube’s short video module, though the implementation methods and specific effects remain unclear.