More
    32.1 C
    Delhi
    Sunday, April 28, 2024
    More

      OpenAI Sora : An AI-Powered Text-to-Video Generator of Creating One-Minute-Long Clips | Details Inside

      OpenAI introduce its first artificial intelligence (AI)-power text-to-video generation model Sora. The OpenAI claims it can generate up to 60-second-long videos. This is longer than any of its competitors in the segment, including Google’s Lumiere.

      Sora is currently available to red teamers, cybersecurity experts who extensively test software to help companies improve their software, and some content creators.

      The AI firm also plans to include Coalition for Content Provenance and Authenticity (C2PA) metadata in the future once the model is deploy in an OpenAI product.

      Announcing the AI video generator in a post on X (formerly known as Twitter), the company said,

      As the length of the video it claims to generate is more than ten times of what its rivals offer.

      Google’s Lumiere can generate 5-second-long videos and Runway AI and Pika 1.0 can generate 4-second and 3-second-long videos, respectively.

      The X account of OpenAI and CEO Sam Altman also share many videos generated by Sora with the prompts use to create them.

      The resulting videos appear highly detail with seamless motion, something other video generators in the market have somewhat struggle with.

      As per OpenAI, it can generate complex scenes with multiple characters, multiple camera angles, specific types of motion, and accurate details of the subject and background.

      This is possible because the text-to-video model uses both the prompt as well as “how those things exist in the physical world.”

      ALSO READ  ChatGPT Generates 'Formulaic' Academic Text and Can Be Picked Up by Existing AI-Detection Tools : Study

      Sora is essentially a diffusion model which uses a transformer architecture similar to GPT models.

      Same, the data it consumes and generates is represent in a term call as patches, which is again akin to tokens in text-generating models.

      Patches are collections of videos and images, bundle in small portions, as per OpenAI.

      Using this visual data enable OpenAI to train the video generation model in different durations, resolutions and aspect ratios.

      In addition to text-to-video generation, Sora can also take a still image and generate a video from it.

      But, it is not without flaws either.

      OpenAI said on its website :

      “The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterwards, the cookie may not have a bite mark.”

      To ensure the AI tool is not use for creating deepfakes or other harmful content, the company is building tools to help detect misleading content.

      It also plans to use C2PA metadata in the generate videos, after adopting the practice for its DALL-E 3 model recently.

      It is also working with red teamers, especially domain experts in areas of misinformation, hateful content, and bias, to improve the model.

      As of now, it is only available to the red teamers and a small number of visual artists, designers, and filmmakers to gain feedback about the product.

      Related Articles

      LEAVE A REPLY

      Please enter your comment!
      Please enter your name here

      Stay Connected

      18,746FansLike
      80FollowersFollow
      720SubscribersSubscribe
      - Advertisement -

      Latest Articles