Stable Video Diffusion is now available through the Stability AI API

Updated 2 years ago on December 21, 2023

Stability AI, known for its Stable Diffusion text-to-image conversion generator, announced that its new core image-to-video conversion model, Stable Video Diffusion (SVD), is now available on a developer platform and through an application programming interface (API), allowing third-party developers to embed it into their applications, websites, software and services.

"This new addition provides programmatic access to the most advanced video model designed for a variety of sectors ... Our goal with this release is to provide developers with an efficient way to seamlessly integrate advanced video generation into their products," the company wrote in a blog post.

While this release could help businesses looking to create videos using artificial intelligence, it could also raise some concerns given that Stability AI is already attracting attention for training its models on LAION-5B, an open-source AI dataset that was found to contain at least 1,008 examples of child sexual abuse content and was taken off the market this week as a result.

Still, for individuals and businesses looking to build generative video into their applications, Stability's new SVD API plugins are one of the leading options in terms of quality, offering "2 seconds of video consisting of 25 generated frames and 24 FILM interpolation frames in an average of 41 seconds," according to Stability AI's post on its LinkedIn page. That may not be enough for large video campaigns, but it sure comes in handy for creating GIFs with specific messages, including memes.

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco July 9-11 at our flagship AI event. Network with peers, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. Register now

The offering is up against competing video generation models from Runway and Pika Labs, the latter of which recently raised $55 million from Lightspeed Venture Partners and unveiled a new web-based video generation and editing platform.

However, none of these proposals have made their video-generating AI models available via APIs - to use them, you have to go directly to their respective websites and applications, which means that, at least for the time being, external developers cannot build applications based on or using them.

Notably, Stability also plans to launch a web-based interface for its video generator, although there is no word yet on when it will be available. The company encourages users to join the waiting list to be the first to try out the interface.

First, let's understand what Stable Video Diffusion does.

Announced almost a month ago in a preliminary research mode, Stable Video Diffusion allows users to create MP4 videos using still images including JPG and PNG.

Judging by the samples presented by the company, the model does a pretty good job of creating the necessary clips, but it is still in the early stages of development, generating only short videos of up to two seconds. This is even shorter than the four-second clips generated by research-oriented video models.

But of course, multiple video clips can be spliced together to create a larger video.

Stability, for its part, says it can help in industries such as advertising, marketing, television, movies and gaming.

More interestingly, unlike the models released last month for sensing and feedback, the newly released model can generate video in multiple layouts and resolutions, including 1024×576, 768×768, and 576×1024. It also includes additional features such as motion force control and seed-based control, allowing developers to choose between repetitive and random generation.

Stability continues despite controversy

While the launch of Stable Video Diffusion gives enterprises an easy way to build video creation features into their products, it also emphasizes that Stability AI is poised to take over the market, even if some question the source of its training data.

Most recently, the Stanford Internet Observatory discovered that the free LAION-5B dataset, which was used to train popular image text generators including Stable Diffusion 1.5 (released by Runway and maintained by Stability), contained at least 1,008 cases of child sexual abuse. The publisher, LAION, has already removed the dataset.

Even earlier this year, the company was named in a class action lawsuit alleging that the company paid LAION to purchase "copies of billions of copyrighted images without permission to create Stable Diffusion."

Stability's developer platform API currently provides access to all of the company's models, from the Stable Diffusion XL text-to-image generator to the new SVD model. The company also offers memberships to help customers host models locally.

Let's get in touch!

Please feel free to send us a message through the contact form.

Drop us a line at mailrequest@nosota.com / Give us a call over skypenosota.skype