Stable diffusion versus DALL-E 2: Which is better in 2023?
Updated 2 years ago on July 18, 2023
Table of Contents
- How do "Stable Diffusion" and "DALL-E 2" work?
- Stable diffusion compared to DALL-E 2 at first glance
- Both cases produce great images created by artificial intelligence
- DALL-E 2 is easier to use
- Stable diffusion is more powerful
- Stable diffusion wins in pricing
- Commercial use is difficult for both
- DALL-E 2 vs. stable diffusion: Which is better to use?
Stable Diffusion and DALL-E 2 are the two best artificial intelligence image generation models available today, and they perform almost identically. Both models have been trained on millions and billions of text-to-image pairs. This allows them to understand concepts like dogs, deer hats, and dark moody lighting, and is how they can understand what a query like "impressionist oil painting of a Canadian riding a moose through a forest of maple trees" is actually asking them.
In addition to being artificial intelligence models, Stable Diffusion and DALL-E 2 have applications that can take a text query and generate a series of corresponding images. So which of these applications should be used? Let's break it down.
How do "Stable Diffusion" and "DALL-E 2" work?
Stable Diffusion and DALL-E 2 use a process called diffusion to generate images. The image generator starts with a random noise field and then edits it in several steps according to their interpretation of the prompt. By starting with a different set of random noise each time, they can create different results based on the same prompt. This is similar to if you looked at an overcast sky, found a cloud that looked like a dog, and then snapped your fingers, making it look more and more like a dog.
Although both models have a similar technical basis, there are many differences between them.
Stability AI (creators of Stable Diffusion) and OpenAI (creators of DALL-E 2) have different philosophical approaches to how such AI tools should work. They also trained on different datasets and made different design and implementation decisions. So while you can use both tools to accomplish the same task, they can produce very different results.
The following should also be kept in mind: DALL-E 2 is only available through OpenAI (or other services that use its API). Stable Diffusion is actually several open source models. You can access them through Stability AI's DreamStudio app, but you can also download the latest version of Stable Diffusion, install it on your computer, and even train it on your own data. (This is how many services work, such as Lensa's AI avatars).
I'll talk about what all this means a little later, but for ease of comparison, I'll be comparing the models as they are available through their official web apps.
Stable diffusion compared to DALL-E 2 at first glance
Stable Diffusion and DALL-E 2 are built on similar technologies, but they differ in several important ways. Here is a brief description, but read on for more details.
Stable diffusion |
DALL-E 2 |
|
---|---|---|
Quality |
⭐⭐⭐⭐⭐ Exceptional images created by artificial intelligence |
⭐⭐⭐⭐⭐ Exceptional images created by artificial intelligence |
Ease of use |
⭐⭐⭐⭐ Lots of opportunities, but there can be challenges |
⭐⭐⭐⭐⭐ Enter a prompt, press the button |
Power and control |
⭐⭐⭐⭐⭐ You still have to write the prompt, but you get a lot of control over the generation process. |
⭐⭐⭐⭐ Very limited options beyond writing in and writing out |
Both cases produce great images created by artificial intelligence
Let's get to the main point: both Stable Diffusion and DALL-E 2 are capable of creating incredible images generated by artificial intelligence. I had a lot of fun playing with both models, and was shocked at how they handled some of the tasks. I also laughed a lot at their mistakes. The truth is, neither model is objectively - or even subjectively - better than the other. At least, not all the time.
If I were forced to highlight where the models might differ, I would say this:
-
By default, Stable Diffusion leans toward more realistic images, while DALL-E 2 can be more abstract.
-
Sometimes DALL-E 2 can give better results with shorter prompts than Stable Diffusion.
Again, though, the results depend on what you ask for and how willing you are to "operationalize design".
DALL-E 2 is easier to use
DALL-E 2 is incredibly easy to use. Type in a query, click Generate, and get four results. It feels like a fun toy.
That's not to say you can't dive deeper into DALL-E 2. You can upload your own images and use them as cues to create new variations, and you can use the editor to inpaint (replace portions of an image with AI-generated elements) or outpaint (expand an image with AI-generated elements). It's just that many of the "nuts" and "bolts" are hidden from view.
Out of the box, Stable Diffusion is a little less user-friendly. While you can type a query, press the Dream button and do all the same inpaiting and outpaiting, there are additional features here that can't help but be interesting.
For example, you can choose a style (Enhance, Anime, Photographic, Digital Art, Comic Book, Fantasy Art, Analog Film or Neon Punk). There are also two hint fields: one for normal hints and one for negative hints, i.e. those you don' t want to see in your images. And that's only after you consider the additional options that allow you to set the strength of the hint, the number of model generation steps, the model used, and even the seeds.
Of course, installing and training your own instance of Stable Diffusion is a different story.
Stable diffusion is more powerful
While DALL-E 2 is easy to use, it doesn't give you a lot of options. You can generate images at a prompt, and.... that's it. If you don't like the result, you need to change the hint and try again. Some other services that use the DALL-E 2 API, like NightCafé, offer style options and an advanced tooltip editor with suggested terms, but you still just generate the result with a text tooltip.
Stable diffusion (in each iteration) gives you more options and control. As I mentioned above, you can set the number of steps, the initial grain, the strength of the hint, and also make a negative hint - all in the DreamStudio web application.
And even in NightCafé, which also supports Stable Diffusion, you get more options than in DALL-E 2. In addition to customizing styles and using the advanced cue editor, you can control which samples are used, what sampling method is used in the algorithm, etc.
Finally, if you want to create a generative artificial intelligence trained on specific data such as your own faces, logos, or anything else, you can do so with Stable Diffusion. This allows you to create an image generator that will continuously produce images of a specific type or style. The details of how to do this are well beyond the scope of this comparison, but the point is that Stable Diffusion allows you to do something that is not even remotely possible in DALL-E 2.
Stable diffusion wins in pricing
Pricing in DALL-E 2 is very simple. Each text clue generates a set of four images and costs one credit. Credits are worth $15 for 115, so that's ~$0.13 per clue or ~$0.0325 per image. Each round of redrawing or coloring also generates four choices and is worth one credit. (If you signed up for DALL-E 2 before April 6, 2023, there was a free trial and you received 40 free credits each month. Unfortunately, that option has now been canceled.)
Pricing at Stable Diffusion is much more complicated.
Suppose you access it through DreamStudio rather than downloading Stable Diffusion and running it on your computer or accessing it through some other service that uses a specially trained model. In this case, Stable Diffusion also uses a credit system, but it's not as neat as "one credit, one hint". Because you have so many options, the price varies depending on the size, number of steps, and number of images you want to generate. Let's say you want to generate four 512x512 pixel images with the latest model using 50 steps. This would cost 3.32 credits. If, however, you only need to use 30 steps, the cost would only be 2 credits. (You can always check the cost before clicking the Dream button).
When you sign up for Dream Studio, you get 25 free credits, enough for ~30 images (or seven text prompts) with default settings. After that, the value of 1000 credits is $10. This is enough to create more than 1000 images or ~300 text prompts with the default settings.
So, if you ignore all the confusion and focus on the default number of images you can get per dollar, Stable Diffusion takes the spot. And it has a free trial.
Commercial use is difficult for both
If you plan to use Stable Diffusion or DALL-E 2 for commercial purposes, the situation gets a bit more complicated.
Commercial use is currently authorized by both parties, but the implications are not yet fully understood. In its February 2023 decision, the US Copyright Office ruled that images created by Midjourney, another generative AI, cannot be copyrighted. This means that anyone is free to take any image you create and use it as they see fit, although this has not yet been tested in practice.
Purely from a licensing standpoint, Stable Diffusion has a slight advantage. Its model has fewer fences - and even fewer if you train it yourself - so you can create more types of content. DALL-E 2 doesn't allow you to create a huge amount of content, including images of public figures.
DALL-E 2 also adds a multicolored watermark to the bottom right corner of images, although it can be removed.
DALL-E 2 vs. stable diffusion: Which is better to use?
While DALL-E 2 is the best known in the field of AI image generation, it might make sense to try Stable Diffusion first: it has a free trial, is cheaper, more powerful, and has wider usage rights. If you get completely sidetracked, you can use it to develop your own generative AI.
When DALL-E 2 had a great free trial, its simplicity was very well liked. If OpenAI brings that simplicity back, it will make sense for those who are just interested in seeing what artificial intelligence-generated image generators can do.
In any case, the decision is made not so much on the quality of the results you get, but on the overall experience. Both apps can create amazing, funny and even bizarre images with the right approach. And in the end, you might end up using a third-party app based on one of these two models, in which case you won't even notice the difference.
More Questions
Installation is done using pip. PyMilvus is in the Python package index. Installing in a virtualized environment. Installing a specific version of PyMilvus. Installing from source.
An artificial intelligence API is an API that allows developers to add artificial intelligence features to applications. Such APIs can be used in a variety of business functions, including facial recognition, spam filtering, location detection, and even information/post sharing.
What is an AI-based API? An AI-based API is a set of programming instructions that allow developers to access the functionality of an artificial intelligence or ML model. It can include tasks such as pattern recognition, natural language processing, and predictive analytics.
What is a stable diffusion model? Simply put, stable diffusion models allow you to control the style of image generation. Using a model specifically trained on real images allows you to get realistic results, such as a photorealistic portrait.
Is ChatGPT free? Yes, the basic version of ChatGPT is completely free. There is no limit to the number of messages you can use in ChatGPT during the day, however there is a limit to the number of words and characters in your replies.
Go to the plugin store A button will appear to go to the plugin store. Here you can view a list of ChatGPT plugins. If you find a plugin you want to try, click on the green Install button next to it. Once installed, you will be able to access it anytime from the same menu.
Related Topics
Let's get in touch!
Please feel free to send us a message through the contact form.
Drop us a line at request@nosota.com / Give us a call over nosota.skype