- ThePrompt
- Posts
- Dalle-3: How to use it, use-cases and app ideas
Dalle-3: How to use it, use-cases and app ideas
OpenAI has recently unveiled DALL-E 3, a text-to-image model that can generate highly detailed and accurate images from text.
Unlike other text-to-image systems, DALL-E 3 does not require users to master the art of prompt engineering.
Instead, you can simply type in whatever you want, and the model will generate an image that precisely adheres to the text provided.
Outline:
How to use Dalle 3?
How to use Bing image creator?
What can you do with Dalle-3
Early Dalle-3 use-cases
What is not possible with Dalle-3
How does it compare to Midjourney/SD
What can you build?
How to use Dalle 3?
Currently the image model is available on Bing Chat, with promises that we’ll see it in ChatGPT in the coming weeks.
To use it just choose “Creative Mode” in the chat window, and start typing your prompt.
But if you only want to create images, you can use the native Bing Image Creator instead of the chat.
How to use Bing image creator?
The Bing Image creator is powered by DALLE3, and you can use it for free. To generate a photo, just type your prompt and then click “Create”. They are only allowing 100 generations, because there is a lot of demand. To use your 100 generations make sure you click on the “Boosts” icon next to the prompt.
What can you do with DALL-E 3?
The possibilities are endless! You can do so many things with Dalle-3, and very easy.
Generate realistic and weird images
Edit your images by uploading your photo and asking the chat to edit parts of it;
Assistant for practical problems: You can ask Bing to fix your bike by providing a photo, manual, and your tools
Photos with words: You can create photos with words, which is not the case with other AI image generators like MidJourney and Stable Diffusion
Early Dalle-3 use-cases
DALL-E 3 has a wide range of use-cases. Here are some that we and other people on the internet tested out.
Understands text and image objects context very well!
I provided a photo with an optical illusion, and not only that Bing understood what the photo meant, but it explained the optical illusion to me. Now, that’s powerful.
Puns
You can easily make pictures with words. My first idea was to recreate well-known puns.
Here’s the result for“ raining cats and dogs”, “this is a piece of cake”, “I’m all ears” and “we’re on the same page”.
Comic Books
Probably the most interesting use-case - you can create comics with text on them.
You can use ChatGPT or the actual Bing chat to brainstorm ideas, and then feed it as a prompt to generate the image.
Or you can give a very simple prompt saying "generate a comic about Rocket Science" and it will generate a comic just like that.
Coloring Books
Another use case is creating coloring books for your kids with fictional characters. Just look at this example:
Weird Concepts
The model can also think outside the box and create weird concepts like a "Pearl with a girl earring"
What is not possible with Dalle-3
There are some things that DALL-E 3 cannot do.
Can’t explain recent events from images
For example, if you provide a photo from a recent event and ask to describe what happened on the photo, it might not do as well.
Find out what happened with my question about Taylor Swift and Mama Kelce in the booth👇🏻
Can’t generate images with famous people
You also can't generate photos with famous people because Bing’s Privacy Blur feature will hide faces from Bing chat.
You’ll get error every time, without an explanation.
Can’t edit faces on images
I created a really nice photo with Taylor Swift and Travis Kelce, but TayTay was looking a bit sad.
When I asked Bing Chat to make her happy, it refused and it even made a joke!
It also doesn’t work well if you try to edit any other piece of the image, because it will blur the faces in the result.
In this case I wanted to change the jersey number from 87 to 95, and although it did change the number, the faces were blurred.
Also, the new picture didn't look the same as the old one. 👇🏻
The verdict: How does it compare to MidJourney and Stable Diffusion
Dalle-3 feels like the upgrade that we were waiting since beginning of 2023.
This model can do two major things better than any other model on the market:
Words written on screens: This was very tough to be accomplised and we’re genuinelly interested how big of an impact did the language capabilities of GPT-4 had on this. Stable Diffusion was announcing this since early 2023, but their models never accomplished good text on images.
Great instruction following: Using Dalle-3 feels very intuitive. Not a lot of prompt engineering is needed, and it can follow your instruction to the smallest detail. See the photo and prompt I generated below (it followed everything to the smallest detail)
Prompt: Create an illustration with a girl with long blond hair, smiling, wearing light blue bejeweled dress, and an tall, white, American football player, short dark hair, short mustache, wearing a red shirt jersey with "87" written in the back. And a bubble next to them: "I'll be 87 you'll be 89". Sky with stars above them
We’re bejeweled! 💎
What can you build?
DALL-E 3 is not currently available via API, but it is worth keeping an eye on these early use-cases, and what people use it for.
Here are some niche ideas:
Comic generator
Car assistant: Think, how can I change a flat tire?
Furniture Assembly: Think IKEA manuals and tools to assemble furniture
Parking sign explanation: We all know how confusing parking signs can be
Recipes generator: Upload a fridge photo, and your food preferences and get a recipe
All these niche ideas can be wrapped into a nice AI app that someone will pay for.
You can prepare everything in advance and connect it when the API is ready.
Frequently Asked Questions
What is Bing Image Creator?
The Bing Image Creator is an AI-powered app by Microsoft that lets you create images with text, using one of the best image models Dalle-3 (Built by OpenAI).
Is Bing Image Creator free?
Yes, you can use the Bing Image Creator for free for unlimited images or if demand is high, for up to 100 image generations by using your “Boosts”. You only need to make a Microsoft account.
Can you change the Bing Image Creator aspect ratio?
Unfortunately no, you can only create an 1:1 aspect ratio images.
What are the Bing Image Creator boosts?
You can use up to 100 Boosts to generate your images in couple of seconds. When the demand is high, you can wait for up to 3 hours for a photo to be generated. You can redeem more with Microsoft Rewards points or wait for your boosts to be replenished on a weekly basis.