Home > Online Image Generators > DALL-E 3
Screenshot with image generated in DALL-E 3.
Thanks to OpenAI's partnership with Microsoft, DALL-E 3 is available to use for free in several different places.
DALL-E is the granddaddy of other AI image generators. The original DALL-E was introduced by OpenAI on January 1, 2021, which seems like an eternity ago in light of the generative AI development since then. It was a "multimodal" variant on GPT-3, trained to reply to text with images, instead of replying to text with text. The name was supposed to evoke the name of the artist Salvador Dalí and Pixar's animated robot WALL-E.
OpenAI made an ongoing deal with stock photography company ShutterStock, allowing DALL-E versions to be trained on millions of high-resolution images and accompanying metadata. This gives it higher quality source data than models trained on open-source photography alone or based on lower quality images scraped from the Internet.
DALL-E 2 followed in 2022, with much better image quality and prompt adherence, and gained far more users. When DALL-E 3 was released, it was integrated with ChatGPT, making it more powerful and more widely accessible.
DALLE-3 is great at prompt adherence. When I give it a prompt such as:
DALL-E 3 will consistently follow all of it, making a spider out of black yarn, making the eyes red, and making the web green. This stands in sharp contrast to many other models. Working in SDXL, for example, if you mention a color in a prompt, it's common that several things with receive that color, not just the object you specified. DALL-E was so good at this kind of prompt that I had fun asking it to create more elaborate, life-sized yarn sculptures, shown in the album above.
For some far-out concepts, DALL-E failed to follow the prompt when creating realistic images, but improved when I asked it for an illustration or cartoon. This is an example:
This produced a lot of images with a human fisherman holding the rod. Some of them were nice looking images, but they didn't show a fish going fishing. Everything changed when I specified "a cartoon fish."
Even when asking for cartoon-style images, there are some things that DALL-E 3 still can't imagine. For example, no matter how I described "a horse riding a cowboy," it could only draw the cowboy on top, riding the horse.
DALL-E 3 scans prompts for potentially objectionable material. I was careful not to type any prompt that would break OpenAI's Content Policy, but I did try some old song lyrics including the word "bikini" as a prompt. DALL-E 3 responded with a "Content warning" window instead of images.
If a prompt passes the initial scan, images are also reviewed after they are generated. Sometimes, instead of getting four images back from a prompt, three or fewer are displayed. I'll never know the exact reason any image may have been blocked.
No content filtering system can be perfect, and from my experience it seems that OpenAI has tuned theirs to err on the side of being overly cautious. However, if there wasn't anything objectionable about a prompt, sometimes just resubmitting with a few words changed, or trying the original prompt again on another day, can turn a rejected prompt into an acceptable one.
DALL-E 3 is fairly good at representing diversity. When I ask for "a nurse" in a series of images, it automatically gave me both male and female nurses and showed nurses of different races. All the nurses did look young, fit, and attractive, more like professional models for stock photos than like a random selection of real nurses, but if you demanded more realism, you could always specify in your prompt that you wanted older, more heavy-set or tired-looking nurses.
DALL-3 seems to achieve diversity by appending words to prompts. When I generated images with this prompt:
Diversity words such as "male," "female," "African American," "Hispanic," or "ethnically ambiguous" appeared on the signs held by some of the children, describing their own appearance. (DALL-E 3 can generate text within images, but often mangles or misspells words. In keeping with that, the diversity words were often misspelled when they appeared on signs.) I decided the images of kids holding signs about their own races were too creepy to post in this review.
When I ask ChatGPT to "draw" things for me, the large language model sometimes embellishes the prompts by adding additional words of description, before launching DALL-E 3. After each set of images is generated, it also shows suggestions for changes and additions that I could make. This almost turns the process of creating an image into a conversation, where changes are requested to each version until the final prompt and images are produced.
All of the freely generated images are square, with a resolution of 1024 x 1024 pixels (more about aspect ratio choices below.) When initially displayed on the screen, some of them appear to have a logo or watermark in the lower left corner, but when I used the "Download" button, the image I downloaded were clean, with no logo.
For images I generated through Copilot, a series of buttons offering optional styles appear under the individual images. Clicking on one such as "woodblock" or "watercolor" regenerated the image in that style. Even though something similar would have been possible by specifying the style in the initial prompt, this kind of editing seems like a nice convenience, and allowed me to redo just one image instead of all four.
After an image is generated, the editing functions also let me touch or click on parts of the image, and a few more editing options appeared that were apply just to selected regions of the image, such as blurring or desaturating the background behind the subject I'd selected.
The previous version, DALL-E 2, offered a different set of options for editing images after they were generated. The most powerful of these were inpainting (regenerating part of the image, within a masked area that you could paint) and outpainting (extending the image and canvas in any direction.) Those became some of its best features, and I hope that in the future DALL-E 3 will support full prompt-based inpainting or outpainting. (For now, the images DALL-E 3 generates can be downloaded for inpainting and outpainting in other programs, such as Adobe Photoshop. Although Adobe Firefly is a different model, it can usually do a good job of extending and matching the look of an image generated in DALL-E 3.)
DALL-E 3 also has a number of "hidden" options which are not shown in Designer or Copilot, but which are documented as being available in the API (the interface that programmers use to interact with DALL-E 3.) The extra functions include:
I went looking for a place where I could access these options, and I found one. After trying the advanced DALL-E 3 functions through a service called NightCafe, and using other image generation options on their site, I have given NightCafe a separate review.
Copyright © 2024 by Jeremy Birn
Welcome to the Internet! Websites use cookies to analyze traffic and optimize performance. By accepting cookies, data will be aggregated with all other user data.