Getting Started with Midjourney

For this review, I tried using Midjourney in two ways: via the Discord app, and directly on Midjourney's website. The option to use Midjourney through its new web-based interface is still in an early testing phase, and they currently have a rule that only users who have created at least 100 images are allowed to test it. This means that (for now) new users need to start with Discord.

Before you start to use Midjourney, first install Discord, the chatting app, and create an account for yourself. After you have Discord installed, go to Midjourney.com to sign up for Midjourney. Plans start at $10 per month, when billed monthly. Midjourney will give you an invitation to the Midjourney Discord server.

Using Midjourney through Discord

When you first start using Midjourney through Discord, you can join one of the "newbie" rooms on Midjourney's Discord server. These can be a chaotic place, because many users are all typing prompts at once, then scrolling down past other users' posts to find the images that the Midjourney Bot posts in reply.

You can also follow instructions on Midjourney's website to create your own Discord server, and then add the Midjourney Bot to your server, giving you a place to have one-on-one interaction with the Midjourney Bot. (Your images are still not private, however. They still appear on Midjourney's website, unless you pay a higher rate for an optional "stealth" mode.)

When posting on Discord, every prompt must begin with "/imagine." After that, you can drag images into the prompt (a very powerful feature) and/or type a written prompt describing what you want to see. If you have a negative prompt (specifying what you don't want to see), add the option "--no" to your command, and put the negative prompt afterwards. Other parameters, such as how stylized, how weird, or how chaotic, can optionally be added with a syntax such as "--stylization 1000."

The Midjourney Bot will reply to your prompt with four images. If you pick one to upscale, you'll see a larger version of the image, and more options to take it higher resolution, modify the image, inpaint (they call it Vary Region, it means regenerating one selected region of the image), or outpaint (zoom-out or extend the image.)

Using Midjourney through the new web interface

The version of the new interface that I tested for this review was labelled as an "alpha," but already worked pretty well. It had sliders and drop-downs for commonly adjusted parameters and displayed a visual history of your recent creations, making it easy to go back to earlier steps.

As you'd expect, the new interface was still a work in progress. There was no box for a negative prompt. I wasn't clear how to drag an image into the prompt for a style reference. Also, the most important parts of the interface appeared scrunched into the top inch of the page, as if they were worried they'd run out of room on the monitor.

The new interface already looks like a much more inviting environment for new users than the newbie rooms on Discord. I look forward to seeing it completed and offered to all of the users.

Midjourney Image Quality

After creating images and upscaling them, the final files downloaded from Midjourney will be .png files (for better quality than .jpg) with pixel resolutions of about 2464 x 1856 (this varies depending on the aspect ratio.)

Midjourney's useful Vary Region function (also known as inpainting) is only available at low resolutions, before you upscale. This means there are likely to be some areas where faces lack detail, or small regions still need retouching and regeneration afterwards.

How Midjourney Compares to DALL-E

I started using Midjourney (version 6) by testing it with some of the same prompts that I also used in DALL-E 3. In terms of pure prompt adherence, DALL-E 3 is ahead of Midjourney 6. For example, I used this prompt on both models:

Architectural beach house with natural material architecture, with a roof terrace to enjoy the late-afternoon sun and waterfront view. Near the beach, a bike path with bicyclists, and idyllic gardens.

DALL-E 3 gave me all of the prompt, including bicyclists riding on the bike path, a beach, and colorful flowers to make an idyllic garden. Midjourney gave me a nice beach house, but most of the images it produced had no bike path or bicyclists. Midjourney also didn't follow the cue for "late-afternoon" and gave me images from different times of day.

The comparison doesn't end with prompt adherence, though. Midjourney has a number of advantages over DALL-E 3 that helped it pull through with really strong images in the end.

The initial picture of a beach house I got from Midjourney didn't include much of the environment around it, so I clicked a down-arrow to ask Midjourney to outpaint the image, expanding the canvas downwards. It gave me a taller image with sand, water, and rocks down below the beach house.

I liked the new composition, but still wanted an idyllic garden below the beach house, so I used "Vary Region," selected the lower part of the image for replacement, and edited the prompt to stress the colorful flowers in the garden. Midjourney gave me four images to choose from, and I selected one with flowers as well as a nice sandy path down the beach.

Once the final image was resized with creative upscale, I had a beautiful high-resolution picture of a beach house, that took advantage of several advantages that Midjourney has over DALL-E 3:

More choices of aspect ratios
Inpainting (Vary Region)
Outpainting (Zoom Out)
Higher Resolution

I could add to the list that Midjourney also offers more parameters to adjust how images are made. Having chaos, weirdness, stylization, and so on can make a big difference to the image. (At the same time, I have to note that the prompt makes a bigger difference to the overall look. I've gone from Midjourney giving me output that looks like a cartoon to getting something almost photorealistic just by adding the words "Still from a movie" to the end of the prompt, for example.

Like DALL-E, Midjourney also has content filtering which will limit the creation of some kinds of images. They differ in that Midjourney's content filters are less strict, allowing some occasional nudity through for example, while DALL-E's filters are hyperactive, and sometimes go so far that they wrongly block a prompt that wouldn't have broken any rules.

Midjourney's support for reference images, which can be used as image prompts, for blending and combining images, for character appearance, or for artistic style, also gives it a huge advantage over DALL-E 3, which only bases images on text prompts.

How Does Midjourney compare to Adobe Firefly

If you already use Photoshop with Adobe Firefly, Midjourney could still be a useful extension of your toolset. Firefly can generate images from scratch, but Midjourney is better at it. If you don't mind the extra bill, it's certainly nicer to have both of them.

An ideal combination is to use Midjourney to generate images and upscale them, and then finish your project in Photoshop with Adobe Firefly. You could do both Subtle Upscale and Creative Upscale, download both, and drag them both into Photoshop. This way, you can compare which parts of the image get better with the top layer visible, then paint a layer mask to get the best of both versions. After that, you can work on selecting areas to recreate with Generative Fill, if there are any spots that you think could be enhanced or improved.

How Does Midjourney compare to Stable Diffusion?

If you already have a Stable Diffusion interface running locally on your own computer, then it becomes more difficult, but not impossible, to justify using Midjourney. An interface like Automatic1111 or ComfyUI offers more controls and options and ways to create and edit images than Midjourney does. Your home system may have required the upfront cost of buying a graphics card suitable for generative AI, but once you own it, you aren't charged for creating images and animations using open-source software.

But even in this comparison, Midjourney version 6 does seem to be a terrific model compared to any SDXL models (note that I'm writing this review before the release of Stable Diffusion 3 models.) The tools and functionality of Midjourney offer a different set of creative controls that can provide some unique output. Also, the GPUs that Midjourney is running on are likely to be much better than your GPU. You can even launch multiple tasks at the same time, making Midjourney use multiple GPUs at the same time to create high-resolution images for you.

If you're using both together, you could generate and upscale images in Midjourney, then do inpainting and possibly more upscaling in your favorite Stable Diffusion interface. You could also generate still images in Midjourney, then add motion to them with Stable Video Diffusion.

Midjourney is powerful and fun to use, without being very technically demanding of the user. If this were your main generative AI tool, it could be well worth the price. Midjourney is an advanced system that offers a lot of creative possibilities to artists and is certainly at least worth exploring, even if you only subscribe for one month just to see how it goes.