Start free trial Sign in

From the course: What Is Generative AI?

Text to image applications - DALL-E Tutorial

From the course: What Is Generative AI?

Text to image applications

“

In 2022, we have seen a rise in commercial image generation services. The technology behind these services is broadly referred as text-to-image. You simply type words on a screen and watch the algorithms create an image based on your cue, even if your description is not very specific. there are three main text-to-image generation services, Midjourney, DALI, and Stable Diffusion. If we were to compare these three text-to-image tools to operating systems, Midjourney would be Mac OS, because they have a closed API and a very design and art-scented approach to the image generation process. DALI would be Windows, but with an open API, because the model is released by a corporation and it initially had the most superior machine learning algorithm. OpenAI values technical superiority over design and art sensitivities. And the third, the stable diffusion, would be Linux because it is open source and is improving each day with the contribution of the generative AI community. The quality of the generated images from text to image models can depend both on the quality of the algorithm and the data set they use to train it. So now that we know the main services, let's look at three industrial applications. First is Kubrick, Hollywood's first generative AI tool created by our company, Sehan Lee, for streaming the production of film backgrounds. A normal virtual production workflow uses three-dimensional world building, which involves a bunch of people building 3D worlds that are custom made for that film. It's time-consuming, expensive, and requires a lot of repetitive tasks. An alternative now is to augment 2D backgrounds into 2.5D by involving generative AI in the picture creation process. The second example would be Stitch Fix. When they suggest garments to discover their customer's fashion style, they use real clothes, along with clothes generated with Dali. And finally, marketers and filmmakers use text-to-image models when ideating for a concept and a film. And actually, they may later on continue to use it to make storyboards and even use it in the production of the final art of their campaigns and films, just like we have seen in Kubrick. A recent example from the marketing world would be Martini, that used a mid-journey generated image in their campaign. Another one would be Heinz and Nestle, that use DALI in their campaign, and GoFundMe that use stable diffusion in their artfully illustrated film. Marketers prefer using generative AI in their creative process for two reasons. First, for its time and cost-saving efficiency, and the second, for the unique look and feel that you get from text-to-image-based tools.

Contents