Car Image Generation Using GANs
In this blog post I will be discussing how one can use GANs (generative adversarial networks) to create brand new images based on a given training set. For example, if your training set is composed of pictures of dogs, the generator should be able to create new dog images that can pass as actual dogs. This blog will cover an introduction to GANs, my experiment using this network to generate car images, and other potential use cases for this technology.
What is GANs?
The term GANs stands for generative adversarial network, but what does that mean? GANs networks are actually composed of two separate networks that work against each other (hence adversarial) to generate new images from random noise. The two networks are called the generator and the discriminator. The generator takes an input of random noise, or in some cases a preselected image, and turns that into a new “fake” image. The generator’s goal is to fool the discriminator into believing that the generated image came from the training set of real images and is not recognized as a generated image by the discriminator. The discriminator’s job is to be able to tell if an image is a real image from the training set or a “fake” image created by the generator. A helpful analogy for understanding this would be thinking of the generator as a band of criminals trying to create counterfeit money and the discriminator as the police trying to spot the counterfeit money. The criminals, or the generator, can use the discriminator’s response to gauge how well their money passes as real money and can develop strategies based on what changes cause the discriminator to classify more generated images as real.
While GANs has proven quite effective for image generation, its structure makes it challenging to train. Care must be taken to manage the training of both the generator and the discriminator relative to each other or the generator may only ever produce more static as its “fake” images. If the discriminator is trained incorrectly and no longer has a good idea of what real and fake images are then the generator has no way of knowing how well its image generation is working.
For my first attempt at using GANs I decided to generate images of cars based on several datasets I found online, including one from Stanford University and images I scraped from search engines. I used the Progressive Growing of GANs model provided by NVIDIA for all of my attempts. My experiments were performed on a Dell C4140 server using 4 NVIDIA V100 GPUs.
Below is the image generated from my best and most recent attempt which, while probably not being able to fool a person, can at least be identified as cars. I used 16,485 images in my training set for this attempt and training ran for 2 days, 20 hours, and 55 minutes. To see all my attempts and for more details on the training process, visit the post I wrote on my personal website or feel free to contact me directly.
Other Potential Use Cases
While image generation is the most well-known use case for GANs, researchers have found many other uses for GANs. Instead of starting with random noise, one can also feed an image into the generator for different effects. Below you can see GANs being used to change an image to give it a certain artist’s style, also known as style transfer. You can read more about style transfer GANs in this article.
Other uses for GANs include increasing image resolution, create images from text descriptions, and generating 3D objects from 2D images. For descriptions about these and many more GANs applications, check out this article from Machine Learning Mastery.