What are Generative Adversarial Networks (GANs)?
Generative Adversarial Networks (often abbreviated GANs) are a type of deep-learning model used within the field of Artificial Intelligence (AI). It is designed to produce new data sets that have a similar characteristic to the data set they have been given. GANs feature two neural networks, the generator (responsible for creating new data) and the discriminator (determine whether the newly-created data is correct), which work together in a competing fashion to produce increasingly realistic outputs of data, such as images, videos, audio, or others.
Introduced by Ian Goodfellow and his collaborators in 2014, GANs have quickly become a transformative technology in AI, widely used for generating synthetic data, improving data augmentation, and even creating content from scratch.
How do GANs work?
GANs operate through a process of adversarial training, where the generator and the discriminator act against each other:
- Generator: The generator is a neural network that produces new data samples, typically from random noise as input. Its main purpose is to create realistic samples that are similar to the actual data it was trained on.
- Discriminator: The discriminator’s role is to determine whether the data being evaluated is real (from the training dataset) or created by the generator(fake). It then assigns a probability to the sample to indicate whether it’s real or fake.
These two networks are trained together. As the generator improves its ability to produce samples that resemble real data, the discriminator similarly becomes increasingly adept at identifying fake samples, resulting in more closely identifiable and realistic generated samples. This, however, often makes it hard for the discriminator to know the difference between the generator’s output and the real sample.
What are the major types of GANs?
Each type of GANs are designed to resolve certain problems, and some of the most common ones include:
- Deep Convolutional GAN (DCGAN): Such variant utilizes convolutional neural networks (CNNs) to refine the quality of generated images. DCGANs are also known for developing high-quality visual outputs and are typically used for image generation.
- Conditional GAN (CGAN): In this type, both the generator and discriminator are set up on additional information like class labels. This allows for controlled data generation, in which the model generates data of a particular type, for instance, generating images of a specific object class.
- Wasserstein GAN (WGAN): This type is used to address training instability issues, usually encountered by traditional GANs. WGANs utilize a different loss function based on the Wasserstein distance, producing a more stable training process and higher-quality results.
- CycleGAN: This type of GAN enables image-to-image translation tasks wherein paired data is not available. For instance, a cycleGAN can transform a photo from one domain into another (e.g., turning a photo of a horse into a zebra), without the need for a paired dataset of horses or zebras.
- Progressive GAN: Designed to enhance the generation of high-resolution outputs, a progressive GAN works by progressively increasing an image’s resolution as training continues. Hence, producing detailed, high-quality images from lower-resolution ones.