For questions related to the task of image generation, which can be done, for example, with variational auto-encoders (VAEs) or generative adversarial networks (GANs).
Questions tagged [image-generation]
91 questions
13
votes
1 answer
What are the fundamental differences between VAE and GAN for image generation?
Starting from my own understanding, and scoped to the purpose of image generation, I'm well aware of the major architectural differences:
A GAN's generator samples from a relatively low dimensional random variable and produces an image. Then the…
Alexander Soare
- 1,379
- 3
- 12
- 28
10
votes
2 answers
Using AI to extend an imagine pattern
I have created some nice patterns using the MidJourney tool. I'd like to find a way to extend these patterns, and I was thinking about an AI tool that takes one of these patterns and extends it in all directions surrounding the original…
Nicola Lepetit
- 203
- 2
- 8
8
votes
1 answer
How can an Artificial Intelligence system be ethically trained to generate art?
There have been a lot of popular AI-generating image systems put out recently, with such systems as Midjourney and Dall-E catching attention with how well put-together many of the automatically generated images are.
However, there has been a lot of…
Mithical
- 2,965
- 5
- 28
- 39
7
votes
1 answer
How many training data is required for GAN?
I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10.
The problem is I don't have a big…
iv67
- 215
- 3
- 12
6
votes
1 answer
How does AI 'see' the images it generates- from what perspective?
I've been using AI image generation for a while now, and I've noticed how profoundly AI doesn't seem to see the image as a whole, sometimes generating an image with parts of fingers floating near objects supposed to be being held, VERY warped…
ben svenssohn
- 344
- 2
- 10
6
votes
2 answers
What is the exact role of model $p_\theta$ in diffusion models for the reverse process?
I'm reading this interesting blog post explaining diffusion probabilistic models and trying to understand the following.
In order to compute the reverse process, we need to consider the posterior distribution $q(\textbf{x}_{t-1} | \textbf{x}_t)$…
James Arten
- 307
- 1
- 12
5
votes
1 answer
What kind of algorithm is used by StackGAN to generate realistic images from text?
What kind of algorithm is used by StackGAN to generate realistic images from text? How does StackGAN work?
Aneesh bhat
- 53
- 4
5
votes
1 answer
Does MMD-VAE solve the problem of blurred images of vanilla VAEs?
I understand that with vanilla VAEs, there are a few reasons justifying the production of blurred out images. The InfoVAE paper describes the case when the decoder is flexible enough to ignore the latent attributes and generate an averaged out image…
Ananda
- 158
- 1
- 9
5
votes
1 answer
Context-based gap-fill face posture-mapper GAN
These images are handmade, not auto-generated like they will be in production. Apologies for inaccuracies in the graph overlay.
I am trying to build an AI like that displayed in the diagram: when given a training set of images with their…
Geza Kerecsenyi
- 51
- 6
4
votes
1 answer
Why can't AI image generators output verbatim text when prompted to do so?
I want to create a splash screen that includes the name of my project. DALL-E 2 changed some of the letters in the name, even when I tried putting the name of my project in double-quotes (").
Other prompts to create images with short verbatim text,…
Silver Sagely
- 143
- 1
- 1
- 5
4
votes
1 answer
What is the state-of-the-art algorithm for neural style transfer?
I've read the paper A Neural Algorithm of Artistic Style by Gatys et. al. and I find the application of neural style transfer very fun.
I also read that Exploring the structure of a real-time, arbitrary neuralartistic stylization network by Ghiasi…
DeepNet
- 41
- 2
4
votes
0 answers
How would an AI visualize a story written in natural language?
Can AI transform natural language text describing real scenarios to visual images and videos ? How does as AI interprets say a Harry Potter story if it has to reproduce it in form of videos ? Would be useful if anyone can help me with the required…
katipra
- 61
- 1
3
votes
0 answers
How random should an untrained generative AI output really be?
I am developing a particular implementation of VAE, and, how usually one does while implementing any architecture, I passed a random input to the model to test if everything worked fine (e.g. check for size mismatches, device shenanigans, etc..).…
GPU'njoyer
- 31
- 2
3
votes
1 answer
How do stable diffusion models take the data into account
I'm interested in how text to image models like Midjourney and Dall-E work, where you enter a text prompt and get as output some images. I started reading some papers on it and stumbled upon "Denoising Diffusion Probabilistic Models" -…
Rohit Pandey
- 161
- 4
3
votes
3 answers
Is the output of image generation models like Midjourney and Stable Diffusion deterministic?
Assuming the user can set all parameters, including but not limited to the seed.
Is the output deterministic? As in, the same set of inputs will create the same image?
Mindwin Remember Monica
- 131
- 2