Questions tagged [image-generation]

For questions related to the task of image generation, which can be done, for example, with variational auto-encoders (VAEs) or generative adversarial networks (GANs).

91 questions
13
votes
1 answer

What are the fundamental differences between VAE and GAN for image generation?

Starting from my own understanding, and scoped to the purpose of image generation, I'm well aware of the major architectural differences: A GAN's generator samples from a relatively low dimensional random variable and produces an image. Then the…
10
votes
2 answers

Using AI to extend an imagine pattern

I have created some nice patterns using the MidJourney tool. I'd like to find a way to extend these patterns, and I was thinking about an AI tool that takes one of these patterns and extends it in all directions surrounding the original…
8
votes
1 answer

How can an Artificial Intelligence system be ethically trained to generate art?

There have been a lot of popular AI-generating image systems put out recently, with such systems as Midjourney and Dall-E catching attention with how well put-together many of the automatically generated images are. However, there has been a lot of…
Mithical
  • 2,965
  • 5
  • 28
  • 39
7
votes
1 answer

How many training data is required for GAN?

I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10. The problem is I don't have a big…
6
votes
1 answer

How does AI 'see' the images it generates- from what perspective?

I've been using AI image generation for a while now, and I've noticed how profoundly AI doesn't seem to see the image as a whole, sometimes generating an image with parts of fingers floating near objects supposed to be being held, VERY warped…
ben svenssohn
  • 344
  • 2
  • 10
6
votes
2 answers

What is the exact role of model $p_\theta$ in diffusion models for the reverse process?

I'm reading this interesting blog post explaining diffusion probabilistic models and trying to understand the following. In order to compute the reverse process, we need to consider the posterior distribution $q(\textbf{x}_{t-1} | \textbf{x}_t)$…
James Arten
  • 307
  • 1
  • 12
5
votes
1 answer

What kind of algorithm is used by StackGAN to generate realistic images from text?

What kind of algorithm is used by StackGAN to generate realistic images from text? How does StackGAN work?
5
votes
1 answer

Does MMD-VAE solve the problem of blurred images of vanilla VAEs?

I understand that with vanilla VAEs, there are a few reasons justifying the production of blurred out images. The InfoVAE paper describes the case when the decoder is flexible enough to ignore the latent attributes and generate an averaged out image…
5
votes
1 answer

Context-based gap-fill face posture-mapper GAN

These images are handmade, not auto-generated like they will be in production. Apologies for inaccuracies in the graph overlay. I am trying to build an AI like that displayed in the diagram: when given a training set of images with their…
4
votes
1 answer

Why can't AI image generators output verbatim text when prompted to do so?

I want to create a splash screen that includes the name of my project. DALL-E 2 changed some of the letters in the name, even when I tried putting the name of my project in double-quotes ("). Other prompts to create images with short verbatim text,…
Silver Sagely
  • 143
  • 1
  • 1
  • 5
4
votes
1 answer

What is the state-of-the-art algorithm for neural style transfer?

I've read the paper A Neural Algorithm of Artistic Style by Gatys et. al. and I find the application of neural style transfer very fun. I also read that Exploring the structure of a real-time, arbitrary neuralartistic stylization network by Ghiasi…
4
votes
0 answers

How would an AI visualize a story written in natural language?

Can AI transform natural language text describing real scenarios to visual images and videos ? How does as AI interprets say a Harry Potter story if it has to reproduce it in form of videos ? Would be useful if anyone can help me with the required…
katipra
  • 61
  • 1
3
votes
0 answers

How random should an untrained generative AI output really be?

I am developing a particular implementation of VAE, and, how usually one does while implementing any architecture, I passed a random input to the model to test if everything worked fine (e.g. check for size mismatches, device shenanigans, etc..).…
3
votes
1 answer

How do stable diffusion models take the data into account

I'm interested in how text to image models like Midjourney and Dall-E work, where you enter a text prompt and get as output some images. I started reading some papers on it and stumbled upon "Denoising Diffusion Probabilistic Models" -…
3
votes
3 answers

Is the output of image generation models like Midjourney and Stable Diffusion deterministic?

Assuming the user can set all parameters, including but not limited to the seed. Is the output deterministic? As in, the same set of inputs will create the same image?
1
2 3 4 5 6 7