Questions tagged [image-processing]

For questions related to image processing (in the context of AI).

For more info, see e.g. https://en.wikipedia.org/wiki/Digital_image_processing.

177 questions
12
votes
2 answers

Is there any existing attempt to create a deep learning model which extracts vector paths from bitmaps?

I need an algorithm to trace simple bitmaps, which only contain paths with a given stroke width. Is there any existing attempt to create a deep learning model which extracts vector paths from bitmaps? It is obviously very easy to generate bitmaps…
8
votes
3 answers

Is it okay to use publicly available Instagram videos to train an AI?

Since I haven't found any good training data for my university project, I want to use pictures and videos from public Instagram profiles. Am I allowed to do that?
8
votes
2 answers

What are the main algorithms used in computer vision?

Nowadays, CV has really achieved great performance in many different areas. However, it is not clear what a CV algorithm is. What are some examples of CV algorithms that are commonly used nowadays and have achieved state-of-the-art performance?
7
votes
3 answers

Does each filter in each convolution layer create a new image?

Say I have a CNN with this structure: input = 1 image (say, 30x30 RGB pixels) first convolution layer = 10 5x5 convolution filters second convolution layer = 5 3x3 convolution filters one dense layer with 1 output So a graph of the network will…
5
votes
1 answer

In OCR, how should I deal with the warped text on the sides of oval objects?

Consider an image that contains one can (or bottle, or any similar oval object), which has texts all over it. In the image below, I have many bottles, but you can assume that each image only contains one such object. As we can see, in each can, the…
5
votes
1 answer

Autoencoder produces repeated artifacts after convergence

As experiment, I have tried using an autoencoder to encode height data from the alps, however the decoded image is very pixellated after training for several hours as show in the image below. This repeating patter is larger than the final kernel…
4
votes
5 answers

Can an AI generated image (such as pic of human face) be detected that it's AI generated?

AIs are getting better and better at creating images and art. Some of the stuff is almost impossible to be detected by the naked eye. But what about programs and algorithms? Instead of creating an image, can anything detect that this image was…
4
votes
1 answer

What is the stride information of an image referring here?

In convolutional neural networks, the convolution and pooling operations have a parameter known as stride, which decides the amount of jump the kernel needs to do on the input image. You can get more information regarding stride from follows taken…
4
votes
1 answer

What is the state-of-the-art algorithm for neural style transfer?

I've read the paper A Neural Algorithm of Artistic Style by Gatys et. al. and I find the application of neural style transfer very fun. I also read that Exploring the structure of a real-time, arbitrary neuralartistic stylization network by Ghiasi…
4
votes
1 answer

How to evaluate the performance of an autoencoder trained on image data?

I am training an autoencoder on (general) image data. I use binary crossentropy loss function, but it is not very informative when I want to evaluate the performance of my autoencoder. An obvious performance metric would be pixel-wise MSE, but it…
nim.py
  • 160
  • 8
4
votes
1 answer

How to calculate the size of a 3d object from an image?

I am wondering how to calculate the size of a 3d object in an image without knowing the focal length of the camera but the distance from the camera to the object.
4
votes
1 answer

Video engagement analysis with deep learning

I am trying to rank video scenes/frames based on how appealing they are for a viewer. Basically, how "interesting" or "attractive" a scene inside a video can be for a viewer. My final goal is to generate say a 10-second short summary given a video…
4
votes
1 answer

Turn photos right-side up?

I'm looking for either an existing AI app or a pre-trained NN that will tell me if a photograph is right-side up or not. I want to use this to create an application that automatically rotates photos so they are right-side-up. This doesn't seem…
vy32
  • 141
  • 2
4
votes
1 answer

Why do we get a three-dimensional output after a convolutional layer?

In a convolutional neural network, when we apply the convolution on a $5 \times 5$ image with $3 \times 3$ kernel, with stride $1$, we should get only one $4 \times 4$ as output. In most of the CNN tutorials, we are having $4 \times 4 \times m$ as…
4
votes
1 answer

Aesthetics analysis with deep learning

I'm trying to score video scenes in terms of aesthetics and cinematography features. Basically, how "interesting" a scene or video frame can be for a viewer. Simpler, how attractive a scene is. My final goal is to tag intervals of video which can be…
1
2 3
11 12