Questions tagged [datasets]

For questions related to sets of data and their use in AI.

Because of such common use, the term data sets is concatenated in much of the AI literature to datasets. Datasets have many uses in AI, including but limited to these.

  • Input to convergence systems involving general purpose artificial networks, domain specific models, or both
  • Diagnostic information to determine system health with respect to its role in normal uses
  • Audit information for the determination of performance
  • Input to learning systems that rely on productions, cognitions, fuzzy logic rules and certainty values
268 questions
34
votes
5 answers

How can I deal with images of variable dimensions when doing image segmentation?

I'm facing the problem of having images of different dimensions as inputs in a segmentation task. Note that the images do not even have the same aspect ratio. One common approach that I found in general in deep learning is to crop the images, as it…
10
votes
4 answers

How do I select the relevant features of the data?

Recently I was working on a problem to do some cost analysis of my expenditure for some particular resource. I usually make some manual decisions from the analysis and plan accordingly. I have a big data set in excel format and with hundreds of…
10
votes
2 answers

How can I encode angle data to train neural networks?

I am training a neural network where the target data is a vector of angles in radians (between $0$ and $2\pi$). I am looking for study material on how to encode this data. Can you supply me with a book or research paper that covers this topic…
8
votes
3 answers

Is it okay to use publicly available Instagram videos to train an AI?

Since I haven't found any good training data for my university project, I want to use pictures and videos from public Instagram profiles. Am I allowed to do that?
8
votes
2 answers

What are examples of techniques to prevent bias in artificial intelligence systems?

I recently read an article about how artificial intelligence replicates human stereotypes when applied to biased datasets. What are examples of techniques to prevent bias (and stereotypes) in artificial intelligence (in particular, machine learning)…
user6698
7
votes
1 answer

For each epoch, can I use only on a subset of the full training dataset to train the neural network?

If one has a dataset large enough to learn a highly complex function, say learning chess game-play, and the processing time to run mini-batch gradient descent on this entire dataset is too high, can I instead do the following? Run the algorithm on…
pranav
  • 301
  • 1
  • 9
7
votes
2 answers

Is there an argument against using the (reviewed) predictions of a model as ground truth to further train exactly this model?

I plan to use my predictions as ground truth to continue training my model. These predictions are of course reviewed during this process. Is there an argument against that (reinforcement of slight mistakes/overfitting etc.)? Here my specific use…
7
votes
1 answer

How many training data is required for GAN?

I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10. The problem is I don't have a big…
6
votes
3 answers

Has anybody tried unsupervised deep learning from youtube videos?

YouTube has a huge amount of videos, many of which also containing various spoken languages. This would presumably provide something like the data that a "challenged" baby would experience - "challenged" meaning a baby without arms or legs…
6
votes
1 answer

Find anomalies from records of categorical data

I have a data-set with $m$ observations and $p$ categorical variables (nominal), each variable $X_1, X_2,\dots, X_p$ has several different possible values. Ultimately, I am looking for a way to find anomalies i.e. to identify rows for which the…
bat
  • 61
  • 1
6
votes
2 answers

Does the quality of training images affect the accuracy of the neural network?

I just got into AI few months ago. I noticed most of the images in training datasets are usually low quality( almost pixelated). Does the quality of training images affect the accuracy of the neural network? I tried googling, but I couldn't find…
Jerry U
  • 163
  • 1
  • 6
6
votes
4 answers

What are some datasets to train an MLP on simple tasks?

I have implemented an MLP. Now, I want to train it to solve simple tasks. Are there any data sets to train the MLP on simple tasks, that is, tasks with a small number of inputs and outputs? I would like to train it to solve problems which are…
6
votes
0 answers

Are there any easy ways to create annotated training images for object detection?

For the purposes of object detection, are there any easy ways to create annotated training images? For example, if we have $10,000$ images and want to draw bounding boxes on 2 objects for each image, do we have to physically draw those boxes? Is…
6
votes
1 answer

How to detect LEGO bricks by using a deep learning approach?

In my thesis I dealt with the question how a computer can recognize LEGO bricks. With multiple object detection, I chose a deep learning approach. I also looked at an existing training set of LEGO brick images and tried to optimize it. My…
5
votes
4 answers

Traffic signs dataset

I'm looking for annotated dataset of traffic signs. I was able to find Belgium, German and many more traffic signs datasets. The only problem is these datasets contain only cropped images, like this: While i need (for YOLO -- You Only Look Once…
1
2 3
17 18