Questions tagged [training-datasets]

For questions related to the dataset used to train machine learning models, such as neural networks. The training dataset is different from the validation and test datasets, which are used for early stopping (and/or hyperparameter optimization) and to test the final model's performance, respectively.

67 questions
10
votes
1 answer

What causes ChatGPT to generate responses that refer to itself as a bot or LM?

ChatGPT occasionally generates responses to prompts that refer to itself as a "bot" or "language model." For instance, when given a certain input (the first paragraph of this question) ChatGPT produces (in part) the output: It is not appropriate…
7
votes
1 answer

How many training data is required for GAN?

I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10. The problem is I don't have a big…
6
votes
1 answer

How was ChatGPT trained?

I know that large language models like GPT-3 are trained simply to continue pieces of text that have been scraped from the web. But how was ChatGPT trained, which, while also having a good understanding of language, is not directly a language model,…
6
votes
1 answer

During neural network training, can gradients leak sensitive information in case training data fed is encrypted (homomorphic)?

Some algorithms in the literature allow recovering the input data used to train a neural network. This is done using the gradients (updates) of weights, such as in Deep Leakage from Gradients (2019) by Ligeng Zhu et al. In case the neural network is…
5
votes
1 answer

How can I estimate how many photos I need to train ResNet-50 for image classification?

I am working on a project where I have to classify around 1000 unique objects. I'm trying to plan how much training data I will need to collect. I was planning on using ResNet-50. Is there anyway I can estimate the amount of photos I should plan to…
5
votes
2 answers

Do we need automatic hyper-parameter tuning when we have a large enough dataset?

Hyperparameter tuning is the process of selecting the optimal hyperparameters for an ANN. Now, my guess is that, if we have sufficient data (say, 1.4 million for, say, 6 features), the model can be optimally trained and we don't need a…
4
votes
1 answer

What happens to the training data after your machine learning model has been trained?

What happens after you have used machine learning to train your model? What happens to the training data? Let's pretend it predicted correct 99.99999% of the time and you were happy with it and wanted to share it with the world. If you put in 10GB…
4
votes
1 answer

What are "development test sets" used for?

This is a theoretical question. I am a newbie to artificial intelligence and machine learning, and the more I read the more I like this. So far, I have been reading about the evaluation of language models (I am focused on ASR), but I still don't get…
3
votes
4 answers

Labeling policy for airplane detecting YOLO

I am training my YOLO to detect airplanes and drones. in some of the pictures it is impossible to detect that the object is indeed an airplane, and it even looks like a drone (pictures are taken from very far away), but I know from the context that…
3
votes
1 answer

Why are some LLMs trained on both CommonCrawl and Wikipedia/StackExchange?

Some LLMs are trained on both CommonCrawl and Wikipedia/StackExchange. Why? Does CommonCrawl already contain Wikipedia/StackExchange? E.g., from the LLaMa 1 paper: and from…
Franck Dernoncourt
  • 3,473
  • 2
  • 21
  • 39
3
votes
1 answer

What's the architecture and size of neural-network-based reward models as used in reinforcement learning by human feedback

My rough understanding of RLHF as used for ChatGPT in a nutshell is this: A reward model is trained using comparisons of different responses to the same prompt. Human trainers rank these responses based on quality. The reward model is a neural…
3
votes
4 answers

How is MNIST only providing the training and the test sets? What about the validation?

I was taught that, usually, a dataset has to be divided into three parts: Training set - for learning purposes Validation set - for picking the model which minimize the loss on this set Test test - for testing the performance of the model picked…
tail
  • 167
  • 7
3
votes
1 answer

How do I select the (number of) negative cases, if I'm given a set of positive cases?

We were given a list of labeled data (around 100) of known positive cases, i.e. people that have a certain disease, i.e. all these people are labeled with the same class (disease). We also have a much larger amount of data that we can label as…
3
votes
0 answers

How does one continue the pre-training in BERT?

I need some help with continuing pre-training on Bert. I have a very specific vocabulary and lots of specific abbreviations at hand. I want to do an STS task. Let me specify my task: I have domain-specific sentences and want to pair them in terms of…
Adrian_G
  • 31
  • 1
2
votes
2 answers

What is the effect of training a neural network with randomly generated fake data that satisfies certain constraints?

I have a neural network with 2 inputs and one output, like so: input | output ____________________ a | b | c 5.15 |3.17 | 0.0607 4.61 |2.91 | 0.1551 etc. I have 75 samples and I am using 50 for training and 25 for…
1
2 3 4 5