For questions related to pre-trained model. A pre-trained model is a model that was trained on a large benchmark dataset to solve a problem similar to the one that we want to solve. Accordingly, due to the computational cost of training such models, it is common practice to import and use models from published literature (e.g. VGG, Inception, MobileNet)
Questions tagged [pretrained-models]
32 questions
4
votes
1 answer
What is the difference between fine tuning and variants of few shot learning?
I am trying to understand the concept of fine-tuning and few-shot learning.
I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task.
However, recently I have seen a plethora of blog posts…
Exploring
- 371
- 7
- 18
4
votes
0 answers
How can I improve the performance of a model trained to detect vehicle poses?
I'm looking for some suggestions on how to improve our vehicle image recognition. We have an online marketplace where customers submit photos of their vehicles. The photos need to meet certain requirements before the advert can be approved.…
mechane
- 41
- 4
3
votes
0 answers
What is the difference between prompt tuning and prefix tuning?
I read prompt tuning and prefix tuning are two effective mechanisms to leverage frozen language models to perform downstream tasks. What is the difference between the two and how they work really?
Prompt Tuning:…
Exploring
- 371
- 7
- 18
3
votes
1 answer
Using a pre-trained model to generate labels to data to then train a model on
I'm trying to set up a pipeline for my ML models to automatically re-train themselves whenever concept drift occurs to recalibrate to the new output distributions. However, I can't get ground-truth from my data without manual labeling, and I want an…
Sanger Steel
- 31
- 1
3
votes
1 answer
How does a software license apply to pretrained models?
Google provides a lot of pretrained tensorflow models, but I cannot find a license.
I am interested in the tfjs-models. The code is licensed Apache-2.0, but the models are downloaded by the code, so the license of the repository probably does not…
allo
- 312
- 1
- 9
3
votes
1 answer
Are there any better visual models for transfer rather than ImageNet?
Similar to the recent pushes in Pretrained Language Models (BERT, GPT2, XLNet) I was wondering if such a thrust exists in Computer Vision?
From my understanding, it seems the community has converged and settled for ImageNet trained classifiers as…
mshlis
- 2,399
- 9
- 23
2
votes
0 answers
Fine-tuning ResNet101 stuck at ~50% accuracy while MobileNetV2 reaches ~90% (same data, head, training setup)
I'm fine-tuning two different CNNs for an image classification task:
The first CNN uses a ResNet101 backbone, and the second uses a MobileNetV2 backbone. Both are pre-trained on ImageNet.
I use the same classification head for both models: a dense…
S.E.K.
- 41
- 1
- 5
2
votes
1 answer
Should I use pretrained model for image classification or not?
I have thousands of images similar to this.
I can classify them using existing metadata to different folders according to gravel product type loaded on the truck.
What would be optimal way to train a model for image classification that would be…
Vojtěch Dohnal
- 121
- 2
2
votes
1 answer
Does BERT freeze the entire model body when it does fine-tuning?
Recently, I came across the BERT model. I did some research and tried some implementations.
I wanted to tackle a NER task, so I chose the BertForSequenceClassifications provided by HuggingFace.
for epoch in range(1, args.epochs + 1):
total_loss…
Joon
- 51
- 1
- 6
2
votes
1 answer
What are some most promising ways to approximate common sense and background knowledge?
I learned from this blog post Self-Supervised Learning: The Dark Matter of Intelligence that
We believe that self-supervised learning (SSL) is one of the most promising ways to build such background knowledge and approximate a form of common sense…
Lerner Zhang
- 1,065
- 1
- 9
- 22
2
votes
1 answer
Which hyperparameters in neural network are accesible to users adjustment
I am new to Neural Networks and my questions are still very basic.
I know that most of neural networks allow and even ask user to chose hyper-parameters like:
amount of hidden layers
amount of neurons in each layer
amount of inputs and…
Igor
- 303
- 1
- 11
2
votes
1 answer
How to use pre-trained BERT to extract the vectors from sentences?
I'm trying to extract the vectors from the sentences. Spent soo much time searching for the pre-trained BERT models but found nothing.
Is it possible to get the vectors using pre-trained BERT from the data?
Pluviophile
- 1,293
- 7
- 20
- 40
1
vote
1 answer
Multi-task objective sometimes improve single-task performance, but is this true when fine tuning?
It is known that multitask objectives in neural networks sometimes have the effect of improving the performance of the neural network for each of the tasks individually (versus training the same network for each task individually). To what extent is…
Alexander Soare
- 1,379
- 3
- 12
- 28
1
vote
1 answer
How to Train a Decoder for Pre-trained BERT Transformer-Encoder?
Context:
I am currently working on an encoder-decoder sequence to sequence model that uses a sequence of word embeddings as input and output, and then reduces the dimensionality of the word embeddings.
The word embeddings are created using…
node_env
- 11
- 2
1
vote
0 answers
How to scrape product data on supplier websites?
I'm currently trying to build a semantic scraper that can extract product information from different company websites of suppliers in the packaging industry (with as little manual customization per supplier/website as possible).
The current approach…
johannesha
- 11
- 1