What is the difference between feature extraction and fine-tuning in transfer learning?

Question

I'm building a model for facial expression recognition, and I want to use transfer learning. From what I understand, there are different steps to do it. The first is the feature extraction and the second is fine-tuning. I want to understand more about these two stages, and the difference between them. Must we use them simultaneously in the same training?

nbro · Answer 1 · 2022-01-12T15:26:39.683

Typically, in transfer learning, you have 2-3 stages

Pre-training: pre-train some base model $M_\text{base}$ on some "general" dataset $A$; note that you may not necessarily need to train $M_\text{base}$, but it may already be available e.g. on the web. During this phase, we extract (general) features or learn representations of the data, which can "bootstrap" the learning task with your specific dataset
Training: You replace the last layers of $M_\text{base}$ (i.e. the classifier/regression part) with new layers to solve your task, then you might freeze the initial layers (e.g. the convolutional layers) that are assumed to contain the general extracted features that can also be useful for your task: let's call this model $M_\text{main}$; at this point, you train this partially frozen model $M_\text{main}$ with your dataset $B$.
Fine-tuning: after training, you could unfreeze some of the frozen layers in $M_\text{main}$, especially the ones closest to your new classifier, then train again

In all 3 stages, one could say that we're extracting features (because we're learning weights), but some people, I guess, will refer to the pre-training phase as the feature extraction phase. I think I've seen people call the training stage also the fine-tuning stage (and the previous version of this answer actually was referring to the training phase as the fine-tuning phase), but, in the end, these terms could be used inconsistently anyway, so the important thing is that you understand what's going on and keep context into account.

You can find more information about this topic here. Note that there may be other more sophisticated or simply different approaches to transfer learning.

score 1 · Answer 2 · answered Jan 12 '22 at 11:41

The difference between the two approaches (feature extraction vs fine-tuning) is well explained here: Fine Tuning vs Joint Training vs Feature Extraction

Also, this paper evaluate the performance one can hope to achieve with 2 sequence models (ELMo and BERT) with each approach: To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks

score 1 · Accepted Answer · answered Mar 20 '24 at 14:23

It has been a while since the question was asked, but I came up with this article. It helped me to understand the topic. From the article:

Feature-based methods involve using the intermediate representations or features from a pre-trained model as additional inputs to a task-specific model.

Fine-tuning, on the other hand, involves modifying and retraining a pre-trained model to adapt it to a specific task.

What is the difference between feature extraction and fine-tuning in transfer learning?

3 Answers3