Are both the training and inference systems required in the same application?

Question

From what I understand, there are 2 stages for deep learning: the first is training and the second is inference. The first is often done on GPUs because of their massive parallelism capabilities, among other things. The second, inference, while it can be done on GPUs, it's not used that much, because of power usage, and because the data presented while inferring are much less, so the full capabilities of GPUs won't be much needed. Instead, FPGAs and CPUs are often used for that.

My understanding also is that a complete DL system will have both, a training system and an inferring system.

My question is: are both systems required on the same application?

Let's assume an autonomous car or an application where visual and image recognition is done, will it have both a training system to be trained and an inference system to execute? Or it has only the inference system and will communicate with a distant system that is already trained and has built a database?

Also, if the application has both systems, will it have a big enough memory to store the training data? Given that it can be a small system and memory is ultimately limited.

score 1 · Accepted Answer · edited Dec 19 '21 at 13:48

Training and inference are usually completed on two separate systems. You are right in knowing that training of deep neural networks is usually done on GPUs and that inference is usually done on CPUs. However, training and inference are almost always done on two separate systems.

The main workflow for many data scientists today is as follows:

Create and establish all hyper-parameters for a model such as a deep neural network
Train the deep neural network using a GPU
Save the weights that training on the GPU established so that the model can be deployed.
Code the model in a production application with the optimal weights found in training.

So, as you can see from this workflow, training and inference are done in two completely separate phases.

However, in some specific cases, training and inference are done on the same system. For example, if you are using a Deep Neural Network to play video games, then you may have the neural network train and infer on the same system. This would lead to more efficiency because it would allow the model to continuously learn.

To answer your question on memory, the only applications where inference and training are done in the same application have a lot of memory available (think dual GPU dual CPU 128GB of RAM workstations), whereas applications that have a limited amount of memory only use inference such as embedded applications.

score 0 · Answer 2 · edited Oct 22 '18 at 03:43

Deep learning seems mostly to be a buzzword for what is essentially a neural network. You train with a data set to recognize a pattern, then input new data which is then classified by the trained network.

So you train a neural network with 10 different kinds of animals using thousands of pictures. Then you show the network say 100 new images and have the network "guess" what each animal is.

The point here is that training a neural network would required code for feedback that an application using the trained network would not need. So an application just using the trained network would be a bit more streamlined than an application which could allow additional training data.

What is missing is the ability for machine learning to form hierarchical or otherwise interrelated rules from the trained network so that the trained network is closer to being able to rationally explain why the classification works. Going back to the 10 animal trained network, without formulation of rules during training, there is no practical way for the network to reveal why any of the animals was classified the way it was.

Are both the training and inference systems required in the same application?

2 Answers2