1

(Disclaimer: I don't know much about ML/AI, besides some basic ideas behind it all.)

It seems like ML/AI models can often be boiled down to statistics, where certain levers (weights) get fine-tuned based on the specific input of a large set of training data.

Clearly, ML/AI models don't only distinguish themselves in their training data alone, otherwise there would not be so many improvements happening in the field all the time. My question therefore is: What does distinguish different models of the same category?

If I have an AI that completes real-life pictures that have some missing parts, and an AI that completes a painting with missing parts, what key concepts separates the two?

If I have an AI detecting text in an image, and an AI detecting... trees in an image, what key concepts separates the two?

In other words, what is stopping me from "taking" an existing implementation of a certain AI category, and just feeding it my specific training set + rewards (i.e. judgement criteria for good vs bad output), in order to solve a specific task?

In yet again other words, if I wanted to use ML/AI to build a new model for a specific task, what concepts and topics would I need to pay extra attention to? (I guess you could say I'm trying to reverse engineer the learning process of the field here. I don't have the time to properly teach myself and become an "expert", but find it all very interesting and would still like to use some of the wonderful things people have done.)

nbro
  • 42,615
  • 12
  • 119
  • 217
Fly
  • 111
  • 1

1 Answers1

1

If I understand what you mean correctly then the answer is basically nothing. Fundamentlaly all ML algorithms are ding the same thing, which is to optimize some weights for a certain output (this is true even for non-parametric methods in an implict way, but lets not dive too deep here). The only differences are:

  1. Dataset the model is trained on
  2. Specific dimensionality of inputs and outputs to make them compatible with the input data and the output labels.
  3. Complexity that can be expressed by the model (the number of weights and their structure in layers in the case of Neural Networks).
  4. Changes to the optimization process (gradient clipping, regularization, ecc.)
  5. Structural changes to the architecture that embed assumptions about the specific problem setting (e.g. 2D-Convulutions embed assumptions for images, softmax activations embeds the assumption for classification with probabilities, hidden states embed the assumptions about memory and "fogetting", ecc...)

Therefore, if you have a "similar" problem setting where you can assume that 3, 4, 5 can be the same without problems, then you can just make the appropriate changes in 2 and change the dataset (1) to get a model that works on something different.

Of course, being able to tell how "similar" a problem is to another and what possible things could be different is quite tricky and requires a lot of knowledge about the domain of Machine Learning and how every algorithm works and why.

In essence, I'm saying that in principle you could take a model and train it for a new setting with very minor and simple changes that don't require a lot of knowledge. However, without the wider knowledge on the field of ML/AI you won't be able to tell if what you are doing is ok and how well can it work in general.