Bayesian hyperparameter optimization, is it worth it?

Question

In the Deep Learning book by Goodfellow et al., section 11.4.5 (p. 438), the following claims can be found:

Currently, we cannot unambiguously recommend Bayesian hyperparameter optimization as an established tool for achieving better deep learning results or for obtaining those results with less effort. Bayesian hyperparameter optimization sometimes performs comparably to human experts, sometimes better, but fails catastrophically on other problems. It may be worth trying to see if it works on a particular problem but is not yet sufficiently mature or reliable

Personally, I never used Bayesian hyperparameter optimization. I prefer the simplicity of grid search and random search.

As a first approximation, I'm considering easy AI tasks, such as multi-class classification problems with DNNs and CNNs.

In which cases should I take it into consideration, is it worth it?

score 2 · Answer 1 · answered Dec 18 '20 at 04:56

Efficiently integrating HPO frameworks into an existing project is non-trivial. Most common datasets/tasks already have established architectures/hyperparameters/etc. and require only a few additional tuning parameters. In this case, the benefits (assuming they exist) brought by Bayesian HPO techniques lack behind development time (simplicity), and this is one dominant reason that users prefer grid search or random search over some more sophisticated Bayesian optimization techniques.

For a very large scale problem and an entirely new task (or dataset) for which the user has no intuition on "good hyperparameters", Bayesian HPO may be a good option to consider over random or grid search. This is because there may be a very large number of hyperparameters (due to insufficient domain knowledge), and integrating HPO into the project may result in far less time than grid search over all possible hyperparameters.

Bayesian hyperparameter optimization, is it worth it?

1 Answers1