Simply put, the question 11 in chapter 4 of Aurélien Géron's book "Hands-on Machine Learning" asks:
Suppose you want to classify pictures as outdoor/indoor and daytime/nighttime. Should you implement two logistic regression classifiers or one softmax regression classifier?
which he gives the answer in the jupyter notebook that accompanies the book (the given answer quoted below is exactly how he gave in the book and is complete):
If you want to classify pictures as outdoor/indoor and daytime/nighttime, since these are not exclusive classes (i.e., all four combinations are possible) you should train two Logistic Regression classifiers.
And I am not sure if I got it. My answer was direct: one softmax! I was surprised the the answer was different from that and I don't know if I understand it.
What I thought:
Isn't the idea of softmax regression: multiclass classification? When he says "all four combinations are possible" doesn't he means outdoor daytime, outdoor nighttime, indoor daytime, and indoor nighttime? If those four classes are possible and I want to classify pictures as one between those four, this is the task I expect to be more appropriate for a Softmax regression than two separate logistic regressions.
More than that, I think that Softmax regression allows for joint modeling instead of treating each classification task independently, which would be the case of using two logistic regressions. Considering all classes together and learning the relationships between the features can capture dependencies between then, right?
About the parameter estimation: isn't estimating a single set of parameters that map inputs to probabilities of each class more efficient than training two separate models?
Generalization: isn't sofmatx possibly better to generalize to unseen combinations of outdoor/indoor and daytime/nighttime since it learns a unified decision boundary?
Am I wrong? There are more things to consider or two logistic regressions separate will be always better than one softmax model?