4

I was just doing a simple NN example with the fashion MNIST dataset, where I was getting 97% accuracy, when I noticed that I was using Binary cross-entropy instead of categorical cross-entropy by accident. When I switched to categorical cross-entropy, the accuracy dropped to 90%. I then got curious and tried to use binary cross-entropy instead of categorical cross-entropy in my other projects and in all of them the accuracy increased.

Now, I know that binary cross-entropy can be used in a multi-class, multi-label classification problem, but why is working better than categorical cross-entropy in a multiclass single label problem?

nbro
  • 42,615
  • 12
  • 119
  • 217

1 Answers1

1

https://stats.stackexchange.com/questions/260505/machine-learning-should-i-use-a-categorical-cross-entropy-or-binary-cross-entro Is relevant.

based on my reading when you have a NN and do Binary crossentropy on what you might call 'Linked category data' the accuracy can tend to be better than in a Categorical crossentropy model. The binary aspect implies the categories can undergo multiple splits before deciding on the exact category when the data is categorically splittable like this in a tree like hierarchy the accuracy can tend to be better.

Think of how difficult it would be to memorize the name of every type of clothing in someones wardrobe if each piece had its own special name. Vs if they had structurally relevant names like Upper/lower for category number one is it warn on your top half or bottom half. followed by inner or outer. It an inner or outer layer of clothing. learning such binary name/feature categories enables a more accurate model. If it was data unrelated in this way it would most likely not be as accurate. The binary model can take advantage learning such features while a muti categorical model I think assumes independence and tries to learn best the features of each group and gives out a prediction of how sure it falls in each category.

Michael Hearn
  • 555
  • 5
  • 17