I've just installed Recoll to index my text files. It works like a charm, but what surprised me is that it was able to index docx by default, while asked to install antiword to index doc files. I know doc and docx have different MIME types but they can both be easily opened by Libre.
What I want to understand is: how come docx files were parsed out of the box, while doc files required an additional app(antiword)? It's either Lible is used by default for docx only(which I doubt because when I navigate my files in Nautilus both doc and docx are recognised as LibleOffice files) or Ubuntu has some other docs parser that I'm not aware of?
In any case, I'm surprised to see that a more complex Win Office files are better supported than the simpler ones.
UPDATE: Just checked both MIME types with xdg-mine. My question still stands. Why weren't doc files indexed by default?
yuranos@yuranos-XPS-15-9550:~/development$ xdg-mime query default application/msword
libreoffice-writer.desktop
yuranos@yuranos-XPS-15-9550:~/development$ xdg-mime query default application/vnd.openxmlformats-officedocument.wordprocessingml.document
libreoffice-writer.desktop