As far as I understood from a series of papers, minimizing the T-count in Clifford+T circuits is essential for fault-tolerant quantum computing:
While techniques such as magic state distillation and injection allow for fault-tolerant implementation of T gates, they typically require an order of magnitude more resources than Clifford gates
For example see here and here.
But do I understand correctly that while minimizing T-count (incl. in Clifford+T circuits) does not give a significant gain for the current IBM open quantum systems (real hardware, not simulations!), in particular, due to the fact that at the moment T gate (like T†, U1/P and RZ gates)
can be implemented virtually in hardware via framechanges (i.e. at zero error and duration)
See the documentation