Are there industry standard or widely accepted ways of profiling DL model speed?

Question

I understand of course that absolute model inference speed differs based on hardware, but is there some common method of profiling or estimating some "relative" model inference speed - perhaps based on number and type of operations etc - such that one could compare several models, agnostic of differing hardware? E.g., say I develop a model on my machine, and a colleague develops a model on their machine (with, say a different cpu/gpu), would there be some way for each of us to compare their relative speeds (without having to run them side by side on the same machine).

In the past, I have profiled coreml (.mlmodel) compiled models using the built-in profiler in xcode on a mac, but I'm wondering if there are ways that are less proprietary. From some searching I haven't found anything, so if there aren't any common methods for what I'm describing, I'm curious why not - is there some reason that relative speed of models isn't possible to reliably estimate agnostic of hardware without directly measuring inference times on some specific hardware?

Are there industry standard or widely accepted ways of profiling DL model speed?

0 Answers0