Questions tagged [flops]

5 questions
2
votes
1 answer

Does higher FLOPS mean higher throughput?

I understand that FLOPS means floating-point operations per second, and throughput is the number of inputs (for example, images) per second. If a model has higher FLOPS, it means it performs faster. However, in the article Container: Context…
BestR
  • 183
  • 1
  • 7
1
vote
0 answers

Using FLOPS estimate of Transformer to approximate time given GPU FLOPS per second

Intro I am attempting to approximate the time it takes for a Transformer to generate tokens given a GPU. Based on ran experiments, the below approach significantly underestimate the actual runtime. The model's runtime does not scale in any…
1
vote
1 answer

Given an input of shape $(3, 32, 32)$, which is convolved with a $(3 \times 3)$ kernel, how do I calculate the FLOPS?

I have an input tensor of shape $\mathbf{(3, 32, 32)}$ consisting of 3 channels, 16 rows, and 16 columns. I want to convolve the input tensor using $\mathbf{(3 \times 3)}$ kernel/filter. How can I calculate the required FLOPs?
0
votes
0 answers

Are there industry standard or widely accepted ways of profiling DL model speed?

I understand of course that absolute model inference speed differs based on hardware, but is there some common method of profiling or estimating some "relative" model inference speed - perhaps based on number and type of operations etc - such that…
user91748
  • 1
  • 2
0
votes
0 answers

How can you calculate the amount of FLOPS required to process, given model data, a prompt length and an answer lenght?

I wanted to calculate the rough number of floating point operations (or float multiplication) that is required to generate a response to a prompt. I am using llama 3.1, 70B as an example: prompt length (using words instead of tokens for…