Questions tagged [flops]
5 questions
2
votes
1 answer
Does higher FLOPS mean higher throughput?
I understand that FLOPS means floating-point operations per second, and throughput is the number of inputs (for example, images) per second. If a model has higher FLOPS, it means it performs faster.
However, in the article Container: Context…
BestR
- 183
- 1
- 7
1
vote
0 answers
Using FLOPS estimate of Transformer to approximate time given GPU FLOPS per second
Intro
I am attempting to approximate the time it takes for a Transformer to generate tokens given a GPU.
Based on ran experiments, the below approach significantly underestimate the actual runtime. The model's runtime does not scale in any…
jr123456jr987654321
- 255
- 1
- 7
1
vote
1 answer
Given an input of shape $(3, 32, 32)$, which is convolved with a $(3 \times 3)$ kernel, how do I calculate the FLOPS?
I have an input tensor of shape $\mathbf{(3, 32, 32)}$ consisting of 3 channels, 16 rows, and 16 columns. I want to convolve the input tensor using $\mathbf{(3 \times 3)}$ kernel/filter. How can I calculate the required FLOPs?
Mhasan502
- 13
- 4
0
votes
0 answers
Are there industry standard or widely accepted ways of profiling DL model speed?
I understand of course that absolute model inference speed differs based on hardware, but is there some common method of profiling or estimating some "relative" model inference speed - perhaps based on number and type of operations etc - such that…
user91748
- 1
- 2
0
votes
0 answers
How can you calculate the amount of FLOPS required to process, given model data, a prompt length and an answer lenght?
I wanted to calculate the rough number of floating point operations (or float multiplication) that is required to generate a response to a prompt.
I am using llama 3.1, 70B as an example:
prompt length (using words instead of tokens for…
user2741831
- 135
- 6