Questions tagged [gpu]

Graphics processing units or GPUs are specialized hardware for the manipulation of images and calculation of local image properties.

The mathematical basis of neural networks and image manipulation are similar, parallel tasks involving matrices, leading GPUs to become increasingly used for machine learning tasks. As of 2016, GPUs are popular for AI work, and they continue to evolve in a direction to facilitate deep learning, both for training[24] and inference in devices such as self-driving cars. GPU developers are developing additional connective capability for the kind of dataflow workloads AI benefits from. As GPUs have been increasingly applied to AI acceleration, GPU manufacturers have incorporated neural network specific hardware to further accelerate these tasks. Tensor cores are intended to speed up the training of neural networks. Wikipedia Reference

46 questions

votes

3 answers

Is a GPU always faster than a CPU for training neural networks?

Currently, I am working on a few projects that use feedforward neural networks for regression and classification of simple tabular data. I have noticed that training a neural network using TensorFlow-GPU is often slower than training the same…

asked Aug 24 '19 at 13:28

GKozinski

1,290
11
22

votes

2 answers

Can LSTM neural networks be sped up by a GPU?

I am training LSTM neural networks with Keras on a small mobile GPU. The speed on the GPU is slower than on the CPU. I found some articles that say that it is hard to train LSTMs (and, in general, RNNs) on GPUs because the training cannot be…

training tensorflow keras long-short-term-memory gpu

asked Jul 09 '18 at 04:55

Dieshe

votes

2 answers

Effect of batch size and number of GPUs on model accuracy

I have a data set that was split using a fixed random seed and I am going to use 80% of the data for training and the rest for validation. Here are my GPU and batch size configurations use 64 batch size with one GTX 1080Ti use 128 batch size with…

deep-learning keras accuracy gpu batch-size

asked Jan 09 '20 at 06:08

bit_scientist

votes

3 answers

For an LLM model, how can I estimate its memory requirements based on storage usage?

It is easy to see the amount of disk space consumed by an LLM model (downloaded from huggingface, for instance). Just go in the relevant directory and check the file sizes. How can I estimate the amount of GPU RAM required to run the model? For…

large-language-models gpu hardware hardware-evaluation

asked Jun 14 '23 at 12:01

ahron

votes

3 answers

How does a transformer leverage the GPU to be trained faster than RNNs?

How does a transformer leverage the GPU to be trained faster than RNNs? I understand the parameter space of the transformer might be significantly larger than that of the RNN. But why does the transformer structure can leverage multiple GPUs, and…

natural-language-processing training transformer attention gpu

asked Dec 10 '19 at 22:05

YoYO Man

votes

0 answers

Good model and training algorithm to store texture data for fast gpu inference

Now, the following may sound silly, but I want to do it for my better understanding of performance and implementation of GPU inference for a set of deep learning problems. What I want to do is to replace a surface texture for a 3d model by a NN that…

gpu

asked Dec 16 '19 at 11:51

matthias_buehlmann

votes

1 answer

Training network with 4 GPUs performance is not exactly 4 times over one GPU why?

Training neural network with 4 GPUs using pyTorch, performance is not even 2 times (btw 1 & 2 times) compare to using one GPU. From Nvidia-smi we see GPU usage is for few milliseconds and next 5-10 seconds looks like data is off-loaded and loaded…

training pytorch gpu

asked Oct 30 '19 at 14:29

Troy

votes

2 answers

How can I reduce the GPU memory usage with large images?

I am trying to train a CNN-LSTM model. The size of my images is 640x640. I have a GTX 1080 ti 11GB. I am using Keras with the TensorFlow backend. Here is the model. img_input_1 = Input(shape=(1, n_width, n_height, n_channels)) conv_1 =…

convolutional-neural-networks tensorflow training keras gpu

asked Sep 26 '19 at 23:49

Thiedent

votes

1 answer

Complete formula to get LLM VRAM usage

I would like to find the GPU size required to run an hypothetical LLM, considering all possible factors, like: P: Model parameters (total or MoE active parameters) Q: Quantization bits C: Context length cap (from what I understand, the context can…

large-language-models attention gpu memory

asked Apr 30 '25 at 15:32

rikyeah

votes

1 answer

Has anyone tried to use llama.cpp with NVLink?

Apparantly its possible to pool the memory of two 3090 using NVLink (although not with 4090). This would make it possible to run large LLM's on consumer hardware. https://huggingface.co/transformers/v4.9.2/performance.html Although before I invest…

gpu

asked Sep 14 '23 at 07:26

user2741831

votes

0 answers

How does one deal with images that are too large to fit in the GPU memory for doing ML image analysis?

How does one deal with images that are too large to fit in the GPU memory for doing ML image analysis? I am interested in detecting small structures on images which are themselves many GB in size. Beyond simple downsampling and maybe doing…

data-preprocessing image-processing image-segmentation data-science gpu

asked Feb 24 '23 at 07:52

Luca

votes

0 answers

How to train neural networks with multiprocessing?

I am trying to figure out how multiprocessing works in neural networks. In the example I've seen, the database is split into $x$ parts (depending on how many workers you have) and each worker is responsible to train the network using a different…

neural-networks gradient-descent gpu

asked Oct 19 '21 at 13:26

Yedidya kfir

votes

2 answers

How do GPUs faciliate the training of a Deep Learning Architecture?

I would love to know in detail, how exactly GPUs help, in technical terms, in training the deep learning models. To my understanding, GPUs help in performing independent tasks simultaneously to improve the speed. For example, in calculation of the…

neural-networks deep-learning convolutional-neural-networks long-short-term-memory gpu

asked Jun 16 '20 at 12:47

Anubhav Sachan

votes

0 answers

In addition to matrix algebra, can GPU's also handle the various Kernel functions for Neural Networks?

I've read a number of articles on how GPUs can speed up matrix algebra calculations, but I'm wondering how calculations are performed when one uses various kernel functions in a neural network. If I use Sigmoid functions in my neural network, does…

neural-networks tensorflow keras performance gpu

asked Jun 07 '19 at 19:32

Greg Thatcher

vote

0 answers

Using FLOPS estimate of Transformer to approximate time given GPU FLOPS per second

Intro I am attempting to approximate the time it takes for a Transformer to generate tokens given a GPU. Based on ran experiments, the below approach significantly underestimate the actual runtime. The model's runtime does not scale in any…

neural-networks transformer gpu flops

asked Mar 27 '25 at 18:01

jr123456jr987654321

2 3 4 Next