I am looking to install llama.cpp from the official repo https://github.com/ggerganov/llama.cpp.
May someone help me please? There is no Ubuntu tutorial on YouTube and I don't want to follow ChatGPT for something so important.
I am looking to install llama.cpp from the official repo https://github.com/ggerganov/llama.cpp.
May someone help me please? There is no Ubuntu tutorial on YouTube and I don't want to follow ChatGPT for something so important.
Save LLama.cpp files locally. Open terminal in a folder where you want the app.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Download model, and place it into the 'models' (sub)folder. For example: https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin
Notes: Better model = better results.
Yet, there are also restrictions. For example, 65B model 'alpaca-lora-65B.ggml.q5_1.bin' (5bit) = 49GB space; 51GB RAM Required.
Hopefully in the future we'll find even better ones.
Also, there are different files (requirements) for models that will use only CPU or also GPU (and from which brand - AMD, NVIDIA).
To make the best use of your hardware - check available models.
'ggml-alpaca-7b-q4.bin' 7B model works without any need for the extra Graphic Card. It's light to start with.
/Documents/Llama/llama.cpp$
make -j && ./main -m ./models/ggml-alpaca-7b-q4.bin -p "What is the best gift for my wife?" -n 512
Result:
https://www.youtube.com/watch?v=nVC9D9fRyNU from https://discord.com/channels/1018992679893340160/1094185166060138547/threads/1094187855854719007
P.S. The easiest AI local installation is to download 'one-click-installer' from https://github.com/oobabooga/one-click-installers (and follow prompt messages).
For Ubuntu \ Terminal:
$ chmod +x start_linux.sh $ ./start_linux.sh
Yet, now it's not a perfect world. My failed attempts included:
the easiest way is doing this: curl -fsSL https://www.ollama.com/install.sh | sh
You can download the latest version from https://github.com/ggerganov/llama.cpp/releases/
At the time of writing this the latest version is b4610, I am using Ubuntu and a machine with x64 architecture:
mkdir llama.cpp
cd llama.cpp
wget https://github.com/ggerganov/llama.cpp/releases/download/b4610/llama-b4610-bin-ubuntu-x64.zip
unzip llama-b4610-bin-ubuntu-x64.zip
If you decide to build llama.cpp your self, I recommend you to use their official manual at: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md
You may need to install some packages:
sudo apt update
sudo apt install build-essential
sudo apt install cmake
Download and build llama.cpp:
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release
Whichever path you followed, you will have your llama.cpp binaries in the folder llama.cpp/build/bin/.
Use HuggingFace to download models
If you are using HuggingFace, you can use the -hf option and it can download the model you want. Models downloaded this way are stored in ~/.cache/llama.cpp:
cd llama.cpp/build/bin/
./llama-cli -hf bartowski/Mistral-Small-24B-Instruct-2501-GGUF
Load model from other location
If you have already downloaded your model somewhere else, you can always use the -m option, like this: llama-cli -m [path_to_model]. For example, I am keeping my models in my home folder like this ~/models/, so if I want to use the Mistral-7B-Instruct-v0.3-GGUF file my command is:
cd llama.cpp/build/bin/
./llama-cli -m ~/models/Mistral-Small-24B-Instruct-2501-GGUF