5

I am looking to install llama.cpp from the official repo https://github.com/ggerganov/llama.cpp.

May someone help me please? There is no Ubuntu tutorial on YouTube and I don't want to follow ChatGPT for something so important.

sotirov
  • 4,379
Pablo
  • 81

3 Answers3

3
  1. Save LLama.cpp files locally. Open terminal in a folder where you want the app.

    git clone https://github.com/ggerganov/llama.cpp
    cd llama.cpp
    make
  2. Download model, and place it into the 'models' (sub)folder. For example: https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin

Notes: Better model = better results.
Yet, there are also restrictions. For example, 65B model 'alpaca-lora-65B.ggml.q5_1.bin' (5bit) = 49GB space; 51GB RAM Required.
Hopefully in the future we'll find even better ones. Also, there are different files (requirements) for models that will use only CPU or also GPU (and from which brand - AMD, NVIDIA).

To make the best use of your hardware - check available models.
'ggml-alpaca-7b-q4.bin' 7B model works without any need for the extra Graphic Card. It's light to start with.

  1. Update path + Run (question) prompts from terminal

/Documents/Llama/llama.cpp$

make -j && ./main -m ./models/ggml-alpaca-7b-q4.bin -p "What is the best gift for my wife?" -n 512

Result:

How terminal command looks like


Source: https://github.com/ggerganov/llama.cpp

It would be great to:
1. check for a better (Web)GUI (on top of the terminal).
2. Add persona, like

https://www.youtube.com/watch?v=nVC9D9fRyNU from https://discord.com/channels/1018992679893340160/1094185166060138547/threads/1094187855854719007



P.S. The easiest AI local installation is to download 'one-click-installer' from https://github.com/oobabooga/one-click-installers (and follow prompt messages).

For Ubuntu \ Terminal:

$ chmod +x start_linux.sh
$ ./start_linux.sh

Yet, now it's not a perfect world. My failed attempts included:

  • OobaBooga failed for my laptop hardware (no GPU found). Bug - reported. And it looks like the model I've selected could not work without NVIDIA graphic card
  • Dalai lama failed due to folder rights restrictions, and a few items versions compatibility issues. So I skipped it, even if it looked promising. https://github.com/cocktailpeanut/dalai
1

the easiest way is doing this: curl -fsSL https://www.ollama.com/install.sh | sh

0

Ubuntu 24.04

Option 1: Download pre-built binaries (recommended)

You can download the latest version from https://github.com/ggerganov/llama.cpp/releases/

At the time of writing this the latest version is b4610, I am using Ubuntu and a machine with x64 architecture:

mkdir llama.cpp
cd llama.cpp
wget https://github.com/ggerganov/llama.cpp/releases/download/b4610/llama-b4610-bin-ubuntu-x64.zip
unzip llama-b4610-bin-ubuntu-x64.zip

Option 2: Build locally

If you decide to build llama.cpp your self, I recommend you to use their official manual at: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md

You may need to install some packages:

sudo apt update
sudo apt install build-essential
sudo apt install cmake

Download and build llama.cpp:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

Using llama.cpp:

Whichever path you followed, you will have your llama.cpp binaries in the folder llama.cpp/build/bin/.

  • Use HuggingFace to download models

    If you are using HuggingFace, you can use the -hf option and it can download the model you want. Models downloaded this way are stored in ~/.cache/llama.cpp:

    cd llama.cpp/build/bin/
    ./llama-cli -hf bartowski/Mistral-Small-24B-Instruct-2501-GGUF
    
  • Load model from other location

    If you have already downloaded your model somewhere else, you can always use the -m option, like this: llama-cli -m [path_to_model]. For example, I am keeping my models in my home folder like this ~/models/, so if I want to use the Mistral-7B-Instruct-v0.3-GGUF file my command is:

    cd llama.cpp/build/bin/
    ./llama-cli -m ~/models/Mistral-Small-24B-Instruct-2501-GGUF
    
sotirov
  • 4,379