7

I am looking to install llama.cpp from the official repo https://github.com/ggerganov/llama.cpp.

May someone help me please? There is no Ubuntu tutorial on YouTube and I don't want to follow ChatGPT for something so important.

sotirov
  • 4,455
Pablo
  • 101

4 Answers4

3
  1. Save LLama.cpp files locally. Open terminal in a folder where you want the app.

    git clone https://github.com/ggerganov/llama.cpp
    cd llama.cpp
    make
  2. Download model, and place it into the 'models' (sub)folder. For example: https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin

Notes: Better model = better results.
Yet, there are also restrictions. For example, 65B model 'alpaca-lora-65B.ggml.q5_1.bin' (5bit) = 49GB space; 51GB RAM Required.
Hopefully in the future we'll find even better ones. Also, there are different files (requirements) for models that will use only CPU or also GPU (and from which brand - AMD, NVIDIA).

To make the best use of your hardware - check available models.
'ggml-alpaca-7b-q4.bin' 7B model works without any need for the extra Graphic Card. It's light to start with.

  1. Update path + Run (question) prompts from terminal

/Documents/Llama/llama.cpp$

make -j && ./main -m ./models/ggml-alpaca-7b-q4.bin -p "What is the best gift for my wife?" -n 512

Result:

How terminal command looks like


Source: https://github.com/ggerganov/llama.cpp

It would be great to:
1. check for a better (Web)GUI (on top of the terminal).
2. Add persona, like

https://www.youtube.com/watch?v=nVC9D9fRyNU from https://discord.com/channels/1018992679893340160/1094185166060138547/threads/1094187855854719007



P.S. The easiest AI local installation is to download 'one-click-installer' from https://github.com/oobabooga/one-click-installers (and follow prompt messages).

For Ubuntu \ Terminal:

$ chmod +x start_linux.sh
$ ./start_linux.sh

Yet, now it's not a perfect world. My failed attempts included:

  • OobaBooga failed for my laptop hardware (no GPU found). Bug - reported. And it looks like the model I've selected could not work without NVIDIA graphic card
  • Dalai lama failed due to folder rights restrictions, and a few items versions compatibility issues. So I skipped it, even if it looked promising. https://github.com/cocktailpeanut/dalai
  • In short, result are biased from the: model (for example 4GB Wikipedia.zip vs 120GB wiki.zip) and the software on top of it (like LLama.cpp). I'd like to have it without too many restrictions. For example I've tested Bing, ChatGPT, LLama, ... and some answers are considered to be impolite or not legal (in that region). – Sergiusz Golec Apr 30 '23 at 14:22
  • 1
    This reads like a list of suggests not a clear answer. – David DE Apr 30 '23 at 14:57
2

Ubuntu 24.04

Option 1: Download pre-built binaries (recommended)

You can download the latest version from https://github.com/ggerganov/llama.cpp/releases/

At the time of writing this the latest version is b4610, I am using Ubuntu and a machine with x64 architecture:

mkdir llama.cpp
cd llama.cpp
wget https://github.com/ggerganov/llama.cpp/releases/download/b4610/llama-b4610-bin-ubuntu-x64.zip
unzip llama-b4610-bin-ubuntu-x64.zip

Option 2: Build locally

If you decide to build llama.cpp your self, I recommend you to use their official manual at: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md

You may need to install some packages:

sudo apt update
sudo apt install build-essential
sudo apt install cmake

Download and build llama.cpp:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

Using llama.cpp:

Whichever path you followed, you will have your llama.cpp binaries in the folder llama.cpp/build/bin/.

  • Use HuggingFace to download models

    If you are using HuggingFace, you can use the -hf option and it can download the model you want. Models downloaded this way are stored in ~/.cache/llama.cpp:

    cd llama.cpp/build/bin/
    ./llama-cli -hf bartowski/Mistral-Small-24B-Instruct-2501-GGUF
    
  • Load model from other location

    If you have already downloaded your model somewhere else, you can always use the -m option, like this: llama-cli -m [path_to_model]. For example, I am keeping my models in my home folder like this ~/models/, so if I want to use the Mistral-7B-Instruct-v0.3-GGUF file my command is:

    cd llama.cpp/build/bin/
    ./llama-cli -m ~/models/Mistral-Small-24B-Instruct-2501-GGUF
    
sotirov
  • 4,455
2

the easiest way is doing this: curl -fsSL https://www.ollama.com/install.sh | sh

  • While ollama is definitely the more convenient option, it is worth mentioning that llama.cpp is the backend of ollama, and as such it sometimes exposes more functionality, e.g. Vulkan backend support: https://github.com/ollama/ollama/pull/5059 – Ciro Santilli OurBigBook.com Jul 16 '25 at 07:01
0

I followed @sotirov's answer and it works, but I needed to first install curl and libcurl4-openssl-dev:

sudo apt install curl libcurl4-openssl-dev
  • sotirov's answer doesn't use curl, so I don't see why installing it would be required, unless you used curl instead of wget because you prefer curl. In that case, it is just your preference, and one wouldn't need to install the packages you mention in your answer. Or maybe you used Soyalguien2324's answer? Or maybe curl is somewhere implicitly required but I don't see it? Can you please clarify the case? – BeastOfCaerbannog Aug 03 '25 at 10:14