Install llama.cpp locally?

Question

I am looking to install llama.cpp from the official repo https://github.com/ggerganov/llama.cpp.

May someone help me please? There is no Ubuntu tutorial on YouTube and I don't want to follow ChatGPT for something so important.

I use Alpaca, a fork of Llama. I think it's easier to install and use, installation is straightforward. -> https://github.com/antimatter15/alpaca.cpp — pLumo, Mar 30 '23 at 07:49
@pLumo can you send me the link for ggml-alpaca-7b-q4.bin please, i can't find it — Pablo, Mar 30 '23 at 10:07
it looks like changes were rolled back upstream to llama.cpp from alpaca — chovy, Apr 23 '23 at 07:01
Please read this tutorial it covers everything: https://medium.com/p/530987b8a835 — Sam Shepherd, Nov 01 '23 at 17:40

Sergiusz Golec · Answer 1 · 2023-05-01T08:24:19.543

Save LLama.cpp files locally. Open terminal in a folder where you want the app.
```
git clone https://github.com/ggerganov/llama.cpp
```
```
cd llama.cpp
```
```
make
```
Download model, and place it into the 'models' (sub)folder. For example: https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin

Notes: Better model = better results.
Yet, there are also restrictions. For example, 65B model 'alpaca-lora-65B.ggml.q5_1.bin' (5bit) = 49GB space; 51GB RAM Required.
Hopefully in the future we'll find even better ones. Also, there are different files (requirements) for models that will use only CPU or also GPU (and from which brand - AMD, NVIDIA).

To make the best use of your hardware - check available models.
'ggml-alpaca-7b-q4.bin' 7B model works without any need for the extra Graphic Card. It's light to start with.

Update path + Run (question) prompts from terminal

/Documents/Llama/llama.cpp$

make -j && ./main -m ./models/ggml-alpaca-7b-q4.bin -p "What is the best gift for my wife?" -n 512

Result:

Source: https://github.com/ggerganov/llama.cpp

It would be great to:
1. check for a better (Web)GUI (on top of the terminal).
2. Add persona, like

https://www.youtube.com/watch?v=nVC9D9fRyNU from https://discord.com/channels/1018992679893340160/1094185166060138547/threads/1094187855854719007

P.S. The easiest AI local installation is to download 'one-click-installer' from https://github.com/oobabooga/one-click-installers (and follow prompt messages).

For Ubuntu \ Terminal:

$ chmod +x start_linux.sh
$ ./start_linux.sh

Yet, now it's not a perfect world. My failed attempts included:

OobaBooga failed for my laptop hardware (no GPU found). Bug - reported. And it looks like the model I've selected could not work without NVIDIA graphic card
Dalai lama failed due to folder rights restrictions, and a few items versions compatibility issues. So I skipped it, even if it looked promising. https://github.com/cocktailpeanut/dalai

In short, result are biased from the: model (for example 4GB Wikipedia.zip vs 120GB wiki.zip) and the software on top of it (like LLama.cpp). I'd like to have it without too many restrictions. For example I've tested Bing, ChatGPT, LLama, ... and some answers are considered to be impolite or not legal (in that region). — Sergiusz Golec, Apr 30 '23 at 14:22

sotirov · Answer 2 · 2025-02-02T16:12:23.400

Ubuntu 24.04

Option 1: Download pre-built binaries (recommended)

You can download the latest version from https://github.com/ggerganov/llama.cpp/releases/

At the time of writing this the latest version is b4610, I am using Ubuntu and a machine with x64 architecture:

mkdir llama.cpp
cd llama.cpp
wget https://github.com/ggerganov/llama.cpp/releases/download/b4610/llama-b4610-bin-ubuntu-x64.zip
unzip llama-b4610-bin-ubuntu-x64.zip

Option 2: Build locally

If you decide to build llama.cpp your self, I recommend you to use their official manual at: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md

You may need to install some packages:

sudo apt update
sudo apt install build-essential
sudo apt install cmake

Download and build llama.cpp:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

Using llama.cpp:

Whichever path you followed, you will have your llama.cpp binaries in the folder llama.cpp/build/bin/.

Use HuggingFace to download models

If you are using HuggingFace, you can use the -hf option and it can download the model you want. Models downloaded this way are stored in ~/.cache/llama.cpp:
```
cd llama.cpp/build/bin/
./llama-cli -hf bartowski/Mistral-Small-24B-Instruct-2501-GGUF
```
Load model from other location

If you have already downloaded your model somewhere else, you can always use the -m option, like this: llama-cli -m [path_to_model]. For example, I am keeping my models in my home folder like this ~/models/, so if I want to use the Mistral-7B-Instruct-v0.3-GGUF file my command is:
```
cd llama.cpp/build/bin/
./llama-cli -m ~/models/Mistral-Small-24B-Instruct-2501-GGUF
```

score 2 · Answer 3 · answered Feb 02 '25 at 17:33

2

the easiest way is doing this: curl -fsSL https://www.ollama.com/install.sh | sh

answered Feb 02 '25 at 17:33

Soyalguien2324

21

While ollama is definitely the more convenient option, it is worth mentioning that llama.cpp is the backend of ollama, and as such it sometimes exposes more functionality, e.g. Vulkan backend support: https://github.com/ollama/ollama/pull/5059 – Ciro Santilli OurBigBook.com Jul 16 '25 at 07:01

score 0 · Answer 4 · edited Aug 03 '25 at 10:09

0

I followed @sotirov's answer and it works, but I needed to first install curl and libcurl4-openssl-dev:

sudo apt install curl libcurl4-openssl-dev

edited Aug 03 '25 at 10:09

BeastOfCaerbannog

16,936

answered Jul 25 '25 at 11:04

Stelmo Barbosa

1

sotirov's answer doesn't use curl, so I don't see why installing it would be required, unless you used curl instead of wget because you prefer curl. In that case, it is just your preference, and one wouldn't need to install the packages you mention in your answer. Or maybe you used Soyalguien2324's answer? Or maybe curl is somewhere implicitly required but I don't see it? Can you please clarify the case? – BeastOfCaerbannog Aug 03 '25 at 10:14

Install llama.cpp locally?

4 Answers4

Ubuntu 24.04

Option 1: Download pre-built binaries (recommended)

Option 2: Build locally

Using llama.cpp: