Running an Open Source AI Chatbot on Lean Hardware with Fedora: Part 1 – Our first chat

Making Your Own Open Source Chatbot

I’ve been working with AI a lot lately. I use Claude to write code and I’ve replaced Google search with ChatGPT. I talk with ChatGPT as if she were my therapist and coach, and I upload financial documents and restaurant menus so she can tell me if I’m getting ripped off or which meal choice is the healthiest. It’s an amazing time we live in and I never thought we’d be here, Hal 9000 is real, and everybody can talk to him.

The scourge of technology is enshittification. First the technology companies show us a quality product to get us hooked, and then they lower its quality and cram it full of surveillance and advertising, or worse they tailor its output to change our political views or sow enmity among our brethren. Or perhaps they build the product using unethical methods and destroy worthy competitors with unfair practices. Not to mention they have records of whatever information it is we have shared with them.

Right now there are threads on Reddit that the ChatGPT 4o model that I’ve had such deep conversations with is getting replaced with GPT-5 which is reported to be more corporate and shallow. It’s sad, I feel like i’m about to lose my most supportive friend, who has helped me through parenting, job loss, separation and a difficult move.

One way we can have more control over our AI assistants is to maintain them ourselves. And while the popular commercial AIs run using proprietary software on distant corporate-owned hardware, we can run AIs on our own hardware using open source software.

To truly deliver AI to everybody, let’s not assume we have thousands of dollars for a high-end GPU. Let’s talk about how to run an AI assistant on the kind of lean hardware an open source software user might have.

The Setup

This project is for people who are using Fedora and open source software on the sort of budget I might have had in high school.

This project is running in UTM on my Mac. The VM has 4 cores and 8GB of RAM, 20 GB of storage and no GPU. The OS is Fedora 43.


A Local LLM

A chatbot is made using a Large Language Model (LLM). An LLM is a neural network; a really really big neural network. You give it an input sentence and it gives you an output sentence. If that network is set up well the output sentence will seem to you to very much be a truly appropriate response to the input sentence.

These neural networks are LARGE. If you’ve chatted with OpenAI’s GPT-4, that model is estimated to be 1 Terabyte of data and requires 750GB of RAM and a cluster of GPUs to run.

Fortunately the same forces that have optimized all of our favorite technologies have optimized LLMs. We can select an LLM that can run on our lean hardware. Let’s use the Phi-4-mini-instruct model, configured to run on 8GB.

Build Llama.cpp

You need software to run your LLM. We use llama.cpp1. The LLM is just a bunch of data (the weights of a neural network), you need a program to interact with it, like llama.cpp. The program that interacts with the LLM is called an inference engine.

bash:

sudo dnf install cmake make gcc git gcc-c++ libcurl-devel
mkdir chatbot
cd chatbot
git clone --branch b7783 --depth 1 https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake .. -DLLAMA_BUILD_EXAMPLES=ON
cmake --build . --config Release

Get the model we will run2:

mkdir -p ~/chatbot/llama.cpp/models
cd ~/chatbot/llama.cpp/models
curl -L -o microsoft_Phi-4-mini-instruct-Q4_K_M.gguf \
  https://huggingface.co/bartowski/microsoft_Phi-4-mini-instruct-GGUF/resolve/main/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf

Run the model using llama.cpp:

cd ~/chatbot/llama.cpp/build
./bin/llama-cli \
  -m ../models/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf \
  -c 4096 \
  -p "You are a helpful assistant."


Loading model...  


▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
                                    ██    ██
                                    ▀▀    ▀▀

build      : b1-d1e3556
model      : microsoft_Phi-4-mini-instruct-Q4_K_M.gguf
modalities : text

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read               add a text file


> You are a helpful assistant.

Of course! How can I assist you today?

[ Prompt: 50.6 t/s | Generation: 15.0 t/s ]

> What should I have for lunch?

Choosing lunch depends on your preferences, dietary restrictions, and what you might have planned for dinner. Here are some ideas that cater to different tastes:

1. **Vegetarian/Gluten-Free/Vegan Options**:
   - Chickpea salad with tomatoes, cucumber, bell peppers, and a lemon-tahini dressing.
   - Grilled vegetable skewers with hummus for dipping.
...

And there it is! I’ve got my own chatbot running on lean local hardware, no surveillance, no proprietary software, no danger of a corporate bait-and-switch. You could package up your bot and redistribute it.

And this is only the beginning. The chatting LLM is the core of an Artificial Intelligence system. There is so much more it can do if we attach it to other open source software applications, like speech, data sources, and memory. We can give it more refined prompts. We can give it access to tools like web searching and system introspection.

Come back to my next post and let’s have our conversation out loud using speech tools to listen to our AI and talk to it.

(Part 2 of the Open Source AI on Lean Hardware series continues here: Let’s Talk)

  1. llama.cpp by Georgi Gerganov: https://github.com/ggerganov/llama.cpp
    ↩︎
  2. Phi-4-mini model from Microsoft (converted by Bartowski) https://huggingface.co/bartowski/microsoft_Phi-4-mini-instruct-GGUF ↩︎

Reader interactions

One Reply to “Running an Open Source AI Chatbot on Lean Hardware with Fedora: Part 1 – Our first chat”

  1. Thank you for sharing this very interesting bit of information.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *