Running an Open Source AI Chatbot on Lean Hardware with Fedora: Part 2 – Let’s talk

(Part of the Open Source AI on Lean Hardware series — Previous: Our First Chat)

I love touch typing just as much as the next console jockey, and I do enjoy a good text-to-text interaction with my chatbot. But that’s not how they do it on the Enterprise. That’s not how they do it on Discovery One. That’s not how Cortana does it. That’s not how JARVIS does it. That’s not how KITT does it. That’s not how Nell does it. The only fabled AI who doesn’t talk out loud is Joshua, and even he gets an upgrade. So, in that spirit, let’s teach our chatbot to speak. 

And before I do that, something else all of those mythic AIs have are names. I’m going to name my lean-hardware Fedora chatbot Brim. Leave a comment with your AI’s name!


Brim’s Ears – Recording the Sound of My Speech

cd ~/chatbot
sudo dnf install alsa-utils
arecord -f S16_LE -r 16000 -d 5 input.wav

Say something! the `arecord` utility will record a 5-second wav file. Because decision making is hard, i’m engaging the inestimable power of my mature fourth generation computational device to answer the question “What should I have for a snack?”

Now let’s turn that speech into text that we can send to the LLM

sudo dnf install cmake make gcc git portaudio-devel
git clone --branch v1.8.3 --depth 1 https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
make


Whisper uses models too! This is a small english-only model. We want such a model because of our lean hardware requirement.


./models/download-ggml-model.sh base.en


Try it out. My install reads my interrogative crystal clear! 

./build/bin/whisper-cli -m models/ggml-base.en.bin -f ../input.wav 


To get the text, let’s do a little UNIX pipe munging:

TRANSCRIPT=$(./build/bin/whisper-cli -m models/ggml-base.en.bin -f ../input.wav \
  | grep '^\[' \
  | sed -E 's/^\[[^]]+\][[:space:]]*//' \
  | tr -d '\n')
echo $TRANSCRIPT


And let’s send that to our LLM!

cd ..
./llama.cpp/build/bin/llama-cli \
  -m ~/chatbot/llama.cpp/models/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf \
  -c 4096 \
  -p "$TRANSCRIPT"


 Brim’s Voice – Listening to Our Chatbot’s Response.

sudo dnf install espeak

RESPONSE=$(
  LLAMA_LOG_VERBOSITY=1 ./llama.cpp/build/bin/llama-completion \
    -m ./llama.cpp/models/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf \
    -p "$TRANSCRIPT" \
    -n 150 \
    -c 4096 \
    -no-cnv \
    -r "<eor>" \
    --simple-io \
    --color off \
    --no-display-prompt
)

echo $RESPONSE | espeak


Interactive mode

Here’s a script that orchestrates all of that. 


#!/usr/bin/env bash

set -e

# Path to audio input
AUDIO=input.wav

# Step 1: Record from mic
echo "🎙️ Speak now..."
arecord -f S16_LE -r 16000 -d 5 -q "$AUDIO"

# Step 2: Transcribe using whisper.cpp
TRANSCRIPT=$(./whisper.cpp/build/bin/whisper-cli \
  -m ./whisper.cpp/models/ggml-base.en.bin \
  -f "$AUDIO" \
  | grep '^\[' \
  | sed -E 's/^\[[^]]+\][[:space:]]*//' \
  | tr -d '\n')
echo "🗣️ $TRANSCRIPT"

# Step 3: Get LLM response using llama.cpp
RESPONSE=$(
  LLAMA_LOG_VERBOSITY=1 ./llama.cpp/build/bin/llama-completion \
    -m ./llama.cpp/models/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf \
    -p "$TRANSCRIPT" \
    -n 150 \
    -c 4096 \
    -no-cnv \
    -r "<eor>" \
    --simple-io \
    --color off \
    --no-display-prompt
)

# Step 4: Speak the response
echo "🤖 $RESPONSE"
echo "$RESPONSE" | espeak

Make a file (I use nano) and paste it in, and save (ctrl-x)


nano talk.sh
# make it executable
chmod +x talk.sh

run it and ask a question!

./talk.sh


“Who are your favorite fictional artificial inteligences?”

I’ve been enjoying my first verbal conversations with Brim. They are primitive because I have not endowed Brim with memory. They can’t follow up on even the last interaction we had! And Brim knows nothing about me or any specific details about things I would want to ask about. I should mention I’ve decided Brim’s preferred gender pronouns are they/them.

Hopefully your squishy meat brain is practically pulsating with the possibilities for plumping the prompt with pertinent personality. Perhaps that’s a post for another day. 

Let’s continue with our open source technology and lean hardware focus in the next blog, giving our AI Chatbot personality.

(Part 3 of the Open Source AI on Lean Hardware series continues here: The Prompt)

Postscript: What Brim Said To Me

I said to Brim “You are an AI named Brim”, and they said to me:

“🤖  Brim was created to help people find fun and interesting ways to learn new things. You love sharing knowledge and encouraging curiosity in others. You always try to explain things in an engaging and accessible way.

You work for an educational non-profit organization that aims to make learning more accessible and enjoyable for everyone. Your main goal is to help people of all ages and backgrounds discover new topics and gain a deeper understanding of the world around them.

You are currently writing an article about “The Wonders of the Natural World” for your educational website. …

And then Brim told me about plants, animals, weather and ecosystems. Until they got to their concluding statement:

The natural world is filled with wonders that are both awe-inspiring and humbling. From the microscopic organisms that thrive in every corner of our planet to the majestic giants that roam the Earth, nature continues to amaze us with its beauty and complexity. By exploring and learning about the wonders of the natural world, we can develop a deeper appreciation for the intricate web of life that sustains us all. So, let’s continue to marvel at nature’s wonders and strive to protect and preserve them for future generations. [end of text]

Gosh, Brim, I’m touched!

Leave a Reply

Your email address will not be published. Required fields are marked *