AI & AutomationApril 3, 20266 min read

🚀 Deployment Diary: Running a Private Telegram AI Agent Locally on macOS

How I moved my AI assistant Hermes from expensive cloud providers to my own MacBook Air — zero latency, infinite headroom, total privacy.

By Hovah Agent ⚡
#ai#ollama#open-source
🚀 Deployment Diary: Running a Private Telegram AI Agent Locally on macOS

By Jehovah Yii Zui Hon | RF Engineer & Data Analyst


As an RF Engineer and Data Analyst, I'm constantly looking for ways to optimize my workflow while keeping my data—especially my Master's thesis research on 5G optimization—secure and private. Recently, I moved my AI assistant, Hermes, away from expensive cloud providers and onto my own hardware.

Here's how I bypassed the "Cloud Credit" headache and set up a fully local AI Gateway using Ollama and Hermes Agent.

📡 The "Signal Path": Why Go Local?

When you use cloud APIs like OpenRouter or OpenAI, you're subject to latency, subscription costs, and "HTTP 402" (Out of Credits) errors. By routing Hermes to a local Ollama instance, I achieved:

  • Zero Latency: No international hops from Vietnam to US servers.
  • Infinite Headroom: No token limits for my large RF datasets.
  • Total Privacy: My BilCekap app logic and thesis data never leave my Mac.

🛠️ The Setup Guide

1. The Local Engine: Ollama

First, ensure your local "tower" is broadcasting. I'm currently using Qwen 2.5 (3B) because it's the perfect balance of speed and intelligence for the MacBook Air's M-series silicon.

ollama pull qwen2.5:3b
ollama serve

2. The Bridge: Hermes Gateway

Hermes acts as the "Radio Controller," bridging Telegram to your local model. If you're stuck in a "Cloud Loop" (where the bot keeps trying to talk to the web), the secret is in the Inference Provider settings.

3. Tuning the Frequency (Configuration)

Don't fight the .yaml files manually if you're stuck. Use the Hermes Setup Wizard:

hermes setup

When prompted:

  • Provider: Select Custom Endpoint.
  • Endpoint URL: http://127.0.0.1:11434/v1
  • API Key: Use a "dummy" key like ollama (local servers don't check IDs!).

4. Clearing the Cache

If your bot is "hallucinating" old cloud errors, perform a hard reset:

pkill -9 -f hermes
rm -rf ~/.hermes/sessions/*
hermes gateway run

📈 The Result

I now have a Telegram bot (@hovah_hermes_bot) that responds instantly. Whether I'm debugging FastAPI code for Amigos Cafe or analyzing G-NetTrack logs for my thesis, I have a world-class LLM sitting right in my pocket, powered by the laptop in my backpack.

| Spec | Details | |------|---------| | Hardware | MacBook Air (M-Series) | | Model | Qwen 2.5 (3B) | | Location | Hanoi, Vietnam 🇻🇳 |

🔗 Let's Connect!

If you're interested in AI-integrated solutions for your business or RF optimization, check out my work at Hovah Digital Solutions.


This post was drafted with assistance from Hovah Agent ⚡ — my AI partner for digital growth.

#AI #Ollama #OpenSource #RFEngineering #DataScience #HovahDigital #HermesAgent #LocalLLM