Host Your Own AI Chatbot Privately: A Complete Guide
By Naomi Brockwell, Founder and Director of NBTV
AI chatbots are everywhere, but using them often means sacrificing privacy. Every time you use these tools, your data may be sent to third-party servers, where it can be stored, analyzed, and shared with countless entities. If you want to use cutting-edge technology without giving up control over your data, hosting an AI model on your own device is the most private approach. Here’s a complete guide to help you do just that.
Note: This process is more advanced than our usual guides, recommended for intermediate users and beyond. Read on if you’re curious, but don’t feel overwhelmed if it’s a project you’re not yet ready to tackle! There are other great ways to use AI privately, that we outlined in our previous guide, that are beginner-friendly.
Step 1: Understanding AI Terminology
Before diving into hosting an LLM, let’s review some key terms that will make the setup easier.
LLM (Large Language Model): A type of AI trained on massive amounts of text, designed to understand and generate human language.
Parameters: Think of these as adjustable “knobs” in the model that are fine-tuned during training. The more parameters a model has, the better it can understand complex patterns in language. Large models can have billions of parameters.
Model Size: The number of parameters affects the model’s capabilities and the computing power required. Smaller models are more manageable for home setups.
AI Engine: The software that makes the model run on your device. Examples include Llama-CPP.
User Interface (UI): The way you interact with the model. You can use a command-line interface (CLI), like Ollama’s, or a graphical user interface (GUI) like OpenWebUI, which we’ll cover later.
Step 2: Choosing Your Model and AI Engine
To get started, you’ll need a model to download and an engine to run it. Ollama simplifies this by allowing you to browse, download, and interact with AI models—all in one app.
Download Ollama: Visit Ollama’s website and download the app for your operating system (Windows, Mac, or Linux).
Install and Test: Once installed, open the command line (on Mac/Linux, this is Terminal; on Windows, it’s Command Prompt). Type:
Ollama
This will confirm the app’s setup and display available commands. You’ll mainly use “pull” (to download models) and “run” (to start models). If you prefer a graphical interface, we’ll explain how to set up OpenWebUI later.
Step 3: Selecting and Running a Model
When choosing a model, consider both the model’s size (number of parameters) and its purpose. Smaller models (with single-digit billions of parameters) are ideal for laptops and most home computers.
Choose Your Model: Go to Ollama’s “Models” page to browse different models. Each model is designed for specific tasks, so pick one that fits your needs and device’s capacity. For extra guidance, platforms like Chatbot Arena provide comparisons between models to help you find one suited for your tasks.
Run the Model:
Open your terminal or command prompt.
Type the command:
Ollama run [model name]
For example, if you selected the “Aya” model with 8 billion parameters, type:
Ollama run aya:8b
You can copy the exact command directly from the Ollama “Models” page.
Hit Enter. This command downloads the model and starts it on your device, allowing you to interact with it directly in the terminal.
Interacting with the Model: Once the model loads, type in your prompts, and the AI will respond right in the terminal. This setup keeps everything local on your device—no data is sent online, so your interactions remain private.
Step 4: Setting Up OpenWebUI (Optional for Graphical Interface)
If you’re not comfortable with the command line, OpenWebUI offers a browser-based interface for managing and interacting with your models, similar to a typical app. I recommend you watch our video tutorial for this part, as it’s a more involved process.
Install Docker: OpenWebUI uses Docker to make setup easier. Docker creates a virtual container with everything OpenWebUI needs to run. Download Docker from Docker’s official page and follow the installation steps.
Run OpenWebUI in Docker:
Go to the OpenWebUI website and copy the setup command.
Paste this command into your terminal to launch OpenWebUI within Docker.
To access the OpenWebUI interface, open your browser and type:
http://localhost:3000
OpenWebUI will let you select and interact with models visually, making it easier to manage settings without needing to code. You can even save it as an app on your desktop for quick access.
Final Takeaways and Tips
Ollama is Beginner-Friendly: Downloading and running models with Ollama is simple and keeps everything private.
OpenWebUI Offers a User-Friendly GUI: OpenWebUI is ideal if you want to avoid the command line and need a visual experience. We recommend watching our video tutorial for help with installation.
Choose the Right Model Size: Smaller models (with fewer parameters) are best for consumer-grade devices like laptops.
Docker for Easy Setup: Docker simplifies the installation of OpenWebUI, helping you avoid complex setup issues.
Keep Privacy in Control: Running a model locally means all data stays on your device, ensuring full control over your interactions.
Running models locally is the most private way to use AI, and with this setup, your data never leaves your device. You’re now empowered to explore AI on your terms, enjoying the benefits of advanced technology without sacrificing privacy.
A huge thank you to The Hated One for collaborating with us on this guide. He has a playlist about how to use the internet anonymously, which is a great resource!
A version of this article first appeared in video form on NBTV. NBTV is a non-profit educational platform that teaches people how to reclaim control of their lives in the digital age. They give people the tools they need to take back their privacy, money, and free online expression.
Learn more at NBTV.media