Skip to main content

Command Palette

Search for a command to run...

Avoiding Data Leaks with Self-Hosted AI: My Setup with Ollama and Open Web UI

Updated
•3 min read
Avoiding Data Leaks with Self-Hosted AI: My Setup with Ollama and Open Web UI
C

I like to: āš™ļø Tinker with different kinds of technologies šŸ’­ Apply knowledge across different fields/disciplines šŸ“ˆ Optimize stuff (sometimes more than necessary lol)

Important Note

  • This does not have ChatGPT-like quality of responses, far from it, since this is hosted on a low spec. server (4$ a month) for cost reasons. Production LLMs would need a higher spec., most specifically VRAM.

  • For now, this is just a high-level explanation of what tools I used and what they’re for. Currently, I’m focused on working on a centralized data ingestion platform. I’ll update this soon with a complete tutorial. The platform I’m building pulls in context for AI use from multiple sources such as Google Drive, Notion, Jira, and more, and lets you plug chatbots into your app that can use that context.

  • Also, open source language models have come a long way, they are getting cheaper to host (requires less resources) and more effective!

Stay tuned for updates!


LLM Provider Data Privacy Concerns

Data privacy has become a real concern since tools like ChatGPT, Claude, and others became popular. For regular users, it might not seem like a big deal. But for companies, there's a risk that sensitive info such as internal documents, code, or client data could be stored or even used by the AI provider if they’re not careful.

There have already been a few incidents. In 2023, Samsung banned ChatGPT after an employee accidentally shared confidential source code. Apple, JPMorgan, and others also restricted its use. Even though providers like OpenAI say they don’t train on API data, many companies still feel uneasy. A KPMG survey found that 78% of business leaders in the U.S. listed data privacy as one of their biggest concerns when using AI.

Because of this, more businesses are starting to host their own LLMs, keeping both the model and the data within their own systems. This gives them more control, helps with compliance like GDPR or HIPAA, and lowers the risk of sensitive data being exposed to external platforms.


My Approach to Self-Hosting LLMs

So what’s the solution? I wanted to explore how these models could be hosted independently without relying on third-party services. The open-source community is amazing because for almost every paid platform or tool, there’s often a self-hostable alternative you can run on your own server or even locally.

Here’s what I deployed:

  • Open WebUI
    This is the chat interface. It can save conversations, switch between local models (powered by Ollama), enable web search, and more.

  • Ollama
    This handles the models, including downloading, deleting, and running them in the background. It also exposes an inference API, which is really useful.

Ideally, I’d love to run all of this on my own physical server. But I don’t have a home lab setup yet. Running large language models requires a good amount of VRAM, which most budget VPS plans don’t offer.


Quick Note on Cloud Hosting

The whole point of this setup is to give you more control over your data without sacrificing usability. Hosting on a cloud VPS doesn’t give full control, since your data still lives on someone else’s physical servers. Once it's there, you’re bound by the provider’s terms of service. In some cases, they could take down your server and you could lose everything if you're not prepared.

Still, this shows that it’s possible to self-host these tools even on a tight budget. If you want to use LLMs for production, you’ll need a machine with enough VRAM to get something close to ChatGPT-quality responses.

For this setup, I used the cheapest Hetzner VPS (CX22), which costs me €3.99 per month plus €0.60 for IPv4. It’s relatively cheap compared to DigitalOcean.


Try It Yourself

You can test it out here:

Link: openwebui.chrismargate.dev
Email: guest@chrismargate.dev
Password: rVh3qNBUpiqWW1cNEdkoe3v1iQuuTik9+zmg6q74jos=

Hope you enjoyed this short blog! I’ll be updating it soon with more details.