Incognito Cat

Host Wikipedia Offline with Kiwix

Host Wikipedia Offline with Kiwix

Remember the encyclopedia? Before the internet, this massive set—often inherited, perpetually out-of-date, and occasionally used as a booster seat—was the pinnacle of home scholarship. (The only more fascinating read for a child was, of course, The Guinness Book of World Records.) The sheer scale of knowledge contained in those heavy, dusty volumes was astounding. Yet, even the best sets were hilariously behind the times; our family's edition, for example, thought the Philippines was just about to gain independence in 1946!

That outdated physical mountain of knowledge now fits on a single, tiny memory card. With Kiwix, you don't just access Wikipedia; you can download and host the entire encyclopedia of the modern age on your own local server.

Information is King

The original promise of the internet was universal access to all the world's knowledge. Today, that access often comes with a price tag: relentless online tracking and constant ads. Worse, for billions of people, access is restricted entirely—whether suppressed by a government or physically curtailed by slow, nonexistent, or expensive connections.

For both privacy advocates and the digitally disconnected, there is an empowering solution: Kiwix.

Kiwix is a free and open-source software project run by a Swiss non-profit. Their core mission is simple: to bridge the digital divide by ensuring that information, especially from open-source projects like Wikipedia, remains accessible everywhere, regardless of cost, infrastructure, or censorship.

Their ingenious solution involves encapsulating vast amounts of information, like the entirety of Wikipedia, into a single, highly compressed file format called ZIM. This file can easily fit onto portable media like a memory card or USB drive. Kiwix provides various ways to read these ZIM files, including desktop and mobile clients and dedicated server software. They even have a version for the Raspberry Pi, turning the tiny device into an instant information hotspot that broadcasts knowledge over a local Wi-Fi network. You can explore their amazing collection of websites and books in ZIM format at https://library.kiwix.org/

The Self-Hosting Imperative

We fundamentally believe in a local-first philosophy. Using self-hosted, offline options prevents the monitoring of our habits and the collection of our personal information. This applies to everything we do: from streaming music off our Network Attached Storage (NAS) and watching TV shows recorded on our locally hosted Over the Air (OTA) Digital Video Recorder (DVR), to running our Smarthome with Home Assistant, or even powering our own AI Server using Ollama. Hosting locally lets us enjoy modern technology without modern intrusions.

Since our self-hosted AI Server had plenty of underutilized processing and storage capacity, adding a Kiwix server to it was an obvious choice. Now, we keep an astonishing amount of knowledge instantly available at our fingertips, completely off the grid!

Dropping in Another Container

If you haven't read our post about our AI server, here's the basic setup for context:

Note on AI Model: Since that initial setup in the post, we’ve moved to the mistral-nemo:12b model, which runs entirely on the GPU. This leaves a significant amount of memory and storage for additions—perfect for Kiwix.

To get Kiwix running, we first created a local folder to hold the ZIM archives, which we pulled down using WGET from https://download.kiwix.org/zim/. With our existing Docker setup ready, we then pulled the latest Kiwix-Server image following the instructions at https://github.com/kiwix/kiwix-tools/pkgs/container/kiwix-serve.

A new Docker container was set up, and Kiwix was up and running almost immediately. A few adjustments to our firewall were required since we changed the default Kiwix port from 8080.

Just like that, we had loads of new information locally, so even when Mother Nature takes out our internet, we're not completely disconnected from necessary information.

Screenshot of our Kiwix Server

Making Your Own ZIM Files

Kiwix offers a fantastic library to start with. However, if you need an additional source that isn't available, Kiwix provides a tool called Zimit. While a Docker version is available, the most accessible option is the Zimit website.

To demonstrate, we spent a couple of days working with the Docker version to capture our own website, which taught us valuable lessons.

IncognitoCat.me on Kiwix

Our first lesson was about the "Scope Exclude Regular Expression" (--scopeExcludeRx) option. Our initial capture of our tiny, text-based website took twelve hours and resulted in a massive 260MB file! The culprit was the endless web of hashtags at the bottom of every post. Once those were excluded, the capture took minutes and resulted in a lean 8MB file—a much better result. We also figured out how to adjust our website styles to hide features that wouldn't work in the static ZIM file and how to set the correct title, description, and icon.

Post from IncognitoCat.me on Kiwix

We don't suggest using the Docker version for most users. Instead, use Kiwix's handy website at https://zimit.kiwix.org/ for a much friendlier user interface. Regardless of the method, we highly recommend reading the documentation on their GitHub at https://github.com/openzim/zimit to understand the options—especially that critical "Scope Exclude Regular Expression."

Consider Adding Kiwix to Your Privacy Toolbox

We hope this post has provided you with valuable information about another excellent privacy-focused option. Having critical, up-to-date information on hand when there's no internet is a tremendous addition, whether it's on your phone, desktop, or on your home network like our setup.

We believe Kiwix is providing an invaluable service to the global community and encourage you to visit their website at https://kiwix.org/ to learn more about the project and consider making a donation to support their efforts to make information available to everyone.

Remember, we may not have anything to hide, but we have everything to protect.

Host Wikipedia Offline with Kiwix

#DigitalPrivacy #Kiwix #Privacy #PrivacyTool #Wikipedia