Designing My Homelab Configuration

I needed a homelab to test things for my project described over in Ubuntu 22.04 LTS High Availability Router/Firewall with Two ISPs. This post explains the process of how I designed my homelab setup. In other articles, I will provide details on the step-by-step instructions I used for building the homelab and simulating the high-availability environment I was working on. This post will be updated with links when those articles are completed.

I have used a lot of virtualization technologies over the years, chosing whatever best suites my needs. I have used VMWare ESXi, VMWare Workstation, early versions of OpenStack on Ubuntu, Xen on CentOS, VirtualBox, and variations of Hyper-V. In recent years, my needs have been rather meager, so a quick VirtualBox or even WSL install on a Windows laptop was more than enough to get me through for the little oddball projects I needed. But, this more recent project required something more capable, yet still inexpensive.

My first instinct was to see how far I could push VirtualBox. The answer: Not very far. It worked, but even on a decent laptop with a moderate amount of RAM, VMs were slow and network configurations were limited. I needed some dedicated hardware that did not rely on a wireless interface or USB ethernet adapter. But, this is a homelab that I’m using for pro-bono work, so spending thousands of dollars is out of the question.

I settled on buying a cheap refurbished machine and seeing how much I could squeeze out of it. The machine came with a 4-core i7 processor, 16GB of RAM, a 500GB HDD, 4 PCIe slots (16x,4x,1x,1x), and an onboard NIC. For under $200, this wasn’t bad, especially considering the pandemic had driven hardware costs up over the past couple of years. I added a PCIe x1 NIC capable of 2.5Gbps, replaced the ram with 32GB, and added a 1TB NVMe SSD. My plan was to try slotting two 1TB NVMe drives for redundancy and performance, but the BIOS did not support booting from NVMe nor did it allow for PCIe bus bifurcation. I did not want to consume both of my high-speed PCIe slots with NVMe drives because I might need them for other functions. This homelab is not super powerful, but it does not need to be. It runs silently under my desk, and performs very well for my purposes. Better hardware will work just fine with exactly the same software setup, so future upgrades will be trivial.

Next, I needed to select the platform. VMWare’s ESXi would have been decent for the project I am working on. The people I am working with already have a recent VMWare deployment they have invested in, so they will not be dropping VMWare for at least a couple of years. But with the recent Broadcom purchase of VMWare, ESXi no longer comes in a free version and the minimum buy-in for vSphere is impractical for a homelab. VMWare Workstation would require a GUI-based OS to be running in addition to the hypervisor software, which is going to eat up the available resources on a lightweight homelab server. So I first opted to try out OpenStack, which I had not used for a couple iterations of its development.

Installation of OpenStack following the official Canonical OpenStack Installation on Ubuntu documentation works in a pretty straightforward manner. But it is kind of a slow process to install. OpenStack is designed to run on multiple physical machines. The single-node variant builds the system using containers. It uses 23 if I recall correctly. My little homelab server chugged away on that for about an hour, but eventually everything came up. The web GUI worked out of the box, but was extremely sluggish, taking several seconds to load every page. After reading some documentation and user forums, there were various performance tweaks I could try. But this system was creating a moderate CPU load out of the box, with no VMs running. This was not going to work for my lightweight homelab.

Next, I turned to Proxmox. I had used Proxmox setup by others in the past, but never ran my own Proxmox install. The installation was straightforward using the official Proxmox Documentation. Installation was quick, typical of a Debian-based install. Within a few minutes, I was at the web GUI login page. But, I had to login as an actual root user. Logging into root on a web GUI was troubling to me. Debian and derivatives have disabled root login by default for several versions, and it has been a recommended practice for many years before that. Despite that, many systems require elevated privileges during setup, and then guide you through securing the system. Not Proxmox. The documentation does not even offer guidance on setting up non-root users and securing the system. Performance was better than OpenStack, but the system still showed a fair amount of load for an idle hypervisor, and a significant number of services were running that I would not need any time soon. Rather than spending a lot of time figuring out how to configure my shiny new Proxmox box, I figured maybe I should try something else.

GUIs are nice. Don’t get me wrong. But after thinking about it, I decided that a GUI was just going to consume my limited resources. Because I almost always have to get under the hood and tweak things from the CLI to carry out the projects I’m working on, I don’t really get a lot of benefit anyway. So I asked myself, “What other options are there?” OpenStack, Proxmox, and even Xen all use KVM under the hood. What is the current state of CLI tools for managing KVM directly? Pretty good, actually.

A base Ubuntu Server 22.04 LTS install actually supports raw KVM instances fairly easily. Load a couple of apt packages and you are off to the races. So I decided to install Ubuntu Server 22.04 LTS back on my homelab server. But, the same exact installer I had used before to load this machine was now crashing repeatedly. After a little poking around, I concluded Proxmox had done something to the NVMe drive that Ubuntu did not know what to do with, so the installer crashed seconds after trying to make partition changes. A quick dd if=/dev/zero of=/dev/nvme0n1 bs=4096 count=1 wiped the partition table (plus a few blocks), and then booting the installer again worked like a charm.

On this install, knowing that KVM would not have dozens of components requiring various network addresses, I decided to be selective in my network configuration from the very start. My home network (like many) runs on the 192.168.1.0/24 network from RFC1918 private address space. My Internet Service Provider’s (ISP) Customer Premise Equipment (CPE) is configured by default to serve DHCP addresses from .64 to .253, leaving the lower 63 addresses for static assignment. I chose to use the CIDR boundary of 192.168.1.16/28 for my homelab to have a few addresses exposed to the LAN, so I gave my base install a static address of 192.168.1.16/24, a gateway of 192.168.1.254, nameservers of 1.1.1.1 and 1.0.0.1 (Cloudflare), and search domains of my personal domains. Because I’m always logging in by SSH, I used Launchpad’s SSH pubkey import mechanism. (As a side note, I use Launchpad instead of GitHub for this because Launchpad supports IPv6, allowing for IPv6-only installs when desired). After about 2 minutes, Ubuntu is installed, and another 5 minutes later and the updates were done and I was ready to reboot.

Even though the installer upgrades various components, after reboot I always run sudo apt update && sudo apt upgrade -y to make sure I am completely ready to go with the latest software. Sure enough, there was a new kernel, so another reboot was required. After the system was back up, I was ready to finalize my virtualization environment. A quick series of apt install commands installed the userspace tools I needed.

Having previously installed and run OpenStack and Proxmox, I already knew my machine had the appropriate hardware and bios settings to support KVM, but if needed the kvm-ok tool can be installed and run with sudo apt install -y cpu-checker && kvm-ok to see what the status of virtualization capabilities are on the machine.

Out of the box, libvirt on Ubuntu comes with a default virtual network configured that uses NAT. My project involves interfacing directly with the ISP to do IPv6 Prefix Delegation (PD), so I knew I needed to bridge my devices onto network. I built two networks, one with bridged forwarding to my LAN, and two isolated networks. There are many options that can be specified within the interface definition file (see libvirt Network XML Format). But for a very simple bridge to the real network and isolated networks, extremely simple XML can be used. More complex configurations will probably require explicitly defining the interface that underlies the bridge.

After the networks were up, I was able to craft some virt-install commands to fire up some instances from Ubuntu ISOs I had downloaded. Within an hour, I had a fully functional Ubuntu Desktop install and four Ubuntu Server installs running concurrently, plus the hypervisor. Even while updates were running, my typical load was under 1.5. For a total of six operating systems and one full GUI running concurrently, this was very functional only my cheap multi-core setup.

I have already used this homelab over the last two weeks to test several iterations of builds for the pro-bono high-availability project, and it has been a great tool now that their servers are in semi-production and cannot run testing configs during business hours or without advanced notice to end-users. Snapshots from the CLI are as simple as virsh snapshot-create <instance-name> and rollbacks are as easy as virsh snapshot-rollback --current. With six instances running idle, load average is around 0.30 on a 4-core system. Network latency across the virtual networks is sub-millisecond, and onto the real network it floats around 1 millisecond. Boot and reboot performance vastly exceeds the real environment I’m simulating, because KVM boots have zero delays for BIOS and hardware checks. For comparison, reboot to login prompt is about 20 seconds in the homelab, but 112 seconds on the production servers. If I break something on the network configs (which is very easy to do when configuring high-availability), console access is just a virsh console <instance-name> away. All things considered, I am extremely pleased with this low budget solution, and I it performs far better than I expected for the investment I was able to make.