blog/content/state-of-the-servers-2022.md

87 lines
14 KiB
Markdown

+++
date = "2022-11-01T00:00:00-05:00"
tags = ["systems administration", "pvc","ceph","homelab","servers","networking"]
title = "State of the Servers 2022"
description = "A complete writeup of my homeproduction system as of end-2022, its 10th anniversary"
type = "post"
weight = 1
draft = true
+++
My home lab/production datacentre is my main hobby. While I have others, over the past 10 years I've definitely spent more time on it than anything else. From humble beginnings I've built a system to rival many small-to-medium enterprises (SMEs) or ISPs in my basement, providing me highly redundant and stable services both for my Internet presence, and for learning.
While I've written about parts of the setup in the past, I don't think I've ever done a complete and thorough writeup of every piece of the system, what went into my choices (spoiler: mostly cost) and design, and how it all fits together. This post is my attempt to rectify that.
For most of its first 8 years, the system was constantly changing even month-to-month as I obtained new parts, tinkered away, and just generally broke things for fun. But over COVID, while working from home, and with money being tight, as well as my maturity as a senior systems architect, things really stabilized, and it's only changed in a few minor ways since 2020. I have big plans for 2023, but for right now things have been stable for long enough to be able to really dig into all the parts, as well as hint at my future plans.
So if you dare, please join me on a virtual tour of my "homeproduction" system, the monster in my basement.
## Part One: A Brief History
My homelab journey started over 16 years ago while still in high school. At the time I was a serious computer enthusiast, and had more than enough spare parts to build a few home servers. Between then and finishing my college program (Network Engineering and Security Analyst at Mohawk College in Hamilton, Ontario) in late 2012, I went through a variety of setups that were almost exclusively based on single servers with storage, some sort of hypervisor, and just for tinkering and media storage.
When I started my career in earnest in January 2013, I finally had the disposable income to buy my first real server: a used Dell C6100 with 4 blade nodes. This system formed the basis of my lab for the next 6 years, and is still running today in a colo providing live functions for me.
My first few iterations tended to focus on a pair of Xen servers for virtualization and a separate ZFS server for storage, while also going through various combinations of routers including Mikrotiks trying to find something that would solve my endless WiFi issues. At this time I was running at most a dozen or so VMs with some core functionality for Internet presence, but nothing too fancy - it was primarily a learning tool. At one point I also tried doing a dual-primary DRBD setup for VM disks, but this went about as well as you might expect (not well at all), so I went back to a ZFS array for ZVOLs. I was also using `bcfg2` for configuration management. Basically, I had fully mirrored what I used and deployed at work built from the ground up, and it gave me some seriously in-depth knowledge of these tools that were crucial to my later role.
![Early Homelab Rack #1](/images/state-of-the-servers-2022/early-rack1.png)
Around this time I was also finally stabilizing on a pretty consistent set of systems, and a rumored change to Google's terms for hosted domains prompted me to move one of my first major production services into my home system: email. I can safely say that, having now run email at home for 7 years, it works plenty fine if you take the proper care.
In early 2016 I discovered two critical new things for the management of my systems: Ansible and Ceph. At first, I was using Ansible mostly for ad-hoc tasks, but I quickly started putting together a set of roles to replace bcfg2 as my primary configuration management tool. While declarative configuration management is nice and all, I liked the flexibility of a more procedural, imperitive system, especially when creating new roles, and it gave me a lot of power to automate complex program deployments that were impossible in bcfg2. By the end of the year I had fully moved over to Ansible for configuration management. I also started using `git` to track my configuration around this time, so this is the earliest period I still have records of, though I might wish to forget it...
![Writing good Git Commits](/images/state-of-the-servers-2022/joshfailsatgit.jpg)
Ceph was the real game-changer for me though. For most of the previous 2 years I had been immensely frustrated with my storage host being a single point of failure in my network: if it needed a kernel update, *everything* had to go down. I had looked into some of the more esoteric "enterprise" solutions like multipath SAS and redundant disk arrays, but cost, space, and power requirements kept me from going that route. Then a good friend introduced me to Ceph which he had been playing with at his workplace. Suddenly I could take 3 generic servers (which he, newly married, was happy to provide due to wife-acceptance-factor reasons) and build a redudant and scalable storage cluster that could tolerate single-host failures. At the time Ceph was a lot more primitive than it is today, forcing some uncommon solutions - using ZFS as an underlying filestore for instance to ensure corrupt data wouldn't be replicated. But this same cluster still serves me now after many years of tweaking and adjusting, having grown from just a dozen 3TB drives to over 20 8TB and 14TB drives and 168TB of raw space. At this time, getting actual file storage on Ceph was hard, due to the immaturity of CephFS at this point, and the hack solution of an XFS array in-VM with a dozen 1TB stripes was fraught with issues, but it worked well enough for long enough for CephFS to mature and for me to move to it for bulk data storage.
The next major change to my hypervisor stack came in mid-2016. In addition to a job change that introduced me to a lot of new technologies, at that point I was really feeling limited by Xen's interface and lack of any sort of batteries, so I looked into alteratives. At this time I looked into ProxMox, but was not at all impressed with its performance, reliability, or featureset; that opinion has not changed since. So I decided on one of my first daring plans: to switch from Xen to KVM+Libvirt, and use Corosync and Pacemaker to manage my VMs, with shared storage provided by the Ceph cluster.
By the end of 2016 I had also finally solved my WiFi problem, using a nice bonus to purchase a pair of Ubiquiti UAP-LR's which were, in addition to their strong signal, capable of proper roaming, finally allowing me to actually cover my entire house with usable WiFi. And a year later I upgraded this to a pair of UAP-AC Pro's for even better speed, keeping one of the UAP-LR's as a separate home automation network. I also moved my routing from the previous Mikrotik Routerboards I was using to pfSense, and bought myself a 10-Gigabit switch to upgrade the connectivity of all of the servers, which overnight nearly doubled the performance of my storage cluster. I also purchased several more servers around this time, first to experiment with, and then to replace my now-aging C6100.
2017 was a year of home automation and routing. I purchased my first set of Belkin WeMo switches, finally set up HomeAssistant, and got to work automating many of my lights, including a [custom voice controller system](https://www.boniface.me/self-hosted-voice-control/). Early in the year I also decided to abandon my long-serving ISP-provided static IP block and move to a new setup. While I liked the services and control it gave me, being DSL on 50 year old lines, the actual Internet performance was abysmal, and I wanted WAN redundancy. So I set up a remote dedicated server with a static IP block routed to it, then piped this back to my home using OpenVPN tunnels load-balanced over my now-redundant DSL and Cable Internet connections, proving both resiliency as well as a more "official" online presence. Later in the year, after discussing with a few coworkers, I invested in a proper colocation, abandoned the dedicated server, and used my now-freed and frustrating C6100 as a redundant pair of remote routers, with a pfSense pair on the home side.
In early 2018, the major drawbacks of Corosync and Pacemaker were rearing their ugly heads more and more often: any attempt to restart the service would trash my VM cluster, which had grown to about 30 VMs running many more service by this point. ProxMox still sort of sucked, and OpenStack was nigh-incomprehensable to a single mere wizard like myself. What I wanted was Nutanix, but even most SME's can't afford that. So, I started building my own, named [PVC or Parallel Virtual Cluster](https://docs.parallelvirtualcluster.org). It wasn't that ambitious at first: I just wanted a replacement to Corosync+Pacemaker which would actually preserve state properly, using Zookeeper as the state management backend. Over time I slowly added more functionality to it, and a major breakthrough came in late 2018 when I left the "new" job and returned to my "old" job, bringing this project with me, and impressing my managers with its potential to replace their aging Xen-based platform (on which I based my original homelab design, ironically enough; student became teacher). By early 2020 I had it deployed in production at 2 ISPs, and today have it deployed at 9, plus two in-house clusters, with several more on the way. I discuss PVC in more detail later.
In late 2018, I finally grew fed up with pfSense. The short of it is, nothing config-wise in pfSense is static: events like "the WAN 1 interface went down" would trigger PHP scripts which would regenerate and reload dozens of services, meaning that trivialities like WAN failovers would take up to 30 seconds. Frustrated, I decided to abandon pfSense entirely and replaced my routers with custom-built FreeBSD boxes in line with the remote routers at the colocation. This setup proved invaluable going forward: 1-second failure detection and seamless failover have been instrumental in keeping a 99+% uptime on my *home* system.
2019 was a fairly quiet year, with some minor upgrades here and there, with the occasional server replacement to help keep power usage down. And by early 2020, most of the current system had fallen into place. While the number of VMs fluctuates month to month still, the core set is about 40 that are always running, across 3 hypervisor hosts running PVC, with bulk data on the 3 Ceph nodes. 2 routers on each side provided redundant connectivity, and after an unfortunate complication with my TPIA cable provider, in 2022 I moved to a business-class Gigabit cable connection, and added a Fixed-Wireless connection in addition to the existing DSL, bringing me to 3 Internet connections. There's no kill like overkill.
## Part Two: The Rack
![Rack Front](/images/state-of-the-servers-2022/rack-front.png)
![Rack Inside](/images/state-of-the-servers-2022/rack-inside.png)
The rack itself is built primarily of 2x4's and wood paneling. Originally, as seen above, I had used Lack tables, but due to the heat output I wanted to contain the heat and try to vent it somewhere useful, or at least not as obtrusive. This went through several iterations, and after scouring for enclosed racks around me to no avail, in ~2017 I took what is now a common refrain for me and built my own.
The primary consturction uses 6 ~6-foot 2x4 risers, which are connected at the top and bottom by horizontal 2x4's to form a stable frame. Heavy castors are present on the bottom below each riser to allow for (relatively) easy movement of the rack around the room as needed, for instance for maintnance or enhancements. The bottom front section also features further horizonal 2x4's to form a base for the heavy UPSes discussed in the next section.
The actual servers sit on pieces of angle iron cut to approximately 3 feet, which bridge the first and second sets of risers on each side, and secured by 2 heavy screws and washers on each riser. This provides an extremely stable support for even the heaviest servers I have, and allows for fairly easy maintenance without having to deal with traditional rails and their mounting points.
The outside of the entire rack is covered by thin veneer paneling to trap heat inside in a controlled way. On the left side, the back section forms a door which can be opened to provide access to the backs of the servers, cabling, and power connections.
I've gone through several airflow configurations to try to keep both the rack itself, and the room it's in, cooler. First, I attempted to directly exhaust the hot air out the adjoining window, but this was too prone to seasonal temperature variation to be useful. I then attempted to route the heat out the side of the rack to the front where it could be cooled by an air conditioner, but this proved ineffective as well. Finally, I moved to a simple, physics-powered solution whereby the top 6 inches of the rack is used to direct hot air, via a set of 4 fans in 2 pairs, towards the top front of the rack and out into the room; this solution works very well to keep the inside temperature of the rack to a relatively reasonable 35 degrees Celsius.
## Part Three: Cooling and Power
Continuing on from the rack exhaust, cooling inside the room is provided by a standalone portable 12000BTU air conditioner from Toshiba.
## Part Four: The Internet Connections and Colocation
## Part Five: The Network
## Part Six: The Bulk Storage Cluster and Backups
## Part Seven: The Hypervisor Cluster
## Part Eight: The "Infrastructure" VMs
## Part Nine: The Matrix/Synapse VMs
## Part Ten: The Other VMs
## Part Eleven: Diagrams!
## Part Twelve: A Video Tour
## Part Thirteen: The Future of the System