Fix incorrect version

Fix link name
Fix links
2024-12-27 21:22:28 -05:00 · 2024-12-27 21:19:41 -05:00 · 2024-12-27 21:18:01 -05:00 · 2024-12-27 21:16:06 -05:00 · 2024-12-27 21:13:24 -05:00 · 2024-12-27 21:12:19 -05:00
39 changed files with 4150 additions and 622 deletions
--- a/README.md
+++ b/README.md
@ -1,10 +1,11 @@
 <p align="center">
-<img alt="Logo banner" src="docs/images/pvc_logo_black.png"/>
+<img alt="Logo banner" src="https://docs.parallelvirtualcluster.org/en/latest/images/pvc_logo_black.png"/>
 <br/><br/>
+<a href="https://www.parallelvirtualcluster.org"><img alt="Website" src="https://img.shields.io/badge/visit-website-blue"/></a>
+<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Latest Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
+<a href="https://docs.parallelvirtualcluster.org/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
 <a href="https://github.com/parallelvirtualcluster/pvc"><img alt="License" src="https://img.shields.io/github/license/parallelvirtualcluster/pvc"/></a>
 <a href="https://github.com/psf/black"><img alt="Code style: Black" src="https://img.shields.io/badge/code%20style-black-000000.svg"/></a>
-<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
-<a href="https://docs.parallelvirtualcluster.org/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
 </p>

 ## What is PVC?
@ -19,41 +20,13 @@ As a consequence of its features, PVC makes administrating very high-uptime VMs

 PVC also features an optional, fully customizable VM provisioning framework, designed to automate and simplify VM deployments using custom provisioning profiles, scripts, and CloudInit userdata API support.

-Installation of PVC is accomplished by two main components: a [Node installer ISO](https://github.com/parallelvirtualcluster/pvc-installer) which creates on-demand installer ISOs, and an [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) to configure, bootstrap, and administrate the nodes. Installation can also be fully automated with a companion [cluster bootstrapping system](https://github.com/parallelvirtualcluster/pvc-bootstrap). Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client or WebUI.
+Installation of PVC is accomplished by two main components: a [Node installer ISO](https://github.com/parallelvirtualcluster/pvc-installer) which creates on-demand installer ISOs, and an [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) to configure, bootstrap, and administrate the nodes. Installation can also be fully automated with a companion [cluster bootstrapping system](https://github.com/parallelvirtualcluster/pvc-bootstrap). Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client ~~or WebUI~~ (eventually).

 Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.

+More information about PVC, its motivations, the hardware requirements, and setting up and managing a cluster [can be found over at our docs page](https://docs.parallelvirtualcluster.org).

-## What is it based on?
+## Documentation

-The core node and API daemons, as well as the CLI API client, are written in Python 3 and are fully Free Software (GNU GPL v3). In addition to these, PVC makes use of the following software tools to provide a holistic hyperconverged infrastructure solution:
+This repository contains the MKdocs configuration for the https://docs.parallelvirtualcluster.org ReadTheDocs page.

-  * Debian GNU/Linux as the base OS.
-  * Linux KVM, QEMU, and Libvirt for VM management.
-  * Linux `ip`, FRRouting, NFTables, DNSMasq, and PowerDNS for network management.
-  * Ceph for storage management.
-  * Apache Zookeeper for the primary cluster state database.
-  * Patroni PostgreSQL manager for the secondary relation databases (DNS aggregation, Provisioner configuration).
-
-
-## Getting Started
-
-To get started with PVC, please see the [About](https://docs.parallelvirtualcluster.org/en/latest/about/) page for general information about the project, and the [Getting Started](https://docs.parallelvirtualcluster.org/en/latest/getting-started/) page for details on configuring your first cluster.
-
-
-## Changelog
-
-View the changelog in [CHANGELOG.md](CHANGELOG.md).
-
-
-## Screenshots
-
-While PVC's API and internals aren't very screenshot-worthy, here is some example output of the CLI tool.
-
-<p><img alt="Node listing" src="docs/images/pvc-nodes.png"/><br/><i>Listing the nodes in a cluster</i></p>
-
-<p><img alt="Network listing" src="docs/images/pvc-networks.png"/><br/><i>Listing the networks in a cluster, showing 3 bridged and 1 IPv4-only managed networks</i></p>
-
-<p><img alt="VM listing and migration" src="docs/images/pvc-migration.png"/><br/><i>Listing a limited set of VMs and migrating one with status updates</i></p>
-
-<p><img alt="Node logs" src="docs/images/pvc-nodelog.png"/><br/><i>Viewing the logs of a node (keepalives and VM [un]migration)</i></p>
--- a/docs/about-pvc.md
+++ b/docs/about-pvc.md
@ -2,7 +2,7 @@
 title: "About PVC"
 ---

-This document outlines the basic ideas and inspiration behind PVC as well as some frequently asked questions.
+This document outlines the basic ideas and inspiration behind PVC as well as the core feature set, the underlying technology, and some frequently asked questions.

 [TOC]

@ -24,6 +24,71 @@ PVC aims to bridge the gaps between these 3 categories. Like the larger FLOSS an

 In short, it is a Free Software, scalable, redundant, self-healing, and self-managing private cloud solution designed with administrator simplicity in mind. 

+## Core Features
+
+All features are as of the latest version: <a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
+
+### Overall/Nodes
+
+* Node-level redundancy & node N-1 fault tolerance
+* Cluster- and node-level monitoring
+* Stable base operating system (5+ year support)
+* Convenient, holistic view of the cluster (resources, devices, VMs, etc.) via CLI and API
+* Deployment, management, updates, and base OS upgrades via straightforward [Ansible playbooks](https://github.com/parallelvirtualcluster/pvc-ansible) and [a custom installer ISO](https://github.com/parallelvirtualcluster/pvc-installer)
+* External [bootstrap system](https://github.com/parallelvirtualcluster/pvc-bootstrap) for low-touch cluster deployment
+* Cluster-level backup and restore
+* Node hot add/remove from service (flush/unflush/restore) for maintenance
+* Automatic fencing of unresponsive node(s) and recovery of affected VMs (conditional)
+* Cluster maintenance state (allows monitoring/alerting pause while performing maintenance)
+* Included Munin and CheckMK monitoring plugins
+
+### Virtual Machine Management
+
+* Full VM lifecycle management (start/stop/restart/shutdown/disable)
+* Live-migration (zero-downtime move) of VMs between nodes
+* Automatic restarting of failed VMs
+* (For supporting VMs) Serial console logging with interactive follow
+* VNC console support with flexible listen directives
+* Simple resource management (vCPU/memory) w/restart
+* Hot attach/detach of virtual NICs and block devices
+* Tag support for organization/classification
+* VM hot/online snapshot creation (disks + configuration), with incremental image support, management (delete), and restore
+* VM autobackups with self-contained backup rotation and optional automatic mounting of remote storage resources
+* VM snapshot shipping to external clusters (mirroring) and mirror promotion
+* VM automirrors with self-contained snapshot rotation for regular creation of mirrors
+
+### Network Management
+
+* Bridged (vLAN), Managed (VXLAN, virtual), and Direct (SR-IOV) VM networks
+* Consistent cluster view (all nodes are provisioned with all networks) for Bridged and Managed VM networks
+* DHCP, DNS, NTP, and TFTP support for Managed VM networks
+* Upstream BGP for route learning for Managed VM networks
+
+### Storage Management
+
+* Distributed & replicated self-healing storage backend (Ceph Object Store) with high availability and node-level redundancy
+* Shared storage for VM storage volumes/virtual disks (Ceph RBD)
+* Integrated monitoring and alerting into PVC frontend
+* Zero-cost snapshots
+* Flexible pool replication conigurations for large or complex clusters
+* Support for arbitrary data disk sizes (with limits)
+
+### Provisioning
+
+* Integrated, highly flexible VM provisioning system
+* Define custom Python 3 install scripts or use included examples for common OSes
+* CloudInit Amazon EC2-compatible CloudInit "userdata" support
+* Define dynamic VM profiles from component templates (system, network, disk), scripts, and userdata
+* OVA VM package support
+* Virtual disk import (raw, VMDK, qcow2, and others) support
+* Volume cloning support (cloning VMs)
+
+### Other
+
+* Free, Libre and Open Source (FLOSS) software
+* Written in modern Python 3
+* Well-maintained and frequently updated
+
 ## Building Blocks

 PVC itself is a series of software daemons (services) written in Python 3, with the CLI interface also written in Python 3, designed to glue other FLOSS tools together in order to provide a consistent cluster operation and management experience.
@ -65,7 +130,7 @@ If all you want is a simple home server solution, or you demand scalability beyo

 For a redundant cluster, yes. PVC requires a majority quorum for proper operation at various levels, and the smallest possible majority quorum is 2-of-3; thus 3 nodes is the smallest safe minimum. That said, you can run PVC on a single node for testing/lab purposes without host-level redundancy, should you wish to do so, and it might also be possible to run 2 "main" systems with a 3rd "quorum observer" hosting only the management tools but no VMs; however these options are not officially supported, as PVC is designed primarily for 3+ node operation.

-For more details, see the [Cluster Architecture page](/deployment/cluster-architecture).
+For more details, see the [Cluster Architecture page](architecture/cluster-architecture.md).

 #### Does PVC support containers (Docker/Kubernetes/LXC/etc.)?

--- a/docs/architecture/cluster-architecture.md
+++ b/docs/architecture/cluster-architecture.md
@ -78,7 +78,7 @@ Many PVC daemons, as discussed below, leverage a majority quorum to function. A

 This is an important consideration when deciding the number of coordinators to allocate: a 3-coordinator system can tolerate the loss of a single coordinator without impacting the cluster, but losing 2 would render the cluster inoperable; similarly, a 5-coordinator system can tolerate the loss of 2 coordinators, but losing 3 would render the cluster inoperable. In addition, these coordinators must be located in such a way that a majority can communicate in outage events, in order for the cluster to remain operational. This affects the network and physical design of a cluster and must be carefully considered during deployment; for instance, network switches and links, and power, should be redundant.

-For more details on this, see the [Fencing & Georedundancy](/deployment/fencing-and-georedundancy) documentation. This document also covers the node fencing process, which allows automatic recovery from a node failure in certain outage events.
+For more details on this, see the [Fencing](fencing.md) and [Georedundancy](georedundancy.md) documentation. The first also covers the node fencing process, which allows automatic recovery from a node failure in certain outage events.

 Hypervisors are not affected by the coordinator quorum: a cluster can lose any number of non-coordinator hypervisors without impacting core services, though compute resources (CPU and memory) must be available on the remaining nodes for VMs to function properly, and any OSDs on these hypervisors, if applicable, would become unavailable, potentially impacting storage availability.

@ -92,7 +92,7 @@ The Zookeeper database runs on the coordinator nodes, and requires a majority qu

 ### Patroni/PostgreSQL

-PVC uses the Patroni PostgreSQL cluster manager to store relational data for use by the [Provisioner subsystem](/manuals/provisioner) and managed network DNS aggregation.
+PVC uses the Patroni PostgreSQL cluster manager to store relational data for use by the [Provisioner subsystem](../deployment/provisioner) and managed network DNS aggregation.

 The Patroni system runs on the coordinator nodes, with the primary coordinator taking on the "leader" role (read-write) and all others taking on the "follower" role (read-only). Patroni leverages Zookeeper to handle state, and is thus dependent on Zookeeper to function.

@ -108,7 +108,7 @@ The Ceph storage system features multiple layers. First, OSDs are created on ded

 This default layout provides several benefits, including multi-node replication, the ability to tolerate the loss of a full node without impacting storage, and shared storage facilitating live migration, at the cost of a 3x storage penalty. Additional replication modes (for instance, more copies) are possible to provide more resiliency at the cost of a larger storage penalty.

-It can be important to more advanced configurations to understand how disk writes work in this system to properly understand the implications of this replication. Please see the [Ceph Write Process](/manuals/ceph-write-process) documentation for a full explanation.
+It can be important to more advanced configurations to understand how disk writes work in this system to properly understand the implications of this replication. Please see the [Ceph Write Process](../manuals/ceph-write-process.md) documentation for a full explanation.

 ## Cluster Networking

@ -120,7 +120,7 @@ Within each core network, each node is assigned a static IP address; DHCP is not

 In addition to the main static IP of each node, there is also a "floating" IP in each network which is bound to the primary coordinator. This IP can be used as a single point of access into the cluster for the API or other services that need to see the "cluster as a whole" rather than individual nodes.

-Some or all of these networks can be collapsed, though for optimal performance and security, it is recommended that, at a minimum, the "upstream" and "cluster"/"storage" networks be separated. The physical aspect is discussed further in the [Hardware Requirements](/deployment/hardware-requirements) documentation, however larger clusters should generally lean towards splitting these networks into separate physical, as well as logical, links.
+Some or all of these networks can be collapsed, though for optimal performance and security, it is recommended that, at a minimum, the "upstream" and "cluster"/"storage" networks be separated. The physical aspect is discussed further in the [Hardware Requirements](hardware-requirements.md) documentation, however larger clusters should generally lean towards splitting these networks into separate physical, as well as logical, links.

 #### Upstream

@ -130,7 +130,7 @@ The "upstream" network requires outbound Internet access, as it will be used to

 This network, though it requires Internet access, should not be exposed directly to the Internet or to other untrusted local networks for security reasons. PVC itself makes no attempt to hinder access to nodes from within this network. At a minimum, an upstream firewall should prevent external access to this network, and only trusted hosts or on-cluster VMs should be added to it.

-In addition to all other functions, server IPMI interfaces should reside either directly in this network, or in a network directly reachable from this network, to provide fencing and auto-recovery functionality. For more details, see the [Fencing & Georedundancy](/deployment/fencing-and-georedundancy) documentation.
+In addition to all other functions, server IPMI interfaces should reside either directly in this network, or in a network directly reachable from this network, to provide fencing and auto-recovery functionality. For more details, see the [Fencing](fencing.md) documentation.

 #### Cluster

@ -156,9 +156,9 @@ Managed client networks leverage the EBGP VXLAN subsystem to provide virtual lay

 PVC can provide services to clients in this network via the DNSMasq subsystem, including IPv4 and IPv6 routing, firewalling, DHCP, DNS, and NTP. An upstream router must be configured to accept and return traffic from these network(s), either via BGP or static routing, if outside access is required.

-**NOTE:** Be aware of the potential for "tromboning" when routing between managed networks. All traffic to and from a managed network will flow out the primary coordinator. Thus, if there is a large amount of inter-network traffic between two managed networks, all this traffic will traverse the primary coordinator, introducing a potential bottleneck. To avoid this, keep the amount of inter-network routing between managed networks or between managed networks and the outside world to a minimum.
+📝 **NOTE** Be aware of the potential for "tromboning" when routing between managed networks. All traffic to and from a managed network will flow out the primary coordinator. Thus, if there is a large amount of inter-network traffic between two managed networks, all this traffic will traverse the primary coordinator, introducing a potential bottleneck. To avoid this, keep the amount of inter-network routing between managed networks or between managed networks and the outside world to a minimum.

-One major purpose of managed networks is to provide a bootstrapping mechanism for new VMs deployed using the [PVC provisioner](/manuals/provisioner) with CloudInit metadata services (see that documentation for details). Such deployments will require at least one managed network to provide access to the CloudInit metadata system.
+One major purpose of managed networks is to provide a bootstrapping mechanism for new VMs deployed using the [PVC provisioner](../deployment/provisioner) with CloudInit metadata services (see that documentation for details). Such deployments will require at least one managed network to provide access to the CloudInit metadata system.

 #### Bridged

@ -174,13 +174,13 @@ SR-IOV provides two mechanisms for directly passing underlying network devices i

 SR-IOV networks require static configuration of the hypervisor nodes, both to define the PFs and to define how many VFs can be created on each PF. These options are defined with the `sriov_device` and `vfcount` options in the `pvcnoded.yaml` configuration file.

-**NOTE:** Changing the PF or VF configuration cannot be done dynamically, and requires a restart of the `pvcnoded` daemon.
+📝 **NOTE** Changing the PF or VF configuration cannot be done dynamically, and requires a restart of the `pvcnoded` daemon.

-**NOTE:** Some SR-IOV NICs, specifically Intel NICs, cannot have the `vfcount` modified during runtime after being set. The node must be rebooted for changes to be applied.
+📝 **NOTE** Some SR-IOV NICs, specifically Intel NICs, cannot have the `vfcount` modified during runtime after being set. The node must be rebooted for changes to be applied.

 Once one or more PFs are configured, VFs can then be created on individual nodes via the PVC API, which can then be mapped to VMs in a 1-to-1 relationship.

-**NOTE:** The administrator must be careful to ensure the allocated VFs and PFs are identical between all nodes, otherwise migration of VMs between nodes can result in incorrect network assignments.
+📝 **NOTE** The administrator must be careful to ensure the allocated VFs and PFs are identical between all nodes, otherwise migration of VMs between nodes can result in incorrect network assignments.

 Once VFs are created, they may be attached to VMs using one of the two strategies mentioned above. Each strategy has trade-offs, so careful consideration is required:

--- a/docs/architecture/fencing.md
+++ b/docs/architecture/fencing.md
@ -0,0 +1,101 @@
+---
+title: Fencing
+---
+
+PVC features a fencing system to provide automatic recovery of nodes from certain failure scenarios. This document details the fencing process, limitations, and expectations.
+
+You can also view a video demonstration of the fencing process in action here:
+
+[![Fencing Demonstration](https://img.youtube.com/vi/ZnhJ91-5y1Q/hqdefault.jpg)](https://youtu.be/ZnhJ91-5y1Q)
+
+[TOC]
+
+## Overview
+
+Fencing in PVC provides a mechanism for a cluster's nodes to determine if one of their active (`run` state) peers has stopped responding, take action to ensure the failed node is fully power-cycled, and then, if successful, automatically bring up affected VMs from the dead node onto others awaiting its return to service.
+
+Properly configured fencing can thus help ensure the maximum uptime for VMs in the case of a faulty node.
+
+Fencing is enabled by default for all nodes that have the `fence_intervals` configuration key set and for which the node's IPMI is reachable and usable via `ipmitool` on the peers. Nodes check their own IPMI at daemon startup to validate this and print a warning if failed; in addition a regular health check monitors the IPMI interface and will degrade the node health if it is not reachable or not responding.
+
+Fencing can be temporarily disabled by setting the cluster maintenance mode to `on` and resumed by setting it `off`. This can be useful during maintenance events however the administrator should be careful to `flush` any affected nodes of running VMs first to avoid trouble.
+
+## IPMI Configuration
+
+For fencing to be enabled, several configurations must be correctly set.
+
+* The node must have a proper IPMI interface, as detailed in the [Hardware Requirements](hardware-requirements.md#ipmilights-out-management) documentation.
+* The IPMI interface must be either in the [cluster "upstream" network](cluster-architecture.md#upstream), or in another network reachable by it. The former is strongly recommended, because the latter is potentially susceptible to network faults in the routing between the networks which might cause fencing to fail in otherwise valid scenarios.
+* The IPMI BMC must be configured with an `Administrator`-level user with IPMI-over-LAN privileges enabled.
+* The IPMI interface (IP or hostname) and aforementioned user of each node must be configured in the `fencing` -> `ipmi` section of the `pvcnoded.yaml` file of that node.
+
+PVC will automatically check the reachability of its IPMI and its functionality early during node startup. The functionality can also be tested via the `ipmitool -I lanplus` command from a node.
+
+The [PVC Ansible framework](../deployment/getting-started.md) will automatically configure most aspects of this IPMI setup, though some might require manual configuration. Ensure you test before putting the cluster into production.
+
+## Fencing Process
+
+### Dead Node Detection
+
+Node fencing is handled during regular node keepalive events. Keepalives occur every 5 seconds (default `keepalive_interval`), during which each node checks into the cluster by providing the current UNIX epoch timestamp in a configuration key.
+
+At the end of each keepalive event, all nodes check their peers' timestamps and compare them against the current time. If the peers detect that a node in `run` daemon state has not checked in for 6 intervals (default `fence_intervals`), or 30 seconds by default, one node at random will begin the fencing process as the watching node. First, a timer is started for 6 more `keepalive_intervals` (hard-coded), during which a check-in from the dead node will cancel the fence (a "saving throw").
+
+### Dead Node Fencing
+
+If all 6 saving throw intervals pass without further updates to the dead node's timestamp, actual fencing will begin; by default this will be 60-65 seconds after the last valid keepalive. The exact process is as follows, all run from the selected watching node:
+
+1. The dead node is issued a `chassis power off` via IPMI-over-LAN to trigger an immediate power off.
+1. Wait 1 second.
+1. The `chassis power state` of the dead node is checked and recorded.
+1. The dead node is issued a `chassis power on` via IPMI-over-LAN to trigger a power on.
+1. Wait 2 seconds
+1. The `chassis power state` of the dead node is checked and recorded.
+
+With these 6 steps and the 2 saved results of the `chassis power state`, PVC can determine with near certainty that the dead node was actually powered off, and thus that any VMs that were potentially running on it were terminated. Specifically, if the first result was `Off` and the second was any valid value, the node was definitely shut down (either on its own, or by the first `chassis power off` command). If it cannot determine this, for instance because IPMI was unreachable or neither power state result was `Off`, no action is taken.
+
+### VM Recovery
+
+Once a dead node has been successfully fenced and at least 1 more `keepalive_interval` has passed, the watching node will begin fencing recovery.
+
+What action is taken during fencing recovery is dependent on the `successful_fence` configuration key, which can either be `migrate`, which will perform the below steps, or `none` which will perform no recovery action and stop here.
+
+First, the node is put into a special `fencing-flush` domain state, to indicate that it is undergoing a forced flush after fencing. Then, for each VM which was running on the dead node:
+
+1. The RBD locks on all VM storage volumes are cleared.
+1. The VM is temporarily `migrate`d to one active peer node based on the node's configured `target_selector` (default `mem`).
+1. The VM is started up.
+
+If, at a later time, the dead node successfully recovers and resumes normal operation, it can be put back into service. This **will not** occur automatically, as the node could still be in a bad state and only barely operating; an administrator must closely inspect the node and restore it to service manually after confirming correct operation.
+
+### Failures
+
+If a fence fails for any reason (for instance, the IPMI of the dead node is not reachable), by default no action is taken, as this could be unsafe for the integrity of VM data. This can be overridden by adjusting the `failed_fence` configuration key in conjunction with the node suicide discussed below, however this is strongly discouraged.
+
+### Node Suicide
+
+As an alternative to remote fencing, nodes can be configured to kill themselves by adjusting the `suicide_intervals` configuration key to a non-zero value. If the node itself does not check in for this many intervals, it will trigger a self restart via the `reboot -f` command. However, this is not reliable, and the other nodes will have no way of accurately determining the state of the node and whether VMs are safe to migrate, so this is strongly discouraged.
+
+## Valid Fencing Conditions
+
+The conditions in which a node can be successfully fenced are limited, and thus, auto-recovery is limited only to those situations where a fence can succeed. In short, any situation whereby a node's OS is not responding normally, but its IPMI interface is still up and available, should succeed in a fence; in contrast, those where the IPMI interface is also unavailable will fail.
+
+The following table covers some common scenarios, and whether fencing (and subsequent automatic recovery) can be expected to occur.
+
+| Situation | Fence? | Notes |
+| --------- | --------------------- | ----- |
+| OS lockup (load, OOM, etc.) | ✅ | A key design situation for the fencing system |
+| OS kernel panic | ✅ | A key design situation for the fencing system |
+| Primary network failure | ✅ | Only affecting primary links, not IPMI (see below); a key design situation |
+| Full network failure | ❌ | All links are down, e.g. full network failure including IPMI |
+| Power loss | ❌ | Impossible to determine if this is a transient network cut or actual power loss without IPMI |
+| Hardware failure (CPU, memory) | ✅ | IPMI interface should remain up in these scenarios; a key design situation |
+| Hardware failure (motherboard) | ✅ | If IPMI is **online** after failure |
+| Hardware failure (motherboard) | ❌ | If IPMI is **offline** after failure |
+| Hardware failure (full chassis) | ❌ | If IPMI is **offline** after failure |
+
+Care should be taken to understand these scenarios and which situations can be recovered from automatically, and which require manual human intervention to confirm the situation ("is the node actually physically off?") and manual recovery.
+
+## Future Development
+
+Future versions of PVC may add support for additional fencing modes, for instance the ability for a fence to trigger a remote power device (switched PDU, etc.) or to detect more esoteric situations with the node power state via IPMI, as need requires. The author however believes that the current implementation satisfies the vast majority of potential situations for which auto-recovery is beneficial and thus such work would not see much benefit, though he is open to changing his mind.
--- a/docs/architecture/georedundancy.md
+++ b/docs/architecture/georedundancy.md
@ -0,0 +1,89 @@
+---
+title: Georedundancy
+---
+
+Georeundancy refers to the ability of a system to run across multiple physical geographic areas, and to help tolerate the loss of one of those areas due to a catastrophic event. With respect to PVC, there are two primary types of georedundancy: single-cluster georedundancy, which covers the distribution of the nodes of a single cluster across multiple locations; and multi-cluster georedundancy, in which individual clusters are created at multiple locations and communicate at a higher level. This page will outline the implementation, important caveats and potential solutions if possible, for both kinds of georeundancy.
+
+[TOC]
+
+## Single-Cluster Georedundancy
+
+In a single-cluster georedundant design, one logical cluster can have its nodes, and specifically it's coordinator nodes, placed in different physical locations. This can help ensure that the cluster remains available even if one of the physical locations becomes unavailable, but it has multiple major caveats to consider.
+
+### Number of Locations
+
+Since the nodes in a PVC cluster require a majority quorum to function, there must be at least 3 sites, of which any 2 must be able to communicate directly with each other should the 3rd fail. A single coordinator (for a 3 node cluster) would then be placed at each site.
+
+2 site georedundancy is functionally worthless within a single PVC cluster: if the primary site were to go down, the secondary site will not have enough coordinator nodes to form a majority quorum, and the entire cluster would fail.
+
+[![2 Site Caveats](images/pvc-georedundancy-2-site.png)](images/pvc-georedundancy-2-site.png)
+
+In addition, a 3 site configuration configuration without a full mesh or ring, i.e. where a single site functions as an anchor between the other two, would be a point of failure and would render the cluster non-functional if offline.
+
+[![3 Site Caveats](images/pvc-georedundancy-broken-mesh.png)](images/pvc-georedundancy-broken-mesh.png)
+
+Thus, the smallest useful georedundant physical design is 3 sites in full mesh or ring. The loss of any one site in this scenario will still allow the remaining nodes to form quorum and function.
+
+[![3 Site Solution](images/pvc-georedundancy-full-mesh.png)](images/pvc-georedundancy-full-mesh.png)
+
+A larger cluster could theoretically span 3 (as 2+2+1) or more sites, however with a maximum of 5 coordinators recommended, this many sites is likely to be overkill for the PVC solution; multi-cluster georedundancy would be a preferable solution for such a large distribution of nodes.
+
+Since hypervisors are not affected by nor affect the quorum, any number can be placed at any site. Only compute resources would thus be affected should that site go offline. For instance, a design with one coordinator and one hypervisor at each site would provide a full 4 nodes of compute resources even if one site is offline.
+
+### Fencing
+
+PVC's [fencing mechanism](fencing.md) relies entirely on network access. First, network access is required for a node to update its keepalives to the other nodes via Zookeeper. Second, IPMI out-of-band connectivity is required for the remaining nodes to fence a dead node.
+
+Georedundancy introduces significant complications to this process. First, it makes network cuts more likely, as the cut can now occur somewhere outside of the administrator's control (e.g. on a public utility pole, or in a provider's upstream network). Second, the nature of the cut means that without backup connectivity for the IPMI functionality, any fencing attempt would fail, thus preventing automatic recovery of VMs from the cut site onto the remaining sites. Thus, in this design, several normally-possible recovery situations become impossible to recover from automatically, up to and including any recovery at all. Situations where individual VM availability is paramount are therefore not ideally served by single-cluster georedundancy.
+
+### Orphaned Site Availability
+
+It is also important to note that network cut scenarios in this case will result in the outage of the orphaned site, even if it is otherwise functional. As the single node could no longer communicate with the majority of the storage cluster, its VMs will become unresponsive and blocked from I/O. Thus, splitting a single cluster between sites like this will **not** help ensure that the cut site remains available; on the contrary, the cut site will effectively be sacrificed to preserve the *remainder* of the cluster. For instance, office workers in that location would still not be able to access services on the cluster, even if those services happening to be running in the same physical location.
+
+### Network Speed
+
+PVC clusters are quite network-intensive, as outlined in the [hardware requirements](hardware-requirements.md#networking) documentation. This can pose problems with multi-site clusters with slower interconnects. At least 10Gbps is recommended between nodes, and this includes nodes in different physical locations. In addition, the traffic here is bursty and dependent on VM workloads, both in terms of storage and VM migration. Thus, the site interconnects must account for the speed required of a PVC cluster in addition to any other traffic.
+
+### Network Latency & Distance
+
+The storage write performance within PVC is heavily dependent on network latency. To explain why, one must understand the process behind writes within the Ceph storage subsystem:
+
+[![Ceph Write Process](images/pvc-ceph-write-process.png)](images/pvc-ceph-write-process.png)
+
+As illustrated in this diagram, a write will only be accepted by the client once it has been successfully written to at least `min_copies` OSDs, as defined by the pool replication level (usually 2). Thus, the latency of network communications between any two nodes becomes a major factor in storage performance for writes, as the write cannot complete without at least 4x this latency (send, ack, receive, ack). Significant physical distances and thus latencies (more than about 3ms) begin to introduce performance degradation, and latencies above about 5-10ms can result in a significant drop in write performance.
+
+To combat this, georedundant nodes should be as close as possible, ideally within 20-30km of each other at a maximum. Thus, a ring *within* a city would work well; a ring *between* cities would likely hamper performance significantly.
+
+## Overall Conclusion: Avoid Single-Cluster Georedundancy
+
+It is the opinion of the author that the caveats of single-cluster georedundancy outweigh the benefits in almost every case. The only situation for which multi-site georedundancy provides a notable benefit is in ensuring that copies of data are stored online at multiple locations, but this can also be achieved at higher layers as well. Thus, we strongly recommend against this solution for most use-cases.
+
+## Multi-Cluster Georedundancy
+
+Starting with PVC version 0.9.104, the system now supports online VM snapshot transfers between clusters. This can help enable a second georedundancy mode, leveraging a full cluster in two sites, between which important VMs replicate. In addition, this design can be used with higher-layer abstractions like service-level redundancy to ensure the optimal operation of services even if an entire cluster becomes unavailable. Service-level redundancy between two clusters is not addressed here.
+
+Multi-cluster redundancy eliminates most of the caveats of single-cluster georedundancy while permitting single-instance VMs to be safely replicated for hot availability, but introduces several additional caveats regarding promotion of VMs between clusters that must be considered before and during failure events.
+
+### No Failover Automation
+
+Georedundancy with multiple clusters offers no automation within the PVC system for transitioning VMs like with single-cluster fencing and recovery. If a fault occurs necessitating promotion of services to the secondary cluster, this must be completed manually by the administrator. In addition, once the primary site recovers, it must be handled carefully to re-converge the clusters (see below).
+
+### VM Automirrors
+
+The VM automirror subsystem must be used for proper automatic redundancy on any single-instance VMs within the cluster. A "primary" side must be selected to run the service normally, while a "secondary" site receives regular mirror snapshots to update its local copy and be ready for promotion should this be necessary. Note that controlled cutovers (for e.g. maintenance events) do not present issues aside from brief VM downtime, as a final snapshot is sent during these operations.
+
+The automirror schedule is very important to define here. Since automirrors are point-in-time snapshots, only data at the last sent snapshot will be available on the secondary cluster. Thus, extremely frequent automirrors, on the order of hours or even minutes, are recommended. In addition note that automirrors are run on a fixed schedule for all VMs in the cluster; it is not possible to designate some VMs to run more frequently at this time.
+
+It is also recommended that the guest OSes of any VMs set for automirror support use atomic writes if possible, as online snapshots must be crash-consistent. Most modern operating and file systems are supported, but care must be taken when using e.g. in-memory caching of writes or other similar mechanisms to avoid data loss.
+
+### Data Loss During Transitions
+
+VM automirror snapshots are point-in-time; for a clean promotion without data loss, the `pvc vm mirror promote` command must be used. This affects both directions:
+
+* When promoting a VM on the secondary after a catastrophic failure of the primary (i.e. one in which `pvc vm mirror promote` cannot be used), any data written to the primary side since the last snapshot will be lost. As mentioned above, this necessitates very frequent automirror snapshots to be viable, but even with frequent snapshots some amount of data loss will occur.
+
+* Once the secondary is promoted to become the primary manually, both clusters will consider themselves primary for the VM, should the original primary cluster recover. At that time, there will be a split-brain between the two, and one side's changes must be discarded; there is no reconciliation possible on the PVC side between the two instances. Usually, recovery here will mean the removal of the original primary's copy of the VM and a re-synchronization from the former secondary (now primary) to the original primary cluster with `pvc vm mirror create`, followed by a graceful transition with `pvc vm mirror promote`. Note that the transition will also result in additional downtime for the VM.
+
+## Overall Conclusion: Proceed with Caution
+
+Ultimately the potential for data loss during unplanned promotions must be carefully weighed against the benefits of manually promoting the peer cluster. For short or transient outages, it is highly likely to result in more data loss and impact than is acceptable, and thus a manual promotion should only be considered in truly catastrophic situations. In such situations, the amount of acceptable data loss must inform the timing of the automirrors, and thus how frequently snapshots are taken and transmitted. Ultimately, service-level redundancy is advised when any data loss would be catastrophic.
--- a/docs/architecture/hardware-requirements.md
+++ b/docs/architecture/hardware-requirements.md
@ -14,31 +14,31 @@ This document details the recommendations for *individual* node hardware choices

 PVC is designed to operate in "N-1" mode, that is, all sizing of the cluster should take into account the loss of 1 node after pooling all the available resources.

-For example, consider 3 nodes each with 16 CPU cores and 128GB of RAM. This totals 48 CPU cores and 384GB of RAM, however we should consider the N-1 number, that is 32 CPU cores and 256GB of RAM, to be the maximum usable quantity of each available across the entire cluster.
+For example, consider 3 nodes each with 16 CPU cores and 128GB of RAM. This totals 48 CPU cores and 384GB of RAM, however we should consider the N-1 number, in this case 2 nodes, which would be 32 CPU cores and 256GB of RAM, to be the maximum usable quantity of each available across the entire cluster. PVC will warn the administrator when RAM provisioning exceeds the N-1 number.

-Disks are even more limited. As outlined in the [Cluster Storage section of the Cluster Architecture](/deployment/cluster-architecture/#cluster-storage) documentation, a normal pool replication level for reliable redundant operation is 3 copies with 2 minimum copies. Thus, to continue the above 3 node example, if each node features a 2TB data SSD, the total available N-1 storage is 2TB (as 3 x 2TB / 3 = 2TB).
+Disks are even more limited. As outlined in the [Cluster Storage section of the Cluster Architecture](cluster-architecture.md#cluster-storage) documentation, a normal pool replication level for reliable redundant operation is 3 copies with 2 minimum copies. Thus, to continue the above 3 node example, if each node features a 2TB data SSD, the total available N-1 storage is 2TB (as 3 x 2TB / 3 = 2TB). On larger clusters, this calculation is more complex, so care should be taken to optimize the available number of disks with the pool replication size to ensure efficient utilization.

 ## Hardware Vendors

 PVC places no limitations of the hardware vendor for nodes; any vendor that produces a system compatible with the rest of these requirements will be suitable.

-Some common recommended vendors, with whom the author has had good experiences, include Dell (PowerEdge line, various tiers and generations) and Cisco (UCS C-series line, M4 and M5 era specifically). The author does not recommend Hewlett-Packard Proliant servers due to severe limitations and issues with their storage controller cards, even though they are otherwise sufficient.
+Some common recommended vendors, with whom the author has had good experiences, include Dell (PowerEdge line, various tiers and generations) and Cisco (UCS C-series line, M4 and M5 era specifically). The author does not recommend Hewlett-Packard ProLiant servers due to severe limitations and issues with their storage controller cards, even though they are otherwise sufficient.

 ### IPMI/Lights-out Management

 All aforementioned server vendors support some form of IPMI Lights-out Management, e.g. Dell iDRAC, Cisco CIMC, HP iLO, etc. with IPMI-over-LAN functionality. Consumer and low-end Workstation hardware does not normally support IPMI Lights-out Management and is thus unsuitable for a production node.

-* It is **recommended** for a redundant, production PVC node to feature IPMI Lights-out Management, on a dedicated Ethernet port, with support for IPMI-over-LAN functionality, reachable from or in the [cluster "upstream" network](/deployment/cluster-architecture/#upstream).
+* It is **recommended** for a redundant, production PVC node to feature IPMI Lights-out Management, on a dedicated Ethernet port, with support for IPMI-over-LAN functionality, reachable from or in the [cluster "upstream" network](cluster-architecture.md#upstream).

-This feature is not strictly required, however it is required for the [PVC fencing system](/deployment/fencing-and-georedundancy) to function properly, which is required for auto-recovery from node failures. PVC will detect the lack of a reachable IPMI interface at startup and disable fencing and auto-recovery in such a case.
+This feature is not strictly required, however it is required for the [PVC fencing system](fencing.md) to function properly, which is required for auto-recovery from node failures. PVC will detect the lack of a reachable IPMI interface at startup and disable fencing and auto-recovery in such a case.

 ## CPU

 PVC requires a relatively large amount of CPU horsepower. In addition to any CPU required by VMs, the storage subsystem can consume a large amount of CPU power, as can other daemons on the system. Recent CPU vulnerabilities and their mitigations have also severely affected performance, and thus this should be considered carefully.

-### Vendor
+### Vendor & Architecture

-PVC will work equally well on (modern, see below) Intel- and AMD-based CPUs. Which you select depends primarily on your workload and which feature(s) complement it. The author has used both extensively.
+PVC will work equally well on (modern, see below) Intel- and AMD-based CPUs using the x86_64 architecture (as implemented in IA64 and AMD64, respectively). Which you select depends primarily on availability, workload, and which, if any, additional CPU feature(s) complement it. The author has used both extensively to good results.

 ### Era/Generation

@ -46,9 +46,9 @@ Modern CPUs are a must, as generation improvements compound and can make a major

 #### Intel

-* The **minimum** generation/era for a functional PVC node is "Nehalem", i.e. the Xeon L/X/W-3XXX, 2009-2011.
+* The **minimum** generation/era for a functional PVC node is "Nehalem", i.e. the Xeon L/X/W-3XXX, 2009.

-* The **recommended** generation/era for a production PVC node is "Haswell", i.e. the Xeon E5-2XXX V3, 2013-2015. Processors older than this will be a significant bottleneck due to the slower DDR3 memory system and lower general IPC per clock, especially affecting the storage subsystem.
+* The **recommended** generation/era for a production PVC node is "Haswell", i.e. the Xeon E5-2XXX V3, 2013. Processors older than this will be a significant bottleneck due to the slower DDR3 memory system and lower general IPC per clock, especially affecting the storage subsystem.

 #### AMD

@ -143,7 +143,7 @@ PVC does not require a large amount of space for its system drives. The default

 ### Quantity/Redundancy

-The PVC system disks should be deployed in mirrored mode, via an internal RAID controller or dedicated redundant device (e.g. Dell BOSS card). Note that PVC features a monitoring plugin which can alert to degraded RAID arrays of various types (MegaRAID/Dell PERC, HPSA, and Dell BOSS).
+The PVC system disks should be deployed in mirrored mode, via an internal RAID controller or dedicated redundant device (e.g. Dell BOSS card). Note that PVC features a monitoring plugin which can alert to degraded RAID arrays of various types (MegaRAID/Dell PERC, HPSA, and Dell BOSS) when the appropriate software is installed.

 * The **minimum** system disk quantity for a functional PVC node is 1.

@ -221,7 +221,7 @@ PVC will work equally well regardless of power type (A/C vs D/C, various voltage

 ### Power Supplies

-Redundant power supplies will ensure that even if a power supply or power feed fails, the PVC node will continue to function. Note that PVC features a monitoring plugin which can alert to degraded degraded power redundancy from most IPMI-capable vendors.
+Redundant power supplies will ensure that even if a power supply or power feed fails, the PVC node will continue to function. Note that PVC features a monitoring plugin which can alert to degraded power redundancy from most IPMI-capable vendors.

 * The **minimum** number of power supplies for a functional PVC node is, of course, 1.

--- a/docs/architecture/images/pvc-3-node-cluster.png
+++ b/docs/architecture/images/pvc-3-node-cluster.png
--- a/docs/architecture/images/pvc-8-node-cluster.png
+++ b/docs/architecture/images/pvc-8-node-cluster.png
--- a/docs/architecture/images/pvc-ceph-write-process.png
+++ b/docs/architecture/images/pvc-ceph-write-process.png
--- a/docs/architecture/images/pvc-georedundancy-2-site.png
+++ b/docs/architecture/images/pvc-georedundancy-2-site.png
--- a/docs/architecture/images/pvc-georedundancy-broken-mesh.png
+++ b/docs/architecture/images/pvc-georedundancy-broken-mesh.png
--- a/docs/architecture/images/pvc-georedundancy-full-mesh.png
+++ b/docs/architecture/images/pvc-georedundancy-full-mesh.png
--- a/docs/architecture/images/pvc-software.png
+++ b/docs/architecture/images/pvc-software.png
--- a/docs/deployment/getting-started.md
+++ b/docs/deployment/getting-started.md
--- a/docs/deployment/images/pvc-software.png
+++ b/docs/deployment/images/pvc-software.png
--- a/docs/deployment/provisioner.md
+++ b/docs/deployment/provisioner.md
@ -1,8 +1,6 @@
-# PVC Provisioner Manual
+# PVC Provisioner Guide

-The PVC provisioner is a subsection of the main PVC API. It interfaces directly with the Zookeeper database using the common client functions, and with the Patroni PostgreSQL database to store details. The provisioner also interfaces directly with the Ceph storage cluster, for mapping volumes, creating filesystems, and installing guests.
-
-Details of the Provisioner API interface can be found in [the API manual](/manuals/api).
+The PVC provisioner is a subsection of the main PVC system, designed to aid administrators in quickly deploying virtual machines (mostly major Linux flavours) according to defined templates and profiles, leveraging CloudInit and customizable provisioning scripts, or by deploying OVA images.

 - [PVC Provisioner Manual](#pvc-provisioner-manual)
  * [Overview](#overview)
@ -19,15 +17,13 @@ Details of the Provisioner API interface can be found in [the API manual](/manua

 ## Overview

-The purpose of the Provisioner API is to provide a convenient way for administrators to automate the creation of new virtual machines on the PVC cluster.
+The purpose of the Provisioner is to provide a convenient way for administrators to automate the creation of new virtual machines on the PVC cluster.

 The Provisioner allows the administrator to construct descriptions of VMs, called profiles, which include system resource specifications, network interfaces, disks, cloud-init userdata, and installation scripts. These profiles are highly modular, allowing the administrator to specify arbitrary combinations of the mentioned VM features with which to build new VMs.

 The provisioner supports creating VMs based off of installation scripts, by cloning existing volumes, and by uploading OVA image templates to the cluster.

-Examples in the following sections use the CLI exclusively for demonstration purposes. For details of the underlying API calls, please see the [API interface reference](/manuals/api-reference.html).
-
-Use of the PVC Provisioner is not required. Administrators can always perform their own installation tasks, and the provisioner is not specially integrated, calling various other API commands as though they were run from the CLI or API.
+Use of the PVC Provisioner is not required. Administrators can always perform their own installation tasks, and the provisioner is not specially integrated, calling various other commands as though they were run from the CLI or API.

 # PVC Provisioner concepts

@ -218,87 +214,86 @@ As mentioned above, the `VMBuilderScript` instance includes several instance var

 * `self.vm_data`: A full dictionary representation of the data provided by the PVC provisioner about the VM. Includes many useful details for crafting the VM configuration and setting up disks and networks. An example, in JSON format:

-   ```
-   {
-     "ceph_monitor_list": [
-       "hv1.pvcstorage.tld",
-       "hv2.pvcstorage.tld",
-       "hv3.pvcstorage.tld"
-     ],
-     "ceph_monitor_port": "6789",
-     "ceph_monitor_secret": "96721723-8650-4a72-b8f6-a93cd1a20f0c",
-     "mac_template": null,
-     "networks": [
-       {
-         "eth_bridge": "vmbr1001",
-         "id": 72,
-         "network_template": 69,
-         "vni": "1001"
-       },
-       {
-         "eth_bridge": "vmbr101",
-         "id": 73,
-         "network_template": 69,
-         "vni": "101"
-       }
-     ],
-     "script": [contents of this file]
-     "script_arguments": {
-       "deb_mirror": "http://ftp.debian.org/debian",
-       "deb_release": "bullseye"
-     },
-     "system_architecture": "x86_64",
-     "system_details": {
-       "id": 78,
-       "migration_method": "live",
-       "name": "small",
-       "node_autostart": false,
-       "node_limit": null,
-       "node_selector": null,
-       "ova": null,
-       "serial": true,
-       "vcpu_count": 2,
-       "vnc": false,
-       "vnc_bind": null,
-       "vram_mb": 2048
-     },
-     "volumes": [
-       {
-         "disk_id": "sda",
-         "disk_size_gb": 4,
-         "filesystem": "ext4",
-         "filesystem_args": "-L=root",
-         "id": 9,
-         "mountpoint": "/",
-         "pool": "vms",
-         "source_volume": null,
-         "storage_template": 67
-       },
-       {
-         "disk_id": "sdb",
-         "disk_size_gb": 4,
-         "filesystem": "ext4",
-         "filesystem_args": "-L=var",
-         "id": 10,
-         "mountpoint": "/var",
-         "pool": "vms",
-         "source_volume": null,
-         "storage_template": 67
-       },
-       {
-         "disk_id": "sdc",
-         "disk_size_gb": 4,
-         "filesystem": "ext4",
-         "filesystem_args": "-L=log",
-         "id": 11,
-         "mountpoint": "/var/log",
-         "pool": "vms",
-         "source_volume": null,
-         "storage_template": 67
-       }
-     ]
-   }
-   ```
+        {
+          "ceph_monitor_list": [
+            "hv1.pvcstorage.tld",
+            "hv2.pvcstorage.tld",
+            "hv3.pvcstorage.tld"
+          ],
+          "ceph_monitor_port": "6789",
+          "ceph_monitor_secret": "96721723-8650-4a72-b8f6-a93cd1a20f0c",
+          "mac_template": null,
+          "networks": [
+            {
+              "eth_bridge": "vmbr1001",
+              "id": 72,
+              "network_template": 69,
+              "vni": "1001"
+            },
+            {
+              "eth_bridge": "vmbr101",
+              "id": 73,
+              "network_template": 69,
+              "vni": "101"
+            }
+          ],
+          "script": [contents of this file]
+          "script_arguments": {
+            "deb_mirror": "http://ftp.debian.org/debian",
+            "deb_release": "bullseye"
+          },
+          "system_architecture": "x86_64",
+          "system_details": {
+            "id": 78,
+            "migration_method": "live",
+            "name": "small",
+            "node_autostart": false,
+            "node_limit": null,
+            "node_selector": null,
+            "ova": null,
+            "serial": true,
+            "vcpu_count": 2,
+            "vnc": false,
+            "vnc_bind": null,
+            "vram_mb": 2048
+          },
+          "volumes": [
+            {
+              "disk_id": "sda",
+              "disk_size_gb": 4,
+              "filesystem": "ext4",
+              "filesystem_args": "-L=root",
+              "id": 9,
+              "mountpoint": "/",
+              "pool": "vms",
+              "source_volume": null,
+              "storage_template": 67
+            },
+            {
+              "disk_id": "sdb",
+              "disk_size_gb": 4,
+              "filesystem": "ext4",
+              "filesystem_args": "-L=var",
+              "id": 10,
+              "mountpoint": "/var",
+              "pool": "vms",
+              "source_volume": null,
+              "storage_template": 67
+            },
+            {
+              "disk_id": "sdc",
+              "disk_size_gb": 4,
+              "filesystem": "ext4",
+              "filesystem_args": "-L=log",
+              "id": 11,
+              "mountpoint": "/var/log",
+              "pool": "vms",
+              "source_volume": null,
+              "storage_template": 67
+            }
+          ]
+        }
+

 Since the `VMBuilderScript` runs within its own context but within the PVC Provisioner/API system, it is possible to use many helper libraries from the PVC system itself, including both the built-in daemon libraries (used by the API itself) and several explicit provisioning script helpers. The following are commonly-used (in the examples) imports that can be leveraged:

@ -311,9 +306,9 @@ Since the `VMBuilderScript` runs within its own context but within the PVC Provi
 * `daemon_lib.common`: Part of the PVC daemon libraries, provides several common functions, including, most usefully, `run_os_command` which provides a wrapped, convenient method to call arbitrary shell/OS commands while returning a POSIX returncode, stdout, and stderr (a tuple of the 3 in that order).
 * `daemon_lib.ceph`: Part of the PVC daemon libraries, provides several commands for managing Ceph RBD volumes, including, but not limited to, `clone_volume`, `add_volume`, `map_volume`, and `unmap_volume`. See the `debootstrap` example for a detailed usage example.

-For safety reasons, the script runs in a modified chroot environment on the hypervisor. It will have full access to the entire / (root partition) of the hypervisor, but read-only. In addition it has read-write access to /dev, /sys, /run, and a fresh /tmp to write to; use /tmp/target (as convention) as the destination for any mounting of volumes and installation. Thus it is not possible to do things like `apt-get install`ing additional programs within a script; any such requirements must be set up before running the script (e.g. via `pvc-ansible`).
+For safety reasons, the script runs in a modified chroot environment on the hypervisor. It will have full access to the entire `/` (root partition) of the hypervisor, but read-only. In addition it has read-write access to `/dev`, `/sys`, `/run`, and a fresh `/tmp` to write to; use `/tmp/target` (as convention) as the destination for any mounting of volumes and installation. Thus it is not possible to do things like `apt-get install`ing additional programs within a script; any such requirements must be set up before running the script (e.g. via `pvc-ansible`).

-**WARNING**: Of course, despite this "safety" mechanism, it is VERY IMPORTANT to be cognizant that this script runs AS ROOT ON THE HYPERVISOR SYSTEM with FULL ACCESS to the cluster. You should NEVER allow arbitrary, untrusted users the ability to add or modify provisioning scripts. It is trivially easy to write scripts which will do destructive things - for example writing to arbitrary /dev objects, running arbitrary root-level commands, or importing PVC library functions to delete VMs, RBD volumes, or pools. Thus, ensure you vett and understand every script on the system, audit them regularly for both intentional and accidental malicious activity, and of course (to reiterate), do not allow untrusted script creation!
+⚠️ **WARNING** Of course, despite this "safety" mechanism, it is **very important** to be cognizant that this script runs **as root on the hypervisor system** with **full access to the cluster**. You should **never** allow arbitrary, untrusted users the ability to add or modify provisioning scripts. It is trivially easy to write scripts which will do destructive things - for example writing to arbitrary `/dev` objects, running arbitrary root-level commands, or importing PVC library functions to delete VMs, RBD volumes, or pools. Thus, ensure you vett and understand every script on the system, audit them regularly for both intentional and accidental malicious activity, and of course (to reiterate), do not allow untrusted script creation!

 ## Profiles

@ -398,11 +393,9 @@ Using cluster "local" - Host: "10.0.0.1:7370"  Scheme: "http"  Prefix: "/api/v1"
 Task ID: 39639f8c-4866-49de-8c51-4179edec0194
 ```

-**NOTE**: A VM that is set to do so will be defined on the cluster early in the provisioning process, before creating disks or executing the provisioning script, with the special status `provision`. Once completed, if the VM is not set to start automatically, the state will remain `provision`, with the VM not running, until its state is explicitly changed with the client (or via autostart when its node returns to `ready` state).
+📝 **NOTE** A VM that is set to do so will be defined on the cluster early in the provisioning process, before creating disks or executing the provisioning script, with the special status `provision`. Once completed, if the VM is not set to start automatically, the state will remain `provision`, with the VM not running, until its state is explicitly changed with the client (or via autostart when its node returns to `ready` state).

-**NOTE**: Provisioning jobs are tied to the node that spawned them. If the primary node changes, provisioning jobs will continue to run against that node until they are completed, interrupted, or fail, but the active API (now on the new primary node) will not have access to any status data from these jobs, until the primary node status is returned to the original host. The CLI will warn the administrator of this if there are active jobs while running `node primary` or `node secondary` commands.
-
-**NOTE**: Provisioning jobs cannot be cancelled, either before they start or during execution. The administrator should always let an invalid job either complete or fail out automatically, then remove the erroneous VM with the `vm remove` command.
+📝 **NOTE** Provisioning jobs cannot be cancelled, either before they start or during execution. The administrator should always let an invalid job either complete or fail out automatically, then remove the erroneous VM with the `vm remove` command.

 # Deploying VMs from OVA images

@ -438,4 +431,4 @@ During import, PVC splits the OVA into its constituent parts, including any disk

 Because of this, OVA profiles do not include storage templates like other PVC profiles. A storage template can still be added to such a profile, and the block devices will be added after the main block devices. However, this is generally not recommended; it is far better to modify the OVA to add additional volume(s) before uploading it instead.

-**WARNING**: Never adjust the sizes of the OVA VMDK-formatted storage volumes (named `ova_<NAME>_sdX`) or remove them without removing the OVA itself in the provisioner; doing so will prevent the deployment of the OVA, specifically the conversion of the images to raw format at deploy time, and render the OVA profile useless.
+⚠️ **WARNING** Never adjust the sizes of the OVA VMDK-formatted storage volumes (named `ova_<NAME>_sdX`) or remove them without removing the OVA itself in the provisioner; doing so will prevent the deployment of the OVA, specifically the conversion of the images to raw format at deploy time, and render the OVA profile useless.
--- a/docs/images/0-integrated-help.png
+++ b/docs/images/0-integrated-help.png
--- a/docs/images/1-connection-management.png
+++ b/docs/images/1-connection-management.png
--- a/docs/images/10-provisioner.png
+++ b/docs/images/10-provisioner.png
--- a/docs/images/11-prometheus-grafana.png
+++ b/docs/images/11-prometheus-grafana.png
--- a/docs/images/2-cluster-details-and-output-formats.png
+++ b/docs/images/2-cluster-details-and-output-formats.png
--- a/docs/images/3-node-information.png
+++ b/docs/images/3-node-information.png
--- a/docs/images/4-vm-information.png
+++ b/docs/images/4-vm-information.png
--- a/docs/images/5-vm-details.png
+++ b/docs/images/5-vm-details.png
--- a/docs/images/6-network-information.png
+++ b/docs/images/6-network-information.png
--- a/docs/images/7-storage-information.png
+++ b/docs/images/7-storage-information.png
--- a/docs/images/8-vm-and-node-logs.png
+++ b/docs/images/8-vm-and-node-logs.png
--- a/docs/images/9-vm-and-worker-tasks.png
+++ b/docs/images/9-vm-and-worker-tasks.png
--- a/docs/images/pvc-migration.png
+++ b/docs/images/pvc-migration.png
--- a/docs/images/pvc-networks.png
+++ b/docs/images/pvc-networks.png
--- a/docs/images/pvc-nodelog.png
+++ b/docs/images/pvc-nodelog.png
--- a/docs/images/pvc-nodes.png
+++ b/docs/images/pvc-nodes.png
--- a/docs/images/pvc_logo_white_transparent.png
+++ b/docs/images/pvc_logo_white_transparent.png
--- a/docs/images/pvc_logo_white_transparent.xcf
+++ b/docs/images/pvc_logo_white_transparent.xcf
--- a/docs/index.md
+++ b/docs/index.md
@ -1,9 +1,11 @@
 <p align="center">
-<img alt="Logo banner" src="images/pvc_logo_black.png"/>
+<img alt="Logo banner" src="https://docs.parallelvirtualcluster.org/en/latest/images/pvc_logo_black.png"/>
 <br/><br/>
+<a href="https://www.parallelvirtualcluster.org"><img alt="Website" src="https://img.shields.io/badge/visit-website-blue"/></a>
+<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Latest Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
+<a href="https://docs.parallelvirtualcluster.org/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
 <a href="https://github.com/parallelvirtualcluster/pvc"><img alt="License" src="https://img.shields.io/github/license/parallelvirtualcluster/pvc"/></a>
-<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
-<a href="https://parallelvirtualcluster.readthedocs.io/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
+<a href="https://github.com/psf/black"><img alt="Code style: Black" src="https://img.shields.io/badge/code%20style-black-000000.svg"/></a>
 </p>

 ## What is PVC?
@ -18,42 +20,68 @@ As a consequence of its features, PVC makes administrating very high-uptime VMs

 PVC also features an optional, fully customizable VM provisioning framework, designed to automate and simplify VM deployments using custom provisioning profiles, scripts, and CloudInit userdata API support.

-Installation of PVC is accomplished by two main components: a [Node installer ISO](https://github.com/parallelvirtualcluster/pvc-installer) which creates on-demand installer ISOs, and an [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) to configure, bootstrap, and administrate the nodes. Installation can also be fully automated with a companion [cluster bootstrapping system](https://github.com/parallelvirtualcluster/pvc-bootstrap). Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client or WebUI.
+Installation of PVC is accomplished by two main components: a [Node installer ISO](https://github.com/parallelvirtualcluster/pvc-installer) which creates on-demand installer ISOs, and an [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) to configure, bootstrap, and administrate the nodes. Installation can also be fully automated with a companion [cluster bootstrapping system](https://github.com/parallelvirtualcluster/pvc-bootstrap). Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client ~~or WebUI~~ (eventually).

 Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.

-For more details on the project motivation, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page.
-
-## What is it based on?
-
-The core node and API daemons, as well as the CLI API client, are written in Python 3 and are fully Free Software (GNU GPL v3). In addition to these, PVC makes use of the following software tools to provide a holistic hyperconverged infrastructure solution:
-
-  * Debian GNU/Linux as the base OS.
-  * Linux KVM, QEMU, and Libvirt for VM management.
-  * Linux `ip`, FRRouting, NFTables, DNSMasq, and PowerDNS for network management.
-  * Ceph for storage management.
-  * Apache Zookeeper for the primary cluster state database.
-  * Patroni PostgreSQL manager for the secondary relation databases (DNS aggregation, Provisioner configuration).
-
+More information about PVC, its motivations, the hardware requirements, and setting up and managing a cluster [can be found over at our docs page](https://docs.parallelvirtualcluster.org).

 ## Getting Started

-To get started with PVC, read over the [Cluster Architecture](https://parallelvirtualcluster.readthedocs.io/en/latest/cluster-architecture/) page then see the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) guide for details on configuring your first cluster.
-
+To get started with PVC, please see the [About](https://docs.parallelvirtualcluster.org/en/latest/about-pvc/) page for general information about the project, and the [Getting Started](https://docs.parallelvirtualcluster.org/en/latest/deployment/getting-started/) page for details on configuring your first cluster.

 ## Changelog

-View the changelog in [CHANGELOG.md](https://github.com/parallelvirtualcluster/pvc/blob/master/CHANGELOG.md).
-
+View the changelog in [CHANGELOG.md](https://github.com/parallelvirtualcluster/pvc/blob/master/README.md). **Please note that any breaking changes are announced here; ensure you read the changelog before upgrading!**

 ## Screenshots

-While PVC's API and internals aren't very screenshot-worthy, here is some example output of the CLI tool.
+These screenshots show some of the available functionality of the PVC system and CLI as of PVC v0.9.85.

-<p><img alt="Node listing" src="images/pvc-nodes.png"/><br/><i>Listing the nodes in a cluster</i></p>
+<p><img alt="0. Integrated help" src="images/0-integrated-help.png"/><br/>
+<i>The CLI features an integrated, fully-featured help system to show details about every possible command.</i>
+</p>

-<p><img alt="Network listing" src="images/pvc-networks.png"/><br/><i>Listing the networks in a cluster, showing 3 bridged and 1 IPv4-only managed networks</i></p>
+<p><img alt="1. Connection management" src="images/1-connection-management.png"/><br/>
+<i>A single CLI instance can manage multiple clusters, including a quick detail view, and will default to a "local" connection if an "/etc/pvc/pvc.conf" file is found; sensitive API keys are hidden by default.</i>
+</p>

-<p><img alt="VM listing and migration" src="images/pvc-migration.png"/><br/><i>Listing a limited set of VMs and migrating one with status updates</i></p>
+<p><img alt="2. Cluster details and output formats" src="images/2-cluster-details-and-output-formats.png"/><br/>
+<i>PVC can show the key details of your cluster at a glance, including health, persistent fault events, and key resources; the CLI can output both in pretty human format and JSON for easier machine parsing in scripts.</i>
+</p>

-<p><img alt="Node logs" src="images/pvc-nodelog.png"/><br/><i>Viewing the logs of a node (keepalives and VM [un]migration)</i></p>
+<p><img alt="3. Node information" src="images/3-node-information.png"/><br/>
+<i>PVC can show details about the nodes in the cluster, including their live health and resource utilization.</i>
+</p>
+
+<p><img alt="4. VM information" src="images/4-vm-information.png"/><br/>
+<i>PVC can show details about the VMs in the cluster, including their state, resource allocations, current hosting node, and metadata.</i>
+</p>
+
+<p><img alt="5. VM details" src="images/5-vm-details.png"/><br/>
+<i>In addition to the above basic details, PVC can also show extensive information about a running VM's devices and other resource utilization.</i>
+</p>
+
+<p><img alt="6. Network information" src="images/6-network-information.png"/><br/>
+<i>PVC has two major client network types, and ensures a consistent configuration of client networks across the entire cluster; managed networks can feature DHCP, DNS, firewall, and other functionality including DHCP reservations.</i>
+</p>
+
+<p><img alt="7. Storage information" src="images/7-storage-information.png"/><br/>
+<i>PVC provides a convenient abstracted view of the underlying Ceph system and can manage all core aspects of it.</i>
+</p>
+
+<p><img alt="8. VM and node logs" src="images/8-vm-and-node-logs.png"/><br/>
+<i>PVC can display logs from VM serial consoles (if properly configured) and nodes in-client to facilitate quick troubleshooting.</i>
+</p>
+
+<p><img alt="9. VM and worker tasks" src="images/9-vm-and-worker-tasks.png"/><br/>
+<i>PVC provides full VM lifecycle management, as well as long-running worker-based commands (in this example, clearing a VM's storage locks).</i>
+</p>
+
+<p><img alt="10. Provisioner" src="images/10-provisioner.png"/><br/>
+<i>PVC features an extensively customizable and configurable VM provisioner system, including EC2-compatible CloudInit support, allowing you to define flexible VM profiles and provision new VMs with a single command.</i>
+</p>
+
+<p><img alt="11. Prometheus and Grafana dashboard" src="images/11-prometheus-grafana.png"/><br/>
+<i>PVC features several monitoring integration examples under "node-daemon/monitoring", including CheckMK, Munin, and, most recently, Prometheus, including an example Grafana dashboard for cluster monitoring and alerting.</i>
+</p>
--- a/docs/manuals/api.md
+++ b/docs/manuals/api.md
@ -4,7 +4,7 @@ The PVC API is a standalone client application for PVC. It interfaces directly w

 The API is built using Flask and is packaged in the Debian package `pvc-client-api`. The API depends on the common client functions of the `pvc-client-common` package as does the CLI client.

-Details of the API interface can be found in [the manual](/manuals/api).
+The full API endpoint and schema documentation [can be found here](/en/latest/manuals/api-reference.html).

 # PVC HTTP API manual

@ -349,7 +349,3 @@ The Ceph monitor port. Should always be `6789`.
 * *required*

 The Libvirt storage secret UUID for the Ceph cluster.
-
-## API Endpoint Documentation
-
-The full API endpoint and schema documentation [can be found here](/manuals/api-reference.html).
--- a/docs/manuals/daemon.md
+++ b/docs/manuals/daemon.md
@ -332,7 +332,7 @@ The action to take regarding VMs once a node is *successfully* fenced, i.e. the

 The action to take regarding VMs once a node fencing *fails*, i.e. the IPMI command to restart the node reports a failure. Can be one of `None`, to perform no action and the default, or `migrate` to migrate and start all failed VMs on other nodes.

-**WARNING:** This functionality is potentially **dangerous** and can result in data loss or corruption in the VM disks; the post-fence migration process *explicitly clears RBD locks on the disk volumes*. It is designed only for specific and advanced use-cases, such as servers that do not reliably report IPMI responses or servers without IPMI (not recommended; see the [cluster architecture documentation](/architecture/cluster)). If this is set to `migrate`, the `suicide_intervals` **must** be set to provide at least some guarantee that the VMs on the node will actually be terminated before this condition triggers. The administrator should think very carefully about their setup and potential failure modes before enabling this option.
+⚠️ **WARNING** This functionality is potentially **dangerous** and can result in data loss or corruption in the VM disks; the post-fence migration process *explicitly clears RBD locks on the disk volumes*. It is designed only for specific and advanced use-cases, such as servers that do not reliably report IPMI responses or servers without IPMI (not recommended; see the [cluster architecture documentation](/architecture/cluster)). If this is set to `migrate`, the `suicide_intervals` **must** be set to provide at least some guarantee that the VMs on the node will actually be terminated before this condition triggers. The administrator should think very carefully about their setup and potential failure modes before enabling this option.

 #### `system` → `fencing` → `ipmi` → `host`

--- a/docs/manuals/swagger.json
+++ b/docs/manuals/swagger.json
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -4,6 +4,7 @@ theme:
  name: readthedocs
  titles_only: yes
  logo: "images/pvc_logo_black_transparent.png"
+  width: "100%"
 markdown_extensions:
  - toc:
      permalink: yes
@ -11,14 +12,18 @@ markdown_extensions:
 nav:
  - 'Home': 'index.md'
  - 'About PVC': 'about-pvc.md'
+  - 'Architecture':
+    - 'Cluster Architecture': 'architecture/cluster-architecture.md'
+    - 'Hardware Requirements': 'architecture/hardware-requirements.md'
+    - 'Fencing': 'architecture/fencing.md'
+    - 'Georedundancy': 'architecture/georedundancy.md'
  - 'Deployment':
-    - 'Cluster Architecture': 'deployment/cluster-architecture.md'
-    - 'Hardware Requirements': 'deployment/hardware-requirements.md'
-    - 'Fencing & Georedundancy': 'deployment/fencing-and-georedundancy.md'
    - 'Getting Started Guide': 'deployment/getting-started.md'
+    - 'Provisioner Guide': 'deployment/provisioner.md'
  - 'Manuals':
    - 'PVC CLI': 'manuals/cli.md'
    - 'PVC HTTP API': 'manuals/api.md'
    - 'PVC Node Daemon': 'manuals/daemon.md'
-    - 'PVC Provisioner': 'manuals/provisioner.md'
    - 'PVC Node Health Plugins': 'manuals/health-plugins.md'
+  - 'API':
+    - 'API Reference': 'manuals/api-reference.html'
Author	SHA1	Message	Date
Joshua M. Boniface	f486b2d3ae	Fix incorrect version	2024-12-27 21:22:28 -05:00
Joshua M. Boniface	4190b802a5	Fix link name	2024-12-27 21:19:41 -05:00
Joshua M. Boniface	0e48068d19	Fix links	2024-12-27 21:18:01 -05:00
Joshua M. Boniface	98aff8fb4c	Add missing step	2024-12-27 21:16:06 -05:00
Joshua M. Boniface	f891b3c501	Fix hostnames	2024-12-27 21:13:24 -05:00
Joshua M. Boniface	d457f74d39	Fix indenting	2024-12-27 21:12:19 -05:00
Joshua M. Boniface	c26034381b	Improve formatting	2024-12-27 21:08:17 -05:00
Joshua M. Boniface	a782113d44	Remove header title	2024-12-27 20:46:52 -05:00
Joshua M. Boniface	1d2b9d7a99	Add TOC for getting started guide	2024-12-27 20:45:30 -05:00
Joshua M. Boniface	6db2201c24	Improve wording	2024-12-27 20:44:42 -05:00
Joshua M. Boniface	1f419ddc64	Be more forceful	2024-12-27 20:37:50 -05:00
Joshua M. Boniface	8802b28034	Fix incorrect heading	2024-12-27 20:29:46 -05:00
Joshua M. Boniface	0abbc34ba4	Improve georedundancy documentation	2024-12-27 20:25:47 -05:00
Joshua M. Boniface	f22fae8277	Add automirror references	2024-11-18 11:00:12 -05:00
Joshua M. Boniface	f453bd6ac5	Add automirror and adjust snapshot age	2024-11-16 13:47:48 -05:00
Joshua M. Boniface	ab800a666e	Adjust emojis consistently and add net warning	2024-11-14 12:16:39 -05:00
Joshua M. Boniface	8d584740ea	Update Docs index	2024-10-25 23:53:18 -04:00
Joshua M. Boniface	10ed0526d1	Update README badge order	2024-10-25 23:48:11 -04:00
Joshua M. Boniface	03aecb1a26	Update README	2024-10-25 23:45:51 -04:00
Joshua M. Boniface	13635eb883	Update README	2024-10-25 23:38:36 -04:00
Joshua M. Boniface	ba46e8c3ef	Update README	2024-10-25 23:29:16 -04:00
Joshua M. Boniface	043fa0da7f	Fix formatting	2024-10-25 03:03:06 -04:00
Joshua M. Boniface	6769bdb086	Unify formatting	2024-10-25 03:01:15 -04:00
Joshua M. Boniface	9039bc0b9d	Fix indentation	2024-10-25 02:57:24 -04:00
Joshua M. Boniface	004be3de16	Add OVA mention	2024-10-25 02:56:09 -04:00
Joshua M. Boniface	5301e47614	Rework some wording	2024-10-25 02:54:48 -04:00
Joshua M. Boniface	8ddd9dc965	Move the provisioner guide	2024-10-25 02:52:51 -04:00
Joshua M. Boniface	fdabdbb52d	Remove mention of obsolete worksheet	2024-10-25 02:50:47 -04:00
Joshua M. Boniface	0e10395419	Bump the base Debian version	2024-10-25 02:50:13 -04:00
Joshua M. Boniface	1dd8e07a55	Add notes about mirrors	2024-10-25 02:48:26 -04:00
Joshua M. Boniface	add583472a	Fix typo	2024-10-25 02:43:21 -04:00
Joshua M. Boniface	93ef4985a5	Up to 8 spaces	2024-10-25 02:39:47 -04:00
Joshua M. Boniface	17244a4b96	Add sidebar to API reference	2024-10-25 02:38:59 -04:00
Joshua M. Boniface	3973d5d079	Add more indentation	2024-10-25 02:37:07 -04:00
Joshua M. Boniface	35fe7c64b6	Add trailing spaces too	2024-10-25 02:36:39 -04:00
Joshua M. Boniface	ecffc2412c	Try more spaces	2024-10-25 02:35:07 -04:00
Joshua M. Boniface	43bc38b998	Try more newlines	2024-10-25 02:33:50 -04:00
Joshua M. Boniface	0aaa28d5df	Try backticks	2024-10-25 02:32:22 -04:00
Joshua M. Boniface	40647b785c	Try to fix formatting of hosts example	2024-10-25 02:31:05 -04:00
Joshua M. Boniface	968855eca8	Fix formatting	2024-10-25 02:29:31 -04:00
Joshua M. Boniface	e81211e3c6	Fix formatting	2024-10-25 02:28:26 -04:00
Joshua M. Boniface	c6eddb6ece	Fix formatting	2024-10-25 02:25:47 -04:00
Joshua M. Boniface	e853f972f1	Fix indents	2024-10-25 02:22:37 -04:00
Joshua M. Boniface	9c6ed63278	Update the Getting Started documentation	2024-10-25 02:13:08 -04:00
Joshua M. Boniface	7ee0598b08	Update swagger spec	2024-10-19 11:48:52 -04:00
Joshua M. Boniface	f809bf8166	Update API doc for 0.9.101/102	2024-10-18 01:29:17 -04:00
Joshua M. Boniface	1b15c92e51	Update the description of VM define endpoint	2024-10-01 13:31:07 -04:00
Joshua M. Boniface	4dc77a66f4	Add proper response schema for 202 responses	2024-10-01 13:26:23 -04:00
Joshua M. Boniface	f940b2ff44	Update API documentation and link	2024-09-30 20:51:12 -04:00
Joshua M. Boniface	1c3eec48ea	Update spec for upcoming release	2024-09-07 12:33:09 -04:00
Joshua M. Boniface	e0081f73f8	Update Swagger spec for 0.9.99	2024-08-28 11:35:32 -04:00
Joshua M. Boniface	1a615dbf50	Fix invalid ref	2024-08-19 16:56:32 -04:00
Joshua M. Boniface	36c2237f6c	Update swagger docs	2024-08-19 16:49:52 -04:00
Joshua M. Boniface	d43ee44a0a	Update Swagger doc for 0.9.94	2024-05-27 09:31:55 -04:00
Joshua M. Boniface	54a00497db	Add white transparent logo	2024-05-27 09:31:26 -04:00
Joshua M. Boniface	b9f3bcbb00	Update Software diagram for 0.9.86+	2024-01-11 10:37:40 -05:00
Joshua M. Boniface	b1e06dbf54	Add migration max downtime metafield for VMs	2024-01-10 16:30:52 -05:00
Joshua M. Boniface	1eeb3bd778	Add Zookeeper metric endpoint and update descrs	2024-01-10 16:30:32 -05:00
Joshua M. Boniface	3b11a74597	Update Metrics endpoint details	2023-12-25 03:05:26 -05:00
Joshua M. Boniface	a424e420b5	Remove WebUI from README	2023-12-25 02:49:53 -05:00
Joshua M. Boniface	a07c995a5a	Add VNC info to screenshots	2023-12-11 03:41:07 -05:00
Joshua M. Boniface	33d67fe03a	Remove debug output from image	2023-12-11 03:14:26 -05:00
Joshua M. Boniface	863202293c	Fix output bugs in VM information	2023-12-11 03:05:03 -05:00
Joshua M. Boniface	6f7f0a834e	Finish missing sentence	2023-12-11 02:39:51 -05:00
Joshua M. Boniface	da61d92a67	Add Grafana dashboard screenshot	2023-12-11 00:39:41 -05:00
Joshua M. Boniface	ac0bab2b29	Update index.md to match project README	2023-12-10 23:53:09 -05:00
Joshua M. Boniface	d28246a15b	Add API endpoints for 0.9.83 and 0.9.84	2023-12-09 23:47:42 -05:00
Joshua M. Boniface	b9500034c7	Add new features and version link	2023-10-27 09:47:50 -04:00
Joshua M. Boniface	e4372b354c	Further point tweaks	2023-10-24 16:34:06 -04:00
Joshua M. Boniface	105c122a6d	Revamp a few more points	2023-10-24 16:32:31 -04:00
Joshua M. Boniface	44b9278cb6	Fix some entryies	2023-10-24 16:09:38 -04:00
Joshua M. Boniface	a6f29bd350	Spice up initial tagline	2023-10-24 16:05:57 -04:00
Joshua M. Boniface	c3ae2ae622	Add core features to About page	2023-10-24 16:04:54 -04:00
Joshua M. Boniface	7e72a0cd66	Add VM backup and restore API endpoints	2023-10-24 02:13:54 -04:00
Joshua M. Boniface	b86033f2f3	Update API swagger definitions	2023-10-03 09:43:00 -04:00
Joshua M. Boniface	eaf9b6927c	Update API JSON for 0.9.78	2023-10-01 15:24:54 -04:00
Joshua M. Boniface	5d3ec9a793	Fix bad link	2023-09-21 22:51:48 -04:00
Joshua M. Boniface	8832a81fa7	Correct spelling errors	2023-09-21 22:50:53 -04:00
Joshua M. Boniface	09b485988a	Fix spelling errors	2023-09-21 22:50:02 -04:00
Joshua M. Boniface	8964e0aa3c	Reorganize node placement	2023-09-21 22:48:20 -04:00
Joshua M. Boniface	0185853873	Adjust wording in final recommendations	2023-09-21 22:44:32 -04:00
Joshua M. Boniface	70766dfef2	Avoid saying significant too much	2023-09-21 22:43:29 -04:00
Joshua M. Boniface	9c1ed0bc57	Add line explaining the diagram	2023-09-21 22:42:04 -04:00
Joshua M. Boniface	9168792e51	Fix image link	2023-09-21 22:39:00 -04:00
Joshua M. Boniface	1673448228	Add Georedundancy documentation	2023-09-21 22:13:11 -04:00
Joshua M. Boniface	3638f3ff21	Mention software req of monitoring	2023-09-21 00:31:35 -04:00
Joshua M. Boniface	defe4719c5	Remove year ranges	2023-09-20 22:31:09 -04:00
Joshua M. Boniface	49f391206d	Add architecture notes	2023-09-20 22:30:24 -04:00
Joshua M. Boniface	216cd4426c	Fix spelling of ProLiant	2023-09-20 22:26:06 -04:00
Joshua M. Boniface	cb408b506d	Improve wording of N-1 section	2023-09-20 22:25:17 -04:00
Joshua M. Boniface	e49091f6d4	Mention that fancing only occurs to run state nodes	2023-09-17 20:30:43 -04:00
Joshua M. Boniface	5262cabaff	Move video to top and adjust wording	2023-09-17 13:05:04 -04:00
Joshua M. Boniface	1172745a96	Fix video embedding	2023-09-17 13:01:05 -04:00
Joshua M. Boniface	b1e39ff4af	Add video to Fencing article	2023-09-17 12:58:19 -04:00
Joshua M. Boniface	2f998069f6	Fix remaining links	2023-09-17 00:46:31 -04:00
Joshua M. Boniface	476eddc0f6	Try fixing nested links	2023-09-17 00:43:11 -04:00
Joshua M. Boniface	8f490a6bfb	Rehash table titles for width	2023-09-17 00:41:22 -04:00
Joshua M. Boniface	f3c513a262	Fix about link	2023-09-17 00:40:27 -04:00
Joshua M. Boniface	ff0b00683d	Remove absolute paths from md links	2023-09-17 00:37:18 -04:00
Joshua M. Boniface	0777823695	Fix bad links	2023-09-17 00:08:10 -04:00
Joshua M. Boniface	a9dde4b65e	Update wording	2023-09-17 00:04:14 -04:00
Joshua M. Boniface	0372584f08	Fix bad table row	2023-09-17 00:03:22 -04:00
Joshua M. Boniface	d8ebc7de1f	Reorganize documentation	2023-09-16 23:59:55 -04:00
Joshua M. Boniface	28950a4d90	Fix broken table and situations	2023-09-16 23:55:50 -04:00
Joshua M. Boniface	836a61708e	Try adjusting width randomly	2023-09-16 23:52:09 -04:00
Joshua M. Boniface	d3de778ca3	Remove unneeded column	2023-09-16 23:50:23 -04:00
Joshua M. Boniface	ac7b25dac1	Update navigation pages	2023-09-16 23:46:55 -04:00
Joshua M. Boniface	daddf13b01	Add fencing documentation	2023-09-16 23:46:18 -04:00
Joshua M. Boniface	bd541bbe49	Remove double word	2023-09-16 18:40:08 -04:00