Compare commits
83 Commits
Author | SHA1 | Date |
---|---|---|
Joshua Boniface | 9441cb3b2e | |
Joshua Boniface | b16542c8fc | |
Joshua Boniface | de0c7e37f2 | |
Joshua Boniface | ae26a071c7 | |
Joshua Boniface | 49a34acd14 | |
Joshua Boniface | 82365ea539 | |
Joshua Boniface | 86f0c5c3ae | |
Joshua Boniface | 83294298e1 | |
Joshua Boniface | 4187aacc5b | |
Joshua Boniface | 35c82b5249 | |
Joshua Boniface | e80b797e3a | |
Joshua Boniface | 7c8c71dff7 | |
Joshua Boniface | 861fef91e3 | |
Joshua Boniface | d1fcac1f0a | |
Joshua Boniface | 6ace2ebf6a | |
Joshua Boniface | 962fba7621 | |
Joshua Boniface | 49bf51da38 | |
Joshua Boniface | 1293e8ae7e | |
Joshua Boniface | ae2cf8a070 | |
Joshua Boniface | ab5bd3c57d | |
Joshua Boniface | 35153cd6b6 | |
Joshua Boniface | 7f7047dd52 | |
Joshua Boniface | 9a91767405 | |
Joshua Boniface | bcfa6851e1 | |
Joshua Boniface | 28b8b3bb44 | |
Joshua Boniface | 02425159ef | |
Joshua Boniface | a6f8500309 | |
Joshua Boniface | ebec1332e9 | |
Joshua Boniface | c08c3b2d7d | |
Joshua Boniface | 4c0d90b517 | |
Joshua Boniface | 70c588d3a8 | |
Joshua Boniface | 214e7f835a | |
Joshua Boniface | 96cebfb42a | |
Joshua Boniface | c4763ac596 | |
Joshua Boniface | ea5512e3d8 | |
Joshua Boniface | ac00f7c4c8 | |
Joshua Boniface | 6d31bf439e | |
Joshua Boniface | c714093a2e | |
Joshua Boniface | 04a09b9269 | |
Joshua Boniface | 3ede0c7d38 | |
Joshua Boniface | ab9390fdb8 | |
Joshua Boniface | 1c83584788 | |
Joshua Boniface | 7f3ab4e119 | |
Joshua Boniface | 16eb09dc22 | |
Joshua Boniface | 7ba75adef4 | |
Joshua Boniface | a691d26c30 | |
Joshua Boniface | 1d90b066bc | |
Joshua Boniface | 3ea7421f09 | |
Joshua Boniface | df4d437d31 | |
Joshua Boniface | 8295e2089d | |
Joshua Boniface | 4ccb570762 | |
Joshua Boniface | 235299942a | |
Joshua Boniface | 9aa32134a9 | |
Joshua Boniface | 75eac356d5 | |
Joshua Boniface | fb8561cc5d | |
Joshua Boniface | 5f7aa0b2d6 | |
Joshua Boniface | 7fac7a62cf | |
Joshua Boniface | b19642aa2e | |
Joshua Boniface | 974e0d6ac2 | |
Joshua Boniface | 7785166a7e | |
Joshua Boniface | 34f0a2f388 | |
Joshua Boniface | 8fa37d21c0 | |
Joshua Boniface | f462ebbc6b | |
Joshua Boniface | 0d533f3658 | |
Joshua Boniface | 792d135950 | |
Joshua Boniface | a64e0c1985 | |
Joshua Boniface | 1cbadb1172 | |
Joshua Boniface | b1c4b2e928 | |
Joshua Boniface | 7fe1262887 | |
Joshua Boniface | 0e389ba1f4 | |
Joshua Boniface | 41cd34ba4d | |
Joshua Boniface | 736762901c | |
Joshua Boniface | ecb812ccac | |
Joshua Boniface | a2e5df9f6d | |
Joshua Boniface | 73c0834f85 | |
Joshua Boniface | 2de999c700 | |
Joshua Boniface | 7543eb839d | |
Joshua Boniface | 8cb44c0c5d | |
Joshua Boniface | c55021f30c | |
Joshua Boniface | 783c9e46c2 | |
Joshua Boniface | b7f33c1fcb | |
Joshua Boniface | 0f578d7c7d | |
Joshua Boniface | f87b96887c |
|
@ -4,4 +4,4 @@ bbuilder:
|
||||||
published:
|
published:
|
||||||
- git submodule update --init
|
- git submodule update --init
|
||||||
- /bin/bash build-stable-deb.sh
|
- /bin/bash build-stable-deb.sh
|
||||||
- sudo /usr/local/bin/deploy-package -C pvc
|
- sudo /usr/local/bin/deploy-package -C pvc -D bookworm
|
||||||
|
|
42
CHANGELOG.md
42
CHANGELOG.md
|
@ -1,8 +1,48 @@
|
||||||
## PVC Changelog
|
## PVC Changelog
|
||||||
|
|
||||||
|
###### [v0.9.103](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.103)
|
||||||
|
|
||||||
|
* [Provisioner] Fixes a bug with the change in `storage_hosts` to FQDNs affecting the VM Builder
|
||||||
|
* [Monitoring] Fixes the Munin plugin to work properly with sudo
|
||||||
|
|
||||||
|
###### [v0.9.102](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.102)
|
||||||
|
|
||||||
|
* [API Daemon] Ensures that received config snapshots update storage hosts in addition to secret UUIDs
|
||||||
|
* [CLI Client] Fixes several bugs around local connection handling and connection listings
|
||||||
|
|
||||||
|
###### [v0.9.101](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.101)
|
||||||
|
|
||||||
|
**New Feature**: Adds VM snapshot sending (`vm snapshot send`), VM mirroring (`vm mirror create`), and (offline) mirror promotion (`vm mirror promote`). Permits transferring VM snapshots to remote clusters, individually or repeatedly, and promoting them to active status, for disaster recovery and migration between clusters.
|
||||||
|
**Breaking Change**: Migrates the API daemon into Gunicorn when in production mode. Permits more scalable and performant operation of the API. **Requires additional dependency packages on all coordinator nodes** (`gunicorn`, `python3-gunicorn`, `python3-setuptools`); upgrade via `pvc-ansible` is strongly recommended.
|
||||||
|
**Enhancement**: Provides whole cluster utilization stats in the cluster status data. Permits better observability into the overall resource utilization of the cluster.
|
||||||
|
**Enhancement**: Adds a new storage benchmark format (v2) which includes additional resource utilization statistics. This allows for better evaluation of storage performance impact on the cluster as a whole. The updated format also permits arbitrary benchmark job names for easier parsing and tracking.
|
||||||
|
|
||||||
|
* [API Daemon] Allows scanning of new volumes added manually via other commands
|
||||||
|
* [API Daemon/CLI Client] Adds whole cluster utilization statistics to cluster status
|
||||||
|
* [API Daemon] Moves production API execution into Gunicorn
|
||||||
|
* [API Daemon] Adds a new storage benchmark format (v2) with additional resource tracking
|
||||||
|
* [API Daemon] Adds support for named storage benchmark jobs
|
||||||
|
* [API Daemon] Fixes a bug in OSD creation which would create `split` OSDs if `--osd-count` was set to 1
|
||||||
|
* [API Daemon] Adds support for the `mirror` VM state used by snapshot mirrors
|
||||||
|
* [CLI Client] Fixes several output display bugs in various commands and in Worker task outputs
|
||||||
|
* [CLI Client] Improves and shrinks the status progress bar output to support longer messages
|
||||||
|
* [API Daemon] Adds support for sending snapshots to remote clusters
|
||||||
|
* [API Daemon] Adds support for updating and promoting snapshot mirrors to remote clusters
|
||||||
|
* [Node Daemon] Improves timeouts during primary/secondary coordinator transitions to avoid deadlocks
|
||||||
|
* [Node Daemon] Improves timeouts during keepalive updates to avoid deadlocks
|
||||||
|
* [Node Daemon] Refactors fencing thread structure to ensure a single fencing task per cluster and sequential node fences to avoid potential anomalies (e.g. fencing 2 nodes simultaneously)
|
||||||
|
* [Node Daemon] Fixes a bug in fencing if VM locks were already freed, leaving VMs in an invalid state
|
||||||
|
* [Node Daemon] Increases the wait time during system startup to ensure Zookeeper has more time to synchronize
|
||||||
|
|
||||||
|
###### [v0.9.100](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.100)
|
||||||
|
|
||||||
|
* [API Daemon] Improves the handling of "detect:" disk strings on newer systems by leveraging the "nvme" command
|
||||||
|
* [Client CLI] Update help text about "detect:" disk strings
|
||||||
|
* [Meta] Updates deprecation warnings and updates builder to only add this version for Debian 12 (Bookworm)
|
||||||
|
|
||||||
###### [v0.9.99](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.99)
|
###### [v0.9.99](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.99)
|
||||||
|
|
||||||
**Deprecation Warning**: `pvc vm backup` commands are now deprecated and will be removed in **0.9.100**. Use `pvc vm snapshot` commands instead.
|
**Deprecation Warning**: `pvc vm backup` commands are now deprecated and will be removed in a future version. Use `pvc vm snapshot` commands instead.
|
||||||
**Breaking Change**: The on-disk format of VM snapshot exports differs from backup exports, and the PVC autobackup system now leverages these. It is recommended to start fresh with a new tree of backups for `pvc autobackup` for maximum compatibility.
|
**Breaking Change**: The on-disk format of VM snapshot exports differs from backup exports, and the PVC autobackup system now leverages these. It is recommended to start fresh with a new tree of backups for `pvc autobackup` for maximum compatibility.
|
||||||
**Breaking Change**: VM autobackups now run in `pvcworkerd` instead of the CLI client directly, allowing them to be triggerd from any node (or externally). It is important to apply the timer unit changes from the `pvc-ansible` role after upgrading to 0.9.99 to avoid duplicate runs.
|
**Breaking Change**: VM autobackups now run in `pvcworkerd` instead of the CLI client directly, allowing them to be triggerd from any node (or externally). It is important to apply the timer unit changes from the `pvc-ansible` role after upgrading to 0.9.99 to avoid duplicate runs.
|
||||||
**Usage Note**: VM snapshots are displayed in the `pvc vm list` and `pvc vm info` outputs, not in a unique "list" endpoint.
|
**Usage Note**: VM snapshots are displayed in the `pvc vm list` and `pvc vm info` outputs, not in a unique "list" endpoint.
|
||||||
|
|
35
README.md
35
README.md
|
@ -1,10 +1,11 @@
|
||||||
<p align="center">
|
<p align="center">
|
||||||
<img alt="Logo banner" src="images/pvc_logo_black.png"/>
|
<img alt="Logo banner" src="https://docs.parallelvirtualcluster.org/en/latest/images/pvc_logo_black.png"/>
|
||||||
<br/><br/>
|
<br/><br/>
|
||||||
|
<a href="https://www.parallelvirtualcluster.org"><img alt="Website" src="https://img.shields.io/badge/visit-website-blue"/></a>
|
||||||
|
<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Latest Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
|
||||||
|
<a href="https://docs.parallelvirtualcluster.org/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
|
||||||
<a href="https://github.com/parallelvirtualcluster/pvc"><img alt="License" src="https://img.shields.io/github/license/parallelvirtualcluster/pvc"/></a>
|
<a href="https://github.com/parallelvirtualcluster/pvc"><img alt="License" src="https://img.shields.io/github/license/parallelvirtualcluster/pvc"/></a>
|
||||||
<a href="https://github.com/psf/black"><img alt="Code style: Black" src="https://img.shields.io/badge/code%20style-black-000000.svg"/></a>
|
<a href="https://github.com/psf/black"><img alt="Code style: Black" src="https://img.shields.io/badge/code%20style-black-000000.svg"/></a>
|
||||||
<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
|
|
||||||
<a href="https://docs.parallelvirtualcluster.org/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
|
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
## What is PVC?
|
## What is PVC?
|
||||||
|
@ -23,62 +24,64 @@ Installation of PVC is accomplished by two main components: a [Node installer IS
|
||||||
|
|
||||||
Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.
|
Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.
|
||||||
|
|
||||||
|
More information about PVC, its motivations, the hardware requirements, and setting up and managing a cluster [can be found over at our docs page](https://docs.parallelvirtualcluster.org).
|
||||||
|
|
||||||
## Getting Started
|
## Getting Started
|
||||||
|
|
||||||
To get started with PVC, please see the [About](https://docs.parallelvirtualcluster.org/en/latest/about-pvc/) page for general information about the project, and the [Getting Started](https://docs.parallelvirtualcluster.org/en/latest/deployment/getting-started/) page for details on configuring your first cluster.
|
To get started with PVC, please see the [About](https://docs.parallelvirtualcluster.org/en/latest/about-pvc/) page for general information about the project, and the [Getting Started](https://docs.parallelvirtualcluster.org/en/latest/deployment/getting-started/) page for details on configuring your first cluster.
|
||||||
|
|
||||||
## Changelog
|
## Changelog
|
||||||
|
|
||||||
View the changelog in [CHANGELOG.md](CHANGELOG.md). **Please note that any breaking changes are announced here; ensure you read the changelog before upgrading!**
|
View the changelog in [CHANGELOG.md](https://github.com/parallelvirtualcluster/pvc/blob/master/CHANGELOG.md). **Please note that any breaking changes are announced here; ensure you read the changelog before upgrading!**
|
||||||
|
|
||||||
## Screenshots
|
## Screenshots
|
||||||
|
|
||||||
These screenshots show some of the available functionality of the PVC system and CLI as of PVC v0.9.85.
|
These screenshots show some of the available functionality of the PVC system and CLI as of PVC v0.9.85.
|
||||||
|
|
||||||
<p><img alt="0. Integrated help" src="images/0-integrated-help.png"/><br/>
|
<p><img alt="0. Integrated help" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/0-integrated-help.png"/><br/>
|
||||||
<i>The CLI features an integrated, fully-featured help system to show details about every possible command.</i>
|
<i>The CLI features an integrated, fully-featured help system to show details about every possible command.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="1. Connection management" src="images/1-connection-management.png"/><br/>
|
<p><img alt="1. Connection management" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/1-connection-management.png"/><br/>
|
||||||
<i>A single CLI instance can manage multiple clusters, including a quick detail view, and will default to a "local" connection if an "/etc/pvc/pvc.conf" file is found; sensitive API keys are hidden by default.</i>
|
<i>A single CLI instance can manage multiple clusters, including a quick detail view, and will default to a "local" connection if an "/etc/pvc/pvc.conf" file is found; sensitive API keys are hidden by default.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="2. Cluster details and output formats" src="images/2-cluster-details-and-output-formats.png"/><br/>
|
<p><img alt="2. Cluster details and output formats" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/2-cluster-details-and-output-formats.png"/><br/>
|
||||||
<i>PVC can show the key details of your cluster at a glance, including health, persistent fault events, and key resources; the CLI can output both in pretty human format and JSON for easier machine parsing in scripts.</i>
|
<i>PVC can show the key details of your cluster at a glance, including health, persistent fault events, and key resources; the CLI can output both in pretty human format and JSON for easier machine parsing in scripts.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="3. Node information" src="images/3-node-information.png"/><br/>
|
<p><img alt="3. Node information" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/3-node-information.png"/><br/>
|
||||||
<i>PVC can show details about the nodes in the cluster, including their live health and resource utilization.</i>
|
<i>PVC can show details about the nodes in the cluster, including their live health and resource utilization.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="4. VM information" src="images/4-vm-information.png"/><br/>
|
<p><img alt="4. VM information" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/4-vm-information.png"/><br/>
|
||||||
<i>PVC can show details about the VMs in the cluster, including their state, resource allocations, current hosting node, and metadata.</i>
|
<i>PVC can show details about the VMs in the cluster, including their state, resource allocations, current hosting node, and metadata.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="5. VM details" src="images/5-vm-details.png"/><br/>
|
<p><img alt="5. VM details" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/5-vm-details.png"/><br/>
|
||||||
<i>In addition to the above basic details, PVC can also show extensive information about a running VM's devices and other resource utilization.</i>
|
<i>In addition to the above basic details, PVC can also show extensive information about a running VM's devices and other resource utilization.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="6. Network information" src="images/6-network-information.png"/><br/>
|
<p><img alt="6. Network information" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/6-network-information.png"/><br/>
|
||||||
<i>PVC has two major client network types, and ensures a consistent configuration of client networks across the entire cluster; managed networks can feature DHCP, DNS, firewall, and other functionality including DHCP reservations.</i>
|
<i>PVC has two major client network types, and ensures a consistent configuration of client networks across the entire cluster; managed networks can feature DHCP, DNS, firewall, and other functionality including DHCP reservations.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="7. Storage information" src="images/7-storage-information.png"/><br/>
|
<p><img alt="7. Storage information" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/7-storage-information.png"/><br/>
|
||||||
<i>PVC provides a convenient abstracted view of the underlying Ceph system and can manage all core aspects of it.</i>
|
<i>PVC provides a convenient abstracted view of the underlying Ceph system and can manage all core aspects of it.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="8. VM and node logs" src="images/8-vm-and-node-logs.png"/><br/>
|
<p><img alt="8. VM and node logs" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/8-vm-and-node-logs.png"/><br/>
|
||||||
<i>PVC can display logs from VM serial consoles (if properly configured) and nodes in-client to facilitate quick troubleshooting.</i>
|
<i>PVC can display logs from VM serial consoles (if properly configured) and nodes in-client to facilitate quick troubleshooting.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="9. VM and worker tasks" src="images/9-vm-and-worker-tasks.png"/><br/>
|
<p><img alt="9. VM and worker tasks" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/9-vm-and-worker-tasks.png"/><br/>
|
||||||
<i>PVC provides full VM lifecycle management, as well as long-running worker-based commands (in this example, clearing a VM's storage locks).</i>
|
<i>PVC provides full VM lifecycle management, as well as long-running worker-based commands (in this example, clearing a VM's storage locks).</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="10. Provisioner" src="images/10-provisioner.png"/><br/>
|
<p><img alt="10. Provisioner" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/10-provisioner.png"/><br/>
|
||||||
<i>PVC features an extensively customizable and configurable VM provisioner system, including EC2-compatible CloudInit support, allowing you to define flexible VM profiles and provision new VMs with a single command.</i>
|
<i>PVC features an extensively customizable and configurable VM provisioner system, including EC2-compatible CloudInit support, allowing you to define flexible VM profiles and provision new VMs with a single command.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p><img alt="11. Prometheus and Grafana dashboard" src="images/11-prometheus-grafana.png"/><br/>
|
<p><img alt="11. Prometheus and Grafana dashboard" src="https://raw.githubusercontent.com/parallelvirtualcluster/pvc/refs/heads/master/images/11-prometheus-grafana.png"/><br/>
|
||||||
<i>PVC features several monitoring integration examples under "node-daemon/monitoring", including CheckMK, Munin, and, most recently, Prometheus, including an example Grafana dashboard for cluster monitoring and alerting.</i>
|
<i>PVC features several monitoring integration examples under "node-daemon/monitoring", including CheckMK, Munin, and, most recently, Prometheus, including an example Grafana dashboard for cluster monitoring and alerting.</i>
|
||||||
</p>
|
</p>
|
||||||
|
|
|
@ -21,4 +21,5 @@
|
||||||
|
|
||||||
from daemon_lib.zkhandler import ZKSchema
|
from daemon_lib.zkhandler import ZKSchema
|
||||||
|
|
||||||
ZKSchema.write()
|
schema = ZKSchema(root_path=".")
|
||||||
|
schema.write()
|
||||||
|
|
|
@ -19,6 +19,13 @@
|
||||||
#
|
#
|
||||||
###############################################################################
|
###############################################################################
|
||||||
|
|
||||||
import pvcapid.Daemon # noqa: F401
|
import sys
|
||||||
|
from os import path
|
||||||
|
|
||||||
|
# Ensure current directory (/usr/share/pvc) is in the system path for Gunicorn
|
||||||
|
current_dir = path.dirname(path.abspath(__file__))
|
||||||
|
sys.path.append(current_dir)
|
||||||
|
|
||||||
|
import pvcapid.Daemon # noqa: F401, E402
|
||||||
|
|
||||||
pvcapid.Daemon.entrypoint()
|
pvcapid.Daemon.entrypoint()
|
||||||
|
|
|
@ -19,15 +19,13 @@
|
||||||
#
|
#
|
||||||
###############################################################################
|
###############################################################################
|
||||||
|
|
||||||
|
import subprocess
|
||||||
from ssl import SSLContext, TLSVersion
|
from ssl import SSLContext, TLSVersion
|
||||||
|
|
||||||
from distutils.util import strtobool as dustrtobool
|
from distutils.util import strtobool as dustrtobool
|
||||||
|
|
||||||
import daemon_lib.config as cfg
|
import daemon_lib.config as cfg
|
||||||
|
|
||||||
# Daemon version
|
# Daemon version
|
||||||
version = "0.9.99"
|
version = "0.9.100~git-73c0834f"
|
||||||
|
|
||||||
# API version
|
# API version
|
||||||
API_VERSION = 1.0
|
API_VERSION = 1.0
|
||||||
|
@ -53,7 +51,6 @@ def strtobool(stringv):
|
||||||
# Configuration Parsing
|
# Configuration Parsing
|
||||||
##########################################################
|
##########################################################
|
||||||
|
|
||||||
|
|
||||||
# Get our configuration
|
# Get our configuration
|
||||||
config = cfg.get_configuration()
|
config = cfg.get_configuration()
|
||||||
config["daemon_name"] = "pvcapid"
|
config["daemon_name"] = "pvcapid"
|
||||||
|
@ -61,22 +58,16 @@ config["daemon_version"] = version
|
||||||
|
|
||||||
|
|
||||||
##########################################################
|
##########################################################
|
||||||
# Entrypoint
|
# Flask App Creation for Gunicorn
|
||||||
##########################################################
|
##########################################################
|
||||||
|
|
||||||
|
|
||||||
def entrypoint():
|
def create_app():
|
||||||
import pvcapid.flaskapi as pvc_api # noqa: E402
|
"""
|
||||||
|
Create and return the Flask app and SSL context if necessary.
|
||||||
if config["api_ssl_enabled"]:
|
"""
|
||||||
context = SSLContext()
|
# Import the Flask app from pvcapid.flaskapi after adjusting the path
|
||||||
context.minimum_version = TLSVersion.TLSv1
|
import pvcapid.flaskapi as pvc_api
|
||||||
context.get_ca_certs()
|
|
||||||
context.load_cert_chain(
|
|
||||||
config["api_ssl_cert_file"], keyfile=config["api_ssl_key_file"]
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
context = None
|
|
||||||
|
|
||||||
# Print our startup messages
|
# Print our startup messages
|
||||||
print("")
|
print("")
|
||||||
|
@ -102,9 +93,69 @@ def entrypoint():
|
||||||
print("")
|
print("")
|
||||||
|
|
||||||
pvc_api.celery_startup()
|
pvc_api.celery_startup()
|
||||||
pvc_api.app.run(
|
|
||||||
|
return pvc_api.app
|
||||||
|
|
||||||
|
|
||||||
|
##########################################################
|
||||||
|
# Entrypoint
|
||||||
|
##########################################################
|
||||||
|
|
||||||
|
|
||||||
|
def entrypoint():
|
||||||
|
if config["debug"]:
|
||||||
|
app = create_app()
|
||||||
|
|
||||||
|
if config["api_ssl_enabled"]:
|
||||||
|
ssl_context = SSLContext()
|
||||||
|
ssl_context.minimum_version = TLSVersion.TLSv1
|
||||||
|
ssl_context.get_ca_certs()
|
||||||
|
ssl_context.load_cert_chain(
|
||||||
|
config["api_ssl_cert_file"], keyfile=config["api_ssl_key_file"]
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
ssl_context = None
|
||||||
|
|
||||||
|
app.run(
|
||||||
config["api_listen_address"],
|
config["api_listen_address"],
|
||||||
config["api_listen_port"],
|
config["api_listen_port"],
|
||||||
threaded=True,
|
threaded=True,
|
||||||
ssl_context=context,
|
ssl_context=ssl_context,
|
||||||
)
|
)
|
||||||
|
else:
|
||||||
|
# Build the command to run Gunicorn
|
||||||
|
gunicorn_cmd = [
|
||||||
|
"gunicorn",
|
||||||
|
"--workers",
|
||||||
|
"1",
|
||||||
|
"--threads",
|
||||||
|
"8",
|
||||||
|
"--timeout",
|
||||||
|
"86400",
|
||||||
|
"--bind",
|
||||||
|
"{}:{}".format(config["api_listen_address"], config["api_listen_port"]),
|
||||||
|
"pvcapid.Daemon:create_app()",
|
||||||
|
"--log-level",
|
||||||
|
"info",
|
||||||
|
"--access-logfile",
|
||||||
|
"-",
|
||||||
|
"--error-logfile",
|
||||||
|
"-",
|
||||||
|
]
|
||||||
|
|
||||||
|
if config["api_ssl_enabled"]:
|
||||||
|
gunicorn_cmd += [
|
||||||
|
"--certfile",
|
||||||
|
config["api_ssl_cert_file"],
|
||||||
|
"--keyfile",
|
||||||
|
config["api_ssl_key_file"],
|
||||||
|
]
|
||||||
|
|
||||||
|
# Run Gunicorn
|
||||||
|
try:
|
||||||
|
subprocess.run(gunicorn_cmd)
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
exit(0)
|
||||||
|
except Exception as e:
|
||||||
|
print(e)
|
||||||
|
exit(1)
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -21,7 +21,9 @@
|
||||||
|
|
||||||
import flask
|
import flask
|
||||||
import json
|
import json
|
||||||
|
import logging
|
||||||
import lxml.etree as etree
|
import lxml.etree as etree
|
||||||
|
import sys
|
||||||
|
|
||||||
from re import match
|
from re import match
|
||||||
from requests import get
|
from requests import get
|
||||||
|
@ -40,6 +42,15 @@ import daemon_lib.network as pvc_network
|
||||||
import daemon_lib.ceph as pvc_ceph
|
import daemon_lib.ceph as pvc_ceph
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
logger.setLevel(logging.INFO)
|
||||||
|
handler = logging.StreamHandler(sys.stdout)
|
||||||
|
handler.setLevel(logging.INFO)
|
||||||
|
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
|
||||||
|
handler.setFormatter(formatter)
|
||||||
|
logger.addHandler(handler)
|
||||||
|
|
||||||
|
|
||||||
#
|
#
|
||||||
# Cluster base functions
|
# Cluster base functions
|
||||||
#
|
#
|
||||||
|
@ -1142,11 +1153,11 @@ def vm_remove(zkhandler, name):
|
||||||
|
|
||||||
|
|
||||||
@ZKConnection(config)
|
@ZKConnection(config)
|
||||||
def vm_start(zkhandler, name):
|
def vm_start(zkhandler, name, force=False):
|
||||||
"""
|
"""
|
||||||
Start a VM in the PVC cluster.
|
Start a VM in the PVC cluster.
|
||||||
"""
|
"""
|
||||||
retflag, retdata = pvc_vm.start_vm(zkhandler, name)
|
retflag, retdata = pvc_vm.start_vm(zkhandler, name, force=force)
|
||||||
|
|
||||||
if retflag:
|
if retflag:
|
||||||
retcode = 200
|
retcode = 200
|
||||||
|
@ -1190,11 +1201,11 @@ def vm_shutdown(zkhandler, name, wait):
|
||||||
|
|
||||||
|
|
||||||
@ZKConnection(config)
|
@ZKConnection(config)
|
||||||
def vm_stop(zkhandler, name):
|
def vm_stop(zkhandler, name, force=False):
|
||||||
"""
|
"""
|
||||||
Forcibly stop a VM in the PVC cluster.
|
Forcibly stop a VM in the PVC cluster.
|
||||||
"""
|
"""
|
||||||
retflag, retdata = pvc_vm.stop_vm(zkhandler, name)
|
retflag, retdata = pvc_vm.stop_vm(zkhandler, name, force=force)
|
||||||
|
|
||||||
if retflag:
|
if retflag:
|
||||||
retcode = 200
|
retcode = 200
|
||||||
|
@ -1280,7 +1291,7 @@ def vm_flush_locks(zkhandler, vm):
|
||||||
zkhandler, None, None, None, vm, is_fuzzy=False, negate=False
|
zkhandler, None, None, None, vm, is_fuzzy=False, negate=False
|
||||||
)
|
)
|
||||||
|
|
||||||
if retdata[0].get("state") not in ["stop", "disable"]:
|
if retdata[0].get("state") not in ["stop", "disable", "mirror"]:
|
||||||
return {"message": "VM must be stopped to flush locks"}, 400
|
return {"message": "VM must be stopped to flush locks"}, 400
|
||||||
|
|
||||||
retflag, retdata = pvc_vm.flush_locks(zkhandler, vm)
|
retflag, retdata = pvc_vm.flush_locks(zkhandler, vm)
|
||||||
|
@ -1294,6 +1305,342 @@ def vm_flush_locks(zkhandler, vm):
|
||||||
return output, retcode
|
return output, retcode
|
||||||
|
|
||||||
|
|
||||||
|
@ZKConnection(config)
|
||||||
|
def vm_snapshot_receive_block_full(zkhandler, pool, volume, snapshot, size, request):
|
||||||
|
"""
|
||||||
|
Receive an RBD volume from a remote system
|
||||||
|
"""
|
||||||
|
import rados
|
||||||
|
import rbd
|
||||||
|
|
||||||
|
_, rbd_detail = pvc_ceph.get_list_volume(
|
||||||
|
zkhandler, pool, limit=volume, is_fuzzy=False
|
||||||
|
)
|
||||||
|
if len(rbd_detail) > 0:
|
||||||
|
volume_exists = True
|
||||||
|
else:
|
||||||
|
volume_exists = False
|
||||||
|
|
||||||
|
cluster = rados.Rados(conffile="/etc/ceph/ceph.conf")
|
||||||
|
cluster.connect()
|
||||||
|
ioctx = cluster.open_ioctx(pool)
|
||||||
|
|
||||||
|
if not volume_exists:
|
||||||
|
rbd_inst = rbd.RBD()
|
||||||
|
rbd_inst.create(ioctx, volume, size)
|
||||||
|
retflag, retdata = pvc_ceph.add_volume(
|
||||||
|
zkhandler, pool, volume, str(size) + "B", force_flag=True, zk_only=True
|
||||||
|
)
|
||||||
|
if not retflag:
|
||||||
|
ioctx.close()
|
||||||
|
cluster.shutdown()
|
||||||
|
|
||||||
|
if retflag:
|
||||||
|
retcode = 200
|
||||||
|
else:
|
||||||
|
retcode = 400
|
||||||
|
|
||||||
|
output = {"message": retdata.replace('"', "'")}
|
||||||
|
return output, retcode
|
||||||
|
|
||||||
|
image = rbd.Image(ioctx, volume)
|
||||||
|
|
||||||
|
last_chunk = 0
|
||||||
|
chunk_size = 1024 * 1024 * 1024
|
||||||
|
|
||||||
|
logger.info(f"Importing full snapshot {pool}/{volume}@{snapshot}")
|
||||||
|
while True:
|
||||||
|
chunk = request.stream.read(chunk_size)
|
||||||
|
if not chunk:
|
||||||
|
break
|
||||||
|
image.write(chunk, last_chunk)
|
||||||
|
last_chunk += len(chunk)
|
||||||
|
|
||||||
|
image.close()
|
||||||
|
ioctx.close()
|
||||||
|
cluster.shutdown()
|
||||||
|
|
||||||
|
return {"message": "Successfully received RBD block device"}, 200
|
||||||
|
|
||||||
|
|
||||||
|
@ZKConnection(config)
|
||||||
|
def vm_snapshot_receive_block_diff(
|
||||||
|
zkhandler, pool, volume, snapshot, source_snapshot, request
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Receive an RBD volume from a remote system
|
||||||
|
"""
|
||||||
|
import rados
|
||||||
|
import rbd
|
||||||
|
|
||||||
|
cluster = rados.Rados(conffile="/etc/ceph/ceph.conf")
|
||||||
|
cluster.connect()
|
||||||
|
ioctx = cluster.open_ioctx(pool)
|
||||||
|
image = rbd.Image(ioctx, volume)
|
||||||
|
|
||||||
|
if len(request.files) > 0:
|
||||||
|
logger.info(f"Applying {len(request.files)} RBD diff chunks for {snapshot}")
|
||||||
|
|
||||||
|
for i in range(len(request.files)):
|
||||||
|
object_key = f"object_{i}"
|
||||||
|
if object_key in request.files:
|
||||||
|
object_data = request.files[object_key].read()
|
||||||
|
offset = int.from_bytes(object_data[:8], "big")
|
||||||
|
length = int.from_bytes(object_data[8:16], "big")
|
||||||
|
data = object_data[16 : 16 + length]
|
||||||
|
logger.info(f"Applying RBD diff chunk at {offset} ({length} bytes)")
|
||||||
|
image.write(data, offset)
|
||||||
|
else:
|
||||||
|
return {"message": "No data received"}, 400
|
||||||
|
|
||||||
|
image.close()
|
||||||
|
ioctx.close()
|
||||||
|
cluster.shutdown()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"message": f"Successfully received {len(request.files)} RBD diff chunks"
|
||||||
|
}, 200
|
||||||
|
|
||||||
|
|
||||||
|
@ZKConnection(config)
|
||||||
|
def vm_snapshot_receive_block_createsnap(zkhandler, pool, volume, snapshot):
|
||||||
|
"""
|
||||||
|
Create the snapshot of a remote volume
|
||||||
|
"""
|
||||||
|
import rados
|
||||||
|
import rbd
|
||||||
|
|
||||||
|
cluster = rados.Rados(conffile="/etc/ceph/ceph.conf")
|
||||||
|
cluster.connect()
|
||||||
|
ioctx = cluster.open_ioctx(pool)
|
||||||
|
image = rbd.Image(ioctx, volume)
|
||||||
|
image.create_snap(snapshot)
|
||||||
|
image.close()
|
||||||
|
ioctx.close()
|
||||||
|
cluster.shutdown()
|
||||||
|
|
||||||
|
retflag, retdata = pvc_ceph.add_snapshot(
|
||||||
|
zkhandler, pool, volume, snapshot, zk_only=True
|
||||||
|
)
|
||||||
|
if not retflag:
|
||||||
|
|
||||||
|
if retflag:
|
||||||
|
retcode = 200
|
||||||
|
else:
|
||||||
|
retcode = 400
|
||||||
|
|
||||||
|
output = {"message": retdata.replace('"', "'")}
|
||||||
|
return output, retcode
|
||||||
|
|
||||||
|
return {"message": "Successfully received RBD snapshot"}, 200
|
||||||
|
|
||||||
|
|
||||||
|
@ZKConnection(config)
|
||||||
|
def vm_snapshot_receive_config(zkhandler, snapshot, vm_config, source_snapshot=None):
|
||||||
|
"""
|
||||||
|
Receive a VM configuration snapshot from a remote system, and modify it to work on our system
|
||||||
|
"""
|
||||||
|
|
||||||
|
def parse_unified_diff(diff_text, original_text):
|
||||||
|
"""
|
||||||
|
Take a unified diff and apply it to an original string
|
||||||
|
"""
|
||||||
|
# Split the original string into lines
|
||||||
|
original_lines = original_text.splitlines(keepends=True)
|
||||||
|
patched_lines = []
|
||||||
|
original_idx = 0 # Track position in original lines
|
||||||
|
|
||||||
|
diff_lines = diff_text.splitlines(keepends=True)
|
||||||
|
|
||||||
|
for line in diff_lines:
|
||||||
|
if line.startswith("---") or line.startswith("+++"):
|
||||||
|
# Ignore prefix lines
|
||||||
|
continue
|
||||||
|
if line.startswith("@@"):
|
||||||
|
# Extract line numbers from the diff hunk header
|
||||||
|
hunk_header = line
|
||||||
|
parts = hunk_header.split(" ")
|
||||||
|
original_range = parts[1]
|
||||||
|
|
||||||
|
# Get the starting line number and range length for the original file
|
||||||
|
original_start, _ = map(int, original_range[1:].split(","))
|
||||||
|
|
||||||
|
# Adjust for zero-based indexing
|
||||||
|
original_start -= 1
|
||||||
|
|
||||||
|
# Add any lines between the current index and the next hunk's start
|
||||||
|
while original_idx < original_start:
|
||||||
|
patched_lines.append(original_lines[original_idx])
|
||||||
|
original_idx += 1
|
||||||
|
|
||||||
|
elif line.startswith("-"):
|
||||||
|
# This line should be removed from the original, skip it
|
||||||
|
original_idx += 1
|
||||||
|
elif line.startswith("+"):
|
||||||
|
# This line should be added to the patched version, removing the '+'
|
||||||
|
patched_lines.append(line[1:])
|
||||||
|
else:
|
||||||
|
# Context line (unchanged), it has no prefix, add from the original
|
||||||
|
patched_lines.append(original_lines[original_idx])
|
||||||
|
original_idx += 1
|
||||||
|
|
||||||
|
# Add any remaining lines from the original file after the last hunk
|
||||||
|
patched_lines.extend(original_lines[original_idx:])
|
||||||
|
|
||||||
|
return "".join(patched_lines).strip()
|
||||||
|
|
||||||
|
# Get our XML configuration for this snapshot
|
||||||
|
# We take the main XML configuration, then apply the diff for this particular incremental
|
||||||
|
current_snapshot = [s for s in vm_config["snapshots"] if s["name"] == snapshot][0]
|
||||||
|
vm_xml = vm_config["xml"]
|
||||||
|
vm_xml_diff = "\n".join(current_snapshot["xml_diff_lines"])
|
||||||
|
snapshot_vm_xml = parse_unified_diff(vm_xml_diff, vm_xml)
|
||||||
|
xml_data = etree.fromstring(snapshot_vm_xml)
|
||||||
|
|
||||||
|
# Replace the Ceph storage secret UUID with this cluster's
|
||||||
|
our_ceph_secret_uuid = config["ceph_secret_uuid"]
|
||||||
|
ceph_secrets = xml_data.xpath("//secret[@type='ceph']")
|
||||||
|
for ceph_secret in ceph_secrets:
|
||||||
|
ceph_secret.set("uuid", our_ceph_secret_uuid)
|
||||||
|
|
||||||
|
# Replace the Ceph source hosts with this cluster's
|
||||||
|
our_ceph_storage_hosts = config["storage_hosts"]
|
||||||
|
our_ceph_storage_port = str(config["ceph_monitor_port"])
|
||||||
|
ceph_sources = xml_data.xpath("//source[@protocol='rbd']")
|
||||||
|
for ceph_source in ceph_sources:
|
||||||
|
for host in ceph_source.xpath("host"):
|
||||||
|
ceph_source.remove(host)
|
||||||
|
for ceph_storage_host in our_ceph_storage_hosts:
|
||||||
|
new_host = etree.Element("host")
|
||||||
|
new_host.set("name", ceph_storage_host)
|
||||||
|
new_host.set("port", our_ceph_storage_port)
|
||||||
|
ceph_source.append(new_host)
|
||||||
|
|
||||||
|
# Regenerate the VM XML
|
||||||
|
snapshot_vm_xml = etree.tostring(xml_data, pretty_print=True).decode("utf8")
|
||||||
|
|
||||||
|
if (
|
||||||
|
source_snapshot is not None
|
||||||
|
or pvc_vm.searchClusterByUUID(zkhandler, vm_config["uuid"]) is not None
|
||||||
|
):
|
||||||
|
logger.info(
|
||||||
|
f"Receiving incremental VM configuration for {vm_config['name']}@{snapshot}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Modify the VM based on our passed detail
|
||||||
|
retcode, retmsg = pvc_vm.modify_vm(
|
||||||
|
zkhandler,
|
||||||
|
vm_config["uuid"],
|
||||||
|
False,
|
||||||
|
snapshot_vm_xml,
|
||||||
|
)
|
||||||
|
if not retcode:
|
||||||
|
retcode = 400
|
||||||
|
retdata = {"message": retmsg}
|
||||||
|
return retdata, retcode
|
||||||
|
|
||||||
|
retcode, retmsg = pvc_vm.modify_vm_metadata(
|
||||||
|
zkhandler,
|
||||||
|
vm_config["uuid"],
|
||||||
|
None, # Node limits are left unchanged
|
||||||
|
vm_config["node_selector"],
|
||||||
|
vm_config["node_autostart"],
|
||||||
|
vm_config["profile"],
|
||||||
|
vm_config["migration_method"],
|
||||||
|
vm_config["migration_max_downtime"],
|
||||||
|
)
|
||||||
|
if not retcode:
|
||||||
|
retcode = 400
|
||||||
|
retdata = {"message": retmsg}
|
||||||
|
return retdata, retcode
|
||||||
|
|
||||||
|
current_vm_tags = zkhandler.children(("domain.meta.tags", vm_config["uuid"]))
|
||||||
|
new_vm_tags = [t["name"] for t in vm_config["tags"]]
|
||||||
|
remove_tags = []
|
||||||
|
add_tags = []
|
||||||
|
for tag in vm_config["tags"]:
|
||||||
|
if tag["name"] not in current_vm_tags:
|
||||||
|
add_tags.append((tag["name"], tag["protected"]))
|
||||||
|
for tag in current_vm_tags:
|
||||||
|
if tag not in new_vm_tags:
|
||||||
|
remove_tags.append(tag)
|
||||||
|
|
||||||
|
for tag in add_tags:
|
||||||
|
name, protected = tag
|
||||||
|
pvc_vm.modify_vm_tag(
|
||||||
|
zkhandler, vm_config["uuid"], "add", name, protected=protected
|
||||||
|
)
|
||||||
|
for tag in remove_tags:
|
||||||
|
pvc_vm.modify_vm_tag(zkhandler, vm_config["uuid"], "remove", name)
|
||||||
|
else:
|
||||||
|
logger.info(
|
||||||
|
f"Receiving full VM configuration for {vm_config['name']}@{snapshot}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Define the VM based on our passed detail
|
||||||
|
retcode, retmsg = pvc_vm.define_vm(
|
||||||
|
zkhandler,
|
||||||
|
snapshot_vm_xml,
|
||||||
|
None, # Target node is autoselected
|
||||||
|
None, # Node limits are invalid here so ignore them
|
||||||
|
vm_config["node_selector"],
|
||||||
|
vm_config["node_autostart"],
|
||||||
|
vm_config["migration_method"],
|
||||||
|
vm_config["migration_max_downtime"],
|
||||||
|
vm_config["profile"],
|
||||||
|
vm_config["tags"],
|
||||||
|
"mirror",
|
||||||
|
)
|
||||||
|
if not retcode:
|
||||||
|
retcode = 400
|
||||||
|
retdata = {"message": retmsg}
|
||||||
|
return retdata, retcode
|
||||||
|
|
||||||
|
# Add this snapshot to the VM manually in Zookeeper
|
||||||
|
zkhandler.write(
|
||||||
|
[
|
||||||
|
(
|
||||||
|
(
|
||||||
|
"domain.snapshots",
|
||||||
|
vm_config["uuid"],
|
||||||
|
"domain_snapshot.name",
|
||||||
|
snapshot,
|
||||||
|
),
|
||||||
|
snapshot,
|
||||||
|
),
|
||||||
|
(
|
||||||
|
(
|
||||||
|
"domain.snapshots",
|
||||||
|
vm_config["uuid"],
|
||||||
|
"domain_snapshot.timestamp",
|
||||||
|
snapshot,
|
||||||
|
),
|
||||||
|
current_snapshot["timestamp"],
|
||||||
|
),
|
||||||
|
(
|
||||||
|
(
|
||||||
|
"domain.snapshots",
|
||||||
|
vm_config["uuid"],
|
||||||
|
"domain_snapshot.xml",
|
||||||
|
snapshot,
|
||||||
|
),
|
||||||
|
snapshot_vm_xml,
|
||||||
|
),
|
||||||
|
(
|
||||||
|
(
|
||||||
|
"domain.snapshots",
|
||||||
|
vm_config["uuid"],
|
||||||
|
"domain_snapshot.rbd_snapshots",
|
||||||
|
snapshot,
|
||||||
|
),
|
||||||
|
",".join(current_snapshot["rbd_snapshots"]),
|
||||||
|
),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
return {"message": "Successfully received VM configuration snapshot"}, 200
|
||||||
|
|
||||||
|
|
||||||
#
|
#
|
||||||
# Network functions
|
# Network functions
|
||||||
#
|
#
|
||||||
|
@ -1996,6 +2343,22 @@ def ceph_volume_list(zkhandler, pool=None, limit=None, is_fuzzy=True):
|
||||||
return retdata, retcode
|
return retdata, retcode
|
||||||
|
|
||||||
|
|
||||||
|
@ZKConnection(config)
|
||||||
|
def ceph_volume_scan(zkhandler, pool, name):
|
||||||
|
"""
|
||||||
|
(Re)scan a Ceph RBD volume for stats in the PVC Ceph storage cluster.
|
||||||
|
"""
|
||||||
|
retflag, retdata = pvc_ceph.scan_volume(zkhandler, pool, name)
|
||||||
|
|
||||||
|
if retflag:
|
||||||
|
retcode = 200
|
||||||
|
else:
|
||||||
|
retcode = 400
|
||||||
|
|
||||||
|
output = {"message": retdata.replace('"', "'")}
|
||||||
|
return output, retcode
|
||||||
|
|
||||||
|
|
||||||
@ZKConnection(config)
|
@ZKConnection(config)
|
||||||
def ceph_volume_add(zkhandler, pool, name, size, force_flag=False):
|
def ceph_volume_add(zkhandler, pool, name, size, force_flag=False):
|
||||||
"""
|
"""
|
||||||
|
|
|
@ -1517,12 +1517,21 @@ def cli_vm_remove(domain):
|
||||||
@click.command(name="start", short_help="Start up a defined virtual machine.")
|
@click.command(name="start", short_help="Start up a defined virtual machine.")
|
||||||
@connection_req
|
@connection_req
|
||||||
@click.argument("domain")
|
@click.argument("domain")
|
||||||
def cli_vm_start(domain):
|
@click.option(
|
||||||
|
"--force",
|
||||||
|
"force_flag",
|
||||||
|
is_flag=True,
|
||||||
|
default=False,
|
||||||
|
help="Force a snapshot mirror state change.",
|
||||||
|
)
|
||||||
|
def cli_vm_start(domain, force_flag):
|
||||||
"""
|
"""
|
||||||
Start virtual machine DOMAIN on its configured node. DOMAIN may be a UUID or name.
|
Start virtual machine DOMAIN on its configured node. DOMAIN may be a UUID or name.
|
||||||
|
|
||||||
|
If the VM is a snapshot mirror, "--force" allows a manual state change to the mirror.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
retcode, retmsg = pvc.lib.vm.vm_state(CLI_CONFIG, domain, "start")
|
retcode, retmsg = pvc.lib.vm.vm_state(CLI_CONFIG, domain, "start", force=force_flag)
|
||||||
finish(retcode, retmsg)
|
finish(retcode, retmsg)
|
||||||
|
|
||||||
|
|
||||||
|
@ -1582,13 +1591,22 @@ def cli_vm_shutdown(domain, wait):
|
||||||
@click.command(name="stop", short_help="Forcibly halt a running virtual machine.")
|
@click.command(name="stop", short_help="Forcibly halt a running virtual machine.")
|
||||||
@connection_req
|
@connection_req
|
||||||
@click.argument("domain")
|
@click.argument("domain")
|
||||||
|
@click.option(
|
||||||
|
"--force",
|
||||||
|
"force_flag",
|
||||||
|
is_flag=True,
|
||||||
|
default=False,
|
||||||
|
help="Force a snapshot mirror state change.",
|
||||||
|
)
|
||||||
@confirm_opt("Forcibly stop virtual machine {domain}")
|
@confirm_opt("Forcibly stop virtual machine {domain}")
|
||||||
def cli_vm_stop(domain):
|
def cli_vm_stop(domain, force_flag):
|
||||||
"""
|
"""
|
||||||
Forcibly halt (destroy) running virtual machine DOMAIN. DOMAIN may be a UUID or name.
|
Forcibly halt (destroy) running virtual machine DOMAIN. DOMAIN may be a UUID or name.
|
||||||
|
|
||||||
|
If the VM is a snapshot mirror, "--force" allows a manual state change to the mirror.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
retcode, retmsg = pvc.lib.vm.vm_state(CLI_CONFIG, domain, "stop")
|
retcode, retmsg = pvc.lib.vm.vm_state(CLI_CONFIG, domain, "stop", force=force_flag)
|
||||||
finish(retcode, retmsg)
|
finish(retcode, retmsg)
|
||||||
|
|
||||||
|
|
||||||
|
@ -1603,14 +1621,14 @@ def cli_vm_stop(domain):
|
||||||
"force_flag",
|
"force_flag",
|
||||||
is_flag=True,
|
is_flag=True,
|
||||||
default=False,
|
default=False,
|
||||||
help="Forcibly stop the VM instead of waiting for shutdown.",
|
help="Forcibly stop VM without shutdown and/or force a snapshot mirror state change.",
|
||||||
)
|
)
|
||||||
@confirm_opt("Shut down and disable virtual machine {domain}")
|
@confirm_opt("Shut down and disable virtual machine {domain}")
|
||||||
def cli_vm_disable(domain, force_flag):
|
def cli_vm_disable(domain, force_flag):
|
||||||
"""
|
"""
|
||||||
Shut down virtual machine DOMAIN and mark it as disabled. DOMAIN may be a UUID or name.
|
Shut down virtual machine DOMAIN and mark it as disabled. DOMAIN may be a UUID or name.
|
||||||
|
|
||||||
Disabled VMs will not be counted towards a degraded cluster health status, unlike stopped VMs. Use this option for a VM that will remain off for an extended period.
|
If "--force" is specified, and the VM is running, it will be forcibly stopped instead of waiting for a graceful ACPI shutdown. If the VM is a snapshot mirror, "--force" allows a manual state change to the mirror.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
retcode, retmsg = pvc.lib.vm.vm_state(
|
retcode, retmsg = pvc.lib.vm.vm_state(
|
||||||
|
@ -2018,6 +2036,308 @@ def cli_vm_snapshot_import(
|
||||||
finish(retcode, retmsg)
|
finish(retcode, retmsg)
|
||||||
|
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# > pvc vm snapshot send
|
||||||
|
###############################################################################
|
||||||
|
@click.command(
|
||||||
|
name="send",
|
||||||
|
short_help="Send a snapshot of a virtual machine to another PVC cluster.",
|
||||||
|
)
|
||||||
|
@connection_req
|
||||||
|
@click.argument("domain")
|
||||||
|
@click.argument("snapshot_name")
|
||||||
|
@click.argument("destination")
|
||||||
|
@click.option(
|
||||||
|
"-k",
|
||||||
|
"--destination-api-key",
|
||||||
|
"destination_api_key",
|
||||||
|
default=None,
|
||||||
|
help="The API key of the destination cluster when specifying an API URI.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"-p",
|
||||||
|
"--destination-pool",
|
||||||
|
"destination_storage_pool",
|
||||||
|
default=None,
|
||||||
|
help="The target storage pool on the destination cluster, if it differs from the source pool.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"-i",
|
||||||
|
"--incremental",
|
||||||
|
"incremental_parent",
|
||||||
|
default=None,
|
||||||
|
help="Perform an incremental volume send from this parent snapshot.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"--wait/--no-wait",
|
||||||
|
"wait_flag",
|
||||||
|
is_flag=True,
|
||||||
|
default=True,
|
||||||
|
show_default=True,
|
||||||
|
help="Wait or don't wait for task to complete, showing progress if waiting",
|
||||||
|
)
|
||||||
|
def cli_vm_snapshot_send(
|
||||||
|
domain,
|
||||||
|
snapshot_name,
|
||||||
|
destination,
|
||||||
|
destination_api_key,
|
||||||
|
destination_storage_pool,
|
||||||
|
incremental_parent,
|
||||||
|
wait_flag,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Send the (existing) snapshot SNAPSHOT_NAME of virtual machine DOMAIN to the remote PVC cluster DESTINATION.
|
||||||
|
|
||||||
|
DOMAIN may be a UUID or name. DESTINATION may be either a configured PVC connection name in this CLI instance (i.e. a valid argument to "--connection"), or a full API URI, including the scheme, port and API prefix; if using the latter, an API key can be specified with the "-k"/"--destination-api-key" option.
|
||||||
|
|
||||||
|
The send will include the VM configuration, metainfo, and a point-in-time snapshot of all attached RBD volumes.
|
||||||
|
|
||||||
|
By default, the storage pool of the sending cluster will be used at the destination cluster as well. If a pool of that name does not exist, specify one with the "-p"/"--detination-pool" option.
|
||||||
|
|
||||||
|
Incremental sends are possible by specifying the "-i"/"--incremental-parent" option along with a parent snapshot name. To correctly receive, that parent snapshot must exist on DESTINATION. Subsequent sends after the first do not have to be incremental, but an incremental send is likely to perform better than a full send if the VM experiences few writes.
|
||||||
|
|
||||||
|
WARNING: Once sent, the VM will be in the state "mirror" on the destination cluster. If it is subsequently started, for instance for disaster recovery, a new snapshot must be taken on the destination cluster and sent back or data will be inconsistent between the instances. Only VMs in the "mirror" state can accept new sends.
|
||||||
|
|
||||||
|
WARNING: This functionality has no automatic backout on the remote side. While a properly configured cluster should not fail any step in the process, a situation like an intermittent network connection might cause a failure which would have to be manually corrected on that side, usually by removing the mirrored VM and retrying, or rolling back to a previous snapshot and retrying. Future versions may enhance automatic recovery, but for now this would be up to the administrator.
|
||||||
|
"""
|
||||||
|
|
||||||
|
connections_config = get_store(CLI_CONFIG["store_path"])
|
||||||
|
if destination in connections_config.keys():
|
||||||
|
destination_cluster_config = connections_config[destination]
|
||||||
|
destination_api_uri = "{}://{}:{}{}".format(
|
||||||
|
destination_cluster_config["scheme"],
|
||||||
|
destination_cluster_config["host"],
|
||||||
|
destination_cluster_config["port"],
|
||||||
|
CLI_CONFIG["api_prefix"],
|
||||||
|
)
|
||||||
|
destination_api_key = destination_cluster_config["api_key"]
|
||||||
|
else:
|
||||||
|
if "http" not in destination:
|
||||||
|
finish(
|
||||||
|
False, "ERROR: A valid destination cluster or URI must be specified!"
|
||||||
|
)
|
||||||
|
destination_api_uri = destination
|
||||||
|
destination_api_key = destination_api_key
|
||||||
|
|
||||||
|
retcode, retmsg = pvc.lib.vm.vm_send_snapshot(
|
||||||
|
CLI_CONFIG,
|
||||||
|
domain,
|
||||||
|
snapshot_name,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=CLI_CONFIG.get("verify_ssl"),
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
incremental_parent=incremental_parent,
|
||||||
|
wait_flag=wait_flag,
|
||||||
|
)
|
||||||
|
|
||||||
|
if retcode and wait_flag:
|
||||||
|
retmsg = wait_for_celery_task(CLI_CONFIG, retmsg)
|
||||||
|
finish(retcode, retmsg)
|
||||||
|
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# > pvc vm mirror
|
||||||
|
###############################################################################
|
||||||
|
@click.group(
|
||||||
|
name="mirror",
|
||||||
|
short_help="Manage snapshot mirrors for PVC VMs.",
|
||||||
|
context_settings=CONTEXT_SETTINGS,
|
||||||
|
)
|
||||||
|
def cli_vm_mirror():
|
||||||
|
"""
|
||||||
|
Manage snapshot mirrors of VMs in a PVC cluster.
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# > pvc vm mirror create
|
||||||
|
###############################################################################
|
||||||
|
@click.command(
|
||||||
|
name="create",
|
||||||
|
short_help="Create a snapshot mirror of a virtual machine to another PVC cluster.",
|
||||||
|
)
|
||||||
|
@connection_req
|
||||||
|
@click.argument("domain")
|
||||||
|
@click.argument("destination")
|
||||||
|
@click.option(
|
||||||
|
"-k",
|
||||||
|
"--destination-api-key",
|
||||||
|
"destination_api_key",
|
||||||
|
default=None,
|
||||||
|
help="The API key of the destination cluster when specifying an API URI.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"-p",
|
||||||
|
"--destination-pool",
|
||||||
|
"destination_storage_pool",
|
||||||
|
default=None,
|
||||||
|
help="The target storage pool on the destination cluster, if it differs from the source pool.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"--wait/--no-wait",
|
||||||
|
"wait_flag",
|
||||||
|
is_flag=True,
|
||||||
|
default=True,
|
||||||
|
show_default=True,
|
||||||
|
help="Wait or don't wait for task to complete, showing progress if waiting",
|
||||||
|
)
|
||||||
|
def cli_vm_mirror_create(
|
||||||
|
domain,
|
||||||
|
destination,
|
||||||
|
destination_api_key,
|
||||||
|
destination_storage_pool,
|
||||||
|
wait_flag,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
For the virtual machine DOMAIN: create a new snapshot (dated), and send snapshot to the remote PVC cluster DESTINATION; creates a cross-cluster snapshot mirror of the VM.
|
||||||
|
|
||||||
|
DOMAIN may be a UUID or name. DESTINATION may be either a configured PVC connection name in this CLI instance (i.e. a valid argument to "--connection"), or a full API URI, including the scheme, port and API prefix; if using the latter, an API key can be specified with the "-k"/"--destination-api-key" option.
|
||||||
|
|
||||||
|
The send will include the VM configuration, metainfo, and a point-in-time snapshot of all attached RBD volumes.
|
||||||
|
|
||||||
|
This command may be used repeatedly to send new updates for a remote VM mirror. If a valid shared snapshot is found on the destination cluster, block device transfers will be incremental based on that snapshot.
|
||||||
|
|
||||||
|
By default, the storage pool of the sending cluster will be used at the destination cluster as well. If a pool of that name does not exist, specify one with the "-p"/"--detination-pool" option.
|
||||||
|
|
||||||
|
WARNING: Once sent, the VM will be in the state "mirror" on the destination cluster. If it is subsequently started, for instance for disaster recovery, a new snapshot must be taken on the destination cluster and sent back or data will be inconsistent between the instances. Only VMs in the "mirror" state can accept new sends. Consider using "mirror promote" instead of any manual promotion attempts.
|
||||||
|
|
||||||
|
WARNING: This functionality has no automatic backout on the remote side. While a properly configured cluster should not fail any step in the process, a situation like an intermittent network connection might cause a failure which would have to be manually corrected on that side, usually by removing the mirrored VM and retrying, or rolling back to a previous snapshot and retrying. Future versions may enhance automatic recovery, but for now this would be up to the administrator.
|
||||||
|
"""
|
||||||
|
|
||||||
|
connections_config = get_store(CLI_CONFIG["store_path"])
|
||||||
|
if destination in connections_config.keys():
|
||||||
|
destination_cluster_config = connections_config[destination]
|
||||||
|
destination_api_uri = "{}://{}:{}{}".format(
|
||||||
|
destination_cluster_config["scheme"],
|
||||||
|
destination_cluster_config["host"],
|
||||||
|
destination_cluster_config["port"],
|
||||||
|
CLI_CONFIG["api_prefix"],
|
||||||
|
)
|
||||||
|
destination_api_key = destination_cluster_config["api_key"]
|
||||||
|
else:
|
||||||
|
if "http" not in destination:
|
||||||
|
finish(
|
||||||
|
False, "ERROR: A valid destination cluster or URI must be specified!"
|
||||||
|
)
|
||||||
|
destination_api_uri = destination
|
||||||
|
destination_api_key = destination_api_key
|
||||||
|
|
||||||
|
retcode, retmsg = pvc.lib.vm.vm_create_mirror(
|
||||||
|
CLI_CONFIG,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=CLI_CONFIG.get("verify_ssl"),
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
wait_flag=wait_flag,
|
||||||
|
)
|
||||||
|
|
||||||
|
if retcode and wait_flag:
|
||||||
|
retmsg = wait_for_celery_task(CLI_CONFIG, retmsg)
|
||||||
|
finish(retcode, retmsg)
|
||||||
|
|
||||||
|
|
||||||
|
###############################################################################
|
||||||
|
# > pvc vm mirror promote
|
||||||
|
###############################################################################
|
||||||
|
@click.command(
|
||||||
|
name="promote",
|
||||||
|
short_help="Shut down, create a snapshot mirror, and promote a virtual machine to another PVC cluster.",
|
||||||
|
)
|
||||||
|
@connection_req
|
||||||
|
@click.argument("domain")
|
||||||
|
@click.argument("destination")
|
||||||
|
@click.option(
|
||||||
|
"-k",
|
||||||
|
"--destination-api-key",
|
||||||
|
"destination_api_key",
|
||||||
|
default=None,
|
||||||
|
help="The API key of the destination cluster when specifying an API URI.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"-p",
|
||||||
|
"--destination-pool",
|
||||||
|
"destination_storage_pool",
|
||||||
|
default=None,
|
||||||
|
help="The target storage pool on the destination cluster, if it differs from the source pool.",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"--remove/--no-remove",
|
||||||
|
"remove_flag",
|
||||||
|
is_flag=True,
|
||||||
|
default=False,
|
||||||
|
show_default=True,
|
||||||
|
help="Remove or don't remove the local VM after promoting (if set, performs a cross-cluster move).",
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"--wait/--no-wait",
|
||||||
|
"wait_flag",
|
||||||
|
is_flag=True,
|
||||||
|
default=True,
|
||||||
|
show_default=True,
|
||||||
|
help="Wait or don't wait for task to complete, showing progress if waiting",
|
||||||
|
)
|
||||||
|
@confirm_opt("Promote VM {domain} on cluster {destination} (will shut down VM)")
|
||||||
|
def cli_vm_mirror_promote(
|
||||||
|
domain,
|
||||||
|
destination,
|
||||||
|
destination_api_key,
|
||||||
|
destination_storage_pool,
|
||||||
|
remove_flag,
|
||||||
|
wait_flag,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
For the virtual machine DOMAIN: shut down on this cluster, create a new snapshot (dated), send snapshot to the remote PVC cluster DESTINATION, start on DESTINATION, and optionally remove from this cluster; performs a cross-cluster move of the VM, with or without retaining the source as a snapshot mirror.
|
||||||
|
|
||||||
|
DOMAIN may be a UUID or name. DESTINATION may be either a configured PVC connection name in this CLI instance (i.e. a valid argument to "--connection"), or a full API URI, including the scheme, port and API prefix; if using the latter, an API key can be specified with the "-k"/"--destination-api-key" option.
|
||||||
|
|
||||||
|
The send will include the VM configuration, metainfo, and a point-in-time snapshot of all attached RBD volumes.
|
||||||
|
|
||||||
|
If a valid shared snapshot is found on the destination cluster, block device transfers will be incremental based on that snapshot.
|
||||||
|
|
||||||
|
By default, the storage pool of the sending cluster will be used at the destination cluster as well. If a pool of that name does not exist, specify one with the "-p"/"--detination-pool" option.
|
||||||
|
|
||||||
|
WARNING: Once promoted, if the "--remove" flag is not set, the VM will be in the state "mirror" on this cluster. This effectively flips which cluster is the "primary" for this VM, and subsequent mirror management commands must be run against the destination cluster instead of this cluster. If the "--remove" flag is set, the VM will be removed from this cluster entirely once successfully started on the destination cluster.
|
||||||
|
|
||||||
|
WARNING: This functionality has no automatic backout on the remote side. While a properly configured cluster should not fail any step in the process, a situation like an intermittent network connection might cause a failure which would have to be manually corrected on that side, usually by removing the mirrored VM and retrying, or rolling back to a previous snapshot and retrying. Future versions may enhance automatic recovery, but for now this would be up to the administrator.
|
||||||
|
"""
|
||||||
|
|
||||||
|
connections_config = get_store(CLI_CONFIG["store_path"])
|
||||||
|
if destination in connections_config.keys():
|
||||||
|
destination_cluster_config = connections_config[destination]
|
||||||
|
destination_api_uri = "{}://{}:{}{}".format(
|
||||||
|
destination_cluster_config["scheme"],
|
||||||
|
destination_cluster_config["host"],
|
||||||
|
destination_cluster_config["port"],
|
||||||
|
CLI_CONFIG["api_prefix"],
|
||||||
|
)
|
||||||
|
destination_api_key = destination_cluster_config["api_key"]
|
||||||
|
else:
|
||||||
|
if "http" not in destination:
|
||||||
|
finish(
|
||||||
|
False, "ERROR: A valid destination cluster or URI must be specified!"
|
||||||
|
)
|
||||||
|
destination_api_uri = destination
|
||||||
|
destination_api_key = destination_api_key
|
||||||
|
|
||||||
|
retcode, retmsg = pvc.lib.vm.vm_promote_mirror(
|
||||||
|
CLI_CONFIG,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=CLI_CONFIG.get("verify_ssl"),
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
remove_on_source=remove_flag,
|
||||||
|
wait_flag=wait_flag,
|
||||||
|
)
|
||||||
|
|
||||||
|
if retcode and wait_flag:
|
||||||
|
retmsg = wait_for_celery_task(CLI_CONFIG, retmsg)
|
||||||
|
finish(retcode, retmsg)
|
||||||
|
|
||||||
|
|
||||||
###############################################################################
|
###############################################################################
|
||||||
# > pvc vm backup
|
# > pvc vm backup
|
||||||
###############################################################################
|
###############################################################################
|
||||||
|
@ -2028,7 +2348,7 @@ def cli_vm_snapshot_import(
|
||||||
)
|
)
|
||||||
def cli_vm_backup():
|
def cli_vm_backup():
|
||||||
"""
|
"""
|
||||||
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in PVC 0.9.100.
|
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in a future version.
|
||||||
|
|
||||||
Manage backups of VMs in a PVC cluster.
|
Manage backups of VMs in a PVC cluster.
|
||||||
"""
|
"""
|
||||||
|
@ -2059,7 +2379,7 @@ def cli_vm_backup():
|
||||||
)
|
)
|
||||||
def cli_vm_backup_create(domain, backup_path, incremental_parent, retain_snapshot):
|
def cli_vm_backup_create(domain, backup_path, incremental_parent, retain_snapshot):
|
||||||
"""
|
"""
|
||||||
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in PVC 0.9.100.
|
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in a future version.
|
||||||
|
|
||||||
Create a backup of virtual machine DOMAIN to BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
|
Create a backup of virtual machine DOMAIN to BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
|
||||||
|
|
||||||
|
@ -2107,7 +2427,7 @@ def cli_vm_backup_create(domain, backup_path, incremental_parent, retain_snapsho
|
||||||
)
|
)
|
||||||
def cli_vm_backup_restore(domain, backup_datestring, backup_path, retain_snapshot):
|
def cli_vm_backup_restore(domain, backup_datestring, backup_path, retain_snapshot):
|
||||||
"""
|
"""
|
||||||
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in PVC 0.9.100.
|
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in a future version.
|
||||||
|
|
||||||
Restore the backup BACKUP_DATESTRING of virtual machine DOMAIN stored in BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
|
Restore the backup BACKUP_DATESTRING of virtual machine DOMAIN stored in BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
|
||||||
|
|
||||||
|
@ -2147,7 +2467,7 @@ def cli_vm_backup_restore(domain, backup_datestring, backup_path, retain_snapsho
|
||||||
@click.argument("backup_path")
|
@click.argument("backup_path")
|
||||||
def cli_vm_backup_remove(domain, backup_datestring, backup_path):
|
def cli_vm_backup_remove(domain, backup_datestring, backup_path):
|
||||||
"""
|
"""
|
||||||
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in PVC 0.9.100.
|
DEPRECATED: Use 'pvc vm snapshot' commands instead. 'pvc vm backup' commands will be removed in a future version.
|
||||||
|
|
||||||
Remove the backup BACKUP_DATESTRING, including snapshots, of virtual machine DOMAIN stored in BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
|
Remove the backup BACKUP_DATESTRING, including snapshots, of virtual machine DOMAIN stored in BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
|
||||||
|
|
||||||
|
@ -3755,6 +4075,13 @@ def cli_storage_benchmark():
|
||||||
@click.command(name="run", short_help="Run a storage benchmark.")
|
@click.command(name="run", short_help="Run a storage benchmark.")
|
||||||
@connection_req
|
@connection_req
|
||||||
@click.argument("pool")
|
@click.argument("pool")
|
||||||
|
@click.option(
|
||||||
|
"--name",
|
||||||
|
"name",
|
||||||
|
default=None,
|
||||||
|
show_default=False,
|
||||||
|
help="Use a custom name for the job",
|
||||||
|
)
|
||||||
@click.option(
|
@click.option(
|
||||||
"--wait/--no-wait",
|
"--wait/--no-wait",
|
||||||
"wait_flag",
|
"wait_flag",
|
||||||
|
@ -3766,12 +4093,14 @@ def cli_storage_benchmark():
|
||||||
@confirm_opt(
|
@confirm_opt(
|
||||||
"Storage benchmarks take approximately 10 minutes to run and generate significant load on the cluster; they should be run sparingly. Continue"
|
"Storage benchmarks take approximately 10 minutes to run and generate significant load on the cluster; they should be run sparingly. Continue"
|
||||||
)
|
)
|
||||||
def cli_storage_benchmark_run(pool, wait_flag):
|
def cli_storage_benchmark_run(pool, name, wait_flag):
|
||||||
"""
|
"""
|
||||||
Run a storage benchmark on POOL in the background.
|
Run a storage benchmark on POOL in the background.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
retcode, retmsg = pvc.lib.storage.ceph_benchmark_run(CLI_CONFIG, pool, wait_flag)
|
retcode, retmsg = pvc.lib.storage.ceph_benchmark_run(
|
||||||
|
CLI_CONFIG, pool, name, wait_flag
|
||||||
|
)
|
||||||
|
|
||||||
if retcode and wait_flag:
|
if retcode and wait_flag:
|
||||||
retmsg = wait_for_celery_task(CLI_CONFIG, retmsg)
|
retmsg = wait_for_celery_task(CLI_CONFIG, retmsg)
|
||||||
|
@ -3866,8 +4195,6 @@ def cli_storage_osd_create_db_vg(node, device, wait_flag):
|
||||||
Only one OSD database volume group on a single physical device, named "osd-db", is supported per node, so it must be fast and large enough to act as an effective OSD database device for all OSDs on the node. Attempting to add additional database volume groups after the first will result in an error.
|
Only one OSD database volume group on a single physical device, named "osd-db", is supported per node, so it must be fast and large enough to act as an effective OSD database device for all OSDs on the node. Attempting to add additional database volume groups after the first will result in an error.
|
||||||
|
|
||||||
WARNING: If the OSD database device fails, all OSDs on the node using it will be lost and must be recreated.
|
WARNING: If the OSD database device fails, all OSDs on the node using it will be lost and must be recreated.
|
||||||
|
|
||||||
A "detect" string is a string in the form "detect:<NAME>:<HUMAN-SIZE>:<ID>". Detect strings allow for automatic determination of Linux block device paths from known basic information about disks by leveraging "lsscsi" on the target host. The "NAME" should be some descriptive identifier, for instance the manufacturer (e.g. "INTEL"), the "HUMAN-SIZE" should be the labeled human-readable size of the device (e.g. "480GB", "1.92TB"), and "ID" specifies the Nth 0-indexed device which matches the "NAME" and "HUMAN-SIZE" values (e.g. "2" would match the third device with the corresponding "NAME" and "HUMAN-SIZE"). When matching against sizes, there is +/- 3% flexibility to account for base-1000 vs. base-1024 differences and rounding errors. The "NAME" may contain whitespace but if so the entire detect string should be quoted, and is case-insensitive. More information about detect strings can be found in the manual.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
retcode, retmsg = pvc.lib.storage.ceph_osd_db_vg_add(
|
retcode, retmsg = pvc.lib.storage.ceph_osd_db_vg_add(
|
||||||
|
@ -3936,7 +4263,7 @@ def cli_storage_osd_add(
|
||||||
|
|
||||||
DEVICE must be a valid block device path (e.g. '/dev/nvme0n1', '/dev/disk/by-path/...') or a "detect" string. Partitions are NOT supported. A "detect" string is a string in the form "detect:<NAME>:<HUMAN-SIZE>:<ID>". The path or detect string must be valid on the current node housing the OSD.
|
DEVICE must be a valid block device path (e.g. '/dev/nvme0n1', '/dev/disk/by-path/...') or a "detect" string. Partitions are NOT supported. A "detect" string is a string in the form "detect:<NAME>:<HUMAN-SIZE>:<ID>". The path or detect string must be valid on the current node housing the OSD.
|
||||||
|
|
||||||
A "detect" string is a string in the form "detect:<NAME>:<HUMAN-SIZE>:<ID>". Detect strings allow for automatic determination of Linux block device paths from known basic information about disks by leveraging "lsscsi" on the target host. The "NAME" should be some descriptive identifier, for instance the manufacturer (e.g. "INTEL"), the "HUMAN-SIZE" should be the labeled human-readable size of the device (e.g. "480GB", "1.92TB"), and "ID" specifies the Nth 0-indexed device which matches the "NAME" and "HUMAN-SIZE" values (e.g. "2" would match the third device with the corresponding "NAME" and "HUMAN-SIZE"). When matching against sizes, there is +/- 3% flexibility to account for base-1000 vs. base-1024 differences and rounding errors. The "NAME" may contain whitespace but if so the entire detect string should be quoted, and is case-insensitive. More information about detect strings can be found in the pvcbootstrapd manual.
|
A "detect" string is a string in the form "detect:<NAME>:<HUMAN-SIZE>:<ID>". Detect strings allow for automatic determination of Linux block device paths from known basic information about disks by leveraging "lsscsi"/"nvme" on the target host. The "NAME" should be some descriptive identifier that would be part of the device's Model information, for instance the manufacturer (e.g. "INTEL") or a similar unique string (e.g. "BOSS" for Dell BOSS cards); the "HUMAN-SIZE" should be the labeled human-readable size of the device (e.g. "480GB", "1.92TB"); and "ID" specifies the Nth 0-indexed device which matches the "NAME" and "HUMAN-SIZE" values (e.g. "2" would match the third device with the corresponding "NAME" and "HUMAN-SIZE"). When matching against sizes, there is +/- 3% flexibility to account for base-1000 vs. base-1024 differences and rounding errors. The "NAME" may contain whitespace but if so the entire detect string should be quoted, and is case-insensitive. More information about detect strings can be found in the pvcbootstrapd manual.
|
||||||
|
|
||||||
The weight of an OSD should reflect the ratio of the size of the OSD to the other OSDs in the storage cluster. For example, with a 200GB disk and a 400GB disk in each node, the 400GB disk should have twice the weight as the 200GB disk. For more information about CRUSH weights, please see the Ceph documentation.
|
The weight of an OSD should reflect the ratio of the size of the OSD to the other OSDs in the storage cluster. For example, with a 200GB disk and a 400GB disk in each node, the 400GB disk should have twice the weight as the 200GB disk. For more information about CRUSH weights, please see the Ceph documentation.
|
||||||
|
|
||||||
|
@ -6581,7 +6908,11 @@ cli_vm_snapshot.add_command(cli_vm_snapshot_remove)
|
||||||
cli_vm_snapshot.add_command(cli_vm_snapshot_rollback)
|
cli_vm_snapshot.add_command(cli_vm_snapshot_rollback)
|
||||||
cli_vm_snapshot.add_command(cli_vm_snapshot_export)
|
cli_vm_snapshot.add_command(cli_vm_snapshot_export)
|
||||||
cli_vm_snapshot.add_command(cli_vm_snapshot_import)
|
cli_vm_snapshot.add_command(cli_vm_snapshot_import)
|
||||||
|
cli_vm_snapshot.add_command(cli_vm_snapshot_send)
|
||||||
cli_vm.add_command(cli_vm_snapshot)
|
cli_vm.add_command(cli_vm_snapshot)
|
||||||
|
cli_vm_mirror.add_command(cli_vm_mirror_create)
|
||||||
|
cli_vm_mirror.add_command(cli_vm_mirror_promote)
|
||||||
|
cli_vm.add_command(cli_vm_mirror)
|
||||||
cli_vm_backup.add_command(cli_vm_backup_create)
|
cli_vm_backup.add_command(cli_vm_backup_create)
|
||||||
cli_vm_backup.add_command(cli_vm_backup_restore)
|
cli_vm_backup.add_command(cli_vm_backup_restore)
|
||||||
cli_vm_backup.add_command(cli_vm_backup_remove)
|
cli_vm_backup.add_command(cli_vm_backup_remove)
|
||||||
|
|
|
@ -83,6 +83,37 @@ def cli_cluster_status_format_pretty(CLI_CONFIG, data):
|
||||||
total_volumes = data.get("volumes", 0)
|
total_volumes = data.get("volumes", 0)
|
||||||
total_snapshots = data.get("snapshots", 0)
|
total_snapshots = data.get("snapshots", 0)
|
||||||
|
|
||||||
|
total_cpu_total = data.get("resources", {}).get("cpu", {}).get("total", 0)
|
||||||
|
total_cpu_load = data.get("resources", {}).get("cpu", {}).get("load", 0)
|
||||||
|
total_cpu_utilization = (
|
||||||
|
data.get("resources", {}).get("cpu", {}).get("utilization", 0)
|
||||||
|
)
|
||||||
|
total_cpu_string = (
|
||||||
|
f"{total_cpu_utilization:.1f}% ({total_cpu_load:.1f} / {total_cpu_total})"
|
||||||
|
)
|
||||||
|
|
||||||
|
total_memory_total = (
|
||||||
|
data.get("resources", {}).get("memory", {}).get("total", 0) / 1024
|
||||||
|
)
|
||||||
|
total_memory_used = (
|
||||||
|
data.get("resources", {}).get("memory", {}).get("used", 0) / 1024
|
||||||
|
)
|
||||||
|
total_memory_utilization = (
|
||||||
|
data.get("resources", {}).get("memory", {}).get("utilization", 0)
|
||||||
|
)
|
||||||
|
total_memory_string = f"{total_memory_utilization:.1f}% ({total_memory_used:.1f} GB / {total_memory_total:.1f} GB)"
|
||||||
|
|
||||||
|
total_disk_total = (
|
||||||
|
data.get("resources", {}).get("disk", {}).get("total", 0) / 1024 / 1024
|
||||||
|
)
|
||||||
|
total_disk_used = (
|
||||||
|
data.get("resources", {}).get("disk", {}).get("used", 0) / 1024 / 1024
|
||||||
|
)
|
||||||
|
total_disk_utilization = round(
|
||||||
|
data.get("resources", {}).get("disk", {}).get("utilization", 0)
|
||||||
|
)
|
||||||
|
total_disk_string = f"{total_disk_utilization:.1f}% ({total_disk_used:.1f} GB / {total_disk_total:.1f} GB)"
|
||||||
|
|
||||||
if maintenance == "true" or health == -1:
|
if maintenance == "true" or health == -1:
|
||||||
health_colour = ansii["blue"]
|
health_colour = ansii["blue"]
|
||||||
elif health > 90:
|
elif health > 90:
|
||||||
|
@ -94,9 +125,6 @@ def cli_cluster_status_format_pretty(CLI_CONFIG, data):
|
||||||
|
|
||||||
output = list()
|
output = list()
|
||||||
|
|
||||||
output.append(f"{ansii['bold']}PVC cluster status:{ansii['end']}")
|
|
||||||
output.append("")
|
|
||||||
|
|
||||||
output.append(f"{ansii['purple']}Primary node:{ansii['end']} {primary_node}")
|
output.append(f"{ansii['purple']}Primary node:{ansii['end']} {primary_node}")
|
||||||
output.append(f"{ansii['purple']}PVC version:{ansii['end']} {pvc_version}")
|
output.append(f"{ansii['purple']}PVC version:{ansii['end']} {pvc_version}")
|
||||||
output.append(f"{ansii['purple']}Upstream IP:{ansii['end']} {upstream_ip}")
|
output.append(f"{ansii['purple']}Upstream IP:{ansii['end']} {upstream_ip}")
|
||||||
|
@ -136,7 +164,17 @@ def cli_cluster_status_format_pretty(CLI_CONFIG, data):
|
||||||
)
|
)
|
||||||
|
|
||||||
messages = "\n ".join(message_list)
|
messages = "\n ".join(message_list)
|
||||||
output.append(f"{ansii['purple']}Active Faults:{ansii['end']} {messages}")
|
else:
|
||||||
|
messages = "None"
|
||||||
|
output.append(f"{ansii['purple']}Active faults:{ansii['end']} {messages}")
|
||||||
|
|
||||||
|
output.append(f"{ansii['purple']}Total CPU:{ansii['end']} {total_cpu_string}")
|
||||||
|
|
||||||
|
output.append(
|
||||||
|
f"{ansii['purple']}Total memory:{ansii['end']} {total_memory_string}"
|
||||||
|
)
|
||||||
|
|
||||||
|
output.append(f"{ansii['purple']}Total disk:{ansii['end']} {total_disk_string}")
|
||||||
|
|
||||||
output.append("")
|
output.append("")
|
||||||
|
|
||||||
|
@ -168,12 +206,12 @@ def cli_cluster_status_format_pretty(CLI_CONFIG, data):
|
||||||
|
|
||||||
output.append(f"{ansii['purple']}Nodes:{ansii['end']} {nodes_string}")
|
output.append(f"{ansii['purple']}Nodes:{ansii['end']} {nodes_string}")
|
||||||
|
|
||||||
vm_states = ["start", "disable"]
|
vm_states = ["start", "disable", "mirror"]
|
||||||
vm_states.extend(
|
vm_states.extend(
|
||||||
[
|
[
|
||||||
state
|
state
|
||||||
for state in data.get("vms", {}).keys()
|
for state in data.get("vms", {}).keys()
|
||||||
if state not in ["total", "start", "disable"]
|
if state not in ["total", "start", "disable", "mirror"]
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -183,8 +221,10 @@ def cli_cluster_status_format_pretty(CLI_CONFIG, data):
|
||||||
continue
|
continue
|
||||||
if state in ["start"]:
|
if state in ["start"]:
|
||||||
state_colour = ansii["green"]
|
state_colour = ansii["green"]
|
||||||
elif state in ["migrate", "disable", "provision"]:
|
elif state in ["migrate", "disable", "provision", "mirror"]:
|
||||||
state_colour = ansii["blue"]
|
state_colour = ansii["blue"]
|
||||||
|
elif state in ["mirror"]:
|
||||||
|
state_colour = ansii["purple"]
|
||||||
elif state in ["stop", "fail"]:
|
elif state in ["stop", "fail"]:
|
||||||
state_colour = ansii["red"]
|
state_colour = ansii["red"]
|
||||||
else:
|
else:
|
||||||
|
@ -258,9 +298,6 @@ def cli_cluster_status_format_short(CLI_CONFIG, data):
|
||||||
|
|
||||||
output = list()
|
output = list()
|
||||||
|
|
||||||
output.append(f"{ansii['bold']}PVC cluster status:{ansii['end']}")
|
|
||||||
output.append("")
|
|
||||||
|
|
||||||
if health != "-1":
|
if health != "-1":
|
||||||
health = f"{health}%"
|
health = f"{health}%"
|
||||||
else:
|
else:
|
||||||
|
@ -295,7 +332,48 @@ def cli_cluster_status_format_short(CLI_CONFIG, data):
|
||||||
)
|
)
|
||||||
|
|
||||||
messages = "\n ".join(message_list)
|
messages = "\n ".join(message_list)
|
||||||
output.append(f"{ansii['purple']}Active Faults:{ansii['end']} {messages}")
|
else:
|
||||||
|
messages = "None"
|
||||||
|
output.append(f"{ansii['purple']}Active faults:{ansii['end']} {messages}")
|
||||||
|
|
||||||
|
total_cpu_total = data.get("resources", {}).get("cpu", {}).get("total", 0)
|
||||||
|
total_cpu_load = data.get("resources", {}).get("cpu", {}).get("load", 0)
|
||||||
|
total_cpu_utilization = (
|
||||||
|
data.get("resources", {}).get("cpu", {}).get("utilization", 0)
|
||||||
|
)
|
||||||
|
total_cpu_string = (
|
||||||
|
f"{total_cpu_utilization:.1f}% ({total_cpu_load:.1f} / {total_cpu_total})"
|
||||||
|
)
|
||||||
|
|
||||||
|
total_memory_total = (
|
||||||
|
data.get("resources", {}).get("memory", {}).get("total", 0) / 1024
|
||||||
|
)
|
||||||
|
total_memory_used = (
|
||||||
|
data.get("resources", {}).get("memory", {}).get("used", 0) / 1024
|
||||||
|
)
|
||||||
|
total_memory_utilization = (
|
||||||
|
data.get("resources", {}).get("memory", {}).get("utilization", 0)
|
||||||
|
)
|
||||||
|
total_memory_string = f"{total_memory_utilization:.1f}% ({total_memory_used:.1f} GB / {total_memory_total:.1f} GB)"
|
||||||
|
|
||||||
|
total_disk_total = (
|
||||||
|
data.get("resources", {}).get("disk", {}).get("total", 0) / 1024 / 1024
|
||||||
|
)
|
||||||
|
total_disk_used = (
|
||||||
|
data.get("resources", {}).get("disk", {}).get("used", 0) / 1024 / 1024
|
||||||
|
)
|
||||||
|
total_disk_utilization = round(
|
||||||
|
data.get("resources", {}).get("disk", {}).get("utilization", 0)
|
||||||
|
)
|
||||||
|
total_disk_string = f"{total_disk_utilization:.1f}% ({total_disk_used:.1f} GB / {total_disk_total:.1f} GB)"
|
||||||
|
|
||||||
|
output.append(f"{ansii['purple']}CPU usage:{ansii['end']} {total_cpu_string}")
|
||||||
|
|
||||||
|
output.append(
|
||||||
|
f"{ansii['purple']}Memory usage:{ansii['end']} {total_memory_string}"
|
||||||
|
)
|
||||||
|
|
||||||
|
output.append(f"{ansii['purple']}Disk usage:{ansii['end']} {total_disk_string}")
|
||||||
|
|
||||||
output.append("")
|
output.append("")
|
||||||
|
|
||||||
|
@ -827,7 +905,7 @@ def cli_connection_list_format_pretty(CLI_CONFIG, data):
|
||||||
# Parse each connection and adjust field lengths
|
# Parse each connection and adjust field lengths
|
||||||
for connection in data:
|
for connection in data:
|
||||||
for field, length in [(f, fields[f]["length"]) for f in fields]:
|
for field, length in [(f, fields[f]["length"]) for f in fields]:
|
||||||
_length = len(str(connection[field]))
|
_length = len(str(connection[field])) + 1
|
||||||
if _length > length:
|
if _length > length:
|
||||||
length = len(str(connection[field])) + 1
|
length = len(str(connection[field])) + 1
|
||||||
|
|
||||||
|
@ -927,7 +1005,7 @@ def cli_connection_detail_format_pretty(CLI_CONFIG, data):
|
||||||
# Parse each connection and adjust field lengths
|
# Parse each connection and adjust field lengths
|
||||||
for connection in data:
|
for connection in data:
|
||||||
for field, length in [(f, fields[f]["length"]) for f in fields]:
|
for field, length in [(f, fields[f]["length"]) for f in fields]:
|
||||||
_length = len(str(connection[field]))
|
_length = len(str(connection[field])) + 1
|
||||||
if _length > length:
|
if _length > length:
|
||||||
length = len(str(connection[field])) + 1
|
length = len(str(connection[field])) + 1
|
||||||
|
|
||||||
|
|
|
@ -167,9 +167,17 @@ def get_store(store_path):
|
||||||
with open(store_file) as fh:
|
with open(store_file) as fh:
|
||||||
try:
|
try:
|
||||||
store_data = jload(fh)
|
store_data = jload(fh)
|
||||||
return store_data
|
|
||||||
except Exception:
|
except Exception:
|
||||||
return dict()
|
store_data = dict()
|
||||||
|
|
||||||
|
if path.exists(DEFAULT_STORE_DATA["cfgfile"]):
|
||||||
|
if store_data.get("local", None) != DEFAULT_STORE_DATA:
|
||||||
|
del store_data["local"]
|
||||||
|
if "local" not in store_data.keys():
|
||||||
|
store_data["local"] = DEFAULT_STORE_DATA
|
||||||
|
update_store(store_path, store_data)
|
||||||
|
|
||||||
|
return store_data
|
||||||
|
|
||||||
|
|
||||||
def update_store(store_path, store_data):
|
def update_store(store_path, store_data):
|
||||||
|
|
|
@ -68,7 +68,8 @@ def cli_connection_list_parser(connections_config, show_keys_flag):
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
return connections_data
|
# Return, ensuring local is always first
|
||||||
|
return sorted(connections_data, key=lambda x: (x.get("name") != "local"))
|
||||||
|
|
||||||
|
|
||||||
def cli_connection_detail_parser(connections_config):
|
def cli_connection_detail_parser(connections_config):
|
||||||
|
@ -121,4 +122,5 @@ def cli_connection_detail_parser(connections_config):
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
return connections_data
|
# Return, ensuring local is always first
|
||||||
|
return sorted(connections_data, key=lambda x: (x.get("name") != "local"))
|
||||||
|
|
|
@ -19,6 +19,8 @@
|
||||||
#
|
#
|
||||||
###############################################################################
|
###############################################################################
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
from click import progressbar
|
from click import progressbar
|
||||||
from time import sleep, time
|
from time import sleep, time
|
||||||
|
|
||||||
|
@ -105,7 +107,7 @@ def wait_for_celery_task(CLI_CONFIG, task_detail, start_late=False):
|
||||||
|
|
||||||
# Start following the task state, updating progress as we go
|
# Start following the task state, updating progress as we go
|
||||||
total_task = task_status.get("total")
|
total_task = task_status.get("total")
|
||||||
with progressbar(length=total_task, show_eta=False) as bar:
|
with progressbar(length=total_task, width=20, show_eta=False) as bar:
|
||||||
last_task = 0
|
last_task = 0
|
||||||
maxlen = 21
|
maxlen = 21
|
||||||
echo(
|
echo(
|
||||||
|
@ -115,30 +117,39 @@ def wait_for_celery_task(CLI_CONFIG, task_detail, start_late=False):
|
||||||
)
|
)
|
||||||
while True:
|
while True:
|
||||||
sleep(0.5)
|
sleep(0.5)
|
||||||
|
|
||||||
|
task_status = pvc.lib.common.task_status(
|
||||||
|
CLI_CONFIG, task_id=task_id, is_watching=True
|
||||||
|
)
|
||||||
|
|
||||||
if isinstance(task_status, tuple):
|
if isinstance(task_status, tuple):
|
||||||
continue
|
continue
|
||||||
if task_status.get("state") != "RUNNING":
|
if task_status.get("state") != "RUNNING":
|
||||||
break
|
break
|
||||||
if task_status.get("current") > last_task:
|
if task_status.get("current") == 0:
|
||||||
|
continue
|
||||||
|
|
||||||
current_task = int(task_status.get("current"))
|
current_task = int(task_status.get("current"))
|
||||||
total_task = int(task_status.get("total"))
|
total_task = int(task_status.get("total"))
|
||||||
bar.length = total_task
|
bar.length = total_task
|
||||||
|
|
||||||
|
if current_task > last_task:
|
||||||
bar.update(current_task - last_task)
|
bar.update(current_task - last_task)
|
||||||
last_task = current_task
|
last_task = current_task
|
||||||
# The extensive spaces at the end cause this to overwrite longer previous messages
|
|
||||||
curlen = len(str(task_status.get("status")))
|
curlen = len(str(task_status.get("status")))
|
||||||
if curlen > maxlen:
|
if curlen > maxlen:
|
||||||
maxlen = curlen
|
maxlen = curlen
|
||||||
lendiff = maxlen - curlen
|
lendiff = maxlen - curlen
|
||||||
overwrite_whitespace = " " * lendiff
|
overwrite_whitespace = " " * lendiff
|
||||||
echo(
|
|
||||||
CLI_CONFIG,
|
percent_complete = (current_task / total_task) * 100
|
||||||
" " + task_status.get("status") + overwrite_whitespace,
|
bar_output = f"[{bar.format_bar()}] {percent_complete:3.0f}%"
|
||||||
newline=False,
|
sys.stdout.write(
|
||||||
)
|
f"\r {bar_output} {task_status['status']}{overwrite_whitespace}"
|
||||||
task_status = pvc.lib.common.task_status(
|
|
||||||
CLI_CONFIG, task_id=task_id, is_watching=True
|
|
||||||
)
|
)
|
||||||
|
sys.stdout.flush()
|
||||||
|
|
||||||
if task_status.get("state") == "SUCCESS":
|
if task_status.get("state") == "SUCCESS":
|
||||||
bar.update(total_task - last_task)
|
bar.update(total_task - last_task)
|
||||||
|
|
||||||
|
|
|
@ -83,7 +83,7 @@ class UploadProgressBar(object):
|
||||||
else:
|
else:
|
||||||
self.end_suffix = ""
|
self.end_suffix = ""
|
||||||
|
|
||||||
self.bar = click.progressbar(length=self.length, show_eta=True)
|
self.bar = click.progressbar(length=self.length, width=20, show_eta=True)
|
||||||
|
|
||||||
def update(self, monitor):
|
def update(self, monitor):
|
||||||
bytes_cur = monitor.bytes_read
|
bytes_cur = monitor.bytes_read
|
||||||
|
|
|
@ -30,6 +30,7 @@ from requests_toolbelt.multipart.encoder import (
|
||||||
|
|
||||||
import pvc.lib.ansiprint as ansiprint
|
import pvc.lib.ansiprint as ansiprint
|
||||||
from pvc.lib.common import UploadProgressBar, call_api, get_wait_retdata
|
from pvc.lib.common import UploadProgressBar, call_api, get_wait_retdata
|
||||||
|
from pvc.cli.helpers import MAX_CONTENT_WIDTH
|
||||||
|
|
||||||
#
|
#
|
||||||
# Supplemental functions
|
# Supplemental functions
|
||||||
|
@ -1724,15 +1725,17 @@ def format_list_snapshot(config, snapshot_list):
|
||||||
#
|
#
|
||||||
# Benchmark functions
|
# Benchmark functions
|
||||||
#
|
#
|
||||||
def ceph_benchmark_run(config, pool, wait_flag):
|
def ceph_benchmark_run(config, pool, name, wait_flag):
|
||||||
"""
|
"""
|
||||||
Run a storage benchmark against {pool}
|
Run a storage benchmark against {pool}
|
||||||
|
|
||||||
API endpoint: POST /api/v1/storage/ceph/benchmark
|
API endpoint: POST /api/v1/storage/ceph/benchmark
|
||||||
API arguments: pool={pool}
|
API arguments: pool={pool}, name={name}
|
||||||
API schema: {message}
|
API schema: {message}
|
||||||
"""
|
"""
|
||||||
params = {"pool": pool}
|
params = {"pool": pool}
|
||||||
|
if name:
|
||||||
|
params["name"] = name
|
||||||
response = call_api(config, "post", "/storage/ceph/benchmark", params=params)
|
response = call_api(config, "post", "/storage/ceph/benchmark", params=params)
|
||||||
|
|
||||||
return get_wait_retdata(response, wait_flag)
|
return get_wait_retdata(response, wait_flag)
|
||||||
|
@ -1804,7 +1807,7 @@ def get_benchmark_list_results(benchmark_format, benchmark_data):
|
||||||
benchmark_bandwidth, benchmark_iops = get_benchmark_list_results_legacy(
|
benchmark_bandwidth, benchmark_iops = get_benchmark_list_results_legacy(
|
||||||
benchmark_data
|
benchmark_data
|
||||||
)
|
)
|
||||||
elif benchmark_format == 1:
|
elif benchmark_format == 1 or benchmark_format == 2:
|
||||||
benchmark_bandwidth, benchmark_iops = get_benchmark_list_results_json(
|
benchmark_bandwidth, benchmark_iops = get_benchmark_list_results_json(
|
||||||
benchmark_data
|
benchmark_data
|
||||||
)
|
)
|
||||||
|
@ -2006,6 +2009,7 @@ def format_info_benchmark(config, benchmark_information):
|
||||||
benchmark_matrix = {
|
benchmark_matrix = {
|
||||||
0: format_info_benchmark_legacy,
|
0: format_info_benchmark_legacy,
|
||||||
1: format_info_benchmark_json,
|
1: format_info_benchmark_json,
|
||||||
|
2: format_info_benchmark_json,
|
||||||
}
|
}
|
||||||
|
|
||||||
benchmark_version = benchmark_information[0]["test_format"]
|
benchmark_version = benchmark_information[0]["test_format"]
|
||||||
|
@ -2340,12 +2344,15 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
if benchmark_information["benchmark_result"] == "Running":
|
if benchmark_information["benchmark_result"] == "Running":
|
||||||
return "Benchmark test is still running."
|
return "Benchmark test is still running."
|
||||||
|
|
||||||
|
benchmark_format = benchmark_information["test_format"]
|
||||||
benchmark_details = benchmark_information["benchmark_result"]
|
benchmark_details = benchmark_information["benchmark_result"]
|
||||||
|
|
||||||
# Format a nice output; do this line-by-line then concat the elements at the end
|
# Format a nice output; do this line-by-line then concat the elements at the end
|
||||||
ainformation = []
|
ainformation = []
|
||||||
ainformation.append(
|
ainformation.append(
|
||||||
"{}Storage Benchmark details:{}".format(ansiprint.bold(), ansiprint.end())
|
"{}Storage Benchmark details (format {}):{}".format(
|
||||||
|
ansiprint.bold(), benchmark_format, ansiprint.end()
|
||||||
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
nice_test_name_map = {
|
nice_test_name_map = {
|
||||||
|
@ -2393,7 +2400,7 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
if element[1] != 0:
|
if element[1] != 0:
|
||||||
useful_latency_tree.append(element)
|
useful_latency_tree.append(element)
|
||||||
|
|
||||||
max_rows = 9
|
max_rows = 5
|
||||||
if len(useful_latency_tree) > 9:
|
if len(useful_latency_tree) > 9:
|
||||||
max_rows = len(useful_latency_tree)
|
max_rows = len(useful_latency_tree)
|
||||||
elif len(useful_latency_tree) < 9:
|
elif len(useful_latency_tree) < 9:
|
||||||
|
@ -2402,15 +2409,10 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
|
|
||||||
# Format the static data
|
# Format the static data
|
||||||
overall_label = [
|
overall_label = [
|
||||||
"Overall BW/s:",
|
"BW/s:",
|
||||||
"Overall IOPS:",
|
"IOPS:",
|
||||||
"Total I/O:",
|
"I/O:",
|
||||||
"Runtime (s):",
|
"Time:",
|
||||||
"User CPU %:",
|
|
||||||
"System CPU %:",
|
|
||||||
"Ctx Switches:",
|
|
||||||
"Major Faults:",
|
|
||||||
"Minor Faults:",
|
|
||||||
]
|
]
|
||||||
while len(overall_label) < max_rows:
|
while len(overall_label) < max_rows:
|
||||||
overall_label.append("")
|
overall_label.append("")
|
||||||
|
@ -2419,68 +2421,149 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
format_bytes_tohuman(int(job_details[io_class]["bw_bytes"])),
|
format_bytes_tohuman(int(job_details[io_class]["bw_bytes"])),
|
||||||
format_ops_tohuman(int(job_details[io_class]["iops"])),
|
format_ops_tohuman(int(job_details[io_class]["iops"])),
|
||||||
format_bytes_tohuman(int(job_details[io_class]["io_bytes"])),
|
format_bytes_tohuman(int(job_details[io_class]["io_bytes"])),
|
||||||
job_details["job_runtime"] / 1000,
|
str(job_details["job_runtime"] / 1000) + "s",
|
||||||
job_details["usr_cpu"],
|
|
||||||
job_details["sys_cpu"],
|
|
||||||
job_details["ctx"],
|
|
||||||
job_details["majf"],
|
|
||||||
job_details["minf"],
|
|
||||||
]
|
]
|
||||||
while len(overall_data) < max_rows:
|
while len(overall_data) < max_rows:
|
||||||
overall_data.append("")
|
overall_data.append("")
|
||||||
|
|
||||||
|
cpu_label = [
|
||||||
|
"Total:",
|
||||||
|
"User:",
|
||||||
|
"Sys:",
|
||||||
|
"OSD:",
|
||||||
|
"MON:",
|
||||||
|
]
|
||||||
|
while len(cpu_label) < max_rows:
|
||||||
|
cpu_label.append("")
|
||||||
|
|
||||||
|
cpu_data = [
|
||||||
|
(
|
||||||
|
benchmark_details[test]["avg_cpu_util_percent"]["total"]
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
round(job_details["usr_cpu"], 2),
|
||||||
|
round(job_details["sys_cpu"], 2),
|
||||||
|
(
|
||||||
|
benchmark_details[test]["avg_cpu_util_percent"]["ceph-osd"]
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
(
|
||||||
|
benchmark_details[test]["avg_cpu_util_percent"]["ceph-mon"]
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
]
|
||||||
|
while len(cpu_data) < max_rows:
|
||||||
|
cpu_data.append("")
|
||||||
|
|
||||||
|
memory_label = [
|
||||||
|
"Total:",
|
||||||
|
"OSD:",
|
||||||
|
"MON:",
|
||||||
|
]
|
||||||
|
while len(memory_label) < max_rows:
|
||||||
|
memory_label.append("")
|
||||||
|
|
||||||
|
memory_data = [
|
||||||
|
(
|
||||||
|
benchmark_details[test]["avg_memory_util_percent"]["total"]
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
(
|
||||||
|
benchmark_details[test]["avg_memory_util_percent"]["ceph-osd"]
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
(
|
||||||
|
benchmark_details[test]["avg_memory_util_percent"]["ceph-mon"]
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
]
|
||||||
|
while len(memory_data) < max_rows:
|
||||||
|
memory_data.append("")
|
||||||
|
|
||||||
|
network_label = [
|
||||||
|
"Total:",
|
||||||
|
"Sent:",
|
||||||
|
"Recv:",
|
||||||
|
]
|
||||||
|
while len(network_label) < max_rows:
|
||||||
|
network_label.append("")
|
||||||
|
|
||||||
|
network_data = [
|
||||||
|
(
|
||||||
|
format_bytes_tohuman(
|
||||||
|
int(benchmark_details[test]["avg_network_util_bps"]["total"])
|
||||||
|
)
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
(
|
||||||
|
format_bytes_tohuman(
|
||||||
|
int(benchmark_details[test]["avg_network_util_bps"]["sent"])
|
||||||
|
)
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
(
|
||||||
|
format_bytes_tohuman(
|
||||||
|
int(benchmark_details[test]["avg_network_util_bps"]["recv"])
|
||||||
|
)
|
||||||
|
if benchmark_format > 1
|
||||||
|
else "N/A"
|
||||||
|
),
|
||||||
|
]
|
||||||
|
while len(network_data) < max_rows:
|
||||||
|
network_data.append("")
|
||||||
|
|
||||||
bandwidth_label = [
|
bandwidth_label = [
|
||||||
"Min:",
|
"Min:",
|
||||||
"Max:",
|
"Max:",
|
||||||
"Mean:",
|
"Mean:",
|
||||||
"StdDev:",
|
"StdDev:",
|
||||||
"Samples:",
|
"Samples:",
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
]
|
]
|
||||||
while len(bandwidth_label) < max_rows:
|
while len(bandwidth_label) < max_rows:
|
||||||
bandwidth_label.append("")
|
bandwidth_label.append("")
|
||||||
|
|
||||||
bandwidth_data = [
|
bandwidth_data = [
|
||||||
format_bytes_tohuman(int(job_details[io_class]["bw_min"]) * 1024),
|
format_bytes_tohuman(int(job_details[io_class]["bw_min"]) * 1024)
|
||||||
format_bytes_tohuman(int(job_details[io_class]["bw_max"]) * 1024),
|
+ " / "
|
||||||
format_bytes_tohuman(int(job_details[io_class]["bw_mean"]) * 1024),
|
+ format_ops_tohuman(int(job_details[io_class]["iops_min"])),
|
||||||
format_bytes_tohuman(int(job_details[io_class]["bw_dev"]) * 1024),
|
format_bytes_tohuman(int(job_details[io_class]["bw_max"]) * 1024)
|
||||||
job_details[io_class]["bw_samples"],
|
+ " / "
|
||||||
"",
|
+ format_ops_tohuman(int(job_details[io_class]["iops_max"])),
|
||||||
"",
|
format_bytes_tohuman(int(job_details[io_class]["bw_mean"]) * 1024)
|
||||||
"",
|
+ " / "
|
||||||
"",
|
+ format_ops_tohuman(int(job_details[io_class]["iops_mean"])),
|
||||||
|
format_bytes_tohuman(int(job_details[io_class]["bw_dev"]) * 1024)
|
||||||
|
+ " / "
|
||||||
|
+ format_ops_tohuman(int(job_details[io_class]["iops_stddev"])),
|
||||||
|
str(job_details[io_class]["bw_samples"])
|
||||||
|
+ " / "
|
||||||
|
+ str(job_details[io_class]["iops_samples"]),
|
||||||
]
|
]
|
||||||
while len(bandwidth_data) < max_rows:
|
while len(bandwidth_data) < max_rows:
|
||||||
bandwidth_data.append("")
|
bandwidth_data.append("")
|
||||||
|
|
||||||
iops_data = [
|
lat_label = [
|
||||||
format_ops_tohuman(int(job_details[io_class]["iops_min"])),
|
"Min:",
|
||||||
format_ops_tohuman(int(job_details[io_class]["iops_max"])),
|
"Max:",
|
||||||
format_ops_tohuman(int(job_details[io_class]["iops_mean"])),
|
"Mean:",
|
||||||
format_ops_tohuman(int(job_details[io_class]["iops_stddev"])),
|
"StdDev:",
|
||||||
job_details[io_class]["iops_samples"],
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
]
|
]
|
||||||
while len(iops_data) < max_rows:
|
while len(lat_label) < max_rows:
|
||||||
iops_data.append("")
|
lat_label.append("")
|
||||||
|
|
||||||
lat_data = [
|
lat_data = [
|
||||||
int(job_details[io_class]["lat_ns"]["min"]) / 1000,
|
int(job_details[io_class]["lat_ns"]["min"]) / 1000,
|
||||||
int(job_details[io_class]["lat_ns"]["max"]) / 1000,
|
int(job_details[io_class]["lat_ns"]["max"]) / 1000,
|
||||||
int(job_details[io_class]["lat_ns"]["mean"]) / 1000,
|
int(job_details[io_class]["lat_ns"]["mean"]) / 1000,
|
||||||
int(job_details[io_class]["lat_ns"]["stddev"]) / 1000,
|
int(job_details[io_class]["lat_ns"]["stddev"]) / 1000,
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
]
|
]
|
||||||
while len(lat_data) < max_rows:
|
while len(lat_data) < max_rows:
|
||||||
lat_data.append("")
|
lat_data.append("")
|
||||||
|
@ -2489,98 +2572,119 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
lat_bucket_label = list()
|
lat_bucket_label = list()
|
||||||
lat_bucket_data = list()
|
lat_bucket_data = list()
|
||||||
for element in useful_latency_tree:
|
for element in useful_latency_tree:
|
||||||
lat_bucket_label.append(element[0])
|
lat_bucket_label.append(element[0] + ":" if element[0] else "")
|
||||||
lat_bucket_data.append(element[1])
|
lat_bucket_data.append(round(float(element[1]), 2) if element[1] else "")
|
||||||
|
while len(lat_bucket_label) < max_rows:
|
||||||
|
lat_bucket_label.append("")
|
||||||
|
while len(lat_bucket_data) < max_rows:
|
||||||
|
lat_bucket_label.append("")
|
||||||
|
|
||||||
# Column default widths
|
# Column default widths
|
||||||
overall_label_length = 0
|
overall_label_length = 5
|
||||||
overall_column_length = 0
|
overall_column_length = 0
|
||||||
bandwidth_label_length = 0
|
cpu_label_length = 6
|
||||||
bandwidth_column_length = 11
|
cpu_column_length = 0
|
||||||
iops_column_length = 4
|
memory_label_length = 6
|
||||||
latency_column_length = 12
|
memory_column_length = 0
|
||||||
|
network_label_length = 6
|
||||||
|
network_column_length = 6
|
||||||
|
bandwidth_label_length = 8
|
||||||
|
bandwidth_column_length = 0
|
||||||
|
latency_label_length = 7
|
||||||
|
latency_column_length = 0
|
||||||
latency_bucket_label_length = 0
|
latency_bucket_label_length = 0
|
||||||
|
latency_bucket_column_length = 0
|
||||||
|
|
||||||
# Column layout:
|
# Column layout:
|
||||||
# General Bandwidth IOPS Latency Percentiles
|
# Overall CPU Memory Network Bandwidth/IOPS Latency Percentiles
|
||||||
# --------- ---------- -------- -------- ---------------
|
# --------- ----- ------- -------- -------------- -------- ---------------
|
||||||
# Size Min Min Min A
|
# BW Total Total Total Min Min A
|
||||||
# BW Max Max Max B
|
# IOPS Usr OSD Send Max Max B
|
||||||
# IOPS Mean Mean Mean ...
|
# Time Sys MON Recv Mean Mean ...
|
||||||
# Runtime StdDev StdDev StdDev Z
|
# Size OSD StdDev StdDev Z
|
||||||
# UsrCPU Samples Samples
|
# MON Samples
|
||||||
# SysCPU
|
|
||||||
# CtxSw
|
|
||||||
# MajFault
|
|
||||||
# MinFault
|
|
||||||
|
|
||||||
# Set column widths
|
# Set column widths
|
||||||
for item in overall_label:
|
|
||||||
_item_length = len(str(item))
|
|
||||||
if _item_length > overall_label_length:
|
|
||||||
overall_label_length = _item_length
|
|
||||||
|
|
||||||
for item in overall_data:
|
for item in overall_data:
|
||||||
_item_length = len(str(item))
|
_item_length = len(str(item))
|
||||||
if _item_length > overall_column_length:
|
if _item_length > overall_column_length:
|
||||||
overall_column_length = _item_length
|
overall_column_length = _item_length
|
||||||
|
|
||||||
test_name_length = len(nice_test_name_map[test])
|
for item in cpu_data:
|
||||||
if test_name_length > overall_label_length + overall_column_length:
|
|
||||||
_diff = test_name_length - (overall_label_length + overall_column_length)
|
|
||||||
overall_column_length += _diff
|
|
||||||
|
|
||||||
for item in bandwidth_label:
|
|
||||||
_item_length = len(str(item))
|
_item_length = len(str(item))
|
||||||
if _item_length > bandwidth_label_length:
|
if _item_length > cpu_column_length:
|
||||||
bandwidth_label_length = _item_length
|
cpu_column_length = _item_length
|
||||||
|
|
||||||
|
for item in memory_data:
|
||||||
|
_item_length = len(str(item))
|
||||||
|
if _item_length > memory_column_length:
|
||||||
|
memory_column_length = _item_length
|
||||||
|
|
||||||
|
for item in network_data:
|
||||||
|
_item_length = len(str(item))
|
||||||
|
if _item_length > network_column_length:
|
||||||
|
network_column_length = _item_length
|
||||||
|
|
||||||
for item in bandwidth_data:
|
for item in bandwidth_data:
|
||||||
_item_length = len(str(item))
|
_item_length = len(str(item))
|
||||||
if _item_length > bandwidth_column_length:
|
if _item_length > bandwidth_column_length:
|
||||||
bandwidth_column_length = _item_length
|
bandwidth_column_length = _item_length
|
||||||
|
|
||||||
for item in iops_data:
|
|
||||||
_item_length = len(str(item))
|
|
||||||
if _item_length > iops_column_length:
|
|
||||||
iops_column_length = _item_length
|
|
||||||
|
|
||||||
for item in lat_data:
|
for item in lat_data:
|
||||||
_item_length = len(str(item))
|
_item_length = len(str(item))
|
||||||
if _item_length > latency_column_length:
|
if _item_length > latency_column_length:
|
||||||
latency_column_length = _item_length
|
latency_column_length = _item_length
|
||||||
|
|
||||||
for item in lat_bucket_label:
|
for item in lat_bucket_data:
|
||||||
_item_length = len(str(item))
|
_item_length = len(str(item))
|
||||||
if _item_length > latency_bucket_label_length:
|
if _item_length > latency_bucket_column_length:
|
||||||
latency_bucket_label_length = _item_length
|
latency_bucket_column_length = _item_length
|
||||||
|
|
||||||
# Top row (Headers)
|
# Top row (Headers)
|
||||||
ainformation.append(
|
ainformation.append(
|
||||||
"{bold}\
|
"{bold}{overall_label: <{overall_label_length}} {header_fill}{end_bold}".format(
|
||||||
{overall_label: <{overall_label_length}} \
|
|
||||||
{bandwidth_label: <{bandwidth_label_length}} \
|
|
||||||
{bandwidth: <{bandwidth_length}} \
|
|
||||||
{iops: <{iops_length}} \
|
|
||||||
{latency: <{latency_length}} \
|
|
||||||
{latency_bucket_label: <{latency_bucket_label_length}} \
|
|
||||||
{latency_bucket} \
|
|
||||||
{end_bold}".format(
|
|
||||||
bold=ansiprint.bold(),
|
bold=ansiprint.bold(),
|
||||||
end_bold=ansiprint.end(),
|
end_bold=ansiprint.end(),
|
||||||
overall_label=nice_test_name_map[test],
|
overall_label=nice_test_name_map[test],
|
||||||
overall_label_length=overall_label_length,
|
overall_label_length=overall_label_length,
|
||||||
bandwidth_label="",
|
header_fill="-"
|
||||||
bandwidth_label_length=bandwidth_label_length,
|
* (
|
||||||
bandwidth="Bandwidth/s",
|
(MAX_CONTENT_WIDTH if MAX_CONTENT_WIDTH <= 120 else 120)
|
||||||
bandwidth_length=bandwidth_column_length,
|
- len(nice_test_name_map[test])
|
||||||
iops="IOPS",
|
- 4
|
||||||
iops_length=iops_column_length,
|
),
|
||||||
latency="Latency (μs)",
|
)
|
||||||
latency_length=latency_column_length,
|
)
|
||||||
latency_bucket_label="Latency Buckets (μs/%)",
|
|
||||||
latency_bucket_label_length=latency_bucket_label_length,
|
ainformation.append(
|
||||||
latency_bucket="",
|
"{bold}\
|
||||||
|
{overall_label: <{overall_label_length}} \
|
||||||
|
{cpu_label: <{cpu_label_length}} \
|
||||||
|
{memory_label: <{memory_label_length}} \
|
||||||
|
{network_label: <{network_label_length}} \
|
||||||
|
{bandwidth_label: <{bandwidth_label_length}} \
|
||||||
|
{latency_label: <{latency_label_length}} \
|
||||||
|
{latency_bucket_label: <{latency_bucket_label_length}}\
|
||||||
|
{end_bold}".format(
|
||||||
|
bold=ansiprint.bold(),
|
||||||
|
end_bold=ansiprint.end(),
|
||||||
|
overall_label="Overall",
|
||||||
|
overall_label_length=overall_label_length + overall_column_length + 1,
|
||||||
|
cpu_label="CPU (%)",
|
||||||
|
cpu_label_length=cpu_label_length + cpu_column_length + 1,
|
||||||
|
memory_label="Memory (%)",
|
||||||
|
memory_label_length=memory_label_length + memory_column_length + 1,
|
||||||
|
network_label="Network (bps)",
|
||||||
|
network_label_length=network_label_length + network_column_length + 1,
|
||||||
|
bandwidth_label="Bandwidth / IOPS",
|
||||||
|
bandwidth_label_length=bandwidth_label_length
|
||||||
|
+ bandwidth_column_length
|
||||||
|
+ 1,
|
||||||
|
latency_label="Latency (μs)",
|
||||||
|
latency_label_length=latency_label_length + latency_column_length + 1,
|
||||||
|
latency_bucket_label="Buckets (μs/%)",
|
||||||
|
latency_bucket_label_length=latency_bucket_label_length
|
||||||
|
+ latency_bucket_column_length,
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -2588,13 +2692,19 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
# Top row (Headers)
|
# Top row (Headers)
|
||||||
ainformation.append(
|
ainformation.append(
|
||||||
"{bold}\
|
"{bold}\
|
||||||
{overall_label: >{overall_label_length}} \
|
{overall_label: <{overall_label_length}} \
|
||||||
{overall: <{overall_length}} \
|
{overall: <{overall_length}} \
|
||||||
{bandwidth_label: >{bandwidth_label_length}} \
|
{cpu_label: <{cpu_label_length}} \
|
||||||
|
{cpu: <{cpu_length}} \
|
||||||
|
{memory_label: <{memory_label_length}} \
|
||||||
|
{memory: <{memory_length}} \
|
||||||
|
{network_label: <{network_label_length}} \
|
||||||
|
{network: <{network_length}} \
|
||||||
|
{bandwidth_label: <{bandwidth_label_length}} \
|
||||||
{bandwidth: <{bandwidth_length}} \
|
{bandwidth: <{bandwidth_length}} \
|
||||||
{iops: <{iops_length}} \
|
{latency_label: <{latency_label_length}} \
|
||||||
{latency: <{latency_length}} \
|
{latency: <{latency_length}} \
|
||||||
{latency_bucket_label: >{latency_bucket_label_length}} \
|
{latency_bucket_label: <{latency_bucket_label_length}} \
|
||||||
{latency_bucket}\
|
{latency_bucket}\
|
||||||
{end_bold}".format(
|
{end_bold}".format(
|
||||||
bold="",
|
bold="",
|
||||||
|
@ -2603,12 +2713,24 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
overall_label_length=overall_label_length,
|
overall_label_length=overall_label_length,
|
||||||
overall=overall_data[idx],
|
overall=overall_data[idx],
|
||||||
overall_length=overall_column_length,
|
overall_length=overall_column_length,
|
||||||
|
cpu_label=cpu_label[idx],
|
||||||
|
cpu_label_length=cpu_label_length,
|
||||||
|
cpu=cpu_data[idx],
|
||||||
|
cpu_length=cpu_column_length,
|
||||||
|
memory_label=memory_label[idx],
|
||||||
|
memory_label_length=memory_label_length,
|
||||||
|
memory=memory_data[idx],
|
||||||
|
memory_length=memory_column_length,
|
||||||
|
network_label=network_label[idx],
|
||||||
|
network_label_length=network_label_length,
|
||||||
|
network=network_data[idx],
|
||||||
|
network_length=network_column_length,
|
||||||
bandwidth_label=bandwidth_label[idx],
|
bandwidth_label=bandwidth_label[idx],
|
||||||
bandwidth_label_length=bandwidth_label_length,
|
bandwidth_label_length=bandwidth_label_length,
|
||||||
bandwidth=bandwidth_data[idx],
|
bandwidth=bandwidth_data[idx],
|
||||||
bandwidth_length=bandwidth_column_length,
|
bandwidth_length=bandwidth_column_length,
|
||||||
iops=iops_data[idx],
|
latency_label=lat_label[idx],
|
||||||
iops_length=iops_column_length,
|
latency_label_length=latency_label_length,
|
||||||
latency=lat_data[idx],
|
latency=lat_data[idx],
|
||||||
latency_length=latency_column_length,
|
latency_length=latency_column_length,
|
||||||
latency_bucket_label=lat_bucket_label[idx],
|
latency_bucket_label=lat_bucket_label[idx],
|
||||||
|
@ -2617,4 +2739,4 @@ def format_info_benchmark_json(config, benchmark_information):
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
return "\n".join(ainformation)
|
return "\n".join(ainformation) + "\n"
|
||||||
|
|
|
@ -383,8 +383,8 @@ def vm_state(config, vm, target_state, force=False, wait=False):
|
||||||
"""
|
"""
|
||||||
params = {
|
params = {
|
||||||
"state": target_state,
|
"state": target_state,
|
||||||
"force": str(force).lower(),
|
"force": force,
|
||||||
"wait": str(wait).lower(),
|
"wait": wait,
|
||||||
}
|
}
|
||||||
response = call_api(config, "post", "/vm/{vm}/state".format(vm=vm), params=params)
|
response = call_api(config, "post", "/vm/{vm}/state".format(vm=vm), params=params)
|
||||||
|
|
||||||
|
@ -595,6 +595,107 @@ def vm_import_snapshot(
|
||||||
return get_wait_retdata(response, wait_flag)
|
return get_wait_retdata(response, wait_flag)
|
||||||
|
|
||||||
|
|
||||||
|
def vm_send_snapshot(
|
||||||
|
config,
|
||||||
|
vm,
|
||||||
|
snapshot_name,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
incremental_parent=None,
|
||||||
|
wait_flag=True,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Send an (existing) snapshot of a VM's disks and configuration to a destination PVC cluster, optionally
|
||||||
|
incremental with incremental_parent
|
||||||
|
|
||||||
|
API endpoint: POST /vm/{vm}/snapshot/send
|
||||||
|
API arguments: snapshot_name=snapshot_name, destination_api_uri=destination_api_uri, destination_api_key=destination_api_key, destination_api_verify_ssl=destination_api_verify_ssl, incremental_parent=incremental_parent, destination_storage_pool=destination_storage_pool
|
||||||
|
API schema: {"message":"{data}"}
|
||||||
|
"""
|
||||||
|
params = {
|
||||||
|
"snapshot_name": snapshot_name,
|
||||||
|
"destination_api_uri": destination_api_uri,
|
||||||
|
"destination_api_key": destination_api_key,
|
||||||
|
"destination_api_verify_ssl": destination_api_verify_ssl,
|
||||||
|
}
|
||||||
|
if destination_storage_pool is not None:
|
||||||
|
params["destination_storage_pool"] = destination_storage_pool
|
||||||
|
if incremental_parent is not None:
|
||||||
|
params["incremental_parent"] = incremental_parent
|
||||||
|
|
||||||
|
response = call_api(
|
||||||
|
config, "post", "/vm/{vm}/snapshot/send".format(vm=vm), params=params
|
||||||
|
)
|
||||||
|
|
||||||
|
return get_wait_retdata(response, wait_flag)
|
||||||
|
|
||||||
|
|
||||||
|
def vm_create_mirror(
|
||||||
|
config,
|
||||||
|
vm,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
wait_flag=True,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Create a new snapshot and send the snapshot to a destination PVC cluster, with automatic incremental handling
|
||||||
|
|
||||||
|
API endpoint: POST /vm/{vm}/mirror/create
|
||||||
|
API arguments: destination_api_uri=destination_api_uri, destination_api_key=destination_api_key, destination_api_verify_ssl=destination_api_verify_ssl, destination_storage_pool=destination_storage_pool
|
||||||
|
API schema: {"message":"{data}"}
|
||||||
|
"""
|
||||||
|
params = {
|
||||||
|
"destination_api_uri": destination_api_uri,
|
||||||
|
"destination_api_key": destination_api_key,
|
||||||
|
"destination_api_verify_ssl": destination_api_verify_ssl,
|
||||||
|
}
|
||||||
|
if destination_storage_pool is not None:
|
||||||
|
params["destination_storage_pool"] = destination_storage_pool
|
||||||
|
|
||||||
|
response = call_api(
|
||||||
|
config, "post", "/vm/{vm}/mirror/create".format(vm=vm), params=params
|
||||||
|
)
|
||||||
|
|
||||||
|
return get_wait_retdata(response, wait_flag)
|
||||||
|
|
||||||
|
|
||||||
|
def vm_promote_mirror(
|
||||||
|
config,
|
||||||
|
vm,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
remove_on_source=False,
|
||||||
|
wait_flag=True,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Shut down a VM, create a new snapshot, send the snapshot to a destination PVC cluster, start the VM on the remote cluster, and optionally remove the local VM, with automatic incremental handling
|
||||||
|
|
||||||
|
API endpoint: POST /vm/{vm}/mirror/promote
|
||||||
|
API arguments: destination_api_uri=destination_api_uri, destination_api_key=destination_api_key, destination_api_verify_ssl=destination_api_verify_ssl, destination_storage_pool=destination_storage_pool, remove_on_source=remove_on_source
|
||||||
|
API schema: {"message":"{data}"}
|
||||||
|
"""
|
||||||
|
params = {
|
||||||
|
"destination_api_uri": destination_api_uri,
|
||||||
|
"destination_api_key": destination_api_key,
|
||||||
|
"destination_api_verify_ssl": destination_api_verify_ssl,
|
||||||
|
"remove_on_source": remove_on_source,
|
||||||
|
}
|
||||||
|
if destination_storage_pool is not None:
|
||||||
|
params["destination_storage_pool"] = destination_storage_pool
|
||||||
|
|
||||||
|
response = call_api(
|
||||||
|
config, "post", "/vm/{vm}/mirror/promote".format(vm=vm), params=params
|
||||||
|
)
|
||||||
|
|
||||||
|
return get_wait_retdata(response, wait_flag)
|
||||||
|
|
||||||
|
|
||||||
def vm_autobackup(config, email_recipients=None, force_full_flag=False, wait_flag=True):
|
def vm_autobackup(config, email_recipients=None, force_full_flag=False, wait_flag=True):
|
||||||
"""
|
"""
|
||||||
Perform a cluster VM autobackup
|
Perform a cluster VM autobackup
|
||||||
|
@ -1760,6 +1861,7 @@ def format_info(config, domain_information, long_output):
|
||||||
"provision": ansiprint.blue(),
|
"provision": ansiprint.blue(),
|
||||||
"restore": ansiprint.blue(),
|
"restore": ansiprint.blue(),
|
||||||
"import": ansiprint.blue(),
|
"import": ansiprint.blue(),
|
||||||
|
"mirror": ansiprint.purple(),
|
||||||
}
|
}
|
||||||
ainformation.append(
|
ainformation.append(
|
||||||
"{}State:{} {}{}{}".format(
|
"{}State:{} {}{}{}".format(
|
||||||
|
@ -2269,16 +2371,14 @@ def format_list(config, vm_list):
|
||||||
|
|
||||||
# Format the string (elements)
|
# Format the string (elements)
|
||||||
for domain_information in sorted(vm_list, key=lambda v: v["name"]):
|
for domain_information in sorted(vm_list, key=lambda v: v["name"]):
|
||||||
if domain_information["state"] == "start":
|
if domain_information["state"] in ["start"]:
|
||||||
vm_state_colour = ansiprint.green()
|
vm_state_colour = ansiprint.green()
|
||||||
elif domain_information["state"] == "restart":
|
elif domain_information["state"] in ["restart", "shutdown"]:
|
||||||
vm_state_colour = ansiprint.yellow()
|
vm_state_colour = ansiprint.yellow()
|
||||||
elif domain_information["state"] == "shutdown":
|
elif domain_information["state"] in ["stop", "fail"]:
|
||||||
vm_state_colour = ansiprint.yellow()
|
|
||||||
elif domain_information["state"] == "stop":
|
|
||||||
vm_state_colour = ansiprint.red()
|
|
||||||
elif domain_information["state"] == "fail":
|
|
||||||
vm_state_colour = ansiprint.red()
|
vm_state_colour = ansiprint.red()
|
||||||
|
elif domain_information["state"] in ["mirror"]:
|
||||||
|
vm_state_colour = ansiprint.purple()
|
||||||
else:
|
else:
|
||||||
vm_state_colour = ansiprint.blue()
|
vm_state_colour = ansiprint.blue()
|
||||||
|
|
||||||
|
@ -2302,8 +2402,10 @@ def format_list(config, vm_list):
|
||||||
else:
|
else:
|
||||||
net_invalid_list.append(False)
|
net_invalid_list.append(False)
|
||||||
|
|
||||||
|
display_net_string_list = []
|
||||||
net_string_list = []
|
net_string_list = []
|
||||||
for net_idx, net_vni in enumerate(net_list):
|
for net_idx, net_vni in enumerate(net_list):
|
||||||
|
display_net_string_list.append(net_vni)
|
||||||
if net_invalid_list[net_idx]:
|
if net_invalid_list[net_idx]:
|
||||||
net_string_list.append(
|
net_string_list.append(
|
||||||
"{}{}{}".format(
|
"{}{}{}".format(
|
||||||
|
@ -2312,9 +2414,6 @@ def format_list(config, vm_list):
|
||||||
ansiprint.end(),
|
ansiprint.end(),
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
# Fix the length due to the extra fake characters
|
|
||||||
vm_nets_length -= len(net_vni)
|
|
||||||
vm_nets_length += len(net_string_list[net_idx])
|
|
||||||
else:
|
else:
|
||||||
net_string_list.append(net_vni)
|
net_string_list.append(net_vni)
|
||||||
|
|
||||||
|
@ -2331,7 +2430,9 @@ def format_list(config, vm_list):
|
||||||
vm_state_length=vm_state_length,
|
vm_state_length=vm_state_length,
|
||||||
vm_tags_length=vm_tags_length,
|
vm_tags_length=vm_tags_length,
|
||||||
vm_snapshots_length=vm_snapshots_length,
|
vm_snapshots_length=vm_snapshots_length,
|
||||||
vm_nets_length=vm_nets_length,
|
vm_nets_length=vm_nets_length
|
||||||
|
+ len(",".join(net_string_list))
|
||||||
|
- len(",".join(display_net_string_list)),
|
||||||
vm_ram_length=vm_ram_length,
|
vm_ram_length=vm_ram_length,
|
||||||
vm_vcpu_length=vm_vcpu_length,
|
vm_vcpu_length=vm_vcpu_length,
|
||||||
vm_node_length=vm_node_length,
|
vm_node_length=vm_node_length,
|
||||||
|
@ -2344,7 +2445,8 @@ def format_list(config, vm_list):
|
||||||
vm_state=domain_information["state"],
|
vm_state=domain_information["state"],
|
||||||
vm_tags=",".join(tag_list),
|
vm_tags=",".join(tag_list),
|
||||||
vm_snapshots=len(domain_information.get("snapshots", list())),
|
vm_snapshots=len(domain_information.get("snapshots", list())),
|
||||||
vm_networks=",".join(net_string_list),
|
vm_networks=",".join(net_string_list)
|
||||||
|
+ ("" if all(net_invalid_list) else " "),
|
||||||
vm_memory=domain_information["memory"],
|
vm_memory=domain_information["memory"],
|
||||||
vm_vcpu=domain_information["vcpu"],
|
vm_vcpu=domain_information["vcpu"],
|
||||||
vm_node=domain_information["node"],
|
vm_node=domain_information["node"],
|
||||||
|
|
|
@ -2,7 +2,7 @@ from setuptools import setup
|
||||||
|
|
||||||
setup(
|
setup(
|
||||||
name="pvc",
|
name="pvc",
|
||||||
version="0.9.99",
|
version="0.9.103",
|
||||||
packages=["pvc.cli", "pvc.lib"],
|
packages=["pvc.cli", "pvc.lib"],
|
||||||
install_requires=[
|
install_requires=[
|
||||||
"Click",
|
"Click",
|
||||||
|
|
|
@ -19,31 +19,34 @@
|
||||||
#
|
#
|
||||||
###############################################################################
|
###############################################################################
|
||||||
|
|
||||||
|
import os
|
||||||
|
import psutil
|
||||||
import psycopg2
|
import psycopg2
|
||||||
import psycopg2.extras
|
import psycopg2.extras
|
||||||
|
import subprocess
|
||||||
|
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from json import loads, dumps
|
from json import loads, dumps
|
||||||
|
from time import sleep
|
||||||
|
|
||||||
from daemon_lib.celery import start, fail, log_info, update, finish
|
from daemon_lib.celery import start, fail, log_info, update, finish
|
||||||
|
|
||||||
import daemon_lib.common as pvc_common
|
|
||||||
import daemon_lib.ceph as pvc_ceph
|
import daemon_lib.ceph as pvc_ceph
|
||||||
|
|
||||||
|
|
||||||
# Define the current test format
|
# Define the current test format
|
||||||
TEST_FORMAT = 1
|
TEST_FORMAT = 2
|
||||||
|
|
||||||
|
|
||||||
# We run a total of 8 tests, to give a generalized idea of performance on the cluster:
|
# We run a total of 8 tests, to give a generalized idea of performance on the cluster:
|
||||||
# 1. A sequential read test of 8GB with a 4M block size
|
# 1. A sequential read test of 64GB with a 4M block size
|
||||||
# 2. A sequential write test of 8GB with a 4M block size
|
# 2. A sequential write test of 64GB with a 4M block size
|
||||||
# 3. A random read test of 8GB with a 4M block size
|
# 3. A random read test of 64GB with a 4M block size
|
||||||
# 4. A random write test of 8GB with a 4M block size
|
# 4. A random write test of 64GB with a 4M block size
|
||||||
# 5. A random read test of 8GB with a 256k block size
|
# 5. A random read test of 64GB with a 256k block size
|
||||||
# 6. A random write test of 8GB with a 256k block size
|
# 6. A random write test of 64GB with a 256k block size
|
||||||
# 7. A random read test of 8GB with a 4k block size
|
# 7. A random read test of 64GB with a 4k block size
|
||||||
# 8. A random write test of 8GB with a 4k block size
|
# 8. A random write test of 64GB with a 4k block size
|
||||||
# Taken together, these 8 results should give a very good indication of the overall storage performance
|
# Taken together, these 8 results should give a very good indication of the overall storage performance
|
||||||
# for a variety of workloads.
|
# for a variety of workloads.
|
||||||
test_matrix = {
|
test_matrix = {
|
||||||
|
@ -100,7 +103,7 @@ test_matrix = {
|
||||||
|
|
||||||
# Specify the benchmark volume name and size
|
# Specify the benchmark volume name and size
|
||||||
benchmark_volume_name = "pvcbenchmark"
|
benchmark_volume_name = "pvcbenchmark"
|
||||||
benchmark_volume_size = "8G"
|
benchmark_volume_size = "64G"
|
||||||
|
|
||||||
|
|
||||||
#
|
#
|
||||||
|
@ -226,7 +229,7 @@ def cleanup_benchmark_volume(
|
||||||
|
|
||||||
|
|
||||||
def run_benchmark_job(
|
def run_benchmark_job(
|
||||||
test, pool, job_name=None, db_conn=None, db_cur=None, zkhandler=None
|
config, test, pool, job_name=None, db_conn=None, db_cur=None, zkhandler=None
|
||||||
):
|
):
|
||||||
test_spec = test_matrix[test]
|
test_spec = test_matrix[test]
|
||||||
log_info(None, f"Running test '{test}'")
|
log_info(None, f"Running test '{test}'")
|
||||||
|
@ -256,31 +259,165 @@ def run_benchmark_job(
|
||||||
)
|
)
|
||||||
|
|
||||||
log_info(None, "Running fio job: {}".format(" ".join(fio_cmd.split())))
|
log_info(None, "Running fio job: {}".format(" ".join(fio_cmd.split())))
|
||||||
retcode, stdout, stderr = pvc_common.run_os_command(fio_cmd)
|
|
||||||
|
# Run the fio command manually instead of using our run_os_command wrapper
|
||||||
|
# This will help us gather statistics about this node while it's running
|
||||||
|
process = subprocess.Popen(
|
||||||
|
fio_cmd.split(),
|
||||||
|
stdout=subprocess.PIPE,
|
||||||
|
stderr=subprocess.PIPE,
|
||||||
|
text=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Wait 15 seconds for the test to start
|
||||||
|
log_info(None, "Waiting 15 seconds for test resource stabilization")
|
||||||
|
sleep(15)
|
||||||
|
|
||||||
|
# Set up function to get process CPU utilization by name
|
||||||
|
def get_cpu_utilization_by_name(process_name):
|
||||||
|
cpu_usage = 0
|
||||||
|
for proc in psutil.process_iter(["name", "cpu_percent"]):
|
||||||
|
if proc.info["name"] == process_name:
|
||||||
|
cpu_usage += proc.info["cpu_percent"]
|
||||||
|
return cpu_usage
|
||||||
|
|
||||||
|
# Set up function to get process memory utilization by name
|
||||||
|
def get_memory_utilization_by_name(process_name):
|
||||||
|
memory_usage = 0
|
||||||
|
for proc in psutil.process_iter(["name", "memory_percent"]):
|
||||||
|
if proc.info["name"] == process_name:
|
||||||
|
memory_usage += proc.info["memory_percent"]
|
||||||
|
return memory_usage
|
||||||
|
|
||||||
|
# Set up function to get network traffic utilization in bps
|
||||||
|
def get_network_traffic_bps(interface, duration=1):
|
||||||
|
# Get initial network counters
|
||||||
|
net_io_start = psutil.net_io_counters(pernic=True)
|
||||||
|
if interface not in net_io_start:
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
stats_start = net_io_start[interface]
|
||||||
|
bytes_sent_start = stats_start.bytes_sent
|
||||||
|
bytes_recv_start = stats_start.bytes_recv
|
||||||
|
|
||||||
|
# Wait for the specified duration
|
||||||
|
sleep(duration)
|
||||||
|
|
||||||
|
# Get final network counters
|
||||||
|
net_io_end = psutil.net_io_counters(pernic=True)
|
||||||
|
stats_end = net_io_end[interface]
|
||||||
|
bytes_sent_end = stats_end.bytes_sent
|
||||||
|
bytes_recv_end = stats_end.bytes_recv
|
||||||
|
|
||||||
|
# Calculate bytes per second
|
||||||
|
bytes_sent_per_sec = (bytes_sent_end - bytes_sent_start) / duration
|
||||||
|
bytes_recv_per_sec = (bytes_recv_end - bytes_recv_start) / duration
|
||||||
|
|
||||||
|
# Convert to bits per second (bps)
|
||||||
|
bits_sent_per_sec = bytes_sent_per_sec * 8
|
||||||
|
bits_recv_per_sec = bytes_recv_per_sec * 8
|
||||||
|
bits_total_per_sec = bits_sent_per_sec + bits_recv_per_sec
|
||||||
|
|
||||||
|
return bits_sent_per_sec, bits_recv_per_sec, bits_total_per_sec
|
||||||
|
|
||||||
|
log_info(None, f"Starting system resource polling for test '{test}'")
|
||||||
|
storage_interface = config["storage_dev"]
|
||||||
|
total_cpus = psutil.cpu_count(logical=True)
|
||||||
|
ticks = 1
|
||||||
|
osd_cpu_utilization = 0
|
||||||
|
osd_memory_utilization = 0
|
||||||
|
mon_cpu_utilization = 0
|
||||||
|
mon_memory_utilization = 0
|
||||||
|
total_cpu_utilization = 0
|
||||||
|
total_memory_utilization = 0
|
||||||
|
storage_sent_bps = 0
|
||||||
|
storage_recv_bps = 0
|
||||||
|
storage_total_bps = 0
|
||||||
|
|
||||||
|
while process.poll() is None:
|
||||||
|
# Do collection of statistics like network bandwidth and cpu utilization
|
||||||
|
current_osd_cpu_utilization = get_cpu_utilization_by_name("ceph-osd")
|
||||||
|
current_osd_memory_utilization = get_memory_utilization_by_name("ceph-osd")
|
||||||
|
current_mon_cpu_utilization = get_cpu_utilization_by_name("ceph-mon")
|
||||||
|
current_mon_memory_utilization = get_memory_utilization_by_name("ceph-mon")
|
||||||
|
current_total_cpu_utilization = psutil.cpu_percent(interval=1)
|
||||||
|
current_total_memory_utilization = psutil.virtual_memory().percent
|
||||||
|
(
|
||||||
|
current_storage_sent_bps,
|
||||||
|
current_storage_recv_bps,
|
||||||
|
current_storage_total_bps,
|
||||||
|
) = get_network_traffic_bps(storage_interface)
|
||||||
|
# Recheck if the process is done yet; if it's not, we add the values and increase the ticks
|
||||||
|
# This helps ensure that if the process finishes earlier than the longer polls above,
|
||||||
|
# this particular tick isn't counted which can skew the average
|
||||||
|
if process.poll() is None:
|
||||||
|
osd_cpu_utilization += current_osd_cpu_utilization
|
||||||
|
osd_memory_utilization += current_osd_memory_utilization
|
||||||
|
mon_cpu_utilization += current_mon_cpu_utilization
|
||||||
|
mon_memory_utilization += current_mon_memory_utilization
|
||||||
|
total_cpu_utilization += current_total_cpu_utilization
|
||||||
|
total_memory_utilization += current_total_memory_utilization
|
||||||
|
storage_sent_bps += current_storage_sent_bps
|
||||||
|
storage_recv_bps += current_storage_recv_bps
|
||||||
|
storage_total_bps += current_storage_total_bps
|
||||||
|
ticks += 1
|
||||||
|
|
||||||
|
# Get the 1-minute load average and CPU utilization, which covers the test duration
|
||||||
|
load1, _, _ = os.getloadavg()
|
||||||
|
load1 = round(load1, 2)
|
||||||
|
|
||||||
|
# Calculate the average CPU utilization values over the runtime
|
||||||
|
# Divide the OSD and MON CPU utilization by the total number of CPU cores, because
|
||||||
|
# the total is divided this way
|
||||||
|
avg_osd_cpu_utilization = round(osd_cpu_utilization / ticks / total_cpus, 2)
|
||||||
|
avg_osd_memory_utilization = round(osd_memory_utilization / ticks, 2)
|
||||||
|
avg_mon_cpu_utilization = round(mon_cpu_utilization / ticks / total_cpus, 2)
|
||||||
|
avg_mon_memory_utilization = round(mon_memory_utilization / ticks, 2)
|
||||||
|
avg_total_cpu_utilization = round(total_cpu_utilization / ticks, 2)
|
||||||
|
avg_total_memory_utilization = round(total_memory_utilization / ticks, 2)
|
||||||
|
avg_storage_sent_bps = round(storage_sent_bps / ticks, 2)
|
||||||
|
avg_storage_recv_bps = round(storage_recv_bps / ticks, 2)
|
||||||
|
avg_storage_total_bps = round(storage_total_bps / ticks, 2)
|
||||||
|
|
||||||
|
stdout, stderr = process.communicate()
|
||||||
|
retcode = process.returncode
|
||||||
|
|
||||||
|
resource_data = {
|
||||||
|
"avg_cpu_util_percent": {
|
||||||
|
"total": avg_total_cpu_utilization,
|
||||||
|
"ceph-mon": avg_mon_cpu_utilization,
|
||||||
|
"ceph-osd": avg_osd_cpu_utilization,
|
||||||
|
},
|
||||||
|
"avg_memory_util_percent": {
|
||||||
|
"total": avg_total_memory_utilization,
|
||||||
|
"ceph-mon": avg_mon_memory_utilization,
|
||||||
|
"ceph-osd": avg_osd_memory_utilization,
|
||||||
|
},
|
||||||
|
"avg_network_util_bps": {
|
||||||
|
"sent": avg_storage_sent_bps,
|
||||||
|
"recv": avg_storage_recv_bps,
|
||||||
|
"total": avg_storage_total_bps,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
try:
|
try:
|
||||||
jstdout = loads(stdout)
|
jstdout = loads(stdout)
|
||||||
if retcode:
|
if retcode:
|
||||||
raise
|
raise
|
||||||
except Exception:
|
except Exception:
|
||||||
cleanup(
|
return None, None
|
||||||
job_name,
|
|
||||||
db_conn=db_conn,
|
|
||||||
db_cur=db_cur,
|
|
||||||
zkhandler=zkhandler,
|
|
||||||
)
|
|
||||||
fail(
|
|
||||||
None,
|
|
||||||
f"Failed to run fio test '{test}': {stderr}",
|
|
||||||
)
|
|
||||||
|
|
||||||
return jstdout
|
return resource_data, jstdout
|
||||||
|
|
||||||
|
|
||||||
def worker_run_benchmark(zkhandler, celery, config, pool):
|
def worker_run_benchmark(zkhandler, celery, config, pool, name):
|
||||||
# Phase 0 - connect to databases
|
# Phase 0 - connect to databases
|
||||||
|
if not name:
|
||||||
cur_time = datetime.now().isoformat(timespec="seconds")
|
cur_time = datetime.now().isoformat(timespec="seconds")
|
||||||
cur_primary = zkhandler.read("base.config.primary_node")
|
cur_primary = zkhandler.read("base.config.primary_node")
|
||||||
job_name = f"{cur_time}_{cur_primary}"
|
job_name = f"{cur_time}_{cur_primary}"
|
||||||
|
else:
|
||||||
|
job_name = name
|
||||||
|
|
||||||
current_stage = 0
|
current_stage = 0
|
||||||
total_stages = 13
|
total_stages = 13
|
||||||
|
@ -358,7 +495,8 @@ def worker_run_benchmark(zkhandler, celery, config, pool):
|
||||||
total=total_stages,
|
total=total_stages,
|
||||||
)
|
)
|
||||||
|
|
||||||
results[test] = run_benchmark_job(
|
resource_data, fio_data = run_benchmark_job(
|
||||||
|
config,
|
||||||
test,
|
test,
|
||||||
pool,
|
pool,
|
||||||
job_name=job_name,
|
job_name=job_name,
|
||||||
|
@ -366,6 +504,25 @@ def worker_run_benchmark(zkhandler, celery, config, pool):
|
||||||
db_cur=db_cur,
|
db_cur=db_cur,
|
||||||
zkhandler=zkhandler,
|
zkhandler=zkhandler,
|
||||||
)
|
)
|
||||||
|
if resource_data is None or fio_data is None:
|
||||||
|
cleanup_benchmark_volume(
|
||||||
|
pool,
|
||||||
|
job_name=job_name,
|
||||||
|
db_conn=db_conn,
|
||||||
|
db_cur=db_cur,
|
||||||
|
zkhandler=zkhandler,
|
||||||
|
)
|
||||||
|
cleanup(
|
||||||
|
job_name,
|
||||||
|
db_conn=db_conn,
|
||||||
|
db_cur=db_cur,
|
||||||
|
zkhandler=zkhandler,
|
||||||
|
)
|
||||||
|
fail(
|
||||||
|
None,
|
||||||
|
f"Failed to run fio test '{test}'",
|
||||||
|
)
|
||||||
|
results[test] = {**resource_data, **fio_data}
|
||||||
|
|
||||||
# Phase 3 - cleanup
|
# Phase 3 - cleanup
|
||||||
current_stage += 1
|
current_stage += 1
|
||||||
|
|
|
@ -560,7 +560,21 @@ def getVolumeInformation(zkhandler, pool, volume):
|
||||||
return volume_information
|
return volume_information
|
||||||
|
|
||||||
|
|
||||||
def add_volume(zkhandler, pool, name, size, force_flag=False):
|
def scan_volume(zkhandler, pool, name):
|
||||||
|
retcode, stdout, stderr = common.run_os_command(
|
||||||
|
"rbd info --format json {}/{}".format(pool, name)
|
||||||
|
)
|
||||||
|
volstats = stdout
|
||||||
|
|
||||||
|
# 3. Add the new volume to Zookeeper
|
||||||
|
zkhandler.write(
|
||||||
|
[
|
||||||
|
(("volume.stats", f"{pool}/{name}"), volstats),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def add_volume(zkhandler, pool, name, size, force_flag=False, zk_only=False):
|
||||||
# 1. Verify the size of the volume
|
# 1. Verify the size of the volume
|
||||||
pool_information = getPoolInformation(zkhandler, pool)
|
pool_information = getPoolInformation(zkhandler, pool)
|
||||||
size_bytes = format_bytes_fromhuman(size)
|
size_bytes = format_bytes_fromhuman(size)
|
||||||
|
@ -592,27 +606,28 @@ def add_volume(zkhandler, pool, name, size, force_flag=False):
|
||||||
)
|
)
|
||||||
|
|
||||||
# 2. Create the volume
|
# 2. Create the volume
|
||||||
|
# zk_only flag skips actually creating the volume - this would be done by some other mechanism
|
||||||
|
if not zk_only:
|
||||||
retcode, stdout, stderr = common.run_os_command(
|
retcode, stdout, stderr = common.run_os_command(
|
||||||
"rbd create --size {}B {}/{}".format(size_bytes, pool, name)
|
"rbd create --size {}B {}/{}".format(size_bytes, pool, name)
|
||||||
)
|
)
|
||||||
if retcode:
|
if retcode:
|
||||||
return False, 'ERROR: Failed to create RBD volume "{}": {}'.format(name, stderr)
|
return False, 'ERROR: Failed to create RBD volume "{}": {}'.format(
|
||||||
|
name, stderr
|
||||||
# 2. Get volume stats
|
|
||||||
retcode, stdout, stderr = common.run_os_command(
|
|
||||||
"rbd info --format json {}/{}".format(pool, name)
|
|
||||||
)
|
)
|
||||||
volstats = stdout
|
|
||||||
|
|
||||||
# 3. Add the new volume to Zookeeper
|
# 3. Add the new volume to Zookeeper
|
||||||
zkhandler.write(
|
zkhandler.write(
|
||||||
[
|
[
|
||||||
(("volume", f"{pool}/{name}"), ""),
|
(("volume", f"{pool}/{name}"), ""),
|
||||||
(("volume.stats", f"{pool}/{name}"), volstats),
|
(("volume.stats", f"{pool}/{name}"), ""),
|
||||||
(("snapshot", f"{pool}/{name}"), ""),
|
(("snapshot", f"{pool}/{name}"), ""),
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# 4. Scan the volume stats
|
||||||
|
scan_volume(zkhandler, pool, name)
|
||||||
|
|
||||||
return True, 'Created RBD volume "{}" of size "{}" in pool "{}".'.format(
|
return True, 'Created RBD volume "{}" of size "{}" in pool "{}".'.format(
|
||||||
name, format_bytes_tohuman(size_bytes), pool
|
name, format_bytes_tohuman(size_bytes), pool
|
||||||
)
|
)
|
||||||
|
@ -662,21 +677,18 @@ def clone_volume(zkhandler, pool, name_src, name_new, force_flag=False):
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
# 3. Get volume stats
|
# 3. Add the new volume to Zookeeper
|
||||||
retcode, stdout, stderr = common.run_os_command(
|
|
||||||
"rbd info --format json {}/{}".format(pool, name_new)
|
|
||||||
)
|
|
||||||
volstats = stdout
|
|
||||||
|
|
||||||
# 4. Add the new volume to Zookeeper
|
|
||||||
zkhandler.write(
|
zkhandler.write(
|
||||||
[
|
[
|
||||||
(("volume", f"{pool}/{name_new}"), ""),
|
(("volume", f"{pool}/{name_new}"), ""),
|
||||||
(("volume.stats", f"{pool}/{name_new}"), volstats),
|
(("volume.stats", f"{pool}/{name_new}"), ""),
|
||||||
(("snapshot", f"{pool}/{name_new}"), ""),
|
(("snapshot", f"{pool}/{name_new}"), ""),
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# 4. Scan the volume stats
|
||||||
|
scan_volume(zkhandler, pool, name_new)
|
||||||
|
|
||||||
return True, 'Cloned RBD volume "{}" to "{}" in pool "{}"'.format(
|
return True, 'Cloned RBD volume "{}" to "{}" in pool "{}"'.format(
|
||||||
name_src, name_new, pool
|
name_src, name_new, pool
|
||||||
)
|
)
|
||||||
|
@ -761,20 +773,8 @@ def resize_volume(zkhandler, pool, name, size, force_flag=False):
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# 4. Get volume stats
|
# 4. Scan the volume stats
|
||||||
retcode, stdout, stderr = common.run_os_command(
|
scan_volume(zkhandler, pool, name)
|
||||||
"rbd info --format json {}/{}".format(pool, name)
|
|
||||||
)
|
|
||||||
volstats = stdout
|
|
||||||
|
|
||||||
# 5. Update the volume in Zookeeper
|
|
||||||
zkhandler.write(
|
|
||||||
[
|
|
||||||
(("volume", f"{pool}/{name}"), ""),
|
|
||||||
(("volume.stats", f"{pool}/{name}"), volstats),
|
|
||||||
(("snapshot", f"{pool}/{name}"), ""),
|
|
||||||
]
|
|
||||||
)
|
|
||||||
|
|
||||||
return True, 'Resized RBD volume "{}" to size "{}" in pool "{}".'.format(
|
return True, 'Resized RBD volume "{}" to size "{}" in pool "{}".'.format(
|
||||||
name, format_bytes_tohuman(size_bytes), pool
|
name, format_bytes_tohuman(size_bytes), pool
|
||||||
|
@ -807,18 +807,8 @@ def rename_volume(zkhandler, pool, name, new_name):
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
||||||
# 3. Get volume stats
|
# 3. Scan the volume stats
|
||||||
retcode, stdout, stderr = common.run_os_command(
|
scan_volume(zkhandler, pool, new_name)
|
||||||
"rbd info --format json {}/{}".format(pool, new_name)
|
|
||||||
)
|
|
||||||
volstats = stdout
|
|
||||||
|
|
||||||
# 4. Update the volume stats in Zookeeper
|
|
||||||
zkhandler.write(
|
|
||||||
[
|
|
||||||
(("volume.stats", f"{pool}/{new_name}"), volstats),
|
|
||||||
]
|
|
||||||
)
|
|
||||||
|
|
||||||
return True, 'Renamed RBD volume "{}" to "{}" in pool "{}".'.format(
|
return True, 'Renamed RBD volume "{}" to "{}" in pool "{}".'.format(
|
||||||
name, new_name, pool
|
name, new_name, pool
|
||||||
|
@ -1178,11 +1168,14 @@ def get_list_snapshot(zkhandler, target_pool, target_volume, limit=None, is_fuzz
|
||||||
continue
|
continue
|
||||||
if target_volume and volume_name != target_volume:
|
if target_volume and volume_name != target_volume:
|
||||||
continue
|
continue
|
||||||
|
try:
|
||||||
snapshot_stats = json.loads(
|
snapshot_stats = json.loads(
|
||||||
zkhandler.read(
|
zkhandler.read(
|
||||||
("snapshot.stats", f"{pool_name}/{volume_name}/{snapshot_name}")
|
("snapshot.stats", f"{pool_name}/{volume_name}/{snapshot_name}")
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
except Exception:
|
||||||
|
snapshot_stats = []
|
||||||
if limit:
|
if limit:
|
||||||
try:
|
try:
|
||||||
if re.fullmatch(limit, snapshot_name):
|
if re.fullmatch(limit, snapshot_name):
|
||||||
|
@ -1238,16 +1231,16 @@ def osd_worker_add_osd(
|
||||||
current_stage = 0
|
current_stage = 0
|
||||||
total_stages = 5
|
total_stages = 5
|
||||||
if split_count is None:
|
if split_count is None:
|
||||||
_split_count = 1
|
split_count = 1
|
||||||
else:
|
else:
|
||||||
_split_count = split_count
|
split_count = int(split_count)
|
||||||
total_stages = total_stages + 3 * int(_split_count)
|
total_stages = total_stages + 3 * int(split_count)
|
||||||
if ext_db_ratio is not None or ext_db_size is not None:
|
if ext_db_ratio is not None or ext_db_size is not None:
|
||||||
total_stages = total_stages + 3 * int(_split_count) + 1
|
total_stages = total_stages + 3 * int(split_count) + 1
|
||||||
|
|
||||||
start(
|
start(
|
||||||
celery,
|
celery,
|
||||||
f"Adding {_split_count} new OSD(s) on device {device} with weight {weight}",
|
f"Adding {split_count} new OSD(s) on device {device} with weight {weight}",
|
||||||
current=current_stage,
|
current=current_stage,
|
||||||
total=total_stages,
|
total=total_stages,
|
||||||
)
|
)
|
||||||
|
@ -1288,7 +1281,7 @@ def osd_worker_add_osd(
|
||||||
else:
|
else:
|
||||||
ext_db_flag = False
|
ext_db_flag = False
|
||||||
|
|
||||||
if split_count is not None:
|
if split_count > 1:
|
||||||
split_flag = f"--osds-per-device {split_count}"
|
split_flag = f"--osds-per-device {split_count}"
|
||||||
is_split = True
|
is_split = True
|
||||||
log_info(
|
log_info(
|
||||||
|
|
|
@ -262,6 +262,22 @@ def getClusterInformation(zkhandler):
|
||||||
# Get cluster maintenance state
|
# Get cluster maintenance state
|
||||||
maintenance_state = zkhandler.read("base.config.maintenance")
|
maintenance_state = zkhandler.read("base.config.maintenance")
|
||||||
|
|
||||||
|
# Prepare cluster total values
|
||||||
|
cluster_total_node_memory = 0
|
||||||
|
cluster_total_used_memory = 0
|
||||||
|
cluster_total_free_memory = 0
|
||||||
|
cluster_total_allocated_memory = 0
|
||||||
|
cluster_total_provisioned_memory = 0
|
||||||
|
cluster_total_average_memory_utilization = 0
|
||||||
|
cluster_total_cpu_cores = 0
|
||||||
|
cluster_total_cpu_load = 0
|
||||||
|
cluster_total_average_cpu_utilization = 0
|
||||||
|
cluster_total_allocated_cores = 0
|
||||||
|
cluster_total_osd_space = 0
|
||||||
|
cluster_total_used_space = 0
|
||||||
|
cluster_total_free_space = 0
|
||||||
|
cluster_total_average_osd_utilization = 0
|
||||||
|
|
||||||
# Get primary node
|
# Get primary node
|
||||||
maintenance_state, primary_node = zkhandler.read_many(
|
maintenance_state, primary_node = zkhandler.read_many(
|
||||||
[
|
[
|
||||||
|
@ -276,19 +292,36 @@ def getClusterInformation(zkhandler):
|
||||||
# Get the list of Nodes
|
# Get the list of Nodes
|
||||||
node_list = zkhandler.children("base.node")
|
node_list = zkhandler.children("base.node")
|
||||||
node_count = len(node_list)
|
node_count = len(node_list)
|
||||||
# Get the daemon and domain states of all Nodes
|
# Get the information of all Nodes
|
||||||
node_state_reads = list()
|
node_state_reads = list()
|
||||||
|
node_memory_reads = list()
|
||||||
|
node_cpu_reads = list()
|
||||||
for node in node_list:
|
for node in node_list:
|
||||||
node_state_reads += [
|
node_state_reads += [
|
||||||
("node.state.daemon", node),
|
("node.state.daemon", node),
|
||||||
("node.state.domain", node),
|
("node.state.domain", node),
|
||||||
]
|
]
|
||||||
|
node_memory_reads += [
|
||||||
|
("node.memory.total", node),
|
||||||
|
("node.memory.used", node),
|
||||||
|
("node.memory.free", node),
|
||||||
|
("node.memory.allocated", node),
|
||||||
|
("node.memory.provisioned", node),
|
||||||
|
]
|
||||||
|
node_cpu_reads += [
|
||||||
|
("node.data.static", node),
|
||||||
|
("node.vcpu.allocated", node),
|
||||||
|
("node.cpu.load", node),
|
||||||
|
]
|
||||||
all_node_states = zkhandler.read_many(node_state_reads)
|
all_node_states = zkhandler.read_many(node_state_reads)
|
||||||
|
all_node_memory = zkhandler.read_many(node_memory_reads)
|
||||||
|
all_node_cpu = zkhandler.read_many(node_cpu_reads)
|
||||||
|
|
||||||
# Parse out the Node states
|
# Parse out the Node states
|
||||||
node_data = list()
|
node_data = list()
|
||||||
formatted_node_states = {"total": node_count}
|
formatted_node_states = {"total": node_count}
|
||||||
for nidx, node in enumerate(node_list):
|
for nidx, node in enumerate(node_list):
|
||||||
# Split the large list of return values by the IDX of this node
|
# Split the large list of return values by the IDX of this node (states)
|
||||||
# Each node result is 2 fields long
|
# Each node result is 2 fields long
|
||||||
pos_start = nidx * 2
|
pos_start = nidx * 2
|
||||||
pos_end = nidx * 2 + 2
|
pos_end = nidx * 2 + 2
|
||||||
|
@ -308,6 +341,46 @@ def getClusterInformation(zkhandler):
|
||||||
else:
|
else:
|
||||||
formatted_node_states[node_state] = 1
|
formatted_node_states[node_state] = 1
|
||||||
|
|
||||||
|
# Split the large list of return values by the IDX of this node (memory)
|
||||||
|
# Each node result is 5 fields long
|
||||||
|
pos_start = nidx * 5
|
||||||
|
pos_end = nidx * 5 + 5
|
||||||
|
(
|
||||||
|
node_memory_total,
|
||||||
|
node_memory_used,
|
||||||
|
node_memory_free,
|
||||||
|
node_memory_allocated,
|
||||||
|
node_memory_provisioned,
|
||||||
|
) = tuple(all_node_memory[pos_start:pos_end])
|
||||||
|
cluster_total_node_memory += int(node_memory_total)
|
||||||
|
cluster_total_used_memory += int(node_memory_used)
|
||||||
|
cluster_total_free_memory += int(node_memory_free)
|
||||||
|
cluster_total_allocated_memory += int(node_memory_allocated)
|
||||||
|
cluster_total_provisioned_memory += int(node_memory_provisioned)
|
||||||
|
|
||||||
|
# Split the large list of return values by the IDX of this node (cpu)
|
||||||
|
# Each nod result is 3 fields long
|
||||||
|
pos_start = nidx * 3
|
||||||
|
pos_end = nidx * 3 + 3
|
||||||
|
node_static_data, node_vcpu_allocated, node_cpu_load = tuple(
|
||||||
|
all_node_cpu[pos_start:pos_end]
|
||||||
|
)
|
||||||
|
cluster_total_cpu_cores += int(node_static_data.split()[0])
|
||||||
|
cluster_total_cpu_load += round(float(node_cpu_load), 2)
|
||||||
|
cluster_total_allocated_cores += int(node_vcpu_allocated)
|
||||||
|
|
||||||
|
cluster_total_average_memory_utilization = (
|
||||||
|
(round((cluster_total_used_memory / cluster_total_node_memory) * 100, 2))
|
||||||
|
if cluster_total_node_memory > 0
|
||||||
|
else 0.00
|
||||||
|
)
|
||||||
|
|
||||||
|
cluster_total_average_cpu_utilization = (
|
||||||
|
(round((cluster_total_cpu_load / cluster_total_cpu_cores) * 100, 2))
|
||||||
|
if cluster_total_cpu_cores > 0
|
||||||
|
else 0.00
|
||||||
|
)
|
||||||
|
|
||||||
# Get the list of VMs
|
# Get the list of VMs
|
||||||
vm_list = zkhandler.children("base.domain")
|
vm_list = zkhandler.children("base.domain")
|
||||||
vm_count = len(vm_list)
|
vm_count = len(vm_list)
|
||||||
|
@ -380,6 +453,18 @@ def getClusterInformation(zkhandler):
|
||||||
else:
|
else:
|
||||||
formatted_osd_states[osd_state] = 1
|
formatted_osd_states[osd_state] = 1
|
||||||
|
|
||||||
|
# Add the OSD utilization
|
||||||
|
cluster_total_osd_space += int(osd_stats["kb"])
|
||||||
|
cluster_total_used_space += int(osd_stats["kb_used"])
|
||||||
|
cluster_total_free_space += int(osd_stats["kb_avail"])
|
||||||
|
cluster_total_average_osd_utilization += float(osd_stats["utilization"])
|
||||||
|
|
||||||
|
cluster_total_average_osd_utilization = (
|
||||||
|
(round(cluster_total_average_osd_utilization / len(ceph_osd_list), 2))
|
||||||
|
if ceph_osd_list
|
||||||
|
else 0.00
|
||||||
|
)
|
||||||
|
|
||||||
# Get the list of Networks
|
# Get the list of Networks
|
||||||
network_list = zkhandler.children("base.network")
|
network_list = zkhandler.children("base.network")
|
||||||
network_count = len(network_list)
|
network_count = len(network_list)
|
||||||
|
@ -424,6 +509,28 @@ def getClusterInformation(zkhandler):
|
||||||
"pools": ceph_pool_count,
|
"pools": ceph_pool_count,
|
||||||
"volumes": ceph_volume_count,
|
"volumes": ceph_volume_count,
|
||||||
"snapshots": ceph_snapshot_count,
|
"snapshots": ceph_snapshot_count,
|
||||||
|
"resources": {
|
||||||
|
"memory": {
|
||||||
|
"total": cluster_total_node_memory,
|
||||||
|
"free": cluster_total_free_memory,
|
||||||
|
"used": cluster_total_used_memory,
|
||||||
|
"allocated": cluster_total_allocated_memory,
|
||||||
|
"provisioned": cluster_total_provisioned_memory,
|
||||||
|
"utilization": cluster_total_average_memory_utilization,
|
||||||
|
},
|
||||||
|
"cpu": {
|
||||||
|
"total": cluster_total_cpu_cores,
|
||||||
|
"load": cluster_total_cpu_load,
|
||||||
|
"allocated": cluster_total_allocated_cores,
|
||||||
|
"utilization": cluster_total_average_cpu_utilization,
|
||||||
|
},
|
||||||
|
"disk": {
|
||||||
|
"total": cluster_total_osd_space,
|
||||||
|
"used": cluster_total_used_space,
|
||||||
|
"free": cluster_total_free_space,
|
||||||
|
"utilization": cluster_total_average_osd_utilization,
|
||||||
|
},
|
||||||
|
},
|
||||||
"detail": {
|
"detail": {
|
||||||
"node": node_data,
|
"node": node_data,
|
||||||
"vm": vm_data,
|
"vm": vm_data,
|
||||||
|
@ -1053,6 +1160,7 @@ def get_resource_metrics(zkhandler):
|
||||||
"fail": 8,
|
"fail": 8,
|
||||||
"import": 9,
|
"import": 9,
|
||||||
"restore": 10,
|
"restore": 10,
|
||||||
|
"mirror": 99,
|
||||||
}
|
}
|
||||||
state = vm["state"]
|
state = vm["state"]
|
||||||
output_lines.append(
|
output_lines.append(
|
||||||
|
|
|
@ -26,6 +26,7 @@ import subprocess
|
||||||
import signal
|
import signal
|
||||||
from json import loads
|
from json import loads
|
||||||
from re import match as re_match
|
from re import match as re_match
|
||||||
|
from re import search as re_search
|
||||||
from re import split as re_split
|
from re import split as re_split
|
||||||
from re import sub as re_sub
|
from re import sub as re_sub
|
||||||
from difflib import unified_diff
|
from difflib import unified_diff
|
||||||
|
@ -84,6 +85,7 @@ vm_state_combinations = [
|
||||||
"provision",
|
"provision",
|
||||||
"import",
|
"import",
|
||||||
"restore",
|
"restore",
|
||||||
|
"mirror",
|
||||||
]
|
]
|
||||||
ceph_osd_state_combinations = [
|
ceph_osd_state_combinations = [
|
||||||
"up,in",
|
"up,in",
|
||||||
|
@ -1073,7 +1075,7 @@ def sortInterfaceNames(interface_names):
|
||||||
#
|
#
|
||||||
# Parse a "detect" device into a real block device name
|
# Parse a "detect" device into a real block device name
|
||||||
#
|
#
|
||||||
def get_detect_device(detect_string):
|
def get_detect_device_lsscsi(detect_string):
|
||||||
"""
|
"""
|
||||||
Parses a "detect:" string into a normalized block device path using lsscsi.
|
Parses a "detect:" string into a normalized block device path using lsscsi.
|
||||||
|
|
||||||
|
@ -1140,3 +1142,96 @@ def get_detect_device(detect_string):
|
||||||
break
|
break
|
||||||
|
|
||||||
return blockdev
|
return blockdev
|
||||||
|
|
||||||
|
|
||||||
|
def get_detect_device_nvme(detect_string):
|
||||||
|
"""
|
||||||
|
Parses a "detect:" string into a normalized block device path using nvme.
|
||||||
|
|
||||||
|
A detect string is formatted "detect:<NAME>:<SIZE>:<ID>", where
|
||||||
|
NAME is some unique identifier in lsscsi, SIZE is a human-readable
|
||||||
|
size value to within +/- 3% of the real size of the device, and
|
||||||
|
ID is the Nth (0-indexed) matching entry of that NAME and SIZE.
|
||||||
|
"""
|
||||||
|
|
||||||
|
unit_map = {
|
||||||
|
"kB": 1000,
|
||||||
|
"MB": 1000 * 1000,
|
||||||
|
"GB": 1000 * 1000 * 1000,
|
||||||
|
"TB": 1000 * 1000 * 1000 * 1000,
|
||||||
|
"PB": 1000 * 1000 * 1000 * 1000 * 1000,
|
||||||
|
"EB": 1000 * 1000 * 1000 * 1000 * 1000 * 1000,
|
||||||
|
}
|
||||||
|
|
||||||
|
_, name, _size, idd = detect_string.split(":")
|
||||||
|
if _ != "detect":
|
||||||
|
return None
|
||||||
|
|
||||||
|
size_re = re_search(r"([\d.]+)([kKMGTP]B)", _size)
|
||||||
|
size_val = float(size_re.group(1))
|
||||||
|
size_unit = size_re.group(2)
|
||||||
|
size_bytes = int(size_val * unit_map[size_unit])
|
||||||
|
|
||||||
|
retcode, stdout, stderr = run_os_command("nvme list --output-format json")
|
||||||
|
if retcode:
|
||||||
|
print(f"Failed to run nvme: {stderr}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Parse the output with json
|
||||||
|
nvme_data = loads(stdout).get("Devices", list())
|
||||||
|
|
||||||
|
# Handle size determination (+/- 3%)
|
||||||
|
size = None
|
||||||
|
nvme_sizes = set()
|
||||||
|
for entry in nvme_data:
|
||||||
|
nvme_sizes.add(entry["PhysicalSize"])
|
||||||
|
for l_size in nvme_sizes:
|
||||||
|
plusthreepct = size_bytes * 1.03
|
||||||
|
minusthreepct = size_bytes * 0.97
|
||||||
|
|
||||||
|
if l_size > minusthreepct and l_size < plusthreepct:
|
||||||
|
size = l_size
|
||||||
|
break
|
||||||
|
if size is None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
blockdev = None
|
||||||
|
matches = list()
|
||||||
|
for entry in nvme_data:
|
||||||
|
# Skip if name is not contained in the line (case-insensitive)
|
||||||
|
if name.lower() not in entry["ModelNumber"].lower():
|
||||||
|
continue
|
||||||
|
# Skip if the size does not match
|
||||||
|
if size != entry["PhysicalSize"]:
|
||||||
|
continue
|
||||||
|
# Get our blockdev and append to the list
|
||||||
|
matches.append(entry["DevicePath"])
|
||||||
|
|
||||||
|
blockdev = None
|
||||||
|
# Find the blockdev at index {idd}
|
||||||
|
for idx, _blockdev in enumerate(matches):
|
||||||
|
if int(idx) == int(idd):
|
||||||
|
blockdev = _blockdev
|
||||||
|
break
|
||||||
|
|
||||||
|
return blockdev
|
||||||
|
|
||||||
|
|
||||||
|
def get_detect_device(detect_string):
|
||||||
|
"""
|
||||||
|
Parses a "detect:" string into a normalized block device path.
|
||||||
|
|
||||||
|
First tries to parse using "lsscsi" (get_detect_device_lsscsi). If this returns an invalid
|
||||||
|
block device name, then try to parse using "nvme" (get_detect_device_nvme). This works around
|
||||||
|
issues with more recent devices (e.g. the Dell R6615 series) not properly reporting block
|
||||||
|
device paths for NVMe devices with "lsscsi".
|
||||||
|
"""
|
||||||
|
|
||||||
|
device = get_detect_device_lsscsi(detect_string)
|
||||||
|
if device is None or not re_match(r"^/dev", device):
|
||||||
|
device = get_detect_device_nvme(detect_string)
|
||||||
|
|
||||||
|
if device is not None and re_match(r"^/dev", device):
|
||||||
|
return device
|
||||||
|
else:
|
||||||
|
return None
|
||||||
|
|
|
@ -375,8 +375,11 @@ def get_parsed_configuration(config_file):
|
||||||
config = {**config, **config_api_ssl}
|
config = {**config, **config_api_ssl}
|
||||||
|
|
||||||
# Use coordinators as storage hosts if not explicitly specified
|
# Use coordinators as storage hosts if not explicitly specified
|
||||||
|
# These are added as FQDNs in the storage domain
|
||||||
if not config["storage_hosts"] or len(config["storage_hosts"]) < 1:
|
if not config["storage_hosts"] or len(config["storage_hosts"]) < 1:
|
||||||
config["storage_hosts"] = config["coordinators"]
|
config["storage_hosts"] = []
|
||||||
|
for host in config["coordinators"]:
|
||||||
|
config["storage_hosts"].append(f"{host}.{config['storage_domain']}")
|
||||||
|
|
||||||
# Set up our token list if specified
|
# Set up our token list if specified
|
||||||
if config["api_auth_source"] == "token":
|
if config["api_auth_source"] == "token":
|
||||||
|
|
|
@ -0,0 +1 @@
|
||||||
|
{"version": "15", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.fence_lock": "/config/fence_lock", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "logs": "/logs", "faults": "/faults", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.health": "/ceph/health", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "logs": {"node": "", "messages": "/messages"}, "faults": {"id": "", "last_time": "/last_time", "first_time": "/first_time", "ack_time": "/ack_time", "status": "/status", "delta": "/delta", "message": "/message"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "data.pvc_version": "/pvcversion", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf", "monitoring.plugins": "/monitoring_plugins", "monitoring.data": "/monitoring_data", "monitoring.health": "/monitoring_health", "network.stats": "/network_stats"}, "monitoring_plugin": {"name": "", "last_run": "/last_run", "health_delta": "/health_delta", "message": "/message", "data": "/data", "runtime": "/runtime"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.migrate_max_downtime": "/migration_max_downtime", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "meta.tags": "/tags", "migrate.sync_lock": "/migrate_sync_lock", "snapshots": "/snapshots"}, "tag": {"name": "", "type": "/type", "protected": "/protected"}, "domain_snapshot": {"name": "", "timestamp": "/timestamp", "xml": "/xml", "rbd_snapshots": "/rbdsnaplist"}, "network": {"vni": "", "type": "/nettype", "mtu": "/mtu", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "db_device": "/db_device", "fsid": "/fsid", "ofsid": "/fsid/osd", "cfsid": "/fsid/cluster", "lvm": "/lvm", "vg": "/lvm/vg", "lv": "/lvm/lv", "is_split": "/is_split", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "tier": "/tier", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}
|
1872
daemon-common/vm.py
1872
daemon-common/vm.py
File diff suppressed because it is too large
Load Diff
|
@ -336,11 +336,7 @@ def worker_create_vm(
|
||||||
retcode, stdout, stderr = pvc_common.run_os_command("uname -m")
|
retcode, stdout, stderr = pvc_common.run_os_command("uname -m")
|
||||||
vm_data["system_architecture"] = stdout.strip()
|
vm_data["system_architecture"] = stdout.strip()
|
||||||
|
|
||||||
monitor_list = list()
|
vm_data["ceph_monitor_list"] = config["storage_hosts"]
|
||||||
monitor_names = config["storage_hosts"]
|
|
||||||
for monitor in monitor_names:
|
|
||||||
monitor_list.append("{}.{}".format(monitor, config["storage_domain"]))
|
|
||||||
vm_data["ceph_monitor_list"] = monitor_list
|
|
||||||
vm_data["ceph_monitor_port"] = config["ceph_monitor_port"]
|
vm_data["ceph_monitor_port"] = config["ceph_monitor_port"]
|
||||||
vm_data["ceph_monitor_secret"] = config["ceph_secret_uuid"]
|
vm_data["ceph_monitor_secret"] = config["ceph_secret_uuid"]
|
||||||
|
|
||||||
|
|
|
@ -30,7 +30,8 @@ from kazoo.client import KazooClient, KazooState
|
||||||
from kazoo.exceptions import NoNodeError
|
from kazoo.exceptions import NoNodeError
|
||||||
|
|
||||||
|
|
||||||
SCHEMA_ROOT_PATH = "/usr/share/pvc/daemon_lib/migrations/versions"
|
DEFAULT_ROOT_PATH = "/usr/share/pvc"
|
||||||
|
SCHEMA_PATH = "daemon_lib/migrations/versions"
|
||||||
|
|
||||||
|
|
||||||
#
|
#
|
||||||
|
@ -576,7 +577,7 @@ class ZKHandler(object):
|
||||||
#
|
#
|
||||||
class ZKSchema(object):
|
class ZKSchema(object):
|
||||||
# Current version
|
# Current version
|
||||||
_version = 14
|
_version = 15
|
||||||
|
|
||||||
# Root for doing nested keys
|
# Root for doing nested keys
|
||||||
_schema_root = ""
|
_schema_root = ""
|
||||||
|
@ -592,6 +593,7 @@ class ZKSchema(object):
|
||||||
"schema.version": f"{_schema_root}/schema/version",
|
"schema.version": f"{_schema_root}/schema/version",
|
||||||
"config": f"{_schema_root}/config",
|
"config": f"{_schema_root}/config",
|
||||||
"config.maintenance": f"{_schema_root}/config/maintenance",
|
"config.maintenance": f"{_schema_root}/config/maintenance",
|
||||||
|
"config.fence_lock": f"{_schema_root}/config/fence_lock",
|
||||||
"config.primary_node": f"{_schema_root}/config/primary_node",
|
"config.primary_node": f"{_schema_root}/config/primary_node",
|
||||||
"config.primary_node.sync_lock": f"{_schema_root}/config/primary_node/sync_lock",
|
"config.primary_node.sync_lock": f"{_schema_root}/config/primary_node/sync_lock",
|
||||||
"config.upstream_ip": f"{_schema_root}/config/upstream_ip",
|
"config.upstream_ip": f"{_schema_root}/config/upstream_ip",
|
||||||
|
@ -831,8 +833,8 @@ class ZKSchema(object):
|
||||||
def schema(self, schema):
|
def schema(self, schema):
|
||||||
self._schema = schema
|
self._schema = schema
|
||||||
|
|
||||||
def __init__(self):
|
def __init__(self, root_path=DEFAULT_ROOT_PATH):
|
||||||
pass
|
self.schema_path = f"{root_path}/{SCHEMA_PATH}"
|
||||||
|
|
||||||
def __repr__(self):
|
def __repr__(self):
|
||||||
return f"ZKSchema({self.version})"
|
return f"ZKSchema({self.version})"
|
||||||
|
@ -872,7 +874,7 @@ class ZKSchema(object):
|
||||||
if not quiet:
|
if not quiet:
|
||||||
print(f"Loading schema version {version}")
|
print(f"Loading schema version {version}")
|
||||||
|
|
||||||
with open(f"{SCHEMA_ROOT_PATH}/{version}.json", "r") as sfh:
|
with open(f"{self.schema_path}/{version}.json", "r") as sfh:
|
||||||
self.schema = json.load(sfh)
|
self.schema = json.load(sfh)
|
||||||
self.version = self.schema.get("version")
|
self.version = self.schema.get("version")
|
||||||
|
|
||||||
|
@ -1134,7 +1136,7 @@ class ZKSchema(object):
|
||||||
# Migrate from older to newer schema
|
# Migrate from older to newer schema
|
||||||
def migrate(self, zkhandler, new_version):
|
def migrate(self, zkhandler, new_version):
|
||||||
# Determine the versions in between
|
# Determine the versions in between
|
||||||
versions = ZKSchema.find_all(start=self.version, end=new_version)
|
versions = self.find_all(start=self.version, end=new_version)
|
||||||
if versions is None:
|
if versions is None:
|
||||||
return
|
return
|
||||||
|
|
||||||
|
@ -1150,7 +1152,7 @@ class ZKSchema(object):
|
||||||
# Rollback from newer to older schema
|
# Rollback from newer to older schema
|
||||||
def rollback(self, zkhandler, old_version):
|
def rollback(self, zkhandler, old_version):
|
||||||
# Determine the versions in between
|
# Determine the versions in between
|
||||||
versions = ZKSchema.find_all(start=old_version - 1, end=self.version - 1)
|
versions = self.find_all(start=old_version - 1, end=self.version - 1)
|
||||||
if versions is None:
|
if versions is None:
|
||||||
return
|
return
|
||||||
|
|
||||||
|
@ -1165,6 +1167,12 @@ class ZKSchema(object):
|
||||||
# Apply those changes
|
# Apply those changes
|
||||||
self.run_migrate(zkhandler, changes)
|
self.run_migrate(zkhandler, changes)
|
||||||
|
|
||||||
|
# Write the latest schema to a file
|
||||||
|
def write(self):
|
||||||
|
schema_file = f"{self.schema_path}/{self._version}.json"
|
||||||
|
with open(schema_file, "w") as sfh:
|
||||||
|
json.dump(self._schema, sfh)
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def key_diff(cls, schema_a, schema_b):
|
def key_diff(cls, schema_a, schema_b):
|
||||||
# schema_a = current
|
# schema_a = current
|
||||||
|
@ -1210,26 +1218,10 @@ class ZKSchema(object):
|
||||||
|
|
||||||
return {"add": diff_add, "remove": diff_remove, "rename": diff_rename}
|
return {"add": diff_add, "remove": diff_remove, "rename": diff_rename}
|
||||||
|
|
||||||
# Load in the schemal of the current cluster
|
|
||||||
@classmethod
|
|
||||||
def load_current(cls, zkhandler):
|
|
||||||
new_instance = cls()
|
|
||||||
version = new_instance.get_version(zkhandler)
|
|
||||||
new_instance.load(version)
|
|
||||||
return new_instance
|
|
||||||
|
|
||||||
# Write the latest schema to a file
|
|
||||||
@classmethod
|
|
||||||
def write(cls):
|
|
||||||
schema_file = f"{SCHEMA_ROOT_PATH}/{cls._version}.json"
|
|
||||||
with open(schema_file, "w") as sfh:
|
|
||||||
json.dump(cls._schema, sfh)
|
|
||||||
|
|
||||||
# Static methods for reading information from the files
|
# Static methods for reading information from the files
|
||||||
@staticmethod
|
def find_all(self, start=0, end=None):
|
||||||
def find_all(start=0, end=None):
|
|
||||||
versions = list()
|
versions = list()
|
||||||
for version in os.listdir(SCHEMA_ROOT_PATH):
|
for version in os.listdir(self.schema_path):
|
||||||
sequence_id = int(version.split(".")[0])
|
sequence_id = int(version.split(".")[0])
|
||||||
if end is None:
|
if end is None:
|
||||||
if sequence_id > start:
|
if sequence_id > start:
|
||||||
|
@ -1242,11 +1234,18 @@ class ZKSchema(object):
|
||||||
else:
|
else:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
@staticmethod
|
def find_latest(self):
|
||||||
def find_latest():
|
|
||||||
latest_version = 0
|
latest_version = 0
|
||||||
for version in os.listdir(SCHEMA_ROOT_PATH):
|
for version in os.listdir(self.schema_path):
|
||||||
sequence_id = int(version.split(".")[0])
|
sequence_id = int(version.split(".")[0])
|
||||||
if sequence_id > latest_version:
|
if sequence_id > latest_version:
|
||||||
latest_version = sequence_id
|
latest_version = sequence_id
|
||||||
return latest_version
|
return latest_version
|
||||||
|
|
||||||
|
# Load in the schema of the current cluster
|
||||||
|
@classmethod
|
||||||
|
def load_current(cls, zkhandler):
|
||||||
|
new_instance = cls()
|
||||||
|
version = new_instance.get_version(zkhandler)
|
||||||
|
new_instance.load(version)
|
||||||
|
return new_instance
|
||||||
|
|
|
@ -1,3 +1,51 @@
|
||||||
|
pvc (0.9.103-0) unstable; urgency=high
|
||||||
|
|
||||||
|
* [Provisioner] Fixes a bug with the change in `storage_hosts` to FQDNs affecting the VM Builder
|
||||||
|
* [Monitoring] Fixes the Munin plugin to work properly with sudo
|
||||||
|
|
||||||
|
-- Joshua M. Boniface <joshua@boniface.me> Fri, 01 Nov 2024 17:19:44 -0400
|
||||||
|
|
||||||
|
pvc (0.9.102-0) unstable; urgency=high
|
||||||
|
|
||||||
|
* [API Daemon] Ensures that received config snapshots update storage hosts in addition to secret UUIDs
|
||||||
|
* [CLI Client] Fixes several bugs around local connection handling and connection listings
|
||||||
|
|
||||||
|
-- Joshua M. Boniface <joshua@boniface.me> Thu, 17 Oct 2024 10:48:31 -0400
|
||||||
|
|
||||||
|
pvc (0.9.101-0) unstable; urgency=high
|
||||||
|
|
||||||
|
**New Feature**: Adds VM snapshot sending (`vm snapshot send`), VM mirroring (`vm mirror create`), and (offline) mirror promotion (`vm mirror promote`). Permits transferring VM snapshots to remote clusters, individually or repeatedly, and promoting them to active status, for disaster recovery and migration between clusters.
|
||||||
|
**Breaking Change**: Migrates the API daemon into Gunicorn when in production mode. Permits more scalable and performant operation of the API. **Requires additional dependency packages on all coordinator nodes** (`gunicorn`, `python3-gunicorn`, `python3-setuptools`); upgrade via `pvc-ansible` is strongly recommended.
|
||||||
|
**Enhancement**: Provides whole cluster utilization stats in the cluster status data. Permits better observability into the overall resource utilization of the cluster.
|
||||||
|
**Enhancement**: Adds a new storage benchmark format (v2) which includes additional resource utilization statistics. This allows for better evaluation of storage performance impact on the cluster as a whole. The updated format also permits arbitrary benchmark job names for easier parsing and tracking.
|
||||||
|
|
||||||
|
* [API Daemon] Allows scanning of new volumes added manually via other commands
|
||||||
|
* [API Daemon/CLI Client] Adds whole cluster utilization statistics to cluster status
|
||||||
|
* [API Daemon] Moves production API execution into Gunicorn
|
||||||
|
* [API Daemon] Adds a new storage benchmark format (v2) with additional resource tracking
|
||||||
|
* [API Daemon] Adds support for named storage benchmark jobs
|
||||||
|
* [API Daemon] Fixes a bug in OSD creation which would create `split` OSDs if `--osd-count` was set to 1
|
||||||
|
* [API Daemon] Adds support for the `mirror` VM state used by snapshot mirrors
|
||||||
|
* [CLI Client] Fixes several output display bugs in various commands and in Worker task outputs
|
||||||
|
* [CLI Client] Improves and shrinks the status progress bar output to support longer messages
|
||||||
|
* [API Daemon] Adds support for sending snapshots to remote clusters
|
||||||
|
* [API Daemon] Adds support for updating and promoting snapshot mirrors to remote clusters
|
||||||
|
* [Node Daemon] Improves timeouts during primary/secondary coordinator transitions to avoid deadlocks
|
||||||
|
* [Node Daemon] Improves timeouts during keepalive updates to avoid deadlocks
|
||||||
|
* [Node Daemon] Refactors fencing thread structure to ensure a single fencing task per cluster and sequential node fences to avoid potential anomalies (e.g. fencing 2 nodes simultaneously)
|
||||||
|
* [Node Daemon] Fixes a bug in fencing if VM locks were already freed, leaving VMs in an invalid state
|
||||||
|
* [Node Daemon] Increases the wait time during system startup to ensure Zookeeper has more time to synchronize
|
||||||
|
|
||||||
|
-- Joshua M. Boniface <joshua@boniface.me> Tue, 15 Oct 2024 11:39:11 -0400
|
||||||
|
|
||||||
|
pvc (0.9.100-0) unstable; urgency=high
|
||||||
|
|
||||||
|
* [API Daemon] Improves the handling of "detect:" disk strings on newer systems by leveraging the "nvme" command
|
||||||
|
* [Client CLI] Update help text about "detect:" disk strings
|
||||||
|
* [Meta] Updates deprecation warnings and updates builder to only add this version for Debian 12 (Bookworm)
|
||||||
|
|
||||||
|
-- Joshua M. Boniface <joshua@boniface.me> Fri, 30 Aug 2024 11:03:33 -0400
|
||||||
|
|
||||||
pvc (0.9.99-0) unstable; urgency=high
|
pvc (0.9.99-0) unstable; urgency=high
|
||||||
|
|
||||||
**Deprecation Warning**: `pvc vm backup` commands are now deprecated and will be removed in **0.9.100**. Use `pvc vm snapshot` commands instead.
|
**Deprecation Warning**: `pvc vm backup` commands are now deprecated and will be removed in **0.9.100**. Use `pvc vm snapshot` commands instead.
|
||||||
|
|
|
@ -32,7 +32,7 @@ Description: Parallel Virtual Cluster worker daemon
|
||||||
|
|
||||||
Package: pvc-daemon-api
|
Package: pvc-daemon-api
|
||||||
Architecture: all
|
Architecture: all
|
||||||
Depends: systemd, pvc-daemon-common, python3-yaml, python3-flask, python3-flask-restful, python3-celery, python3-distutils, python3-redis, python3-lxml, python3-flask-migrate
|
Depends: systemd, pvc-daemon-common, gunicorn, python3-gunicorn, python3-yaml, python3-flask, python3-flask-restful, python3-celery, python3-distutils, python3-redis, python3-lxml, python3-flask-migrate
|
||||||
Description: Parallel Virtual Cluster API daemon
|
Description: Parallel Virtual Cluster API daemon
|
||||||
A KVM/Zookeeper/Ceph-based VM and private cloud manager
|
A KVM/Zookeeper/Ceph-based VM and private cloud manager
|
||||||
.
|
.
|
||||||
|
|
|
@ -33,7 +33,7 @@ import os
|
||||||
import signal
|
import signal
|
||||||
|
|
||||||
# Daemon version
|
# Daemon version
|
||||||
version = "0.9.99"
|
version = "0.9.103"
|
||||||
|
|
||||||
|
|
||||||
##########################################################
|
##########################################################
|
||||||
|
|
|
@ -34,7 +34,7 @@ warning=0.99
|
||||||
critical=1.99
|
critical=1.99
|
||||||
|
|
||||||
export PVC_CLIENT_DIR="/run/shm/munin-pvc"
|
export PVC_CLIENT_DIR="/run/shm/munin-pvc"
|
||||||
PVC_CMD="/usr/bin/pvc --quiet --cluster local status --format json-pretty"
|
PVC_CMD="/usr/bin/sudo -E /usr/bin/pvc --quiet cluster status --format json-pretty"
|
||||||
JQ_CMD="/usr/bin/jq"
|
JQ_CMD="/usr/bin/jq"
|
||||||
|
|
||||||
output_usage() {
|
output_usage() {
|
||||||
|
@ -126,7 +126,7 @@ output_values() {
|
||||||
is_maintenance="$( $JQ_CMD ".maintenance" <<<"${PVC_OUTPUT}" | tr -d '"' )"
|
is_maintenance="$( $JQ_CMD ".maintenance" <<<"${PVC_OUTPUT}" | tr -d '"' )"
|
||||||
|
|
||||||
cluster_health="$( $JQ_CMD ".cluster_health.health" <<<"${PVC_OUTPUT}" | tr -d '"' )"
|
cluster_health="$( $JQ_CMD ".cluster_health.health" <<<"${PVC_OUTPUT}" | tr -d '"' )"
|
||||||
cluster_health_messages="$( $JQ_CMD -r ".cluster_health.messages | @csv" <<<"${PVC_OUTPUT}" | tr -d '"' | sed 's/,/, /g' )"
|
cluster_health_messages="$( $JQ_CMD -r ".cluster_health.messages | map(.text) | join(\", \")" <<<"${PVC_OUTPUT}" )"
|
||||||
echo 'multigraph pvc_cluster_health'
|
echo 'multigraph pvc_cluster_health'
|
||||||
echo "pvc_cluster_health.value ${cluster_health}"
|
echo "pvc_cluster_health.value ${cluster_health}"
|
||||||
echo "pvc_cluster_health.extinfo ${cluster_health_messages}"
|
echo "pvc_cluster_health.extinfo ${cluster_health_messages}"
|
||||||
|
@ -142,7 +142,7 @@ output_values() {
|
||||||
echo "pvc_cluster_alert.value ${cluster_health_alert}"
|
echo "pvc_cluster_alert.value ${cluster_health_alert}"
|
||||||
|
|
||||||
node_health="$( $JQ_CMD ".node_health.${HOST}.health" <<<"${PVC_OUTPUT}" | tr -d '"' )"
|
node_health="$( $JQ_CMD ".node_health.${HOST}.health" <<<"${PVC_OUTPUT}" | tr -d '"' )"
|
||||||
node_health_messages="$( $JQ_CMD -r ".node_health.${HOST}.messages | @csv" <<<"${PVC_OUTPUT}" | tr -d '"' | sed 's/,/, /g' )"
|
node_health_messages="$( $JQ_CMD -r ".node_health.${HOST}.messages | join(\", \")" <<<"${PVC_OUTPUT}" )"
|
||||||
echo 'multigraph pvc_node_health'
|
echo 'multigraph pvc_node_health'
|
||||||
echo "pvc_node_health.value ${node_health}"
|
echo "pvc_node_health.value ${node_health}"
|
||||||
echo "pvc_node_health.extinfo ${node_health_messages}"
|
echo "pvc_node_health.extinfo ${node_health_messages}"
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -15,7 +15,7 @@
|
||||||
"type": "grafana",
|
"type": "grafana",
|
||||||
"id": "grafana",
|
"id": "grafana",
|
||||||
"name": "Grafana",
|
"name": "Grafana",
|
||||||
"version": "10.2.2"
|
"version": "11.1.4"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"type": "datasource",
|
"type": "datasource",
|
||||||
|
@ -112,6 +112,7 @@
|
||||||
"graphMode": "area",
|
"graphMode": "area",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -119,10 +120,11 @@
|
||||||
"fields": "/^pvc_cluster_id$/",
|
"fields": "/^pvc_cluster_id$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -144,7 +146,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Cluster",
|
"title": "Cluster",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -187,6 +188,7 @@
|
||||||
"graphMode": "area",
|
"graphMode": "area",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -194,10 +196,11 @@
|
||||||
"fields": "/^vm$/",
|
"fields": "/^vm$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -219,7 +222,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "VM Name",
|
"title": "VM Name",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -301,6 +303,21 @@
|
||||||
"color": "dark-red",
|
"color": "dark-red",
|
||||||
"index": 8,
|
"index": 8,
|
||||||
"text": "fail"
|
"text": "fail"
|
||||||
|
},
|
||||||
|
"9": {
|
||||||
|
"color": "dark-blue",
|
||||||
|
"index": 9,
|
||||||
|
"text": "import"
|
||||||
|
},
|
||||||
|
"10": {
|
||||||
|
"color": "dark-blue",
|
||||||
|
"index": 10,
|
||||||
|
"text": "restore"
|
||||||
|
},
|
||||||
|
"99": {
|
||||||
|
"color": "dark-purple",
|
||||||
|
"index": 11,
|
||||||
|
"text": "mirror"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"type": "value"
|
"type": "value"
|
||||||
|
@ -323,6 +340,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -330,10 +348,11 @@
|
||||||
"fields": "/^Value$/",
|
"fields": "/^Value$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -355,7 +374,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "State",
|
"title": "State",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -398,6 +416,7 @@
|
||||||
"graphMode": "area",
|
"graphMode": "area",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -405,10 +424,11 @@
|
||||||
"fields": "/^uuid$/",
|
"fields": "/^uuid$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -430,7 +450,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "UUID",
|
"title": "UUID",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -473,6 +492,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -480,10 +500,11 @@
|
||||||
"fields": "/^node$/",
|
"fields": "/^node$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -505,7 +526,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Active Node",
|
"title": "Active Node",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -545,6 +565,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -552,10 +573,11 @@
|
||||||
"fields": "/^last_node$/",
|
"fields": "/^last_node$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -577,7 +599,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Migrated",
|
"title": "Migrated",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -646,6 +667,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -653,10 +675,11 @@
|
||||||
"fields": "/^Value$/",
|
"fields": "/^Value$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -678,7 +701,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Autostart",
|
"title": "Autostart",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -721,6 +743,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -728,10 +751,11 @@
|
||||||
"fields": "/^description$/",
|
"fields": "/^description$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -753,7 +777,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Description",
|
"title": "Description",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -796,6 +819,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -803,10 +827,11 @@
|
||||||
"fields": "/^Value$/",
|
"fields": "/^Value$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -828,7 +853,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "vCPUs",
|
"title": "vCPUs",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -871,6 +895,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -878,10 +903,11 @@
|
||||||
"fields": "/^topology$/",
|
"fields": "/^topology$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -903,7 +929,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "vCPU Topology",
|
"title": "vCPU Topology",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -947,6 +972,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -954,10 +980,11 @@
|
||||||
"fields": "/^Value$/",
|
"fields": "/^Value$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -979,7 +1006,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "vRAM",
|
"title": "vRAM",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -1022,6 +1048,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -1029,10 +1056,11 @@
|
||||||
"fields": "/^node_limit$/",
|
"fields": "/^node_limit$/",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -1054,7 +1082,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Node Limits",
|
"title": "Node Limits",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -1097,6 +1124,7 @@
|
||||||
"graphMode": "none",
|
"graphMode": "none",
|
||||||
"justifyMode": "auto",
|
"justifyMode": "auto",
|
||||||
"orientation": "auto",
|
"orientation": "auto",
|
||||||
|
"percentChangeColorMode": "standard",
|
||||||
"reduceOptions": {
|
"reduceOptions": {
|
||||||
"calcs": [
|
"calcs": [
|
||||||
"lastNotNull"
|
"lastNotNull"
|
||||||
|
@ -1104,10 +1132,11 @@
|
||||||
"fields": "failed_reason",
|
"fields": "failed_reason",
|
||||||
"values": false
|
"values": false
|
||||||
},
|
},
|
||||||
|
"showPercentChange": false,
|
||||||
"textMode": "auto",
|
"textMode": "auto",
|
||||||
"wideLayout": true
|
"wideLayout": true
|
||||||
},
|
},
|
||||||
"pluginVersion": "10.2.2",
|
"pluginVersion": "11.1.4",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
|
@ -1129,11 +1158,10 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Failure Reason",
|
"title": "Failure Reason",
|
||||||
"transformations": [],
|
|
||||||
"type": "stat"
|
"type": "stat"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"collapsed": true,
|
"collapsed": false,
|
||||||
"gridPos": {
|
"gridPos": {
|
||||||
"h": 1,
|
"h": 1,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
|
@ -1141,7 +1169,10 @@
|
||||||
"y": 10
|
"y": 10
|
||||||
},
|
},
|
||||||
"id": 14,
|
"id": 14,
|
||||||
"panels": [
|
"panels": [],
|
||||||
|
"title": "CPU & Memory Stats",
|
||||||
|
"type": "row"
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
"type": "prometheus",
|
"type": "prometheus",
|
||||||
|
@ -1664,21 +1695,20 @@
|
||||||
],
|
],
|
||||||
"title": "Swap Utilization (+ in/- out)",
|
"title": "Swap Utilization (+ in/- out)",
|
||||||
"type": "timeseries"
|
"type": "timeseries"
|
||||||
}
|
|
||||||
],
|
|
||||||
"title": "CPU & Memory Stats",
|
|
||||||
"type": "row"
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"collapsed": true,
|
"collapsed": false,
|
||||||
"gridPos": {
|
"gridPos": {
|
||||||
"h": 1,
|
"h": 1,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 11
|
"y": 27
|
||||||
},
|
},
|
||||||
"id": 19,
|
"id": 19,
|
||||||
"panels": [
|
"panels": [],
|
||||||
|
"title": "NIC Stats",
|
||||||
|
"type": "row"
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
"type": "prometheus",
|
"type": "prometheus",
|
||||||
|
@ -1727,8 +1757,7 @@
|
||||||
"mode": "absolute",
|
"mode": "absolute",
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"color": "green",
|
"color": "green"
|
||||||
"value": null
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"color": "red",
|
"color": "red",
|
||||||
|
@ -1757,7 +1786,7 @@
|
||||||
"h": 10,
|
"h": 10,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 12
|
"y": 28
|
||||||
},
|
},
|
||||||
"id": 20,
|
"id": 20,
|
||||||
"options": {
|
"options": {
|
||||||
|
@ -1864,8 +1893,7 @@
|
||||||
"mode": "absolute",
|
"mode": "absolute",
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"color": "green",
|
"color": "green"
|
||||||
"value": null
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"color": "red",
|
"color": "red",
|
||||||
|
@ -1894,7 +1922,7 @@
|
||||||
"h": 10,
|
"h": 10,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 22
|
"y": 38
|
||||||
},
|
},
|
||||||
"id": 21,
|
"id": 21,
|
||||||
"options": {
|
"options": {
|
||||||
|
@ -2001,8 +2029,7 @@
|
||||||
"mode": "absolute",
|
"mode": "absolute",
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"color": "green",
|
"color": "green"
|
||||||
"value": null
|
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
@ -2027,7 +2054,7 @@
|
||||||
"h": 8,
|
"h": 8,
|
||||||
"w": 12,
|
"w": 12,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 32
|
"y": 48
|
||||||
},
|
},
|
||||||
"id": 22,
|
"id": 22,
|
||||||
"options": {
|
"options": {
|
||||||
|
@ -2134,8 +2161,7 @@
|
||||||
"mode": "absolute",
|
"mode": "absolute",
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"color": "green",
|
"color": "green"
|
||||||
"value": null
|
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
@ -2160,7 +2186,7 @@
|
||||||
"h": 8,
|
"h": 8,
|
||||||
"w": 12,
|
"w": 12,
|
||||||
"x": 12,
|
"x": 12,
|
||||||
"y": 32
|
"y": 48
|
||||||
},
|
},
|
||||||
"id": 23,
|
"id": 23,
|
||||||
"options": {
|
"options": {
|
||||||
|
@ -2218,21 +2244,20 @@
|
||||||
],
|
],
|
||||||
"title": "Errors (+ RX/- TX)",
|
"title": "Errors (+ RX/- TX)",
|
||||||
"type": "timeseries"
|
"type": "timeseries"
|
||||||
}
|
|
||||||
],
|
|
||||||
"title": "NIC Stats",
|
|
||||||
"type": "row"
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"collapsed": true,
|
"collapsed": false,
|
||||||
"gridPos": {
|
"gridPos": {
|
||||||
"h": 1,
|
"h": 1,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 12
|
"y": 56
|
||||||
},
|
},
|
||||||
"id": 24,
|
"id": 24,
|
||||||
"panels": [
|
"panels": [],
|
||||||
|
"title": "Disk Stats",
|
||||||
|
"type": "row"
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
"type": "prometheus",
|
"type": "prometheus",
|
||||||
|
@ -2281,8 +2306,7 @@
|
||||||
"mode": "absolute",
|
"mode": "absolute",
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"color": "green",
|
"color": "green"
|
||||||
"value": null
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"color": "red",
|
"color": "red",
|
||||||
|
@ -2311,7 +2335,7 @@
|
||||||
"h": 9,
|
"h": 9,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 13
|
"y": 57
|
||||||
},
|
},
|
||||||
"id": 25,
|
"id": 25,
|
||||||
"options": {
|
"options": {
|
||||||
|
@ -2368,7 +2392,6 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "IOPS (+ Read/- Write)",
|
"title": "IOPS (+ Read/- Write)",
|
||||||
"transformations": [],
|
|
||||||
"type": "timeseries"
|
"type": "timeseries"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -2419,8 +2442,7 @@
|
||||||
"mode": "absolute",
|
"mode": "absolute",
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"color": "green",
|
"color": "green"
|
||||||
"value": null
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"color": "red",
|
"color": "red",
|
||||||
|
@ -2449,7 +2471,7 @@
|
||||||
"h": 9,
|
"h": 9,
|
||||||
"w": 24,
|
"w": 24,
|
||||||
"x": 0,
|
"x": 0,
|
||||||
"y": 22
|
"y": 66
|
||||||
},
|
},
|
||||||
"id": 26,
|
"id": 26,
|
||||||
"options": {
|
"options": {
|
||||||
|
@ -2509,12 +2531,8 @@
|
||||||
"type": "timeseries"
|
"type": "timeseries"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"title": "Disk Stats",
|
|
||||||
"type": "row"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"refresh": "5s",
|
"refresh": "5s",
|
||||||
"schemaVersion": 38,
|
"schemaVersion": 39,
|
||||||
"tags": [
|
"tags": [
|
||||||
"pvc"
|
"pvc"
|
||||||
],
|
],
|
||||||
|
|
|
@ -49,7 +49,7 @@ import re
|
||||||
import json
|
import json
|
||||||
|
|
||||||
# Daemon version
|
# Daemon version
|
||||||
version = "0.9.99"
|
version = "0.9.103"
|
||||||
|
|
||||||
|
|
||||||
##########################################################
|
##########################################################
|
||||||
|
|
|
@ -438,8 +438,11 @@ class NodeInstance(object):
|
||||||
# Synchronize nodes B (I am reader)
|
# Synchronize nodes B (I am reader)
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase B", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase B", state="i")
|
||||||
lock.acquire()
|
try:
|
||||||
self.logger.out("Acquired read lock for synchronization phase B", state="o")
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
self.logger.out("Releasing read lock for synchronization phase B", state="i")
|
self.logger.out("Releasing read lock for synchronization phase B", state="i")
|
||||||
lock.release()
|
lock.release()
|
||||||
self.logger.out("Released read lock for synchronization phase B", state="o")
|
self.logger.out("Released read lock for synchronization phase B", state="o")
|
||||||
|
@ -648,8 +651,11 @@ class NodeInstance(object):
|
||||||
# Synchronize nodes A (I am reader)
|
# Synchronize nodes A (I am reader)
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase A", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase A", state="i")
|
||||||
lock.acquire()
|
try:
|
||||||
self.logger.out("Acquired read lock for synchronization phase A", state="o")
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
self.logger.out("Releasing read lock for synchronization phase A", state="i")
|
self.logger.out("Releasing read lock for synchronization phase A", state="i")
|
||||||
lock.release()
|
lock.release()
|
||||||
self.logger.out("Released read lock for synchronization phase A", state="o")
|
self.logger.out("Released read lock for synchronization phase A", state="o")
|
||||||
|
@ -682,8 +688,11 @@ class NodeInstance(object):
|
||||||
# Synchronize nodes C (I am reader)
|
# Synchronize nodes C (I am reader)
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase C", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase C", state="i")
|
||||||
lock.acquire()
|
try:
|
||||||
self.logger.out("Acquired read lock for synchronization phase C", state="o")
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
# 5. Remove Upstream floating IP
|
# 5. Remove Upstream floating IP
|
||||||
self.logger.out(
|
self.logger.out(
|
||||||
"Removing floating upstream IP {}/{} from interface {}".format(
|
"Removing floating upstream IP {}/{} from interface {}".format(
|
||||||
|
@ -701,8 +710,11 @@ class NodeInstance(object):
|
||||||
# Synchronize nodes D (I am reader)
|
# Synchronize nodes D (I am reader)
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase D", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase D", state="i")
|
||||||
lock.acquire()
|
try:
|
||||||
self.logger.out("Acquired read lock for synchronization phase D", state="o")
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
# 6. Remove Cluster & Storage floating IP
|
# 6. Remove Cluster & Storage floating IP
|
||||||
self.logger.out(
|
self.logger.out(
|
||||||
"Removing floating management IP {}/{} from interface {}".format(
|
"Removing floating management IP {}/{} from interface {}".format(
|
||||||
|
@ -729,8 +741,11 @@ class NodeInstance(object):
|
||||||
# Synchronize nodes E (I am reader)
|
# Synchronize nodes E (I am reader)
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase E", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase E", state="i")
|
||||||
lock.acquire()
|
try:
|
||||||
self.logger.out("Acquired read lock for synchronization phase E", state="o")
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
# 7. Remove Metadata link-local IP
|
# 7. Remove Metadata link-local IP
|
||||||
self.logger.out(
|
self.logger.out(
|
||||||
"Removing Metadata link-local IP {}/{} from interface {}".format(
|
"Removing Metadata link-local IP {}/{} from interface {}".format(
|
||||||
|
@ -746,8 +761,11 @@ class NodeInstance(object):
|
||||||
# Synchronize nodes F (I am reader)
|
# Synchronize nodes F (I am reader)
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase F", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase F", state="i")
|
||||||
lock.acquire()
|
try:
|
||||||
self.logger.out("Acquired read lock for synchronization phase F", state="o")
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
# 8. Remove gateway IPs
|
# 8. Remove gateway IPs
|
||||||
for network in self.d_network:
|
for network in self.d_network:
|
||||||
self.d_network[network].removeGateways()
|
self.d_network[network].removeGateways()
|
||||||
|
@ -759,7 +777,7 @@ class NodeInstance(object):
|
||||||
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
lock = self.zkhandler.readlock("base.config.primary_node.sync_lock")
|
||||||
self.logger.out("Acquiring read lock for synchronization phase G", state="i")
|
self.logger.out("Acquiring read lock for synchronization phase G", state="i")
|
||||||
try:
|
try:
|
||||||
lock.acquire(timeout=60) # Don't wait forever and completely block us
|
lock.acquire(timeout=5) # Don't wait forever and completely block us
|
||||||
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
self.logger.out("Acquired read lock for synchronization phase G", state="o")
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
|
@ -21,15 +21,72 @@
|
||||||
|
|
||||||
import time
|
import time
|
||||||
|
|
||||||
|
from kazoo.exceptions import LockTimeout
|
||||||
|
|
||||||
import daemon_lib.common as common
|
import daemon_lib.common as common
|
||||||
|
|
||||||
from daemon_lib.vm import vm_worker_flush_locks
|
from daemon_lib.vm import vm_worker_flush_locks
|
||||||
|
|
||||||
|
|
||||||
#
|
#
|
||||||
# Fence thread entry function
|
# Fence monitor thread entrypoint
|
||||||
#
|
#
|
||||||
def fence_node(node_name, zkhandler, config, logger):
|
def fence_monitor(zkhandler, config, logger):
|
||||||
|
# Attempt to acquire an exclusive lock on the fence_lock key
|
||||||
|
# If it is already held, we'll abort since another node is processing fences
|
||||||
|
lock = zkhandler.exclusivelock("base.config.fence_lock")
|
||||||
|
|
||||||
|
try:
|
||||||
|
lock.acquire(timeout=config["keepalive_interval"] - 1)
|
||||||
|
|
||||||
|
for node_name in zkhandler.children("base.node"):
|
||||||
|
try:
|
||||||
|
node_daemon_state = zkhandler.read(("node.state.daemon", node_name))
|
||||||
|
node_keepalive = int(zkhandler.read(("node.keepalive", node_name)))
|
||||||
|
except Exception:
|
||||||
|
node_daemon_state = "unknown"
|
||||||
|
node_keepalive = 0
|
||||||
|
|
||||||
|
node_deadtime = int(time.time()) - (
|
||||||
|
int(config["keepalive_interval"]) * int(config["fence_intervals"])
|
||||||
|
)
|
||||||
|
if node_keepalive < node_deadtime and node_daemon_state == "run":
|
||||||
|
logger.out(
|
||||||
|
f"Node {node_name} seems dead; starting monitor for fencing",
|
||||||
|
state="w",
|
||||||
|
)
|
||||||
|
zk_lock = zkhandler.writelock(("node.state.daemon", node_name))
|
||||||
|
with zk_lock:
|
||||||
|
# Ensures that, if we lost the lock race and come out of waiting,
|
||||||
|
# we won't try to trigger our own fence thread.
|
||||||
|
if zkhandler.read(("node.state.daemon", node_name)) != "dead":
|
||||||
|
# Write the updated data after we start the fence thread
|
||||||
|
zkhandler.write([(("node.state.daemon", node_name), "dead")])
|
||||||
|
# Start the fence monitoring task for this node
|
||||||
|
# NOTE: This is not a subthread and is designed to block this for loop
|
||||||
|
# This ensures that only one node is ever being fenced at a time
|
||||||
|
fence_node(zkhandler, config, logger, node_name)
|
||||||
|
else:
|
||||||
|
logger.out(
|
||||||
|
f"Node {node_name} is OK; last checkin is {node_deadtime - node_keepalive}s from threshold, node state is '{node_daemon_state}'",
|
||||||
|
state="d",
|
||||||
|
prefix="fence-thread",
|
||||||
|
)
|
||||||
|
except LockTimeout:
|
||||||
|
logger.out(
|
||||||
|
"Fence monitor thread failed to acquire exclusive lock; skipping", state="i"
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.out(f"Fence monitor thread failed: {e}", state="w")
|
||||||
|
finally:
|
||||||
|
# We're finished, so release the global lock
|
||||||
|
lock.release()
|
||||||
|
|
||||||
|
|
||||||
|
#
|
||||||
|
# Fence action function
|
||||||
|
#
|
||||||
|
def fence_node(zkhandler, config, logger, node_name):
|
||||||
# We allow exactly 6 saving throws (30 seconds) for the host to come back online or we kill it
|
# We allow exactly 6 saving throws (30 seconds) for the host to come back online or we kill it
|
||||||
failcount_limit = 6
|
failcount_limit = 6
|
||||||
failcount = 0
|
failcount = 0
|
||||||
|
@ -190,7 +247,7 @@ def migrateFromFencedNode(zkhandler, node_name, config, logger):
|
||||||
)
|
)
|
||||||
zkhandler.write(
|
zkhandler.write(
|
||||||
{
|
{
|
||||||
(("domain.state", dom_uuid), "stopped"),
|
(("domain.state", dom_uuid), "stop"),
|
||||||
(("domain.meta.autostart", dom_uuid), "True"),
|
(("domain.meta.autostart", dom_uuid), "True"),
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
@ -202,6 +259,9 @@ def migrateFromFencedNode(zkhandler, node_name, config, logger):
|
||||||
|
|
||||||
# Loop through the VMs
|
# Loop through the VMs
|
||||||
for dom_uuid in dead_node_running_domains:
|
for dom_uuid in dead_node_running_domains:
|
||||||
|
if dom_uuid in ["0", 0]:
|
||||||
|
# Skip the invalid "0" UUID we sometimes get
|
||||||
|
continue
|
||||||
try:
|
try:
|
||||||
fence_migrate_vm(dom_uuid)
|
fence_migrate_vm(dom_uuid)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
|
|
@ -756,29 +756,21 @@ def node_keepalive(logger, config, zkhandler, this_node, netstats):
|
||||||
|
|
||||||
# Join against running threads
|
# Join against running threads
|
||||||
if config["enable_hypervisor"]:
|
if config["enable_hypervisor"]:
|
||||||
vm_stats_thread.join(timeout=config["keepalive_interval"])
|
vm_stats_thread.join(timeout=config["keepalive_interval"] - 1)
|
||||||
if vm_stats_thread.is_alive():
|
if vm_stats_thread.is_alive():
|
||||||
logger.out("VM stats gathering exceeded timeout, continuing", state="w")
|
logger.out("VM stats gathering exceeded timeout, continuing", state="w")
|
||||||
if config["enable_storage"]:
|
if config["enable_storage"]:
|
||||||
ceph_stats_thread.join(timeout=config["keepalive_interval"])
|
ceph_stats_thread.join(timeout=config["keepalive_interval"] - 1)
|
||||||
if ceph_stats_thread.is_alive():
|
if ceph_stats_thread.is_alive():
|
||||||
logger.out("Ceph stats gathering exceeded timeout, continuing", state="w")
|
logger.out("Ceph stats gathering exceeded timeout, continuing", state="w")
|
||||||
|
|
||||||
# Get information from thread queues
|
# Get information from thread queues
|
||||||
if config["enable_hypervisor"]:
|
if config["enable_hypervisor"]:
|
||||||
try:
|
try:
|
||||||
this_node.domains_count = vm_thread_queue.get(
|
this_node.domains_count = vm_thread_queue.get(timeout=0.1)
|
||||||
timeout=config["keepalive_interval"]
|
this_node.memalloc = vm_thread_queue.get(timeout=0.1)
|
||||||
)
|
this_node.memprov = vm_thread_queue.get(timeout=0.1)
|
||||||
this_node.memalloc = vm_thread_queue.get(
|
this_node.vcpualloc = vm_thread_queue.get(timeout=0.1)
|
||||||
timeout=config["keepalive_interval"]
|
|
||||||
)
|
|
||||||
this_node.memprov = vm_thread_queue.get(
|
|
||||||
timeout=config["keepalive_interval"]
|
|
||||||
)
|
|
||||||
this_node.vcpualloc = vm_thread_queue.get(
|
|
||||||
timeout=config["keepalive_interval"]
|
|
||||||
)
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.out("VM stats queue get exceeded timeout, continuing", state="w")
|
logger.out("VM stats queue get exceeded timeout, continuing", state="w")
|
||||||
else:
|
else:
|
||||||
|
@ -789,9 +781,7 @@ def node_keepalive(logger, config, zkhandler, this_node, netstats):
|
||||||
|
|
||||||
if config["enable_storage"]:
|
if config["enable_storage"]:
|
||||||
try:
|
try:
|
||||||
osds_this_node = ceph_thread_queue.get(
|
osds_this_node = ceph_thread_queue.get(timeout=0.1)
|
||||||
timeout=(config["keepalive_interval"] - 1)
|
|
||||||
)
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.out("Ceph stats queue get exceeded timeout, continuing", state="w")
|
logger.out("Ceph stats queue get exceeded timeout, continuing", state="w")
|
||||||
osds_this_node = "?"
|
osds_this_node = "?"
|
||||||
|
@ -887,44 +877,12 @@ def node_keepalive(logger, config, zkhandler, this_node, netstats):
|
||||||
)
|
)
|
||||||
|
|
||||||
# Look for dead nodes and fence them
|
# Look for dead nodes and fence them
|
||||||
if not this_node.maintenance:
|
if not this_node.maintenance and config["daemon_mode"] == "coordinator":
|
||||||
logger.out(
|
logger.out(
|
||||||
"Look for dead nodes and fence them", state="d", prefix="main-thread"
|
"Look for dead nodes and fence them", state="d", prefix="main-thread"
|
||||||
)
|
)
|
||||||
if config["daemon_mode"] == "coordinator":
|
fence_monitor_thread = Thread(
|
||||||
for node_name in zkhandler.children("base.node"):
|
target=pvcnoded.util.fencing.fence_monitor,
|
||||||
try:
|
args=(zkhandler, config, logger),
|
||||||
node_daemon_state = zkhandler.read(("node.state.daemon", node_name))
|
|
||||||
node_keepalive = int(zkhandler.read(("node.keepalive", node_name)))
|
|
||||||
except Exception:
|
|
||||||
node_daemon_state = "unknown"
|
|
||||||
node_keepalive = 0
|
|
||||||
|
|
||||||
# Handle deadtime and fencng if needed
|
|
||||||
# (A node is considered dead when its keepalive timer is >6*keepalive_interval seconds
|
|
||||||
# out-of-date while in 'start' state)
|
|
||||||
node_deadtime = int(time.time()) - (
|
|
||||||
int(config["keepalive_interval"]) * int(config["fence_intervals"])
|
|
||||||
)
|
|
||||||
if node_keepalive < node_deadtime and node_daemon_state == "run":
|
|
||||||
logger.out(
|
|
||||||
"Node {} seems dead - starting monitor for fencing".format(
|
|
||||||
node_name
|
|
||||||
),
|
|
||||||
state="w",
|
|
||||||
)
|
|
||||||
zk_lock = zkhandler.writelock(("node.state.daemon", node_name))
|
|
||||||
with zk_lock:
|
|
||||||
# Ensures that, if we lost the lock race and come out of waiting,
|
|
||||||
# we won't try to trigger our own fence thread.
|
|
||||||
if zkhandler.read(("node.state.daemon", node_name)) != "dead":
|
|
||||||
fence_thread = Thread(
|
|
||||||
target=pvcnoded.util.fencing.fence_node,
|
|
||||||
args=(node_name, zkhandler, config, logger),
|
|
||||||
kwargs={},
|
|
||||||
)
|
|
||||||
fence_thread.start()
|
|
||||||
# Write the updated data after we start the fence thread
|
|
||||||
zkhandler.write(
|
|
||||||
[(("node.state.daemon", node_name), "dead")]
|
|
||||||
)
|
)
|
||||||
|
fence_monitor_thread.start()
|
||||||
|
|
|
@ -102,5 +102,5 @@ def start_system_services(logger, config):
|
||||||
start_workerd(logger, config)
|
start_workerd(logger, config)
|
||||||
start_healthd(logger, config)
|
start_healthd(logger, config)
|
||||||
|
|
||||||
logger.out("Waiting 5 seconds for daemons to start", state="s")
|
logger.out("Waiting 10 seconds for daemons to start", state="s")
|
||||||
sleep(5)
|
sleep(10)
|
||||||
|
|
|
@ -188,3 +188,6 @@ def setup_node(logger, config, zkhandler):
|
||||||
(("node.count.networks", config["node_hostname"]), "0"),
|
(("node.count.networks", config["node_hostname"]), "0"),
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
|
||||||
|
logger.out("Waiting 5 seconds for Zookeeper to synchronize", state="s")
|
||||||
|
time.sleep(5)
|
||||||
|
|
|
@ -33,6 +33,9 @@ from daemon_lib.vm import (
|
||||||
vm_worker_rollback_snapshot,
|
vm_worker_rollback_snapshot,
|
||||||
vm_worker_export_snapshot,
|
vm_worker_export_snapshot,
|
||||||
vm_worker_import_snapshot,
|
vm_worker_import_snapshot,
|
||||||
|
vm_worker_send_snapshot,
|
||||||
|
vm_worker_create_mirror,
|
||||||
|
vm_worker_promote_mirror,
|
||||||
)
|
)
|
||||||
from daemon_lib.ceph import (
|
from daemon_lib.ceph import (
|
||||||
osd_worker_add_osd,
|
osd_worker_add_osd,
|
||||||
|
@ -52,7 +55,7 @@ from daemon_lib.autobackup import (
|
||||||
)
|
)
|
||||||
|
|
||||||
# Daemon version
|
# Daemon version
|
||||||
version = "0.9.99"
|
version = "0.9.103"
|
||||||
|
|
||||||
|
|
||||||
config = cfg.get_configuration()
|
config = cfg.get_configuration()
|
||||||
|
@ -96,12 +99,12 @@ def create_vm(
|
||||||
|
|
||||||
|
|
||||||
@celery.task(name="storage.benchmark", bind=True, routing_key="run_on")
|
@celery.task(name="storage.benchmark", bind=True, routing_key="run_on")
|
||||||
def storage_benchmark(self, pool=None, run_on="primary"):
|
def storage_benchmark(self, pool=None, name=None, run_on="primary"):
|
||||||
@ZKConnection(config)
|
@ZKConnection(config)
|
||||||
def run_storage_benchmark(zkhandler, self, pool):
|
def run_storage_benchmark(zkhandler, self, pool, name):
|
||||||
return worker_run_benchmark(zkhandler, self, config, pool)
|
return worker_run_benchmark(zkhandler, self, config, pool, name)
|
||||||
|
|
||||||
return run_storage_benchmark(self, pool)
|
return run_storage_benchmark(self, pool, name)
|
||||||
|
|
||||||
|
|
||||||
@celery.task(name="cluster.autobackup", bind=True, routing_key="run_on")
|
@celery.task(name="cluster.autobackup", bind=True, routing_key="run_on")
|
||||||
|
@ -227,6 +230,138 @@ def vm_import_snapshot(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@celery.task(name="vm.send_snapshot", bind=True, routing_key="run_on")
|
||||||
|
def vm_send_snapshot(
|
||||||
|
self,
|
||||||
|
domain=None,
|
||||||
|
snapshot_name=None,
|
||||||
|
destination_api_uri="",
|
||||||
|
destination_api_key="",
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
incremental_parent=None,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
run_on="primary",
|
||||||
|
):
|
||||||
|
@ZKConnection(config)
|
||||||
|
def run_vm_send_snapshot(
|
||||||
|
zkhandler,
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
snapshot_name,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
incremental_parent=None,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
):
|
||||||
|
return vm_worker_send_snapshot(
|
||||||
|
zkhandler,
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
snapshot_name,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=destination_api_verify_ssl,
|
||||||
|
incremental_parent=incremental_parent,
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
)
|
||||||
|
|
||||||
|
return run_vm_send_snapshot(
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
snapshot_name,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=destination_api_verify_ssl,
|
||||||
|
incremental_parent=incremental_parent,
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@celery.task(name="vm.create_mirror", bind=True, routing_key="run_on")
|
||||||
|
def vm_create_mirror(
|
||||||
|
self,
|
||||||
|
domain=None,
|
||||||
|
destination_api_uri="",
|
||||||
|
destination_api_key="",
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
run_on="primary",
|
||||||
|
):
|
||||||
|
@ZKConnection(config)
|
||||||
|
def run_vm_create_mirror(
|
||||||
|
zkhandler,
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
):
|
||||||
|
return vm_worker_create_mirror(
|
||||||
|
zkhandler,
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=destination_api_verify_ssl,
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
)
|
||||||
|
|
||||||
|
return run_vm_create_mirror(
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=destination_api_verify_ssl,
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@celery.task(name="vm.promote_mirror", bind=True, routing_key="run_on")
|
||||||
|
def vm_promote_mirror(
|
||||||
|
self,
|
||||||
|
domain=None,
|
||||||
|
destination_api_uri="",
|
||||||
|
destination_api_key="",
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
remove_on_source=False,
|
||||||
|
run_on="primary",
|
||||||
|
):
|
||||||
|
@ZKConnection(config)
|
||||||
|
def run_vm_promote_mirror(
|
||||||
|
zkhandler,
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=True,
|
||||||
|
destination_storage_pool=None,
|
||||||
|
remove_on_source=False,
|
||||||
|
):
|
||||||
|
return vm_worker_promote_mirror(
|
||||||
|
zkhandler,
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=destination_api_verify_ssl,
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
remove_on_source=remove_on_source,
|
||||||
|
)
|
||||||
|
|
||||||
|
return run_vm_promote_mirror(
|
||||||
|
self,
|
||||||
|
domain,
|
||||||
|
destination_api_uri,
|
||||||
|
destination_api_key,
|
||||||
|
destination_api_verify_ssl=destination_api_verify_ssl,
|
||||||
|
destination_storage_pool=destination_storage_pool,
|
||||||
|
remove_on_source=remove_on_source,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
@celery.task(name="osd.add", bind=True, routing_key="run_on")
|
@celery.task(name="osd.add", bind=True, routing_key="run_on")
|
||||||
def osd_add(
|
def osd_add(
|
||||||
self,
|
self,
|
||||||
|
|
Loading…
Reference in New Issue