Compare commits

...

27 Commits

Author SHA1 Message Date
afdf254297 Bump version to 0.9.32 2021-08-19 12:37:58 -04:00
42e776fac1 Properly handle exceptions getting VM stats 2021-08-19 12:36:31 -04:00
dae67a1b7b Fix image dimensions and size 2021-08-18 19:51:55 -04:00
b86f8c1e09 Add screenshots to docs 2021-08-18 19:49:53 -04:00
13e309b450 Fix colours of network status elements 2021-08-18 19:41:53 -04:00
7ecc6a2635 Bump version to 0.9.31 2021-07-30 12:08:12 -04:00
73e8149cb0 Remove explicit image-features from rbd cmd
This should be managed in ceph.conf with the `rbd default
features` configuration option instead, and thus can be tailored to the
underlying OS version.
2021-07-30 11:33:59 -04:00
4a7246b8c0 Ensure RBD resize has bytes appended
If this isn't, the resize will be interpreted as a MB value and result
in an absurdly big volume instead. This is the same consistency
validation that occurs on add.
2021-07-30 11:25:13 -04:00
c49351469b Revert "Ensure consistent sizing of volumes"
This reverts commit dc03e95bbf.
2021-07-29 15:30:00 -04:00
dc03e95bbf Ensure consistent sizing of volumes
Convert from human to bytes, then to megabytes and always pass this to
the RBD command. This ensures consistency regardless of what is actually
passed by the user.
2021-07-29 15:14:25 -04:00
c460aa051a Add missing floppy RASD type for compat 2021-07-27 16:32:32 -04:00
3ab6365a53 Adjust receive output to show proper source 2021-07-22 15:43:08 -04:00
32613ff119 Remove obsolete Suggests lines from control 2021-07-20 00:35:21 -04:00
2a99a27feb Bump version to 0.9.30 2021-07-20 00:01:45 -04:00
45f23c12ea Remove logs from schema validation
These are managed entirely by the logging subsystem not by the schema
handler due to catch-22's.
2021-07-20 00:00:37 -04:00
fa1d93e933 Bump version to 0.9.29 2021-07-19 16:55:41 -04:00
b14bc7e3a3 Add retry to log writes 2021-07-19 13:11:28 -04:00
4d6842f942 Don't bail out if write fails, keep retrying 2021-07-19 13:09:36 -04:00
6ead21a308 Handle cleanup from a failure properly 2021-07-19 12:39:13 -04:00
b7c8c2ee3d Fix handling of this_node and d_domain in cleanup 2021-07-19 12:36:35 -04:00
d48f58930b Use harder exits and add cleanup termination 2021-07-19 12:27:16 -04:00
7c36388c8f Add post-networking delay and adjust daemon delay 2021-07-19 12:23:45 -04:00
e9df043c0a Ensure ZK logging does not block startup 2021-07-19 12:19:59 -04:00
71e4d0b32a Bump version to 0.9.28 2021-07-19 09:29:34 -04:00
f16bad4691 Revamp confirmation options for vm modify
Before, "-y"/"--yes" only confirmed the reboot portion. Instead, modify
this to confirm both the diff portion and the restart portion, and add
separate flags to bypass one or the other independently, ensuring the
administrator has lots of flexibility. UNSAFE mode implies "-y" so both
would be auto-confirmed if that option is set.
2021-07-19 00:25:43 -04:00
15d92c483f Bump version to 0.9.27 2021-07-19 00:03:40 -04:00
7dd17e71e7 Fix bug with VM editing with file
Current config is needed for the diff but it was in a conditional.
2021-07-19 00:02:19 -04:00
19 changed files with 300 additions and 113 deletions

View File

@ -1 +1 @@
0.9.26 0.9.32

View File

@ -40,8 +40,51 @@ The core node and API daemons, as well as the CLI API client, are written in Pyt
To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your first cluster. To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your first cluster.
## Screenshots
While PVC's API and internals aren't very screenshot-worthy, here is some example output of the CLI tool.
<p><img alt="Node listing" src="docs/images/pvc-nodes.png"/><br/><i>Listing the nodes in a cluster</i></p>
<p><img alt="Network listing" src="docs/images/pvc-networks.png"/><br/><i>Listing the networks in a cluster, showing 3 bridged and 1 IPv4-only managed networks</i></p>
<p><img alt="VM listing and migration" src="docs/images/pvc-migration.png"/><br/><i>Listing a limited set of VMs and migrating one with status updates</i></p>
<p><img alt="Node logs" src="docs/images/pvc-nodelog.png"/><br/><i>Viewing the logs of a node (keepalives and VM [un]migration)</i></p>
## Changelog ## Changelog
#### v0.9.32
* [CLI Client] Fixes some incorrect colours in network lists
* [Documentation] Adds documentation screenshots of CLI client
* [Node Daemon] Fixes a bug if VM stats gathering fails
#### v0.9.31
* [Packages] Cleans up obsolete Suggests lines
* [Node Daemon] Adjusts log text of VM migrations to show the correct source node
* [API Daemon] Adjusts the OVA importer to support floppy RASD types for compatability
* [API Daemon] Ensures that volume resize commands without a suffix get B appended
* [API Daemon] Removes the explicit setting of image-features in PVC; defaulting to the limited set has been moved to the ceph.conf configuration on nodes via PVC Ansible
#### v0.9.30
* [Node Daemon] Fixes bug with schema validation
#### v0.9.29
* [Node Daemon] Corrects numerous bugs with node logging framework
#### v0.9.28
* [CLI Client] Revamp confirmation options for "vm modify" command
#### v0.9.27
* [CLI Client] Fixes a bug with vm modify command when passed a file
#### v0.9.26 #### v0.9.26
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures * [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures

View File

@ -25,7 +25,7 @@ import yaml
from distutils.util import strtobool as dustrtobool from distutils.util import strtobool as dustrtobool
# Daemon version # Daemon version
version = '0.9.26' version = '0.9.32'
# API version # API version
API_VERSION = 1.0 API_VERSION = 1.0

View File

@ -414,6 +414,7 @@ class OVFParser(object):
"5": "ide-controller", "5": "ide-controller",
"6": "scsi-controller", "6": "scsi-controller",
"10": "ethernet-adapter", "10": "ethernet-adapter",
"14": "floppy",
"15": "cdrom", "15": "cdrom",
"17": "disk", "17": "disk",
"20": "other-storage-device", "20": "other-storage-device",

View File

@ -491,14 +491,10 @@ def net_sriov_vf_info(config, node, vf):
# Output display functions # Output display functions
# #
def getColour(value): def getColour(value):
if value in ['True', "start"]: if value in ["False", "None"]:
return ansiprint.green()
elif value in ["restart", "shutdown"]:
return ansiprint.yellow()
elif value in ["stop", "fail"]:
return ansiprint.red()
else:
return ansiprint.blue() return ansiprint.blue()
else:
return ansiprint.green()
def getOutputColours(network_information): def getOutputColours(network_information):

View File

@ -764,9 +764,19 @@ def vm_meta(domain, node_limit, node_selector, node_autostart, migration_method,
help='Immediately restart VM to apply new config.' help='Immediately restart VM to apply new config.'
) )
@click.option( @click.option(
'-y', '--yes', 'confirm_flag', '-d', '--confirm-diff', 'confirm_diff_flag',
is_flag=True, default=False, is_flag=True, default=False,
help='Confirm the restart' help='Confirm the diff.'
)
@click.option(
'-c', '--confirm-restart', 'confirm_restart_flag',
is_flag=True, default=False,
help='Confirm the restart.'
)
@click.option(
'-y', '--yes', 'confirm_all_flag',
is_flag=True, default=False,
help='Confirm the diff and the restart.'
) )
@click.argument( @click.argument(
'domain' 'domain'
@ -774,7 +784,7 @@ def vm_meta(domain, node_limit, node_selector, node_autostart, migration_method,
@click.argument( @click.argument(
'cfgfile', type=click.File(), default=None, required=False 'cfgfile', type=click.File(), default=None, required=False
) )
def vm_modify(domain, cfgfile, editor, restart, confirm_flag): def vm_modify(domain, cfgfile, editor, restart, confirm_diff_flag, confirm_restart_flag, confirm_all_flag):
""" """
Modify existing virtual machine DOMAIN, either in-editor or with replacement CONFIG. DOMAIN may be a UUID or name. Modify existing virtual machine DOMAIN, either in-editor or with replacement CONFIG. DOMAIN may be a UUID or name.
""" """
@ -788,12 +798,12 @@ def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
dom_name = vm_information.get('name') dom_name = vm_information.get('name')
if editor is True: # Grab the current config
# Grab the current config current_vm_cfg_raw = vm_information.get('xml')
current_vm_cfg_raw = vm_information.get('xml') xml_data = etree.fromstring(current_vm_cfg_raw)
xml_data = etree.fromstring(current_vm_cfg_raw) current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8').strip()
current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8').strip()
if editor is True:
new_vm_cfgfile = click.edit(text=current_vm_cfgfile, require_save=True, extension='.xml') new_vm_cfgfile = click.edit(text=current_vm_cfgfile, require_save=True, extension='.xml')
if new_vm_cfgfile is None: if new_vm_cfgfile is None:
click.echo('Aborting with no modifications.') click.echo('Aborting with no modifications.')
@ -831,9 +841,10 @@ def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
except Exception as e: except Exception as e:
cleanup(False, 'Error: XML is malformed or invalid: {}'.format(e)) cleanup(False, 'Error: XML is malformed or invalid: {}'.format(e))
click.confirm('Write modifications to cluster?', abort=True) if not confirm_diff_flag and not confirm_all_flag and not config['unsafe']:
click.confirm('Write modifications to cluster?', abort=True)
if restart and not confirm_flag and not config['unsafe']: if restart and not confirm_restart_flag and not confirm_all_flag and not config['unsafe']:
try: try:
click.confirm('Restart VM {}'.format(domain), prompt_suffix='? ', abort=True) click.confirm('Restart VM {}'.format(domain), prompt_suffix='? ', abort=True)
except Exception: except Exception:

View File

@ -2,7 +2,7 @@ from setuptools import setup
setup( setup(
name='pvc', name='pvc',
version='0.9.26', version='0.9.32',
packages=['pvc', 'pvc.cli_lib'], packages=['pvc', 'pvc.cli_lib'],
install_requires=[ install_requires=[
'Click', 'Click',

View File

@ -491,7 +491,7 @@ def add_volume(zkhandler, pool, name, size):
size = '{}B'.format(size) size = '{}B'.format(size)
# 2. Create the volume # 2. Create the volume
retcode, stdout, stderr = common.run_os_command('rbd create --size {} --image-feature layering,exclusive-lock {}/{}'.format(size, pool, name)) retcode, stdout, stderr = common.run_os_command('rbd create --size {} {}/{}'.format(size, pool, name))
if retcode: if retcode:
return False, 'ERROR: Failed to create RBD volume "{}": {}'.format(name, stderr) return False, 'ERROR: Failed to create RBD volume "{}": {}'.format(name, stderr)
@ -536,6 +536,10 @@ def resize_volume(zkhandler, pool, name, size):
if not verifyVolume(zkhandler, pool, name): if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool) return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
# Add 'B' if the volume is in bytes
if re.match(r'^[0-9]+$', size):
size = '{}B'.format(size)
# 1. Resize the volume # 1. Resize the volume
retcode, stdout, stderr = common.run_os_command('rbd resize --size {} {}/{}'.format(size, pool, name)) retcode, stdout, stderr = common.run_os_command('rbd resize --size {} {}/{}'.format(size, pool, name))
if retcode: if retcode:

View File

@ -23,6 +23,7 @@ from collections import deque
from threading import Thread from threading import Thread
from queue import Queue from queue import Queue
from datetime import datetime from datetime import datetime
from time import sleep
from daemon_lib.zkhandler import ZKHandler from daemon_lib.zkhandler import ZKHandler
@ -83,7 +84,8 @@ class Logger(object):
self.last_prompt = '' self.last_prompt = ''
if self.config['zookeeper_logging']: if self.config['zookeeper_logging']:
self.zookeeper_logger = ZookeeperLogger(config) self.zookeeper_queue = Queue()
self.zookeeper_logger = ZookeeperLogger(self.config, self.zookeeper_queue)
self.zookeeper_logger.start() self.zookeeper_logger.start()
# Provide a hup function to close and reopen the writer # Provide a hup function to close and reopen the writer
@ -96,9 +98,15 @@ class Logger(object):
if self.config['file_logging']: if self.config['file_logging']:
self.writer.close() self.writer.close()
if self.config['zookeeper_logging']: if self.config['zookeeper_logging']:
self.out("Waiting for Zookeeper message queue to drain", state='s') self.out("Waiting 15s for Zookeeper message queue to drain", state='s')
while not self.zookeeper_logger.queue.empty():
pass tick_count = 0
while not self.zookeeper_queue.empty():
sleep(0.5)
tick_count += 1
if tick_count > 30:
break
self.zookeeper_logger.stop() self.zookeeper_logger.stop()
self.zookeeper_logger.join() self.zookeeper_logger.join()
@ -145,7 +153,7 @@ class Logger(object):
# Log to Zookeeper # Log to Zookeeper
if self.config['zookeeper_logging']: if self.config['zookeeper_logging']:
self.zookeeper_logger.queue.put(message) self.zookeeper_queue.put(message)
# Set last message variables # Set last message variables
self.last_colour = colour self.last_colour = colour
@ -157,19 +165,14 @@ class ZookeeperLogger(Thread):
Defines a threaded writer for Zookeeper locks. Threading prevents the blocking of other Defines a threaded writer for Zookeeper locks. Threading prevents the blocking of other
daemon events while the records are written. They will be eventually-consistent daemon events while the records are written. They will be eventually-consistent
""" """
def __init__(self, config): def __init__(self, config, zookeeper_queue):
self.config = config self.config = config
self.node = self.config['node'] self.node = self.config['node']
self.max_lines = self.config['node_log_lines'] self.max_lines = self.config['node_log_lines']
self.queue = Queue() self.zookeeper_queue = zookeeper_queue
self.zkhandler = None self.connected = False
self.start_zkhandler()
# Ensure the root keys for this are instantiated
self.zkhandler.write([
('base.logs', ''),
(('logs', self.node), '')
])
self.running = False self.running = False
self.zkhandler = None
Thread.__init__(self, args=(), kwargs=None) Thread.__init__(self, args=(), kwargs=None)
def start_zkhandler(self): def start_zkhandler(self):
@ -179,10 +182,29 @@ class ZookeeperLogger(Thread):
self.zkhandler.disconnect() self.zkhandler.disconnect()
except Exception: except Exception:
pass pass
self.zkhandler = ZKHandler(self.config, logger=None)
self.zkhandler.connect(persistent=True) while True:
try:
self.zkhandler = ZKHandler(self.config, logger=None)
self.zkhandler.connect(persistent=True)
break
except Exception:
sleep(0.5)
continue
self.connected = True
# Ensure the root keys for this are instantiated
self.zkhandler.write([
('base.logs', ''),
(('logs', self.node), '')
])
def run(self): def run(self):
while not self.connected:
self.start_zkhandler()
sleep(1)
self.running = True self.running = True
# Get the logs that are currently in Zookeeper and populate our deque # Get the logs that are currently in Zookeeper and populate our deque
raw_logs = self.zkhandler.read(('logs.messages', self.node)) raw_logs = self.zkhandler.read(('logs.messages', self.node))
@ -192,7 +214,7 @@ class ZookeeperLogger(Thread):
while self.running: while self.running:
# Get a new message # Get a new message
try: try:
message = self.queue.get(timeout=1) message = self.zookeeper_queue.get(timeout=1)
if not message: if not message:
continue continue
except Exception: except Exception:
@ -205,8 +227,21 @@ class ZookeeperLogger(Thread):
date = '' date = ''
# Add the message to the deque # Add the message to the deque
logs.append(f'{date}{message}') logs.append(f'{date}{message}')
# Write the updated messages into Zookeeper
self.zkhandler.write([(('logs.messages', self.node), '\n'.join(logs))]) tick_count = 0
while True:
try:
# Write the updated messages into Zookeeper
self.zkhandler.write([(('logs.messages', self.node), '\n'.join(logs))])
break
except Exception:
# The write failed (connection loss, etc.) so retry for 15 seconds
sleep(0.5)
tick_count += 1
if tick_count > 30:
break
else:
continue
return return
def stop(self): def stop(self):

View File

@ -777,7 +777,7 @@ class ZKSchema(object):
logger.out(f'Key not found: {self.path(kpath)}', state='w') logger.out(f'Key not found: {self.path(kpath)}', state='w')
result = False result = False
for elem in ['logs', 'node', 'domain', 'network', 'osd', 'pool']: for elem in ['node', 'domain', 'network', 'osd', 'pool']:
# First read all the subelements of the key class # First read all the subelements of the key class
for child in zkhandler.zk_conn.get_children(self.path(f'base.{elem}')): for child in zkhandler.zk_conn.get_children(self.path(f'base.{elem}')):
# For each key in the schema for that particular elem # For each key in the schema for that particular elem
@ -856,7 +856,7 @@ class ZKSchema(object):
data = '' data = ''
zkhandler.zk_conn.create(self.path(kpath), data.encode(zkhandler.encoding)) zkhandler.zk_conn.create(self.path(kpath), data.encode(zkhandler.encoding))
for elem in ['logs', 'node', 'domain', 'network', 'osd', 'pool']: for elem in ['node', 'domain', 'network', 'osd', 'pool']:
# First read all the subelements of the key class # First read all the subelements of the key class
for child in zkhandler.zk_conn.get_children(self.path(f'base.{elem}')): for child in zkhandler.zk_conn.get_children(self.path(f'base.{elem}')):
# For each key in the schema for that particular elem # For each key in the schema for that particular elem

42
debian/changelog vendored
View File

@ -1,3 +1,45 @@
pvc (0.9.32-0) unstable; urgency=high
* [CLI Client] Fixes some incorrect colours in network lists
* [Documentation] Adds documentation screenshots of CLI client
* [Node Daemon] Fixes a bug if VM stats gathering fails
-- Joshua M. Boniface <joshua@boniface.me> Thu, 19 Aug 2021 12:37:58 -0400
pvc (0.9.31-0) unstable; urgency=high
* [Packages] Cleans up obsolete Suggests lines
* [Node Daemon] Adjusts log text of VM migrations to show the correct source node
* [API Daemon] Adjusts the OVA importer to support floppy RASD types for compatability
* [API Daemon] Ensures that volume resize commands without a suffix get B appended
* [API Daemon] Removes the explicit setting of image-features in PVC; defaulting to the limited set has been moved to the ceph.conf configuration on nodes via PVC Ansible
-- Joshua M. Boniface <joshua@boniface.me> Fri, 30 Jul 2021 12:08:12 -0400
pvc (0.9.30-0) unstable; urgency=high
* [Node Daemon] Fixes bug with schema validation
-- Joshua M. Boniface <joshua@boniface.me> Tue, 20 Jul 2021 00:01:45 -0400
pvc (0.9.29-0) unstable; urgency=high
* [Node Daemon] Corrects numerous bugs with node logging framework
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 16:55:41 -0400
pvc (0.9.28-0) unstable; urgency=high
* [CLI Client] Revamp confirmation options for "vm modify" command
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 09:29:34 -0400
pvc (0.9.27-0) unstable; urgency=high
* [CLI Client] Fixes a bug with vm modify command when passed a file
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 00:03:40 -0400
pvc (0.9.26-0) unstable; urgency=high pvc (0.9.26-0) unstable; urgency=high
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures * [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures

1
debian/control vendored
View File

@ -9,7 +9,6 @@ X-Python3-Version: >= 3.2
Package: pvc-daemon-node Package: pvc-daemon-node
Architecture: all Architecture: all
Depends: systemd, pvc-daemon-common, python3-kazoo, python3-psutil, python3-apscheduler, python3-libvirt, python3-psycopg2, python3-dnspython, python3-yaml, python3-distutils, python3-rados, python3-gevent, ipmitool, libvirt-daemon-system, arping, vlan, bridge-utils, dnsmasq, nftables, pdns-server, pdns-backend-pgsql Depends: systemd, pvc-daemon-common, python3-kazoo, python3-psutil, python3-apscheduler, python3-libvirt, python3-psycopg2, python3-dnspython, python3-yaml, python3-distutils, python3-rados, python3-gevent, ipmitool, libvirt-daemon-system, arping, vlan, bridge-utils, dnsmasq, nftables, pdns-server, pdns-backend-pgsql
Suggests: pvc-client-api, pvc-client-cli
Description: Parallel Virtual Cluster node daemon (Python 3) Description: Parallel Virtual Cluster node daemon (Python 3)
A KVM/Zookeeper/Ceph-based VM and private cloud manager A KVM/Zookeeper/Ceph-based VM and private cloud manager
. .

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

BIN
docs/images/pvc-nodelog.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 300 KiB

BIN
docs/images/pvc-nodes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

View File

@ -40,8 +40,51 @@ The core node and API daemons, as well as the CLI API client, are written in Pyt
To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your first cluster. To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your first cluster.
## Screenshots
While PVC's API and internals aren't very screenshot-worthy, here is some example output of the CLI tool.
<p><img alt="Node listing" src="images/pvc-nodes.png"/><br/><i>Listing the nodes in a cluster</i></p>
<p><img alt="Network listing" src="images/pvc-networks.png"/><br/><i>Listing the networks in a cluster, showing 3 bridged and 1 IPv4-only managed networks</i></p>
<p><img alt="VM listing and migration" src="images/pvc-migration.png"/><br/><i>Listing a limited set of VMs and migrating one with status updates</i></p>
<p><img alt="Node logs" src="images/pvc-nodelog.png"/><br/><i>Viewing the logs of a node (keepalives and VM [un]migration)</i></p>
## Changelog ## Changelog
#### v0.9.32
* [CLI Client] Fixes some incorrect colours in network lists
* [Documentation] Adds documentation screenshots of CLI client
* [Node Daemon] Fixes a bug if VM stats gathering fails
#### v0.9.31
* [Packages] Cleans up obsolete Suggests lines
* [Node Daemon] Adjusts log text of VM migrations to show the correct source node
* [API Daemon] Adjusts the OVA importer to support floppy RASD types for compatability
* [API Daemon] Ensures that volume resize commands without a suffix get B appended
* [API Daemon] Removes the explicit setting of image-features in PVC; defaulting to the limited set has been moved to the ceph.conf configuration on nodes via PVC Ansible
#### v0.9.30
* [Node Daemon] Fixes bug with schema validation
#### v0.9.29
* [Node Daemon] Corrects numerous bugs with node logging framework
#### v0.9.28
* [CLI Client] Revamp confirmation options for "vm modify" command
#### v0.9.27
* [CLI Client] Fixes a bug with vm modify command when passed a file
#### v0.9.26 #### v0.9.26
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures * [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures

View File

@ -56,7 +56,7 @@ import pvcnoded.CephInstance as CephInstance
import pvcnoded.MetadataAPIInstance as MetadataAPIInstance import pvcnoded.MetadataAPIInstance as MetadataAPIInstance
# Version string for startup output # Version string for startup output
version = '0.9.26' version = '0.9.32'
############################################################################### ###############################################################################
# PVCD - node daemon startup program # PVCD - node daemon startup program
@ -76,8 +76,11 @@ version = '0.9.26'
# Daemon functions # Daemon functions
############################################################################### ###############################################################################
# Ensure the update_timer is None until it's set for real # Ensure update_timer, this_node, and d_domain are None until they're set for real
# Ensures cleanup() doesn't fail due to these items not being created yet
update_timer = None update_timer = None
this_node = None
d_domain = None
# Create timer to update this node in Zookeeper # Create timer to update this node in Zookeeper
@ -110,7 +113,7 @@ try:
pvcnoded_config_file = os.environ['PVCD_CONFIG_FILE'] pvcnoded_config_file = os.environ['PVCD_CONFIG_FILE']
except Exception: except Exception:
print('ERROR: The "PVCD_CONFIG_FILE" environment variable must be set before starting pvcnoded.') print('ERROR: The "PVCD_CONFIG_FILE" environment variable must be set before starting pvcnoded.')
exit(1) os._exit(1)
# Set local hostname and domain variables # Set local hostname and domain variables
myfqdn = gethostname() myfqdn = gethostname()
@ -142,7 +145,7 @@ def readConfig(pvcnoded_config_file, myhostname):
o_config = yaml.load(cfgfile, Loader=yaml.SafeLoader) o_config = yaml.load(cfgfile, Loader=yaml.SafeLoader)
except Exception as e: except Exception as e:
print('ERROR: Failed to parse configuration file: {}'.format(e)) print('ERROR: Failed to parse configuration file: {}'.format(e))
exit(1) os._exit(1)
# Handle the basic config (hypervisor-only) # Handle the basic config (hypervisor-only)
try: try:
@ -179,7 +182,7 @@ def readConfig(pvcnoded_config_file, myhostname):
} }
except Exception as e: except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e)) print('ERROR: Failed to load configuration: {}'.format(e))
exit(1) cleanup(failure=True)
config = config_general config = config_general
# Handle debugging config # Handle debugging config
@ -236,7 +239,7 @@ def readConfig(pvcnoded_config_file, myhostname):
except Exception as e: except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e)) print('ERROR: Failed to load configuration: {}'.format(e))
exit(1) cleanup(failure=True)
config = {**config, **config_networking} config = {**config, **config_networking}
# Create the by-id address entries # Create the by-id address entries
@ -250,7 +253,7 @@ def readConfig(pvcnoded_config_file, myhostname):
network = ip_network(config[network_key]) network = ip_network(config[network_key])
except Exception: except Exception:
print('ERROR: Network address {} for {} is not valid!'.format(config[network_key], network_key)) print('ERROR: Network address {} for {} is not valid!'.format(config[network_key], network_key))
exit(1) cleanup(failure=True)
# If we should be autoselected # If we should be autoselected
if config[address_key] == 'by-id': if config[address_key] == 'by-id':
@ -270,7 +273,7 @@ def readConfig(pvcnoded_config_file, myhostname):
raise raise
except Exception: except Exception:
print('ERROR: Floating address {} for {} is not valid!'.format(config[floating_key], floating_key)) print('ERROR: Floating address {} for {} is not valid!'.format(config[floating_key], floating_key))
exit(1) cleanup(failure=True)
# Handle the storage config # Handle the storage config
if config['enable_storage']: if config['enable_storage']:
@ -281,7 +284,7 @@ def readConfig(pvcnoded_config_file, myhostname):
} }
except Exception as e: except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e)) print('ERROR: Failed to load configuration: {}'.format(e))
exit(1) cleanup(failure=True)
config = {**config, **config_storage} config = {**config, **config_storage}
# Handle an empty ipmi_hostname # Handle an empty ipmi_hostname
@ -488,6 +491,9 @@ if enable_networking:
else: else:
common.run_os_command('ip route add default via {} dev {}'.format(upstream_gateway, 'brupstream')) common.run_os_command('ip route add default via {} dev {}'.format(upstream_gateway, 'brupstream'))
logger.out('Waiting 3s for networking to come up', state='s')
time.sleep(3)
############################################################################### ###############################################################################
# PHASE 2c - Prepare sysctl for pvcnoded # PHASE 2c - Prepare sysctl for pvcnoded
############################################################################### ###############################################################################
@ -559,8 +565,8 @@ if enable_storage:
logger.out('Starting Ceph manager daemon', state='i') logger.out('Starting Ceph manager daemon', state='i')
common.run_os_command('systemctl start ceph-mgr@{}'.format(myhostname)) common.run_os_command('systemctl start ceph-mgr@{}'.format(myhostname))
logger.out('Waiting 5s for daemons to start', state='s') logger.out('Waiting 3s for daemons to start', state='s')
time.sleep(5) time.sleep(3)
############################################################################### ###############################################################################
# PHASE 4 - Attempt to connect to the coordinators and start zookeeper client # PHASE 4 - Attempt to connect to the coordinators and start zookeeper client
@ -575,7 +581,7 @@ try:
zkhandler.connect(persistent=True) zkhandler.connect(persistent=True)
except Exception as e: except Exception as e:
logger.out('ERROR: Failed to connect to Zookeeper cluster: {}'.format(e), state='e') logger.out('ERROR: Failed to connect to Zookeeper cluster: {}'.format(e), state='e')
exit(1) os._exit(1)
logger.out('Validating Zookeeper schema', state='i') logger.out('Validating Zookeeper schema', state='i')
@ -696,7 +702,7 @@ else:
# Cleanup function # Cleanup function
def cleanup(): def cleanup(failure=False):
global logger, zkhandler, update_timer, d_domain global logger, zkhandler, update_timer, d_domain
logger.out('Terminating pvcnoded and cleaning up', state='s') logger.out('Terminating pvcnoded and cleaning up', state='s')
@ -708,19 +714,19 @@ def cleanup():
# Waiting for any flushes to complete # Waiting for any flushes to complete
logger.out('Waiting for any active flushes', state='s') logger.out('Waiting for any active flushes', state='s')
while this_node.flush_thread is not None: if this_node is not None:
time.sleep(0.5) while this_node.flush_thread is not None:
time.sleep(0.5)
# Stop console logging on all VMs # Stop console logging on all VMs
logger.out('Stopping domain console watchers', state='s') logger.out('Stopping domain console watchers', state='s')
for domain in d_domain: if d_domain is not None:
if d_domain[domain].getnode() == myhostname: for domain in d_domain:
try: if d_domain[domain].getnode() == myhostname:
d_domain[domain].console_log_instance.stop() try:
except NameError: d_domain[domain].console_log_instance.stop()
pass except Exception:
except AttributeError: pass
pass
# Force into secondary coordinator state if needed # Force into secondary coordinator state if needed
try: try:
@ -737,13 +743,11 @@ def cleanup():
# Stop keepalive thread # Stop keepalive thread
try: try:
stopKeepaliveTimer() stopKeepaliveTimer()
except NameError:
pass
except AttributeError:
pass
logger.out('Performing final keepalive update', state='s') logger.out('Performing final keepalive update', state='s')
node_keepalive() node_keepalive()
except Exception:
pass
# Set stop state in Zookeeper # Set stop state in Zookeeper
zkhandler.write([ zkhandler.write([
@ -763,12 +767,17 @@ def cleanup():
logger.out('Terminated pvc daemon', state='s') logger.out('Terminated pvc daemon', state='s')
logger.terminate() logger.terminate()
os._exit(0) if failure:
retcode = 1
else:
retcode = 0
os._exit(retcode)
# Termination function # Termination function
def term(signum='', frame=''): def term(signum='', frame=''):
cleanup() cleanup(failure=False)
# Hangup (logrotate) function # Hangup (logrotate) function
@ -868,7 +877,7 @@ if enable_hypervisor:
lv_conn.close() lv_conn.close()
except Exception as e: except Exception as e:
logger.out('ERROR: Failed to connect to Libvirt daemon: {}'.format(e), state='e') logger.out('ERROR: Failed to connect to Libvirt daemon: {}'.format(e), state='e')
exit(1) cleanup(failure=True)
############################################################################### ###############################################################################
# PHASE 7c - Ensure NFT is running on the local host # PHASE 7c - Ensure NFT is running on the local host
@ -1666,11 +1675,7 @@ def collect_vm_stats(queue):
domain_memory_stats = domain.memoryStats() domain_memory_stats = domain.memoryStats()
domain_cpu_stats = domain.getCPUStats(True)[0] domain_cpu_stats = domain.getCPUStats(True)[0]
except Exception as e: except Exception as e:
if debug: logger.out("Failed getting VM information for {}: {}".format(domain.name(), e), state='w', prefix='vm-thread')
try:
logger.out("Failed getting VM information for {}: {}".format(domain.name(), e), state='d', prefix='vm-thread')
except Exception:
pass
continue continue
# Ensure VM is present in the domain_list # Ensure VM is present in the domain_list
@ -1680,42 +1685,50 @@ def collect_vm_stats(queue):
if debug: if debug:
logger.out("Getting disk statistics for VM {}".format(domain_name), state='d', prefix='vm-thread') logger.out("Getting disk statistics for VM {}".format(domain_name), state='d', prefix='vm-thread')
domain_disk_stats = [] domain_disk_stats = []
for disk in tree.findall('devices/disk'): try:
disk_name = disk.find('source').get('name') for disk in tree.findall('devices/disk'):
if not disk_name: disk_name = disk.find('source').get('name')
disk_name = disk.find('source').get('file') if not disk_name:
disk_stats = domain.blockStats(disk.find('target').get('dev')) disk_name = disk.find('source').get('file')
domain_disk_stats.append({ disk_stats = domain.blockStats(disk.find('target').get('dev'))
"name": disk_name, domain_disk_stats.append({
"rd_req": disk_stats[0], "name": disk_name,
"rd_bytes": disk_stats[1], "rd_req": disk_stats[0],
"wr_req": disk_stats[2], "rd_bytes": disk_stats[1],
"wr_bytes": disk_stats[3], "wr_req": disk_stats[2],
"err": disk_stats[4] "wr_bytes": disk_stats[3],
}) "err": disk_stats[4]
})
except Exception as e:
logger.out("Failed to get disk stats for VM {}: {}".format(domain_name, e), state='w', prefix='vm-thread')
continue
if debug: if debug:
logger.out("Getting network statistics for VM {}".format(domain_name), state='d', prefix='vm-thread') logger.out("Getting network statistics for VM {}".format(domain_name), state='d', prefix='vm-thread')
domain_network_stats = [] domain_network_stats = []
for interface in tree.findall('devices/interface'): try:
interface_type = interface.get('type') for interface in tree.findall('devices/interface'):
if interface_type not in ['bridge']: interface_type = interface.get('type')
continue if interface_type not in ['bridge']:
interface_name = interface.find('target').get('dev') continue
interface_bridge = interface.find('source').get('bridge') interface_name = interface.find('target').get('dev')
interface_stats = domain.interfaceStats(interface_name) interface_bridge = interface.find('source').get('bridge')
domain_network_stats.append({ interface_stats = domain.interfaceStats(interface_name)
"name": interface_name, domain_network_stats.append({
"bridge": interface_bridge, "name": interface_name,
"rd_bytes": interface_stats[0], "bridge": interface_bridge,
"rd_packets": interface_stats[1], "rd_bytes": interface_stats[0],
"rd_errors": interface_stats[2], "rd_packets": interface_stats[1],
"rd_drops": interface_stats[3], "rd_errors": interface_stats[2],
"wr_bytes": interface_stats[4], "rd_drops": interface_stats[3],
"wr_packets": interface_stats[5], "wr_bytes": interface_stats[4],
"wr_errors": interface_stats[6], "wr_packets": interface_stats[5],
"wr_drops": interface_stats[7] "wr_errors": interface_stats[6],
}) "wr_drops": interface_stats[7]
})
except Exception as e:
logger.out("Failed to get network stats for VM {}: {}".format(domain_name, e), state='w', prefix='vm-thread')
continue
# Create the final dictionary # Create the final dictionary
domain_stats = { domain_stats = {

View File

@ -635,7 +635,7 @@ class VMInstance(object):
self.inreceive = True self.inreceive = True
self.logger.out('Receiving VM migration from node "{}"'.format(self.node), state='i', prefix='Domain {}'.format(self.domuuid)) self.logger.out('Receiving VM migration from node "{}"'.format(self.last_currentnode), state='i', prefix='Domain {}'.format(self.domuuid))
# Short delay to ensure sender is in sync # Short delay to ensure sender is in sync
time.sleep(0.5) time.sleep(0.5)