parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	e514eed414	Re-add success log output during migration	2021-09-27 11:50:55 -04:00
Joshua M. Boniface	b81e70ec18	Fix missing character in log message	2021-09-27 00:49:43 -04:00
Joshua M. Boniface	c2a473ed8b	Simplify VM migration down to 3 steps Remove two superfluous synchronization steps which are not needed here, since the exclusive lock handles that situation anyways. Still does not fix the weird flush->unflush lock timeout bug, but is better worked-around now due to the cancelling of the other wait freeing this up and continuing.	2021-09-27 00:03:20 -04:00
Joshua M. Boniface	5355f6ff48	Work around synchronization lock issues Make the block on stage C only wait for 900 seconds (15 minutes) to prevent indefinite blocking. The issue comes if a VM is being received, and the current unflush is cancelled for a flush. When this happens, this lock acquisition seems to block for no obvious reason, and no other changes seem to affect it. This is certainly some sort of locking bug within Kazoo but I can't diagnose it as-is. Leave a TODO to look into this again in the future.	2021-09-26 23:26:21 -04:00
Joshua M. Boniface	bf7823deb5	Improve log messages during VM migration	2021-09-26 23:15:38 -04:00
Joshua M. Boniface	8ba371723e	Use event to non-block wait and fix inf wait	2021-09-26 22:55:39 -04:00
Joshua M. Boniface	e10ac52116	Track status of VM state thread	2021-09-26 22:55:21 -04:00
Joshua M. Boniface	341073521b	Simplify locking process for VM migration Rather than using a cumbersome and overly complex ping-pong of read and write locks, instead move to a much simpler process using exclusive locks. Describing the process in ASCII or narrative is cumbersome, but the process ping-pongs via a set of exclusive locks and wait timers, so that the two sides are able to synchronize via blocking the exclusive lock. The end result is a much more streamlined migration (takes about half the time all things considered) which should be less error-prone.	2021-09-26 22:08:07 -04:00
Joshua M. Boniface	16c38da5ef	Fix failure to connect to libvirt in keepalive This should be caught and abort the thread rather than failing and holding up keepalives.	2021-09-26 20:42:01 -04:00
Joshua M. Boniface	c8134d3a1c	Fix several bugs in fence handling 1. Output from ipmitool was not being stripped, and stray newlines were throwing off the comparisons. Fixes this. 2. Several stages were lacking meaningful messages. Adds these in so the output is more clear about what is going on. 3. Reduce the sleep time after a fence to just 1x the keepalive_interval, rather than 2x, because this seemed like excessively long even for slow IPMI interfaces, especially since we're checking the power state now anyways. 4. Set the node daemon state to an explicit 'fenced' state after a successful fence to indicate to users that the node was indeed fenced successfully and not still 'dead'.	2021-09-26 20:07:30 -04:00
Joshua M. Boniface	9f41373324	Ensure pvc-flush is after network-online	2021-09-26 17:40:42 -04:00
Joshua M. Boniface	8e62d5b30b	Fix typo in log message	2021-09-26 03:35:30 -04:00
Joshua M. Boniface	7a8eee244a	Tweak CLI helptext around OSD actions Adds some more detail about OSD commands and their values.	2021-09-26 01:29:23 -04:00
Joshua M. Boniface	7df5b8e52e	Fix typo in sgdisk command options	2021-09-26 00:59:05 -04:00
Joshua M. Boniface	6f96219023	Use re.search instead of re.match Required since we're not matching the start of the string.	2021-09-26 00:55:29 -04:00
Joshua M. Boniface	51967e164b	Raise basic exceptions in CephInstance Avoids no exception to reraise errors on failures.	2021-09-26 00:50:10 -04:00
Joshua M. Boniface	7a3a44d47c	Fix OSD creation for partition paths and fix gdisk The previous implementation did not work with /dev/nvme devices or any /dev/disk/by-* devices due to some logical failures in the partition naming scheme, so fix these, and be explicit about what is supported in the PVC CLI command output. The 'echo \| gdisk' implementation of partition creation also did not work due to limitations of subprocess.run; instead, use sgdisk which allows these commands to be written out explicitly and is included in the same package as gdisk.	2021-09-26 00:12:28 -04:00
Joshua M. Boniface	44491dd988	Add support for configurable OSD DB ratios The default of 0.05 (5%) is likely ideal in the initial implementation, but allow this to be set explicitly for maximum flexibility in space-constrained or performance-critical use-cases.	2021-09-24 01:06:39 -04:00
Joshua M. Boniface	eba142f470	Bump version to 0.9.36	2021-09-23 14:01:38 -04:00
Joshua M. Boniface	6cef68d157	Add separate OSD DB device support Adds in three parts: 1. Create an API endpoint to create OSD DB volume groups on a device. Passed through to the node via the same command pipeline as creating/removing OSDs, and creates a volume group with a fixed name (osd-db). 2. Adds API support for specifying whether or not to use this DB volume group when creating a new OSD via the "ext_db" flag. Naming and sizing is fixed for simplicity and based on Ceph recommendations (5% of OSD size). The Zookeeper schema tracks the block device to use during removal. 3. Adds CLI support for the new and modified API endpoints, as well as displaying the block device and DB block device in the OSD list. While I debated supporting adding a DB device to an existing OSD, in practice this ended up being a very complex operation involving stopping the OSD and setting some options, so this is not supported; this can be specified during OSD creation only. Closes #142	2021-09-23 13:59:49 -04:00
Joshua M. Boniface	e8caf3369e	Move console watcher stop try up Could cause an exception if d_domain is not defined yet.	2021-09-22 16:02:04 -04:00
Joshua M. Boniface	3e3776a25b	Bump version to 0.9.35	2021-09-13 02:20:46 -04:00
Joshua M. Boniface	6e0d0e264e	Add memory and vCPU checks to VM define/modify Ensures that a VM won't: (a) Have provisioned more RAM than there is available on a given node. Due to memory overprovisioning, this is simply a "is the VM memory count more than the node count", and doesn't factor in free or used memory on a node, total cluster usage, etc. So if a node has 64GB total RAM, the VM limit is 64GB. It is up to an administrator to ensure sanity below that value. (b) Have provisioned more vCPUs than there are CPU cores on the node, minus 2 to account for hypervisor/storage processes. Will ensure there is no severe CPU contention caused by a single VM having more vCPUs than there are actual execution threads available. Closes #139	2021-09-13 01:51:21 -04:00
Joshua M. Boniface	1855d03a36	Add pool size check when resizing volumes Closes #140	2021-09-12 19:54:51 -04:00
Joshua M. Boniface	1a286dc8dd	Increase build-and-deploy sleep	2021-09-12 19:50:58 -04:00
Joshua M. Boniface	1b6d10e03a	Handle VM disk/network stats gathering exceptions	2021-09-12 19:41:07 -04:00
Joshua M. Boniface	73c96d1e93	Add VM device hot attach/detach support Adds a new API endpoint to support hot attach/detach of devices, and the corresponding client-side logic to use this endpoint when doing VM network/storage add/remove actions. The live attach is now the default behaviour for these types of additions and removals, and can be disabled if needed. Closes #141	2021-09-12 19:33:00 -04:00
Joshua M. Boniface	5841c98a59	Adjust lint script for newer linter	2021-09-12 15:40:38 -04:00
Joshua M. Boniface	bc6395c959	Don't crash cleanup if no this_node	2021-08-29 03:52:18 -04:00
Joshua M. Boniface	d582f87472	Change default node object state to flushed	2021-08-29 03:34:08 -04:00
Joshua M. Boniface	e9735113af	Bump version to 0.9.34	2021-08-24 16:15:25 -04:00
Joshua M. Boniface	722fd0a65d	Properly handle =-separated fsargs	2021-08-24 11:40:22 -04:00
Joshua M. Boniface	3b41beb0f3	Convert argument elements of task status to types	2021-08-23 14:28:12 -04:00
Joshua M. Boniface	d3392c0282	Fix typo in output message	2021-08-23 00:39:19 -04:00
Joshua M. Boniface	560c013e95	Bump version to 0.9.33	2021-08-21 03:28:48 -04:00
Joshua M. Boniface	384c6320ef	Avoid failing if no provisioner tasks	2021-08-21 03:25:16 -04:00
Joshua M. Boniface	445dec1c38	Ensure pycache files are removed on deb creation	2021-08-21 03:19:18 -04:00
Joshua M. Boniface	534c7cd7f0	Refactor pvcnoded to reduce Daemon.py size This branch commit refactors the pvcnoded component to better adhere to good programming practices. The previous Daemon.py was a massive file which contained almost 2000 lines of direct, root-level code which was directly imported. Not only was this poor practice, but this resulted in a nigh-unmaintainable file which was hard even for me to understand. This refactoring splits a large section of the code from Daemon.py into separate small modules and functions in the `util/` directory. This will hopefully make most of the functionality easy to find and modify without having to dig through a single large file. Further the existing subcomponents have been moved to the `objects/` directory which clearly separates them. Finally, the Daemon.py code has mostly been moved into a function, `entrypoint()`, which is then called from the `pvcnoded.py` stub. An additional item is that most format strings have been replaced by f-strings to make use of the Python 3.6 features in Daemon.py and the utility files.	2021-08-21 03:14:22 -04:00
Joshua M. Boniface	4014ef7714	Bump version to 0.9.32	2021-08-19 12:37:58 -04:00
Joshua M. Boniface	180f0445ac	Properly handle exceptions getting VM stats	2021-08-19 12:36:31 -04:00
Joshua Boniface	074664d4c1	Fix image dimensions and size	2021-08-18 19:51:55 -04:00
Joshua Boniface	418ac23d40	Add screenshots to docs	2021-08-18 19:49:53 -04:00
Joshua M. Boniface	13e309b450	Fix colours of network status elements	2021-08-18 19:41:53 -04:00
Joshua M. Boniface	7ecc6a2635	Bump version to 0.9.31 v0.9.31	2021-07-30 12:08:12 -04:00
Joshua M. Boniface	73e8149cb0	Remove explicit image-features from rbd cmd This should be managed in ceph.conf with the `rbd default features` configuration option instead, and thus can be tailored to the underlying OS version.	2021-07-30 11:33:59 -04:00
Joshua M. Boniface	4a7246b8c0	Ensure RBD resize has bytes appended If this isn't, the resize will be interpreted as a MB value and result in an absurdly big volume instead. This is the same consistency validation that occurs on add.	2021-07-30 11:25:13 -04:00
Joshua M. Boniface	c49351469b	Revert "Ensure consistent sizing of volumes" This reverts commit `dc03e95bbf`.	2021-07-29 15:30:00 -04:00
Joshua M. Boniface	dc03e95bbf	Ensure consistent sizing of volumes Convert from human to bytes, then to megabytes and always pass this to the RBD command. This ensures consistency regardless of what is actually passed by the user.	2021-07-29 15:14:25 -04:00
Joshua M. Boniface	c460aa051a	Add missing floppy RASD type for compat	2021-07-27 16:32:32 -04:00
Joshua M. Boniface	3ab6365a53	Adjust receive output to show proper source	2021-07-22 15:43:08 -04:00

1 2 3 4 5 ...

2451 Commits