Commit Graph

2646 Commits

Author SHA1 Message Date
Joshua Boniface e7b6a3eac1 Implement SR-IOV PF and VF instances
Adds support for the node daemon managing SR-IOV PF and VF instances.

PFs are added to Zookeeper automatically based on the config at startup
during network configuration, and are otherwise completely static. PFs
are automatically removed from Zookeeper, along with all coresponding
VFs, should the PF phy device be removed from the configuration.

VFs are configured based on the (autocreated) VFs of each PF device,
added to Zookeeper, and then a new class instance, SRIOVVFInstance, is
used to watch them for configuration changes. This will enable the
runtime management of VF settings by the API. The set of keys ensures
that both configuration and details of the NIC can be tracked.

Most keys are self-explanatory, especially for PFs and the basic keys
for VFs. The configuration tree is also self-explanatory, being based
entirely on the options available in the `ip link set {dev} vf` command.

Two additional keys are also present: `used` and `used_by`, which will
be able to track the (boolean) state of usage, as well as the VM that
uses a given VIF. Since the VM side implementation will support both
macvtap and direct "hostdev" assignments, this will ensure that this
state can be tracked on both the VF and the VM side.
2021-06-17 01:33:03 -04:00
Joshua Boniface 0ad6d55dff Add initial SR-IOV support to node daemon
Adds configuration values for enabled flag and SR-IOV devices to the
configuration and sets up the initial SR-IOV configuration on daemon
startup (inserting the module, configuring the VF count, etc.).
2021-06-15 22:56:09 -04:00
Joshua Boniface eada5db5e4 Add diagram and info about invalid georedundancy 2021-06-15 10:20:42 -04:00
Joshua Boniface 164becd3ef Fix info and list matching 2021-06-15 02:32:34 -04:00
Joshua Boniface e4a65230a1 Just do the shutdown command itself 2021-06-15 02:32:14 -04:00
Joshua Boniface da48304d4a Avoid hackery in VNI list and support direct type 2021-06-15 00:31:13 -04:00
Joshua Boniface f540dd320b Allow VNI for "direct" type vNICs 2021-06-15 00:27:01 -04:00
Joshua Boniface 284c581845 Ensure shutdown migrations actually time out
Without this a VM that fails to respond to a shutdown will just spin
forever, blocking state changes.
2021-06-15 00:23:15 -04:00
Joshua Boniface 7b85d5e3f3 Stop VM before removing 2021-06-14 21:44:17 -04:00
Joshua Boniface 23318524b9 Ensure validate writes a valid schema version 2021-06-14 21:27:37 -04:00
Joshua Boniface 5f11b3198b Fix base schema None issue in handler too 2021-06-14 21:13:40 -04:00
Joshua Boniface 953e46055a Fix issue with loading None version schema 2021-06-14 21:09:55 -04:00
Joshua Boniface 96f1d7df83 Fix bad quote 2021-06-14 20:36:28 -04:00
Joshua Boniface d2bcfe5cf7 Bump version to 0.9.20 2021-06-14 18:06:27 -04:00
Joshua Boniface ef1701b4c8 Handle an additional exception case 2021-06-14 17:15:40 -04:00
Joshua Boniface 08dc756549 Actually disable the pvcapid service
Prevents it from trying to start itself during updates or reboots on
non-primary coordinators.
2021-06-14 17:13:22 -04:00
Joshua Boniface 0a9c0c1ccb Use a nicer reload method on hot schema update
Instead of exiting and trusting systemd to restart us, instead leverage
the os.execv() call to reload the process in the current PID context.

Also improves the log messages so it's very clear what's going on.
2021-06-14 17:10:21 -04:00
Joshua Boniface e34a7d4d2a Handle hot reloads properly
A hot reload isn't possible due to DataWatch and ChildrenWatch
constructs, so we instead need to terminate the daemon to "apply" the
schema update. Thus we use exit code 150 (Application defined in LSB)
and reorder some of the elements of the schema validation to ensure
things happen in the right order.
2021-06-14 12:52:43 -04:00
Joshua Boniface ddd3eeedda Remove needless literal_eval statements 2021-06-14 01:46:30 -04:00
Joshua Boniface 6fdc6674cf Fix grabbing existing version
The schema `version = ` now messes this up.
2021-06-14 01:40:10 -04:00
Joshua Boniface 78453a173c Add functional testing script
Since trying to unit test this monstrous program at this point is a
daunting task, instead create a functional testing script. Can be
theoretically run against any cluster with an appropriate "test"
provisioner profile, but I mostly just run it against my own.
2021-06-14 01:14:20 -04:00
Joshua Boniface 20c773413c Fix bug in snapshot rename 2021-06-14 00:55:26 -04:00
Joshua Boniface 49f4feb482 Fix typo bug in key rename 2021-06-14 00:51:45 -04:00
Joshua Boniface a2205bec13 Allow VM dump to file directly
Similar to the cluster backup task.
2021-06-13 22:32:54 -04:00
Joshua Boniface 7727221b59 Correctly use the Click file in backups 2021-06-13 22:17:35 -04:00
Joshua Boniface 30a160d5ff Fix invalid type_key 2021-06-13 21:20:10 -04:00
Joshua Boniface 1cbc66dccf Fix bugs in lease listing 2021-06-13 21:10:42 -04:00
Joshua Boniface bbd903e568 Fix bad schema name 2021-06-13 21:02:44 -04:00
Joshua Boniface 1f49bfa1b2 Fix name of schema element 2021-06-13 20:56:17 -04:00
Joshua Boniface 9511dc9864 Correct issue with invalid ACL ordering 2021-06-13 20:55:28 -04:00
Joshua Boniface 3013973975 Fix bad schema names 2021-06-13 20:32:41 -04:00
Joshua Boniface 8269930d40 Fix bad entry in network add 2021-06-13 18:22:13 -04:00
Joshua Boniface 647bce2a22 Ensure we don't grab None data 2021-06-13 16:43:25 -04:00
Joshua Boniface ae79113f7c Correct key typo and add error handler 2021-06-13 15:49:30 -04:00
Joshua Boniface 3bad3de720 Verify if key exists before reading 2021-06-13 15:39:43 -04:00
Joshua Boniface d2f93b3a2e Fix call to celery 2021-06-13 14:56:09 -04:00
Joshua Boniface 680c62a6e4 Fix schema path call and version check 2021-06-13 14:46:30 -04:00
Joshua Boniface 26b1f531e9 Fix bad variable interpolation 2021-06-13 14:37:23 -04:00
Joshua Boniface be9f1e8636 Use more compatible is_alive in thread 2021-06-13 14:36:27 -04:00
Joshua Boniface 88a1d89501 Fix bad key name 2021-06-13 14:29:54 -04:00
Joshua Boniface 7110a42e5f Add final schema elements after refactoring 2021-06-13 14:26:17 -04:00
Joshua Boniface 01c82f5d19 Move backup and restore into common 2021-06-13 14:25:51 -04:00
Joshua Boniface 059230d369 Convert vm.py to new ZK schema handler 2021-06-13 13:41:21 -04:00
Joshua Boniface f6e37906a9 Convert node.py to new ZK schema handler 2021-06-13 13:18:34 -04:00
Joshua Boniface 0a162b304a Convert network.py to new ZK schema handler 2021-06-12 18:40:25 -04:00
Joshua Boniface f071343333 Add DHCP lease schema and temp workaround 2021-06-12 18:22:43 -04:00
Joshua Boniface 01c762a362 Convert common.py to new ZK schema handler 2021-06-12 17:59:09 -04:00
Joshua Boniface 9b1bd8476f Convert cluster.py to new ZK schema handler 2021-06-12 17:11:32 -04:00
Joshua Boniface 6d00ec07b5 Convert ceph.py to new ZK schema handler 2021-06-12 17:09:29 -04:00
Joshua Boniface 247ae4fe2d Fix pre-refactor path bug 2021-06-10 01:18:33 -04:00