Commit Graph

514 Commits

Author SHA1 Message Date
Joshua Boniface 26dd24e3f5 Ensure MTU is set on VF when starting up 2021-06-22 02:26:14 -04:00
Joshua Boniface e623909a43 Store PHY MAC for VFs and restore after free 2021-06-22 00:56:47 -04:00
Joshua Boniface 60e1da09dd Don't try any shenannegans when updating NICs
Trying to do this on the VMInstance side had problems because we can't
differentiate the 3 types of migration there. So, just update this in
the API side and hope everything goes well.

This introduces an edge bug: if a VM is using a macvtap SR-IOV device,
and then tries to migrate, and the migrate is aborted, the NIC lists
will be inconsistent.

When I revamp the VMInstance in the future, I should be able to correct
this, but for now we'll have to live with that edgecase.
2021-06-22 00:00:50 -04:00
Joshua Boniface 7d42fba373 Ensure being in migrate doesn't abort shutdown 2021-06-21 23:28:53 -04:00
Joshua Boniface 24ce361a04 Ensure SR-IOV NIC states are updated on migration 2021-06-21 23:18:34 -04:00
Joshua Boniface eeb83da97d Add support for SR-IOV NICs to VMs 2021-06-21 23:18:22 -04:00
Joshua Boniface 64d1a37b3c Add PCIe device paths to SR-IOV VF information
This will be used when adding VM network interfaces of type hostdev.
2021-06-21 21:08:46 -04:00
Joshua Boniface 13cc0f986f Implement SR-IOV VF config set
Also fixes some random bugs, adds proper interface sorting, and assorted
tweaks.
2021-06-21 18:40:11 -04:00
Joshua Boniface ca11dbf491 Sort the list of VFs for easier parsing 2021-06-21 01:40:05 -04:00
Joshua Boniface e8bd1bf2c4 Ensure used/used_by are set on creation 2021-06-21 01:25:38 -04:00
Joshua Boniface bff6d71e18 Add logging to SRIOVVFInstance and fix bug 2021-06-17 02:02:41 -04:00
Joshua Boniface 57b041dc62 Ensure default for vLAN and QOS is 0 not empty 2021-06-17 01:54:37 -04:00
Joshua Boniface 5607a6bb62 Avoid overwriting VF data
Ensures that the configuration of a VF is not overwritten in Zookeeper
on a node restart. The SRIOVVFInstance handlers were modified to start
with None values, so that the DataWatch statements will always trigger
updates to the live system interfaces on daemon startup, thus ensuring
that the config stored in Zookeeper is applied to the system on startup
(mostly relevant after a cold boot or if the API changes them during a
daemon restart).
2021-06-17 01:45:22 -04:00
Joshua Boniface 8f1af2a642 Ignore hostdev interfaces in VM net stat gathering
Prevents errors if a SR-IOV hostdev interface is configured until this
is more defined.
2021-06-17 01:33:11 -04:00
Joshua Boniface e7b6a3eac1 Implement SR-IOV PF and VF instances
Adds support for the node daemon managing SR-IOV PF and VF instances.

PFs are added to Zookeeper automatically based on the config at startup
during network configuration, and are otherwise completely static. PFs
are automatically removed from Zookeeper, along with all coresponding
VFs, should the PF phy device be removed from the configuration.

VFs are configured based on the (autocreated) VFs of each PF device,
added to Zookeeper, and then a new class instance, SRIOVVFInstance, is
used to watch them for configuration changes. This will enable the
runtime management of VF settings by the API. The set of keys ensures
that both configuration and details of the NIC can be tracked.

Most keys are self-explanatory, especially for PFs and the basic keys
for VFs. The configuration tree is also self-explanatory, being based
entirely on the options available in the `ip link set {dev} vf` command.

Two additional keys are also present: `used` and `used_by`, which will
be able to track the (boolean) state of usage, as well as the VM that
uses a given VIF. Since the VM side implementation will support both
macvtap and direct "hostdev" assignments, this will ensure that this
state can be tracked on both the VF and the VM side.
2021-06-17 01:33:03 -04:00
Joshua Boniface 0ad6d55dff Add initial SR-IOV support to node daemon
Adds configuration values for enabled flag and SR-IOV devices to the
configuration and sets up the initial SR-IOV configuration on daemon
startup (inserting the module, configuring the VF count, etc.).
2021-06-15 22:56:09 -04:00
Joshua Boniface e4a65230a1 Just do the shutdown command itself 2021-06-15 02:32:14 -04:00
Joshua Boniface 284c581845 Ensure shutdown migrations actually time out
Without this a VM that fails to respond to a shutdown will just spin
forever, blocking state changes.
2021-06-15 00:23:15 -04:00
Joshua Boniface 953e46055a Fix issue with loading None version schema 2021-06-14 21:09:55 -04:00
Joshua Boniface d2bcfe5cf7 Bump version to 0.9.20 2021-06-14 18:06:27 -04:00
Joshua Boniface ef1701b4c8 Handle an additional exception case 2021-06-14 17:15:40 -04:00
Joshua Boniface 08dc756549 Actually disable the pvcapid service
Prevents it from trying to start itself during updates or reboots on
non-primary coordinators.
2021-06-14 17:13:22 -04:00
Joshua Boniface 0a9c0c1ccb Use a nicer reload method on hot schema update
Instead of exiting and trusting systemd to restart us, instead leverage
the os.execv() call to reload the process in the current PID context.

Also improves the log messages so it's very clear what's going on.
2021-06-14 17:10:21 -04:00
Joshua Boniface e34a7d4d2a Handle hot reloads properly
A hot reload isn't possible due to DataWatch and ChildrenWatch
constructs, so we instead need to terminate the daemon to "apply" the
schema update. Thus we use exit code 150 (Application defined in LSB)
and reorder some of the elements of the schema validation to ensure
things happen in the right order.
2021-06-14 12:52:43 -04:00
Joshua Boniface 1f49bfa1b2 Fix name of schema element 2021-06-13 20:56:17 -04:00
Joshua Boniface 647bce2a22 Ensure we don't grab None data 2021-06-13 16:43:25 -04:00
Joshua Boniface 26b1f531e9 Fix bad variable interpolation 2021-06-13 14:37:23 -04:00
Joshua Boniface be9f1e8636 Use more compatible is_alive in thread 2021-06-13 14:36:27 -04:00
Joshua Boniface b694945010 Fix incorrect name bug 2021-06-10 01:11:14 -04:00
Joshua Boniface 058c2ceef3 Convert VXNetworkInstance to new ZK schema handler 2021-06-10 00:36:18 -04:00
Joshua Boniface e7d60260a0 Fix typo in CephInstance path 2021-06-10 00:36:02 -04:00
Joshua Boniface 85aba7cc18 Convert VMInstance to new ZK schema handler 2021-06-09 23:15:08 -04:00
Joshua Boniface 7e42118e6f Adjust lock schema in NodeInstance and VMInstance
Removes a superfluous lock and puts the sync_lock keys in more usable
places.
2021-06-09 22:51:00 -04:00
Joshua Boniface 2704badfbe Convert VMConsole... to new ZK schema handler 2021-06-09 22:08:32 -04:00
Joshua Boniface 450bf6b153 Convert NodeInstance to new ZK schema handler 2021-06-09 22:07:32 -04:00
Joshua Boniface b94fe88405 Convert fencing to new ZK schema handler 2021-06-09 21:29:01 -04:00
Joshua Boniface 610f6e8f2c Convert CephInstance to new ZK schema handler 2021-06-09 21:17:09 -04:00
Joshua Boniface f913f42a6d Replace schema paths with updated zkhandler 2021-06-09 20:29:42 -04:00
Joshua Boniface e475552391 Fix some bugs with hot reload 2021-06-09 00:03:26 -04:00
Joshua Boniface 5540bdc86b Add automatic schema upgrade to nodes
Performs an automatic schema upgrade when all nodes are updated to the
latest version.

Addresses #129
2021-06-08 23:35:39 -04:00
Joshua Boniface 3c102b3769 Add per-node schema tracking
This will allow nodes to start with their own schema versions, and then
be updated simultaneously by the API.

References #129
2021-06-08 23:35:39 -04:00
Joshua Boniface a4aaf89681 Add ZKSchema loading and validation to Daemon
Also removes some previous hack migrations from pre-0.9.19.

Addresses #129
2021-06-08 23:35:39 -04:00
Joshua Boniface 126f0742cd Add Zookeeper schema manager to zkhandler
Adds a new class, ZKSchema, to handle schema management in Zookeeper in
an automated and consistent way. This should solve several issues:

1. Pain in managing changes to ZK keys
2. Pain in handling those changes during live upgrades
3. Simplifying the codebase to remove hardcoded ZK paths

The current master schema for PVC 0.9.19 is committed as version 0.

Addresses #129
2021-06-08 23:35:39 -04:00
Joshua Boniface 5843d8aff4 Fix fence call to findTargetNode 2021-06-08 23:34:49 -04:00
Joshua Boniface cf96bb009f Bump version to 0.9.19 2021-06-06 01:47:41 -04:00
Joshua Boniface 719954b70b Fix missing list comma 2021-06-06 01:39:43 -04:00
Joshua Boniface 7dea5d2fac Move logger to common, fix buffering 2021-06-01 18:50:26 -04:00
Joshua Boniface 3a5226b893 Add missing flushed output 2021-06-01 18:30:18 -04:00
Joshua Boniface de2ff2e01b Fix removed function args 2021-06-01 17:02:36 -04:00
Joshua Boniface cd75413667 Increase initial lock timer
With the new library the reader seems to be a little too quick, so hold
the write lock for 1 second instead of 1/2 second to ensure it is
caught.
2021-06-01 17:00:11 -04:00