parallelvirtualcluster/pvc - pvc

Commit Graph

Author	SHA1	Message	Date
Joshua Boniface	330cf14638	Remove return statements in keepalive collectors These seem to bork the keepalive timer process, so just remove them and let it continue to press on.	2021-07-09 13:04:17 -04:00
Joshua Boniface	65d14ccd92	Return to all command-based Ceph gathering Using the Rados module was very problematic, specifically because it had no sensible timeout parameters and thus would hang for many seconds. This has poor implications since it blocks further keepalives. Instead, remove the Rados usage entirely and go back completely to using manual OS commands to gather this information. While this may cause PID exhaustion more quickly it's worthwhile to avoid failure scenarios when Ceph stats time out. Closes #137	2021-07-06 11:30:45 -04:00
Joshua Boniface	7082982a33	Bump version to 0.9.23	2021-07-05 23:40:32 -04:00
Joshua Boniface	5b6ef71909	Ensure daemon mode is updated on startup Fixes the side effect of the previous bug during deploys of 0.9.22.	2021-07-05 23:39:23 -04:00
Joshua Boniface	37cd278bc2	Bump version to 0.9.22	2021-07-05 14:18:51 -04:00
Joshua Boniface	a69105569f	Add node PVC version data to Node information Allows API client to see the currently-active version of the node daemon.	2021-07-05 09:57:38 -04:00
Joshua Boniface	f12de6727d	Adjust logo slightly and add debug state	2021-07-02 02:32:08 -04:00
Joshua Boniface	e94f5354e6	Update startup messages with new ASCII logo	2021-07-02 02:21:30 -04:00
Joshua Boniface	c51023ba81	Add profiler to keepalive function	2021-07-02 01:55:15 -04:00
Joshua Boniface	39e82ee426	Cast base schema version to int Or all our comparisons will fail later and nodes can't start.	2021-06-30 09:40:33 -04:00
Joshua Boniface	fe0a1d582a	Bump version to 0.9.21	2021-06-29 19:21:31 -04:00
Joshua Boniface	e623909a43	Store PHY MAC for VFs and restore after free	2021-06-22 00:56:47 -04:00
Joshua Boniface	eeb83da97d	Add support for SR-IOV NICs to VMs	2021-06-21 23:18:22 -04:00
Joshua Boniface	64d1a37b3c	Add PCIe device paths to SR-IOV VF information This will be used when adding VM network interfaces of type hostdev.	2021-06-21 21:08:46 -04:00
Joshua Boniface	13cc0f986f	Implement SR-IOV VF config set Also fixes some random bugs, adds proper interface sorting, and assorted tweaks.	2021-06-21 18:40:11 -04:00
Joshua Boniface	ca11dbf491	Sort the list of VFs for easier parsing	2021-06-21 01:40:05 -04:00
Joshua Boniface	e8bd1bf2c4	Ensure used/used_by are set on creation	2021-06-21 01:25:38 -04:00
Joshua Boniface	57b041dc62	Ensure default for vLAN and QOS is 0 not empty	2021-06-17 01:54:37 -04:00
Joshua Boniface	5607a6bb62	Avoid overwriting VF data Ensures that the configuration of a VF is not overwritten in Zookeeper on a node restart. The SRIOVVFInstance handlers were modified to start with None values, so that the DataWatch statements will always trigger updates to the live system interfaces on daemon startup, thus ensuring that the config stored in Zookeeper is applied to the system on startup (mostly relevant after a cold boot or if the API changes them during a daemon restart).	2021-06-17 01:45:22 -04:00
Joshua Boniface	8f1af2a642	Ignore hostdev interfaces in VM net stat gathering Prevents errors if a SR-IOV hostdev interface is configured until this is more defined.	2021-06-17 01:33:11 -04:00
Joshua Boniface	e7b6a3eac1	Implement SR-IOV PF and VF instances Adds support for the node daemon managing SR-IOV PF and VF instances. PFs are added to Zookeeper automatically based on the config at startup during network configuration, and are otherwise completely static. PFs are automatically removed from Zookeeper, along with all coresponding VFs, should the PF phy device be removed from the configuration. VFs are configured based on the (autocreated) VFs of each PF device, added to Zookeeper, and then a new class instance, SRIOVVFInstance, is used to watch them for configuration changes. This will enable the runtime management of VF settings by the API. The set of keys ensures that both configuration and details of the NIC can be tracked. Most keys are self-explanatory, especially for PFs and the basic keys for VFs. The configuration tree is also self-explanatory, being based entirely on the options available in the `ip link set {dev} vf` command. Two additional keys are also present: `used` and `used_by`, which will be able to track the (boolean) state of usage, as well as the VM that uses a given VIF. Since the VM side implementation will support both macvtap and direct "hostdev" assignments, this will ensure that this state can be tracked on both the VF and the VM side.	2021-06-17 01:33:03 -04:00
Joshua Boniface	0ad6d55dff	Add initial SR-IOV support to node daemon Adds configuration values for enabled flag and SR-IOV devices to the configuration and sets up the initial SR-IOV configuration on daemon startup (inserting the module, configuring the VF count, etc.).	2021-06-15 22:56:09 -04:00
Joshua Boniface	953e46055a	Fix issue with loading None version schema	2021-06-14 21:09:55 -04:00
Joshua Boniface	d2bcfe5cf7	Bump version to 0.9.20	2021-06-14 18:06:27 -04:00
Joshua Boniface	0a9c0c1ccb	Use a nicer reload method on hot schema update Instead of exiting and trusting systemd to restart us, instead leverage the os.execv() call to reload the process in the current PID context. Also improves the log messages so it's very clear what's going on.	2021-06-14 17:10:21 -04:00
Joshua Boniface	e34a7d4d2a	Handle hot reloads properly A hot reload isn't possible due to DataWatch and ChildrenWatch constructs, so we instead need to terminate the daemon to "apply" the schema update. Thus we use exit code 150 (Application defined in LSB) and reorder some of the elements of the schema validation to ensure things happen in the right order.	2021-06-14 12:52:43 -04:00
Joshua Boniface	b694945010	Fix incorrect name bug	2021-06-10 01:11:14 -04:00
Joshua Boniface	f913f42a6d	Replace schema paths with updated zkhandler	2021-06-09 20:29:42 -04:00
Joshua Boniface	e475552391	Fix some bugs with hot reload	2021-06-09 00:03:26 -04:00
Joshua Boniface	5540bdc86b	Add automatic schema upgrade to nodes Performs an automatic schema upgrade when all nodes are updated to the latest version. Addresses #129	2021-06-08 23:35:39 -04:00
Joshua Boniface	3c102b3769	Add per-node schema tracking This will allow nodes to start with their own schema versions, and then be updated simultaneously by the API. References #129	2021-06-08 23:35:39 -04:00
Joshua Boniface	a4aaf89681	Add ZKSchema loading and validation to Daemon Also removes some previous hack migrations from pre-0.9.19. Addresses #129	2021-06-08 23:35:39 -04:00
Joshua Boniface	cf96bb009f	Bump version to 0.9.19	2021-06-06 01:47:41 -04:00
Joshua Boniface	7dea5d2fac	Move logger to common, fix buffering	2021-06-01 18:50:26 -04:00
Joshua Boniface	9764090d6d	Merge node common with daemon common	2021-06-01 12:22:11 -04:00
Joshua Boniface	12ac3686de	Convert missed elements to new zkhandler	2021-06-01 11:57:21 -04:00
Joshua Boniface	1c9a7a6479	Convert VXNetworkInstance to new zkhandler	2021-06-01 11:49:39 -04:00
Joshua Boniface	790098f181	Convert VMInstance to new zkhandler	2021-06-01 11:46:27 -04:00
Joshua Boniface	8a4a41e092	Convert NodeInstance to new zkhandler	2021-06-01 11:27:35 -04:00
Joshua Boniface	a0b9087167	Set Daemon migration selector in zookeeper	2021-06-01 10:52:41 -04:00
Joshua Boniface	33a54cf7f2	Move configuration keys to /config tree	2021-06-01 10:48:55 -04:00
Joshua Boniface	d6a8cf9780	Convert MetadataAPIInstance to new zkhandler	2021-05-31 19:55:09 -04:00
Joshua Boniface	abd619a3c1	Convert DNSAggregatorInstance to new zkhandler	2021-05-31 19:55:01 -04:00
Joshua Boniface	ef5fe78125	Convert CepnInstance to new zkhandler	2021-05-31 19:51:27 -04:00
Joshua Boniface	f6d0e89568	Properly add absent node type	2021-05-31 19:26:27 -04:00
Joshua Boniface	ede3e88cd7	Modify node daemon root to use updated zkhandler	2021-05-31 03:14:09 -04:00
Joshua Boniface	0c75a127b2	Bump version to 0.9.18	2021-05-23 17:23:10 -04:00
Joshua Boniface	9de14c46fb	Bump version to 0.9.17	2021-05-19 17:06:29 -04:00
Joshua Boniface	fe15bdb854	Bump version to 0.9.16	2021-05-10 01:13:21 -04:00
Joshua Boniface	669338c22b	Bump version to 0.9.15	2021-04-08 13:37:47 -04:00
Joshua Boniface	c4ac75b973	Bump version to 0.9.14	2021-03-30 10:27:37 -04:00
Joshua Boniface	0bf276fd51	Update copyright year in headers	2021-03-25 17:01:55 -04:00
Joshua Boniface	f4ec161aa2	Update file copyright header. Remove the option to select a later version of the GPL.	2021-03-25 16:58:02 -04:00
Joshua Boniface	0ccfc41398	Bump version to 0.9.13	2021-02-17 11:37:59 -05:00
Joshua Boniface	9100c63e99	Add stored_bytes to pool stats information	2021-02-09 01:46:01 -05:00
Joshua Boniface	aba567d6c9	Add nice startup banners to both daemons Add nicer easy-to-find (yay ASCII art) banners for the startup printouts of both the node and API daemons. Also adds the safe loader to pvcnoded to prevent hassle messages and a version string in the API daemon file.	2021-02-08 02:51:43 -05:00
Joshua Boniface	0db8fd9da6	Bump version to 0.9.12	2021-01-28 16:29:58 -05:00
Joshua Boniface	9fbe35fd24	Bump version to 0.9.11	2021-01-05 15:58:26 -05:00
Joshua Boniface	a24724d9f0	Use external ceph cmd for ceph df	2020-12-26 14:04:21 -05:00
Joshua Boniface	518d699c15	Bump version to 0.9.10	2020-12-15 10:45:15 -05:00
Joshua Boniface	7c99a7bda7	Safely reset RBD locks on failed VMs Should correct issues on cold start as well as if a VM crashes uncleanly, which would prevent the VM from starting due to stale RBD locks. This implementation has four parts: 1. Update how IP addresses are handled, specifically by replacing all previous instances of "vni_ipaddr" with "vni_floatingipaddr", and then adding the "vni_ipaddr" with the real data for this node's IPs. Also include the storage IPs in this where they weren't before, so each this_node actually has the local IPs plus floating IPs. This enables the next two steps. 2. Modify flush_locks to take this_node as an argument, and update the run_command function to only operate against this node, rather than on the primary coordinator. 3. Have the flush_locks check each lock against the current node, to verify that the lock is actually held by the current node. This is the only way to do this safely. During fencing, we override this by not passing a this_node which bypasses this check. 4. Have the VM start do the check for VM failure/startup and execute a flush_locks before actually starting the VM.	2020-12-14 15:53:18 -05:00
Joshua Boniface	89c7e225a0	Move OSD stats uploading to primary only Instead of each node uploading its own OSD stats, which would not work if the PVC daemon wasn't running, instead have the primary upload stats for all OSDs in the cluster.	2020-12-09 02:46:09 -05:00
Joshua Boniface	b36ec43a2d	Bump version to 0.9.9	2020-12-09 02:20:20 -05:00
Joshua Boniface	ce5ee11841	Bump version to 0.9.8	2020-11-24 12:26:57 -05:00
Joshua Boniface	d4a28d7a58	Bump version to 0.9.7	2020-11-19 10:48:28 -05:00
Joshua Boniface	e69eb93cb3	Bump version to 0.9.6	2020-11-17 13:01:54 -05:00
Joshua Boniface	a4e5323e81	Bump version to 0.9.5	2020-11-17 12:34:04 -05:00
Joshua Boniface	9053edacd8	Bump version to 0.9.4	2020-11-10 15:33:50 -05:00
Joshua Boniface	baac8f24fd	Bump version to 0.9.3	2020-11-09 10:28:15 -05:00
Joshua Boniface	11702f4bc8	Bump version to 0.9.2	2020-11-08 02:03:29 -05:00
Joshua Boniface	6f66b77a00	Lint: E121/E126 continuation line under/over-indented for hanging indent	2020-11-07 15:06:21 -05:00
Joshua Boniface	260b39ebf2	Lint: E302 expected 2 blank lines, found X	2020-11-07 14:45:24 -05:00
Joshua Boniface	c3dfe2e381	Lint: F821 undefined name 'myshorthostname'	2020-11-07 13:31:19 -05:00
Joshua Boniface	961ebb4c01	Lint: E305 expected 2 blank lines after class or function definition, found X	2020-11-07 13:17:49 -05:00
Joshua Boniface	e553c5d42a	Lint: E122 continuation line missing indentation or outdented	2020-11-07 13:12:26 -05:00
Joshua Boniface	7932be3948	Lint: E261 at least two spaces before inline comment	2020-11-07 13:11:03 -05:00
Joshua Boniface	d2490419c5	Lint: E202 whitespace before ']'	2020-11-07 13:02:54 -05:00
Joshua Boniface	d2e5ede399	Lint: E202 whitespace before ')'	2020-11-07 12:58:54 -05:00
Joshua Boniface	3f242cd437	Lint: E202 whitespace before '}'	2020-11-07 12:57:42 -05:00
Joshua Boniface	b7daa8e1f6	E201 whitespace after '['	2020-11-07 12:39:59 -05:00
Joshua Boniface	c88965e898	Lint: E201 whitespace after '('	2020-11-07 12:39:27 -05:00
Joshua Boniface	e333f2b935	Lint: E201 whitespace after '{'	2020-11-07 12:38:31 -05:00
Joshua Boniface	8c623023d5	Lint: F811 redefinition of unused '<function>'	2020-11-07 12:14:29 -05:00
Joshua Boniface	2eef6a1c21	Lint: E265 block comment should start with '# '	2020-11-06 21:32:17 -05:00
Joshua Boniface	5da314902f	Lint: F841 local variable '<variable>' is assigned to but never used	2020-11-06 21:13:13 -05:00
Joshua Boniface	98a573bbc7	Lint: E402 module level import not at top of file	2020-11-06 20:40:32 -05:00
Joshua Boniface	aecb845d6a	Lint: E713 test for membership should be 'not in'	2020-11-06 20:37:52 -05:00
Joshua Boniface	57c51d3234	Lint: E711 comparison to None should be 'if cond is not None:'	2020-11-06 19:37:13 -05:00
Joshua Boniface	ce01b41d81	Lint: E711 comparison to None should be 'if cond is None:'	2020-11-06 19:36:36 -05:00
Joshua Boniface	ebf254f62d	Lint: W293 blank line contains whitespace	2020-11-06 19:11:07 -05:00
Joshua Boniface	63f4f9aed7	Lint: E722 do not use bare 'except'	2020-11-06 18:55:10 -05:00
Joshua Boniface	56ba7b1457	Bump version to 0.9.1	2020-10-29 12:16:38 -04:00
Joshua Boniface	0f299777f1	Modify version to 3-digit numbering I expect 0.9 will be fairly long-lived, so add another decimal place so I may continue adding tweaks to it. THIS IS NOT SEMVER.	2020-10-26 02:13:11 -04:00
Joshua Boniface	c6e34c7dc6	Bump base version to 0.9	2020-10-18 14:31:19 -04:00
Joshua Boniface	a4b80be5ed	Add provisioned memory to node info Adds a separate field to the node memory, "provisioned", which totals the amount of memory provisioned to all VMs on the node, regardless of state, and in contrast to "allocated" which only counts running VMs. Allows for the detection of potential overprovisioned states when factoring in non-running VMs. Includes the supporting code to get this data, since the original implementation of VM memory selection was dependent on the VM being running and getting this from libvirt. Now, if the VM is not active, it gets this from the domain XML instead.	2020-10-18 14:17:15 -04:00
Joshua Boniface	9366977fe6	Copy d_domain before iterating Prevents a bug where the thread can crash due to a change in the d_domain object while running the for loop. By copying and iterating over the copy, this becomes safer.	2020-09-16 15:12:37 -04:00
Joshua Boniface	65b44f2955	Avoid breaking keepalive during incoming migration The keepalive was getting stuck gathering memoryStats from the non-running VM, since it was in a paused state. Avoid this by just skipping past the rest of the stats gathering if the VM isn't running.	2020-08-28 01:47:36 -04:00
Joshua Boniface	78dec77987	Bump version to 0.8	2020-08-26 10:24:44 -04:00
Joshua Boniface	921e57ca78	Fix syntax error	2020-08-20 23:05:56 -04:00
Joshua Boniface	3cc7df63f2	Add configurable VM shutdown timeout Closes #102	2020-08-20 21:26:12 -04:00
Joshua Boniface	e8e65934e3	Use logger prefix for thread debug logs	2020-08-17 14:30:21 -04:00
Joshua Boniface	9b3ef6d610	Add connect timeout to Ceph This doesn't seem to actually do anything (like most of these timeouts...) but add it just for posterity.	2020-08-17 13:58:14 -04:00
Joshua Boniface	b451c0e8e3	Add additional start/finish debug messages	2020-08-17 13:11:03 -04:00
Joshua Boniface	553f96e7ef	Use logger for debug output Using simple print statements was annoying (lack of timing info and formatting), so move to using the debug logger for these instead with a custom state ('d') with white text to differentiate them. Also indicate which subthread of the keepalive each task is being executed in for easier tracing of issues.	2020-08-17 12:46:52 -04:00
Joshua Boniface	65add58c9a	Properly properly handle issue	2020-08-16 11:38:39 -04:00
Joshua Boniface	0a01d84290	Tie fence timers to keepalive_interval Also wait 2 full keepalive intervals after fencing before doing anything else, to give the Ceph cluster a chance to recover.	2020-08-15 12:38:03 -04:00
Joshua Boniface	4afb288429	Properly handle missing domain_name fail	2020-08-15 12:07:23 -04:00
Joshua Boniface	985ad5edc0	Warn if fencing will fail Verify our IPMI state on startup, and then warn if fencing will fail. For now, this is sufficient, but in future (requires refactoring) we might want to adjust how fencing occurs based on this information.	2020-08-13 14:42:18 -04:00
Joshua Boniface	0587bcbd67	Go back to manual command for OSD stats Using the Ceph library was a disaster here; it had no timeout or way to force it to continue, so keepalives would become stuck and trigger fence storms. Go back to the manual osd dump command with a 2s timeout which is far more reliable and can be adequately terminated if it runs long.	2020-08-12 22:31:25 -04:00
Joshua Boniface	e0cb4a58c3	Ensure zk_listener is readded after reconnect	2020-08-11 12:46:15 -04:00
Joshua Boniface	0e5c681ada	Clean up imports Make several imports more specific to reduce redundant code imports and improve memory utilization.	2020-08-11 12:09:10 -04:00
Joshua Boniface	46ffe352e3	Better handle subthread timeouts in keepalive Prevent the main keepalive thread from getting stuck due to a subthread taking an enormous time. If this happens, the rest of the main keepalive will continue onward, thus ensuring that the main keepalive does not fail for a significant number of cycles, which would cause a fence.	2020-08-11 11:37:26 -04:00
Joshua Boniface	876f2424e0	Ensure dead state isn't written erroneously	2020-08-05 21:57:11 -04:00
Joshua Boniface	5871380e1b	Avoid crashing VM stats thread if domain migrated	2020-06-10 17:10:46 -04:00
Joshua Boniface	654a3cb7fa	Improve debug output and use ceph df util data	2020-06-06 22:52:49 -04:00
Joshua Boniface	9b65d3271a	Improve handling of Ceph status gathering Use the Rados library instead of random OS commands, which massively improves the performance of these tasks. Closes #97	2020-06-06 22:30:25 -04:00
Joshua Boniface	598b2025e8	Use Rados and add Ceph entries to pvcnoded.yaml	2020-06-06 21:12:51 -04:00
Joshua Boniface	70b787d1fd	Move all VM functions into thread	2020-06-06 15:44:05 -04:00
Joshua Boniface	e1310a05f2	Implement recording of VM stats during keepalive	2020-06-06 15:34:03 -04:00
Joshua Boniface	2ad6860dfe	Move Ceph statistics gathering into thread	2020-06-06 13:25:02 -04:00
Joshua Boniface	cebb4bbc1a	Comment cleanup	2020-06-06 13:20:40 -04:00
Joshua Boniface	a672e06dd2	Move fencing to end of keepalive function	2020-06-06 13:19:11 -04:00
Joshua Boniface	1db73bb892	Move libvirt closure into previous section	2020-06-06 13:18:37 -04:00
Joshua Boniface	c1956072f0	Rename update_zookeeper function to node_keepalive	2020-06-06 12:49:50 -04:00
Joshua Boniface	5f9836f96d	Add error message to OSD parse fail	2020-05-12 11:04:38 -04:00
Joshua Boniface	b580760537	Add missing fmt_cyan variable	2020-05-08 18:15:02 -04:00
Joshua Boniface	331027d124	Add further tweaks to takeover state checks Just ensure that everything is proper state before proceeding	2020-04-22 11:16:19 -04:00
Joshua Boniface	611e0edd80	Reorder last keepalive during cleanup Make sure the stopping of the keepalive timer and final keepalive update are done as the last step before complete shutdown. The previous setup could conceivably result in a node being fenced should the cleanup operations take longer than ~45 seconds, for instance if primary node switchover took too long or blocked, or log watchers failed to stop quickly enough. Ensures that keepalives will continue to be run during the shutdown process until the last possible moment.	2020-04-12 03:49:29 -04:00
Joshua Boniface	b413e042a6	Improve handling of primary contention Previously, contention could occasionally cause a flap/dual primary contention state due to the lack of checking within this function. This could cause a state where a node transitions to primary than is almost immediately shifted away, which could cause undefined behaviour in the cluster. The solution includes several elements: * Implement an exclusive lock operation in zkhandler * Switch the become_primary function to use this exclusive lock * Implement exclusive locking during the contention process * As a failsafe, check stat versions before setting the node as the primary node, in case another node already has * Delay the start of takeover/relinquish operations by slightly longer than the lock timeout * Make the current router_state conditions more explicit (positive conditionals rather than negative conditionals) The new scenario ensures that during contention, only one secondary will ever succeed at acquiring the lock. Ideally, the other would then grab the lock and pass, but in testing this does not seem to be the case - the lock always times out, so the failsafe check is technically not needed but has been left as an added safety mechanism. With this setup, the node that fails the contention will never block the switchover nor will it try to force itself onto the cluster after another node has successfully won contention. Timeouts may need to be adjusted in the future, but the base timeout of 0.4 seconds (and transition delay of 0.5 seconds) seems to work reliably during preliminary tests.	2020-04-12 03:40:17 -04:00
Joshua Boniface	a671d9d457	Use consistent tense in messages	2020-04-08 22:00:51 -04:00
Joshua Boniface	fee1c7dd6c	Reorder cleanup and gracefully wait for flushes	2020-04-08 22:00:08 -04:00
Joshua Boniface	d2a5fe59c0	Use transitional takeover states for migration Use a pair of transitional states, "takeover" and "relinquish", when transitioning between primary and secondary coordinator states. This provides a clsuter-wide record that the nodes are still working during their synchronous transition states, and should allow clients to determine when the node(s) have fully switched over. Also add an additional 2 seconds of wait at the end of the transition jobs to ensure everything has had a chance to start before proceeding. References #72	2020-02-19 14:06:54 -05:00
Joshua Boniface	9c7041f12c	Update package version to 0.7	2020-02-15 23:25:47 -05:00
Joshua Boniface	ce985234c3	Use consistent naming of components Rename "pvcd" to "pvcnoded", and "pvc-api" to "pvcapid" so names for the daemons are fully consistent. Update the names of the configuration files as well to match this new formatting. References #79	2020-02-08 19:34:07 -05:00

1 2 3 4 5

234 Commits