parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	494c20263d	Move monitoring folder to top level	2023-12-27 11:37:49 -05:00
Joshua M. Boniface	3e4cc53fdd	Add node network statistics and utilization values Adds a new physical network interface stats parser to the node keepalives, and leverages this information to provide a network utilization overview in the Prometheus metrics.	2023-12-21 15:45:01 -05:00
Joshua M. Boniface	39f9f3640c	Rename health metrics and add resource metrics	2023-12-21 09:40:49 -05:00
Joshua M. Boniface	0a93f526e0	Bump version to 0.9.86	2023-12-14 14:46:29 -05:00
Joshua M. Boniface	38e43b46c3	Update health detail messages format	2023-12-13 03:17:47 -05:00
Joshua M. Boniface	0f24184b78	Explicitly clear resources of fenced node This actually solves the bug originally "fixed" in `5f1432ccdd` without breaking VM resource allocations for working nodes.	2023-12-11 12:14:56 -05:00
Joshua M. Boniface	1ba37fe33d	Restore VM resource allocation location Commit `5f1432ccdd` changed where these happen due to a bug after fencing. However this completely broke node resource reporting as only the final instance will be queried here. Revert this change and look further into the original bug.	2023-12-11 11:52:59 -05:00
Joshua M. Boniface	1a05077b10	Fix missing fstring	2023-12-11 11:29:49 -05:00
Joshua M. Boniface	9617660342	Update Prometheus Grafana dashboard	2023-12-11 00:23:08 -05:00
Joshua M. Boniface	9dc5097dbc	Bump version to 0.9.85	2023-12-10 01:00:33 -05:00
Joshua M. Boniface	53d632f283	Fix bug in example PVC Grafana dashboard	2023-12-10 00:50:05 -05:00
Joshua M. Boniface	7bc0760b78	Add time to "starting keepalive" message Matches the pvchealthd output and provides a useful message detail to this otherwise contextless message.	2023-12-10 00:40:32 -05:00
Joshua M. Boniface	9aee2a9075	Bump version to 0.9.84	2023-12-09 23:05:40 -05:00
Joshua M. Boniface	1f6347d24b	Add Prometheus monitoring examples	2023-12-09 17:42:51 -05:00
Joshua M. Boniface	988de1218f	Bump version to 0.9.83	2023-12-01 17:37:42 -05:00
Joshua M. Boniface	1fb0463dea	Adjust daemon service startup Add healthd, adjust workerd, lower waittime	2023-11-30 03:28:02 -05:00
Joshua M. Boniface	03a738f878	Move config parser into daemon_lib And reformat/add config values for API.	2023-11-30 00:05:37 -05:00
Joshua M. Boniface	4a2eba0961	Improve node output messages (from pvchealthd) 1. Output startup "list" entries in cyan with s state 2. Add start of keepalive run message	2023-11-29 21:21:51 -05:00
Joshua M. Boniface	647cba3cf5	Expand startup width for new daemon name	2023-11-29 21:21:51 -05:00
Joshua M. Boniface	41f4e4fb2f	Split health monitoring into discrete daemon/pkg	2023-11-29 21:21:51 -05:00
Joshua M. Boniface	83ceb41138	Add daemon name to Logger entries	2023-11-29 15:18:37 -05:00
Joshua M. Boniface	2545a7b744	Allow similar for IPMI hostnames	2023-11-28 16:09:01 -05:00
Joshua M. Boniface	ce907ff26a	Allow specifying static IPs instead of a file	2023-11-28 15:28:31 -05:00
Joshua M. Boniface	71e589e461	Remove superflous debug output This is printed in the startup logo block anyways.	2023-11-27 13:46:30 -05:00
Joshua M. Boniface	fc3d292081	Add missing subdirectory configs	2023-11-27 13:40:07 -05:00
Joshua M. Boniface	eab1ae873b	Ensure upstream_gateway key will exist	2023-11-27 13:37:57 -05:00
Joshua M. Boniface	eaf93cdf96	Readd missing subsystem configurations	2023-11-27 13:33:41 -05:00
Joshua M. Boniface	c8f4cbb39e	Fix node entry keys	2023-11-27 13:24:01 -05:00
Joshua M. Boniface	786fae7769	Improve logo output	2023-11-27 13:01:43 -05:00
Joshua M. Boniface	bcc57638a9	Refactor pvcnoded to use new configuration	2023-11-26 15:41:25 -05:00
Joshua M. Boniface	2666e0603e	Update dnsmasq script to use new config file	2023-11-26 14:18:13 -05:00
Joshua M. Boniface	dab7396196	Move to unified pvc.conf configuration file	2023-11-26 14:16:21 -05:00
Joshua M. Boniface	460a2dd09f	Bump version to 0.9.82	2023-11-25 15:38:50 -05:00
Joshua M. Boniface	3e001b08b6	Bump version to 0.9.81	2023-11-17 01:29:41 -05:00
Joshua M. Boniface	e818df5dae	Use enable/disable --now instead of two commands Avoids needing two calls here especially for the stop.	2023-11-16 02:40:35 -05:00
Joshua M. Boniface	c76a5afd04	Avoid waits during node secondary Waiting for the daemons to stop took too much time on some nodes and could throw off the lockstep. Instead, leverage background=True to run the systemctl os_commands in the background (when they complete is irrelevant), stop the Metadata API first, and don't delay during its stop at all.	2023-11-16 02:34:12 -05:00
Joshua M. Boniface	18e43a9377	Adjust name in worker log output	2023-11-16 02:25:14 -05:00
Joshua M. Boniface	aef38639cf	Rename pvcapid-worker to pvcworkerd	2023-11-15 20:31:39 -05:00
Joshua M. Boniface	5f1432ccdd	Fix memory allocation updates and add more debug Previously, we were assigning memalloc/memprov/vcpualloc during an earlier phase using the main d_domain list. I'm not sure exactly why, but this was throwing off stats after a fence. Instead, set these values later on while parsing the actually-active VMs.	2023-11-10 10:29:32 -05:00
Joshua M. Boniface	d6b8808448	Clean up fencing handler 1. Remove all format strings in favour of f-strings 2. Ensure all logger messages have a prefix 3. Add a few more logger messages for clarity	2023-11-10 10:09:54 -05:00
Joshua M. Boniface	83c4c6633d	Readd RBD lock detection and clearing on startup This is still needed due to the nature of the locks and freeing them on startup, and to preserve lock=fail behaviour on VM startup. Also fixes the fencing lock flush to directly use the client library outside of Celery. I don't like this hack but it seems prudent until we move fencing to the workers as well.	2023-11-10 01:33:48 -05:00
Joshua M. Boniface	2a9bc632fa	Add node monitoring plugin for KeyDB/Redis	2023-11-10 00:56:46 -05:00
Joshua M. Boniface	08411708f6	Clean up dangling references to cmd pipes Also removes the schema references for these CMD pipes as they are no longer required.	2023-11-09 23:28:14 -05:00
Joshua M. Boniface	ce17c60a20	Port OSD on-node tasks to Celery worker system Adds Celery versions of the osd_add, osd_replace, osd_refresh, osd_remove, and osd_db_vg_add functions.	2023-11-09 23:28:08 -05:00
Joshua M. Boniface	89681d54b9	Port VM on-node tasks to Celery worker system Adds Celery versions of the flush_locks, device_attach, and device_detach functions.	2023-11-06 20:40:46 -05:00
Joshua M. Boniface	f0c2e9d295	Don't start pvcapid-worker on primary It will be running anyways	2023-11-05 19:44:00 -05:00
Joshua M. Boniface	2c15036f86	Add KeyDB to node startup services Also ensure API worker starts on all nodes, not just coordinators.	2023-11-05 19:26:38 -05:00
Joshua M. Boniface	30d7e49401	Start API worker with node daemon on coordinators	2023-11-04 13:08:16 -04:00
Joshua M. Boniface	7490f13b7c	Check for partition tables on new devices	2023-11-04 03:13:58 -04:00
Joshua M. Boniface	e32054be81	Refactor refresh as well	2023-11-04 02:44:52 -04:00

1 2 3 4 5 ...

819 Commits