parallelvirtualcluster/pvc - pvc

Commit Graph

Author	SHA1	Message	Date
Joshua Boniface	3346ce9bb0	Add missing shutdown state from combinations	2023-12-27 13:40:30 -05:00
Joshua Boniface	e654fbba08	Move debug condition handling to Logger Avoids many dozens of conditionals sprinkled throughout the code by centralizing this check into the main Logger instance.	2023-12-27 13:01:45 -05:00
Joshua Boniface	4375f66793	Use proper get() for invalid values	2023-12-27 12:03:48 -05:00
Joshua Boniface	3df3ca5b44	Fix value for OSD utilization Ceph provides in KB; convert to bytes.	2023-12-27 11:56:50 -05:00
Joshua Boniface	431ee69620	Use proper percentage for pool util	2023-12-27 10:03:00 -05:00
Joshua Boniface	88f4d79d5a	Handle invalid values on older Libvirt versions	2023-12-27 09:51:24 -05:00
Joshua Boniface	84d22751d8	Fix bad JSON data handler	2023-12-27 09:43:37 -05:00
Joshua Boniface	40ff005a09	Fix handling of Ceph OSD bytes	2023-12-26 12:43:51 -05:00
Joshua Boniface	9604f655d0	Improve node utilization metrics and fix bugs	2023-12-25 02:47:41 -05:00
Joshua Boniface	3e4cc53fdd	Add node network statistics and utilization values Adds a new physical network interface stats parser to the node keepalives, and leverages this information to provide a network utilization overview in the Prometheus metrics.	2023-12-21 15:45:01 -05:00
Joshua Boniface	d2d2a9c617	Include our newline atomically Sometimes clashing log entries would print on the same line, likely due to some sort of race condition in Python's print() built-in. Instead, add a newline to our actual message and print without an end character. This ensures atomic printing of our log messages.	2023-12-21 13:12:43 -05:00
Joshua Boniface	6ed4efad33	Add new network.stats key to nodes	2023-12-21 12:48:48 -05:00
Joshua Boniface	39f9f3640c	Rename health metrics and add resource metrics	2023-12-21 09:40:49 -05:00
Joshua Boniface	c64e888d30	Fix incorrect cast of None	2023-12-14 16:00:53 -05:00
Joshua Boniface	f1249452e5	Fix bug if no nodes are present	2023-12-14 15:32:18 -05:00
Joshua Boniface	f41c5176be	Ensure health value is an int properly	2023-12-13 14:34:02 -05:00
Joshua Boniface	ed9c37982a	Move metric collection into daemon library	2023-12-11 19:20:30 -05:00
Joshua Boniface	57c28376a6	Port one final Ceph function to read_many	2023-12-11 10:25:36 -05:00
Joshua Boniface	e781d742e6	Fix bug with volume and snapshot listing	2023-12-11 10:21:46 -05:00
Joshua Boniface	741dafb26b	Port VM functions to read_many	2023-12-11 03:34:36 -05:00
Joshua Boniface	5d9e83e8ed	Fix output bugs in VM information	2023-12-11 03:04:46 -05:00
Joshua Boniface	7c116b2fbc	Ensure node health value is an int	2023-12-10 23:56:50 -05:00
Joshua Boniface	1023c55087	Fix bug in VM state list	2023-12-10 23:44:01 -05:00
Joshua Boniface	9235187c6f	Port Ceph functions to read_many Only ports getOSDInformation, as all the others feature 3 or less reads which is acceptable sequentially.	2023-12-10 22:24:38 -05:00
Joshua Boniface	0c94f1b4f8	Port Network functions to read_many	2023-12-10 22:19:21 -05:00
Joshua Boniface	44a4f0e1f7	Use new info detail output instead of new lists Avoids multiple additional ZK calls by using data that is now in the status detail output.	2023-12-10 22:19:09 -05:00
Joshua Boniface	5d53a3e529	Add state and faults detail to cluster information We already parse this information out anyways, so might as well add it to the API output JSON. This can be leveraged by the Prometheus endpoint as well to avoid duplicate listings.	2023-12-10 17:29:32 -05:00
Joshua Boniface	35e22cb50f	Simplify cluster status handling This significantly simplifies cluster state handling by removing most of the superfluous get_list() calls, replacing them with basic child reads since most of them are just for a count anyways. The ones that require states simplify this down to a child read plus direct reads for the exact items required while leveraging the new read_many() function.	2023-12-10 17:05:46 -05:00
Joshua Boniface	a3171b666b	Split node health into separate function	2023-12-10 16:52:10 -05:00
Joshua Boniface	48e41d7b05	Port Faults getFault and getAllFaults to read_many	2023-12-10 16:05:16 -05:00
Joshua Boniface	d6aecf195e	Port Node getNodeInformation to read_many	2023-12-10 15:53:28 -05:00
Joshua Boniface	9329784010	Implement async ZK read function Adds a function, "read_many", which can take in multiple ZK keys and return the values from all of them, using asyncio to avoid reading sequentially. Initial tests show a marked improvement in read performance of multiple read()-heavy functions (e.g. "get_list()" functions) with this method.	2023-12-10 15:35:40 -05:00
Joshua Boniface	b9fbfe2ed5	Improve fault ID format Instead of using random hex characters from an md5sum, use a nice name in all-caps similar to how Ceph does. This further helps prevent dupes but also permits a changing health delta within a single event (which would really only ever apply to plugin faults).	2023-12-09 16:48:14 -05:00
Joshua Boniface	7e6d922877	Improve fault detail handling further Since we already had a "details" field, simply move where it gets added to the message later, in generate_fault, after the main message value was used to generate the ID.	2023-12-09 16:13:36 -05:00
Joshua Boniface	4003204f14	Remove bracketed text from fault_str This ensures that certain faults e.g. Ceph status faults, will be combined despite the added text in brackets, while still keeping them mostly separate. Also ensure the health text is updated each time to assist with this, as this health text may now change independent of the fault ID.	2023-12-09 15:34:18 -05:00
Joshua Boniface	2bea78d25e	Make all remaining limits optional	2023-12-09 13:43:58 -05:00
Joshua Boniface	fd717b702d	Use external list of fault states	2023-12-09 12:51:41 -05:00
Joshua Boniface	317ca4b98c	Move defined state combinations into common	2023-12-09 12:36:32 -05:00
Joshua Boniface	0bda095571	Move libvirt_schema and fix other imports	2023-12-09 12:20:29 -05:00
Joshua Boniface	813aef1463	Fix incorrect UUID key name	2023-12-09 12:14:57 -05:00
Joshua Boniface	5a7ea25266	Fix incorrect database name entries	2023-12-09 12:12:00 -05:00
Joshua Boniface	61b39d0739	Fix incorrect cluster health calculation	2023-12-07 11:13:36 -05:00
Joshua Boniface	4bf80a5913	Fix missing datetime shrink	2023-12-06 17:15:36 -05:00
Joshua Boniface	e0bf7f7d1a	Fix bad ID values in acknowledge	2023-12-06 14:18:31 -05:00
Joshua Boniface	20acf3295f	Add mass ack/delete of faults	2023-12-06 13:59:39 -05:00
Joshua Boniface	d1e34e7333	Store fault times only to the second Any more precision is unnecessary and saves 6 chars when displaying these times elsewhere.	2023-12-06 13:20:18 -05:00
Joshua Boniface	79eb54d5da	Move fault generation to common library	2023-12-06 13:17:10 -05:00
Joshua Boniface	2267a9c85d	Improve output formatting for simplicity	2023-12-05 10:37:35 -05:00
Joshua Boniface	672e58133f	Implement interfaces to faults	2023-12-04 01:37:54 -05:00
Joshua Boniface	3dc48c1783	Lower default monitoring interval to 15s Faults are also reported on the monitoring interval, so 60s seems like too long. Lower this to 15 seconds by default instead.	2023-12-01 17:38:28 -05:00

1 2 3 4 5 ...

382 Commits