Joshua Boniface
0bcf8cfe19
Add Zookeeper metrics proxy
2023-12-28 13:53:15 -05:00
Joshua Boniface
2bb24d3b57
Update Prometheus dashboard and add README
2023-12-27 15:57:12 -05:00
Joshua Boniface
8083b7a3e6
Bump version to 0.9.87
2023-12-27 13:40:51 -05:00
Joshua Boniface
3346ce9bb0
Add missing shutdown state from combinations
2023-12-27 13:40:30 -05:00
Joshua Boniface
572596c575
Fix missing f-string placeholder
2023-12-27 13:21:20 -05:00
Joshua Boniface
e654fbba08
Move debug condition handling to Logger
...
Avoids many dozens of conditionals sprinkled throughout the code by
centralizing this check into the main Logger instance.
2023-12-27 13:01:45 -05:00
Joshua Boniface
52bf5ad0ef
Update store_path set location
...
Prevents a bug if no cluster is selected while doing connection list
commands.
2023-12-27 12:42:19 -05:00
Joshua Boniface
576afc1e94
Update Grafana dashboard layouts
2023-12-27 12:24:46 -05:00
Joshua Boniface
4375f66793
Use proper get() for invalid values
2023-12-27 12:03:48 -05:00
Joshua Boniface
3df3ca5b44
Fix value for OSD utilization
...
Ceph provides in KB; convert to bytes.
2023-12-27 11:56:50 -05:00
Joshua Boniface
cb3c2cd86d
Adjust name of PVC cluster dashboard
2023-12-27 11:42:58 -05:00
Joshua Boniface
d0de4f1825
Update Grafana dashboard to overview
...
Adds resource utilization in addition to health.
2023-12-27 11:38:39 -05:00
Joshua Boniface
494c20263d
Move monitoring folder to top level
2023-12-27 11:37:49 -05:00
Joshua Boniface
431ee69620
Use proper percentage for pool util
2023-12-27 10:03:00 -05:00
Joshua Boniface
88f4d79d5a
Handle invalid values on older Libvirt versions
2023-12-27 09:51:24 -05:00
Joshua Boniface
84d22751d8
Fix bad JSON data handler
2023-12-27 09:43:37 -05:00
Joshua Boniface
40ff005a09
Fix handling of Ceph OSD bytes
2023-12-26 12:43:51 -05:00
Joshua Boniface
ab4ec7a5fa
Remove WebUI from README
2023-12-25 02:48:44 -05:00
Joshua Boniface
9604f655d0
Improve node utilization metrics and fix bugs
2023-12-25 02:47:41 -05:00
Joshua Boniface
3e4cc53fdd
Add node network statistics and utilization values
...
Adds a new physical network interface stats parser to the node
keepalives, and leverages this information to provide a network
utilization overview in the Prometheus metrics.
2023-12-21 15:45:01 -05:00
Joshua Boniface
d2d2a9c617
Include our newline atomically
...
Sometimes clashing log entries would print on the same line, likely due
to some sort of race condition in Python's print() built-in.
Instead, add a newline to our actual message and print without an end
character. This ensures atomic printing of our log messages.
2023-12-21 13:12:43 -05:00
Joshua Boniface
6ed4efad33
Add new network.stats key to nodes
2023-12-21 12:48:48 -05:00
Joshua Boniface
39f9f3640c
Rename health metrics and add resource metrics
2023-12-21 09:40:49 -05:00
Joshua Boniface
c64e888d30
Fix incorrect cast of None
2023-12-14 16:00:53 -05:00
Joshua Boniface
f1249452e5
Fix bug if no nodes are present
2023-12-14 15:32:18 -05:00
Joshua Boniface
0a93f526e0
Bump version to 0.9.86
2023-12-14 14:46:29 -05:00
Joshua Boniface
7c9512fb22
Fix broken config file in API migration script
2023-12-14 14:45:58 -05:00
Joshua Boniface
e88b97f3a9
Print fenced state in red
2023-12-13 15:02:18 -05:00
Joshua Boniface
709c9cb73e
Pause pvchealthd startup until node daemon is run
...
If the health daemon starts too soon during a node bootup, it will
generate generate tons of erroneous faults while the node starts up.
Adds a conditional wait for the current node daemon to be in "run"
state before the health daemon really starts up.
2023-12-13 14:53:54 -05:00
Joshua Boniface
f41c5176be
Ensure health value is an int properly
2023-12-13 14:34:02 -05:00
Joshua Boniface
38e43b46c3
Update health detail messages format
2023-12-13 03:17:47 -05:00
Joshua Boniface
ed9c37982a
Move metric collection into daemon library
2023-12-11 19:20:30 -05:00
Joshua Boniface
0f24184b78
Explicitly clear resources of fenced node
...
This actually solves the bug originally "fixed" in
5f1432ccdd
without breaking VM resource
allocations for working nodes.
2023-12-11 12:14:56 -05:00
Joshua Boniface
1ba37fe33d
Restore VM resource allocation location
...
Commit 5f1432ccdd
changed where these
happen due to a bug after fencing. However this completely broke node
resource reporting as only the final instance will be queried here.
Revert this change and look further into the original bug.
2023-12-11 11:52:59 -05:00
Joshua Boniface
1a05077b10
Fix missing fstring
2023-12-11 11:29:49 -05:00
Joshua Boniface
57c28376a6
Port one final Ceph function to read_many
2023-12-11 10:25:36 -05:00
Joshua Boniface
e781d742e6
Fix bug with volume and snapshot listing
2023-12-11 10:21:46 -05:00
Joshua Boniface
6c6d1508a1
Add VNC info to screenshots
2023-12-11 03:40:49 -05:00
Joshua Boniface
741dafb26b
Port VM functions to read_many
2023-12-11 03:34:36 -05:00
Joshua Boniface
032d3ebf18
Remove debug output from image
2023-12-11 03:23:10 -05:00
Joshua Boniface
5d9e83e8ed
Fix output bugs in VM information
2023-12-11 03:04:46 -05:00
Joshua Boniface
ad0bd8649f
Finish missing sentence
2023-12-11 02:39:39 -05:00
Joshua Boniface
9b5e53e4b6
Add Grafana dashboard screenshot
2023-12-11 00:39:24 -05:00
Joshua Boniface
9617660342
Update Prometheus Grafana dashboard
2023-12-11 00:23:08 -05:00
Joshua Boniface
ab0a1e0946
Update and streamline README and update images
2023-12-10 23:57:01 -05:00
Joshua Boniface
7c116b2fbc
Ensure node health value is an int
2023-12-10 23:56:50 -05:00
Joshua Boniface
1023c55087
Fix bug in VM state list
2023-12-10 23:44:01 -05:00
Joshua Boniface
9235187c6f
Port Ceph functions to read_many
...
Only ports getOSDInformation, as all the others feature 3 or less reads
which is acceptable sequentially.
2023-12-10 22:24:38 -05:00
Joshua Boniface
0c94f1b4f8
Port Network functions to read_many
2023-12-10 22:19:21 -05:00
Joshua Boniface
44a4f0e1f7
Use new info detail output instead of new lists
...
Avoids multiple additional ZK calls by using data that is now in the
status detail output.
2023-12-10 22:19:09 -05:00