Joshua Boniface
51b9f062b7
Add descriptions for each panel and reset version.
2023-12-29 10:30:28 -05:00
Joshua Boniface
e4ca74c201
Add Zookeeper performance to Grafana dashboard
2023-12-29 09:44:40 -05:00
Joshua Boniface
4969e90f8a
Allow enable/disable of Prometheus endpoints
...
Since these are unauthenticated, it might be the case that an
administrator wishes to completely disable these metrics endpoints.
Provide that option via pvc.conf through pvc-ansible's existing
enable_prometheus_exporters option and the new enable_prometheus
configuration flag.
Defaults to "yes" to provide all functionality unless explicitly
disabled, as the author assumes that the PVC API is secured in other
ways as well and that metric information is not completely sensitive.
2023-12-29 09:25:10 -05:00
Joshua Boniface
52f68909f6
Update Grafana dashboard
2023-12-28 14:55:43 -05:00
Joshua Boniface
0bcf8cfe19
Add Zookeeper metrics proxy
2023-12-28 13:53:15 -05:00
Joshua Boniface
2bb24d3b57
Update Prometheus dashboard and add README
2023-12-27 15:57:12 -05:00
Joshua Boniface
8083b7a3e6
Bump version to 0.9.87
2023-12-27 13:40:51 -05:00
Joshua Boniface
3346ce9bb0
Add missing shutdown state from combinations
2023-12-27 13:40:30 -05:00
Joshua Boniface
572596c575
Fix missing f-string placeholder
2023-12-27 13:21:20 -05:00
Joshua Boniface
e654fbba08
Move debug condition handling to Logger
...
Avoids many dozens of conditionals sprinkled throughout the code by
centralizing this check into the main Logger instance.
2023-12-27 13:01:45 -05:00
Joshua Boniface
52bf5ad0ef
Update store_path set location
...
Prevents a bug if no cluster is selected while doing connection list
commands.
2023-12-27 12:42:19 -05:00
Joshua Boniface
576afc1e94
Update Grafana dashboard layouts
2023-12-27 12:24:46 -05:00
Joshua Boniface
4375f66793
Use proper get() for invalid values
2023-12-27 12:03:48 -05:00
Joshua Boniface
3df3ca5b44
Fix value for OSD utilization
...
Ceph provides in KB; convert to bytes.
2023-12-27 11:56:50 -05:00
Joshua Boniface
cb3c2cd86d
Adjust name of PVC cluster dashboard
2023-12-27 11:42:58 -05:00
Joshua Boniface
d0de4f1825
Update Grafana dashboard to overview
...
Adds resource utilization in addition to health.
2023-12-27 11:38:39 -05:00
Joshua Boniface
494c20263d
Move monitoring folder to top level
2023-12-27 11:37:49 -05:00
Joshua Boniface
431ee69620
Use proper percentage for pool util
2023-12-27 10:03:00 -05:00
Joshua Boniface
88f4d79d5a
Handle invalid values on older Libvirt versions
2023-12-27 09:51:24 -05:00
Joshua Boniface
84d22751d8
Fix bad JSON data handler
2023-12-27 09:43:37 -05:00
Joshua Boniface
40ff005a09
Fix handling of Ceph OSD bytes
2023-12-26 12:43:51 -05:00
Joshua Boniface
ab4ec7a5fa
Remove WebUI from README
2023-12-25 02:48:44 -05:00
Joshua Boniface
9604f655d0
Improve node utilization metrics and fix bugs
2023-12-25 02:47:41 -05:00
Joshua Boniface
3e4cc53fdd
Add node network statistics and utilization values
...
Adds a new physical network interface stats parser to the node
keepalives, and leverages this information to provide a network
utilization overview in the Prometheus metrics.
2023-12-21 15:45:01 -05:00
Joshua Boniface
d2d2a9c617
Include our newline atomically
...
Sometimes clashing log entries would print on the same line, likely due
to some sort of race condition in Python's print() built-in.
Instead, add a newline to our actual message and print without an end
character. This ensures atomic printing of our log messages.
2023-12-21 13:12:43 -05:00
Joshua Boniface
6ed4efad33
Add new network.stats key to nodes
2023-12-21 12:48:48 -05:00
Joshua Boniface
39f9f3640c
Rename health metrics and add resource metrics
2023-12-21 09:40:49 -05:00
Joshua Boniface
c64e888d30
Fix incorrect cast of None
2023-12-14 16:00:53 -05:00
Joshua Boniface
f1249452e5
Fix bug if no nodes are present
2023-12-14 15:32:18 -05:00
Joshua Boniface
0a93f526e0
Bump version to 0.9.86
2023-12-14 14:46:29 -05:00
Joshua Boniface
7c9512fb22
Fix broken config file in API migration script
2023-12-14 14:45:58 -05:00
Joshua Boniface
e88b97f3a9
Print fenced state in red
2023-12-13 15:02:18 -05:00
Joshua Boniface
709c9cb73e
Pause pvchealthd startup until node daemon is run
...
If the health daemon starts too soon during a node bootup, it will
generate generate tons of erroneous faults while the node starts up.
Adds a conditional wait for the current node daemon to be in "run"
state before the health daemon really starts up.
2023-12-13 14:53:54 -05:00
Joshua Boniface
f41c5176be
Ensure health value is an int properly
2023-12-13 14:34:02 -05:00
Joshua Boniface
38e43b46c3
Update health detail messages format
2023-12-13 03:17:47 -05:00
Joshua Boniface
ed9c37982a
Move metric collection into daemon library
2023-12-11 19:20:30 -05:00
Joshua Boniface
0f24184b78
Explicitly clear resources of fenced node
...
This actually solves the bug originally "fixed" in
5f1432ccdd
without breaking VM resource
allocations for working nodes.
2023-12-11 12:14:56 -05:00
Joshua Boniface
1ba37fe33d
Restore VM resource allocation location
...
Commit 5f1432ccdd
changed where these
happen due to a bug after fencing. However this completely broke node
resource reporting as only the final instance will be queried here.
Revert this change and look further into the original bug.
2023-12-11 11:52:59 -05:00
Joshua Boniface
1a05077b10
Fix missing fstring
2023-12-11 11:29:49 -05:00
Joshua Boniface
57c28376a6
Port one final Ceph function to read_many
2023-12-11 10:25:36 -05:00
Joshua Boniface
e781d742e6
Fix bug with volume and snapshot listing
2023-12-11 10:21:46 -05:00
Joshua Boniface
6c6d1508a1
Add VNC info to screenshots
2023-12-11 03:40:49 -05:00
Joshua Boniface
741dafb26b
Port VM functions to read_many
2023-12-11 03:34:36 -05:00
Joshua Boniface
032d3ebf18
Remove debug output from image
2023-12-11 03:23:10 -05:00
Joshua Boniface
5d9e83e8ed
Fix output bugs in VM information
2023-12-11 03:04:46 -05:00
Joshua Boniface
ad0bd8649f
Finish missing sentence
2023-12-11 02:39:39 -05:00
Joshua Boniface
9b5e53e4b6
Add Grafana dashboard screenshot
2023-12-11 00:39:24 -05:00
Joshua Boniface
9617660342
Update Prometheus Grafana dashboard
2023-12-11 00:23:08 -05:00
Joshua Boniface
ab0a1e0946
Update and streamline README and update images
2023-12-10 23:57:01 -05:00
Joshua Boniface
7c116b2fbc
Ensure node health value is an int
2023-12-10 23:56:50 -05:00