Joshua Boniface
c08c3b2d7d
Improve thread timeouts in keepalive
...
Avoids various parts of the keepalive deadlocking waiting on data that
will never come when various internal processes fail. This should ensure
based on testing that the keepalive will always finish in <5 seconds.
2024-10-10 15:33:47 -04:00
Joshua Boniface
4c0d90b517
Add read lock timeouts to prevent deadlocks
2024-10-10 15:19:05 -04:00
Joshua Boniface
8cb44c0c5d
Bump version to 0.9.100
2024-08-30 11:03:33 -04:00
Joshua Boniface
02a775c99b
Bump version to 0.9.99
2024-08-28 11:15:55 -04:00
Joshua Boniface
97329bb90d
Sort Ceph pool data by name
...
There is no guarantee that both commands output the pools in the same
order, so sort them by name first so the iteration over the pools by ID
is successful.
2024-07-22 13:26:27 -04:00
Joshua Boniface
1aa5999109
Bump version to 0.9.98
2024-06-05 12:01:31 -04:00
Joshua Boniface
570460e5ee
Add --version flag to pvcnoded.py for info
2024-06-05 11:57:47 -04:00
Joshua Boniface
dcb9c0d12c
Improve fence handling conditions
...
Use the intermediate output text when judging the fence status, rather
than the retcode of the stop as this should be more reliable.
2024-05-08 10:55:15 -04:00
Joshua Boniface
f1fe0c63f5
Bump version to 0.9.97
2024-04-19 10:32:16 -04:00
Joshua Boniface
9714ac20b2
Update formatting for Black 24.4.0
2024-04-19 10:26:06 -04:00
Joshua Boniface
79ad09ae59
Switch virtual memory free to allocated
...
Avoids incorrect reporting if cache/buffers exceeds normal.
2024-04-19 10:25:33 -04:00
Joshua Boniface
4c6aabec6a
Fix bug if d_network changes
2024-04-05 14:05:51 -04:00
Joshua Boniface
78c774b607
Bump version to 0.9.96
2024-03-08 14:23:07 -05:00
Joshua Boniface
dee8d186cf
Bump version to 0.9.95
2024-02-12 13:12:48 -05:00
Joshua Boniface
d63cc2e661
Bump version to 0.9.94
2024-02-06 13:31:50 -05:00
Joshua Boniface
18f09196be
Bump version to 0.9.93
2024-01-30 09:51:21 -05:00
Joshua Boniface
df40b779af
Bump version to 0.9.92
2024-01-29 09:39:10 -05:00
Joshua Boniface
f29b4c2755
Bump version to 0.9.91
2024-01-23 10:40:59 -05:00
Joshua Boniface
86ca363697
Bump version to 0.9.90
2024-01-11 10:22:48 -05:00
Joshua Boniface
a5763c9d25
Fix possible race condition applying schemas
...
Found an instance where two of these fired too close together, and
caused a fatal error. Use a write lock, and then catch the schema.apply
function in case it fails anyways.
2024-01-11 10:21:01 -05:00
Joshua Boniface
09269f182c
Add live migrate max downtime selector meta field
...
Adds a new flag to VM metadata to allow setting the VM live migration
max downtime. This will enable very busy VMs that hang live migration to
have this value changed.
2024-01-11 00:05:50 -05:00
Joshua Boniface
e9b6072fa0
Bump version to 0.9.89
2024-01-09 12:15:53 -05:00
Joshua Boniface
1d480f5629
Bump version to 0.9.88
2023-12-29 14:56:33 -05:00
Joshua Boniface
123c7ce857
Update copyright header on all files for 2024
...
Last release of 2023 is probably the best time to do this.
2023-12-29 11:16:59 -05:00
Joshua Boniface
8083b7a3e6
Bump version to 0.9.87
2023-12-27 13:40:51 -05:00
Joshua Boniface
e654fbba08
Move debug condition handling to Logger
...
Avoids many dozens of conditionals sprinkled throughout the code by
centralizing this check into the main Logger instance.
2023-12-27 13:01:45 -05:00
Joshua Boniface
494c20263d
Move monitoring folder to top level
2023-12-27 11:37:49 -05:00
Joshua Boniface
3e4cc53fdd
Add node network statistics and utilization values
...
Adds a new physical network interface stats parser to the node
keepalives, and leverages this information to provide a network
utilization overview in the Prometheus metrics.
2023-12-21 15:45:01 -05:00
Joshua Boniface
39f9f3640c
Rename health metrics and add resource metrics
2023-12-21 09:40:49 -05:00
Joshua Boniface
0a93f526e0
Bump version to 0.9.86
2023-12-14 14:46:29 -05:00
Joshua Boniface
38e43b46c3
Update health detail messages format
2023-12-13 03:17:47 -05:00
Joshua Boniface
0f24184b78
Explicitly clear resources of fenced node
...
This actually solves the bug originally "fixed" in
5f1432ccdd
without breaking VM resource
allocations for working nodes.
2023-12-11 12:14:56 -05:00
Joshua Boniface
1ba37fe33d
Restore VM resource allocation location
...
Commit 5f1432ccdd
changed where these
happen due to a bug after fencing. However this completely broke node
resource reporting as only the final instance will be queried here.
Revert this change and look further into the original bug.
2023-12-11 11:52:59 -05:00
Joshua Boniface
1a05077b10
Fix missing fstring
2023-12-11 11:29:49 -05:00
Joshua Boniface
9617660342
Update Prometheus Grafana dashboard
2023-12-11 00:23:08 -05:00
Joshua Boniface
9dc5097dbc
Bump version to 0.9.85
2023-12-10 01:00:33 -05:00
Joshua Boniface
53d632f283
Fix bug in example PVC Grafana dashboard
2023-12-10 00:50:05 -05:00
Joshua Boniface
7bc0760b78
Add time to "starting keepalive" message
...
Matches the pvchealthd output and provides a useful message detail to
this otherwise contextless message.
2023-12-10 00:40:32 -05:00
Joshua Boniface
9aee2a9075
Bump version to 0.9.84
2023-12-09 23:05:40 -05:00
Joshua Boniface
1f6347d24b
Add Prometheus monitoring examples
2023-12-09 17:42:51 -05:00
Joshua Boniface
988de1218f
Bump version to 0.9.83
2023-12-01 17:37:42 -05:00
Joshua Boniface
1fb0463dea
Adjust daemon service startup
...
Add healthd, adjust workerd, lower waittime
2023-11-30 03:28:02 -05:00
Joshua Boniface
03a738f878
Move config parser into daemon_lib
...
And reformat/add config values for API.
2023-11-30 00:05:37 -05:00
Joshua Boniface
4a2eba0961
Improve node output messages (from pvchealthd)
...
1. Output startup "list" entries in cyan with s state
2. Add start of keepalive run message
2023-11-29 21:21:51 -05:00
Joshua Boniface
647cba3cf5
Expand startup width for new daemon name
2023-11-29 21:21:51 -05:00
Joshua Boniface
41f4e4fb2f
Split health monitoring into discrete daemon/pkg
2023-11-29 21:21:51 -05:00
Joshua Boniface
83ceb41138
Add daemon name to Logger entries
2023-11-29 15:18:37 -05:00
Joshua Boniface
2545a7b744
Allow similar for IPMI hostnames
2023-11-28 16:09:01 -05:00
Joshua Boniface
ce907ff26a
Allow specifying static IPs instead of a file
2023-11-28 15:28:31 -05:00
Joshua Boniface
71e589e461
Remove superflous debug output
...
This is printed in the startup logo block anyways.
2023-11-27 13:46:30 -05:00