Commit Graph

392 Commits

Author SHA1 Message Date
Joshua Boniface 6c407d54c3 Bump version to 0.9.70 2023-08-31 14:15:54 -04:00
Joshua Boniface cb413e5ce6 [Bookworm] Fix Ceph 16 OSD stat parsing 2023-08-31 00:45:03 -04:00
Joshua Boniface 123499f75f [Bookworm] Specify YAML loader explicitly 2023-08-31 00:16:19 -04:00
Joshua Boniface 83b8ce7b62 Bump version to 0.9.69 (nice) 2023-08-29 22:02:13 -04:00
Joshua Boniface 5e43f9bd7c Ensure Patroni failures do not block takeover 2023-08-29 22:00:11 -04:00
Joshua Boniface ed087d83c2 Found cpuload to 2 decimal places 2023-08-29 21:41:44 -04:00
Joshua Boniface 83d475bd15 Bump version to 0.9.68 2023-08-27 20:59:23 -04:00
Joshua Boniface 705ec802a3 Bump version to 0.9.67 2023-08-27 14:47:20 -04:00
Joshua Boniface 0b90f37518 Bump version to 0.9.66 2023-08-27 11:41:22 -04:00
Joshua Boniface 1e083d7652 Bump version to 0.9.65 2023-08-23 01:56:57 -04:00
Joshua Boniface 075dbe7cc9 Bump version to 0.9.64 2023-08-18 12:34:27 -04:00
Joshua Boniface b5f996febd Fix bugs for node flush for stop/shutdown/restart
Previously VMs in stop/shutdown/restart states wouldn't be properly
handled during a node flush. This fixes the bugs and ensures that the
transient VM states (shutdown/restart) are completed before proceeding,
and then avoids setting a stopped/shutdown VM to shutdown/auotstart.
2023-08-18 11:25:59 -04:00
Joshua Boniface 3a90fda109 Bump version to 0.9.63 2023-04-28 14:47:04 -04:00
Joshua Boniface 2c3a3cdf52 Use try when watching health value in NodeInstance 2023-03-07 09:53:01 -05:00
Joshua Boniface 7c07fbefff Adjust keepalive health printing and ordering 2023-02-24 11:08:30 -05:00
Joshua Boniface 202dc3ed59 Correct error handling if monitoring plugins fail 2023-02-24 10:19:41 -05:00
Joshua Boniface 45ad3b9a17 Bump version to 0.9.62 2023-02-22 18:13:45 -05:00
Joshua Boniface e45b3108a2 Add health delta change to message output 2023-02-22 15:02:08 -05:00
Joshua Boniface 118237a53b Fix bad string value for message 2023-02-22 15:02:08 -05:00
Joshua Boniface 1093ca6264 Disallow health less than 0 2023-02-15 16:50:24 -05:00
Joshua Boniface f4eef30770 Add JSON health to cluster data 2023-02-15 15:26:57 -05:00
Joshua Boniface 0ecf219910 Run setup during plugin loads 2023-02-15 10:11:38 -05:00
Joshua Boniface 0f4edc54d1 Use percentage in keepalie output 2023-02-15 01:56:02 -05:00
Joshua Boniface 14d29f2986 Adjust text on log message 2023-02-13 22:21:23 -05:00
Joshua Boniface bc88d764b0 Add logging flag for montioring plugin output 2023-02-13 22:04:39 -05:00
Joshua Boniface b07396c39a Fix bugs if plugins fail to load 2023-02-13 21:51:48 -05:00
Joshua Boniface 1ea4800212 Set node health to None when restarting 2023-02-13 15:54:46 -05:00
Joshua Boniface 9c14d84bfc Add node health value and send out API 2023-02-13 15:53:39 -05:00
Joshua Boniface d8f346abdd Move Ceph cluster health reporting to plugin
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 13:29:40 -05:00
Joshua Boniface 2ee52e44d3 Move Ceph cluster health reporting to plugin
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 12:13:56 -05:00
Joshua Boniface 3c742a827b Initial implementation of monitoring plugin system 2023-02-13 12:06:26 -05:00
Joshua Boniface aeb238f43c Bump version to 0.9.61 2023-02-08 10:08:05 -05:00
Joshua Boniface a49510ecc8 Bump version to 0.9.60 2022-12-06 15:42:55 -05:00
Joshua Boniface 92feeefd26 Bump version to 0.9.59 2022-11-15 15:50:15 -05:00
Joshua Boniface 095bcb2373 Bump version to 0.9.58 2022-11-07 12:27:48 -05:00
Joshua Boniface d65f512897 Bump version to 0.9.57 2022-11-06 01:39:50 -04:00
Joshua Boniface c3bc55eff8 Bump version to 0.9.56 2022-10-27 14:21:04 -04:00
Joshua Boniface 726d0a562b Update copyright header year 2022-10-06 11:55:27 -04:00
Joshua Boniface f1df1cfe93 Bump version to 0.9.55 2022-10-04 13:21:40 -04:00
Joshua Boniface 5942aa50fc Avoid raise/handle deadlocks
Can cause log flooding in some edge cases and isn't really needed any
longer. Use a proper conditional followed by an actual error handler.
2022-10-03 14:04:12 -04:00
Joshua Boniface 239c392892 Bump version to 0.9.54 2022-08-23 11:01:05 -04:00
Joshua Boniface 9b499b9f48 Bump version to 0.9.53 2022-08-12 17:47:11 -04:00
Joshua Boniface 2a21d48128 Bump version to 0.9.52 2022-08-12 11:09:25 -04:00
Joshua Boniface 8d0f26ff7a Add additional kb_ values to OSD stats
Allows for easier parsing later to get e.g. % values and more details on
the used amounts.
2022-08-11 11:06:36 -04:00
Joshua Boniface 645b525ad7 Bump version to 0.9.51 2022-07-25 23:25:41 -04:00
Joshua Boniface 932b3c55a3 Bump version to 0.9.50 2022-07-06 16:01:14 -04:00
Joshua Boniface 92e2ff7449 Fix bug with space-containing detect strings 2022-07-06 15:58:57 -04:00
Joshua Boniface 51ad2058ed Bump version to 0.9.49 2022-05-06 15:49:39 -04:00
Joshua Boniface 7a40c7a55b Add support for replacing/refreshing OSDs
Adds commands to both replace an OSD disk, and refresh (reimport) an
existing OSD disk on a new node. This handles the cases where an OSD
disk should be replaced (either due to upgrades or failures) or where a
node is rebuilt in-place and an existing OSD must be re-imported to it.

This should avoid the need to do a full remove/add sequence for either
case.

Also cleans up some aspects of OSD removal that are identical between
methods (e.g. using safe-to-destroy and sleeping after stopping) and
fixes a bug if an OSD does not truly exist when the daemon starts up.
2022-05-06 15:32:06 -04:00
Joshua Boniface 3801fcc07b Fix bug with initial JSON for stats 2022-05-02 13:28:19 -04:00