e8f0005894
Bump version to 0.9.69 (nice)
2023-08-29 22:02:13 -04:00
e15f9ed509
Ensure Patroni failures do not block takeover
2023-08-29 22:00:11 -04:00
26921d81cc
Found cpuload to 2 decimal places
2023-08-29 21:41:44 -04:00
2e1269eaae
Bump version to 0.9.68
2023-08-27 20:59:23 -04:00
1c79ce05ac
Bump version to 0.9.67
2023-08-27 14:47:20 -04:00
d08b90f90d
Bump version to 0.9.66
2023-08-27 11:41:22 -04:00
ce9eaaac8e
Bump version to 0.9.65
2023-08-23 01:56:57 -04:00
529ecfdcf0
Bump version to 0.9.64
2023-08-18 12:34:27 -04:00
36558c73b8
Fix bugs for node flush for stop/shutdown/restart
...
Previously VMs in stop/shutdown/restart states wouldn't be properly
handled during a node flush. This fixes the bugs and ensures that the
transient VM states (shutdown/restart) are completed before proceeding,
and then avoids setting a stopped/shutdown VM to shutdown/auotstart.
2023-08-18 11:25:59 -04:00
3fa111aba5
Bump version to 0.9.63
2023-04-28 14:47:04 -04:00
2af217ced1
Use try when watching health value in NodeInstance
2023-03-07 09:53:01 -05:00
6ac4b7a54e
Adjust keepalive health printing and ordering
2023-02-24 11:08:30 -05:00
faa96ff6c4
Correct error handling if monitoring plugins fail
2023-02-24 10:19:41 -05:00
646785b7f8
Bump version to 0.9.62
2023-02-22 18:13:45 -05:00
a9e7713abf
Add health delta change to message output
2023-02-22 15:02:08 -05:00
0f3cd13da1
Fix bad string value for message
2023-02-22 15:02:08 -05:00
4ab0bdd9e8
Disallow health less than 0
2023-02-15 16:50:24 -05:00
3a1b8f0e7a
Add JSON health to cluster data
2023-02-15 15:26:57 -05:00
fc16e26f23
Run setup during plugin loads
2023-02-15 10:11:38 -05:00
8aa74aae62
Use percentage in keepalie output
2023-02-15 01:56:02 -05:00
8e6632bf10
Adjust text on log message
2023-02-13 22:21:23 -05:00
96d3aff7ad
Add logging flag for montioring plugin output
2023-02-13 22:04:39 -05:00
54373c5bec
Fix bugs if plugins fail to load
2023-02-13 21:51:48 -05:00
af436a93cc
Set node health to None when restarting
2023-02-13 15:54:46 -05:00
edb3aea990
Add node health value and send out API
2023-02-13 15:53:39 -05:00
4d786c11e3
Move Ceph cluster health reporting to plugin
...
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 13:29:40 -05:00
25f3faa08f
Move Ceph cluster health reporting to plugin
...
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 12:13:56 -05:00
3ad6ff2d9c
Initial implementation of monitoring plugin system
2023-02-13 12:06:26 -05:00
c7c47d9f86
Bump version to 0.9.61
2023-02-08 10:08:05 -05:00
0b8d26081b
Bump version to 0.9.60
2022-12-06 15:42:55 -05:00
f3ba4b6294
Bump version to 0.9.59
2022-11-15 15:50:15 -05:00
a28df75a5d
Bump version to 0.9.58
2022-11-07 12:27:48 -05:00
d63e80675a
Bump version to 0.9.57
2022-11-06 01:39:50 -04:00
ef3c22d793
Bump version to 0.9.56
2022-10-27 14:21:04 -04:00
a81d419a2e
Update copyright header year
2022-10-06 11:55:27 -04:00
c84ee0f4f1
Bump version to 0.9.55
2022-10-04 13:21:40 -04:00
76c51460b0
Avoid raise/handle deadlocks
...
Can cause log flooding in some edge cases and isn't really needed any
longer. Use a proper conditional followed by an actual error handler.
2022-10-03 14:04:12 -04:00
4b41ee2817
Bump version to 0.9.54
2022-08-23 11:01:05 -04:00
6146b062d6
Bump version to 0.9.53
2022-08-12 17:47:11 -04:00
73c1ac732e
Bump version to 0.9.52
2022-08-12 11:09:25 -04:00
58dd5830eb
Add additional kb_ values to OSD stats
...
Allows for easier parsing later to get e.g. % values and more details on
the used amounts.
2022-08-11 11:06:36 -04:00
5ae430e1c5
Bump version to 0.9.51
2022-07-25 23:25:41 -04:00
e464dcb483
Bump version to 0.9.50
2022-07-06 16:01:14 -04:00
27214c8190
Fix bug with space-containing detect strings
2022-07-06 15:58:57 -04:00
baf5a132ff
Bump version to 0.9.49
2022-05-06 15:49:39 -04:00
21bbb0393f
Add support for replacing/refreshing OSDs
...
Adds commands to both replace an OSD disk, and refresh (reimport) an
existing OSD disk on a new node. This handles the cases where an OSD
disk should be replaced (either due to upgrades or failures) or where a
node is rebuilt in-place and an existing OSD must be re-imported to it.
This should avoid the need to do a full remove/add sequence for either
case.
Also cleans up some aspects of OSD removal that are identical between
methods (e.g. using safe-to-destroy and sleeping after stopping) and
fixes a bug if an OSD does not truly exist when the daemon starts up.
2022-05-06 15:32:06 -04:00
1f8f3252a6
Fix bug with initial JSON for stats
2022-05-02 13:28:19 -04:00
b47c9832b7
Refactor OSD removal to use new ZK data
...
With the OSD LVM information stored in Zookeeper, we can use this to
determine the actual block device to zap rather than relying on runtime
determination and guestimation.
2022-05-02 12:52:22 -04:00
d2757004db
Store additional OSD information in ZK
...
Ensures that information like the FSIDs and the OSD LVM volume are
stored in Zookeeper at creation time and updated at daemon start time
(to ensure the data is populated at least once, or if the /dev/sdX
path changes).
This will allow safer operation of OSD removals and the potential
implementation of re-activation after node replacements.
2022-05-02 12:11:39 -04:00
7323269775
Ensure initial OSD stats is populated
...
Values are all invalid but this ensures the client won't error out when
trying to show an OSD that has never checked in yet.
2022-04-29 16:50:30 -04:00