373 Commits

Author SHA1 Message Date
4ab0bdd9e8 Disallow health less than 0 2023-02-15 16:50:24 -05:00
3a1b8f0e7a Add JSON health to cluster data 2023-02-15 15:26:57 -05:00
fc16e26f23 Run setup during plugin loads 2023-02-15 10:11:38 -05:00
8aa74aae62 Use percentage in keepalie output 2023-02-15 01:56:02 -05:00
8e6632bf10 Adjust text on log message 2023-02-13 22:21:23 -05:00
96d3aff7ad Add logging flag for montioring plugin output 2023-02-13 22:04:39 -05:00
54373c5bec Fix bugs if plugins fail to load 2023-02-13 21:51:48 -05:00
af436a93cc Set node health to None when restarting 2023-02-13 15:54:46 -05:00
edb3aea990 Add node health value and send out API 2023-02-13 15:53:39 -05:00
4d786c11e3 Move Ceph cluster health reporting to plugin
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 13:29:40 -05:00
25f3faa08f Move Ceph cluster health reporting to plugin
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 12:13:56 -05:00
3ad6ff2d9c Initial implementation of monitoring plugin system 2023-02-13 12:06:26 -05:00
c7c47d9f86 Bump version to 0.9.61 2023-02-08 10:08:05 -05:00
0b8d26081b Bump version to 0.9.60 2022-12-06 15:42:55 -05:00
f3ba4b6294 Bump version to 0.9.59 2022-11-15 15:50:15 -05:00
a28df75a5d Bump version to 0.9.58 2022-11-07 12:27:48 -05:00
d63e80675a Bump version to 0.9.57 2022-11-06 01:39:50 -04:00
ef3c22d793 Bump version to 0.9.56 2022-10-27 14:21:04 -04:00
a81d419a2e Update copyright header year 2022-10-06 11:55:27 -04:00
c84ee0f4f1 Bump version to 0.9.55 2022-10-04 13:21:40 -04:00
76c51460b0 Avoid raise/handle deadlocks
Can cause log flooding in some edge cases and isn't really needed any
longer. Use a proper conditional followed by an actual error handler.
2022-10-03 14:04:12 -04:00
4b41ee2817 Bump version to 0.9.54 2022-08-23 11:01:05 -04:00
6146b062d6 Bump version to 0.9.53 2022-08-12 17:47:11 -04:00
73c1ac732e Bump version to 0.9.52 2022-08-12 11:09:25 -04:00
58dd5830eb Add additional kb_ values to OSD stats
Allows for easier parsing later to get e.g. % values and more details on
the used amounts.
2022-08-11 11:06:36 -04:00
5ae430e1c5 Bump version to 0.9.51 2022-07-25 23:25:41 -04:00
e464dcb483 Bump version to 0.9.50 2022-07-06 16:01:14 -04:00
27214c8190 Fix bug with space-containing detect strings 2022-07-06 15:58:57 -04:00
baf5a132ff Bump version to 0.9.49 2022-05-06 15:49:39 -04:00
21bbb0393f Add support for replacing/refreshing OSDs
Adds commands to both replace an OSD disk, and refresh (reimport) an
existing OSD disk on a new node. This handles the cases where an OSD
disk should be replaced (either due to upgrades or failures) or where a
node is rebuilt in-place and an existing OSD must be re-imported to it.

This should avoid the need to do a full remove/add sequence for either
case.

Also cleans up some aspects of OSD removal that are identical between
methods (e.g. using safe-to-destroy and sleeping after stopping) and
fixes a bug if an OSD does not truly exist when the daemon starts up.
2022-05-06 15:32:06 -04:00
1f8f3252a6 Fix bug with initial JSON for stats 2022-05-02 13:28:19 -04:00
b47c9832b7 Refactor OSD removal to use new ZK data
With the OSD LVM information stored in Zookeeper, we can use this to
determine the actual block device to zap rather than relying on runtime
determination and guestimation.
2022-05-02 12:52:22 -04:00
d2757004db Store additional OSD information in ZK
Ensures that information like the FSIDs and the OSD LVM volume are
stored in Zookeeper at creation time and updated at daemon start time
(to ensure the data is populated at least once, or if the /dev/sdX
path changes).

This will allow safer operation of OSD removals and the potential
implementation of re-activation after node replacements.
2022-05-02 12:11:39 -04:00
7323269775 Ensure initial OSD stats is populated
Values are all invalid but this ensures the client won't error out when
trying to show an OSD that has never checked in yet.
2022-04-29 16:50:30 -04:00
85463f9aec Bump version to 0.9.48 2022-04-29 15:03:52 -04:00
19c37c3ed5 Fix bugs with forced removal 2022-04-29 14:03:07 -04:00
cb50eee2a9 Add OSD removal force option
Ensures a removal can continue even in situations where some step(s)
might fail, for instance removing an obsolete OSD from a replaced node.
2022-04-29 11:16:33 -04:00
313a5d1c7d Bump version to 0.9.47 2021-12-28 22:03:08 -05:00
c3d255be65 Bump version to 0.9.46 2021-12-28 15:02:14 -05:00
45fc8a47a3 Allow single-node clusters to restart and timeout
Prevents a daemon from waiting forever to terminate if it is primary,
and avoids this entirely if there is only a single node in the cluster.
2021-12-28 03:06:03 -05:00
07f2006f68 Fix bug when removing OSDs
Ensure the OSD is down as well as out or purge might fail.
2021-12-28 03:05:34 -05:00
f4c7fdffb8 Handle detect strings as arguments for blockdevs
Allows specifying blockdevs in the OSD and OSD-DB addition commands as
detect strings rather than actual block device paths. This provides
greater flexibility for automation with pvcbootstrapd (which originates
the concept of detect strings) and in general usage as well.
2021-12-28 02:53:02 -05:00
02a2f6a27a Bump version to 0.9.45 2021-11-25 09:34:20 -05:00
3aa20fbaa3 Bump version to 0.9.44 2021-11-11 16:20:38 -05:00
6febcfdd97 Bump version to 0.9.43 2021-11-08 02:29:17 -05:00
16544227eb Reformat recent changes with Black 2021-11-06 03:27:07 -04:00
73e3746885 Fix linting error F541 f-string placeholders 2021-11-06 03:26:03 -04:00
66230ce971 Fix linting errors F522/F523 unused args 2021-11-06 03:24:50 -04:00
2083fd824a Reformat code with Black code formatter
Unify the code style along PEP and Black principles using the tool.
2021-11-06 03:02:43 -04:00
3b02034b70 Add some delay and additional tries to fencing 2021-10-27 16:24:17 -04:00