Joshua Boniface
bc88d764b0
Add logging flag for montioring plugin output
2023-02-13 22:04:39 -05:00
Joshua Boniface
b07396c39a
Fix bugs if plugins fail to load
2023-02-13 21:51:48 -05:00
Joshua Boniface
1ea4800212
Set node health to None when restarting
2023-02-13 15:54:46 -05:00
Joshua Boniface
9c14d84bfc
Add node health value and send out API
2023-02-13 15:53:39 -05:00
Joshua Boniface
d8f346abdd
Move Ceph cluster health reporting to plugin
...
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 13:29:40 -05:00
Joshua Boniface
2ee52e44d3
Move Ceph cluster health reporting to plugin
...
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 12:13:56 -05:00
Joshua Boniface
3c742a827b
Initial implementation of monitoring plugin system
2023-02-13 12:06:26 -05:00
Joshua Boniface
aeb238f43c
Bump version to 0.9.61
2023-02-08 10:08:05 -05:00
Joshua Boniface
a49510ecc8
Bump version to 0.9.60
2022-12-06 15:42:55 -05:00
Joshua Boniface
92feeefd26
Bump version to 0.9.59
2022-11-15 15:50:15 -05:00
Joshua Boniface
095bcb2373
Bump version to 0.9.58
2022-11-07 12:27:48 -05:00
Joshua Boniface
d65f512897
Bump version to 0.9.57
2022-11-06 01:39:50 -04:00
Joshua Boniface
c3bc55eff8
Bump version to 0.9.56
2022-10-27 14:21:04 -04:00
Joshua Boniface
726d0a562b
Update copyright header year
2022-10-06 11:55:27 -04:00
Joshua Boniface
f1df1cfe93
Bump version to 0.9.55
2022-10-04 13:21:40 -04:00
Joshua Boniface
5942aa50fc
Avoid raise/handle deadlocks
...
Can cause log flooding in some edge cases and isn't really needed any
longer. Use a proper conditional followed by an actual error handler.
2022-10-03 14:04:12 -04:00
Joshua Boniface
239c392892
Bump version to 0.9.54
2022-08-23 11:01:05 -04:00
Joshua Boniface
9b499b9f48
Bump version to 0.9.53
2022-08-12 17:47:11 -04:00
Joshua Boniface
2a21d48128
Bump version to 0.9.52
2022-08-12 11:09:25 -04:00
Joshua Boniface
8d0f26ff7a
Add additional kb_ values to OSD stats
...
Allows for easier parsing later to get e.g. % values and more details on
the used amounts.
2022-08-11 11:06:36 -04:00
Joshua Boniface
645b525ad7
Bump version to 0.9.51
2022-07-25 23:25:41 -04:00
Joshua Boniface
932b3c55a3
Bump version to 0.9.50
2022-07-06 16:01:14 -04:00
Joshua Boniface
92e2ff7449
Fix bug with space-containing detect strings
2022-07-06 15:58:57 -04:00
Joshua Boniface
51ad2058ed
Bump version to 0.9.49
2022-05-06 15:49:39 -04:00
Joshua Boniface
7a40c7a55b
Add support for replacing/refreshing OSDs
...
Adds commands to both replace an OSD disk, and refresh (reimport) an
existing OSD disk on a new node. This handles the cases where an OSD
disk should be replaced (either due to upgrades or failures) or where a
node is rebuilt in-place and an existing OSD must be re-imported to it.
This should avoid the need to do a full remove/add sequence for either
case.
Also cleans up some aspects of OSD removal that are identical between
methods (e.g. using safe-to-destroy and sleeping after stopping) and
fixes a bug if an OSD does not truly exist when the daemon starts up.
2022-05-06 15:32:06 -04:00
Joshua Boniface
3801fcc07b
Fix bug with initial JSON for stats
2022-05-02 13:28:19 -04:00
Joshua Boniface
c741900baf
Refactor OSD removal to use new ZK data
...
With the OSD LVM information stored in Zookeeper, we can use this to
determine the actual block device to zap rather than relying on runtime
determination and guestimation.
2022-05-02 12:52:22 -04:00
Joshua Boniface
464f0e0356
Store additional OSD information in ZK
...
Ensures that information like the FSIDs and the OSD LVM volume are
stored in Zookeeper at creation time and updated at daemon start time
(to ensure the data is populated at least once, or if the /dev/sdX
path changes).
This will allow safer operation of OSD removals and the potential
implementation of re-activation after node replacements.
2022-05-02 12:11:39 -04:00
Joshua Boniface
cea8832f90
Ensure initial OSD stats is populated
...
Values are all invalid but this ensures the client won't error out when
trying to show an OSD that has never checked in yet.
2022-04-29 16:50:30 -04:00
Joshua Boniface
5807351405
Bump version to 0.9.48
2022-04-29 15:03:52 -04:00
Joshua Boniface
d6ca74376a
Fix bugs with forced removal
2022-04-29 14:03:07 -04:00
Joshua Boniface
4d698be34b
Add OSD removal force option
...
Ensures a removal can continue even in situations where some step(s)
might fail, for instance removing an obsolete OSD from a replaced node.
2022-04-29 11:16:33 -04:00
Joshua Boniface
ea709f573f
Bump version to 0.9.47
2021-12-28 22:03:08 -05:00
Joshua Boniface
58d57d7037
Bump version to 0.9.46
2021-12-28 15:02:14 -05:00
Joshua Boniface
00d2c67c41
Allow single-node clusters to restart and timeout
...
Prevents a daemon from waiting forever to terminate if it is primary,
and avoids this entirely if there is only a single node in the cluster.
2021-12-28 03:06:03 -05:00
Joshua Boniface
67131de4f6
Fix bug when removing OSDs
...
Ensure the OSD is down as well as out or purge might fail.
2021-12-28 03:05:34 -05:00
Joshua Boniface
abc23ebb18
Handle detect strings as arguments for blockdevs
...
Allows specifying blockdevs in the OSD and OSD-DB addition commands as
detect strings rather than actual block device paths. This provides
greater flexibility for automation with pvcbootstrapd (which originates
the concept of detect strings) and in general usage as well.
2021-12-28 02:53:02 -05:00
Joshua Boniface
f164d898c1
Bump version to 0.9.45
2021-11-25 09:34:20 -05:00
Joshua Boniface
817dffcf30
Bump version to 0.9.44
2021-11-11 16:20:38 -05:00
Joshua Boniface
6e9fcd38a3
Bump version to 0.9.43
2021-11-08 02:29:17 -05:00
Joshua Boniface
78faa90139
Reformat recent changes with Black
2021-11-06 03:27:07 -04:00
Joshua Boniface
23b1501f40
Fix linting error F541 f-string placeholders
2021-11-06 03:26:03 -04:00
Joshua Boniface
66bfad3109
Fix linting errors F522/F523 unused args
2021-11-06 03:24:50 -04:00
Joshua Boniface
c41664d2da
Reformat code with Black code formatter
...
Unify the code style along PEP and Black principles using the tool.
2021-11-06 03:02:43 -04:00
Joshua Boniface
2e7b9b28b3
Add some delay and additional tries to fencing
2021-10-27 16:24:17 -04:00
Joshua Boniface
55f397a347
Fix bad location of config sets
2021-10-12 17:23:04 -04:00
Joshua Boniface
dfebb2d3e5
Also validate on failures
2021-10-12 17:11:03 -04:00
Joshua Boniface
e88147db4a
Bump version to 0.9.42
2021-10-12 15:25:42 -04:00
Joshua Boniface
b8204d89ac
Go back to passing if exception
...
Validation already happened and the set happens again later.
2021-10-12 14:21:52 -04:00
Joshua Boniface
fe73dfbdc9
Use current live value for bridge_mtu
...
This will ensure that upgrading without the bridge_mtu config key set
will keep things as they are.
2021-10-12 12:24:03 -04:00