Joshua Boniface
3cb8a70f04
Add forcing to OSD purge
2023-11-02 23:20:48 -04:00
Joshua Boniface
f53af510c1
Avoid startup failures if OSD removed
2023-11-02 22:24:39 -04:00
Joshua Boniface
d5d783fad3
Set proper split flag
2023-11-02 22:20:22 -04:00
Joshua Boniface
980ea6a9e9
Adjust handling of ext_db and _count options
...
Avoid the use of superfluous flag options, default them to none, and add
support for fixed-size DB LVs.
2023-11-02 13:29:47 -04:00
Joshua Boniface
8780044be6
Ensure db_device is an empty string
2023-11-02 00:52:18 -04:00
Joshua Boniface
f08c654f22
Fix missing fstring
2023-11-01 21:41:06 -04:00
Joshua Boniface
8b93f9a80e
Handle OSD index errors during stats collection
2023-11-01 21:33:40 -04:00
Joshua Boniface
526a5f4a74
Add support for split OSD adds
...
Allows creating multiple OSDs on a single (NVMe) block device,
leveraging the "ceph-volume lvm batch" command. Replaces the previous
method of creating OSDs.
Also adds a new ZK item for each OSD indicating if it is split or not.
2023-11-01 21:31:35 -04:00
Joshua Boniface
aa0b1f504f
Fix output bug
2023-11-01 15:46:38 -04:00
Joshua Boniface
5b4dd61754
Bump version to 0.9.80
2023-10-27 09:56:31 -04:00
Joshua Boniface
221af3f241
Bump version to 0.9.79
2023-10-24 02:10:24 -04:00
Joshua Boniface
0769f1ea52
Increase service start time to 10s
2023-10-23 22:24:03 -04:00
Joshua Boniface
50aabde320
Ensure bond count is compared with actual qty
2023-10-22 02:28:04 -04:00
Joshua Boniface
6e83300d78
Increase ipmi plugin timeout
2023-10-04 19:21:59 -04:00
Joshua Boniface
c6c44bf775
Bump version to 0.9.78
2023-09-30 12:57:55 -04:00
Joshua Boniface
7c0f12750e
Bump version to 0.9.77
2023-09-19 11:05:55 -04:00
Joshua Boniface
51e78480fa
Bump version to 0.9.76
2023-09-18 10:15:52 -04:00
Joshua Boniface
f46bfc962f
Bump version to 0.9.75
2023-09-16 23:06:38 -04:00
Joshua Boniface
714d4b6005
Revert float conversion of cpu_cores
...
Results in much uglier output, there are no decimal core counts.
2023-09-16 23:06:07 -04:00
Joshua Boniface
fa8329ac3d
Explicitly round load avg in load plugin
2023-09-16 22:58:49 -04:00
Joshua Boniface
457b7bed3d
Handle exceptions in fence migrations
2023-09-16 22:56:09 -04:00
Joshua Boniface
86115b2928
Add startup message for IPMI reachability
...
It's good to know that this succeeded in addition to knowing if it
failed.
2023-09-16 22:41:58 -04:00
Joshua Boniface
1a906b589e
Bump version to 0.9.74
2023-09-16 00:18:13 -04:00
Joshua Boniface
7b230d8bd5
Add monitoring plugin for hardware RAID arrays
2023-09-16 00:02:53 -04:00
Joshua Boniface
48662e90c1
Remove obsolete monitoring_instance passing
2023-09-15 22:47:45 -04:00
Joshua Boniface
079381c03e
Move printing to end and add runtime
2023-09-15 22:40:09 -04:00
Joshua Boniface
794cea4a02
Reverse ordering, run checks before starting timer
2023-09-15 22:25:37 -04:00
Joshua Boniface
fa24f3ba75
Fix bad fstring in psur check
2023-09-15 22:19:49 -04:00
Joshua Boniface
caadafa80d
Add PSU redundancy sensor check
2023-09-15 19:07:29 -04:00
Joshua Boniface
479e156234
Run monitoring plugins once on startup
2023-09-15 17:53:16 -04:00
Joshua Boniface
86830286f3
Adjust message printing to be on one line
2023-09-15 17:00:34 -04:00
Joshua Boniface
4d51318a40
Make monitoring interval configurable
2023-09-15 16:54:51 -04:00
Joshua Boniface
cba6f5be48
Fix wording of non-coordinator state
2023-09-15 16:51:04 -04:00
Joshua Boniface
254303b9d4
Use coordinator_state instead of router_state
...
Makes it much clearer what this variable represents.
2023-09-15 16:47:56 -04:00
Joshua Boniface
40b7d68853
Separate monitoring and move to 60s interval
...
Removes the dependency of the monitoring subsystem from the node
keepalives, and runs them at a 60s interval to avoid excessive backups
if a plugin takes too long.
Adds its own logs and related items as required.
Finally adds a new required argument to the run() of plugins, the
coordinator state, which can be used by a plugin to determine actions
based on whether the node is a primary, secondary, or non-coordinator.
2023-09-15 16:47:11 -04:00
Joshua Boniface
a8115cafd1
Bump version to 0.9.73
2023-09-02 02:16:19 -04:00
Joshua Boniface
570da99605
Avoid failures if no children found
2023-09-02 01:36:17 -04:00
Joshua Boniface
fdda47e8a2
Bump version to 0.9.72
2023-09-01 16:34:45 -04:00
Joshua Boniface
bb2aac145d
Bump version to 0.9.71
2023-09-01 00:36:38 -04:00
Joshua Boniface
6c407d54c3
Bump version to 0.9.70
2023-08-31 14:15:54 -04:00
Joshua Boniface
cb413e5ce6
[Bookworm] Fix Ceph 16 OSD stat parsing
2023-08-31 00:45:03 -04:00
Joshua Boniface
123499f75f
[Bookworm] Specify YAML loader explicitly
2023-08-31 00:16:19 -04:00
Joshua Boniface
83b8ce7b62
Bump version to 0.9.69 (nice)
2023-08-29 22:02:13 -04:00
Joshua Boniface
5e43f9bd7c
Ensure Patroni failures do not block takeover
2023-08-29 22:00:11 -04:00
Joshua Boniface
ed087d83c2
Found cpuload to 2 decimal places
2023-08-29 21:41:44 -04:00
Joshua Boniface
83d475bd15
Bump version to 0.9.68
2023-08-27 20:59:23 -04:00
Joshua Boniface
705ec802a3
Bump version to 0.9.67
2023-08-27 14:47:20 -04:00
Joshua Boniface
0b90f37518
Bump version to 0.9.66
2023-08-27 11:41:22 -04:00
Joshua Boniface
1e083d7652
Bump version to 0.9.65
2023-08-23 01:56:57 -04:00
Joshua Boniface
075dbe7cc9
Bump version to 0.9.64
2023-08-18 12:34:27 -04:00
Joshua Boniface
b5f996febd
Fix bugs for node flush for stop/shutdown/restart
...
Previously VMs in stop/shutdown/restart states wouldn't be properly
handled during a node flush. This fixes the bugs and ensures that the
transient VM states (shutdown/restart) are completed before proceeding,
and then avoids setting a stopped/shutdown VM to shutdown/auotstart.
2023-08-18 11:25:59 -04:00
Joshua Boniface
3a90fda109
Bump version to 0.9.63
2023-04-28 14:47:04 -04:00
Joshua Boniface
9114255af5
Add *.update-* obsolete configs to dpkg plugin
2023-04-10 15:39:40 -04:00
Joshua Boniface
2c3a3cdf52
Use try when watching health value in NodeInstance
2023-03-07 09:53:01 -05:00
Joshua Boniface
0b583bfdaf
Bump IPMI timeout to 2 seconds
2023-03-07 09:25:27 -05:00
Joshua Boniface
7c07fbefff
Adjust keepalive health printing and ordering
2023-02-24 11:08:30 -05:00
Joshua Boniface
202dc3ed59
Correct error handling if monitoring plugins fail
2023-02-24 10:19:41 -05:00
Joshua Boniface
4c2d99f8a6
Fix bug with SMART info
2023-02-23 13:21:23 -05:00
Joshua Boniface
bcff6650d0
Set timeout on IPMI command
2023-02-23 11:10:09 -05:00
Joshua Boniface
a11206253d
Fix ZK check location
2023-02-23 11:04:02 -05:00
Joshua Boniface
45ad3b9a17
Bump version to 0.9.62
2023-02-22 18:13:45 -05:00
Joshua Boniface
dc4e56db4b
Add IPMI monitoring check
2023-02-22 15:02:08 -05:00
Joshua Boniface
e45b3108a2
Add health delta change to message output
2023-02-22 15:02:08 -05:00
Joshua Boniface
118237a53b
Fix bad string value for message
2023-02-22 15:02:08 -05:00
Joshua Boniface
9805681f94
Use consistent connection with other checks
2023-02-22 15:02:08 -05:00
Joshua Boniface
6c9abb2abe
Add Libvirtd monitoring check
2023-02-22 15:02:08 -05:00
Joshua Boniface
a1122c6e71
Add Zookeeper monitoring check
2023-02-22 15:02:08 -05:00
Joshua Boniface
3696f81597
Add PostgreSQL monitoring check
2023-02-22 15:02:08 -05:00
Joshua Boniface
5ca0d903b6
Adjust comment message
2023-02-22 15:02:08 -05:00
Joshua Boniface
626424b74a
Adjust Munin threshold values
2023-02-22 10:42:43 -05:00
Joshua Boniface
c9ceb3159b
Remove obsolete LINKSPEED variable
2023-02-22 01:04:25 -05:00
Joshua Boniface
6525a2568b
Adjust health delta of load to 50
...
This is a very bad situation and should be critical.
2023-02-22 01:03:12 -05:00
Joshua Boniface
09a005d3d7
Adjust health delta of EDAC Uncorrected to 50
...
This is a very bad situation and should be critical.
2023-02-22 01:01:54 -05:00
Joshua Boniface
fb0fcc0597
Update readme for Munin plugin
2023-02-18 00:00:04 -05:00
Joshua Boniface
3009f24910
Fix typo in var and flip conditional
2023-02-17 16:18:42 -05:00
Joshua Boniface
5ae836f1c5
Fix various issues with PVC Munin plugin
2023-02-17 15:41:16 -05:00
Joshua Boniface
eda1b95d5f
Update Munin plugin example
2023-02-16 16:06:00 -05:00
Joshua Boniface
3bd93563e6
Add CheckMK monitoring example plugins
2023-02-16 16:05:47 -05:00
Joshua Boniface
1093ca6264
Disallow health less than 0
2023-02-15 16:50:24 -05:00
Joshua Boniface
388f6556c0
Remove extra text from packages plugin
2023-02-15 16:28:41 -05:00
Joshua Boniface
6c7be492b8
Move Ceph health to global cluster health
2023-02-15 15:46:13 -05:00
Joshua Boniface
f4eef30770
Add JSON health to cluster data
2023-02-15 15:26:57 -05:00
Joshua Boniface
8565cf26b3
Add disk monitoring plugin
2023-02-15 11:30:49 -05:00
Joshua Boniface
0ecf219910
Run setup during plugin loads
2023-02-15 10:11:38 -05:00
Joshua Boniface
0f4edc54d1
Use percentage in keepalie output
2023-02-15 01:56:02 -05:00
Joshua Boniface
ca91be51e1
Improve ethtool parsing speeds
2023-02-14 15:49:58 -05:00
Joshua Boniface
e29d0e89eb
Add NIC monitoring plugin
2023-02-14 15:43:52 -05:00
Joshua Boniface
14d29f2986
Adjust text on log message
2023-02-13 22:21:23 -05:00
Joshua Boniface
bc88d764b0
Add logging flag for montioring plugin output
2023-02-13 22:04:39 -05:00
Joshua Boniface
a3c31564ca
Flip condition in EDAC check
2023-02-13 21:58:56 -05:00
Joshua Boniface
b07396c39a
Fix bugs if plugins fail to load
2023-02-13 21:51:48 -05:00
Joshua Boniface
71139fa66d
Add EDAC check plugin
2023-02-13 21:43:13 -05:00
Joshua Boniface
1ea4800212
Set node health to None when restarting
2023-02-13 15:54:46 -05:00
Joshua Boniface
9c14d84bfc
Add node health value and send out API
2023-02-13 15:53:39 -05:00
Joshua Boniface
d8f346abdd
Move Ceph cluster health reporting to plugin
...
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 13:29:40 -05:00
Joshua Boniface
2ee52e44d3
Move Ceph cluster health reporting to plugin
...
Also removes several outputs from the normal keepalive that were
superfluous/static so that the main output fits on one line.
2023-02-13 12:13:56 -05:00
Joshua Boniface
3c742a827b
Initial implementation of monitoring plugin system
2023-02-13 12:06:26 -05:00
Joshua Boniface
aeb238f43c
Bump version to 0.9.61
2023-02-08 10:08:05 -05:00
Joshua Boniface
a49510ecc8
Bump version to 0.9.60
2022-12-06 15:42:55 -05:00
Joshua Boniface
92feeefd26
Bump version to 0.9.59
2022-11-15 15:50:15 -05:00