431 Commits

Author SHA1 Message Date
b0909aed61 Get proper FSID value 2023-11-03 23:38:24 -04:00
f418b40527 Use proper FSID instead of hack 2023-11-03 16:38:19 -04:00
dd0177ce10 Rework replacement procedure again
Avoid calling other functions; replicate the actual process from Ceph
docs (https://docs.ceph.com/en/pacific/rados/operations/add-or-rm-osds/)
to ensure things work out well (e.g. preserving OSD IDs).
2023-11-03 16:31:56 -04:00
ed5bc9fb43 Fix numerous formatting and function bugs 2023-11-03 14:00:05 -04:00
94d8d2cf75 Fix skip_zap_flag anomaly and add crush rm 2023-11-03 02:35:12 -04:00
20497cf89d Fix bugs and skip safe_to_destroy on force 2023-11-03 02:29:50 -04:00
64e37ae963 Update OSD replacement functionality
1. Simplify this by leveraging the existing remove_osd/add_osd
functions, since its task was functionally identical to those two in
sequential order.
2. Add support for split OSDs within the command (replacing all OSDs on
the block device(s) as required).
3. Add additional configurability and flexibility around the old device,
weight, and external DB LVs.
2023-11-03 01:45:49 -04:00
3cb8a70f04 Add forcing to OSD purge 2023-11-02 23:20:48 -04:00
f53af510c1 Avoid startup failures if OSD removed 2023-11-02 22:24:39 -04:00
d5d783fad3 Set proper split flag 2023-11-02 22:20:22 -04:00
980ea6a9e9 Adjust handling of ext_db and _count options
Avoid the use of superfluous flag options, default them to none, and add
support for fixed-size DB LVs.
2023-11-02 13:29:47 -04:00
8780044be6 Ensure db_device is an empty string 2023-11-02 00:52:18 -04:00
f08c654f22 Fix missing fstring 2023-11-01 21:41:06 -04:00
8b93f9a80e Handle OSD index errors during stats collection 2023-11-01 21:33:40 -04:00
526a5f4a74 Add support for split OSD adds
Allows creating multiple OSDs on a single (NVMe) block device,
leveraging the "ceph-volume lvm batch" command. Replaces the previous
method of creating OSDs.

Also adds a new ZK item for each OSD indicating if it is split or not.
2023-11-01 21:31:35 -04:00
aa0b1f504f Fix output bug 2023-11-01 15:46:38 -04:00
5b4dd61754 Bump version to 0.9.80 2023-10-27 09:56:31 -04:00
221af3f241 Bump version to 0.9.79 2023-10-24 02:10:24 -04:00
0769f1ea52 Increase service start time to 10s 2023-10-23 22:24:03 -04:00
c6c44bf775 Bump version to 0.9.78 2023-09-30 12:57:55 -04:00
7c0f12750e Bump version to 0.9.77 2023-09-19 11:05:55 -04:00
51e78480fa Bump version to 0.9.76 2023-09-18 10:15:52 -04:00
f46bfc962f Bump version to 0.9.75 2023-09-16 23:06:38 -04:00
457b7bed3d Handle exceptions in fence migrations 2023-09-16 22:56:09 -04:00
86115b2928 Add startup message for IPMI reachability
It's good to know that this succeeded in addition to knowing if it
failed.
2023-09-16 22:41:58 -04:00
1a906b589e Bump version to 0.9.74 2023-09-16 00:18:13 -04:00
48662e90c1 Remove obsolete monitoring_instance passing 2023-09-15 22:47:45 -04:00
079381c03e Move printing to end and add runtime 2023-09-15 22:40:09 -04:00
794cea4a02 Reverse ordering, run checks before starting timer 2023-09-15 22:25:37 -04:00
479e156234 Run monitoring plugins once on startup 2023-09-15 17:53:16 -04:00
86830286f3 Adjust message printing to be on one line 2023-09-15 17:00:34 -04:00
4d51318a40 Make monitoring interval configurable 2023-09-15 16:54:51 -04:00
cba6f5be48 Fix wording of non-coordinator state 2023-09-15 16:51:04 -04:00
254303b9d4 Use coordinator_state instead of router_state
Makes it much clearer what this variable represents.
2023-09-15 16:47:56 -04:00
40b7d68853 Separate monitoring and move to 60s interval
Removes the dependency of the monitoring subsystem from the node
keepalives, and runs them at a 60s interval to avoid excessive backups
if a plugin takes too long.

Adds its own logs and related items as required.

Finally adds a new required argument to the run() of plugins, the
coordinator state, which can be used by a plugin to determine actions
based on whether the node is a primary, secondary, or non-coordinator.
2023-09-15 16:47:11 -04:00
a8115cafd1 Bump version to 0.9.73 2023-09-02 02:16:19 -04:00
570da99605 Avoid failures if no children found 2023-09-02 01:36:17 -04:00
fdda47e8a2 Bump version to 0.9.72 2023-09-01 16:34:45 -04:00
bb2aac145d Bump version to 0.9.71 2023-09-01 00:36:38 -04:00
6c407d54c3 Bump version to 0.9.70 2023-08-31 14:15:54 -04:00
cb413e5ce6 [Bookworm] Fix Ceph 16 OSD stat parsing 2023-08-31 00:45:03 -04:00
123499f75f [Bookworm] Specify YAML loader explicitly 2023-08-31 00:16:19 -04:00
83b8ce7b62 Bump version to 0.9.69 (nice) 2023-08-29 22:02:13 -04:00
5e43f9bd7c Ensure Patroni failures do not block takeover 2023-08-29 22:00:11 -04:00
ed087d83c2 Found cpuload to 2 decimal places 2023-08-29 21:41:44 -04:00
83d475bd15 Bump version to 0.9.68 2023-08-27 20:59:23 -04:00
705ec802a3 Bump version to 0.9.67 2023-08-27 14:47:20 -04:00
0b90f37518 Bump version to 0.9.66 2023-08-27 11:41:22 -04:00
1e083d7652 Bump version to 0.9.65 2023-08-23 01:56:57 -04:00
075dbe7cc9 Bump version to 0.9.64 2023-08-18 12:34:27 -04:00