988de1218f
Bump version to 0.9.83
2023-12-01 17:37:42 -05:00
1fb0463dea
Adjust daemon service startup
...
Add healthd, adjust workerd, lower waittime
2023-11-30 03:28:02 -05:00
03a738f878
Move config parser into daemon_lib
...
And reformat/add config values for API.
2023-11-30 00:05:37 -05:00
4a2eba0961
Improve node output messages (from pvchealthd)
...
1. Output startup "list" entries in cyan with s state
2. Add start of keepalive run message
2023-11-29 21:21:51 -05:00
647cba3cf5
Expand startup width for new daemon name
2023-11-29 21:21:51 -05:00
41f4e4fb2f
Split health monitoring into discrete daemon/pkg
2023-11-29 21:21:51 -05:00
83ceb41138
Add daemon name to Logger entries
2023-11-29 15:18:37 -05:00
2545a7b744
Allow similar for IPMI hostnames
2023-11-28 16:09:01 -05:00
ce907ff26a
Allow specifying static IPs instead of a file
2023-11-28 15:28:31 -05:00
71e589e461
Remove superflous debug output
...
This is printed in the startup logo block anyways.
2023-11-27 13:46:30 -05:00
fc3d292081
Add missing subdirectory configs
2023-11-27 13:40:07 -05:00
eab1ae873b
Ensure upstream_gateway key will exist
2023-11-27 13:37:57 -05:00
eaf93cdf96
Readd missing subsystem configurations
2023-11-27 13:33:41 -05:00
c8f4cbb39e
Fix node entry keys
2023-11-27 13:24:01 -05:00
786fae7769
Improve logo output
2023-11-27 13:01:43 -05:00
bcc57638a9
Refactor pvcnoded to use new configuration
2023-11-26 15:41:25 -05:00
2666e0603e
Update dnsmasq script to use new config file
2023-11-26 14:18:13 -05:00
460a2dd09f
Bump version to 0.9.82
2023-11-25 15:38:50 -05:00
3e001b08b6
Bump version to 0.9.81
2023-11-17 01:29:41 -05:00
e818df5dae
Use enable/disable --now instead of two commands
...
Avoids needing two calls here especially for the stop.
2023-11-16 02:40:35 -05:00
c76a5afd04
Avoid waits during node secondary
...
Waiting for the daemons to stop took too much time on some nodes and
could throw off the lockstep. Instead, leverage background=True to run
the systemctl os_commands in the background (when they complete is
irrelevant), stop the Metadata API first, and don't delay during its
stop at all.
2023-11-16 02:34:12 -05:00
18e43a9377
Adjust name in worker log output
2023-11-16 02:25:14 -05:00
aef38639cf
Rename pvcapid-worker to pvcworkerd
2023-11-15 20:31:39 -05:00
5f1432ccdd
Fix memory allocation updates and add more debug
...
Previously, we were assigning memalloc/memprov/vcpualloc during an
earlier phase using the main d_domain list. I'm not sure exactly why,
but this was throwing off stats after a fence. Instead, set these values
later on while parsing the actually-active VMs.
2023-11-10 10:29:32 -05:00
d6b8808448
Clean up fencing handler
...
1. Remove all format strings in favour of f-strings
2. Ensure all logger messages have a prefix
3. Add a few more logger messages for clarity
2023-11-10 10:09:54 -05:00
83c4c6633d
Readd RBD lock detection and clearing on startup
...
This is still needed due to the nature of the locks and freeing them on
startup, and to preserve lock=fail behaviour on VM startup.
Also fixes the fencing lock flush to directly use the client library
outside of Celery. I don't like this hack but it seems prudent until we
move fencing to the workers as well.
2023-11-10 01:33:48 -05:00
08411708f6
Clean up dangling references to cmd pipes
...
Also removes the schema references for these CMD pipes as they are no
longer required.
2023-11-09 23:28:14 -05:00
ce17c60a20
Port OSD on-node tasks to Celery worker system
...
Adds Celery versions of the osd_add, osd_replace, osd_refresh,
osd_remove, and osd_db_vg_add functions.
2023-11-09 23:28:08 -05:00
89681d54b9
Port VM on-node tasks to Celery worker system
...
Adds Celery versions of the flush_locks, device_attach, and
device_detach functions.
2023-11-06 20:40:46 -05:00
f0c2e9d295
Don't start pvcapid-worker on primary
...
It will be running anyways
2023-11-05 19:44:00 -05:00
2c15036f86
Add KeyDB to node startup services
...
Also ensure API worker starts on all nodes, not just coordinators.
2023-11-05 19:26:38 -05:00
30d7e49401
Start API worker with node daemon on coordinators
2023-11-04 13:08:16 -04:00
7490f13b7c
Check for partition tables on new devices
2023-11-04 03:13:58 -04:00
e32054be81
Refactor refresh as well
2023-11-04 02:44:52 -04:00
b3d13fe9be
Add log message for zap
2023-11-04 01:02:51 -04:00
48b2ccbd95
Add timeout for safe-to-destroy
...
Continuously take the OSD down and out while doing so.
2023-11-04 00:55:05 -04:00
1535078842
Fix lvremove, lvcreate, and update ZK details
2023-11-04 00:30:14 -04:00
0e45613634
Use right key with correct data
2023-11-04 00:02:00 -04:00
7f5dd385b5
Use right key for FSID elsewhere
2023-11-03 23:51:01 -04:00
befce62925
Add OSD destroy before purge
2023-11-03 23:44:27 -04:00
b0909aed61
Get proper FSID value
2023-11-03 23:38:24 -04:00
f418b40527
Use proper FSID instead of hack
2023-11-03 16:38:19 -04:00
dd0177ce10
Rework replacement procedure again
...
Avoid calling other functions; replicate the actual process from Ceph
docs (https://docs.ceph.com/en/pacific/rados/operations/add-or-rm-osds/ )
to ensure things work out well (e.g. preserving OSD IDs).
2023-11-03 16:31:56 -04:00
ed5bc9fb43
Fix numerous formatting and function bugs
2023-11-03 14:00:05 -04:00
94d8d2cf75
Fix skip_zap_flag anomaly and add crush rm
2023-11-03 02:35:12 -04:00
20497cf89d
Fix bugs and skip safe_to_destroy on force
2023-11-03 02:29:50 -04:00
64e37ae963
Update OSD replacement functionality
...
1. Simplify this by leveraging the existing remove_osd/add_osd
functions, since its task was functionally identical to those two in
sequential order.
2. Add support for split OSDs within the command (replacing all OSDs on
the block device(s) as required).
3. Add additional configurability and flexibility around the old device,
weight, and external DB LVs.
2023-11-03 01:45:49 -04:00
3cb8a70f04
Add forcing to OSD purge
2023-11-02 23:20:48 -04:00
f53af510c1
Avoid startup failures if OSD removed
2023-11-02 22:24:39 -04:00
d5d783fad3
Set proper split flag
2023-11-02 22:20:22 -04:00