Joshua Boniface
f34b2a5f7e
Add cleanup to update oneshot playbook
2023-09-01 15:42:25 -04:00
Joshua Boniface
1f6cb077fa
Update tags and add kernel-cleanup script
2023-09-01 15:42:25 -04:00
Joshua Boniface
0bf9c6209c
Fix incorrect systemd enabling in Patroni
2023-09-01 15:42:25 -04:00
Joshua Boniface
8d41650619
Add reboot to purge
2023-09-01 15:42:25 -04:00
Joshua Boniface
c634823ba5
Remove log dirs during purge
2023-09-01 15:42:25 -04:00
Joshua Boniface
c0dc6fad4e
Add some additional compression libraries
2023-09-01 15:42:25 -04:00
Joshua Boniface
a4be011884
Add local domain to resolver config
2023-09-01 15:42:25 -04:00
Joshua Boniface
4f5dbee8ee
Correct bugs during bootstrap
...
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2023-09-01 15:42:25 -04:00
Joshua Boniface
05a2c1949d
Add removal of Zookeeper keys too
2023-09-01 15:42:25 -04:00
Joshua Boniface
a4f1d6eedc
Update purge script
2023-09-01 15:42:25 -04:00
Joshua Boniface
26dbd082ef
Retry pgsql bootstrap startup 6 times
...
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface
e9f08ad100
Retry msgr2 enabling 6 times
...
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface
a77e41bf7c
Remove invalid timezone entries in postgres conf
2023-09-01 15:42:25 -04:00
Joshua Boniface
0ddd11844e
Reorder Ceph stop and lower some waits
2023-09-01 15:42:25 -04:00
Joshua Boniface
cf609eb609
Add tasks to verify node has finished (un)flushing
2023-09-01 15:42:25 -04:00
Joshua Boniface
c29cdd5305
Increase all wait timeouts to 30s
...
Ensure that even on slow(er) clusters, these timeouts have more time to
complete before proceeding so the task won't fail.
2023-09-01 15:42:24 -04:00
Joshua Boniface
cba276e248
Add default values
2023-09-01 15:42:24 -04:00
Joshua Boniface
be94bc134f
Add configurable ZK memory limits
2023-09-01 15:42:24 -04:00
Joshua Boniface
6e74ac44a5
Remove libjemalloc package
2023-09-01 15:42:24 -04:00
Joshua Boniface
2bd5cc5a25
Tune Zookeeper memory usage
...
Use Xms and Xmx=128M to reduce overall Zookeeper memory usage.
2023-09-01 15:42:24 -04:00
Joshua Boniface
b4e36d146a
Add tuning for Ceph OSDs
2023-09-01 15:42:24 -04:00
Joshua Boniface
24764fe704
Don't use libjemalloc for Ceph daemons
...
This was an artifact of a much, much older Ceph configuration I ran, and
is not relevant with newer Ceph versions like those used in PVC.
Performance testing with Nautilus and Bluestore reveals a minimal
performance hit, and using `jemalloc` prevents cache autotuning from
being effective, so remove it.
2023-09-01 15:42:24 -04:00
Joshua Boniface
750cb4b55c
Disable pvc-flush service while rebooting
...
Prevents the flush daemon from starting on node boot, before the
playbook is actually ready to unflush the node.
2023-09-01 15:42:24 -04:00
Joshua Boniface
cdc7e3377b
Tweak oneshot script
...
Cleanly stop daemons; check if OSDs are back before continuing; wait
less
2023-09-01 15:42:24 -04:00
Joshua Boniface
458e7b4872
Use new init command location
...
Command was renamed in the PVC CLI to facilitate other "task" actions
like backup/restore.
2023-09-01 15:42:24 -04:00
Joshua Boniface
bcb5962353
Add jute.maxbuffer to Zookeeper environment ops
...
Adds this option based on the findings of
https://github.com/python-zk/kazoo/issues/630 , whereby restores of >1MB
in size would fail. This is considered an unsafe option, but given our
usecase no actual znode should ever exceed this limit; this is purely
for the large transactions that come from a `pvc task restore` action to
an empty Zookeeper instance.
2023-09-01 15:42:24 -04:00
Joshua Boniface
075ce8ea22
Add PVC status MOTD script
2023-09-01 15:42:24 -04:00
Joshua Boniface
68a475ccf9
Set proper mode on agent plugins
2023-09-01 15:42:24 -04:00
Joshua Boniface
9962ceaf0a
Add cluster safe update playbook
...
This playbook will perform a oneshot upgrade of the systems in the
cluster, including performing a clean and safe reboot of the node(s) if
required (either due to services needing a restart, or the kernel
changing). It runs in serial=1 and only reboots if needed.
2023-09-01 15:42:24 -04:00
Joshua Boniface
f86ec62416
Add check-mk-agent plugin installs
...
These are used by various Ansible tasks, even if the administrator is
not using Check_MK for monitoring.
2023-09-01 15:42:24 -04:00
Joshua Boniface
62d53b0c9c
Add PCI and USB utils
2023-09-01 15:42:24 -04:00
Joshua Boniface
f79fb605de
Support using existing SSL certs on system
...
Add the additional pvc_api_ssl_cert_path and pvc_api_ssl_key_path
group_vars options, which can be used to set the SSL details to existing
files on the filesystem if desired. If these are empty (or nonexistent),
the original pvc_api_ssl_cert and pvc_api_ssl_key raw format options
will be used as they were.
Allows the administrator to use outside methods (such as Let's Encrypt)
to obtain the certs locally on the system, avoiding changes to the
group_vars and redeployment to manage SSL keys.
2023-09-01 15:42:24 -04:00
Joshua Boniface
a8419be587
Use generic Debian repos and PVC component
2023-09-01 15:42:24 -04:00
Joshua Boniface
2caed2ae12
Rename remaining "pvc_prov" items to pvc_api
2023-09-01 15:42:24 -04:00
Joshua Boniface
2a2d318dbc
Change name of default API database
...
From pvcprov to pvcapi to reflect the changing use of this database.
2023-09-01 15:42:24 -04:00
Joshua Boniface
833d99a360
Add comments to defaults
2023-09-01 15:42:24 -04:00
Joshua Boniface
503d99d0fe
Add more detailed comments
2023-09-01 15:42:24 -04:00
Joshua Boniface
8109f13386
Add additional configuration to group_vars
...
Also include defaults and the new pvc_vm_shutdown_timeout option.
2023-09-01 15:42:24 -04:00
Joshua Boniface
98c3586511
Add nice warning to purge script
2023-09-01 15:42:24 -04:00
Joshua Boniface
72df058684
Ensure ZK prioritizes IPv4
2023-09-01 15:42:24 -04:00
Joshua Boniface
457e18a850
Use FQDN for Zookeeper server entries
2023-09-01 15:42:24 -04:00
Joshua Boniface
777a4693a1
Improve SSH configuration for nodes
...
Ensure hostbased auth works with configs, remove erroneous old
conditional for authtypes, remove obsolete config option.
2023-09-01 15:42:24 -04:00
Joshua Boniface
88209a2b70
Use Google DNS instead of Cloudflare
...
For some reason Cloudflare works in fewer places than Google, so just
use it instead.
2023-09-01 15:42:24 -04:00
Joshua Boniface
fbbf5ffe09
Use cluster_group variable for paths
...
Instead of trying to automagic this group out of the Ansible hostvars,
just make it explicitly defined in the group_vars to avoid any
confusion.
2023-09-01 15:42:23 -04:00
Joshua Boniface
a925e4bd40
Ignore errors in bringing up bootstrap interfaces
2023-09-01 15:42:23 -04:00
Joshua Boniface
e3ad750412
Add storage components to default pvcnoded.yaml
2023-09-01 15:42:23 -04:00
Joshua Boniface
715fa103cd
Ensure uuid-runtime is installed
2023-09-01 15:42:23 -04:00
Joshua Boniface
7dc6efdf9a
Add update to purge command
2023-09-01 15:42:23 -04:00
Joshua Boniface
12d50cfca6
Use correct syntax for init command
2023-09-01 15:42:23 -04:00
Joshua Boniface
92ccc0a737
Use consistent naming in patroni.yml
2023-09-01 15:42:23 -04:00