8fa8590eb8
Ensure apt-update runs if configs update
2021-06-05 01:03:35 -04:00
9dc0949b47
Add bullseye support
2021-06-05 00:56:02 -04:00
998e5a8752
Add directory creation to backup script
2021-06-01 10:16:08 -04:00
0aa328e350
Add PostgreSQL to daily backup script
2021-06-01 10:10:22 -04:00
9deee94332
Update tags and fix backup keys to var
2021-05-27 12:29:19 -04:00
cae8cfc4cb
Add norestart policy for apt updates
2021-05-27 01:38:43 -04:00
491ea77306
Add README and daemon upgrade playbook, cleanups
2021-05-20 11:02:47 -04:00
e76832de91
Allow inter-cluster orphan NTP sync
...
Due to the requirement of Ceph to have all peer nodes tightly
synchronized with each other to come online, PVC nodes need a way to
synchronize to each other even in the absence of an external time
reference. This is especially prevalent if a set of nodes are left
offline for an extended period (>1-2 weeks), since their hardware clocks
will drift. If the resulting Internet connectivity is then dependent on
a VM, this will cause a catch-22 and the cluster will not properly
start.
This configuration will accomplish that - if no suitable >6 stratum
peers are found, the hosts will enter orphan mode. Since they are now
all configured as "peers" with each other, they will collectively decide
on one of them to become the source and sync to it. A local stratum 10
fudge is added so that at least one of the nodes can become this source.
While this is not an ideal use of NTP, it is by far the cleanest
solution to this problem, and does not impact normal functionality when
the two configured stratum-2 servers are reachable.
2021-05-19 11:03:18 -04:00
238449904f
Move some other tasks to bootstrap role
...
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2021-05-13 10:17:38 -04:00
7536732f30
Remove GRUB config from base role
...
This is not actually ideal.
2021-05-12 14:55:57 -04:00
04bc9730a0
Fix version sorting bugs in kernel-cleanup.sh
2021-05-12 14:40:18 -04:00
45322e0f9e
Add additional items to base role
...
Backups, GRUB configuration, and IPMI configuration.
2021-05-12 13:53:15 -04:00
da9eafcdfa
Fix sudoers to use conditional deploy_username
2021-04-13 16:50:05 -04:00
70ba4b240f
Allow configurable fail2ban IPs
2021-04-13 16:44:49 -04:00
ce3554b530
Allow customization of deploy username
2021-04-13 11:30:42 -04:00
593a81e07c
Fix group_vars to match new setup
2021-04-08 14:15:11 -04:00
3819cd87fd
Move to more dynamic apt configs
...
Allow specifying repository URLs in the group_vars, and add
release-specific template files to support future version changes.
2021-04-08 14:14:25 -04:00
3e1d3a90b0
Update root password in default group_vars
2021-04-08 14:08:21 -04:00
404751f695
Update relative path to bootstrap files
2021-04-08 14:04:56 -04:00
622cef1586
Remove superfluous symlink
2021-04-08 13:50:47 -04:00
6589a9cd38
Add sensible sorting of kernel removals
2021-04-08 13:46:43 -04:00
6598637e91
Remove cruft and add mkpasswd setup
2021-04-08 13:46:30 -04:00
25674731cd
Update file copyright header
2021-03-25 16:58:58 -04:00
dcd0b48d94
Correct bad indentation in base role
2021-03-18 09:36:49 -04:00
82fa85834a
Add libguestfs-tools to libvirt role deps
2021-03-15 13:39:37 -04:00
510db0df58
Add cleanup to update oneshot playbook
2021-02-02 15:41:38 -05:00
ca3a5e144f
Update tags and add kernel-cleanup script
2021-02-02 15:41:38 -05:00
1c05c8729f
Fix incorrect systemd enabling in Patroni
2021-01-28 16:28:02 -05:00
4b179b66ed
Add reboot to purge
2021-01-28 14:13:15 -05:00
71edb9db15
Remove log dirs during purge
2021-01-28 14:12:40 -05:00
f4974d648d
Add some additional compression libraries
2021-01-28 13:34:58 -05:00
fa0aeec88e
Add local domain to resolver config
2021-01-28 13:34:26 -05:00
04ca8f73d2
Correct bugs during bootstrap
...
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2021-01-28 13:32:36 -05:00
21e3e0e172
Add removal of Zookeeper keys too
2021-01-28 13:26:46 -05:00
20d802f0b0
Update purge script
2021-01-27 17:08:38 -05:00
b7f251ea16
Retry pgsql bootstrap startup 6 times
...
This will sometimes fail, so retry it several times
2021-01-27 15:45:36 -05:00
7b08610efa
Retry msgr2 enabling 6 times
...
This will sometimes fail, so retry it several times
2021-01-27 14:13:09 -05:00
c4c285c7b3
Remove invalid timezone entries in postgres conf
2021-01-26 15:20:25 -05:00
97869ca5c3
Reorder Ceph stop and lower some waits
2021-01-07 11:11:16 -05:00
d35250b870
Add tasks to verify node has finished (un)flushing
2021-01-07 10:49:23 -05:00
cd164d1984
Increase all wait timeouts to 30s
...
Ensure that even on slow(er) clusters, these timeouts have more time to
complete before proceeding so the task won't fail.
2021-01-05 16:17:19 -05:00
7585553225
Add default values
2020-12-21 00:20:45 -05:00
ac071f4bf0
Add configurable ZK memory limits
2020-12-21 00:20:45 -05:00
98e3e39570
Remove libjemalloc package
2020-12-21 00:20:45 -05:00
8e104113d7
Tune Zookeeper memory usage
...
Use Xms and Xmx=128M to reduce overall Zookeeper memory usage.
2020-12-21 00:20:45 -05:00
de04105a38
Add tuning for Ceph OSDs
2020-12-21 00:20:45 -05:00
28c86d170f
Don't use libjemalloc for Ceph daemons
...
This was an artifact of a much, much older Ceph configuration I ran, and
is not relevant with newer Ceph versions like those used in PVC.
Performance testing with Nautilus and Bluestore reveals a minimal
performance hit, and using `jemalloc` prevents cache autotuning from
being effective, so remove it.
2020-12-21 00:20:45 -05:00
f277acc974
Disable pvc-flush service while rebooting
...
Prevents the flush daemon from starting on node boot, before the
playbook is actually ready to unflush the node.
2020-12-15 14:32:50 -05:00
8b474760ed
Tweak oneshot script
...
Cleanly stop daemons; check if OSDs are back before continuing; wait
less
2020-11-26 10:51:54 -05:00
cb96ef4e7a
Use new init command location
...
Command was renamed in the PVC CLI to facilitate other "task" actions
like backup/restore.
2020-11-24 12:22:34 -05:00