Commit Graph

310 Commits

Author SHA1 Message Date
Joshua Boniface 3adacf3107 Fix excessive whitespace 2021-07-18 22:13:09 -04:00
Joshua Boniface 764c2c3928 Fix memory tuning issues 2021-07-18 18:51:21 -04:00
Joshua Boniface 10a1754285 Adjust package lists per Debian version 2021-07-18 18:36:58 -04:00
Joshua Boniface b33096202e Fix bad Ansible variable name 2021-07-18 17:49:42 -04:00
Joshua Boniface 0e046b48d4 Add Zookeeper logging configs 2021-07-18 17:47:02 -04:00
Joshua Boniface a1362c4363 Don't fail if IPMI tasks fail 2021-07-07 10:42:30 -04:00
Joshua Boniface 96544aabb8 Add GRUB, Plymouth themes and issue for PVC 2021-06-30 02:50:18 -04:00
Joshua Boniface 9d4eb89bde Fix zkcli for good 2021-06-29 18:16:02 -04:00
Joshua Boniface c0ad9740f4 Fix bootstrap collection path for Ceph 2021-06-29 17:52:21 -04:00
Joshua Boniface 3d47b12b76 Add GRUB configuration to Ansible role 2021-06-29 17:48:55 -04:00
Joshua Boniface 120871ee45 Support both versions of psycopg2 and kazoo 2021-06-29 17:29:01 -04:00
Joshua Boniface 231cb7b2aa Fix Patroni ACL to use subnet mask 2021-06-29 16:47:55 -04:00
Joshua Boniface d794197633 Fix zkcli alias to use hostname 2021-06-29 16:47:42 -04:00
Joshua Boniface 9855088a8e Use short ansible_hostname in ipmi fragment 2021-06-29 15:38:19 -04:00
Joshua Boniface 10e8947cb0 Add ipmitool to packages list 2021-06-29 15:30:54 -04:00
Joshua Boniface 53872c0056 Add generic SR-IOV configuration 2021-06-22 03:47:03 -04:00
Joshua Boniface d88ba7272d Ensure we can connect to Patroni 2021-06-22 03:28:36 -04:00
Joshua Boniface 84bf1d7efa Use IPs for Patroni configuration 2021-06-22 03:27:01 -04:00
Joshua Boniface ae45da3f85 Bump max connections in Zookeeper to 200 2021-06-22 03:15:23 -04:00
Joshua Boniface c6590f8ab9 Configure Zookeeper only on Cluster address 2021-06-22 03:15:23 -04:00
Joshua Boniface 6396eaa5ff Ensure libvirtd restarts when unit changes 2021-06-22 03:15:23 -04:00
Joshua Boniface 73bc005c0b Ensure deb-src is present for bullseye 2021-06-22 03:15:23 -04:00
Joshua Boniface ec879f4e3c Add override custom libvirtd.service unit
This has no functional change on Buster, but on Bullseye this overrides
the stupid socket-based activation shenanigans that the default unit
tries to do, as well as the breaking replacement of the
/etc/default/libvirt variable names.
2021-06-22 03:15:23 -04:00
Joshua Boniface b4e9ed5d39 Ensure DEBIAN_FRONTEND is noninteractive 2021-06-22 03:15:23 -04:00
Joshua Boniface 4ccc23bd85 Add python3 version of psycopg2 explicitly 2021-06-22 03:15:23 -04:00
Joshua Boniface 8a140f70dc Use inventory_hostname for IPMI dict 2021-06-22 03:15:23 -04:00
Joshua Boniface 836c946c72 Use independent fact to work around codename 2021-06-07 10:54:55 -04:00
Joshua Boniface 69c037c136 Ensure backup_keys isn't empty 2021-06-06 00:41:53 -04:00
Joshua Boniface 6b79e5db31 Avoid writing hosts if empty 2021-06-05 01:12:00 -04:00
Joshua Boniface 8fa8590eb8 Ensure apt-update runs if configs update 2021-06-05 01:03:35 -04:00
Joshua Boniface 9dc0949b47 Add bullseye support 2021-06-05 00:56:02 -04:00
Joshua Boniface 998e5a8752 Add directory creation to backup script 2021-06-01 10:16:08 -04:00
Joshua Boniface 0aa328e350 Add PostgreSQL to daily backup script 2021-06-01 10:10:22 -04:00
Joshua Boniface 9deee94332 Update tags and fix backup keys to var 2021-05-27 12:29:19 -04:00
Joshua Boniface e76832de91 Allow inter-cluster orphan NTP sync
Due to the requirement of Ceph to have all peer nodes tightly
synchronized with each other to come online, PVC nodes need a way to
synchronize to each other even in the absence of an external time
reference. This is especially prevalent if a set of nodes are left
offline for an extended period (>1-2 weeks), since their hardware clocks
will drift. If the resulting Internet connectivity is then dependent on
a VM, this will cause a catch-22 and the cluster will not properly
start.

This configuration will accomplish that - if no suitable >6 stratum
peers are found, the hosts will enter orphan mode. Since they are now
all configured as "peers" with each other, they will collectively decide
on one of them to become the source and sync to it. A local stratum 10
fudge is added so that at least one of the nodes can become this source.

While this is not an ideal use of NTP, it is by far the cleanest
solution to this problem, and does not impact normal functionality when
the two configured stratum-2 servers are reachable.
2021-05-19 11:03:18 -04:00
Joshua Boniface 238449904f Move some other tasks to bootstrap role
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2021-05-13 10:17:38 -04:00
Joshua Boniface 7536732f30 Remove GRUB config from base role
This is not actually ideal.
2021-05-12 14:55:57 -04:00
Joshua Boniface 04bc9730a0 Fix version sorting bugs in kernel-cleanup.sh 2021-05-12 14:40:18 -04:00
Joshua Boniface 45322e0f9e Add additional items to base role
Backups, GRUB configuration, and IPMI configuration.
2021-05-12 13:53:15 -04:00
Joshua Boniface da9eafcdfa Fix sudoers to use conditional deploy_username 2021-04-13 16:50:05 -04:00
Joshua Boniface 70ba4b240f Allow configurable fail2ban IPs 2021-04-13 16:44:49 -04:00
Joshua Boniface ce3554b530 Allow customization of deploy username 2021-04-13 11:30:42 -04:00
Joshua Boniface 3819cd87fd Move to more dynamic apt configs
Allow specifying repository URLs in the group_vars, and add
release-specific template files to support future version changes.
2021-04-08 14:14:25 -04:00
Joshua Boniface 404751f695 Update relative path to bootstrap files 2021-04-08 14:04:56 -04:00
Joshua Boniface 622cef1586 Remove superfluous symlink 2021-04-08 13:50:47 -04:00
Joshua Boniface 6589a9cd38 Add sensible sorting of kernel removals 2021-04-08 13:46:43 -04:00
Joshua Boniface 6598637e91 Remove cruft and add mkpasswd setup 2021-04-08 13:46:30 -04:00
Joshua Boniface dcd0b48d94 Correct bad indentation in base role 2021-03-18 09:36:49 -04:00
Joshua Boniface 82fa85834a Add libguestfs-tools to libvirt role deps 2021-03-15 13:39:37 -04:00
Joshua Boniface ca3a5e144f Update tags and add kernel-cleanup script 2021-02-02 15:41:38 -05:00
Joshua Boniface 1c05c8729f Fix incorrect systemd enabling in Patroni 2021-01-28 16:28:02 -05:00
Joshua Boniface f4974d648d Add some additional compression libraries 2021-01-28 13:34:58 -05:00
Joshua Boniface fa0aeec88e Add local domain to resolver config 2021-01-28 13:34:26 -05:00
Joshua Boniface 04ca8f73d2 Correct bugs during bootstrap
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2021-01-28 13:32:36 -05:00
Joshua Boniface b7f251ea16 Retry pgsql bootstrap startup 6 times
This will sometimes fail, so retry it several times
2021-01-27 15:45:36 -05:00
Joshua Boniface 7b08610efa Retry msgr2 enabling 6 times
This will sometimes fail, so retry it several times
2021-01-27 14:13:09 -05:00
Joshua Boniface c4c285c7b3 Remove invalid timezone entries in postgres conf 2021-01-26 15:20:25 -05:00
Joshua Boniface 7585553225 Add default values 2020-12-21 00:20:45 -05:00
Joshua Boniface ac071f4bf0 Add configurable ZK memory limits 2020-12-21 00:20:45 -05:00
Joshua Boniface 98e3e39570 Remove libjemalloc package 2020-12-21 00:20:45 -05:00
Joshua Boniface 8e104113d7 Tune Zookeeper memory usage
Use Xms and Xmx=128M to reduce overall Zookeeper memory usage.
2020-12-21 00:20:45 -05:00
Joshua Boniface de04105a38 Add tuning for Ceph OSDs 2020-12-21 00:20:45 -05:00
Joshua Boniface 28c86d170f Don't use libjemalloc for Ceph daemons
This was an artifact of a much, much older Ceph configuration I ran, and
is not relevant with newer Ceph versions like those used in PVC.
Performance testing with Nautilus and Bluestore reveals a minimal
performance hit, and using `jemalloc` prevents cache autotuning from
being effective, so remove it.
2020-12-21 00:20:45 -05:00
Joshua Boniface cb96ef4e7a Use new init command location
Command was renamed in the PVC CLI to facilitate other "task" actions
like backup/restore.
2020-11-24 12:22:34 -05:00
Joshua Boniface 3c0c3e8e56 Add jute.maxbuffer to Zookeeper environment ops
Adds this option based on the findings of
https://github.com/python-zk/kazoo/issues/630, whereby restores of >1MB
in size would fail. This is considered an unsafe option, but given our
usecase no actual znode should ever exceed this limit; this is purely
for the large transactions that come from a `pvc task restore` action to
an empty Zookeeper instance.
2020-11-24 12:20:25 -05:00
Joshua Boniface da8c357d38 Add PVC status MOTD script 2020-11-17 12:48:53 -05:00
Joshua Boniface 9f84609808 Set proper mode on agent plugins 2020-10-27 15:48:57 -04:00
Joshua Boniface 2d1b76ecdf Add check-mk-agent plugin installs
These are used by various Ansible tasks, even if the administrator is
not using Check_MK for monitoring.
2020-10-27 15:41:20 -04:00
Joshua Boniface 2b0398dec8 Add PCI and USB utils 2020-10-05 16:10:10 -04:00
Joshua Boniface 934f73af0f Support using existing SSL certs on system
Add the additional pvc_api_ssl_cert_path and pvc_api_ssl_key_path
group_vars options, which can be used to set the SSL details to existing
files on the filesystem if desired. If these are empty (or nonexistent),
the original pvc_api_ssl_cert and pvc_api_ssl_key raw format options
will be used as they were.

Allows the administrator to use outside methods (such as Let's Encrypt)
to obtain the certs locally on the system, avoiding changes to the
group_vars and redeployment to manage SSL keys.
2020-08-26 14:11:14 -04:00
Joshua Boniface 2edea75fbe Use generic Debian repos and PVC component 2020-08-26 12:16:39 -04:00
Joshua Boniface 2f2123b70e Rename remaining "pvc_prov" items to pvc_api 2020-08-25 13:01:48 -04:00
Joshua Boniface d79c587384 Change name of default API database
From pvcprov to pvcapi to reflect the changing use of this database.
2020-08-25 02:00:29 -04:00
Joshua Boniface 663d525bb1 Add comments to defaults 2020-08-21 09:40:51 -04:00
Joshua Boniface e32dfe6200 Add additional configuration to group_vars
Also include defaults and the new pvc_vm_shutdown_timeout option.
2020-08-20 21:39:44 -04:00
Joshua Boniface 774595cdb7 Ensure ZK prioritizes IPv4 2020-08-19 13:10:03 -04:00
Joshua Boniface c9b487f5e6 Use FQDN for Zookeeper server entries 2020-08-19 12:47:06 -04:00
Joshua Boniface a0e4f3bd30 Improve SSH configuration for nodes
Ensure hostbased auth works with configs, remove erroneous old
conditional for authtypes, remove obsolete config option.
2020-08-06 15:56:01 -04:00
Joshua Boniface 6851d42885 Use Google DNS instead of Cloudflare
For some reason Cloudflare works in fewer places than Google, so just
use it instead.
2020-08-06 13:22:30 -04:00
Joshua Boniface 6b8232d38e Use cluster_group variable for paths
Instead of trying to automagic this group out of the Ansible hostvars,
just make it explicitly defined in the group_vars to avoid any
confusion.
2020-08-06 13:20:14 -04:00
Joshua Boniface a488f62ef8 Ignore errors in bringing up bootstrap interfaces 2020-07-27 13:08:24 -04:00
Joshua Boniface 69b0590b54 Add storage components to default pvcnoded.yaml 2020-06-06 21:15:10 -04:00
Joshua Boniface 646219737c Ensure uuid-runtime is installed 2020-05-12 11:15:01 -04:00
Joshua Boniface b0186b85c2 Use correct syntax for init command 2020-04-06 15:19:49 -04:00
Joshua Boniface af1927e384 Use consistent naming in patroni.yml 2020-04-06 14:33:13 -04:00
Joshua Boniface 417dde5b1b Remove obsolete issue-gen script on install 2020-04-06 13:55:51 -04:00
Joshua Boniface f90f8f33da Use short names in PVC configs 2020-04-06 13:54:39 -04:00
Joshua Boniface f560f55010 Use shortname for Zookeeper 2020-04-06 13:45:29 -04:00
Joshua Boniface c591b1e39f Include upstream and short names in hosts 2020-04-06 13:36:38 -04:00
Joshua Boniface e37f2af6cd Use local CLI command instead of API to init 2020-04-06 13:36:38 -04:00
Joshua Boniface b9f6284e36 Use only short names in Ceph MON config 2020-04-06 13:36:38 -04:00
Joshua Boniface fe40811f2b Fix conditional checks with inventory_hostname 2020-04-06 13:36:38 -04:00
Joshua Boniface 2afccf44fb Handle bridge creation more sensibly 2020-04-06 13:36:38 -04:00
Joshua Boniface d60eabf63d Don't restart pvcd.service on bootstrap 2020-02-20 14:34:48 -05:00
Joshua Boniface a79aef90fa Allow deb migrations to be installed 2020-02-15 23:30:11 -05:00
Joshua Boniface eaf9467b75 Add symlink for pvc files dir 2020-02-15 23:02:33 -05:00
Joshua Boniface f5cd8a94c2 Handle creation and collection on bootstrap better 2020-02-15 23:01:32 -05:00
Joshua Boniface b922d47458 Use new in-built database migrations in API 2020-02-15 22:49:48 -05:00
Joshua Boniface 67d1f6761a Use new package and file names
References parallelvirtualclient/pvc#79
2020-02-08 19:47:47 -05:00
Joshua Boniface 94f2cd5c86 Don't mess with upstream at all during bootstrap
This caused some major breakage and is not required.
2020-01-13 15:12:54 -05:00
Joshua Boniface 129219faff Don't remove nano 2020-01-13 09:17:38 -05:00
Joshua Boniface 7d6052f9cb Modify add_cluster_ips to support new bridges 2020-01-12 19:46:27 -05:00
Joshua Boniface 00315e01c3 Enable and start vhostmd service 2020-01-07 10:45:12 -05:00
Joshua Boniface d9b3f15381 Add source_volume column to storage table 2020-01-06 23:54:48 -05:00
Joshua Boniface 03779056c7 Add new empty script entry 2020-01-06 23:54:48 -05:00
Joshua Boniface cd7cdf2719 Add bridge_device entry to config
Used to properly allow bridged networks to be formed.

Ref parallelvirtualcluster/pvc#64
2020-01-06 14:35:25 -05:00
Joshua Boniface a1efa2f01a Fix additional reference to userdata_template 2020-01-04 13:41:03 -05:00
Joshua Boniface 761715d015 Adjust provisioner database schema 2020-01-04 12:13:11 -05:00
Joshua Boniface dcd3194432 Set msgr2 mode on Ceph monitors 2019-12-30 09:13:50 -05:00
Joshua Boniface a66d17252f Apply fix with some tweaks to other serial handlers 2019-12-25 13:45:29 -05:00
Joshua Boniface af606ac49c Change ordering of networks in file 2019-12-25 13:31:02 -05:00
Joshua Boniface a30edbfa54 Replace broken "serial" restarts with a new method 2019-12-25 13:30:37 -05:00
Joshua Boniface bf4de842d8 Correct bad address in pvcd.yaml 2019-12-25 12:57:51 -05:00
Joshua Boniface c5da6381c9 Set provisioner database in pvcd.yaml 2019-12-25 12:37:32 -05:00
Joshua Boniface 1dda60d301 Add and remove floating IP during cluster bootstrap 2019-12-25 12:12:53 -05:00
Joshua Boniface ee948cb91c Move netmask to separate config part 3 2019-12-24 14:27:31 -05:00
Joshua Boniface 79dd0cd4bc Ensure the Patroni ZK is clean for bootstrap 2019-12-24 14:17:41 -05:00
Joshua Boniface 06467b64ea Move netmask to separate config part 2 2019-12-24 14:16:20 -05:00
Joshua Boniface 22c6c13f0f Don't try to do crazy restart ordering, it fails 2019-12-24 14:15:52 -05:00
Joshua Boniface 73617fa1a6 Change Patroni scope to just pvc 2019-12-24 14:15:39 -05:00
Joshua Boniface a10fc7eb3f Move netmask to separate config 2019-12-24 14:15:14 -05:00
Joshua Boniface 1570ccd370 Set timezone to be a variable 2019-12-24 09:09:11 -05:00
Joshua Boniface ebee10747c Use API endpoint to bootstrap PVC cluster 2019-12-24 09:08:21 -05:00
Joshua Boniface 93f44dd9dc Add additional API configuration 2019-12-23 23:25:27 -05:00
Joshua Boniface a37f511241 Install Provisioner schema to database 2019-12-23 23:19:09 -05:00
Joshua Boniface e08e19ee64 Add provisioner database schema 2019-12-23 12:58:03 -05:00
Joshua Boniface 4d2ef3b622 Remove empty newline 2019-12-23 12:57:46 -05:00
Joshua Boniface 64157e8c89 Remove invalid flag to ceph-authtool 2019-12-14 14:10:35 -05:00
Joshua Boniface 6bfc83e8f3 Don't become for uuidgen 2019-12-14 13:51:43 -05:00
Joshua Boniface 1c2f972e93 Move Ceph access to storage network 2019-12-14 13:14:21 -05:00
Joshua Boniface 80fdc88235 Improve script to run ZK cleanup on all hosts 2019-12-01 20:29:47 -05:00
Joshua Boniface d78d682fe5 Add jq dependency 2019-12-01 20:26:08 -05:00
Joshua Boniface 6a29400525 Make vacuum script more comprehensive 2019-12-01 20:24:18 -05:00
Joshua Boniface 273c048e6a Add check_mk check for PVC status 2019-10-24 09:46:10 -04:00
Joshua Boniface 0336cd998f Improve daily vacuum script 2019-10-24 09:43:24 -04:00
Joshua Boniface 7f5a7e48f8 Add daily Zookeeper data cleanup 2019-08-26 11:09:23 -04:00
Joshua Boniface ef9673de02 Add custom systemd unit for Zookeeper
We're 100% systemd here, and the lack of control/information that the
old-school ZK initscript provides is frustrating. Replace it with our
own simple unit file.
2019-08-26 11:06:30 -04:00
Joshua Boniface 9b2e12e69b Add support for arbitrary /etc/hosts entries 2019-08-26 11:06:30 -04:00
Joshua Boniface b75e84a124 Add logrotate configuration 2019-08-11 15:41:10 -04:00
Joshua Boniface 030a3ded99 Add daily Postgres vacuum script 2019-08-11 15:29:00 -04:00
Joshua Boniface 91509720ac Add Zookeeper autopurge @72h 2019-08-05 13:16:09 -04:00
Joshua Boniface 005ba71fc8 Update config template with recent changes
1) Add debug flag
2) Move intervals config up one level
2019-08-01 13:21:12 -04:00
Joshua Boniface ada3cb1d87 Set debug value in API config 2019-07-26 11:44:08 -04:00
Joshua Boniface dc27564157 Limit database tasks to coordinators only
Non-coordinators don't need these configurations as they shouldn't run
there.
2019-07-11 19:58:56 -04:00
Joshua Boniface 2b54feb4bf Always perform the apt-update 2019-07-10 22:56:58 -04:00
Joshua Boniface db2c77d330 Support new log flags and update default log conf
Tweak the defaults a bit; pvc-ansible assumes we're running under
systemd, so set a log format that's better for it (no colour or date).
2019-07-10 21:49:38 -04:00
Joshua Boniface 4217a92750 Allow sysrc triggers in nodes 2019-07-09 14:13:44 -04:00
Joshua Boniface 0d562b829c Replace tabs with spaces 2019-07-08 19:24:59 -04:00
Joshua Boniface 6319241df9 Remove bad content from pvc-api.yml 2019-07-08 19:03:08 -04:00
Joshua Boniface e98649c417 Add quote around ZK nodes in Patroni 2019-07-08 16:59:12 -04:00