Commit Graph

248 Commits

Author SHA1 Message Date
Joshua Boniface e92a57606d Use better forceful arping command
Send ARP responses with the source IP in it to force update even if the
old primary did not cleanly terminate (during fencing for instance).
2019-08-07 11:29:38 -04:00
Joshua Boniface ef3b6b3723 Arping 3 times instead of 2
During fence 2 is not always enough for the network to recognize the
change in primary coordinator.
2019-08-07 11:15:36 -04:00
Joshua Boniface 3b27a88128 Allow abort of shutdown state
Adds some logic to allow an active shutdown state to be aborted by
changing the VM to another state. Useful mostly if a VM is doing funky
things and not responding to the shutdown, but the administrator either
doesn't want to wait for the timer to expire (forcing an immediate
termination) or wishes to abort the shutdown attempt.

Fixes #49
2019-08-07 10:58:18 -04:00
Joshua Boniface e2ae58b62c Add the missing newline to the string compare 2019-08-04 17:00:33 -04:00
Joshua Boniface d0d5ab4425 Fix bug if the switchover target is the same 2019-08-04 16:51:11 -04:00
Joshua Boniface a329376d33 Lock primary_node key during primary switchover
Also implements a looping to switch over the Patroni leader to ensure
this always follows the primary and clean up the code around here a bit.
2019-08-04 16:42:06 -04:00
Joshua Boniface 710d2cf9c2 Fix record duplication bug and general cleanup
Fixes #47
2019-08-01 13:11:45 -04:00
Joshua Boniface 8bdec03cf1 Properly support debug logging via config 2019-08-01 11:22:27 -04:00
Joshua Boniface c6e58796ba Clean up redundant return section 2019-07-31 23:57:31 -04:00
Joshua Boniface 7380f45b1b Improve dnsmasq interface handling
listen-address is enough; adding interface too causes weird issues where
dnsmasq is listening on an IPv6 global wildcard too which conflicts with
the PowerDNS instance.
2019-07-31 10:03:56 -04:00
Joshua Boniface 324990739e Make DNS aggregator listen on port 53
Using the non-standard port was a pain. Now that all the DNSMasq stuff
works, move back to the default port.
2019-07-30 09:20:01 -04:00
Joshua Boniface 717d00cfcf Implement snapshot rename in node daemon
[4/2] Implements #44
2019-07-28 23:06:12 -04:00
Joshua Boniface 83b806d0b5 Move intervals config one level up
Makes for a slightly-better-organized configuration and explanation.
2019-07-28 19:33:23 -04:00
Joshua Boniface 68ca493b3b Fix bad error code 2019-07-26 20:53:01 -04:00
Joshua Boniface 837666a15e Revamp renamekey function
The function had numerous bugs and didn't work. Fix them up.
2019-07-26 16:38:05 -04:00
Joshua Boniface 35363671a0 Implement Ceph volume resize and rename
Includes a simple implementation of a zookeeper "rename" facility,
allowing a key and all data to be replaced by a new key with a different
name but containing all the same child elements and data.

[2/2] Implements #44
2019-07-26 15:13:21 -04:00
Joshua Boniface 50367c9190 Improve OSD create messages 2019-07-26 11:41:51 -04:00
Joshua Boniface 96bc181877 Set the routerstate on daemon startup
Allows switching from coordinator to not coordinator with a service
restart.
2019-07-12 09:51:56 -04:00
Joshua Boniface 2a220cd16e Nicer colour output for coordinator state client 2019-07-12 09:31:42 -04:00
Joshua Boniface 439c5f18c3 Add router_state to output of keepalives 2019-07-11 20:11:05 -04:00
Joshua Boniface f30be555c1 Improve message output for logging
Improve some formatting of the messages being printed to make it nicer
for long-term logging.
2019-07-10 22:38:32 -04:00
Joshua Boniface ac36870a86 Implement hup for log rotation
This function was long-existent, but never used; implement it.
2019-07-10 22:22:02 -04:00
Joshua Boniface 58f4222ee7 Support disabling log colours and dates
For usecases such as a pure-syslog, allow disabling of dates or colours
in the log messages (separately).
2019-07-10 22:17:23 -04:00
Joshua Boniface 32a6369de2 Add nicer message when live migrate fails 2019-07-10 17:42:24 -04:00
Joshua Boniface 8a28738bff Use consistent terminology in fence message 2019-07-10 11:54:56 -04:00
Joshua Boniface 8f160abf90 Handle cancelling flushes when new ones run
Store the flush_thread of a node as a class object. Before starting a
new flush thread (either flush or unflush), stop the existing one if it
exists to prevent further migrations, then start the new thread. Set the
object to None on init and again once the task actually finishes. Remove
the inflush flag as this is not required when using these threads and
functionally does nothing any longer, but add the flush_stopper flag to
trigger cancellation of the current job.
2019-07-10 11:54:34 -04:00
Joshua Boniface c7c8c8bcbb Fix bug with flush 2019-07-10 00:43:55 -04:00
Joshua Boniface 7a8aee9fe7 Remove flush locking functionality
This just seemed like more trouble that it was worth. Flush locks were
originally intended as a way to counteract the weird issues around
flushing that were mostly fixed by the code refactoring, so this will
help test if those issues are truly gone. If not, will look into a
cleaner solution that doesn't result in unchangeable states.
2019-07-09 23:59:17 -04:00
Joshua Boniface ad284b13bc Fix bugs with fencing 2019-07-09 19:17:53 -04:00
Joshua Boniface 7df200ac44 Improve ZK connection loss handling 2019-07-09 19:17:32 -04:00
Joshua Boniface 47f86475f8 Handle failures of Ceph commands gradefully
If these commands fail, catch the error, print a message, and set up
empty lists. Also handle later data parsing in this case.
2019-07-09 16:43:38 -04:00
Joshua Boniface 1a8e7509f7 Support run_os_command timeout; use timeouts 2019-07-09 15:09:13 -04:00
Joshua Boniface 83a4140703 Allow enabling debug mode in config
Makes debugging easier without modifying code.
2019-07-09 14:59:00 -04:00
Joshua Boniface 8eeba9bc9b Make Ceph commands time out if needed 2019-07-09 14:35:53 -04:00
Joshua Boniface 19701c66e4 Move fencing to after keepalive output
Just makes the messages a little easier to read when triggered.
2019-07-09 14:24:31 -04:00
Joshua Boniface 17dfaf43c5 Move hypervisor selection out to common 2019-07-09 14:20:58 -04:00
Joshua Boniface b551b54642 Rename message when contending 2019-07-09 14:03:48 -04:00
Joshua Boniface 4249d5d982 Always load and store IPMI on daemon start
Without this, the IPMI information set during initial node creation can
never be changed, which can cause issues later. Instead, always set it
fresh on each node boot.
2019-07-09 14:00:31 -04:00
Joshua Boniface 7f828a27a5 Free RBD locks when fencing node 2019-07-09 10:59:31 -04:00
Joshua Boniface bc54ea2449 Log message when starting or stopping API client 2019-07-08 19:29:49 -04:00
Joshua Boniface cda690e94f Set RADOS df information in ZK 2019-07-08 10:19:56 -04:00
Joshua Boniface d9ebd04264 Fix missing dom_uuid values in data reads 2019-07-07 15:30:28 -04:00
Joshua Boniface b82ccaa84d Improve flush handling
Similar to recent client changes, don't replace the previous node record
of an already-migrated VM. Wait for shutdown if required. Use a
continue statement instead of a needless else block.
2019-07-07 15:27:37 -04:00
Joshua Boniface 0d398f663b Rename "Domain" to "VM" in various class names
The name "Domain", though technically correct from a Libvirt
perspective, was unnecessarily confusing. Call the class instances what
they are, VMs.
2019-07-07 15:20:37 -04:00
Joshua Boniface 8216125b02 Enable autostart of API client on Primary
Adds a config flag that turns on the API client following the Primary
coordinator. The retcode of the start/stop commands is ignore so this
can fail gracefully if e.g. the client isn't installed.
2019-07-06 02:42:56 -04:00
Joshua Boniface e6012965f1 Add YAML header to sample config files 2019-07-06 02:24:35 -04:00
Joshua Boniface 3e591bd09e Remove extra whitespaces on blank lines 2019-06-25 22:33:23 -04:00
Joshua Boniface 08cb16bfbc Revamp VM migration handling
This was very old code that was hard to follow and quite fragile, with
failures and infinite loops occurring fairly frequently. These changes
make the code more robust, including the addition of timeouts, some code
cleanup, and some improvements to the logical flow.

Also forces the libvirt migration to occur on the cluster network, which
couples to changes in the libvirtd listen (via pvc-ansible) and in
Daemon.py via the previous commit.
2019-06-25 22:23:48 -04:00
Joshua Boniface d336fce253 Connect to actual IP not localhost for Libvirt 2019-06-25 22:09:32 -04:00
Joshua Boniface 75d0e7f989 Revert "Only perform fencing duties on primary"
This reverts commit 464c69aac6.

Actually, yea, this made sense - if the primary fails, it can't
fence itself.
2019-06-25 12:36:48 -04:00
Joshua Boniface 85a5a8a0c9 Disable tx offloading on bridge interfaces
Reference: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=717215#68

Without this, DHCP fails when traversing only the local bridge, for
Debian Jessie or earlier (and possibly other OSes as well), due to the
missing UDP checksums. This disables the offload and hence reenables
the checksums even on the software-only bridge.

Also rearranged the steps and added comments arround this section to
better clarify what each command is doing.
2019-06-25 12:36:37 -04:00
Joshua Boniface 464c69aac6 Only perform fencing duties on primary
There was really no need for this to be shared among all the
coordinators, which seemed more fragile. This way only the primary will
try to fence dead nodes.
2019-06-24 20:17:51 -04:00
Joshua Boniface 249611b161 Remove duplicate import 2019-06-24 20:14:43 -04:00
Joshua Boniface ef272b0b7d Add removal confirmations and zap disk before add 2019-06-21 15:52:28 -04:00
Joshua Boniface 867ad1fc1b Support human-readable biconversion and in volumes 2019-06-21 09:23:52 -04:00
Joshua Boniface ddedb1a992 Set image features to supported values 2019-06-19 15:19:36 -04:00
Joshua Boniface 0f15e7cda5 Set shutdown state after final keepalive 2019-06-19 14:52:47 -04:00
Joshua Boniface 0060c0313b Put daemonstate to shutdown when stopping
This way it isn't "run" all the way until it shuts down.
2019-06-19 14:23:07 -04:00
Joshua Boniface 9a0554fdbe Remove all volumes from pool on removal
Technically not needed, but otherwise random errors may be thrown,
so best to be explicit.
2019-06-19 12:49:03 -04:00
Joshua Boniface 87907d4ce8 Remove size field from volume objects
This data is just in the stats anyways.
2019-06-19 10:45:14 -04:00
Joshua Boniface 09562fdc06 Output in json format instead 2019-06-19 10:32:01 -04:00
Joshua Boniface a940d03959 Fix some bugs and add RBD volume stats 2019-06-19 10:25:22 -04:00
Joshua Boniface db0b382b3d Don't bother with snapshot management by Daemon
This is *definitely* not needed in the end, and just uses RAM for
no conceivable purpose. Snapshots are fully client-managed.
2019-06-19 09:43:04 -04:00
Joshua Boniface 1c9f606480 Implement volume and snapshot handling by daemon
This seems like a super-gross way to do this, but at the moment
I don't have a better way. Maybe just remove this component since
none of the volume/snapshot stuff is dynamic; will see as this
progresses.
2019-06-19 09:40:32 -04:00
Joshua Boniface 784b428ed0 Add creation of volume and snapshot lists 2019-06-19 09:29:36 -04:00
Joshua Boniface 064e6455bc Correct some more bugs 2019-06-19 00:29:21 -04:00
Joshua Boniface a4ab3075ab Correct some bugs around new code 2019-06-19 00:23:25 -04:00
Joshua Boniface 01959cb9e3 Implementation of RBD volumes and snapshots
Adds the ability to manage RBD volumes (add/remove) and RBD
snapshots (add/remove). (Working) list functions to come.
2019-06-19 00:12:44 -04:00
Joshua Boniface 2bbbda3da5 Only trigger pool updates on primary 2019-06-18 21:26:05 -04:00
Joshua Boniface 612f5ab52c Strip pv_block from stdout 2019-06-18 20:34:25 -04:00
Joshua Boniface 1622226c32 Add more logging during OSD creation/deletion 2019-06-18 20:31:04 -04:00
Joshua Boniface 3adeef6fdd Use the fsid to activate new OSDs 2019-06-18 20:22:28 -04:00
Joshua Boniface 443108f53d Add support for enable/disable keepalive detail 2019-06-18 19:54:42 -04:00
Joshua Boniface 79f284a0a9 Pass logger into run_command 2019-06-18 13:45:59 -04:00
Joshua Boniface 080ca3201c Correct actual problem with this_node 2019-06-18 13:43:54 -04:00
Joshua Boniface d076f9f4eb Use self.this_node everywhere 2019-06-18 13:25:16 -04:00
Joshua Boniface aee078f3eb Support disabling keepalive logging 2019-06-18 12:44:07 -04:00
Joshua Boniface b0411e8e1a Remove "error" message from Ceph commands
This triggeres at every node start and isn't useful.
2019-06-18 12:41:38 -04:00
Joshua Boniface 8d9007f697 Remove OSD stat collection if count is zero
Otherwise, ceph osd df will hang indefinitely trying to get data
for the zero OSDs.
2019-06-18 12:36:53 -04:00
Joshua Boniface 5a327dc41a Clean up Ceph pipeline and add more debug logs 2019-06-18 11:19:03 -04:00
Joshua Boniface 46a416bc78 Use a proper variable for vni_mtu 2019-06-18 00:01:12 -04:00
Joshua Boniface 1f92b90a3e Don't encode initial data as we're using zkhander 2019-06-17 23:53:16 -04:00
Joshua Boniface d4ebe63d9b Rename network device field
It seems much nicer and more consistent as "device" rather than as
"name".
2019-06-17 23:44:41 -04:00
Joshua Boniface 1d3f868206 Unify network devices and addresses in config
The old way of doing this was a little cumbersome, with an upper YAML
tree split between "devices" (name and MTU) and addresses. This commit
unifies these under the root "networking" section to make this section
clearer.
2019-06-17 23:41:07 -04:00
Joshua Boniface e70255dbd6 Support configurable interface MTUs
MTUs were hardcoded at 9000, which breaks if the underlying interface
or network switch does not support jumbo frames, a possible deployment
limitation. This has non-obvious consequences due to MTU mismatches
for certain services (Ceph, Zookeeper, etc.).

This commit adds support for configurable MTUs for each interface,
set in pvcd.yaml. The example has been updated to reflect this, with
a default of 1500 (the Ethernet standard).

This commit also adds autoconfiguration of the VNI device MTU based
on the `vni_mtu` value, the same for bridge networks and minus 50
(rather than 200 from the hardcoded value, based on the following
resource [1]) for VXLAN networks.

[1] http://ipengineer.net/2014/06/vxlan-mtu-vs-ip-mtu-consideration/
2019-06-17 23:34:48 -04:00
Joshua Boniface c583ee1709 Revert "Wait a little longer"
This reverts commit bd7a55e9e1.

This is not really needed, but do keep the 5s wait
2019-06-17 21:56:06 -04:00
Joshua Boniface bd7a55e9e1 Wait a little longer 2019-06-17 12:14:13 -04:00
Joshua Boniface 23994f8a11 Increase wait time for daemons and log message 2019-06-17 10:30:46 -04:00
Joshua Boniface fe654aa5a2 Correct typo in daemon 2019-06-16 19:27:20 -04:00
Joshua Boniface 14e9ba892c Wait on both sides for 30s
Still finding issues with the flush
2019-05-24 01:23:18 -04:00
Joshua Boniface ae37afcf75 Wait 10 seconds when starting pvc-flush
Without waiting the unflush will trigger too soon, before the
daemon is fully ready and such it fails in odd ways.
2019-05-23 23:35:01 -04:00
Joshua Boniface e8b666708c Add one final keepalive update before exiting 2019-05-23 23:23:03 -04:00
Joshua Boniface 4c5ce9b995 Perform additional tweaks to units
Use RemainAfterExit to avoid pvc-flush from auto-stopping immediately.

Use PartOf to tie services to the target itself.

Use --wait on flush to avoid daemon stopping before flush is complete.
2019-05-23 23:18:28 -04:00
Joshua Boniface e46aa22989 Remove invalid Restart in pvc-flush.service 2019-05-23 22:51:36 -04:00
Joshua Boniface 7c6132f7dd Add node autoflush service and target
Add a systemd service to manage node flush/unflush, useful during
system startup and shutdown to avoid requiring administrator
intervention for this to occur. This is optional and the service is
not enabled by default, and the postinst script informs the
administrator of this.

Also adds a systemd target to collect the two service units together
and provide an easy way to flush+shutdown or startup+unflush the
entire PVC system.

Closes #28
2019-05-23 22:42:51 -04:00
Joshua Boniface 8ef21cf9f2 Sleep longer before removing gateways
1 second was just slightly too little time to wait and packets would
occasionally be lost on primary switchover. Increase this to 2
seconds to provide more time for arping to run on the new primary.
2019-05-23 22:20:38 -04:00
Joshua Boniface 8881b97e8b Correct a missing capitalization 2019-05-21 23:19:19 -04:00
Joshua Boniface 3893666507 Improve performance by removing spurious actions
1. Remove a number of time.sleep commands which don't really seem
necessary any longer and which significantly increased the startup
time while parsing the VM list.
2. Handle some variable sets during initialization of the object,
rather than waiting for a management command, enabling...
3. Know when a state change, and the corresponding Libvirt lookup,
is unnecessary due to the target node not matching the current node.
This also removes a number of unremovable errors from Libvirt on the
console which were annoying.

This reduces the total time taken by the VM startup segment (lines
760-762 of Daemon.py) from 17.117s down to 0.976s for 82 VMs.
2019-05-21 22:56:40 -04:00
Joshua Boniface 595cf1782c Switch DNS aggregator to PostgreSQL
MariaDB+Galera was terribly unstable, with the cluster failing to
start or dying randomly, and generally seemed incredibly unsuitable
for an HA solution. This commit switches the DNS aggregator SQL
backend to PostgreSQL, implemented via Patroni HA.

It also manages the Patroni state, forcing the primary instance to
follow the PVC coordinator, such that the active DNS Aggregator
instance is always able to communicate read+write with the local
system.

This required some logic changes to how the DNS Aggregator worked,
specifically ensuring that database changes aren't attempted while
the instance isn't actively running - to be honest this was a bug
anyways that had just never been noticed.

Closes #34
2019-05-21 01:07:41 -04:00
Joshua Boniface 9e806d30f9 Only stop log parser if it's actually running 2019-05-11 12:09:42 -04:00