267a3d16e5
Bump version to 0.5
2019-08-08 20:56:27 -04:00
2880a761c0
Move Ceph command pipe to new location
...
Matching the new /cmd/domain pipe, move Ceph pipe to /cmd/ceph.
2019-08-07 14:47:27 -04:00
b7546e3711
Fix bugs in command pipeline for VMs
2019-08-07 14:13:01 -04:00
0ff2d7d537
Use shlex for command splitting
...
This will preserve quoted strings, required for the rbd lock commands.
2019-08-07 14:02:57 -04:00
a2a630f6a0
Add pipeline for VM lock flush cmd
2019-08-07 13:49:33 -04:00
496216321e
Move lock flushing to VMInstance
...
Prepares for reuse of this function via client commands.
2019-08-07 13:36:56 -04:00
0446b2db02
Catch exceptions if Patroni is not up
2019-08-07 11:46:58 -04:00
7e77752ce5
Add limit to Patroni switchover attempts
2019-08-07 11:46:42 -04:00
33a963c2af
Improve fence output on failure and increase delay
2019-08-07 11:35:49 -04:00
e92a57606d
Use better forceful arping command
...
Send ARP responses with the source IP in it to force update even if the
old primary did not cleanly terminate (during fencing for instance).
2019-08-07 11:29:38 -04:00
ef3b6b3723
Arping 3 times instead of 2
...
During fence 2 is not always enough for the network to recognize the
change in primary coordinator.
2019-08-07 11:15:36 -04:00
3b27a88128
Allow abort of shutdown state
...
Adds some logic to allow an active shutdown state to be aborted by
changing the VM to another state. Useful mostly if a VM is doing funky
things and not responding to the shutdown, but the administrator either
doesn't want to wait for the timer to expire (forcing an immediate
termination) or wishes to abort the shutdown attempt.
Fixes #49
2019-08-07 10:58:18 -04:00
e2ae58b62c
Add the missing newline to the string compare
2019-08-04 17:00:33 -04:00
d0d5ab4425
Fix bug if the switchover target is the same
2019-08-04 16:51:11 -04:00
a329376d33
Lock primary_node key during primary switchover
...
Also implements a looping to switch over the Patroni leader to ensure
this always follows the primary and clean up the code around here a bit.
2019-08-04 16:42:06 -04:00
710d2cf9c2
Fix record duplication bug and general cleanup
...
Fixes #47
2019-08-01 13:11:45 -04:00
8bdec03cf1
Properly support debug logging via config
2019-08-01 11:22:27 -04:00
c6e58796ba
Clean up redundant return section
2019-07-31 23:57:31 -04:00
7380f45b1b
Improve dnsmasq interface handling
...
listen-address is enough; adding interface too causes weird issues where
dnsmasq is listening on an IPv6 global wildcard too which conflicts with
the PowerDNS instance.
2019-07-31 10:03:56 -04:00
324990739e
Make DNS aggregator listen on port 53
...
Using the non-standard port was a pain. Now that all the DNSMasq stuff
works, move back to the default port.
2019-07-30 09:20:01 -04:00
717d00cfcf
Implement snapshot rename in node daemon
...
[4/2] Implements #44
2019-07-28 23:06:12 -04:00
83b806d0b5
Move intervals config one level up
...
Makes for a slightly-better-organized configuration and explanation.
2019-07-28 19:33:23 -04:00
68ca493b3b
Fix bad error code
2019-07-26 20:53:01 -04:00
837666a15e
Revamp renamekey function
...
The function had numerous bugs and didn't work. Fix them up.
2019-07-26 16:38:05 -04:00
35363671a0
Implement Ceph volume resize and rename
...
Includes a simple implementation of a zookeeper "rename" facility,
allowing a key and all data to be replaced by a new key with a different
name but containing all the same child elements and data.
[2/2] Implements #44
2019-07-26 15:13:21 -04:00
50367c9190
Improve OSD create messages
2019-07-26 11:41:51 -04:00
96bc181877
Set the routerstate on daemon startup
...
Allows switching from coordinator to not coordinator with a service
restart.
2019-07-12 09:51:56 -04:00
2a220cd16e
Nicer colour output for coordinator state client
2019-07-12 09:31:42 -04:00
439c5f18c3
Add router_state to output of keepalives
2019-07-11 20:11:05 -04:00
f30be555c1
Improve message output for logging
...
Improve some formatting of the messages being printed to make it nicer
for long-term logging.
2019-07-10 22:38:32 -04:00
ac36870a86
Implement hup for log rotation
...
This function was long-existent, but never used; implement it.
2019-07-10 22:22:02 -04:00
58f4222ee7
Support disabling log colours and dates
...
For usecases such as a pure-syslog, allow disabling of dates or colours
in the log messages (separately).
2019-07-10 22:17:23 -04:00
32a6369de2
Add nicer message when live migrate fails
2019-07-10 17:42:24 -04:00
8a28738bff
Use consistent terminology in fence message
2019-07-10 11:54:56 -04:00
8f160abf90
Handle cancelling flushes when new ones run
...
Store the flush_thread of a node as a class object. Before starting a
new flush thread (either flush or unflush), stop the existing one if it
exists to prevent further migrations, then start the new thread. Set the
object to None on init and again once the task actually finishes. Remove
the inflush flag as this is not required when using these threads and
functionally does nothing any longer, but add the flush_stopper flag to
trigger cancellation of the current job.
2019-07-10 11:54:34 -04:00
c7c8c8bcbb
Fix bug with flush
2019-07-10 00:43:55 -04:00
7a8aee9fe7
Remove flush locking functionality
...
This just seemed like more trouble that it was worth. Flush locks were
originally intended as a way to counteract the weird issues around
flushing that were mostly fixed by the code refactoring, so this will
help test if those issues are truly gone. If not, will look into a
cleaner solution that doesn't result in unchangeable states.
2019-07-09 23:59:17 -04:00
ad284b13bc
Fix bugs with fencing
2019-07-09 19:17:53 -04:00
7df200ac44
Improve ZK connection loss handling
2019-07-09 19:17:32 -04:00
47f86475f8
Handle failures of Ceph commands gradefully
...
If these commands fail, catch the error, print a message, and set up
empty lists. Also handle later data parsing in this case.
2019-07-09 16:43:38 -04:00
1a8e7509f7
Support run_os_command timeout; use timeouts
2019-07-09 15:09:13 -04:00
83a4140703
Allow enabling debug mode in config
...
Makes debugging easier without modifying code.
2019-07-09 14:59:00 -04:00
8eeba9bc9b
Make Ceph commands time out if needed
2019-07-09 14:35:53 -04:00
19701c66e4
Move fencing to after keepalive output
...
Just makes the messages a little easier to read when triggered.
2019-07-09 14:24:31 -04:00
17dfaf43c5
Move hypervisor selection out to common
2019-07-09 14:20:58 -04:00
b551b54642
Rename message when contending
2019-07-09 14:03:48 -04:00
4249d5d982
Always load and store IPMI on daemon start
...
Without this, the IPMI information set during initial node creation can
never be changed, which can cause issues later. Instead, always set it
fresh on each node boot.
2019-07-09 14:00:31 -04:00
7f828a27a5
Free RBD locks when fencing node
2019-07-09 10:59:31 -04:00
bc54ea2449
Log message when starting or stopping API client
2019-07-08 19:29:49 -04:00
cda690e94f
Set RADOS df information in ZK
2019-07-08 10:19:56 -04:00