Commit Graph

41 Commits

Author SHA1 Message Date
Joshua Boniface 123c7ce857 Update copyright header on all files for 2024
Last release of 2023 is probably the best time to do this.
2023-12-29 11:16:59 -05:00
Joshua Boniface 8083b7a3e6 Bump version to 0.9.87 2023-12-27 13:40:51 -05:00
Joshua Boniface 572596c575 Fix missing f-string placeholder 2023-12-27 13:21:20 -05:00
Joshua Boniface e654fbba08 Move debug condition handling to Logger
Avoids many dozens of conditionals sprinkled throughout the code by
centralizing this check into the main Logger instance.
2023-12-27 13:01:45 -05:00
Joshua Boniface 0a93f526e0 Bump version to 0.9.86 2023-12-14 14:46:29 -05:00
Joshua Boniface 709c9cb73e Pause pvchealthd startup until node daemon is run
If the health daemon starts too soon during a node bootup, it will
generate generate tons of erroneous faults while the node starts up.
Adds a conditional wait for the current node daemon to be in "run"
state before the health daemon really starts up.
2023-12-13 14:53:54 -05:00
Joshua Boniface 9dc5097dbc Bump version to 0.9.85 2023-12-10 01:00:33 -05:00
Joshua Boniface 9aee2a9075 Bump version to 0.9.84 2023-12-09 23:05:40 -05:00
Joshua Boniface b0557edb76 Ensure entry in name is uppercase 2023-12-09 17:01:41 -05:00
Joshua Boniface 47bd7bf2f5 Only run cluster-wide health checks on primary
Avoids multiple coordinators trying to write updated cluster-wide fault
events. Instead, they are now only written by the primary (or the
incoming primary if still in a transition).
2023-12-09 16:50:51 -05:00
Joshua Boniface b9fbfe2ed5 Improve fault ID format
Instead of using random hex characters from an md5sum, use a nice name
in all-caps similar to how Ceph does. This further helps prevent dupes
but also permits a changing health delta within a single event (which
would really only ever apply to plugin faults).
2023-12-09 16:48:14 -05:00
Joshua Boniface 7e6d922877 Improve fault detail handling further
Since we already had a "details" field, simply move where it gets added
to the message later, in generate_fault, after the main message value
was used to generate the ID.
2023-12-09 16:13:36 -05:00
Joshua Boniface 82a7fd3c80 Add more debugging info to psql 2023-12-07 21:36:05 -05:00
Joshua Boniface ddd9d9ee07 Adjust psql check to avoid weird failures 2023-12-07 15:07:59 -05:00
Joshua Boniface 9e2e749c55 Combine pvchealthd output into single log message 2023-12-07 14:00:43 -05:00
Joshua Boniface 157b8c20bf Add Patroni output to debug logs 2023-12-07 14:00:35 -05:00
Joshua Boniface bf158dc2d9 Shorten debug output 2023-12-07 13:31:20 -05:00
Joshua Boniface 1b84553405 Use passed coordinator state 2023-12-07 11:19:26 -05:00
Joshua Boniface 60dac143f2 Use simpler health calculation 2023-12-07 11:17:31 -05:00
Joshua Boniface a13273335d Add colon to result text 2023-12-07 11:15:42 -05:00
Joshua Boniface e7f21b7058 Enhance and fix bugs in psql plugin
1. Check Patronictl statuses
2. Don't error during node primary transitions
2023-12-07 11:14:16 -05:00
Joshua Boniface 9dbadfdd6e Move back to per-plugin fault reporting 2023-12-07 11:13:56 -05:00
Joshua Boniface 5691f75ac9 Fix bad import 2023-12-06 14:28:32 -05:00
Joshua Boniface 4a02c2c8e3 Add additional faults 2023-12-06 13:27:39 -05:00
Joshua Boniface 79eb54d5da Move fault generation to common library 2023-12-06 13:17:10 -05:00
Joshua Boniface 067e73337f Shorten health IDs to 8 characters 2023-12-04 15:48:27 -05:00
Joshua Boniface b59f743690 Improve logging and handling of fault entries 2023-12-01 17:38:28 -05:00
Joshua Boniface 4c3f235e05 Avoid running fault updates in maintenance mode
When the cluster is in maintenance mode, all faults should be ignored.
2023-12-01 17:38:28 -05:00
Joshua Boniface 9c2b1b29ee Add node health to fault states
Adjusts ordering and ensures that node health states are included in
faults if they are less than 50%.

Also adjusts fault ID generation and runs fault checks only coordinator
nodes to avoid too many runs.
2023-12-01 17:38:28 -05:00
Joshua Boniface 8594eb697f Add initial fault generation in pvchealthd
References: #164
2023-12-01 17:38:27 -05:00
Joshua Boniface 988de1218f Bump version to 0.9.83 2023-12-01 17:37:42 -05:00
Joshua Boniface 915a84ee3c Fix psql check for new configs 2023-12-01 03:58:21 -05:00
Joshua Boniface 03a738f878 Move config parser into daemon_lib
And reformat/add config values for API.
2023-11-30 00:05:37 -05:00
Joshua Boniface 97eb63ebab Clean up config naming and dead files 2023-11-29 21:21:51 -05:00
Joshua Boniface 077dd8708f Add check start message 2023-11-29 21:21:51 -05:00
Joshua Boniface b6b5786c3b Output list in cyan (s state) 2023-11-29 21:21:51 -05:00
Joshua Boniface d2b764a2c7 Output more details on startup 2023-11-29 21:21:51 -05:00
Joshua Boniface 7a7c975eff Ensure return from health shutdown 2023-11-29 21:21:51 -05:00
Joshua Boniface 647cba3cf5 Expand startup width for new daemon name 2023-11-29 21:21:51 -05:00
Joshua Boniface 921ecb3a05 Fix name in kydb plugin 2023-11-29 21:21:51 -05:00
Joshua Boniface 41f4e4fb2f Split health monitoring into discrete daemon/pkg 2023-11-29 21:21:51 -05:00