Compare commits
25 Commits
9e0e2f0c76
...
4018f32460
Author | SHA1 | Date | |
---|---|---|---|
4018f32460 | |||
a72384631b | |||
564a9cf2a0 | |||
a04f12d4c3 | |||
377e692f7a | |||
4e3b1e462c | |||
6e4cd34881 | |||
12d4690e86 | |||
651a2fa45e | |||
32082581af | |||
3fba1f6de7 | |||
1a9bd09788 | |||
b7859189eb | |||
ddccc91645 | |||
2b180bd5c4 | |||
c0b868896e | |||
f02b128284 | |||
871955e5b6 | |||
511df41fa4 | |||
5c2ec9ce78 | |||
f2a6a4ac1f | |||
31c7c2522f | |||
35a5052e2b | |||
390b0c6257 | |||
3b5b1f258d |
21
README.md
21
README.md
@ -60,7 +60,7 @@ The PVC Bootstrap system is designed to heavily leverage Redfish in its automati
|
||||
|
||||
1. Connect power to the servers, but do not manually power on the servers - Redfish will handle this aspect after characterizing each host, as well as manage boot, RAID array creation (as documented in `bootstrap.yml`), BIOS configuration, etc.
|
||||
|
||||
1. Wait for the cluster bootstrapping to complete; you can watch the output of the `pvcbootstrapd` and `pvcbootstrapd-worker` services on the Bootstrap host to see progress. If supported, the indicator LEDs of the nodes will be lit during setup and will be disabled upon completion to provide a physical indication of the process.
|
||||
1. Wait for the cluster bootstrapping to complete; you can watch the output of the `pvcbootstrapd` and `pvcbootstrapd-worker` services on the Bootstrap host to see progress, or configure the system to send webhooks to a remote target (e.g. Slack/Mattermost messages). If supported, the indicator LEDs of the nodes will be lit during setup and will be disabled upon completion to provide a physical indication of the process.
|
||||
|
||||
1. Verify and power off the servers and put them into production; you may need to complete several post-install tasks (for instance setting the production BMC networking via `sudo ifup ipmi` on each node) before the cluster is completely finished.
|
||||
|
||||
@ -84,7 +84,7 @@ The PVC Bootstrap system can still handle nodes without Redfish support, for ins
|
||||
|
||||
1. Power on the servers and set them to boot temporarily (one time) from PXE.
|
||||
|
||||
1. Wait for the cluster bootstrapping to complete; you can watch the output of the `pvcbootstrapd` and `pvcbootstrapd-worker` services on the Bootstrap host to see progress. If supported, the indicator LEDs of the nodes will be lit during setup and will be disabled upon completion to provide a physical indication of the process.
|
||||
1. Wait for the cluster bootstrapping to complete; you can watch the output of the `pvcbootstrapd` and `pvcbootstrapd-worker` services on the Bootstrap host to see progress, or configure the system to send webhooks to a remote target (e.g. Slack/Mattermost messages). If supported, the indicator LEDs of the nodes will be lit during setup and will be disabled upon completion to provide a physical indication of the process.
|
||||
|
||||
1. Verify and power off the servers and put them into production; you may need to complete several post-install tasks (for instance setting the production BMC networking via `sudo ifup ipmi` on each node) before the cluster is completely finished.
|
||||
|
||||
@ -150,13 +150,22 @@ filesystem="ext4"
|
||||
# The hostname of the system (set per-run)
|
||||
target_hostname="hv1.example.tld"
|
||||
|
||||
# The target system disk path
|
||||
# The target system disk path; must be a single disk (mdadm/software RAID is not supported)
|
||||
# This will usually use a `detect` string. A "detect" string is a string in the form "detect:<NAME>:<HUMAN-SIZE>:<ID>".
|
||||
# Detect strings allow for automatic determination of Linux block device paths from known basic information
|
||||
# about disks by leveraging "lsscsi" on the target host. The "NAME" should be some descriptive identifier,
|
||||
# for instance the manufacturer (e.g. "INTEL"), the "HUMAN-SIZE" should be the labeled human-readable size
|
||||
# of the device (e.g. "480GB", "1.92TB"), and "ID" specifies the Nth 0-indexed device which matches the
|
||||
# NAME" and "HUMAN-SIZE" values (e.g. "2" would match the third device with the corresponding "NAME" and
|
||||
# "HUMAN-SIZE"). When matching against sizes, there is +/- 3% flexibility to account for base-1000 vs.
|
||||
# base-1024 differences and rounding errors. The "NAME" may contain whitespace but if so the entire detect
|
||||
# string should be quoted, and is case-insensitive.
|
||||
target_disk="detect:LOGICAL:146GB:0"
|
||||
|
||||
# SSH key method (usually tftp)
|
||||
# SSH key fetch method (usually tftp)
|
||||
target_keys_method="tftp"
|
||||
|
||||
# SSH key path (usually keys.txt)
|
||||
# SSH key fetch path (usually keys.txt)
|
||||
target_keys_path="keys.txt"
|
||||
|
||||
# Deploy username (usually deploy)
|
||||
@ -179,6 +188,6 @@ pvcbootstrapd_checkin_uri="http://10.255.255.1:9999/checkin/host"
|
||||
|
||||
## Bootstrap Process
|
||||
|
||||
This diagram outlines the various states the nodes and clusters will be in throughout the setup process along with the individual steps for reference.
|
||||
This diagram outlines the various states the nodes and clusters will be in throughout the setup process along with the individual steps for reference. Which node starts characterizing first can be random, but is shown as `node1` for clarity. For non-Redflish installs, the first several steps must be completed manually as referenced above.
|
||||
|
||||

|
||||
|
@ -58,6 +58,15 @@ pvc:
|
||||
# Per-host TFTP path (almost always "/host" under "root_path"; must be writable)
|
||||
host_path: "/srv/tftp/pvc-installer/host"
|
||||
|
||||
# Debian repository configuration
|
||||
repo:
|
||||
# Mirror path; defaults to using the apt-cacher-ng instance located on the current machine
|
||||
# Replace "10.199.199.254" if you change "dhcp" -> "address" above
|
||||
mirror: http://10.199.199.254:3142/ftp.debian.org/debian
|
||||
|
||||
# Default Debian release for new clusters. Must be supported by PVC ("buster", "bullseye", "bookworm").
|
||||
release: bookworm
|
||||
|
||||
# PVC Ansible repository configuration
|
||||
# Note: If "path" does not exist, "remote" will be cloned to it via Git using SSH private key "keyfile".
|
||||
# Note: The VCS will be refreshed regularly via the API in response to webhooks.
|
||||
@ -66,7 +75,7 @@ pvc:
|
||||
path: "/var/home/joshua/pvc"
|
||||
|
||||
# Path to the deploy key (if applicable) used to clone and pull the repository
|
||||
keyfile: "/var/home/joshua/id_ed25519.joshua.key"
|
||||
key_file: "/var/home/joshua/id_ed25519.joshua.key"
|
||||
|
||||
# Git remote URI for the repository
|
||||
remote: "ssh://git@git.bonifacelabs.ca:2222/bonifacelabs/pvc.git"
|
||||
@ -77,6 +86,9 @@ pvc:
|
||||
# Clusters configuration file
|
||||
clusters_file: "clusters.yml"
|
||||
|
||||
# Lock file to use for Git interaction
|
||||
lock_file: "/run/pvcbootstrapd.lock"
|
||||
|
||||
# Filenames of the various group_vars components of a cluster
|
||||
# Generally with pvc-ansible this will contain 2 files: "base.yml", and "pvc.yml"; refer to the
|
||||
# pvc-ansible documentation and examples for details on these files.
|
||||
|
@ -21,6 +21,9 @@ pvc:
|
||||
tftp:
|
||||
root_path: "ROOT_DIRECTORY/tftp"
|
||||
host_path: "ROOT_DIRECTORY/tftp/host"
|
||||
repo:
|
||||
mirror: http://BOOTSTRAP_ADDRESS:3142/UPSTREAM_MIRROR
|
||||
release: DEBIAN_RELEASE
|
||||
ansible:
|
||||
path: "ROOT_DIRECTORY/repo"
|
||||
keyfile: "ROOT_DIRECTORY/id_ed25519"
|
||||
|
@ -121,6 +121,7 @@ def read_config():
|
||||
o_queue = o_base["queue"]
|
||||
o_dhcp = o_base["dhcp"]
|
||||
o_tftp = o_base["tftp"]
|
||||
o_repo = o_base["repo"]
|
||||
o_ansible = o_base["ansible"]
|
||||
o_notifications = o_base["notifications"]
|
||||
except KeyError as k:
|
||||
@ -178,8 +179,17 @@ def read_config():
|
||||
f"Missing second-level key '{key}' under 'tftp'"
|
||||
)
|
||||
|
||||
# Get the Repo configuration
|
||||
for key in ["mirror", "release"]:
|
||||
try:
|
||||
config[f"repo_{key}"] = o_repo[key]
|
||||
except Exception:
|
||||
raise MalformedConfigurationError(
|
||||
f"Missing second-level key '{key}' under 'repo'"
|
||||
)
|
||||
|
||||
# Get the Ansible configuration
|
||||
for key in ["path", "keyfile", "remote", "branch", "clusters_file"]:
|
||||
for key in ["path", "key_file", "remote", "branch", "clusters_file", "lock_file"]:
|
||||
try:
|
||||
config[f"ansible_{key}"] = o_ansible[key]
|
||||
except Exception:
|
||||
|
@ -54,7 +54,7 @@ def run_bootstrap(config, cspec, cluster, nodes):
|
||||
logger.info("Waiting 60s before starting Ansible bootstrap.")
|
||||
sleep(60)
|
||||
|
||||
logger.info("Starting Ansible bootstrap of cluster {cluster.name}")
|
||||
logger.info(f"Starting Ansible bootstrap of cluster {cluster.name}")
|
||||
notifications.send_webhook(config, "begin", f"Cluster {cluster.name}: Starting Ansible bootstrap")
|
||||
|
||||
# Run the Ansible playbooks
|
||||
@ -66,8 +66,8 @@ def run_bootstrap(config, cspec, cluster, nodes):
|
||||
limit=f"{cluster.name}",
|
||||
playbook=f"{config['ansible_path']}/pvc.yml",
|
||||
extravars={
|
||||
"ansible_ssh_private_key_file": config["ansible_keyfile"],
|
||||
"bootstrap": "yes",
|
||||
"ansible_ssh_private_key_file": config["ansible_key_file"],
|
||||
"do_bootstrap": "yes",
|
||||
},
|
||||
forks=len(nodes),
|
||||
verbosity=2,
|
||||
@ -76,7 +76,7 @@ def run_bootstrap(config, cspec, cluster, nodes):
|
||||
logger.info("{}: {}".format(r.status, r.rc))
|
||||
logger.info(r.stats)
|
||||
if r.rc == 0:
|
||||
git.commit_repository(config)
|
||||
git.commit_repository(config, f"Generated files for cluster '{cluster.name}'")
|
||||
git.push_repository(config)
|
||||
notifications.send_webhook(config, "success", f"Cluster {cluster.name}: Completed Ansible bootstrap")
|
||||
else:
|
||||
|
@ -67,7 +67,7 @@ def init_database(config):
|
||||
(id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
cluster INTEGER NOT NULL,
|
||||
state TEXT NOT NULL,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
name TEXT NOT NULL,
|
||||
nodeid INTEGER NOT NULL,
|
||||
bmc_macaddr TEXT NOT NULL,
|
||||
bmc_ipaddr TEXT NOT NULL,
|
||||
|
@ -22,6 +22,7 @@
|
||||
import os.path
|
||||
import git
|
||||
import yaml
|
||||
from filelock import FileLock
|
||||
|
||||
import pvcbootstrapd.lib.notifications as notifications
|
||||
|
||||
@ -36,7 +37,7 @@ def init_repository(config):
|
||||
Clone the Ansible git repository
|
||||
"""
|
||||
try:
|
||||
git_ssh_cmd = f"ssh -i {config['ansible_keyfile']} -o StrictHostKeyChecking=no"
|
||||
git_ssh_cmd = f"ssh -i {config['ansible_key_file']} -o StrictHostKeyChecking=no"
|
||||
if not os.path.exists(config["ansible_path"]):
|
||||
print(
|
||||
f"First run: cloning repository {config['ansible_remote']} branch {config['ansible_branch']} to {config['ansible_path']}"
|
||||
@ -60,61 +61,67 @@ def pull_repository(config):
|
||||
"""
|
||||
Pull (with rebase) the Ansible git repository
|
||||
"""
|
||||
logger.info(f"Updating local configuration repository {config['ansible_path']}")
|
||||
try:
|
||||
git_ssh_cmd = f"ssh -i {config['ansible_keyfile']} -o StrictHostKeyChecking=no"
|
||||
g = git.cmd.Git(f"{config['ansible_path']}")
|
||||
g.pull(rebase=True, env=dict(GIT_SSH_COMMAND=git_ssh_cmd))
|
||||
g.submodule("update", "--init", env=dict(GIT_SSH_COMMAND=git_ssh_cmd))
|
||||
except Exception as e:
|
||||
logger.warn(e)
|
||||
notifications.send_webhook(config, "failure", "Failed to update Git repository")
|
||||
with FileLock(config['ansible_lock_file']):
|
||||
logger.info(f"Updating local configuration repository {config['ansible_path']}")
|
||||
try:
|
||||
git_ssh_cmd = f"ssh -i {config['ansible_key_file']} -o StrictHostKeyChecking=no"
|
||||
g = git.cmd.Git(f"{config['ansible_path']}")
|
||||
logger.debug("Performing git pull")
|
||||
g.pull(rebase=True, env=dict(GIT_SSH_COMMAND=git_ssh_cmd))
|
||||
logger.debug("Performing git submodule update")
|
||||
g.submodule("update", "--init", env=dict(GIT_SSH_COMMAND=git_ssh_cmd))
|
||||
except Exception as e:
|
||||
logger.warn(e)
|
||||
notifications.send_webhook(config, "failure", "Failed to update Git repository")
|
||||
logger.info("Completed repository synchonization")
|
||||
|
||||
|
||||
def commit_repository(config):
|
||||
def commit_repository(config, message="Generic commit"):
|
||||
"""
|
||||
Commit uncommitted changes to the Ansible git repository
|
||||
"""
|
||||
logger.info(
|
||||
f"Committing changes to local configuration repository {config['ansible_path']}"
|
||||
)
|
||||
|
||||
try:
|
||||
g = git.cmd.Git(f"{config['ansible_path']}")
|
||||
g.add("--all")
|
||||
commit_env = {
|
||||
"GIT_COMMITTER_NAME": "PVC Bootstrap",
|
||||
"GIT_COMMITTER_EMAIL": "git@pvcbootstrapd",
|
||||
}
|
||||
g.commit(
|
||||
"-m",
|
||||
"Automated commit from PVC Bootstrap Ansible subsystem",
|
||||
author="PVC Bootstrap <git@pvcbootstrapd>",
|
||||
env=commit_env,
|
||||
with FileLock(config['ansible_lock_file']):
|
||||
logger.info(
|
||||
f"Committing changes to local configuration repository {config['ansible_path']}"
|
||||
)
|
||||
notifications.send_webhook(config, "success", "Successfully committed to Git repository")
|
||||
except Exception as e:
|
||||
logger.warn(e)
|
||||
notifications.send_webhook(config, "failure", "Failed to commit to Git repository")
|
||||
try:
|
||||
g = git.cmd.Git(f"{config['ansible_path']}")
|
||||
g.add("--all")
|
||||
commit_env = {
|
||||
"GIT_COMMITTER_NAME": "PVC Bootstrap",
|
||||
"GIT_COMMITTER_EMAIL": "git@pvcbootstrapd",
|
||||
}
|
||||
g.commit(
|
||||
"-m",
|
||||
"Automated commit from PVC Bootstrap Ansible subsystem",
|
||||
"-m",
|
||||
message,
|
||||
author="PVC Bootstrap <git@pvcbootstrapd>",
|
||||
env=commit_env,
|
||||
)
|
||||
notifications.send_webhook(config, "success", "Successfully committed to Git repository")
|
||||
except Exception as e:
|
||||
logger.warn(e)
|
||||
notifications.send_webhook(config, "failure", "Failed to commit to Git repository")
|
||||
|
||||
|
||||
def push_repository(config):
|
||||
"""
|
||||
Push changes to the default remote
|
||||
"""
|
||||
logger.info(
|
||||
f"Pushing changes from local configuration repository {config['ansible_path']}"
|
||||
)
|
||||
|
||||
try:
|
||||
git_ssh_cmd = f"ssh -i {config['ansible_keyfile']} -o StrictHostKeyChecking=no"
|
||||
g = git.Repo(f"{config['ansible_path']}")
|
||||
origin = g.remote(name="origin")
|
||||
origin.push(env=dict(GIT_SSH_COMMAND=git_ssh_cmd))
|
||||
notifications.send_webhook(config, "success", "Successfully pushed Git repository")
|
||||
except Exception as e:
|
||||
logger.warn(e)
|
||||
notifications.send_webhook(config, "failure", "Failed to push Git repository")
|
||||
with FileLock(config['ansible_lock_file']):
|
||||
logger.info(
|
||||
f"Pushing changes from local configuration repository {config['ansible_path']}"
|
||||
)
|
||||
try:
|
||||
git_ssh_cmd = f"ssh -i {config['ansible_key_file']} -o StrictHostKeyChecking=no"
|
||||
g = git.Repo(f"{config['ansible_path']}")
|
||||
origin = g.remote(name="origin")
|
||||
origin.push(env=dict(GIT_SSH_COMMAND=git_ssh_cmd))
|
||||
notifications.send_webhook(config, "success", "Successfully pushed Git repository")
|
||||
except Exception as e:
|
||||
logger.warn(e)
|
||||
notifications.send_webhook(config, "failure", "Failed to push Git repository")
|
||||
|
||||
|
||||
def load_cspec_yaml(config):
|
||||
|
@ -43,7 +43,7 @@ def run_paramiko(config, node_address):
|
||||
ssh_client.connect(
|
||||
hostname=node_address,
|
||||
username=config["deploy_username"],
|
||||
key_filename=config["ansible_keyfile"],
|
||||
key_filename=config["ansible_key_file"],
|
||||
)
|
||||
yield ssh_client
|
||||
ssh_client.close()
|
||||
@ -69,6 +69,7 @@ def run_hook_osddb(config, targets, args):
|
||||
stdin, stdout, stderr = c.exec_command(pvc_cmd_string)
|
||||
logger.debug(stdout.readlines())
|
||||
logger.debug(stderr.readlines())
|
||||
return stdout.channel.recv_exit_status()
|
||||
|
||||
|
||||
def run_hook_osd(config, targets, args):
|
||||
@ -98,6 +99,7 @@ def run_hook_osd(config, targets, args):
|
||||
stdin, stdout, stderr = c.exec_command(pvc_cmd_string)
|
||||
logger.debug(stdout.readlines())
|
||||
logger.debug(stderr.readlines())
|
||||
return stdout.channel.recv_exit_status()
|
||||
|
||||
|
||||
def run_hook_pool(config, targets, args):
|
||||
@ -127,7 +129,7 @@ def run_hook_pool(config, targets, args):
|
||||
logger.debug(stderr.readlines())
|
||||
|
||||
# This only runs once on whatever the first node is
|
||||
break
|
||||
return stdout.channel.recv_exit_status()
|
||||
|
||||
|
||||
def run_hook_network(config, targets, args):
|
||||
@ -191,7 +193,7 @@ def run_hook_network(config, targets, args):
|
||||
logger.debug(stderr.readlines())
|
||||
|
||||
# This only runs once on whatever the first node is
|
||||
break
|
||||
return stdout.channel.recv_exit_status()
|
||||
|
||||
|
||||
def run_hook_copy(config, targets, args):
|
||||
@ -217,11 +219,14 @@ def run_hook_copy(config, targets, args):
|
||||
tc.chmod(dfile, int(dmode, 8))
|
||||
tc.close()
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
def run_hook_script(config, targets, args):
|
||||
"""
|
||||
Run a script on the targets
|
||||
"""
|
||||
return_status = 0
|
||||
for node in targets:
|
||||
node_name = node.name
|
||||
node_address = node.host_ipaddr
|
||||
@ -272,6 +277,10 @@ def run_hook_script(config, targets, args):
|
||||
stdin, stdout, stderr = c.exec_command(remote_command)
|
||||
logger.debug(stdout.readlines())
|
||||
logger.debug(stderr.readlines())
|
||||
if stdout.channel.recv_exit_status() != 0:
|
||||
return_status = stdout.channel.recv_exit_status()
|
||||
|
||||
return return_status
|
||||
|
||||
|
||||
def run_hook_webhook(config, targets, args):
|
||||
@ -345,7 +354,9 @@ def run_hooks(config, cspec, cluster, nodes):
|
||||
# Run the hook function
|
||||
try:
|
||||
notifications.send_webhook(config, "begin", f"Cluster {cluster.name}: Running hook task '{hook_name}'")
|
||||
hook_functions[hook_type](config, target_nodes, hook_args)
|
||||
retcode = hook_functions[hook_type](config, target_nodes, hook_args)
|
||||
if retcode > 0:
|
||||
raise Exception(f"Hook returned with code {retcode}")
|
||||
notifications.send_webhook(config, "success", f"Cluster {cluster.name}: Completed hook task '{hook_name}'")
|
||||
except Exception as e:
|
||||
logger.warning(f"Error running hook: {e}")
|
||||
|
@ -84,3 +84,16 @@ def set_boot_state(config, cspec, data, state):
|
||||
db.update_node_state(config, cspec_cluster, cspec_hostname, state)
|
||||
node = db.get_node(config, cspec_cluster, name=cspec_hostname)
|
||||
logger.debug(node)
|
||||
|
||||
|
||||
def set_completed(config, cspec, cluster):
|
||||
nodes = list()
|
||||
for bmc_macaddr in cspec["bootstrap"]:
|
||||
if cspec["bootstrap"][bmc_macaddr]["node"]["cluster"] == cluster:
|
||||
nodes.append(cspec["bootstrap"][bmc_macaddr])
|
||||
for node in nodes:
|
||||
cspec_cluster = node["node"]["cluster"]
|
||||
cspec_hostname = node["node"]["hostname"]
|
||||
db.update_node_state(config, cspec_cluster, cspec_hostname, "completed")
|
||||
node = db.get_node(config, cspec_cluster, name=cspec_hostname)
|
||||
logger.debug(node)
|
||||
|
@ -66,8 +66,8 @@ def add_preseed(config, cspec_node, host_macaddr, system_drive_target):
|
||||
|
||||
# We use the dhcp_address here to allow the listen_address to be 0.0.0.0
|
||||
rendered = template.render(
|
||||
debrelease=cspec_node.get("config", {}).get("release"),
|
||||
debmirror=cspec_node.get("config", {}).get("mirror"),
|
||||
debrelease=config.get("repo_release"),
|
||||
debmirror=config.get("repo_mirror"),
|
||||
addpkglist=add_packages,
|
||||
filesystem=cspec_node.get("config", {}).get("filesystem"),
|
||||
skip_blockcheck=False,
|
||||
|
@ -50,24 +50,38 @@ def dnsmasq_checkin(config, data):
|
||||
)
|
||||
cspec = git.load_cspec_yaml(config)
|
||||
is_in_bootstrap_map = True if data["macaddr"] in cspec["bootstrap"] else False
|
||||
if is_in_bootstrap_map:
|
||||
notifications.send_webhook(config, "info", f"New host checkin from MAC {data['macaddr']} as host {cspec['bootstrap'][data['macaddr']]['node']['fqdn']} in cluster {cspec['bootstrap'][data['macaddr']]['node']['cluster']}")
|
||||
if (
|
||||
cspec["bootstrap"][data["macaddr"]]["bmc"].get("redfish", None)
|
||||
is not None
|
||||
):
|
||||
if cspec["bootstrap"][data["macaddr"]]["bmc"]["redfish"]:
|
||||
is_redfish = True
|
||||
else:
|
||||
is_redfish = False
|
||||
try:
|
||||
if is_in_bootstrap_map:
|
||||
cspec_cluster = cspec["bootstrap"][data["macaddr"]]["node"]["cluster"]
|
||||
is_registered = True if data["macaddr"] in [x.bmc_macaddr for x in db.get_nodes_in_cluster(config, cspec_cluster)] else False
|
||||
else:
|
||||
is_redfish = redfish.check_redfish(config, data)
|
||||
is_registered = False
|
||||
except Exception:
|
||||
is_registered = False
|
||||
|
||||
logger.info(f"Is device '{data['macaddr']}' Redfish capable? {is_redfish}")
|
||||
if is_redfish:
|
||||
redfish.redfish_init(config, cspec, data)
|
||||
else:
|
||||
if not is_in_bootstrap_map:
|
||||
logger.warn(f"Device '{data['macaddr']}' not in bootstrap map; ignoring.")
|
||||
return
|
||||
|
||||
if is_registered:
|
||||
logger.info(f"Device '{data['macaddr']}' has already been bootstrapped; ignoring.")
|
||||
return
|
||||
|
||||
notifications.send_webhook(config, "info", f"New host checkin from MAC {data['macaddr']} as host {cspec['bootstrap'][data['macaddr']]['node']['fqdn']} in cluster {cspec['bootstrap'][data['macaddr']]['node']['cluster']}")
|
||||
if (
|
||||
cspec["bootstrap"][data["macaddr"]]["bmc"].get("redfish", None)
|
||||
is not None
|
||||
):
|
||||
if cspec["bootstrap"][data["macaddr"]]["bmc"]["redfish"]:
|
||||
is_redfish = True
|
||||
else:
|
||||
is_redfish = False
|
||||
else:
|
||||
is_redfish = redfish.check_redfish(config, data)
|
||||
|
||||
logger.info(f"Is device '{data['macaddr']}' Redfish capable? {is_redfish}")
|
||||
if is_redfish:
|
||||
redfish.redfish_init(config, cspec, data)
|
||||
|
||||
return
|
||||
|
||||
@ -140,11 +154,9 @@ def host_checkin(config, data):
|
||||
|
||||
hooks.run_hooks(config, cspec, cluster, ready_nodes)
|
||||
|
||||
target_state = "completed"
|
||||
for node in all_nodes:
|
||||
host.set_boot_state(config, cspec, data, target_state)
|
||||
host.set_completed(config, cspec, cspec_cluster)
|
||||
|
||||
# Hosts will now power down ready for real activation in production
|
||||
sleep(60)
|
||||
sleep(300)
|
||||
cluster = db.update_cluster_state(config, cspec_cluster, "completed")
|
||||
notifications.send_webhook(config, "completed", f"Cluster {cspec_cluster}: PVC bootstrap deployment completed")
|
||||
|
@ -715,8 +715,8 @@ def redfish_init(config, cspec, data):
|
||||
cspec_hostname = cspec_node["node"]["hostname"]
|
||||
cspec_fqdn = cspec_node["node"]["fqdn"]
|
||||
|
||||
logger.info("Waiting 60 seconds for system normalization")
|
||||
sleep(60)
|
||||
logger.info("Waiting 30 seconds for system normalization")
|
||||
sleep(30)
|
||||
|
||||
notifications.send_webhook(config, "begin", f"Cluster {cspec_cluster}: Beginning Redfish initialization of host {cspec_fqdn}")
|
||||
|
||||
@ -748,10 +748,11 @@ def redfish_init(config, cspec, data):
|
||||
return
|
||||
notifications.send_webhook(config, "success", f"Cluster {cspec_cluster}: Logged in to Redfish for host {cspec_fqdn} at {bmc_host}")
|
||||
|
||||
logger.info("Waiting 60 seconds for system normalization")
|
||||
sleep(60)
|
||||
logger.info("Waiting 30 seconds for system normalization")
|
||||
sleep(30)
|
||||
|
||||
logger.info("Characterizing node...")
|
||||
notifications.send_webhook(config, "begin", f"Cluster {cspec_cluster}: Beginning Redfish characterization of host {cspec_fqdn} at {bmc_host}")
|
||||
try:
|
||||
|
||||
# Get Refish bases
|
||||
@ -791,24 +792,29 @@ def redfish_init(config, cspec, data):
|
||||
try:
|
||||
ethernet_root = system_detail["EthernetInterfaces"]["@odata.id"].rstrip("/")
|
||||
ethernet_detail = session.get(ethernet_root)
|
||||
logger.debug(f"Found Ethernet detail: {ethernet_detail}")
|
||||
embedded_ethernet_detail_members = [e for e in ethernet_detail["Members"] if "Embedded" in e["@odata.id"]]
|
||||
embedded_ethernet_detail_members.sort(key = lambda k: k["@odata.id"])
|
||||
logger.debug(f"Found Ethernet members: {embedded_ethernet_detail_members}")
|
||||
first_interface_root = embedded_ethernet_detail_members[0]["@odata.id"].rstrip("/")
|
||||
first_interface_detail = session.get(first_interface_root)
|
||||
# Something went wrong, so fall back
|
||||
except KeyError:
|
||||
except Exception:
|
||||
first_interface_detail = dict()
|
||||
|
||||
logger.debug(f"First interface detail: {first_interface_detail}")
|
||||
logger.debug(f"HostCorrelation detail: {system_detail.get('HostCorrelation', {})}")
|
||||
# Try to get the MAC address directly from the interface detail (Redfish standard)
|
||||
logger.debug("Try to get the MAC address directly from the interface detail (Redfish standard)")
|
||||
if first_interface_detail.get("MACAddress") is not None:
|
||||
logger.debug("Try to get the MAC address directly from the interface detail (Redfish standard)")
|
||||
bootstrap_mac_address = first_interface_detail["MACAddress"].strip().lower()
|
||||
# Try to get the MAC address from the HostCorrelation->HostMACAddress (HP DL360x G8)
|
||||
elif len(system_detail.get("HostCorrelation", {}).get("HostMACAddress", [])) > 0:
|
||||
logger.debug("Try to get the MAC address from the HostCorrelation (HP iLO)")
|
||||
bootstrap_mac_address = (
|
||||
system_detail["HostCorrelation"]["HostMACAddress"][0].strip().lower()
|
||||
)
|
||||
# We can't find it, so use a dummy value
|
||||
# We can't find it, so abort
|
||||
else:
|
||||
logger.error("Could not find a valid MAC address for the bootstrap interface.")
|
||||
return
|
||||
@ -877,43 +883,43 @@ def redfish_init(config, cspec, data):
|
||||
return
|
||||
|
||||
# Adjust any BIOS settings
|
||||
logger.info("Adjusting BIOS settings...")
|
||||
try:
|
||||
bios_root = system_detail.get("Bios", {}).get("@odata.id")
|
||||
if bios_root is not None:
|
||||
bios_detail = session.get(bios_root)
|
||||
bios_attributes = list(bios_detail["Attributes"].keys())
|
||||
for setting, value in cspec_node["bmc"].get("bios_settings", {}).items():
|
||||
if setting not in bios_attributes:
|
||||
continue
|
||||
|
||||
payload = {"Attributes": {setting: value}}
|
||||
session.patch(f"{bios_root}/Settings", payload)
|
||||
except Exception as e:
|
||||
notifications.send_webhook(config, "failure", f"Cluster {cspec_cluster}: Failed to set BIOS settings for host {cspec_fqdn} at {bmc_host}. Check pvcbootstrapd logs and reset this host's BMC to retry.")
|
||||
logger.error(f"Cluster {cspec_cluster}: Failed to set BIOS settings for host {cspec_fqdn} at {bmc_host}: {e}")
|
||||
logger.error("Aborting Redfish configuration; reset BMC to retry.")
|
||||
del session
|
||||
return
|
||||
if len(cspec_node["bmc"].get("bios_settings", {}).items()) > 0:
|
||||
logger.info("Adjusting BIOS settings...")
|
||||
try:
|
||||
bios_root = system_detail.get("Bios", {}).get("@odata.id")
|
||||
if bios_root is not None:
|
||||
bios_detail = session.get(bios_root)
|
||||
bios_attributes = list(bios_detail["Attributes"].keys())
|
||||
for setting, value in cspec_node["bmc"].get("bios_settings", {}).items():
|
||||
if setting not in bios_attributes:
|
||||
continue
|
||||
payload = {"Attributes": {setting: value}}
|
||||
session.patch(f"{bios_root}/Settings", payload)
|
||||
except Exception as e:
|
||||
notifications.send_webhook(config, "failure", f"Cluster {cspec_cluster}: Failed to set BIOS settings for host {cspec_fqdn} at {bmc_host}. Check pvcbootstrapd logs and reset this host's BMC to retry.")
|
||||
logger.error(f"Cluster {cspec_cluster}: Failed to set BIOS settings for host {cspec_fqdn} at {bmc_host}: {e}")
|
||||
logger.error("Aborting Redfish configuration; reset BMC to retry.")
|
||||
del session
|
||||
return
|
||||
|
||||
# Adjust any Manager settings
|
||||
logger.info("Adjusting Manager settings...")
|
||||
try:
|
||||
mgrattribute_root = f"{manager_root}/Attributes"
|
||||
mgrattribute_detail = session.get(mgrattribute_root)
|
||||
mgrattribute_attributes = list(mgrattribute_detail["Attributes"].keys())
|
||||
for setting, value in cspec_node["bmc"].get("manager_settings", {}).items():
|
||||
if setting not in mgrattribute_attributes:
|
||||
continue
|
||||
|
||||
payload = {"Attributes": {setting: value}}
|
||||
session.patch(mgrattribute_root, payload)
|
||||
except Exception as e:
|
||||
notifications.send_webhook(config, "failure", f"Cluster {cspec_cluster}: Failed to set BMC settings for host {cspec_fqdn} at {bmc_host}. Check pvcbootstrapd logs and reset this host's BMC to retry.")
|
||||
logger.error(f"Cluster {cspec_cluster}: Failed to set BMC settings for host {cspec_fqdn} at {bmc_host}: {e}")
|
||||
logger.error("Aborting Redfish configuration; reset BMC to retry.")
|
||||
del session
|
||||
return
|
||||
if len(cspec_node["bmc"].get("manager_settings", {}).items()) > 0:
|
||||
logger.info("Adjusting Manager settings...")
|
||||
try:
|
||||
mgrattribute_root = f"{manager_root}/Attributes"
|
||||
mgrattribute_detail = session.get(mgrattribute_root)
|
||||
mgrattribute_attributes = list(mgrattribute_detail["Attributes"].keys())
|
||||
for setting, value in cspec_node["bmc"].get("manager_settings", {}).items():
|
||||
if setting not in mgrattribute_attributes:
|
||||
continue
|
||||
payload = {"Attributes": {setting: value}}
|
||||
session.patch(mgrattribute_root, payload)
|
||||
except Exception as e:
|
||||
notifications.send_webhook(config, "failure", f"Cluster {cspec_cluster}: Failed to set BMC settings for host {cspec_fqdn} at {bmc_host}. Check pvcbootstrapd logs and reset this host's BMC to retry.")
|
||||
logger.error(f"Cluster {cspec_cluster}: Failed to set BMC settings for host {cspec_fqdn} at {bmc_host}: {e}")
|
||||
logger.error("Aborting Redfish configuration; reset BMC to retry.")
|
||||
del session
|
||||
return
|
||||
|
||||
# Set boot override to Pxe for the installer boot
|
||||
logger.info("Setting temporary PXE boot...")
|
||||
@ -952,7 +958,7 @@ def redfish_init(config, cspec, data):
|
||||
node = db.get_node(config, cspec_cluster, name=cspec_hostname)
|
||||
|
||||
# Graceful shutdown of the machine
|
||||
notifications.send_webhook(config, "info", f"Cluster {cspec_cluster}: Powering off host {cspec_fqdn}")
|
||||
notifications.send_webhook(config, "info", f"Cluster {cspec_cluster}: Shutting down host {cspec_fqdn}")
|
||||
set_power_state(session, system_root, redfish_vendor, "GracefulShutdown")
|
||||
system_power_state = "On"
|
||||
while system_power_state != "Off":
|
||||
@ -964,6 +970,8 @@ def redfish_init(config, cspec, data):
|
||||
# Turn off the indicator to indicate bootstrap has completed
|
||||
set_indicator_state(session, system_root, redfish_vendor, "off")
|
||||
|
||||
notifications.send_webhook(config, "success", f"Cluster {cspec_cluster}: Powered off host {cspec_fqdn}")
|
||||
|
||||
# We must delete the session
|
||||
del session
|
||||
return
|
||||
|
@ -21,16 +21,18 @@
|
||||
|
||||
import os.path
|
||||
import shutil
|
||||
from subprocess import run
|
||||
|
||||
import pvcbootstrapd.lib.notifications as notifications
|
||||
|
||||
|
||||
def build_tftp_repository(config):
|
||||
# Generate an installer config
|
||||
build_cmd = f"{config['ansible_path']}/pvc-installer/buildpxe.sh -o {config['tftp_root_path']} -u {config['deploy_username']}"
|
||||
print(f"Building TFTP contents via pvc-installer command: {build_cmd}")
|
||||
notifications.send_webhook(config, "begin", f"Building TFTP contents via pvc-installer command: {build_cmd}")
|
||||
os.system(build_cmd)
|
||||
build_cmd = [ f"{config['ansible_path']}/pvc-installer/buildpxe.sh", "-o", config['tftp_root_path'], "-u", config['deploy_username'], "-m", config["repo_mirror"] ]
|
||||
print(f"Building TFTP contents via pvc-installer command: {' '.join(build_cmd)}")
|
||||
notifications.send_webhook(config, "begin", f"Building TFTP contents via pvc-installer command: {' '.join(build_cmd)}")
|
||||
ret = run(build_cmd)
|
||||
return True if ret.returncode == 0 else False
|
||||
|
||||
|
||||
def init_tftp(config):
|
||||
@ -43,8 +45,13 @@ def init_tftp(config):
|
||||
os.makedirs(config["tftp_root_path"])
|
||||
os.makedirs(config["tftp_host_path"])
|
||||
shutil.copyfile(
|
||||
f"{config['ansible_keyfile']}.pub", f"{config['tftp_root_path']}/keys.txt"
|
||||
f"{config['ansible_key_file']}.pub", f"{config['tftp_root_path']}/keys.txt"
|
||||
)
|
||||
|
||||
build_tftp_repository(config)
|
||||
notifications.send_webhook(config, "success", "First run: successfully initialized TFTP root and contents")
|
||||
result = build_tftp_repository(config)
|
||||
if result:
|
||||
print("First run: successfully initialized TFTP root and contents")
|
||||
notifications.send_webhook(config, "success", "First run: successfully initialized TFTP root and contents")
|
||||
else:
|
||||
print("First run: failed initialized TFTP root and contents; see logs above")
|
||||
notifications.send_webhook(config, "failure", "First run: failed initialized TFTP root and contents; check pvcbootstrapd logs")
|
||||
|
@ -95,12 +95,35 @@ if [[ -z ${deploy_username} ]]; then
|
||||
fi
|
||||
echo
|
||||
|
||||
echo "Please enter an upstream Debian mirror (hostname+directory without scheme) to use (e.g. ftp.debian.org/debian):"
|
||||
echo -n "[ftp.debian.org/debian] > "
|
||||
read upstream_mirror
|
||||
if [[ -z ${upstream_mirror} ]]; then
|
||||
upstream_mirror="ftp.debian.org/debian"
|
||||
fi
|
||||
echo
|
||||
|
||||
echo "Please enter the default Debian release for new clusters (e.g. 'bullseye', 'bookworm'):"
|
||||
echo -n "[bookworm] > "
|
||||
read debian_release
|
||||
if [[ -z ${debian_release} ]]; then
|
||||
debian_release="bookworm"
|
||||
fi
|
||||
echo
|
||||
|
||||
echo "Proceeding with setup!"
|
||||
echo
|
||||
|
||||
echo "Installing APT dependencies..."
|
||||
sudo apt-get update
|
||||
sudo apt-get install --yes vlan iptables dnsmasq redis python3 python3-pip python3-requests sqlite3 celery pxelinux syslinux-common live-build debootstrap uuid-runtime qemu-user-static
|
||||
sudo apt-get install --yes vlan iptables dnsmasq redis python3 python3-pip python3-requests sqlite3 celery pxelinux syslinux-common live-build debootstrap uuid-runtime qemu-user-static apt-cacher-ng
|
||||
|
||||
echo "Configuring apt-cacher-ng..."
|
||||
sudo systemctl enable --now apt-cacher-ng
|
||||
if ! grep -q ${upstream_mirror} /etc/apt-cacher-ng/backends_debian; then
|
||||
echo "http://${upstream_mirror}" | sudo tee /etc/apt-cacher-ng/backends_debian &>/dev/null
|
||||
sudo systemctl restart apt-cacher-ng
|
||||
fi
|
||||
|
||||
echo "Configuring dnsmasq..."
|
||||
sudo systemctl disable --now dnsmasq
|
||||
@ -131,6 +154,8 @@ sed -i "s|BOOTSTRAP_DHCPSTART|${bootstrap_dhcpstart}|" ${root_directory}/pvcboot
|
||||
sed -i "s|BOOTSTRAP_DHCPEND|${bootstrap_dhcpend}|" ${root_directory}/pvcbootstrapd/pvcbootstrapd.yaml
|
||||
sed -i "s|GIT_REMOTE|${git_remote}|" ${root_directory}/pvcbootstrapd/pvcbootstrapd.yaml
|
||||
sed -i "s|GIT_BRANCH|${git_branch}|" ${root_directory}/pvcbootstrapd/pvcbootstrapd.yaml
|
||||
sed -i "s|UPSTREAM_MIRROR|${upstream_mirror}|" ${root_directory}/pvcbootstrapd/pvcbootstrapd.yaml
|
||||
sed -i "s|DEBIAN_RELEASE|${debian_release}|" ${root_directory}/pvcbootstrapd/pvcbootstrapd.yaml
|
||||
|
||||
echo "Creating network configuration for interface ${bootstrap_interface} (is vLAN? ${is_bootstrap_interface_vlan})..."
|
||||
if [[ "${is_bootstrap_interface_vlan}" == "yes" ]]; then
|
||||
|
Reference in New Issue
Block a user