Create a None cluster
This document explains how to create HostedClusters and NodePools using the 'None' platform to create bare metal worker nodes.
Hypershift Operator requirements
- cluster-admin access to an OpenShift Cluster (We tested it with 4.9.17) to deploy the CRDs + operator
- 1 x filesystem type Persistent Volume to store the
etcd
database for demo purposes (3x for 'production' environments)
Versions used
- OCP compact cluster (3 masters) version 4.9.17
- HyperShift Operator built from sources (Commit ID 0371f889)
Prerequisites: Building the Hypershift Operator
Currently, the HyperShift operator is deployed using the hypershift
binary, which needs to be compiled manually.
RHEL8 doesn't include go1.18 officially but it can be installed via gvm
by following the next steps:
# Install prerequisites
sudo dnf install -y curl git make bison gcc glibc-devel
git clone https://github.com/openshift/hypershift.git
pushd hypershift
# Install gvm to install go 1.18
bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)
source ${HOME}/.gvm/scripts/gvm
gvm install go1.18
gvm use go1.18
# build the binary
make hypershift
popd
Then, the hypershift
binary can be moved to a convenient place as:
sudo install -m 0755 -o root -g root hypershift/bin/hypershift /usr/local/bin/hypershift
Alternatively, it can be compiled using a container as:
# Install prerequisites
sudo dnf install podman -y
# Compile hypershift
mkdir -p ./tmp/ && \
podman run -it -v ${PWD}/tmp:/var/tmp/hypershift-bin/:Z --rm docker.io/golang:1.18 sh -c \
'git clone --depth 1 https://github.com/openshift/hypershift.git /var/tmp/hypershift/ && \
cd /var/tmp/hypershift && \
make hypershift && \
cp bin/hypershift /var/tmp/hypershift-bin/'
sudo install -m 0755 -o root -g root ./tmp/hypershift /usr/local/bin/hypershift
WARNING: At the time of writting this document, there were some issues already fixed in HyperShift but unfortunately those weren't included in the latest release of the container.
Prerequisite (optional): Create a custom HyperShift image
The official container image containing the HyperShift bits is hosted in quay.io/hypershift/hypershift but if creating a custom HyperShift image is needed, the following steps can be performed:
QUAY_ACCOUNT='testuser'
podman login -u ${QUAY_ACCOUNT} -p testpassword quay.io
sudo dnf install -y curl git make bison gcc glibc-devel
git clone https://github.com/openshift/hypershift.git
pushd hypershift
# Install gvm to install go 1.18
bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)
source ${HOME}/.gvm/scripts/gvm
gvm install go1.18 -B
gvm use go1.18
# Build the binaries and the container
make build
make RUNTIME=podman IMG=quay.io/${QUAY_ACCOUNT}/hypershift:latest docker-build docker-push
sudo install -m 0755 -o root -g root hypershift/bin/hypershift /usr/local/bin/hypershift
popd
Deploy HyperShift
Once the binary is in place, the operator deployment is performed as:
hypershift install
Or if using a custom image:
hypershift install --hypershift-image quay.io/${QUAY_ACCOUNT}/hypershift:latest
Using hypershift install --render > hypershift-install.yaml
will create a yaml file with all the assets required to deploy HyperShift, then they can be applied as:
oc apply -f ./hypershift-install.yaml
Deploy a hosted cluster
There are two main CRDs to describe a hosted cluster:
-
HostedCluster
defines the control plane hosted in the management OpenShift -
NodePool
defines the nodes that will be created/attached to a hosted cluster
The hostedcluster.spec.platform
specifies the underlying infrastructure provider for the cluster and is used to configure platform specific behavior, so depending on the environment it is required to configure it properly.
In this repo we will cover the 'none' provider.
Requisites
- Proper DNS entries for the workers (if the worker uses
localhost
it won't work) - DNS entry to point
api.${cluster}.${domain}
to each of the nodes where the hostedcluster will be running. This is because the hosted cluster API is exposed as anodeport
. For example:
api.hosted0.example.com. IN A 10.19.138.32
api.hosted0.example.com. IN A 10.19.138.33
api.hosted0.example.com. IN A 10.19.138.37
-
DNS entry to point
*.apps.${cluster}.${domain}
to a load balancer deployed to redirect incoming traffic to the ingresses pod the OpenShift documentation provides some instructions about this)NOTE: This is not strictly required to deploy a sample cluster but to access the exposed routes there. Also, it can be simply an A record pointing to a worker IP where the ingress pods are running and enabling the
hostedcluster.spec.infrastructureAvailabilityPolicy: SingleReplica
configuration parameter. -
Pull-secret (available at cloud.redhat.com)
- ssh public key already available (it can be created as
ssh-keygen -t rsa -f /tmp/sshkey -q -N ""
) - Any httpd server available to host a ignition file (text) and a modified RHCOS iso
Procedure
- Create a file containing all the variables depending on the enviroment:
cat <<'EOF' > ./myvars
export CLUSTERS_NAMESPACE="clusters"
export HOSTED="hosted0"
export HOSTED_CLUSTER_NS="clusters-${HOSTED}"
export PULL_SECRET_NAME="${HOSTED}-pull-secret"
export MACHINE_CIDR="10.19.138.0/24"
export OCP_RELEASE_VERSION="4.9.17"
export OCP_ARCH="x86_64"
export BASEDOMAIN="example.com"
export PULL_SECRET_CONTENT=$(cat ~/clusterconfigs/pull-secret.txt)
export SSH_PUB=$(cat ~/.ssh/id_rsa.pub)
EOF
source ./myvars
- Create a namespace to host the HostedCluster and secrets
envsubst <<"EOF" | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: ${CLUSTERS_NAMESPACE}
EOF
export PS64=$(echo -n ${PULL_SECRET_CONTENT} | base64 -w0)
envsubst <<"EOF" | oc apply -f -
apiVersion: v1
data:
.dockerconfigjson: ${PS64}
kind: Secret
metadata:
name: ${PULL_SECRET_NAME}
namespace: ${CLUSTERS_NAMESPACE}
type: kubernetes.io/dockerconfigjson
EOF
envsubst <<"EOF" | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
name: ${HOSTED}-ssh-key
namespace: ${CLUSTERS_NAMESPACE}
stringData:
id_rsa.pub: ${SSH_PUB}
EOF
- Create the
hostedcluster
and thenodepool
objects:
envsubst <<"EOF" | oc apply -f -
apiVersion: hypershift.openshift.io/v1alpha1
kind: HostedCluster
metadata:
name: ${HOSTED}
namespace: ${CLUSTERS_NAMESPACE}
spec:
release:
image: "quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_VERSION}-${OCP_ARCH}"
pullSecret:
name: ${PULL_SECRET_NAME}
sshKey:
name: "${HOSTED}-ssh-key"
networking:
serviceCIDR: "172.31.0.0/16"
podCIDR: "10.132.0.0/14"
machineCIDR: "${MACHINE_CIDR}"
platform:
type: None
infraID: ${HOSTED}
dns:
baseDomain: ${BASEDOMAIN}
services:
- service: APIServer
servicePublishingStrategy:
nodePort:
address: api.${HOSTED}.${BASEDOMAIN}
type: NodePort
- service: OAuthServer
servicePublishingStrategy:
nodePort:
address: api.${HOSTED}.${BASEDOMAIN}
type: NodePort
- service: OIDC
servicePublishingStrategy:
nodePort:
address: api.${HOSTED}.${BASEDOMAIN}
type: None
- service: Konnectivity
servicePublishingStrategy:
nodePort:
address: api.${HOSTED}.${BASEDOMAIN}
type: NodePort
- service: Ignition
servicePublishingStrategy:
nodePort:
address: api.${HOSTED}.${BASEDOMAIN}
type: NodePort
EOF
envsubst <<"EOF" | oc apply -f -
apiVersion: hypershift.openshift.io/v1alpha1
kind: NodePool
metadata:
name: ${HOSTED}-workers
namespace: ${CLUSTERS_NAMESPACE}
spec:
clusterName: ${HOSTED}
replicas: 0
management:
autoRepair: false
upgradeType: Replace
platform:
type: None
release:
image: "quay.io/openshift-release-dev/ocp-release:${OCP_RELEASE_VERSION}-${OCP_ARCH}"
EOF
NOTE: The
HostedCluster
andNodePool
objects can be created using thehypershift
binary ashypershift create cluster
. See thehypershift create cluster -h
output for more information.
After a while, a number of pods will be created in the ${CLUSTERS_NAMESPACE}
namespace. Those pods are the control plane of the hosted cluster.
oc get pods -n ${HOSTED_CLUSTER_NS}
NAME READY STATUS RESTARTS AGE
catalog-operator-54d47cbbdb-29mzf 2/2 Running 0 6m24s
certified-operators-catalog-78db79f86-6hlk9 1/1 Running 0 6m30s
cluster-api-655c8ff4fb-598zs 1/1 Running 1 (5m57s ago) 7m26s
cluster-autoscaler-86d9474fcf-rmwzr 0/1 Running 0 6m11s
cluster-policy-controller-bf87c9858-nnlgw 1/1 Running 0 6m37s
cluster-version-operator-ff9475794-dc9hf 2/2 Running 0 6m37s
community-operators-catalog-6f5797cdc4-2hlcp 1/1 Running 0 6m29s
control-plane-operator-749b94cf54-p2lg2 1/1 Running 0 7m23s
etcd-0 1/1 Running 0 6m46s
hosted-cluster-config-operator-6646d8f868-h9r2w 0/1 Running 0 6m34s
ignition-server-7797c5f7-vkb2b 1/1 Running 0 7m20s
ingress-operator-5dc47b99b7-jttpg 0/2 Init:0/1 0 6m35s
konnectivity-agent-85f979fcb4-67c5h 1/1 Running 0 6m45s
konnectivity-server-576dc7b8b7-rxgms 1/1 Running 0 6m46s
kube-apiserver-66d99fd9fb-dvslc 2/2 Running 0 6m43s
kube-controller-manager-68dd9fb75f-mgd22 1/1 Running 0 6m42s
kube-scheduler-748d9f5bcb-mlk52 0/1 Running 0 6m42s
machine-approver-c8c68ffb9-psc6n 0/1 Running 0 6m11s
oauth-openshift-7fc7dc9c66-fg258 1/1 Running 0 6m8s
olm-operator-54d7d78b89-f9dng 2/2 Running 0 6m22s
openshift-apiserver-64b4669d54-ffpw2 2/2 Running 0 6m41s
openshift-controller-manager-7847ddf4fb-x5659 1/1 Running 0 6m38s
openshift-oauth-apiserver-554c449b8f-lk97w 1/1 Running 0 6m41s
packageserver-6fd9f8479-pbvzl 0/2 Init:0/1 0 6m22s
redhat-marketplace-catalog-8cc88f5cb-hbxv9 1/1 Running 0 6m29s
redhat-operators-catalog-b749d6945-2bx8k 1/1 Running 0 6m29s
The hosted cluster's kubeconfig can be extracted as:
oc extract -n ${CLUSTERS_NAMESPACE} secret/${HOSTED}-admin-kubeconfig --to=- > ${HOSTED}-kubeconfig
oc get clusterversion --kubeconfig=${HOSTED}-kubeconfig
Adding a bare metal worker
- Download the RHCOS live ISO into the httpd server (for example into
/var/www/html/hypershift-none/
on an apache server hosted at www.example.com)
mkdir -p /var/www/html/hypershift-none/
curl -s -o /var/www/html/hypershift-none/rhcos-live.x86_64.iso https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.9/4.9.0/rhcos-live.x86_64.iso
- Download the ignition generated in the hostedcluster
IGNITION_ENDPOINT=$(oc get hc ${HOSTED} -o json | jq -r '.status.ignitionEndpoint')
IGNITION_TOKEN_SECRET=$(oc -n clusters-${HOSTED} get secret | grep token-${HOSTED} | awk '{print $1}')
set +x
IGNITION_TOKEN=$(oc -n clusters-${HOSTED} get secret ${IGNITION_TOKEN_SECRET} -o jsonpath={.data.token})
curl -s -k -H "Authorization: Bearer ${IGNITION_TOKEN}" https://${IGNITION_ENDPOINT}/ignition > /var/www/html/hypershift-none/worker.ign
- Modify the RHCOS live ISO to install the worker using that ignition file into the
/dev/sda
device (your milleage may vary)
podman run --rm -it -v /var/www/html/hypershift-none/:/data:z --workdir /data quay.io/coreos/coreos-installer:release iso customize --live-karg-append="coreos.inst.ignition_url=http://www.example.com/hypershift-none/worker.ign coreos.inst.install_dev=/dev/sda" -o ./rhcos.iso ./rhcos-live.x86_64.iso
podman run --rm -it -v /var/www/html/hypershift-none/:/data:z --workdir /data quay.io/coreos/coreos-installer:release iso kargs show ./rhcos.iso
chmod a+r /var/www/html/hypershift-none/rhcos.iso
- (Optionally) Check the ISO can be downloaded
curl -v -o rhcos.iso http://www.example.com/hypershift-none/rhcos.iso
- Attach the ISO to the BMC and boot from it once.
This step is highly dependant on the hardware used. In this example, using Dell hardware, the following steps can be done, but your milleage may vary:
export IDRACIP=10.19.136.22
export IDRACUSER="root"
export IDRACPASS="calvin"
curl -s -L -k https://raw.githubusercontent.com/dell/iDRAC-Redfish-Scripting/master/Redfish%20Python/SetNextOneTimeBootVirtualMediaDeviceOemREDFISH.py -O
curl -s -L -k https://raw.githubusercontent.com/dell/iDRAC-Redfish-Scripting/master/Redfish%20Python/InsertEjectVirtualMediaREDFISH.py -O
curl -s -L -k https://raw.githubusercontent.com/dell/iDRAC-Redfish-Scripting/master/Redfish%20Python/GetSetPowerStateREDFISH.py -O
# Turn the server off
python3 ./SetPowerStateREDFISH.py -ip ${IDRACIP} -u ${IDRACUSER} -p ${IDRACPASS} -r Off
# Insert the ISO as virtual media
python3 ./InsertEjectVirtualMediaREDFISH.py -ip ${IDRACIP} -u ${IDRACUSER} -p ${IDRACPASS} -o 1 -d 1 -i http://www.example.com/hypershift-none/rhcos.iso
# Set boot once using the Virtual media previously attached
python3 ./SetNextOneTimeBootVirtualMediaDeviceOemREDFISH.py -ip ${IDRACIP} -u ${IDRACUSER} -p ${IDRACPASS} -d 1 -r y
After a while, the worker will be installed.
- Sign the CSR:
oc get csr --kubeconfig=${HOSTED}-kubeconfig -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve --kubeconfig=${HOSTED}-kubeconfig
- Then the worker is added to the cluster:
oc get nodes --kubeconfig=${HOSTED}-kubeconfig
NAME STATUS ROLES AGE VERSION
kni1-worker-0.cloud.lab.eng.bos.redhat.com Ready worker 28m v1.22.3+e790d7f
oc get co --kubeconfig=${HOSTED}-kubeconfig
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
console 4.9.17 True False False 17m
csi-snapshot-controller 4.9.17 True False False 24m
dns 4.9.17 True False False 24m
image-registry 4.9.17 False False False 4m49s NodeCADaemonAvailable: The daemon set node-ca does not have available replicas...
ingress 4.9.17 True False False 14m
kube-apiserver 4.9.17 True False False 3h45m
kube-controller-manager 4.9.17 True False False 3h45m
kube-scheduler 4.9.17 True False False 3h45m
kube-storage-version-migrator 4.9.17 True False False 24m
monitoring False True True 9m Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
network 4.9.17 True False False 25m
node-tuning 4.9.17 True False False 24m
openshift-apiserver 4.9.17 True False False 3h45m
openshift-controller-manager 4.9.17 True False False 3h45m
openshift-samples 4.9.17 True False False 23m
operator-lifecycle-manager 4.9.17 True False False 3h45m
operator-lifecycle-manager-catalog 4.9.17 True False False 3h45m
operator-lifecycle-manager-packageserver 4.9.17 True False False 3h45m
service-ca 4.9.17 True False False 25m
storage 4.9.17 True False False 25m
oc get clusterversion --kubeconfig=${HOSTED}-kubeconfig
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version False True 3h46m Unable to apply 4.9.17: some cluster operators have not yet rolled out
NOTE: Some cluster operators are degraded because there is only a single worker and they require at least 2. However setting the
hostedcluster.spec.infrastructureAvailabilityPolicy: SingleReplica
configuration parameter disables the requirement and will make the clusters operator available with a single worker.
- After adding 2 workers, the hosted cluster is completely available as well as all the cluster operators:
oc get hostedcluster -n clusters hosted0
NAME VERSION KUBECONFIG PROGRESS AVAILABLE REASON
hosted0 4.9.17 hosted0-admin-kubeconfig Completed True HostedClusterAsExpected
KUBECONFIG=./hosted0-kubeconfig oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
console 4.9.17 True False False 64m
csi-snapshot-controller 4.9.17 True False False 71m
dns 4.9.17 True False False 71m
image-registry 4.9.17 True False False 11m
ingress 4.9.17 True False False 61m
kube-apiserver 4.9.17 True False False 4h32m
kube-controller-manager 4.9.17 True False False 4h32m
kube-scheduler 4.9.17 True False False 4h32m
kube-storage-version-migrator 4.9.17 True False False 71m
monitoring 4.9.17 True False False 6m51s
network 4.9.17 True False False 72m
node-tuning 4.9.17 True False False 71m
openshift-apiserver 4.9.17 True False False 4h32m
openshift-controller-manager 4.9.17 True False False 4h32m
openshift-samples 4.9.17 True False False 70m
operator-lifecycle-manager 4.9.17 True False False 4h32m
operator-lifecycle-manager-catalog 4.9.17 True False False 4h32m
operator-lifecycle-manager-packageserver 4.9.17 True False False 4h32m
service-ca 4.9.17 True False False 72m
storage 4.9.17 True False False 72m
KUBECONFIG=./hosted0-kubeconfig oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.17 True False 8m42s Cluster version is 4.9.17