Part 4/4 - Deploying Openshift/OKD 4.5 on Proxmox VE Homelab
This is the last part of a 4-part series on Running Openshift at Home. Some information here will have some references to the previous part of the series. Please find the links below for the previous posts.
Part 1/4 - Homelab Hardware
Part 2/4 - Building a Silent Server Cabinet
Part 3/4 - Installing a Two-Node Proxmox VE Cluster
The installation process uses OKD4.5, the upstream community project of Openshift. It's like the Fedora Linux of Red Hat Enterprise Linux.
Openshift has 2 types of installation. The IPI, which is the fully automated deployment over known cloud providers such as AWS, Azure and GCP. On the other hand, UPI or user-provisioned infrastructure is a partially automated process which we will talk about here.
The installation process is automatically performed by bootstrapping. A bootstrap machine is a temporary machine used by
Openshift/Kubernetes to host the services required in the bootstrap procedure. The bootstrap machine will create an etcd cluster and starts a few Kubernetes services. The master machines will then join the etcd cluster through ignition. The Kubernetes services will then be transferred from the bootstrap machine to the master nodes as soon as they become ready. The last step of the bootstrap process is that the bootstrap machine will be removed from the etcd cluster. At this point, the bootstrap machine can be shut down and deleted forever.
Though the bootstrap process is automatic, the preparation of the installation, however, has to be done manually. Note that in 4.6 which was released a couple of days ago, there is now support for automated installation on bare-metal infrastructure using IPMI/BMC. We will not cover this here.
Infrastructure
As described in the previous post, my homelab infrastructure looks like this.
The servers are running Proxmox Virtualization Environment, an opensource hypervisor. I also have a physical router and a physical DNS server. We will configure this device as well for the OKD bootstrap process to work. You will need a good amount of RAM on the host to run the following configuration. In Proxmox, we can over-provision RAM. So even if the total RAM of the VMS is 100GB, the setup should run if you have at least 64GB of RAM available on the host. In my case, the total RAM usage after installing a 5-node Openshift was around 56GB.
Virtual Machines
For a 5-node Openshift/OKD cluster you will need to spin up 6x Fedora Core OS VMs, and 1x Centos 8, assuming you have a physical router and an external DNS server. Otherwise, you may also run your router and DNS server in a VM. But this will eat up even more RAM on your host.
Start by creating the following VMs in Proxmox VE as detailed in the following sections and take note of their MAC addresses after creation. We will use the table below as a reference for VM creation and DHCP address reservation configuration.
Download the OS Images
Download the latest Fedora Core OS installer from https://getfedora.org/en/coreos/download. Select the Bare Metal & Virtualized Tab. Download the Bare Metal ISO package.
Upload the installer to the local storage of the Proxmox node where you will create the VMs (just in case you have multiple nodes).
Create the VMs
From the Proxmox VE Web interface, right-click in the node and select create VM. Name the VM according to the table above starting with the okd4-bootstrap.
Select the Fedora Core OS image we uploaded earlier.
Leave the system tab with default values. Proxmox VE has already pre-selected the optimum setting for the selected Guest OS type. Set the size of the disk according to the table above.
Select 4 cores in the CPU tab as per the table above. Leave the rest of the setting unchanged by default, unless you know what you are doing.
Set the Memory to 16GB (16384 Mib).
If you followed the above instructions correctly, you should see the following values in the confirmation screen. Then just click finish.
After the VM is created. Proxmox will generate a MAC Address for the virtual network interface card. Take note of the MAC Address. Create a table similar to the above, but with the MAC Addresses column. You will need this later.
Repeat the above procedure for the rest of the VMs. Take note that the last VM in the table, okd4-services, is a CentOS 8 VM. You need to download the CentOS 8 installer ISO and upload it to Proxmox local storage. The latest CentOS 8 release can be downloaded here: http://isoredirect.centos.org/centos/8/isos/x86_64.
Download the one that ends with dvd1.iso.
Upload this file to the local storage of Proxmox VE and create the okd4-services VM as per the above procedure. Take note that this VM only has 4GB of RAM. Not 16GB.
You should have the following list of VMs at the end.
DHCP Address Reservation
Using the list of MAC Addresses of the VMs created earlier, we need to assign IP addresses to these MAC Addresses through DHCP Address reservation. Depending on the router, the process may be slightly different.
For my case, I have an ASUS Router and this is how the Address reservation looks like. Just a table of MAC addresses and their pre-assigned IP address.
When the VMs are started, they will get these IP Addresses via DHCP.
If you do not have a physical router or don't want to use your home router, you can run a PfSense router on a VM and configure the above VMs to be behind this router. Then you need to configure the same DHCP address reservation configuration. We will also need to revisit this router configuration after setting up a DNS server.
OKD Services
The okd4-services VM will run several things required to install and run Openshift/OKD.
- DNS Server - If you do not have an external/Raspberry Pi DNS server
- HA Proxy - Load Balancer
- Apache Web Server (httpd) - to host the OS images and ignition files during PXE booting. This service can be stopped after the installation.
- NFS server - to be used as Persistent Volume by Openshift Image Registry
Start the okd4-services VM. Navigate to console.
In the Installation Destination option, select custom, then done.
Delete the /home partition and leave the desired capacity for / empty.
Select the Network and Hostname. Enable the ethernet adapter, set the hostname and tick the automatically connect.
Then click begin installation and set the root password.
After installation is complete, run the following to add the EPEL repository to DNF/yum and update the OS.
From this point on, we will do the rest of the installation and configuration from this VM. SSH to this new VM.
Create a DNS Server
Install git and clone the git repo https://github.com/rossbrigoli/okd4_files
Create a Load Balancer
Copy the HA Proxy configuration file to the /etc/haproxy and then start HA Proxy service.
Open TCP ports for Openshift/etcd clustering.
Serve the Installation Files Over HTTP
Now, we need to install Apache webserver (HTTPD). We will host here, the files that we need to PXE boot the nodes on http port 8080.
A quick recap of what we just did.
- We created the VMs we need for the cluster
- We configured a router/DHCP server
- We installed Centos 8 on okd4-service VM
- We configured a DNS Server (named) on Okd4-services VM
- We created a load balancer using HA Proxy running in okd4-services VM
- We created an Apache webserver to host installation files in okd4-services VM
Installing OKD
Now that the infrastructure is ready, it's time to install OKD. The open-source upstream project of Openshift. The latest OKD release can be found at https://github.com/openshift/okd/releases. Update the links below accordingly to get the latest version.
Download the Openshift installer and the OC client. Extract the downloaded file and move the extracted binaries to /usr/local/bin. SSH to okd4-services VM and execute the following.
You may also check the status of the release builds at https://origin-release.apps.ci.l2s4.p1.openshiftapps.com/.
Setup the Openshift Installer
If you haven't done so, create an SSH key without a password. We will provide the SSH key to the install_config.yaml so that we can login to the VMs without password prompts.
Your ssh key public key is usually located at ~/.ssh/id_rsa.pub.
Create an installation directory and copy the installer config file to it. We will use this directory to hold the files generated by the opnshift-install command.
Get a Red Hat pull secret by logging in to https://cloud.redhat.com. Navigate to Cluster Manager > Create Cluster > Red Hat Openshift Container Platform > Run on Baremetal > User-Provisioned Infrastructure > Copy pull secret.
Edit the install_config.yaml. Replace the value of pull secret field with your pull secret or leave it as is if you don't have a Red Hat pull secret. Then, replace the value of sshKey field with your SSH key.
The last two lines of your install_config.yaml file should look like this.
Generate the Ignition Files
Run the installer to generate files and host the files in httpd.
Download the Fedora Core OS image and signature. Rename with a shorter name and host the image and signature files in httpd under okd4 directory.
The latest Fedora Core OS release is available at https://getfedora.org/coreos/download?tab=cloud_launchable&stream=stable. Update the wget links accordingly to get the latest versions.
Starting the VMs
Now that the ignition files are generated. It's time to start the VMs. Select the okd4-bootstrap VM and navigate to Console. Start the VM. When you see the Fedora CoreOS startup screen, press TAB on the keyboard. This will initiate the PXE boot (booting over the network). In the command line below the screen, append the following arguments. This will install the Fedora CoreOS to /dev/sda disk using the image file and the ignition file we hosted in http in the earlier steps.
Bootstrap Node
In the console screen, it should look like this.
This step is painful when you make typo errors. Review what you typed before pressing "Enter". You also cannot copy-paste the arguments for these steps because Proxmox VE has no way of forwarding the clipboard to the VNC console session.
Master Nodes
Repeat the above step for all the other VMs, starting with the master nodes, okd4-control-plane-X. For the master nodes you need to replace the last argument ignition file name to master.ign.
Worker Nodes
Repeat the above steps for worker nodes.
Bootstrap Progress
You can monitor the installation progress by running the following command.
Once the bootstrap process completes, you should see the following messages.
Removing the Bootstrap Node
You can now shutdown/stop the okd4-bootstrap VM. Then we need to remove/comment out the bootstrap node from the load balancer so that API requests do not get routed to the bootstrap IP.Edit the /etc/haproxy/haproxy.cfg file and reload the HAProxy configuration.
Approving Certificate Signing Requests
Check the Status of the Cluster Operators
You will have to bypass SSL error twice because both the web console and the oAuth domains have an invalid certificate. After ignoring SSL errors for both web console and OAuth, you should see a login screen.
Create a Cluster Admin User
Login with credentials: testuser/testpassword. After authentication succeeds, we need to give this new user a cluster admin role. Note that the command below does not work before the first login.
Now that we have a proper cluster admin user. We can then delete the kubeadmin temporary user.
You should not see the kube:admin login option the next time you log in.
Setup Image Registry
Deploying our First Application
Part 2/4 - Building a Silent Server Cabinet
Part 3/4 - Installing a Two-Node Proxmox VE Cluster
You actually make it look so easy with your performance but I find this matter to be actually something which I think I would never comprehend. It seems too complicated and extremely broad for me. I'm looking forward for your next post, I’ll try to get the hang of it! Applications for Qualification
ReplyDeleteThanks for the great guide. Just got my new home server and used this guide to setup my own Openshift cluster so now I can bring my work home with me!!!
ReplyDeleteJust a couple of corrections for you.
1) You're missing the 'sudo dnf install haproxy' command, no biggy as this is fairly obvious
But the one that had me scratching my head was
2) You have the console url as https://console-openshift-console.clustername.domain.name and NOT https://console-openshift-console.apps.clustername.domain.name. I spent an hour or so scratching my head checking the routes and services, making sure all the pods in openshift-console and openshift-authentication were happy before I noticed the missing apps in the url.
But now it's all working I just wanted to say again. Thanks for the great guide
Thanks a ton Ross.. Got my 4.11.0-0.okd-2022-12-02-145640 cluster running with 3 control-plane and 3 worker nodes running on Proxmox VE 7.3 primarily based on the guidance you provided here. I'm using VyOS 1.4 nightly as the DNS server (it supports being a full fledged DNS server using powerDNS and I've already got automated deployment / config backup scripts so deploying it in an automated Infra as Code way with the required records in a git tracked config file is super useful) and an LXC container with HAProxy as the services node.
ReplyDeleteWas a fun project.