The OpenShift (OKD) is a distribution of Kubernetes optimized for continuous application development and multi-tenant deployment. It adds developer and operations-centric tools on top of Kubernetes to enable rapid application development, easy deployment and scaling, and long-term lifecycle maintenance for small and large teams. It incorporates and extends Kubernetes with security and other integrated concepts. The OKD is a sibling Kubernetes distribution to Red Hat OpenShift.
The purpose of this roadmap is to install a Private Cloud on a private server ("bare-metal"), that is, on an User Provisioned Infrastructure (UPI). This Private Cloud will be all provided by OKD which, basically, is the open and free community version of OpenShift. Several additional features like Prometheus, Alertmanager, Grafana and an Alert Manager will be available out of the box. Cool, isn't it? 😎
[Ref(s).: https://www.okd.io/#v4 , https://docs.okd.io/4.9/monitoring/accessing-third-party-uis.html , https://thanos.io/tip/components/query.md/ , https://grafana.com/ , https://prometheus.io/ , https://docs.okd.io/latest/monitoring/managing-alerts.html ]
IMPORTANT: My life, my work and my passion is free software. We've come across people from Red Hat themselves using this roadmap to build infrastructure with Openshift (OKD). Corrections, tweaks and improvements are very welcome (pull requests 😉)! Please consider giving us a ⭐, fork, support this project or even visit our professional profile (see About). Thanks! 🤗
Please consider to deposit a donation through PayPal by clicking on the next button...
This is free software and you are equally free to specify any amount of money you want.
Support free software and my work! ❤️👨👩👧🐧
- IMPORTANT things to know before getting started and some facts about OpenShift (OKD) 4.X
- Cluster overview
- Create a "very private"/"very isolated" network on KVM (N_OKD_LAN) (HYPERVISOR)
- Create the OKD_SERVICES server
- Create the virtual machine (HYPERVISOR)
- Install operating system (OKD_SERVICES)
- Network configuration (OKD_SERVICES)
- Setup the gateway (OKD_SERVICES)
- Setup the DHCP service (ISC DHCP service) (OKD_SERVICES)
- Setup the DNS service (ISC BIND 9) (OKD_SERVICES)
- Setup the load balancer service (HAProxy) (OKD_SERVICES)
- Setup the NTP service (Chrony) (OKD_SERVICES)
- Setup the web server (Apache/httpd) (OKD_SERVICES)
- Setup the OpenShift (OKD) (installer and client) (OKD_SERVICES)
- Install the "libvirt" dependency
- Download the OpenShift (OKD) installer and the "oc" client
- Test the OpenShift (OKD) installer and client
- Setup the OpenShift (OKD) installer
- Setup the cluster's install directory
- Generate the Kubernetes manifests
- Create the ignition configuration files
- Add the ignition and the Fedora CoreOS/FCOS files to the web server
- Create the OKD_BOOTSTRAP, OKD_MASTER_Xs and OKD_WORKER_Xs nodes
- Create the virtual machines (OKD_BOOTSTRAP, OKD_MASTER_Xs and OKD_WORKER_Xs) (HYPERVISOR)
- Create the bootstrap node (OKD_BOOTSTRAP)
- Create the masters nodes (OKD_MASTER_1, OKD_MASTER_2 and OKD_MASTER_3)
- Create the workers nodes (OKD_WORKER_1 and OKD_WORKER_2)
- Follow the bootstrap process evolution (OKD_SERVICES)
- Remove the bootstrap node (OKD_SERVICES) (HYPERVISOR)
- Finalize setup on the OKD_SERVICES server
- Access OpenShift (OKD) web resources behind a Ngnix reverse proxy (NGINX_REVERSE_PROXY)
- Access OpenShift (OKD) using OpenLDAP (LDAP) as identity provider (OKD_SERVICES)
- Test the cluster (OKD_SERVICES)
- Seeking help and helping others
- KVM network - Create a "very private"/"very isolated" network
- KVM Nested Virtualization - Virtualization Support or Hardware-Assisted Virtualization for guests
- Setup Let's Encrypt Wildcard SSL Certificate with Nginx Reverse Proxy
- OpenLDAP (LDAP) and OpenShift (OKD) - Configuring OpenLDAP (LDAP) as identity provider for OpenShift (OKD)
- About
NOTE: The narratives described here - while entirely true - may not reflect the current state of OpenShift (OKD). We hope that this is no longer the reality. Okay, Red Hat (IBM)? 😉
This guide is directed to a User-Provisioned Infrastructure (UPI) built on top of KVM, but about KVM we will only cover the most crucial points or those that generate more doubts in the process. However, with some adaptations we can use this guide with any hypervisor or even physical machines (real "bare-metal"). While we can't talk about "bare-metal" for a UPI built on a hypervisor, we'll agree to use this terminology just for the sake of convenience.
The virtual machines (guests) that will be created will need to have access to Hardware Virtualization Support or Hardware-Assisted Virtualization - basically Intel's VT-x or AMD's AMD-V - as we are going to work with virtual machines that will perform some "Nested Virtualization". Therefore the hypervisor (host) in question needs to provide the virtual machines (guest) with access to the Hardware Virtualization Support.
The OpenShift (OKD) is a memory-consuming beast. It is a cluster to run and manage other systems and, presumably, with performance. It contains a series of auxiliary services related to the administration, control and management of these systems. For the cluster install and work it needs the minimum hardware requirements informed in the table below ("Minimum hardware requirements..."). If you don't have these minimal requirements stop here and don't waste your time. These resources are the minimum possible for the cluster works. Without these the cluster will not install. Without the required amount of memory will not possible for the cluster to raise all its necessary resources and a myriad of errors will occur.
Another point that must be addressed at the first moment is the base domain used by OpenShift (OKD) that must be the same "internally" (LAN) and "externally" (WAN). In other words, don't expect to install OpenShift (OKD) using the domain "myinternaldomain.org" (LAN) and access its web resources on the internet (WAN) via the domain "mywebdomain.net". Also note that the base domain used to install the cluster cannot be changed. We've already told to the OpenShift (OKD)'s development team - including the development lead - that this is an important limitation and that it can become something dramatic in the case of certain maintenances or changes. Full information about this limitation can be found at "OpenShift/OKD cluster - Use with external/outside domain" ( okd-project/okd#716 (comment) ).
Keep in mind that using and running a cluster is something the the vast majority of people are not used to dealing with. So, don't expect that a cluster will work like the multithread and/or multiprocess systems that you might be used to. The cluster bootstrap process is a bit difficult and can fail even if you do all the setup correctly. Things are naturally more complex to handle and sometimes to use.
The OpenShift (OKD) is a good solution, but despite all the Red Hat hype (Red Hat OpenShift, specifically), it still needs to be matured and still has a way to go as free software, as a community and as a product (Are all these absurd RAM requirements just for running the cluster really necessary or is it simply lack of cluster optimization?). Be aware that OpenShift (OKD) 4.X is not yet widely adopted by the open source community (help can be difficult to find) and is ultimately something centered on the Red Hat universe. So, if you look at Red Hat's open source policy for OpenShift (OKD) with certain suspicion at this point, you're probably right... Finally, we don't usually trust and recommend open source products that aren't good and reliable free products and widely adopted by free software communities. Therefore, if you want to venture out with OpenShift (OKD), do it at your own risk.
In our experience using OpenShift (OKD) it showed constant instability and was incapable to be used by us in production. We don't know what the behavior of this platform would be using absurdly high hardware resources and high performance hardware, which in this scenario would prove to be, compared to others, an expensive and inefficient solution. We also noticed that some crucial documentation and official help channels are vague, incomplete, or out of date (including Red Hat OpenShift). Information and help to any problem is very difficult to obtain.
We see that this project seems to us disconnected from the large public and the free software community that is the mother cell of any serious project based largely on free software. The free software community cannot be seen as a nuisance or a problem, if it is seen as a problem then there really is a problem. This is what experience teaches us and what is here https://www.redhat.com/en ("Red Hat - We make open source technologies for the enterprise"), after all, there is no reason to talk about open source if not there is involvement of free software communities. It would be nonsense.
We need to make it clear that our intention is not to disqualify the OpenShift (OKD) project nor the work of the people involved in it, much less foment conflicts, we want to present another vision. These conclusions were built through our experience for months with this product. We are convinced that our vision does not reflect the vision of the few, but that of the many, it is up to the project leaders to define a good path for the project.
Install OpenShift (OKD) on an UPI is a long process with a lot of pitfalls and details. You need, also, some knowledge regarding network infrastructure among other things (DNS, DHCP, load balancing, etc...). Also note that hostnames and domains (DNS, basically) are critical for the functioning of the cluster and if there are any error in these configurations the cluster will not even install.
Unfortunately, there is no easy way to deploy this thing and sometimes this process - without the correct information and without help from the community - is hard as hell... We made this guide because we found several difficulty points in installing OpenShift (OKD) and the guides we found on the internet are complex and omit important and crucial information about this process.
So, stay on that track and don't look back! Let's work!
Minimum hardware requirements...
.----------------------------------------------------------------------.
| 1 X BOOTSTRAP NODE -------------- 4~8 CPU 8192~16384 RAM 60 GB DISK |
| 3 X MASTER NODES ---------------- 4~8 CPU 8192~16384 RAM 25 GB DISK |
| 2 X WORKER NODES ---------------- 4~8 CPU 12288~16384 RAM 30 GB DISK |
| 1 X SERVICES SERVER ------------- 4 CPU 4096 RAM 60 GB DISK |
'----------------------------------------------------------------------'
IMPORTANT: More RAM the nodes have, better the cluster will work. The amounts of RAM above are the strictly necessary for the cluster works.
Virtual machines...
.----------------------------------------------------------------------------------------------------.
| NAME ROLE OS CPU RAM DISK IP(OKD) MAC(OKD) |
| OKD_SERVICES (DNS/DHCP/GW/LB/ CentOS 8 4[V] 4 60 10.3.0.3 52:54:00:3a:fd:a2 |
| NTP/NFS/WEB/CIA) (IP(INT) / MAC(INT) | 10.2.0.18 / 52:54:00:92:ce:78) |
| OKD_MASTER_1 master Fedora CoreOS 4[V] 8 25 10.3.0.4 52:54:00:7d:97:70 |
| OKD_MASTER_2 master Fedora CoreOS 4[V] 8 25 10.3.0.5 52:54:00:6e:52:85 |
| OKD_MASTER_3 master Fedora CoreOS 4[V] 8 25 10.3.0.6 52:54:00:a3:65:d9 |
| OKD_WORKER_1 worker Fedora CoreOS 4[V] 12 30 10.3.0.12 52:54:00:e3:7c:fb |
| OKD_WORKER_2 worker Fedora CoreOS 4[V] 12 30 10.3.0.13 52:54:00:20:ec:4f |
| OKD_BOOTSTRAP bootstrap Fedora CoreOS 4[V] 8 60 10.3.0.19 52:54:00:07:80:62 |
'----------------------------------------------------------------------------------------------------'
NOTES: In addition to the various tests we did to define the machine settings above we tried to base it on the default KubeInit settings (inventory file) and the "Quickstart Guide: Installing OpenShift Container Platform on Red Hat Virtualization" (page 5).
[Ref(s).: https://access.redhat.com/sites/default/files/attachments/quickstart_guide_for_installing_ocp_on_rhv_1.4.pdf (page 5), https://github.com/Kubeinit/kubeinit/blob/main/kubeinit/hosts/okd/inventory , okd-project/okd#152 (comment) ]
Hardware requirements and other information...
- TOTAL DISK - 60+25+25+25+30+30+60 = 255GB (195GB without OKD_BOOTSTRAP);
- TOTAL RAM - 8+8+8+8+12+12+4 = 60GB (52GB without OKD_BOOTSTRAP);
- RAM and DISK are in gigabytes (GB);
- CPU is a shared resource;
- V - Nested Virtualization required.
NOTE: The bootstrap node (OKD_BOOTSTRAP) is only used during the OKD installation and will be destroyed at the end of the installation.
[Ref(s).: https://docs.openshift.com/container-platform/4.2/architecture/architecture-installation.html ]
Some acronyms...
- DNS - Domain Name System;
- DHCP - Dynamic Host Configuration Protocol;
- GW - Gateway;
- LB - Load Balancing;
- NTP - Network Time Protocol;
- NFS - Network File Sharing;
- WEB - Web server;
- CIA - Cluster Installation and Administration.
NOTE: The first IP ("10.3.0.1") is by default reserved for the (KVM) hypervisor.
IMPORTANT: There are defined names for the services server and the nodes in the DNS, DHCP and LB settings. We do not recommend modifying these names. However, if you can't resist changing these names, just change the names referring to the services server and nodes.
Network configuration X tags used in this documentation...
.--------------------------------------------------------------------------------------------------.
| 10.3.0.3 -- 52:54:00:3a:fd:a2 X <OKD_LAN_24>.<OKD_SERVICES_LST_OCT> ---- <OKD_SERVICES_MAC> |
| 10.2.0.18 - 52:54:00:92:ce:78 X <INT_LAN_24>.<OKD_SERVICES_IL_LST_OCT> - <OKD_SERVICES_IL_MAC> |
| 10.3.0.4 -- 52:54:00:7d:97:70 X <OKD_LAN_24>.<OKD_MASTER_1_LST_OCT> ---- <OKD_MASTER_1_MAC> |
| 10.3.0.5 -- 52:54:00:6e:52:85 X <OKD_LAN_24>.<OKD_MASTER_2_LST_OCT> ---- <OKD_MASTER_2_MAC> |
| 10.3.0.6 -- 52:54:00:a3:65:d9 X <OKD_LAN_24>.<OKD_MASTER_3_LST_OCT> ---- <OKD_MASTER_3_MAC> |
| 10.3.0.12 - 52:54:00:e3:7c:fb X <OKD_LAN_24>.<OKD_WORKER_1_LST_OCT> ---- <OKD_WORKER_1_MAC> |
| 10.3.0.13 - 52:54:00:20:ec:4f X <OKD_LAN_24>.<OKD_WORKER_2_LST_OCT> ---- <OKD_WORKER_2_MAC> |
| 10.3.0.19 - 52:54:00:07:80:62 X <OKD_LAN_24>.<OKD_BOOTSTRAP_LST_OCT> --- <OKD_BOOTSTRAP_MAC> |
'--------------------------------------------------------------------------------------------------'
- "_LST_OCT" - Last Octet;
- "_MAC" - MAC address;
- "_IL" - Internet Lan.
Network layout...
.---------------------------------------------------------------------------.
| N_INT_LAN(WAN)(R_DHCP) (10.2.0.0/24)(<INT_LAN_24>.0/24) |
| ↕ |
| I_INT_LAN(WAN) |
| V_OKD_SERVICES(R_DHCP)(R_GATEWAY) |
| I_OKD_LAN(LAN) |
| ↕ |
| N_OKD_LAN(LAN) (10.3.0.0/24)(<OKD_LAN_24>.0/24) |
| ↕ |
| .................................. |
| ↕ ↕ ↕ |
| V_OKD_BOOTSTRAP V_OKD_MASTER_1 V_OKD_WORKER_1 |
| V_OKD_MASTER_2 V_OKD_WORKER_2 |
| V_OKD_MASTER_3 |
'---------------------------------------------------------------------------'
- N - Network;
- R - Network Resource;
- I - Network Interface Controller;
- V - Virtual Machine.
[Ref(s).: https://www.alt-codes.net/arrow_alt_codes.php ]
NOTES:
- The N_INT_LAN is (normally) also the "default" network and N_OKD_LAN is also the "okd_network" network;
- The N_INT_LAN is (normally) a NAT network with communication with the internet (WAN) and the hypervisor (host);
- The N_OKD_LAN is a private/isolated network without communication with the internet (WAN) and the hypervisor (host). All external communication will be done through the gateway established in the OKD_SERVICES via the network N_INT_LAN.
The cluster will use a private/isolated network without communication with the internet (WAN) and the hypervisor (host). All external communication will be done through the gateway (OKD_SERVICES).
The procedures required to create this network on the hypervisor (host) are in the "KVM network - Create a "very private"/"very isolated" network" section.
Provides DNS, DHCP, gateway, load balancing, NTP, NFS, web server and cluster installation and administration.
Download a CentOS 8 ISO...
NOTE: This resource can be found here https://www.centos.org/download/ in the "CentOS Linux" section, "8 (XXXX)" tab and "x86_64" link.
MODEL
wget http://<MIRROR_URL>/centos/8/isos/x86_64/CentOS-<LAST_CENTOS8_VER>-x86_64-boot.iso
EXAMPLE
wget http://mirrors.mit.edu/centos/8/isos/x86_64/CentOS-8.4.2105-x86_64-boot.iso
TIPS:
- Choose "boot" version;
- Choose a mirror close to your geographic location for better download performance.
[Ref(s).: https://serverfault.com/a/1011659/276753 ]
Enable Nested Virtualization. For more details see the section "KVM Nested Virtualization - Virtualization Support or Hardware-Assisted Virtualization for guests".
It must have access to N_INT_LAN ("default") and N_OKD_LAN ("okd_network") networks. See "Network layout..." for more details.
Installation can be done according to the following...
- "WELCOME TO CENTOS LINUX 8" screen.
- In "What language would you like to use during the installation process?" make sure "English" is selected in the first list and "English (United States)" is selected in the second - or those of your choice, obviously.
- Click on "Continue".
- "INSTALLATION SUMMARY" screen.
- Click on "Installation Destination" ("SYSTEM").
- Probably the disk available under "Local Standard Disks" will already be selected (marked with a "V").
- In "Storage Configuration" choose "Custom".
- Click on "Done".
- "MANUAL PARTITIONING" screen.
- Check if for "New mount points will use the following partitioning scheme:" informs "LVM".
- Click on "Click here to create them automatically."
- Select the "/home" partition and click "-" to delete it.
- Select the "/" partition and under "Desired Capacity:" and enter the double of the disk capacity (in reality any value above its capacity).
- NOTE: The procedure above causes the installer to use all unoccupied disk space for the selected partition. [Ref(s).: https://docs.centos.org/en-US/8-docs/standard-install/assembly_graphical-installation/ ]
- Click on "Done".
- A message box ("SUMMARY OF CHANGES") will appear.
- Click on "Accept Changes".
- Click on "Network & Host Name" ("SYSTEM").
- Select the first network interface ("Ethernet (ens3)" probably) and click the "OFF" button then it will be "ON".
- NOTE: The interface above is the NIC ens3 (I_INT_LAN) that obtain its settings from the DHCP on the network N_INT_LAN (10.2.0.0/24)(<INT_LAN_24>.0/24), therefore perform the needed configuration in its DHCP. If you don't use DHCP then you will have to configure these settings locally (manually). See "Network layout..." for more details.
- Select the second network interface ("Ethernet (ens4)" probably) and click the "OFF" button then it will be "ON".
- Click on "Done".
- Select the first network interface ("Ethernet (ens3)" probably) and click the "OFF" button then it will be "ON".
- Click on "Software Selection" ("SOFTWARE").
- NOTE: May be necessary to wait a while for the status of this option change once the network interfaces have been enabled.
- Select "Minimal Install".
- Click on "Done".
- Click on "Keyboard" ("LOCALIZATION").
- NOTE: Do this step only if you need adjust the layout of your keyboard.
- Click on "+".
- A message box ("ADD A KEYBOARD LAYOUT") will appear.
- Select "Portuguese (Brazil)" - or one of your choice, obviously.
- Click on "Add".
- Select "Portuguese (Brazil)" from the list and click "^" to put it as the first option - or one of your choice, obviously.
- Click on "Done".
- Click on "Root Password" ("USER SETTINGS").
- Enter the "Root Password:" and "Confirm:".
- Click on "Done" (twice if necessary).
- Click on "Begin Installation".
- When finished click on "Reboot System".
- Click on "Installation Destination" ("SYSTEM").
After the reboot configure the OKD_SERVICES to use its hard disk at boot, that is, disable ISO as boot option.
Access the virtual machine's terminal.
Adjust the NIC ens3 (I_INT_LAN) to use a minimal setup...
MODEL
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
BOOTPROTO=dhcp
DEVICE=<I_INT_LAN>
IPV6INIT=no
ONBOOT=yes
ZONE=public
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/etc/sysconfig/network-scripts/ifcfg-<INTERFACE_NAME>'
EXAMPLE
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
BOOTPROTO=dhcp
DEVICE=ens3
IPV6INIT=no
ONBOOT=yes
ZONE=public
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/etc/sysconfig/network-scripts/ifcfg-ens3'
NOTE: If you haven't done it already, make the appropriate DHCP settings for the NIC ens3 (I_INT_LAN). If you don't use DHCP then you will have to configure these settings locally (manually). See "Network layout..." for more details.
Define a hostname and a domain locally for the server...
MODEL
echo "127.0.0.1 okd-services.<YOUR_DOMAIN> okd-services" | tee -a /etc/hosts > /dev/null 2>&1
echo "HOSTNAME=okd-services" | tee -a /etc/sysconfig/network > /dev/null 2>&1
hostnamectl set-hostname "okd-services" --static
EXAMPLE
echo "127.0.0.1 okd-services.domain.abc okd-services" | tee -a /etc/hosts > /dev/null 2>&1
echo "HOSTNAME=okd-services" | tee -a /etc/sysconfig/network > /dev/null 2>&1
hostnamectl set-hostname "okd-services" --static
NOTE: If the N_INT_LAN ("default") network has a local DNS these settings may be different.
[Ref(s).: https://unix.stackexchange.com/a/239950/61742 ]
Restart the server...
reboot
dnf install -y epel-release
dnf update -y
This repository contains configuration files for OpenShift (OKD) (configuration/bootstrap files), ISC BIND 9 (DNS service), ISC DHCP Server (DHCP service), "chrony" (NTP clients) and HAProxy (load balancer service).
Clone the "okd_bare_metal" repository...
dnf install -y git-core
cd "/usr/local/src"
git clone https://github.com/eduardolucioac/okd_bare_metal.git
Open the "setup.bash" script file...
vi "/usr/local/src/okd_bare_metal/setup.bash"
... , configure the parameters in the "SETUP PARAMETERS" section according to your reality and according to its guidelines...
[...]
# > -------------------
# SETUP PARAMETERS
# The domain for the OpenShift (OKD) cluster.
# IMPORTANT: The domain used to install the cluster CANNOT BE CHANGED! See documentation!
# By Questor
OKD_DOMAIN="domain.abc"
# First 3 octets of OpenShift (OKD) cluster network (forward and reverse).
OKD_LAN_24="10.3.0"
OKD_LAN_24_REVERSE="0.3.10"
# Last octet of the OKD_SERVICES server IP.
OKD_SERVICES_LST_OCT="3"
# Last octet of the OKD_BOOTSTRAP node IP and its MAC address.
OKD_BOOTSTRAP_LST_OCT="19"
OKD_BOOTSTRAP_MAC="52:54:00:07:80:62"
# Last octet of the OKD_MASTER_1 node IP and its MAC address.
OKD_MASTER_1_LST_OCT="4"
OKD_MASTER_1_MAC="52:54:00:7d:97:70"
# Last octet of the OKD_MASTER_2 node IP and its MAC address.
OKD_MASTER_2_LST_OCT="5"
OKD_MASTER_2_MAC="52:54:00:6e:52:85"
# Last octet of the OKD_MASTER_3 node IP and its MAC address.
OKD_MASTER_3_LST_OCT="6"
OKD_MASTER_3_MAC="52:54:00:a3:65:d9"
# Last octet of the OKD_WORKER_1 node IP and its MAC address.
OKD_WORKER_1_LST_OCT="12"
OKD_WORKER_1_MAC="52:54:00:e3:7c:fb"
# Last octet of the OKD_WORKER_2 node IP and its MAC address.
OKD_WORKER_2_LST_OCT="13"
OKD_WORKER_2_MAC="52:54:00:20:ec:4f"
# NOTES:
# I - In case you want to add new master or worker nodes, in the examples above we
# left a gap for 5 sequential IPs for new master nodes (last octets 7, 8, 9, 10 and
# 11) and a gap for 5 sequential IPs for new worker nodes (last octets 14, 15, 16,
# 17 and 18);
# II - All network settings refer to OpenShift (OKD) cluster network ([N]OKD_LAN).
# By Questor
# Available disk space (in GB) for OKD_SERVICES server minus 15. E.g.: 60-15=45.
OKD_SERVICES_STRG_SZ="45"
# < -------------------
[...]
... , execute the script file with this command...
cd "/usr/local/src/okd_bare_metal"
bash setup.bash
... and it will automatically configure all its resources.
TIP: To obtain the values for the "OKD_VM_NAME_MAC" parameters you can pre-prepare the virtual machines (guests), that is, just provision the hardware resources on the hypervisor (host) without install anything. Or, you can generate new ones at the URL https://miniwebtool.com/mac-address-generator/ using MAC address prefix "52:54:00" (is always the same for KVM), MAC address format with ":" and case "Lowercase". Then use these MAC addresses when creating the virtual machines (guests).
As the N_OKD_LAN network does not allow outbound network traffic (WAN, internet), so we need to configure OKD_SERVICES to works as a gateway for the servers that are on this network.
Enable "IP forwarding"...
tee "/etc/sysctl.d/ip_forward.conf" << EOF
net.ipv4.ip_forward=1
EOF
sysctl -w net.ipv4.ip_forward=1
Setup an outbound NAT gateway with destination on NIC ens3 (<I_INT_LAN>) masking devices attached on NIC ens4 (<I_OKD_LAN>)...
MODEL
firewall-cmd --permanent --zone public --add-masquerade
firewall-cmd --permanent --direct --add-rule ipv4 nat POSTROUTING 0 -o <I_INT_LAN> -j MASQUERADE
firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -i <I_OKD_LAN> -o <I_INT_LAN> -j ACCEPT
firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -i <I_INT_LAN> -o <I_OKD_LAN> -m state --state RELATED,ESTABLISHED -j ACCEPT
firewall-cmd --reload
EXAMPLE
firewall-cmd --permanent --zone public --add-masquerade
firewall-cmd --permanent --direct --add-rule ipv4 nat POSTROUTING 0 -o ens3 -j MASQUERADE
firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -i ens4 -o ens3 -j ACCEPT
firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -i ens3 -o ens4 -m state --state RELATED,ESTABLISHED -j ACCEPT
firewall-cmd --reload
[Ref(s).: https://superuser.com/a/1659586/195840 , https://devops.ionos.com/tutorials/deploy-outbound-nat-gateway-on-centos-7/ , https://www.server-world.info/en/note?os=CentOS_Stream_8&p=firewalld&f=2 , https://www.comdivision.com/blog/centos-7-nat-router-basic-configuration , https://blog.redbranch.net/2015/07/30/centos-7-as-nat-gateway-for-private-network/ . https://unix.stackexchange.com/a/550064/61742 , https://forums.centos.org/viewtopic.php?t=53819#p227743 , https://www.server-world.info/en/note?os=CentOS_7&p=firewalld&f=2 , https://serverfault.com/q/870902/276753 ]
Install the package...
dnf install -y dhcp-server
Create a static network configuration for the NIC ens4 (<I_OKD_LAN>) as the DHCP service will be bound to it...
MODEL
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
BOOTPROTO=static
ONBOOT=yes
DEVICE=<I_OKD_LAN>
IPADDR=<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>
NETMASK=255.255.255.0
ZONE=public
IPV6INIT=no
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/etc/sysconfig/network-scripts/ifcfg-<I_OKD_LAN>'
EXAMPLE
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
BOOTPROTO=static
ONBOOT=yes
DEVICE=ens4
IPADDR=10.3.0.3
NETMASK=255.255.255.0
ZONE=public
IPV6INIT=no
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/etc/sysconfig/network-scripts/ifcfg-ens4'
Restart the network service...
systemctl restart NetworkManager.service
Copy the "dhcpd.conf" DHCP configuration file...
cd "/usr/local/src/okd_bare_metal"
mv /etc/dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf_BAK
cp ./dhcpd.conf /etc/dhcp/
NOTE: The DHCP service will be active only on the interface that is part of the subnet 10.3.0.0(<OKD_LAN_24>.0)/255.255.255.0, hence NIC ens4 (<I_OKD_LAN>).
Create firewall rules...
firewall-cmd --zone public --add-service=dhcp --permanent
firewall-cmd --reload
Enable the service "dhcpd" (DHCP) to automatically start at server boot, start it and watch its log in sequence...
systemctl enable dhcpd.service
systemctl restart dhcpd.service
journalctl -u dhcpd.service --no-pager | less +F
[Ref(s).: https://bytefreaks.net/gnulinux/centos-7-setup-a-dhcp-server-and-provide-specific-ip-based-on-mac-address , https://www.unixmen.com/how-to-install-dhcp-server-in-centos-and-ubuntu/ , https://www.appservgrid.com/paw92/index.php/2019/03/14/how-to-setup-dhcp-server-and-client-on-centos-and-ubuntu/ , https://tecadmin.net/configuring-dhcp-server-on-centos-redhat/ , https://elearning.wsldp.com/pcmagazine/install-centos-7-dhcp-server/ , https://elearningsurasakblog.wordpress.com/2019/09/24/how-to-install-and-configure-dhcp-server-on-centos7/ , https://linuxhint.com/dhcp_server_centos8/ , https://www.tecmint.com/install-dhcp-server-in-centos-rhel-fedora/ , https://ask.fedoraproject.org/t/dhcp-does-not-recognise-mac-address-of-interface/1290/36 ]
Install the packages...
dnf -y install bind
dnf -y install bind-utils
Copy the config files...
cd "/usr/local/src/okd_bare_metal"
mv /etc/named.conf /etc/named.conf_BAK
cp ./named.conf /etc/
cp ./named.conf.local /etc/named/
mkdir "/etc/named/zones"
cp ./rv.okd_domain /etc/named/zones/
cp ./fw.okd_domain /etc/named/zones/
Create the firewall rules...
firewall-cmd --zone public --add-port=53/udp --permanent
firewall-cmd --reload
Modify in the appropriate DHCP the OKD_SERVICES's network settings for the NIC ens3 (<I_INT_LAN>) changing the DNS IP to "127.0.0.1" (LOCALHOST). If you don't use DHCP then you will have to configure these settings locally (manually). See "Network layout..." for more details.
Update the OKD_SERVICES's DHCP client settings...
MODEL
dhclient <I_INT_LAN>
EXAMPLE
dhclient ens3
[Ref(s).: https://computingforgeeks.com/install-and-configure-dhcp-server-on-centos-rhel-linux/ ]
Enable the service "named" (BIND 9) to automatically start at server boot, start it and watch its log in sequence...
systemctl enable named.service
systemctl restart named.service
journalctl -u named.service --no-pager | less +F
Test DNS on the OKD_SERVICES...
MODEL
dig <YOUR_DOMAIN>
dig -x <OKD_LAN_24>.<OKD_SERVICES_LST_OCT>
EXAMPLE
dig domain.abc
dig -x 10.3.0.3
Test internet access...
curl http://www.google.com
Install the package...
dnf -y install haproxy
Copy the config files...
cd "/usr/local/src/okd_bare_metal"
mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg_BAK
cp ./haproxy.cfg /etc/haproxy/
Allow HAProxy to connect to unbind IP Addresses...
tee "/etc/sysctl.d/ip_nonlocal_bind.conf" << EOF
net.ipv4.ip_nonlocal_bind=1
EOF
sysctl -w net.ipv4.ip_nonlocal_bind=1
[Ref(s).: https://rahmatawe.com/blog/deploying-upi-okd/ ]
If SELinux is in enforcing mode, allow HAProxy to proxy any port...
setsebool -P haproxy_connect_any 1
Create firewall rules...
firewall-cmd --zone public --add-port=6443/tcp --permanent
firewall-cmd --zone public --add-port=22623/tcp --permanent
firewall-cmd --zone public --add-port=80/tcp --permanent
firewall-cmd --zone public --add-port=443/tcp --permanent
firewall-cmd --reload
Enable the service "haproxy" (HAProxy) to automatically start at server boot, start it and watch its log in sequence...
systemctl enable haproxy.service
systemctl restart haproxy.service
journalctl -u haproxy.service --no-pager | less +F
[Ref(s).: https://github.com/pshchelo/stackdev/blob/master/dib_elements/aws-loadbalancer/install.d/11-haproxy ]
Install the package...
dnf install -y chrony
Set the timezone...
MODEL
timedatectl set-timezone <TIMEZONE>
EXAMPLE
timedatectl set-timezone America/Sao_Paulo
TIP: To list the available timezones use the command...
timedatectl list-timezones
.
Create the firewall rules...
firewall-cmd --zone public --add-service=ntp --permanent
firewall-cmd --reload
Modify the "/etc/chrony.conf" configuration file to allow NTP clients access from local network...
MODEL
sed -i 's/#allow 192.168.0.0\/16/#allow 192.168.0.0\/16\nallow <OKD_LAN_24>.0\/24/g' /etc/chrony.conf
EXAMPLE
sed -i 's/#allow 192.168.0.0\/16/#allow 192.168.0.0\/16\nallow 10.3.0.0\/24/g' /etc/chrony.conf
... and to enable the Chrony server (NTP) to continue to act as if it were connected to the remote reference servers even if the connection (internet, basically) to them fails...
sed -i 's/#local stratum 10/local stratum 10/g' /etc/chrony.conf
... this, also, enables the host to continue to be an NTP server to other hosts on the local network.
Enable the service "chronyd" (Chrony) to automatically start at server boot, start it and watch its log in sequence...
systemctl enable chronyd.service
systemctl restart chronyd.service
journalctl -u chronyd.service --no-pager | less +F
Check if the NTP servers are accessible...
chronyc sources -v
TIP: At least 2 or 3 servers must be available. If none are available, there may be some network blocking for the NTP (123/UDP) protocol.
Force clock resynchronization...
chronyc -a "burst 3/5"
chronyc makestep 1 -1
... and then observe if the "Leap status" has the "status" as "Normal"...
chronyc tracking
.
[Ref.: https://www.server-world.info/en/note?os=CentOS_7&p=ntp&f=3 , https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-configuring_ntp_using_the_chrony_suite ]
A web service will be created to host the resources needed to create the cluster.
Install the package...
dnf -y install httpd
Change Apache ("httpd") to listen on port 8080...
sed -i 's/Listen 80/# Listen 80\nListen 8080/g' /etc/httpd/conf/httpd.conf
To avoid the error "[...]httpd: Could not reliably determine the server's fully qualified domain name[...]"...
sed -i 's/#ServerName www.example.com:80/#ServerName www.example.com:80\nServerName 127.0.0.1:8080/g' /etc/httpd/conf/httpd.conf
[Ref(s).: https://stackoverflow.com/a/46240707/3223785 ]
If SELinux is in enforcing mode, allow Apache ("httpd") to read user content...
setsebool -P httpd_read_user_content 1
[Ref(s).: https://linux.die.net/man/8/apache_selinux ]
Create firewall rules...
firewall-cmd --zone public --add-port=8080/tcp --permanent
firewall-cmd --reload
Enable the service "httpd" (Apache) to automatically start at server boot, start it and watch its log in sequence...
systemctl enable httpd.service
systemctl restart httpd.service
journalctl -u httpd.service --no-pager | less +F
Test the web server...
curl http://127.0.0.1:8080
dnf -y install libvirt
[Ref(s).: okd-project/okd#535 (comment) , https://www.gitmemory.com/cgruver , okd-project/okd#633 (review) ]
Find here https://github.com/openshift/okd/releases the latest version ("Latest release") of openshift-client and openshift-install.
dnf -y install wget
cd "/usr/local/src/"
wget https://github.com/openshift/okd/releases/download/4.7.0-0.okd-2021-08-07-063045/openshift-client-linux-4.7.0-0.okd-2021-08-07-063045.tar.gz
wget https://github.com/openshift/okd/releases/download/4.7.0-0.okd-2021-08-07-063045/openshift-install-linux-4.7.0-0.okd-2021-08-07-063045.tar.gz
NOTE: The latest version ("Latest release") in the creation of this tutorial was "4.7.0-0.okd-2021-08-07-063045".
Extract the OpenShift (OKD) installer and the "oc" client...
dnf -y install tar
cd "/usr/local/src/"
tar -zxvf openshift-client-linux-4.7.0-0.okd-2021-08-07-063045.tar.gz
tar -zxvf openshift-install-linux-4.7.0-0.okd-2021-08-07-063045.tar.gz
Move the "kubectl", "oc" and "openshift-install" to "/usr/local/bin" folder...
cd "/usr/local/src/"
mv ./kubectl /usr/local/bin/
mv ./oc /usr/local/bin/
mv ./openshift-install /usr/local/bin/
rm -f README.md
Test showing the versions of "oc" and "openshift-install"...
oc version
openshift-install version
Generate a SSH key...
MODEL
ssh-keygen -t rsa -C "okd-services.<YOUR_DOMAIN>"
EXAMPLE
ssh-keygen -t rsa -C "okd-services.domain.abc"
... using the default options. So, use empty "passphrase".
Then add the SSH private keys into the SSH authentication agent to implementing single sign-on with SSH...
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa
[Ref(s).: https://www.ssh.com/academy/ssh/add , https://docs.gitlab.com/ee/ssh/ , https://git-scm.com/book/pt-pt/v2/Git-no-Servidor-Generating-Your-SSH-Public-Key , https://docs.okd.io/latest/installing/installing_gcp/installing-gcp-customizations.html#ssh-agent-using_installing-gcp-customizations ]
Set the "sshKey" parameter in the "install-config.yaml" file with your public key ("id_rsa.pub") value generated in the previous step ...
TARGET_ARG="<SSH_PUB_KEY>"
REPLACE_ARG=$(cat ~/.ssh/id_rsa.pub)
FILE_ARG="/usr/local/src/okd_bare_metal/install-config.yaml"
REPLACE_ARG=$(echo "'${REPLACE_ARG}'" | sed 's/[]\/$*.^|[]/\\&/g' | sed 's/\t/\\t/g' | sed ':a;N;$!ba;s/\n/\\n/g')
REPLACE_ARG=${REPLACE_ARG%?}
REPLACE_ARG=${REPLACE_ARG#?}
SED_ARGS="'s/$TARGET_ARG/$REPLACE_ARG/g'"
eval "sed -i $SED_ARGS $FILE_ARG"
NOTE: The above commands automatically sets the "sshKey" parameter value escaping arguments to the "sed" command that actually does the job.
[Ref(s).: https://rahmatawe.com/blog/deploying-upi-okd/ ]
Create an cluster's install directory and copy the "install-config.yaml" file...
mkdir "/usr/local/okd"
cp /usr/local/src/okd_bare_metal/install-config.yaml /usr/local/okd/
TIP: If you need to reuse the "/usr/local/okd/" folder, make sure it is empty. Hidden files are created after generating the configs and they should be removed before you use the same folder on a new attempt.
Generate the Kubernetes manifests for the cluster...
openshift-install create manifests --dir=/usr/local/okd/
NOTE: Ignore the warning.
Modify the "cluster-scheduler-02-config.yml" manifest file to prevent pods from being scheduled ("mastersSchedulable") on the master nodes...
sed -i 's/mastersSchedulable: true/mastersSchedulable: false/g' /usr/local/okd/manifests/cluster-scheduler-02-config.yml
NOTE: The above procedure is required for an installation over an User-Provisioned Infrastructure (UPI)/"bare-metal".
Create the "ignition-configs"...
openshift-install create ignition-configs --dir=/usr/local/okd/
Create the web service's "okd" folder in the "/var/www/html" path...
mkdir "/var/www/html/okd"
Copy the "/usr/local/okd/" folder contents to the "/var/www/html/okd" folder...
cp -r /usr/local/okd/* /var/www/html/okd/
Download the Fedora CoreOS/FCOS Stable (Bare Metal) bios image (look for "Raw"/"raw.xz") and the sig files (look for "raw.xz.sig"), shorten the file names and set the necessary permissions...
NOTE: These resources can be found here https://getfedora.org/coreos/download?tab=cloud_launchable&stream=stable in the "Bare Metal & Virtualized" tab.
cd "/var/www/html/okd"
wget https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/34.20210725.3.0/x86_64/fedora-coreos-34.20210725.3.0-metal.x86_64.raw.xz
wget https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/34.20210725.3.0/x86_64/fedora-coreos-34.20210725.3.0-metal.x86_64.raw.xz.sig
mv ./fedora-coreos-34.20210725.3.0-metal.x86_64.raw.xz ./fcos.raw.xz
mv ./fedora-coreos-34.20210725.3.0-metal.x86_64.raw.xz.sig ./fcos.raw.xz.sig
chown -R apache: /var/www/html/
chmod -R 755 /var/www/html/
NOTE: The stable version at creation of this tutorial was "34.20210725.3.0".
Restart the Apache ("httpd") service...
systemctl restart httpd.service
Download a Fedora CoreOS/FCOS Stable (Bare Metal) ISO...
These resources can be found here https://getfedora.org/coreos/download?tab=cloud_launchable&stream=stable in the "Bare Metal & Virtualized" tab.
wget https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/34.20210725.3.0/x86_64/fedora-coreos-34.20210725.3.0-live.x86_64.iso
IMPORTANT: The stable version at creation of this tutorial was "34.20210725.3.0". Use the SAME VERSION used when the bios image and sig files were downloaded.
Enable Nested Virtualization. For more details see the section "KVM Nested Virtualization - Virtualization Support or Hardware-Assisted Virtualization for guests".
They must have access to N_OKD_LAN ("okd_network") network. See "Network layout..." for more details.
In this step we will not install any operating system we will just provision the hardware resources on the hypervisor (host) without install anything.
Configure OKD_BOOTSTRAP to use ISO "fedora-coreos-34.20210725.3.0-live.x86_64.iso" on boot.
Once the virtual machine starts press the "TAB" key during boot to edit the kernel boot options and add the following...
MODEL
coreos.inst.install_dev=<DISK_DEVICE_PATH> coreos.inst.image_url=http://<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>:8080/okd/bootstrap.ign
EXAMPLE
coreos.inst.install_dev=/dev/vda coreos.inst.image_url=http://10.3.0.3:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://10.3.0.3:8080/okd/bootstrap.ign
... right after the existing settings that is something like this...
/images/pxeboot/vmlinuz initrd=/images/pxeboot/initrd.img,/images/ignition.img systemd.unified_cgroup_hierarchy=0 mitigations=auto,nosmt coreos.liveiso=fedora-coreos-34.20210725.3.0 ignition.firstboot ignition.platform.id=metal
... so the final appearance will be something like this...
/images/pxeboot/vmlinuz initrd=/images/pxeboot/initrd.img,/images/ignition.img systemd.unified_cgroup_hierarchy=0 mitigations=auto,nosmt coreos.liveiso=fedora-coreos-34.20210725.3.0 ignition.firstboot ignition.platform.id=metal coreos.inst.install_dev=/dev/sda coreos.inst.image_url=http://10.3.0.3:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://10.3.0.3:8080/okd/bootstrap.ign
Tag values...
- <DISK_DEVICE_PATH> - Path to disk device. E.g.: "/dev/sda", "/dev/vda";
- <OKD_SERVICES_LST_OCT> - Last octet of machine IP OKD_SERVICES. E.g.: "3";
- <OKD_LAN_24> - First three octets of N_OKD_LAN. E.g.: "10.3.0".
Added the necessary parameters and press "ENTER", then the process will start downloading the resources from the services server (OKD_SERVICES) (the image "fcos.raw.gz" and its signature).
TIP: To insert the above entry using the "virt-manager" do as follows. From "View" > "Console" (virtal machine/guest management), go (again) to "View" > "Consoles" > "Serial 1". This mode will allow you to select the "Paste" option with the second mouse button.
After the end of the process above, the system will reboot and after the reboot configure the OKD_BOOTSTRAP to use its hard disk at boot, that is, disable ISO as boot option.
If everything goes well, the system will boot and the start screen will appear asking for login. It will look like this...
Fedora CoreOS 34.20210725.3.0
Kernel 5.13.4-200.fc34.x86_64 on an x86_64 (ttyl)
okd-bootstrap login:
NOTE: The OS will print/output a number of other information on screen that will get mixed up with the traditional components of the login screen described above. As far as we know this is "normal".
IMPORTANT: At this point, you might want, right now, to take a look at the guidelines in the "Follow the bootstrap process evolution (OKD_SERVICES)" (Especially in the "Some relevant guidelines" section). You might be able to avoid a lot of problems with the guidelines that are there.
Follow the same instructions as in "Starting the bootstrap node (OKD_BOOTSTRAP)" with the exception of what is in the section "Configure parameters on boot" that must be done according to these specific settings...
MODEL
coreos.inst.install_dev=<DISK_DEVICE_PATH> coreos.inst.image_url=http://<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>:8080/okd/master.ign
EXAMPLE
coreos.inst.install_dev=/dev/vda coreos.inst.image_url=http://10.3.0.3:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://10.3.0.3:8080/okd/master.ign
IMPORTANT: It is normal that at the beginning a master node continuously display an error like to the following...
[ 83.933709] ignition[531]: GET https://api-int.mbr.domain.abc:22623/config/master: attempt #16
[ 83.939340] ignition[531]: GET error: Get "https://api-int.mbr.domain.abc:22623/config/master": EOF
... , however with the bootstrap node (OKD_BOOTSTRAP) running (after passing the ignition phase) the above situation MUST NOT EXTEND FOR MORE THAN 10 MINUTES. If you exceed this amount of time, something has certainly gone wrong.
IMPORTANT: Although it is not mandatory to wait for the bootstrap process to occur only with the master nodes running. As a matter of practicality it is better to do the process this way since for the bootstrap process to occur only they are needed.
Follow the same instructions as in "Starting the bootstrap node (OKD_BOOTSTRAP)" with the exception of what is in the section "Configure parameters on boot" that must be done according to these specific settings...
MODEL
coreos.inst.install_dev=<DISK_DEVICE_PATH> coreos.inst.image_url=http://<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://<OKD_LAN_24>.<OKD_SERVICES_LST_OCT>:8080/okd/worker.ign
EXAMPLE
coreos.inst.install_dev=/dev/vda coreos.inst.image_url=http://10.3.0.3:8080/okd/fcos.raw.xz coreos.inst.ignition_url=http://10.3.0.3:8080/okd/worker.ign
IMPORTANT: It is normal that at the beginning a worker node continuously display an error like to the following...
[ 139.524957] ignition[532]: GET https://api-int.mbr.domain.abc:22623/config/worker: attempt #31
[ 139.528886] ignition[532]: GET result: Internal Server Error
... , however with all nodes running (after passing the ignition phase) the above situation MUST NOT EXTEND FOR MORE THAN 30 MINUTES. If you exceed this amount of time, something has certainly gone wrong.
To follow the evolution of the bootstrap process in the OKD_SERVICES use the command...
openshift-install wait-for bootstrap-complete --log-level=info --dir=/usr/local/okd/
... and to follow with more details on bootstrap node (OKD_BOOTSTRAP), access via ssh (no password required)...
MODEL
ssh core@<OKD_LAN_24>.<OKD_BOOTSTRAP_LST_OCT>
EXAMPLE
... and run this command...
journalctl -b -f -u release-image.service -u bootkube.service
.
IMPORTANT: The bootstrap process only ends when "all" the nodes - in fact only the master nodes need to be - have been created and an output like to the one below is observed for the openshift-install wait-for bootstrap-complete --log-level=info --dir=/usr/local/okd/
command...
[...]
INFO Waiting up to 20m0s for the Kubernetes API at https://api.mbr.domain.abc:6443...
INFO API v1.20.0-1077+2817867655bb7b-dirty up
INFO Waiting up to 30m0s for bootstrapping to complete...
INFO It is now safe to remove the bootstrap resources
INFO Time elapsed: 1350s
... and an output like to the one below is observed for the openshift-install wait-for bootstrap-complete --log-level=info --dir=/usr/local/okd/
command...
[...]
Jul 19 01:57:25 okd-bootstrap bootkube.sh[15619]: I0719 01:57:25.084944 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True.
Jul 19 01:57:26 okd-bootstrap bootkube.sh[15619]: I0719 01:57:26.369145 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True.
Jul 19 01:57:32 okd-bootstrap bootkube.sh[15619]: I0719 01:57:32.655288 1 waitforceo.go:64] Cluster etcd operator bootstrapped successfully
Jul 19 01:57:32 okd-bootstrap bootkube.sh[15619]: I0719 01:57:32.657829 1 waitforceo.go:58] cluster-etcd-operator bootstrap etcd
Jul 19 01:57:32 okd-bootstrap podman[15619]: 2021-07-19 01:57:32.878374998 +0000 UTC m=+1110.936765616 container died 4c828c19c702ee51fc6eda68d52859ca2c477eb90aca9266273ff778aa1048ed (image=quay.io/openshift/okd-content@sha256:7707231ca5ce9c574cbcc8cd10b5b311fb98d59907f01696a76ea75e5ee65f09, name=reverent_morse)
Jul 19 01:57:33 okd-bootstrap bootkube.sh[9228]: bootkube.service complete
Jul 19 01:57:33 okd-bootstrap systemd[1]: bootkube.service: Deactivated successfully.
Jul 19 01:57:33 okd-bootstrap systemd[1]: bootkube.service: Consumed 17.878s CPU time.
.
NOTE: The node workers are created from the master nodes.
- It is "normal" to see a lot of errors in the output of the
journalctl -b -f -u release-image.service -u bootkube.service
command. Note, however, that some actions fail at first but end up succeeding later; - Although it is not necessary to wait for this event to start the master nodes (OKD_MASTER_X), the OKD_BOOTSTRAP node will be ready for the master nodes when the log (command above) reach outputs similar to these...
[...]
Jul 18 23:12:02 okd-bootstrap bootkube.sh[5146]: Created "99-okd-worker-disable-mitigations.yaml" machineconfigs.v1.machineconfiguration.openshift.io/99-okd-worker-disable-mitigations -n
Jul 18 23:12:02 okd-bootstrap bootkube.sh[5146]: Created "99_openshift-cluster-api_master-user-data-secret.yaml" secrets.v1./master-user-data -n openshift-machine-api
Jul 18 23:12:03 okd-bootstrap bootkube.sh[5146]: Created "99_openshift-cluster-api_worker-user-data-secret.yaml" secrets.v1./worker-user-data -n openshift-machine-api
Jul 18 23:12:03 okd-bootstrap bootkube.sh[5146]: Created "99_openshift-machineconfig_99-master-ssh.yaml" machineconfigs.v1.machineconfiguration.openshift.io/99-master-ssh -n
Jul 18 23:12:03 okd-bootstrap bootkube.sh[5146]: Created "99_openshift-machineconfig_99-worker-ssh.yaml" machineconfigs.v1.machineconfiguration.openshift.io/99-worker-ssh -n
user-data-secret
... which will happen after the bootstrap node (OKD_BOOTSTRAP) reboot. New entries will stop appearing in the log until you add the master nodes;
- The same reasoning above applies to the node workers when the log (command above) reach outputs similar to the ones below and stop printing for some minutes...
[...]
-version/cluster-version-operator Ready
Jul 18 23:58:37 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-kube-apiserver/kube-apiserver RunningNotReady
Jul 18 23:58:37 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-kube-scheduler/openshift-kube-scheduler RunningNotReady
Jul 18 23:58:37 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-kube-controller-manager/kube-controller-manager Ready
Jul 18 23:58:37 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-cluster-version/cluster-version-operator Ready
Jul 18 23:59:07 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-kube-controller-manager/kube-controller-manager Ready
Jul 18 23:59:07 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-cluster-version/cluster-version-operator Ready
Jul 18 23:59:07 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-kube-apiserver/kube-apiserver RunningNotReady
Jul 18 23:59:07 okd-bootstrap bootkube.sh[17729]: Pod Status:openshift-kube-scheduler/openshift-kube-scheduler Ready
... all master nodes are expected to reboot twice, so that everything is ready to create the worker nodes;
- The nodes must remain ALL THE TIME with the names they got from DHCP service (transient name). This is a sign that everything is going well. If a node shows "localhost" as its name, the process has failed. There may be a problem with the DHCP service (OKD_SERVICES) settings;
- It is also expected that all worker nodes will be rebooted twice, so that everything is effectively finished;
TIPS:
- If your bootstrap process is failing repeatedly, you may need to increase the nodes' RAM beyond the minimum requirements - at the moment we don't recommend do it. However, keep in mind that if the process repeatedly fails effectively something bad is happening even to the consumed remote resources. We've had cases where the bootstrap process repeatedly failed with a certain configuration and two days later it worked perfectly with it. If these remote resources are in trouble, your cluster is at risk of being compromised forever;
- You can wait for the bootstrap process to be completed with only the master nodes as only they are needed for it and then add the worker nodes. Although it is not mandatory to wait for the bootstrap process to occur only with the master nodes running. As a matter of practicality it is better to do the process this way since for the bootstrap process to occur only they are needed;
- If a new attempt is needed (bootstrap the cluster again) remove the folders
rm -rf /usr/local/okd/
andrm -rf /var/www/html/okd
in the OKD_SERVICES server, recreate all nodes - clean/recreate its hard disks - and restart the process from the subsection "Setup the cluster's install directory" of the section "Setup the OpenShift (OKD) (installer and client) (OKD_SERVICES)".
TIP: You can connect to any node with "ssh" from the machine where you generate the ssh certificate (OKD_SERVICES)...
MODEL
ssh core@<OKD_LAN_24>.<OKD_NODENAME_LST_OCT>
EXAMPLE
... and to login as "root" use sudo su
command.
Once the bootstrap process is finished, then the "httpd" (Apache) service (OKD_SERVICES) and the bootstrap node (OKD_BOOTSTRAP) will no longer be needed.
Remove firewall rules...
firewall-cmd --zone public --remove-port=8080/tcp
firewall-cmd --runtime-to-permanent
firewall-cmd --reload
Stop the "httpd" (Apache) service and disable it to automatic start at server boot...
systemctl stop httpd.service
systemctl disable httpd.service
Comment out the bootstrap node and restart the HAProxy (load balancer) service (OKD_SERVICES)...
sed -i 's/server okd-bootstrap/# server okd-bootstrap/g' /etc/haproxy/haproxy.cfg
systemctl restart haproxy.service
Remove the bootstrap machine itself from the hypervisor (host) (HYPERVISOR).
[Ref(s).: https://docs.okd.io/latest/installing/installing_bare_metal/installing-bare-metal.html ]
Now that the master nodes are online, you should be able to login with the "oc" client. Use the following commands to log in and check some information about your cluster...
export KUBECONFIG=/usr/local/okd/auth/kubeconfig
oc whoami
oc get nodes
oc get csr
NOTE: With the oc whoami
command you should see the default administrator username, with the oc get nodes
command you should see only the master nodes and with the oc get csr
command you should see several CSRs (including worker nodes) awaiting approval.
TIP: You can add ("export") the environment variable "KUBECONFIG" globally and permanently using the file "/etc/environment". This way you won't need to export this variable every time you log in...
tee "/etc/environment" << EOF
KUBECONFIG=/usr/local/okd/auth/kubeconfig
EOF
[Ref(s).: https://stackoverflow.com/a/31546962/3223785 , https://unix.stackexchange.com/a/117473/61742 ]
Install the "jq" package to assist you with some maintenance operations...
cd "/usr/local/src"
wget -O jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64
mv ./jq /usr/local/bin/
chmod +x /usr/local/bin/jq
# To test/show version.
jq --version
NOTE: The latest release version at the time this tutorial was created was 1.6.
Approve all pending CSRs...
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
IMPORTANT:
Once you approve the first set of CSRs additional CSRs will be created. These must be approved too. If you do not see pending requests wait a moment.
Watch and wait for the worker nodes to join the cluster and enter a "Ready" status. This can take 5~10 minutes...
watch -n5 oc get nodes
TIP: Alternatively you can use "jq" in the same way and approve all the pending CSRs...
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc
Let's configure our OKD_SERVICES as an NFS (Network File Sharing) server and use it for persistent storage.
Install the package...
dnf install -y nfs-utils
Create a NFS share registry directory...
mkdir -p /var/nfsshare/registry
Create an NFS export...
Add the entry below to the "/etc/exports" file...
MODEL
echo "/var/nfsshare <OKD_LAN_24>.0/24(rw,sync,no_root_squash,no_all_squash,no_wdelay)" | tee /etc/exports
EXAMPLE
echo "/var/nfsshare 10.3.0.0/24(rw,sync,no_root_squash,no_all_squash,no_wdelay)" | tee /etc/exports
Set the necessary permissions...
chmod -R 777 /var/nfsshare
chown -R nobody:nobody /var/nfsshare
If SELinux is in enforcing mode, allow NFS ("nfs-server") to export all...
setsebool -P nfs_export_all_rw 1
Create firewall rules...
firewall-cmd --zone public --add-service mountd --permanent
firewall-cmd --zone public --add-service rpc-bind --permanent
firewall-cmd --zone public --add-service nfs --permanent
firewall-cmd --reload
Enable the services "nfs-server"/"rpcbind" (NFS) to automatically start at server boot, start them and watch its logs in sequence...
systemctl enable nfs-server.service
systemctl restart nfs-server.service
journalctl -u nfs-server.service --no-pager | less +F
systemctl enable rpcbind.service
systemctl restart rpcbind.service
journalctl -u rpcbind.service --no-pager | less +F
Verify the image registry operator ("cluster-image-registry-operator-*") is running in the openshift-image-registry namespace...
oc get pod -n openshift-image-registry | grep "cluster-image-registry-operator-*"
Change "managementState" (Image Registry Operator) configuration from "Removed" to "Managed"...
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState": "Managed"}}'
Use the following command to create the PV using the YAML file...
oc create -f /usr/local/src/okd_bare_metal/registry_pv.yaml
Verify the status of the newly created PV...
oc get pv
... and check if the "STATUS" attribute is "Available" for "registry-pv" ("NAME").
Use the following command to create the PVC using the YAML file...
oc create -n openshift-image-registry -f /usr/local/src/okd_bare_metal/registry_pvc.yaml
Verify the status of the newly created PVC...
oc get pvc -n openshift-image-registry
... and check if the "STATUS" attribute is "Bound" for "registry-pvc" ("NAME").
Add the PVC to the cluster...
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"managementState":"Managed","pvc":{"claim":"registry-pvc"}}}}'
Verify, again, the status of PV...
oc get pv
... and check if the "STATUS" attribute is "Bound" for "registry-pv" ("NAME").
[Ref(s).: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/flexpod_openshift_platform_4.html , https://www.walkersblog.net/documents/openshift/openshift.html ]
Configure the cluster nodes as NTP (chrony) clients.
Open a SSH connection to a master node (OKD_MASTER_X)...
MODEL
ssh core@<OKD_LAN_24>.<OKD_MASTER_X_LST_OCT>
EXAMPLE
... and switch to root...
sudo su
Create a "chrony.conf" from an existing one...
cd "/usr/local/src"
grep -v -e '^#' -e '^$' /etc/chrony.conf > chrony.conf
Copy the contents of the generated "chrony.conf" configuration file...
cd "/usr/local/src"; cat chrony.conf
... and remove it...
cd "/usr/local/src"
rm -f chrony.conf
Close the master node's SSH connection (OKD_MASTER_X).
Create a "chrony.conf" file at the service server (OKD_SERVICES) using the content obtained above and make the necessary adjustments as below...
MODEL
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
pool 2.fedora.pool.ntp.org iburst
sourcedir /run/chrony-dhcp
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
keyfile /etc/chrony.keys
ntsdumpdir /var/lib/chrony
leapsectz right/UTC
logdir /var/log/chrony
server <OKD_LAN_24>.<OKD_SERVICES_LST_OCT> iburst prefer
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/usr/local/src/chrony.conf'
EXAMPLE
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
pool 2.fedora.pool.ntp.org iburst
sourcedir /run/chrony-dhcp
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
keyfile /etc/chrony.keys
ntsdumpdir /var/lib/chrony
leapsectz right/UTC
logdir /var/log/chrony
server 10.3.0.3 iburst prefer
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/usr/local/src/chrony.conf'
NOTE: The above configuration makes a NTP (chrony) client to consume the service server (OKD_SERVICES) as the preferred reference source.
Encode the "chrony.conf" file...
cd "/usr/local/src"
base64 chrony.conf > chrony.conf.encoded
Set the "source" parameter in the "chrony_conf_master.yaml" and "chrony_conf_worker.yaml" files ("okd_bare_metal" folder) with the contents of the "chrony.conf.encoded" file generated in the previous step...
TARGET_ARG="<SOURCE_CHRONY_CONF>"
REPLACE_ARG=$(cat /usr/local/src/chrony.conf.encoded | awk '{print}' ORS='')
FILE_ARG_MASTER="/usr/local/src/okd_bare_metal/chrony_conf_master.yaml"
FILE_ARG_WORKER="/usr/local/src/okd_bare_metal/chrony_conf_worker.yaml"
REPLACE_ARG=$(echo "'${REPLACE_ARG}'" | sed 's/[]\/$*.^|[]/\\&/g' | sed 's/\t/\\t/g' | sed ':a;N;$!ba;s/\n/\\n/g')
REPLACE_ARG=${REPLACE_ARG%?}
REPLACE_ARG=${REPLACE_ARG#?}
SED_ARGS="'s/$TARGET_ARG/$REPLACE_ARG/g'"
eval "sed -i $SED_ARGS $FILE_ARG_MASTER"
eval "sed -i $SED_ARGS $FILE_ARG_WORKER"
NOTE: The above commands automatically sets the "source" parameter value escaping arguments to the "sed" command that actually does the job.
Set the new NTP (chrony) client configuration for the worker nodes (OKD_WORKER_Xs)...
cd "/usr/local/src/okd_bare_metal"
oc apply -f /usr/local/src/okd_bare_metal/chrony_conf_worker.yaml
Set the new NTP (chrony) client configuration for the master nodes (OKD_MASTER_Xs)...
cd "/usr/local/src/okd_bare_metal"
oc apply -f /usr/local/src/okd_bare_metal/chrony_conf_master.yaml
NOTE: Creating the configurations causes each master and worker node to schedule a reboot.
This procedure must be performed on each node.
Open a SSH connection...
MODEL
ssh core@<OKD_LAN_24>.<OKD_NODE_LST_OCT>
EXAMPLE
... and switch to root...
sudo su
Set the timezone...
MODEL
timedatectl set-timezone <TIMEZONE>
EXAMPLE
timedatectl set-timezone America/Sao_Paulo
Force clock resynchronization...
chronyc -a "burst 3/5"
chronyc makestep 1 -1
... and then observe if the "Leap status" has the "status" as "Normal"...
chronyc tracking
.
IMPORTANT: The cluster nodes take about 10~15 minutes to start consuming the service server (OKD_SERVICES) as source. This can be seen using the command...
chronyc sources -v
.
[Ref(s).: https://www.walkersblog.net/documents/openshift/openshift.html#_ntpchrony , https://docs.openshift.com/container-platform/4.4/installing/install_config/installing-customizing.html#installation-special-config-chrony_installing-customizing , https://opensource.com/article/18/12/manage-ntp-chrony , https://wiki.crowncloud.net/?How_to_Sync_Time_in_CentOS_8_using_Chrony , https://rahmatawe.com/blog/deploying-upi-okd/ , https://computingforgeeks.com/configure-chrony-ntp-service-on-openshift-okd/ ]
cat /usr/local/okd/auth/kubeadmin-password
NOTE: The user "kubeadmin" can be used to log in to the Web Console.
The "kubeadmin" is just an initial user. The easiest way to set up a local user is with "htpasswd"...
MODEL
cd "/usr/local/src/okd_bare_metal"
htpasswd -c -B -b users.htpasswd <OKD_ADM_USR> <OKD_ADM_PWD>
EXAMPLE
cd "/usr/local/src/okd_bare_metal"
htpasswd -c -B -b users.htpasswd okdadmusr MySeCREtvalUE
Create a secret in the "openshift-config" project using the "users.htpasswd" file you generated...
cd "/usr/local/src/okd_bare_metal"
oc create secret generic htpass-secret --from-file=htpasswd=users.htpasswd -n openshift-config
Add the identity provider...
cd "/usr/local/src/okd_bare_metal"
oc apply -f /usr/local/src/okd_bare_metal/htpasswd_provider.yaml
NOTE: Ignore the warning.
Give "cluster-admin" access to the new user...
MODEL
oc adm policy add-cluster-role-to-user cluster-admin <OKD_ADM_USR>
EXAMPLE
oc adm policy add-cluster-role-to-user cluster-admin okdadmusr
NOTE: Now, the created user has cluster administrator level access.
Test whether the web console is available...
MODEL
curl -k https://console-openshift-console.apps.mbr.<YOUR_DOMAIN>
EXAMPLE
curl -k https://console-openshift-console.apps.mbr.domain.abc
NOTE: The Web Console may take a few minutes to become available. It is normal for the above command to initially display an error like below...
[root@okd-services ~]# curl -k https://console-openshift-console.apps.mbr.domain.abc
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to console-openshift-console.apps.mbr.domain.abc:443
... or an error like below...
The application is currently not serving requests at this endpoint. It may not have been started or is still starting.
.
TIP: To check all available routes...
oc get routes --all-namespaces
.
From your desktop browser, test access to the Web Console using an URL like this example...
https://console-openshift-console.apps.mbr.domain.abc
... and on the first login screen use the "htpasswd_provider" option and then the username and password created in the "HTPasswd Setup (OKD_SERVICES)" section.
TIP: Test Web Console access from your desktop without having to publish the OpenShift (OKD) web resources on an external DNS on the Internet.
Add an entry as this example...
127.0.0.1 alertmanager-main-openshift-monitoring.apps.mbr.domain.abc canary-openshift-ingress-canary.apps.mbr.domain.abc console-openshift-console.apps.mbr.domain.abc downloads-openshift-console.apps.mbr.domain.abc grafana-openshift-monitoring.apps.mbr.domain.abc oauth-openshift.apps.mbr.domain.abc prometheus-k8s-openshift-monitoring.apps.mbr.domain.abc thanos-querier-openshift-monitoring.apps.mbr.domain.abc wordpress-wordpress-test.apps.mbr.domain.abc
... in your...
sudo vi "/etc/hosts"
... file, create a ssh tunnel to the remote web service (ports 80/http and 443/https) via the ssh command...
MODEL
sudo ssh root@<SSH_INT_IP> -p 573 -CNL :80:<INT_LAN_24>.<OKD_SERVICES_IL_LST_OCT>:80 -CNL :443:<INT_LAN_24>.<OKD_SERVICES_IL_LST_OCT>:443
EXAMPLE
sudo ssh [email protected] -p 573 -CNL :80:10.2.0.18:80 -CNL :443:10.2.0.18:443
[Ref(s).: https://stackoverflow.com/a/29937009/3223785 ]
As this is a very common need and setup in network infrastructures we will show how to allow access to OpenShift (OKD) web resources behind a Ngnix reverse proxy using a wildcard setting for your DNS (external) and a Let's Encrypt wildcard SSL certificate.
In a simplified way the updated network layout will look like below.
Network layout updated with Nginx reverse proxy...
.---------------------.
| WAN |
| ↕ |
| NGINX_REVERSE_PROXY |
| ↕ |
| OKD_SERVICES |
| ↕ |
| [...] |
'---------------------'
The required procedures are in the "Setup Let's Encrypt Wildcard SSL Certificate with Nginx Reverse Proxy" section.
The NGINX_REVERSE_PROXY must have access to N_INT_LAN ("default") network.
As this is a very common need and setup, we will show you how to allow access to OpenShift (OKD) resources using OpenLDAP (LDAP) as identity provider.
The required procedures are in the "OpenLDAP (LDAP) and OpenShift (OKD) - Configuring OpenLDAP (LDAP) as identity provider for OpenShift (OKD)" section.
Create a new project...
oc new-project wordpress-test
Create a new app using a CentOS 7 ("php-73-centos7") S2I ("source-to-image") image from docker hub and use the WordPress GitHub repo as the source...
oc new-app centos/php-73-centos7~https://github.com/WordPress/WordPress.git
NOTES:
- If an error occurs stating this issue "[...] You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit [...]", so this is not an OpenShift (OKD) error, but a limitation imposed by Docker whereby anonymous and free Docker Hub users are limited to 100~200 container image pull requests per six hours;
2 . For some reason this step tends to have errors due to problems in the used repositories. In case it fails you can try to delete the project
oc delete project wordpress-test
and try to create it again.
[Ref(s).: https://www.docker.com/increase-rate-limit ]
Track the build progress...
oc logs -f buildconfig/wordpress
... and wait for outputs similar to these...
Writing manifest to image destination
Storing signatures
Successfully pushed image-registry.openshift-image-registry.svc:5000/wordpress-test/wordpress@sha256:0acafe4f78ea4b6c3dda75837d47e6a4bb3b5d20b02c2597b77afa73ff1355cf
Push successful
Expose the service to create a route...
oc expose service/wordpress
Create a new app using the CentOS 7 MariaDB image with some environment variables...
oc new-app centos/mariadb-103-centos7 --name mariadb --env MYSQL_DATABASE=wordpress --env MYSQL_USER=wordpress --env MYSQL_PASSWORD=wordpress
Open the Web Console and change from "Administrator" to "</> Developer" (top left corner "↓"), then on "Topology" select and click in "wordpress-test" project.
Click on the "wordpress" object, them on the link in the "Routes" section.
After accessing the indicated route, proceed as follows...
-
1ST SCREEN You should see the WordPress setup config. Click "Let's go!".
-
2ND SCREEN Fill the "Database Name" ("wordpress"), "Username" ("wordpress"), "Password" ("wordpress") and "Database Host" ("mariadb"). Click "Submit".
-
3RD SCREEN Click "Run the installation"
-
4TH SCREEN Fill the "Site Title" ("wp_test"), "Username" ("wp_test"), "Password" ("wp_test"), "Your Email" ("[email protected]") and check "Confirm use of weak password". Click "Install WordPress".
-
6TH SCREEN Click "Log In"
-
7TH SCREEN Fill the "Username or Email Address" ("wp_test") and "Password" ("wp_test"). Click "Log In".
-
8TH SCREEN Create/edit something if you want and/or click "View your site".
It should be around 250~300MB in size...
du -sh /var/nfsshare/registry
... , so your persistent volume is working.
Congratulations Dante! You've gone through all the nine circles of torment and installed OpenShift (OKD)! Wow!
Here are some resources available to help you...
To report issues, use the OKD Github repo ( https://github.com/openshift/okd ).
For support check out the "#openshift-users" channel on K8S (Kubernetes) Slack ( https://slack.k8s.io/ ).
The OKD Working Group ( https://github.com/openshift/community#okd-working-group-meetings ) meets bi-weekly to discuss the development and next steps. The meeting schedule and location are tracked in the openshift/community repo ( https://github.com/openshift/community/projects/1#card-28309038 ).
Google group for OKD-WG ( https://groups.google.com/forum/#!forum/okd-wg ).
This type of network can be used for a "very private" or "very isolated" network, since it will not be possible for the virtual machines (guests) to communicate with the hypervisor (host) and the internet (WAN) through this network. However, this virtual network interface can be used for communication between virtual machines (guests).
NOTE: Tested on CentOS 8.
[Ref(s).: https://libvirt.org/formatnetwork.html#examplesNoGateway ]
Create a new network config with no gateway addresses on KVM ("very private" or "very isolated") (HYPERVISOR)
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
<network>
<name>okd_network</name>
<uuid>[MY_NETWORK_UUID]</uuid>
<bridge name='virbr[MY_NETWORK_NUMBER]' stp='on' delay='0'/>
<mac address='52:54:00:[MY_NETWORK_MAC_FINAL]'/>
</network>
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/usr/share/libvirt/networks/okd_network.xml'
Tag values...
- [MY_NETWORK_UUID] ("uuid" is OPTIONAL) - You can generate a new one at the URL https://www.uuidgenerator.net/version4 (version 4 UUID);
- [MY_NETWORK_NUMBER] - We use the "virbr" prefix to follow the existing naming "convention". The suggested value is 1;
- [MY_NETWORK_MAC_FINAL] ("mac" is OPTIONAL) - You can generate one at the URL https://miniwebtool.com/mac-address-generator/ . Use MAC address prefix "52:54:00" (is always the same for KVM), MAC address format with ":" and case "Lowercase".
virsh net-define "/usr/share/libvirt/networks/okd_network.xml"
virsh net-start okd_network
virsh net-autostart okd_network
Nested Virtualization is a technique to run virtual machines in other virtual machines (more than one level of virtualization).
Hardware Virtualization Support or Hardware-Assisted Virtualization is a set of processor extensions that address issues with the virtualization of some privileged instructions and the performance of virtualized system memory. So Hardware Virtualization Support is required on the host processor.
Intel's implementation is called VT-x and AMD's implementation is called AMD-V. This feature is available in most current CPUs. However, this feature might be disabled in the BIOS.
NOTE: Tested on CentOS 8.
[Ref(s).: https://storpool.com/blog/nested-virtualization-with-kvm-and-opennebula , https://docs.fedoraproject.org/en-US/quick-docs/using-nested-virtualization-in-kvm/ , https://www.linux-kvm.org/page/Nested_Guests ]
To make sure that Hardware Virtualization Extensions (Intel VT-X) are present on the host CPU and enabled in its BIOS you can use the following commands...
lscpu | grep Virtualization
... expected output is "Virtualization: VT-x".
NOTES:
- In this process we are only covering Intel's VT-X which is the most common. The process for AMD's AMD-V is very similar;
- The Hardware Virtualization Extensions also need to be enabled in the bios.
[Ref(s).: https://stackoverflow.com/a/56973830/3223785 ]
For Intel processors use the command...
cat "/sys/module/kvm_intel/parameters/nested"
... expected output is "1" or "Y".
NOTE: The KVM kernel modules do not enable nesting by default, though your distribution may override this default.
To enable Nested Virtualization use this command...
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
options kvm-intel nested=Y
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > '/etc/modprobe.d/kvm_intel.conf'
... and reboot the hypervisor (host).
Open "virt-manager" on a desktop computer...
virt-manager
The configuration can be done according to the following guide...
- Select in the hypervisor (host) connection the virtual machine (guest).
- NOTE: The virtual machine (guest) must be turned off.
- Click with the second mouse button.
- Click "Open" in the context menu.
- A new window ("<VM_NAME> on QEMU/KVM: XXX.XXX.XXX.XXX") will appear.
- Follow "View" > "Details".
- Click on "CPUs" and on the "XML" tab look for the "" element configuring it according to this template...
... adding the "" element with the appropriate parameters.
<cpu [...]> [...] <feature policy="require" name="vmx"/> </cpu>
- Click on "Apply".
- Click on "CPUs" and on the "XML" tab look for the "" element configuring it according to this template...
- Follow "View" > "Details".
Access the virtual machine's terminal and proceed with the same checks used for the hypervisor (host) in the section "Check Hardware Virtualization Extensions".
The Let's Encrypt is a Certificate Authority (CA) that provides an easy way to obtain and install free TLS/SSL certificates, thus enabling encrypted HTTPS on web servers. It simplifies the process by providing a software client (Certbot) that tries to automate most (if not all) of the necessary steps, especially for Apache and Nginx.
A Nginx reverse proxy is an intermediary proxy service which takes a client request, passes it on to one or more servers, and subsequently delivers the server's response back to the client.
NOTES:
- Tested on CentOS 8;
- We will not cover here the Nginx reverse proxy installation and basic configuration. We will just cover how to create a Let's Encrypt wildcard SSL certificate and configure Nginx as a reverse proxy with it to the OpenShift (OKD) web resources. Therefore, the explanations provided here assume that you already have your Nginx reverse proxy installed and working.
Create an "A" wildcard record in your DNS (external) as below...
MODEL
*.apps.mbr.<YOUR_DOMAIN>
EXAMPLE
*.apps.mbr.domain.abc
NOTE: In our DNS (external) we had to look for the section where "Type A" appears. Then we entered "*.apps.mbr.domain.abc" for "Entry" and our internet IP for "Value". This will vary according to each reality.
dnf install -y epel-release
dnf update -y
dnf install -y certbot python3-certbot-nginx
MODEL
certbot certonly \
--agree-tos \
--email <YOUR_ADMIN_EMAIL> \
--manual \
--preferred-challenges=dns \
-d *.apps.mbr.<YOUR_DOMAIN> \
--server https://acme-v02.api.letsencrypt.org/directory
EXAMPLE
certbot certonly \
--agree-tos \
--email [email protected] \
--manual \
--preferred-challenges=dns \
-d *.apps.mbr.domain.abc \
--server https://acme-v02.api.letsencrypt.org/directory
You will receive a "TXT" record which you need to add to your DNS (external) server. The record will look as below...
[...]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please deploy a DNS TXT record under the name
_acme-challenge.apps.mbr.domain.abc with the following value:
dL2prHMK152EdcZkcvUA18rsqCJihKoBIkXxyMK3VH5
Before continuing, verify the record is deployed.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Press Enter to Continue
[...]
NOTE: In our DNS (external) we had to look for the section where the "Type TXT" appears. Then we entered "_acme-challenge.apps.mbr.domain.abc" for "Entry" and "dL2prHMK152EdcZkcvUA18rsqCJihKoBIkXxyMK3VH5" for "Value". This will vary according to each reality.
Once the record has been deployed, press Enter to obtain the certificate. You should get a feedback like below...
[...]
Waiting for verification...
Cleaning up challenges
IMPORTANT NOTES:
- Congratulations! Your certificate and chain have been saved at:
/etc/letsencrypt/live/apps.mbr.domain.abc/fullchain.pem
Your key file has been saved at:
/etc/letsencrypt/live/apps.mbr.domain.abc/privkey.pem
Your certificate will expire on 2021-10-19. To obtain a new or
tweaked version of this certificate in the future, simply run
certbot again. To non-interactively renew *all* of your
certificates, run "certbot renew"
- If you like Certbot, please consider supporting our work by:
Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate
Donating to EFF: https://eff.org/donate-le
Your new certificates will be in the folder...
MODEL
ls /etc/letsencrypt/live/apps.mbr.<YOUR_DOMAIN>
EXAMPLE
ls /etc/letsencrypt/live/apps.mbr.domain.abc
As the certificate expires every 3 months, we need to add a job (crontab) to renew the certificate automatically, so the user does not face an invalid digital certificate screen.
Check if there is already a schedule with the command...
crontab -l
... if there is no job, use the command below to add the schedule...
(crontab -l 2>/dev/null; printf "PATH=$PATH\n30 4 * * * /usr/bin/certbot renew --quiet --no-self-upgrade\n") | crontab -
... if there is already a job, use the command (behaves like vi/vim)...
crontab -e
...and add the line...
30 4 * * * /usr/bin/certbot renew --quiet --no-self-upgrade
IMPORTANT: Since crontab does not have the correct shell variables, we need to add the current user (root) path definition to the crontab jobs. In this way add (if it doesn't already exist) as the first line (before any scheduling) the output of the command below...
echo "PATH=$PATH"
... that will be something like this...
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
.
Adjusting the settings in the "SETUP PARAMETERS" section (below) according to your reality and according to its guidelines...
NOTE: After adjusting the settings in the "SETUP PARAMETERS" section (below), copy all the content and paste it into the terminal. The commands below will result in the creation of a configuration file for the Nginx reverse proxy with the settings in the "SETUP PARAMETERS" section.
# > -------------------
# SETUP PARAMETERS
# The domain for the OpenShift (OKD) cluster.
OKD_DOMAIN="domain.abc"
# First 3 octets of OpenShift (OKD) internet network.
INT_LAN_24="10.2.0"
# Last octet of the OKD_SERVICES server IP.
OKD_SERVICES_IL_LST_OCT="18"
# Path where the Nginx reverse proxy configuration file should be created.
NGINX_RP_S_AVAL_PATH="/etc/nginx/sites-available"
# < -------------------
# > -------------------
# Nginx reverse proxy configuration file
read -r -d '' FILE_CONTENT << HEREDOC
BEGIN
server {
access_log /var/log/nginx/apps.mbr.$OKD_DOMAIN-ssl-access.log;
error_log /var/log/nginx/apps.mbr.$OKD_DOMAIN-ssl-error.log;
server_name *.apps.mbr.$OKD_DOMAIN;
location / {
proxy_pass https://$INT_LAN_24.$OKD_SERVICES_IL_LST_OCT:443;
proxy_set_header Host \$host;
proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP \$remote_addr;
proxy_ssl_name \$host;
proxy_ssl_server_name on;
}
listen 443;
ssl_certificate /etc/letsencrypt/live/apps.mbr.$OKD_DOMAIN/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/apps.mbr.$OKD_DOMAIN/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
}
server {
access_log /var/log/nginx/apps.mbr.$OKD_DOMAIN-access.log;
error_log /var/log/nginx/apps.mbr.$OKD_DOMAIN-error.log;
server_name ~^(?<subdomain>[^.]+).apps.mbr.$OKD_DOMAIN;
# Redirect HTTP routes to OpenShift (OKD) subdomains (routes) known as HTTPS.
if (\$subdomain = "\\
alertmanager-main-openshift-monitoring|\\
canary-openshift-ingress-canary|\\
console-openshift-console|\\
downloads-openshift-console|\\
grafana-openshift-monitoring|\\
oauth-openshift|\\
prometheus-k8s-openshift-monitoring|\\
thanos-querier-openshift-monitoring\\
") {
# [Ref(s).: https://stackoverflow.com/a/45504231/3223785 ]
return 301 https://\$host\$request_uri;
}
# Redirection to HTTPS is not possible for all routes because some OpenShift
# (OKD) routes use HTTP.
location / {
proxy_pass http://$INT_LAN_24.$OKD_SERVICES_IL_LST_OCT:80;
proxy_set_header Host \$host;
proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP \$remote_addr;
}
listen 80;
}
END
HEREDOC
echo -n "${FILE_CONTENT:6:-3}" > "$NGINX_RP_S_AVAL_PATH/apps.mbr.$OKD_DOMAIN"
# < -------------------
NOTES:
- Note that the most important configuration item above is
proxy_ssl_name $host;
, without it the "oauth-openshift.apps.mbr.<YOUR_DOMAIN>" route will not work; - There are different ways to maintain and organize settings for an Nginx reverse proxy. Our approach uses the "sites-available/sites-enabled" scheme and symbolic links.
[Ref(s).: https://stackoverflow.com/q/68538099/3223785 ]
Restart the service "nginx" (Nginx) and watch its log in sequence...
systemctl restart nginx.service
journalctl -u nginx.service --no-pager | less +F
TIP: If the DNS (external) configuration with wildcard is correct, any subdomain should get an answer according to the examples below. To test perform these commands on your desktop...
ping -c 2 any.apps.mbr.domain.abc
ping -c 2 other.apps.mbr.domain.abc
ping -c 2 subdomain.apps.mbr.domain.abc
ping -c 2 that.apps.mbr.domain.abc
ping -c 2 exists.apps.mbr.domain.abc
ping -c 2 for.apps.mbr.domain.abc
ping -c 2 your.apps.mbr.domain.abc
ping -c 2 domain.apps.mbr.domain.abc
.
OpenLDAP (LDAP) and OpenShift (OKD) - Configuring OpenLDAP (LDAP) as identity provider for OpenShift (OKD)
Here we explain how to configure the OpenLDAP (LDAP) as identity provider to validate user names (UIDs) and passwords against an LDAPv3 server, using simple bind authentication.
The LDAP (Lightweight Directory Access Protocol) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an internet protocol network.
OpenLDAP (LDAP) is a free, open-source implementation of the LDAP Protocol developed by the OpenLDAP (LDAP) Project.
NOTE: Tested on CentOS 8.
[Ref(s).: https://docs.okd.io/latest/authentication/identity_providers/configuring-ldap-identity-provider.html , https://access.redhat.com/documentation/en-us/openshift_container_platform/4.7/html/authentication_and_authorization/configuring-identity-providers#configuring-ldap-identity-provider . okd-project/okd#797 , https://www.ammeonsolutions.com/insights/2020/3/2/configuring-openshift-v41-ldap-configuration . https://blog.pichuang.com.tw/20200427-openshift-with-coreos-part-5/ , https://www.linuxstudio.com/rh/apps-certificates.html ]
The OpenShift (OKD) cluster has a "stalking mania" and doesn't even believe in its own resources. So we have to extract the certificate used by its access API to avoid the above error.
List the pods used by the cluster's access API...
oc get pods -n openshift-authentication
... and take note of the identifier of one of them (e.g.: "oauth-openshift-56678dbbfb-dzq2j").
Extract the certificate...
MODEL
cd "/etc/pki/ca-trust/source/anchors"
oc rsh -n openshift-authentication <OAUTH_OPENSHIFT_POD_NAME> cat /run/secrets/kubernetes.io/serviceaccount/ca.crt > ingress-ca.crt
EXAMPLE
cd "/etc/pki/ca-trust/source/anchors"
oc rsh -n openshift-authentication oauth-openshift-56678dbbfb-dzq2j cat /run/secrets/kubernetes.io/serviceaccount/ca.crt > ingress-ca.crt
... and add it as a trusted Certificate Authority (CA)...
update-ca-trust extract
[Ref(s).: https://www.mankier.com/8/update-ca-trust ]
To specify an identity provider, you must create a custom resource (CR) that describes that identity provider and add it to the cluster.
Configure the custom resource (CR)...
EXAMPLE
cat <<EOF | oc apply -f -
---
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: ldapidp
mappingMethod: claim
type: LDAP
ldap:
attributes:
id:
- dn
email:
- mail
name:
- cn
preferredUsername:
- uid
bindDN: ""
insecure: true
url: "ldap://10.2.0.5:389/dc=domain,dc=abc?uid?sub?(pgmemberof=cn=openshift_okd,ou=groups,dc=domain,dc=abc)"
EOF
NOTES:
- In the example above we do not use LDAPS (LDAP over SSL, port 636) nor a bind DN with password. If this is not your case see the documentation at https://docs.okd.io/latest/authentication/identity_providers/configuring-ldap-identity-provider.html for more details;
- For the "url" parameter we are using a "memberOf" ("pgmemberof") group filter, so that only users of the "openshift_okd" group can log into OpenShift (OKD). Here's another valid model/example...
MODEL
url: "<LDAP_PROTOCOL>://<LDAP_SRV_NM_OR_IP>:<LDAP_SRV_PORT>/<LDAP_BASE_DN>?<LDAP_URI_PARAMETERS>"
EXAMPLE
url: "ldap://10.2.0.5:389/dc=domain,dc=abc?uid"
.
PLUS: If you use POSIX Groups in your OpenLDAP (LDAP) and want to use them as access groups (see parameter "pgmemberof" above), then you might want to take a look at this solution https://github.com/eduardolucioac/psx-grp-flt .
Log in as an OpenLDAP (LDAP) user...
MODEL
oc login -u <LDAP_USER_UID>
EXAMPLE
oc login -u myusruid
Check the logged in user...
oc whoami
Logout...
oc logout
Return to the default administrator user...
oc login -u "system:admin"
Give "cluster-admin" access to an OpenLDAP (LDAP) user...
MODEL
oc adm policy add-cluster-role-to-user cluster-admin <LDAP_USER_UID>
EXAMPLE
oc adm policy add-cluster-role-to-user cluster-admin myusruid
NOTE: Now, the OpenLDAP (LDAP) user has cluster administrator level access.
From your desktop browser, test access to the Web Console using an URL like this example...
https://console-openshift-console.apps.mbr.domain.abc
... and on the first login screen use the "ldapidp" option and then an OpenLDAP (LDAP) user and password.
TIP: If you change an user CN (change the "Last Name" or "First Name", for example) in OpenLDAP (LDAP), will be necessary to "delete" this user in OpenShift (OKD)...
MODEL
oc delete user <LDAP_USER_UID>
EXAMPLE
oc delete user myusruid
... otherwise the error "Error from server (InternalError): Internal error occurred: unexpected response: 500" will occur when trying to login via command oc login -u <LDAP_USER_UID>
and will also fail when trying to login to the Web Console.
This solution implies losing resources linked to the deleted user. Then assess whether this is the best solution for you.
[Ref(s).: https://docs.openshift.com/enterprise/3.2/admin_guide/manage_users.html#managing-users-deleting-a-user ]
okd_bare_metal 🄯 BSD-3-Clause
Eduardo Lúcio Amorim Costa
Brazil-DF
https://www.linkedin.com/in/eduardo-software-livre/