In this post, I’ll briefly describe my personal experience with shutting down and starting up an OpenShift 4.9 cluster. This process is essential for system maintenance, updates, or other scenarios where a controlled restart is necessary.

Preparation

Before proceeding with the shutdown, it’s crucial to ensure that all nodes and pods are running correctly.

Check Node Status:

oc get nodes --show-labels

This command will display the status and labels of all nodes in the cluster.

Check Pod Status:

oc get pods --all-namespaces

This provides an overview of all pods across all namespaces, which is useful to verify their current state. It’s a good practice to save this output to compare the state of the cluster post-restart.

Backup Namespaces:

Consider creating backups of your namespaces. Tools like Velero can be used for this purpose.

Shutdown cluster:

Shutdown Worker Nodes:

Each worker node should be shut down gracefully. Use the command:

shutdown -h now

Run this command on each worker node.

Shutdown Infra Nodes:

After the worker nodes, proceed with the infra nodes using the same command:

shutdown -h now

This ensures that infra services are safely turned off.

Shutdown Master Nodes:

Finally, shut down all the master nodes:

shutdown -h now

The master nodes should be the last to shut down to maintain control over the cluster for as long as possible.

Start Cluster

Starting the cluster involves a reverse process of the shutdown.

  1. Start Master Nodes:

Boot up all the master nodes first. This is crucial as they control the cluster.

  1. Start Infra Nodes:

After the master nodes are online, start the infra nodes.

  1. Start Worker Nodes:

Finally, bring up all the worker nodes.

Verify Node Startup:

Check if all nodes have started successfully:

oc get nodes

Verify Pod Status:

Ensure that all pods are running correctly:

oc get pods --all-namespaces

This verification step is important to confirm that the cluster is fully operational and that all services are running as expected.

Conclusion

This guide is based on my personal experience with managing an OpenShift 4 cluster. It’s always recommended to follow the official OpenShift documentation for specific operational procedures.

Reference:

How To: Stop and start a production OpenShift Cluster