NetApp: Using Ocean’s Cluster Roll to Update Nodes – The Spot.io Blog

Kubernetes has what you can think of as an aggressive release cycle. There have been three to four releases per year since the release of 1.0 in July 2015. You may have found it too easy to fall behind a few releases. Running the most recent version will help protect your organization from security issues. This is because versions are obsolete once they are three minor versions behind the last. Staying up to date is not just a matter of security! You also have access to new features and improvements.

When using a managed Kubernetes service like Amazon’s EKS, Google’s GKE, and Microsoft’s AKS, the control plane can be updated through a UI, CLI, or API call. However, this still leaves your worker nodes running an older version. Each cloud provider has options to upgrade your worker nodes to the current version. In this example, we’ll be using Amazon’s EKS. Amazon has a excellent article covering the cluster upgrade process. AThe commentary for this article is especially relevant if you are using Spot Ocean to manage the lifecycle of your worker nodes.

We also recommend that you update your self-managed nodes to the same version as your control plane before updating the control plane.

Spot Ocean is here for you. You can update your worker nodes using a feature called “Cluster Roll”. This feature allows you to update all nodes that are part of a Virtual Node Group (VNG) in an orderly fashion. The new nodes are put in place with the changes you request. (In this case, we’ll replace the AMI with one that matches the new version of Kubernetes.) Existing nodes are marked as “NoSchedule” and are drained so that pods are transferred to the new nodes.

If you want to watch a demo of the whole process, go ahead and hit play. Otherwise, scroll down and keep reading.

Let’s walk through the process together.


The demonstration environment

We have an EKS cluster running K8s v1.19 and want to upgrade to version 1.20. Kubernetes upgrades should be done incrementally, one version at a time. We can check the current version using the AWS CLI, kubectl, or by looking in a web UI. Provided the aws cli is configured with the correct credentials, running:

% aws eks describe-cluster --name knauer-eks-Zi4XDZoO

{
"cluster": {
"name": "knauer-eks-Zi4XDZoO",
...
"version": "1.19",
...
}

returns the version. An alternative method, assuming a valid kubeconfig file is available, would be to use kubectl:

% kubectl version --short

Client Version: v1.21.2
Server Version: v1.19.8-eks-96780e
WARNING: version difference between client (1.21) and server (1.19) exceeds the supported minor version skew of +/-1

Note: You should also regularly update the version of kubectl you are using in order to be able to manage your clusters. Typically, you want to stay in one version of the version of the cluster that you are managing. In this example, we get a warning because the cluster we are managing has two versions behind.

This cluster currently has two worker nodes.

% kubectl get nodes

NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-0-1-127.us-west-2.compute.internal   Ready       49m   v1.19.6-eks-49a6c0
ip-10-0-3-146.us-west-2.compute.internal   Ready       45s   v1.19.6-eks-49a6c0

The first one listed is part of an AWS Autoscaling Group (ASG). Note: This ASG is not an EKS managed node group. The second is part of a VNG managed by Spot Ocean. The second will be replaced when the cluster is deployed.

Upgrade the control plane

There are several ways to upgrade our EKS cluster version. For this example cluster, we’ll use the AWS CLI to manage the upgrade.

From the AWS documentation, we need to run:

% aws eks update-cluster-version 
  --region  
  --name  
  --kubernetes-version

and substitute in some values. The “region” is the AWS Region in which this EKS cluster is provisioned. The “name” is the name of the EKS cluster. Set “kubernetes-version” to the desired version, which should be the running version plus one. Since we are at 1.19, we will use “1.20”.

Once all the required variables have been replaced, we will run:

% aws eks update-cluster-version 
  --region us-west-2 
  --name knauer-eks-Zi4XDZoO 
  --kubernetes-version 1.20

This will return an ID which can be used to check the status of the upgrade.

% aws eks describe-update 
  --region us-west-2 
  --name knauer-eks-Zi4XDZoO 
  --update-id ffe2232e-6389-4880-9ecc-4a6d65c1e42d

  {
    "update": {
        "id": "ffe2232e-6389-4880-9ecc-4a6d65c1e42d",
        "status": "InProgress",
        "type": "VersionUpdate",
        "params": [
            {
                "type": "Version",
                "value": "1.20"
            },
            {
                "type": "PlatformVersion",
                "value": "eks.1"
            }
        ],
       ...
    }
  }

The upgrade process will take several minutes. Eventually, the status will come back as “Successful”.

{"update": {"id": "ffe2232e-6389-4880-9ecc-4a6d65c1e42d","status": "Successful","type": "VersionUpdate",...}

Once it has completed successfully, we can verify that the control plane has been upgraded. The returned value for “Server Version” has been updated and now shows 1.20.x instead of 1.19.x.

% kubectl version --shortClient Version: v1.21.2Server Version: v1.20.4-eks-6b7464


Upgrade worker nodes

While the control plane has been upgraded to version 1.20, our data plane or worker nodes are still running version 1.19.

% kubectl get nodes

NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-0-1-127.us-west-2.compute.internal   Ready       90m   v1.19.6-eks-49a6c0
ip-10-0-3-146.us-west-2.compute.internal   Ready       42m   v1.19.6-eks-49a6c0

Get a new AMI ID

Amazon is making updated AMIs available, but we need to get the AMI ID to be able to move forward. One way to do this is to run a query using the aws command.

aws ssm get-parameter --name /aws/service/eks/optimized-ami/1.20/amazon-linux-2/recommended/image_id --region us-west-2 --query "Parameter.Value" --output text

ami-0b05016e79e1e54c6

Now that we have the AMI ID, we can proceed to upgrade the worker nodes that are part of Ocean VNG.

Edit VNG

There are several ways to start a cluster roll with Spot Ocean.

  1. Spot user interface

  2. Using the spotctl CLI

  3. Spot API

  4. SDK

We will go ahead and use the Spot user interface for this example, just know that there are some user-friendly methods for automation.

To note: You can find additional information about this process in the Occasional documentation.

Log in to the Spot UI and navigate to the “Virtual Node Groups” tab. Edit the virtual node group by clicking on the VNG name.

Now replace the “Image” value with the new AMI ID.

To note: We can use the “View AMI Details” link to verify that this is the AMI we wanted.

Finally, don’t forget to press the “Save” button after pasting the new AMI ID.

Start cluster roll

Starting the cluster roll is a quick process. First, go to the “Cluster Rolls” tab.

Second, select “Cluster Roll” from the “Actions” drop-down menu.

Since this VNG is only a node, there is no need to divide the roll into smaller batches. The third step is to click on the “Roll” button. This will submit the request and start the cluster deployment.

Finally, we can switch to the “Log” tab and verify that the cluster roll has started.

What is happening in the cluster?

Now that the cluster roll is in “InProgress”, let’s take a closer look at what’s going on inside the cluster.

Running kubectl get nodes shows that we still have our two nodes. Note that the schedule has been disabled for the worker node that the cluster roller is about to replace.

% kubectl get nodesNAME                                       STATUS                     ROLES    AGE    VERSIONip-10-0-1-127.us-west-2.compute.internal   Ready                         121m   v1.19.6-eks-49a6c0ip-10-0-3-146.us-west-2.compute.internal   Ready,SchedulingDisabled      72m    v1.19.6-eks-49a6c0

Wait a minute or so, then rerun the same command. Spot Ocean has provisioned a new node in the VNG with the updated AMI. The “NotReady” STATUS tells us that it is not yet ready for pods.

% kubectl get nodesNAME                                       STATUS                     ROLES    AGE    VERSIONip-10-0-1-127.us-west-2.compute.internal   Ready                         122m   v1.19.6-eks-49a6c0ip-10-0-1-33.us-west-2.compute.internal    NotReady                      16s    v1.20.4-eks-6b7464ip-10-0-3-146.us-west-2.compute.internal   Ready,SchedulingDisabled      74m    v1.19.6-eks-49a6c0

If we wait a minute or two, the node is now ready to go.

% kubectl get nodesNAME                                       STATUS                     ROLES    AGE    VERSIONip-10-0-1-127.us-west-2.compute.internal   Ready                         123m   v1.19.6-eks-49a6c0ip-10-0-1-33.us-west-2.compute.internal    Ready                         98s    v1.20.4-eks-6b7464ip-10-0-3-146.us-west-2.compute.internal   Ready,SchedulingDisabled      75m    v1.19.6-eks-49a6c0

Our original node running version 1.19 has been removed from the EKS cluster. The new node is running v1.20.x, the same version as the updated control plane. The node that is part of the ASG, and not the Ocean VNG, is still running the previous version.

% kubectl get nodesNAME                                       STATUS   ROLES    AGE    VERSIONip-10-0-1-127.us-west-2.compute.internal   Ready       136m   v1.19.6-eks-49a6c0ip-10-0-1-33.us-west-2.compute.internal    Ready       13m    v1.20.4-eks-6b7464

Summary

We have successfully walked the process of upgrading worker nodes in an EKS cluster to a new version of Kubernetes using Spot Ocean’s cluster deployment feature. Kubernetes version upgrades aren’t the only use case for a cluster deployment. You might need to update to a new AMI in response to a security CVE or make other changes to worker nodes even though the control plane is not upgraded. Please stay tuned for more articles highlighting Spot Ocean’s time-saving features. Thanks for following!

Disclaimer

NetApp Inc. published this content on September 24, 2021 and is solely responsible for the information it contains. Distributed by Public, unedited and unmodified, on September 24, 2021 06:11:04 AM UTC.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *