aws kubernetes version update - amazon-web-services

I need some help
aws forcing me to update my production cluster 1.21 to 1.22
so, is it safe and what are the pitfalls I may encounter?
If I understood correctly when pressing update button it is going to update only control plane? And if so, can I use with my updated control plane my worker nodes with old workloads (yaml files) ? Or can I update my control plane and create new group with worker nodes and move pods to updated nodes? And how to be with statefulsets, they have pvc and if I move stateful pod to another node how could it find pvc?

Related

Stop kubernetes cluster on Autopilot mode

I have a kubernetes cluster set up and I want to stop it so it doesn't generate additional costs, but keep my deployments and configurations saved so that it will work when I start it again. I tried disabling autoscaling and resizing the node pool, but I get the error INVALID_ARGUMENT: Autopilot clusters do not support mutating node pools.
With GKE (autopilot or not) you pay 2 things
The control plane, fully managed by Google
The workers: Node pools for GKE, the running pods on GKE Autopilot.
In both case, you can't stop the control plane, you don't manage it. The only solution is to delete the cluster.
In both case, you can scale your pods/node pools to 0 and therefore remove the worker cost.
That being said, in your case, you have no other solution than deleting your Autopilot control plane, and to save your configuration in config file (the yaml files). Next time you want to start your autopilot cluster, create a new one, load your config, and that's all.
For persistent data, you have to save them outside (on GCS for instance) and to reload them also. The boring part.
Note: you have 1 cluster free per billing account

Security patches for Kubernetes Nodes

I have access to a kops-built kubernetes cluster on AWS EC2 instances. I would like to make sure, that all available security patches from the corresponding package manager are applied. Unfortunately searching the whole internet for hours I am unable to find any clue on how this should be done. Taking a look into the user data of the launch configurations I did not find a line for the package manager - Therefor I am not sure if a simple node restart will do the trick and I also want to make sure that new nodes come up with current packages.
How to make security patches on upcoming nodes of a kubernetes cluster and how to make sure that all nodes are and stay up-to-date?
You might want to explore https://github.com/weaveworks/kured
Kured (KUbernetes REboot Daemon) is a Kubernetes daemonset that performs safe automatic node reboots when the need to do so is indicated by the package management system of the underlying OS.
Watches for the presence of a reboot sentinel e.g. /var/run/reboot-required
Utilises a lock in the API server to ensure only one node reboots at a time
Optionally defers reboots in the presence of active Prometheus alerts or selected pods
Cordons & drains worker nodes before reboot, uncordoning them after

How to resize K8s cluster with kops, cluster-autoscaler to dynamically increase Masters

We have configured Kubernetes cluster on EC2 machines in our AWS account using kops tool (https://github.com/kubernetes/kops) and based on AWS posts (https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-kops/) as well as other resources.
We want to setup a K8s cluster of master and slaves such that:
It will automatically resize (both masters as well as nodes/slaves) based on system load.
Runs in Multi-AZ mode i.e. at least one master and one slave in every AZ (availability zone) in the same region for e.g. us-east-1a, us-east-1b, us-east-1c and so on.
We tried to configure the cluster in the following ways to achieve the above.
Created K8s cluster on AWS EC2 machines using kops this below configuration: node count=3, master count=3, zones=us-east-1c, us-east-1b, us-east-1a. We observed that a K8s cluster was created with 3 Master & 3 Slave Nodes. Each of the master and slave server was in each of the 3 AZ’s.
Then we tried to resize the Nodes/slaves in the cluster using (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-run-on-master.yaml). We set the node_asg_min to 3 and node_asg_max to 5. When we increased the workload on the slaves such that auto scale policy was triggered, we saw that additional (after the default 3 created during setup) slave nodes were spawned, and they did join the cluster in various AZ’s. This worked as expected. There is no question here.
We also wanted to set up the cluster such that the number of masters increases based on system load. Is there some way to achieve this? We tried a couple of approaches and results are shared below:
A) We were not sure if the cluster-auto scaler helps here, but nevertheless tried to resize the Masters in the cluster using (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-run-on-master.yaml). This is useful while creating a new cluster but was not useful to resize the number of masters in an existing cluster. We did not find a parameter to specify node_asg_min, node_asg_max for Master the way it is present for slave Nodes. Is there some way to achieve this?
B) We increased the count MIN from 1 to 3 in ASG (auto-scaling group), associated with one the three IG (instance group) for each master. We found that new instances were created. However, they did not join the master cluster. Is there some way to achieve this?
Could you please point us to steps, resources on how to do this correctly so that we could configure the number of masters to automatically resize based on system load and is in Multi-AZ mode?
Kind regards,
Shashi
There is no need to scale Master nodes.
Master components provide the cluster’s control plane. Master components make global decisions about the cluster (for example, scheduling), and detecting and responding to cluster events (starting up a new pod when a replication controller’s ‘replicas’ field is unsatisfied).
Master components can be run on any machine in the cluster. However, for simplicity, set up scripts typically start all master components on the same machine, and do not run user containers on this machine. See Building High-Availability Clusters for an example multi-master-VM setup.
Master node consists of the following components:
kube-apiserver
Component on the master that exposes the Kubernetes API. It is the front-end for the Kubernetes control plane.
etcd
Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.
kube-scheduler
Component on the master that watches newly created pods that have no node assigned, and selects a node for them to run on.
kube-controller-manager
Component on the master that runs controllers.
cloud-controller-manager
runs controllers that interact with the underlying cloud providers. The cloud-controller-manager binary is an alpha feature introduced in Kubernetes release 1.6.
For more detailed explanation please read the Kubernetes Components docs.
Also if You are thinking about HA, you can read about Creating Highly Available Clusters with kubeadm
I think your assumption is that similar to kubernetes nodes, masters devide the work between eachother. That is not the case, because the main tasks of masters is to have consensus between each other. This is done with etcd which is a distributed key value store. The problem maintaining such a store is easy for 1 machine but gets harder the more machines you add.
The advantage of adding masters is being able to survive more master failures at the cost of having to make all masters fatter (more CPU/RAM....) so that they perform well enough.

How Does Container Optimized OS Handle Security Updates?

If there is a security patch for Google's Container Optimized OS itself, how does the update get applied?
Google's information on the subject is vague
https://cloud.google.com/container-optimized-os/docs/concepts/security#automatic_updates
Google claims the updates are automatic, but how?
Do I have to set a config option to update automatically?
Does the node need to have access to the internet, where is the update coming from? Or is Google Cloud smart enough to let Container Optimized OS update itself when it is in a private VPC?
Do I have to set a config option to update automatically?
The automatic update behavior for Compute Engine (GCE) Container-Optimized OS (COS) VMs (i.e. those instances you created directly from GCE) are controlled via the "cos-update-strategy" GCE metadata. See the documentation at here.
The current documented default behavior is: "If not set all updates from the current channel are automatically downloaded and installed."
The download will happen in the background, and the update will take effect when the VM reboots.
Does the node need to have access to the internet, where is the update coming from? Or is Google Cloud smart enough to let Container Optimized OS update itself when it is in a private VPC?
Yes, the VM needs to access to the internet. If you disabled all egress network traffic, COS VMs won't be able to update itself.
When operated as part of Kubernetes Engine, the auto-upgrade functionality of Container Optimized OS (cos) is disabled. Updates to cos are applied by upgrading the image version of the nodes using the GKE upgrade functionality – upgrade the master, followed by the node pool, or use the GKE auto-upgrade features.
The guidance on upgrading a Kubernetes Engine cluster describes the upgrade process used for manual and automatic upgrades: https://cloud.google.com/kubernetes-engine/docs/how-to/upgrading-a-cluster.
In summary, the following process is followed:
Nodes have scheduling disabled (so they will not be considered for scheduling new pods admitted to the cluster).
Pods assigned to the node under upgrade are drained. They may be recreated elsewhere if attached to a replication controller or equivalent manager which reschedules a replacement, and there is cluster capacity to schedule the replacement on another node.
The node's Computer Engine instance is upgraded with the new cos image, using the same name.
The node is started, re-added to the cluster, and scheduling is re-enabled. (Besides some conditions, most pods will not automatically move back.)
This process is repeated for subsequent nodes in the cluster.
When you run an upgrade, Kubernetes Engine stops scheduling, drains, and deletes all of the cluster's nodes and their Pods one at a time. Replacement nodes are recreated with the same name as their predecessors. Each node must be recreated successfully for the upgrade to complete. When the new nodes register with the master, Kubernetes Engine marks the nodes as schedulable.

Kubernetes - adding more nodes

I have a basic cluster, which has a master and 2 nodes. The 2 nodes are part of an aws autoscaling group - asg1. These 2 nodes are running application1.
I need to be able to have further nodes, that are running application2 be added to the cluster.
Ideally, I'm looking to maybe have a multi-region setup, whereby aplication2 can be run in multiple regions, but be part of the same cluster (not sure if that is possible).
So my question is, how do I add nodes to a cluster, more specifically in AWS?
I've seen a couple of articles whereby people have spun up the instances and then manually logged in to install the kubeltet and various other things, but I was wondering if it could be done in more of an automatic way?
Thanks
If you followed this instructions, you should have an autoscaling group for your minions.
Go to AWS panel, and scale up the autoscaling group. That should do it.
If you did it somehow manually, you can clone a machine selecting an existing minion/slave, and choosing "launch more like this".
As Pablo said, you should be able to add new nodes (in the same availability zone) by scaling up your existing ASG. This will provision new nodes that will be available for you to run application2. Unless your applications can't share the same nodes, you may also be able to run application2 on your existing nodes without provisioning new nodes if your nodes are big enough. In some cases this can be more cost effective than adding additional small nodes to your cluster.
To your other question, Kubernetes isn't designed to be run across regions. You can run a multi-zone configuration (in the same region) for higher availability applications (which is called Ubernetes Lite). Support for cross-region application deployments (Ubernetes) is currently being designed.