Managing volume rollbacks in K8s using persistent volumes - amazon-web-services

I have a kubernetes deployment managed by a helm chart that I am planning an upgrade of. The app has 2 persistent volumes attached which are are EBS volumes in AWS. If the deployment goes wrong and needs rolling back I might also need to roll back the EBS volumes. How would one manage that in K8s? I can easily create the volume manually in AWS from my snapshot I've taken pre deployment but for the deployment to use it would I need to edit the pv yaml file to point to my new volume ID? Or would I need to create a new PV using the volume ID and a new PVC and then edit my deployment to use that claim name?

First you need to define a storage class with reclaimPolicy: Delete
https://kubernetes.io/docs/concepts/storage/storage-classes/
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- debug
volumeBindingMode: Immediate
Then, in your helm chart, you need to use that storage class. So, when you delete the helm chart, the persistent claim will be deleted and because the ReclaimPolicy=Delete for the storage class used, the corresponding persistent volume will also be deleted.
Be careful though. Once PV is deleted, you will not be able to recover that volume's data. There is no "recycle bin".

Related

Migrating StorageClass from gp2 to gp3 - AWS EKS

We are using EKS and we have a stateful set that uses a storage class as volumeclaimtemplates.
Our storage classes are of type gp2 and is using the ebs.csi.aws provisioner.
We need to convert the sc's from gp2 - gp3 without any downtime. Is it possible to do it? We are using Kubectl and Kustomize , when we tried to change the type using a kustomize layer it gave me an error :
Forbidden : Updates to parameters are forbidden.
I also referred to this documentation which also includes steps to change the provisioner.
https://aws.amazon.com/blogs/containers/migrating-amazon-eks-clusters-from-gp2-to-gp3-ebs-volumes/
I was looking to understand if we could easily change the pvc storage class type from gp2 - gp3 as we can with the AWS console and a regular volume.
Thank you

How to configure max price for managed node group for spot instances while creating AWS EKS cluster using eksctl config schema?

I'm creating EKS cluster using the eksctl. While developing the yaml configurations for the underlying resources, I came to know that spot instance is also supported with AWS EKS cluster(here). However while referring the documentation/schema, I didn't find anything to limit the bidding price for spot instance. So by default, it will bid with on demand pricing which is not ideal. Am I missing anything here or it's just not possible at the moment?
Sample yaml config for spot(cluster-config-spot.yaml) -
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: spot-cluster
region: us-east-2
version: "1.23"
managedNodeGroups :
- name: spot-managed-node-group-1
instanceType: ["c7g.xlarge","c6g.xlarge"]
minSize: 1
maxSize: 10
spot: true
AWS EKS cluster creation command -
eksctl create cluster -f cluster-config-spot.yaml
maxPrice can be set for self-managed node group this way; but this is not supported for managed node group. You can upvote the feature here.

EksCtl : Update node-definitions via cluster config file not working

I am using eksctl to create our EKS cluster.
For the first run, it works out good, but if I want to upgrade the cluster-config later in the future, it's not working.
I have a cluster-config file with me, but any changes made to it are not reflect with update/upgrade command.
What am I missing?
Cluster.yaml :
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: supplier-service
region: eu-central-1
vpc:
subnets:
public:
eu-central-1a: {id: subnet-1}
eu-central-1b: {id: subnet-2}
eu-central-1c: {id: subnet-2}
nodeGroups:
- name: ng-1
instanceType: t2.medium
desiredCapacity: 3
ssh:
allow: true
securityGroups:
withShared: true
withLocal: true
attachIDs: ['sg-1', 'sg-2']
iam:
withAddonPolicies:
autoScaler: true
Now, if in the future, I would like to make change to instance.type or replicas, I have to destroy entire cluster and recreate...which becomes quite cumbersome.
How can I do in-place upgrades with clusters created by EksCtl? Thank you.
I'm looking into the exact same issue as yours.
After a bunch of searches against the Internet, I found that it is not possible yet to in-place upgrade your existing node group in EKS.
First, eksctl update has become deprecated. When I executed eksctl upgrade --help, it gave a warning like this:
DEPRECATED: use 'upgrade cluster' instead. Upgrade control plane to the next version.
Second, as mentioned in this GitHub issue and eksctl document, up to now the eksctl upgrade nodegroup is used only for upgrading the version of managed node group.
So unfortunately, you'll have to create a new node group to apply your changes, migrate your workload/switch your traffic to new node group and decommission the old one. In your case, it's not necessary to nuke the entire cluster and recreate.
If you're seeking for seamless upgrade/migration with minimum/zero down time, I suggest you try managed node group, in which the graceful draining of workload seems promising:
Node updates and terminations gracefully drain nodes to ensure that your applications stay available.
Note: in your config file above, if you specify nodeGroups rather than managedNodeGroups, an unmanaged node group will be provisioned.
However, don't lose hope. An active issue in eksctl GitHub repository has been lodged to add eksctl apply option. At this stage it's not yet released. Would be really nice if this came true.
To upgrade the cluster using eksctl:
Upgrade the control plane version
Upgrade coredns, kube-proxy and aws-node
Upgrade the worker nodes
If you just want to update nodegroup and keep the same configuration, you can just change nodegroup names, e.g. append -v2 to the name. [0]
If you want to change the node group configuration 'instance type', you need to just create a new node group: eksctl create nodegroup --config-file=dev-cluster.yaml [1]
[0] https://eksctl.io/usage/cluster-upgrade/#updating-multiple-nodegroups-with-config-file
[1] https://eksctl.io/usage/managing-nodegroups/#creating-a-nodegroup-from-a-config-file

Control GPU machine to start and stop from one function?

Thanks to Google cloud we get free credits for running GPU on cloud, but we getting stuck at a very beginning.
We use to get images daily for processing through machine learning model, but somehow GPU System are not getting used through out the day is there any way we can control this system to start and stop once all the images are processed through one function? Which we can call through cron at specific day and timing.
I have heard about aws lambda but am not sure of what google cloud can provide for this problem.
Thanks in Advance.
If you are willing to spend effort, you can achieve this using Google Kubernetes Engine. As far is I know this is the only way right now to have self-starting and stopping GPU instances on GCP. To achieve this, you have to add a GPU node pool with auto-scaling to your Kubernetes cluster.
gcloud container node-pools create gpu_pool \
--cluster=${GKE_CLUSTER_NAME} \
--machine-type=n1-highmem-96 \
--accelerator=nvidia-tesla-v100,8 \
--node-taints=reserved-pool=true:NoSchedule \
--enable-autoscaling \
--min-nodes=0 \
--max-nodes=4 \
--zone=${GCP_ZONE} \
--project=${PROJECT_ID}
Make sure to sub the env variables with your actual project ID etc. and also make sure to use a GCP zone that actually has the GPU types available that you want (not all zones have all GPUs types). Make sure to specifiy the zone like europe-west1-b not europe-west1.
This command will start all the nodes at once, but they will be automatically shut down after whatever the default timeout for autoscaling nodes is in your default cluster configuration (for me I think it was 5minutes). However, you can change that setting.
You can then start a Kubernetes Job (NOT deployment) from CLI or using any of the available Kubernetes API Client libraries which explicitly request GPU resources.
Here is some example job.yaml with the main necessary components, however, you would need to tweak that according to your cluster config:
apiVersion: batch/v1
kind: Job
metadata:
name: some-job
spec:
parallelism: 1
template:
metadata:
name: some-job
labels:
app: some-app
spec:
containers:
- name: some-image
image: gcr.io/<project-id>/some-image:latest
resources:
limits:
cpu: 3500m
nvidia.com/gpu: 1
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- some-app
topologyKey: "kubernetes.io/hostname"
tolerations:
- key: reserved-pool
operator: Equal
value: "true"
effect: NoSchedule
- key: nvidia.com/gpu
operator: Equal
value: "present"
effect: NoSchedule
restartPolicy: OnFailure
It is vital that the tolerations are set up like this and that the actual resource limit is set to how many GPUs you want. Otherwise it won't work.
The nodes will then be started (if none are available) and the job will be computed. Idle nodes will once again be shutdown after the specified autoscaling timeout.
I got the idea from here.
You can use Cloud Scheduler for this use cases or you can trigger Cloud Function when images are available and process it.
However, the free $300 quota is for training and innovation purpose not for actual production application.
You can try and optimize the GPU usage of the instances by following the guide over here, however, you would need to manage it through a cron or something in the instance.
Also, watch out for your credit usage when using GPU on free trial. Free trial gives you only $300 USD in credits, however, as seen over here, GPU usage is expensive and you may spend all your credits in 1 or 2 weeks if you are not careful.
Hope you find this useful!

Openshift deployment-config template fails to deploy pod if a volume is declared

I have an OpenShift deployment configuration template that I generated from a working deployment (using "oc export"). The original pod has a persistent volume claim (PVC) mounted on /data. When I try to deploy using the template, the pod never starts up. If I remove all mention of the volume and volume mount from the template, the pod does start. I then have to manually attach the volume. I want to be able to do it all from the template though. Here is the partial template showing only relevant items:
apiVersion: v1
kind: Template
metadata:
name: myapp
objects:
- apiVersion: v1
kind: DeploymentConfig
metadata:
name: myapp-service
spec:
template:
spec:
containers:
- name: myapp-service
image: my-private-registry/myapp-service:latest
volumeMounts:
- mountPath: /data
name: volume-001
volumes:
- persistentVolumeClaim:
claimName: nfs-pvc
name: volume-001
When deployed with this template, the deployment sits waiting for the pod to be created ("Status: Container creating"). If the persistentVolumeClaim item is replaced with an ephemeral volume declaration:
volumes:
- emptyDir: {}
name: volume-001
it starts up fine. So it seems that the problem is specific to the persistentVolumeClaim entry. Both the PV and PVC were set up beforehand, as shown here:
> oc get pv
NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE
nfs-pv00 50Gi RWX Bound my-project/nfs-pvc 1h
> oc get pvc
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
nfs-pvc Bound nfs-pv00 50Gi RWX 1h
EDIT: Another data point (that perhaps I should have started with) is that if an existing deployment configuration is modified from the openshift console (Applications->Deployments->my-dc) by selecting "Attach storage" and specifying the PVC, the same behavior is observed as with the templated deployment: a new deployment launches (due to config change) but its associated pod never starts.
SOLVED: My bad. I discovered that I had a bad mount point on the NFS server that I had set up to serve the PV. Once I specified the correct mount point, the pods started up fine (well, on to the next problem anyway).