GKE how to use existing compute engine disk as persistent volumes? - google-cloud-platform

I might have to rebuild the GKE cluster but the compute engine disks won't be delete and needs to be re-used as persistent volumes for the pods. I haven't found a documentation showing how to link the existing GCP compute engine disk as persistent volumes for the pods.
Is it possible to use the existing GCP compute engine disks with GKE storage class and Persistent volumes?

Yes, it's possible to reuse Persistent Disk as Persistent Volume for another clusters, however there is one limitation:
The persistent disk must be in the same zone as the cluster nodes.
If PD will be in a different zone, the cluster will not find this disk.
In Documentation Using preexisting persistent disks as PersistentVolumes you can find information and examples how to reuse persistent disks.
If you didn't create Persistent Disk yet, you can create it based on Creating and attaching a disk documentation. For this tests, I've used below disk:
gcloud compute disks create pd-name \
--size 10G \
--type pd-standard \
--zone europe-west3-b
If you will create PD with less than 200G you will get below Warning, everything depends on your needs. In zone europe-west3-b, pd-standard type can have storage between 10GB - 65536GB.
You have selected a disk size of under [200GB]. This may result in poor I/O performance. For more information, see: https://developers.google.com/com
pute/docs/disks#performance.
Keep in mind that you might get different types of Persistent Disk on different zones. For more details you can check Disk Types documentation or run $ gcloud compute disk-types list.
Once you have Persistent Disk you can create PersistentVolume and PersistentVolumeClaim.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv
spec:
storageClassName: "test"
capacity:
storage: 10G
accessModes:
- ReadWriteOnce
claimRef:
namespace: default
name: pv-claim
gcePersistentDisk:
pdName: pd-name
fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-claim
spec:
storageClassName: "test"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10G
---
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/data"
name: task-pv-storage
Tests
$ kubectl get pv,pvc,pod
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv 10G RWO Retain Bound default/pv-claim test 22s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pv-claim Bound pv 10G RWO test 22s
NAME READY STATUS RESTARTS AGE
pod/task-pv-pod 1/1 Running 0 21s
Write some information to disk
$ kubectl exec -ti task-pv-pod -- bin/bash
root#task-pv-pod:/# cd /usr/share/nginx/html
root#task-pv-pod:/usr/share/nginx/html# echo "This is test message from Nginx pod" >> message.txt
Now I removed all previous resources: pv, pvc and pod.
$ kubectl get pv,pvc,pod
No resources found
Now If I would recreate pv, pvc with small changes in pod, for example busybox.
containers:
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello; sleep 10;done"]
volumeMounts:
- mountPath: "/usr/data"
name: task-pv-storage
It will be rebound
$ kubectl get pv,pvc,po
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv 10G RWO Retain Bound default/pv-claim 43m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pv-claim Bound pv 10G RWO 43m
NAME READY STATUS RESTARTS AGE
pod/busybox 1/1 Running 0 3m43s
And in the busybox pod I will be able to find Message.txt.
$ kubectl exec -ti busybox -- bin/sh
/ # cd usr
/ # cd usr/data
/usr/data # ls
lost+found message.txt
/usr/data # cat message.txt
This is test message from Nginx pod
As additional information, you won't be able to use it in 2 clusters in the same time, if you would try you will get error:
AttachVolume.Attach failed for volume "pv" : googleapi: Error 400: RESOURCE_IN_USE_B
Y_ANOTHER_RESOURCE - The disk resource 'projects/<myproject>/zones/europe-west3-b/disks/pd-name' is already being used by 'projects/<myproject>/zones/europe-west3-b/instances/gke-cluster-3-default-pool-bb545f05-t5hc'

Related

How to copy s3 objects into EKS pods directly

I am working on Kubernetes pods and I have some pods running on an application. Now, I want to copy some files inside my pods but the situation here is my files present in my s3 bucket. So, I want to know an automated way that can directly copy my s3 files into my pod's folder directory. I don't know how to do this.
If anyone knows it then please reply.
Thanks
There are multiple ways to achieve this, few of them are:
Using Kubectl (very basic not automated method): first need to download s3 files on local and copy them into application pods can use kubectl command line to perform copy operation from host fs to pod fs.
kubectl cp <local-file> <pod-name>:<file-path>
Using Init-Container (Automated): Write a small script to download files from s3/cloud provider and place into shared volume between init-container and main container. In that way whenever your pod is spin up, the init-container will prepare volume before running actual container.
Sample Pod yaml:
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
- name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
volumeMounts:
- mountPath: /tmp
name: data
initContainers:
- name: init-downloader
image: busybox:1.28
command: ['sh', '-c', "aws s3 cp s3://xyz/ /tmp"]
volumeMounts:
- mountPath: /tmp
name: data
volumes:
- emptyDir: {}
name: data
Using CrobJob (Periodically Update): If the data needs to be updated in volume on periodic basis and not depends on pod restarts, then additional utility can be utilised to perform this action.
Either a script to download on local and perform kubectl copy operation on all pods
Use other opensource project to fullfill that, for example
https://github.com/maorfr/skbn, Skbn is a tool for copying files and directories between Kubernetes and cloud storage providers
Sample Job Yaml
apiVersion: batch/v1
kind: Job
metadata:
labels:
app: skbn
name: skbn
spec:
template:
metadata:
labels:
app: skbn
annotations:
iam.amazonaws.com/role: skbn
spec:
serviceAccountName: skbn
containers:
- name: skbn
image: maorfr/skbn
command: ["skbn"]
args:
- cp
- --src
- k8s://namespace/pod/container/path/to/copy/from
- --dst
- s3://bucket/path/to/copy/to
env:
- name: AWS_REGION
value: us-east-1
There can be multiple ways to implement this requirement but mostly it will be using above mentioned methods only.
Thanks
You can use the init container & cronjobs with AWS CLI to copy the files to S3 bucket.
To add further you can also use the Volume mount with datshim it's nice option :
apiVersion: com.ie.ibm.hpsys/v1alpha1
kind: Dataset
metadata:
name: example-dataset
spec:
local:
type: "COS"
accessKeyID: "{AWS_ACCESS_KEY_ID}"
secretAccessKey: "{AWS_SECRET_ACCESS_KEY}"
endpoint: "{S3_SERVICE_URL}"
bucket: "{BUCKET_NAME}"
readonly: "true" #OPTIONAL, default is false
region: "" #OPTIONAL
https://github.com/datashim-io/datashim
If you are looking for something with FUSE you should also checkout the https://github.com/s3fs-fuse/s3fs-fuse

How can I set `max_map_count` and `ulimit` for Elasticsearch node when running in Kubernete in EKS?

I am deploying Elasticsearch 7.10.1 to AWS EKS Fargate but I got below error when running them:
ERROR: [2] bootstrap checks failed
[1]: max number of threads [1024] for user [elasticsearch] is too low, increase to at least [4096]
[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
I found solutions for them is max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536] and Elasticsearch: Max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144].
But both requires a change on the host machine. I am using EKS Fargate which means I don't have access to the Kubernete cluster host machine. What else should I do to solve this issue?
Your best bet is to set these via privileged init containers within your Elasticsearch pod/deployment/statefulset, for example:
apiVersion: v1
kind: Pod
metadata:
name: elasticsearch-node
spec:
initContainers:
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
containers:
- name: elasticsearch-node
...
You could also do this through Daemonsets, although Daemonsets aren't very well suited to one-time tasks (but it's possible to hack around this).
But the init container approach will guarantee that your expected settings are in effect precisely before an Elasticsearch container is launched.

AWS EKS K8s Service and CronJob/Jon same node

I have a k8s deployment which consists of a cron job (runs hourly), service (runs the http service) and a storage class (pvc to store data, using gp2).
The issue I am seeing is that gp2 is only readwriteonce.
I notice when the cron job creates a job and it lands on the same node as the service it can mount it fine.
Is there something I can do in the service, deployment or cron job yaml to ensure the cron job and service always land on the same node? It can be any node but as long as cron job goes to the same node as service.
This isn't an issue in my lower environment as we have very little nodes but in our production environments where we have more nodes it is an issue.
In short I want to get my cron job, which creates a job then pod to run the pod on the same node as my services pod is on.
I know thing isn't best practice but our webservice reads data from the pvc and serves it. The cron job pulls new data in from other sources and leaves it for the webserver.
Happy for other ideas / ways.
Thanks
Focusing only on the part:
How can I schedule a workload (Pod, Job, Cronjob) on a specific set of Nodes
You can spawn your Cronjob/Job either with:
nodeSelector
nodeAffinity
nodeSelector
nodeSelector is the simplest recommended form of node selection constraint. nodeSelector is a field of PodSpec. It specifies a map of key-value pairs. For the pod to be eligible to run on a node, the node must have each of the indicated key-value pairs as labels (it can have additional labels as well). The most common usage is one key-value pair.
-- Kubernetes.io: Docs: Concepts: Scheduling eviction: Assign pod node: Node selector
Example of it could be following (assuming that your node have a specific label that is referenced in .spec.jobTemplate.spec.template.spec.nodeSelector):
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
nodeSelector: # <-- IMPORTANT
schedule: "here" # <-- IMPORTANT
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Running above manifest will schedule your Pod (Cronjob) on a node that has a schedule=here label:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-1616323740-mqdmq 0/1 Completed 0 2m33s 10.4.2.67 node-ffb5 <none> <none>
hello-1616323800-wv98r 0/1 Completed 0 93s 10.4.2.68 node-ffb5 <none> <none>
hello-1616323860-66vfj 0/1 Completed 0 32s 10.4.2.69 node-ffb5 <none> <none>
nodeAffinity
Node affinity is conceptually similar to nodeSelector -- it allows you to constrain which nodes your pod is eligible to be scheduled on, based on labels on the node.
There are currently two types of node affinity, called requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution. You can think of them as "hard" and "soft" respectively, in the sense that the former specifies rules that must be met for a pod to be scheduled onto a node (just like nodeSelector but using a more expressive syntax), while the latter specifies preferences that the scheduler will try to enforce but will not guarantee.
-- Kubernetes.io: Docs: Concepts: Scheduling eviction: Assign pod node: Node affinity
Example of it could be following (assuming that your node have a specific label that is referenced in .spec.jobTemplate.spec.template.spec.nodeSelector):
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
# --- nodeAffinity part
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: schedule
operator: In
values:
- here
# --- nodeAffinity part
containers:
- name: hello
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
$ kubectl get pods
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-1616325840-5zkbk 0/1 Completed 0 2m14s 10.4.2.102 node-ffb5 <none> <none>
hello-1616325900-lwndf 0/1 Completed 0 74s 10.4.2.103 node-ffb5 <none> <none>
hello-1616325960-j9kz9 0/1 Completed 0 14s 10.4.2.104 node-ffb5 <none> <none>
Additional resources:
Kubernetes.io: Docs: Concepts: Overview: Working with objects: Labels
I'd reckon you could also take a look on this StackOverflow answer:
Stackoverflow.com: Questions: Kubernetes PVC with readwritemany on AWS

How to deploy dask-kubernetes adaptive cluster onto aws kubernetes instance

I am attempting to deploy an adaptive dask kubernetes cluster to my aws K8s instance (I want to use the kubeControl interface found here). It is unclear to me where and how I execute this code such that it is active on my existing cluster. In addition to this, I want to have an ingress rule such that another ec2 instance I have can connect to the cluster and execute code within an aws VPC to maintain security and network performance.
So far I have managed to get a functional k8s cluster running with dask and jupyterhub running on it. I am using the sample helm chart found here which reference the docker image here. I can see this image does not even install dask-kubernetes. With that being said, I am able to connect to this cluster from my other ec2 instance using the exposed AWS dns server and execute custom code but this is not the kubernetes native dask cluster.
I have worked on modifying the deploy yaml for kubernetes but it is unclear to me what I would need to change to have it use the proper kubernetes cluster/schedulers. I do know I need to modify the docker image I am using to have in install dask-kubernetes, but this still does not help me. Below is the sample helm deploy chart I am using
---
# nameOverride: dask
# fullnameOverride: dask
scheduler:
name: scheduler
image:
repository: "daskdev/dask"
tag: 2.3.0
pullPolicy: IfNotPresent
# See https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
pullSecrets:
# - name: regcred
replicas: 1
# serviceType: "ClusterIP"
# serviceType: "NodePort"
serviceType: "LoadBalancer"
servicePort: 8786
resources: {}
# limits:
# cpu: 1.8
# memory: 6G
# requests:
# cpu: 1.8
# memory: 6G
tolerations: []
nodeSelector: {}
affinity: {}
webUI:
name: webui
servicePort: 80
worker:
name: worker
image:
repository: "daskdev/dask"
tag: 2.3.0
pullPolicy: IfNotPresent
# dask_worker: "dask-cuda-worker"
dask_worker: "dask-worker"
pullSecrets:
# - name: regcred
replicas: 3
aptPackages: >-
default_resources: # overwritten by resource limits if they exist
cpu: 1
memory: "4GiB"
env:
# - name: EXTRA_CONDA_PACKAGES
# value: numba xarray -c conda-forge
# - name: EXTRA_PIP_PACKAGES
# value: s3fs dask-ml --upgrade
resources: {}
# limits:
# cpu: 1
# memory: 3G
# nvidia.com/gpu: 1
# requests:
# cpu: 1
# memory: 3G
# nvidia.com/gpu: 1
tolerations: []
nodeSelector: {}
affinity: {}
jupyter:
name: jupyter
enabled: true
image:
repository: "daskdev/dask-notebook"
tag: 2.3.0
pullPolicy: IfNotPresent
pullSecrets:
# - name: regcred
replicas: 1
# serviceType: "ClusterIP"
# serviceType: "NodePort"
serviceType: "LoadBalancer"
servicePort: 80
# This hash corresponds to the password 'dask'
password: 'sha1:aae8550c0a44:9507d45e087d5ee481a5ce9f4f16f37a0867318c'
env:
# - name: EXTRA_CONDA_PACKAGES
# value: "numba xarray -c conda-forge"
# - name: EXTRA_PIP_PACKAGES
# value: "s3fs dask-ml --upgrade"
resources: {}
# limits:
# cpu: 2
# memory: 6G
# requests:
# cpu: 2
# memory: 6G
tolerations: []
nodeSelector: {}
affinity: {}
To run a Dask cluster on Kubernetes there are three recommended approaches. Each of these approaches require you to have an existing Kubernetes cluster and credentials correctly configured (kubectl works locally).
Dask Helm Chart
You can deploy a standalone Dask cluster using the Dask helm chart.
helm repo add dask https://helm.dask.org/
helm repo update
helm install --name my-release dask/dask
Note that this is not an adaptive cluster but you can scale it by modifying the size of the deployment via kubectl.
kubectl scale deployment dask-worker --replicas=10
Helm Chart Documentation
Python dask-kubernetes API
You can also use dask-kubernetes which is a Python library for creating ad-hoc clusters on the fly.
pip install dask-kubernetes
from dask_kubernetes import KubeCluster
cluster = KubeCluster()
cluster.scale(10) # specify number of nodes explicitly
cluster.adapt(minimum=1, maximum=100) # or dynamically scale based on current workload
This will create a Dask cluster from scratch and will tear it down when the cluster object is garbage collected (most likely on exit).
dask-kubernetes Documentation
Dask Gateway
Dask Gateway provides a secure, multi-tenant server for managing Dask clusters.
To get started on Kubernetes you need to create a Helm configuration file (config.yaml) with a gateway proxy token.
gateway:
proxyToken: "<RANDOM TOKEN>"
Hint: You can generate a suitable token with openssl rand -hex 32.
Then install the chart.
helm repo add dask-gateway https://dask.org/dask-gateway-helm-repo/
helm repo update
helm install --values config.yaml my-release dask-gateway/dask-gateway
Dask Gateway Documentation

vsystem-vrep of vora at Waiting: CrashLoopBackOff

Trying to setup Vora 2 on an AWS kops k8s cluster.
The pod vsystem-vrep cannot start.
In the logfile on the node I see:
sudo cat vsystem-vrep_30.log
{"log":"2018-03-27 12:54:04.164349|+0000|INFO |Starting Kernel NFS Server||vrep|1|Start|server.go(41)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.164897827Z"}
{"log":"2018-03-27 12:54:04.164405|+0000|INFO |Creating directory /exports||dir-handler|1|makeDir|dir_handler.go(40)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.164919387Z"}
{"log":"2018-03-27 12:54:04.164423|+0000|INFO |Listening for private API on port 8738||vrep|18|func1|server.go(45)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.164923893Z"}
{"log":"2018-03-27 12:54:04.166992|+0000|INFO |Configuring Kernel NFS Server||vrep|1|configure|server.go(126)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.167109138Z"}
{"log":"2018-03-27 12:54:04.219089|+0000|INFO |Configuring Kernel NFS Server||vrep|1|configure|server.go(126)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.219235263Z"}
{"log":"2018-03-27 12:54:04.230256|+0000|FATAL|Error starting NFS server: RPC service for NFS server has not been correctly registered||vrep|1|main|server.go(51)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.230526346Z"}
How can I solve this?
When installing Vora 2.1 in AWS with kops, you need to first setup a RWX storage class which is needed by vsystem (the default AWS storage class is read only). During installation, you need to point to that storage class using parameter --vsystem-storage-class. Additionally, parameter --vsystem-load-nfs-modules needs to be set. I suspect that the error happened because that last parameter was missing.
Example, how a call of install.sh would look like:
./install.sh --accept-license --deployment-type=cloud --namespace=xxx
--docker-registry=123456789.dkr.ecr.us-west-1.amazonaws.com
--vora-admin-username=xxx --vora-admin-password=xxx
--cert-domain=my.host.domain.com --interactive-security-configuration=no
--vsystem-storage-class=aws-efs --vsystem-load-nfs-modules
A RWX storage class can e.g. be created as following
Create an EFS file system in same region as kops cluster - see https://us-west-2.console.aws.amazon.com/efs/home?region=us-west-2#/filesystems
Create file system
Select VPC of kops cluster
Add kops master and worker security groups to mount target
Optionally give it a name (e.g. same as your kops cluster, to know what it is used for)
Use default options for the remaining
Once created, note the DNS name (similar to fs-1234e567.efs.us-west-2.amazonaws.com).
Create persistent volume and storage class for Vora
E.g. use yaml files similar to below and point to the newly created EFS file system.
$ cat create_pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: vsystem-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: aws-efs
nfs:
path: /
server: fs-1234e567.efs.us-west-2.amazonaws.com
$ cat create_sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: aws-efs
provisioner: xyz.com/aws-efs
kubectl create -f create_pv.yaml
kubectl create -f create_sc.yaml
-- check if newly created pv and sc exist
kubectl get pv
kubectl get storageclasses