Difficulty using Kubernetes Persistent Volume Claims with Amazon EBS - amazon-web-services

I'm attempting to follow the instructions at https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/registry to add a private docker registry to Kubernetes, but the pod created by the rc isn't able to mount the persistent volume claim.
First I'm creating a volume on EBS like so:
aws ec2 create-volume --region us-west-1 --availability-zone us-west-1a --size 32 --volume-type gp2
(us-west-1a is also the availability zone that all of my kube minions are running in.)
Then I create a persistent volume like so:
kind: PersistentVolume
apiVersion: v1
metadata:
name: kube-system-kube-registry-pv
labels:
kubernetes.io/cluster-service: "true"
spec:
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
volumeID: vol-XXXXXXXX
fsType: ext4
And a claim on the persistent volume like so:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: kube-registry-pvc
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
The replication controller is specified like so:
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-registry-v0
namespace: kube-system
labels:
k8s-app: kube-registry
version: v0
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: kube-registry
version: v0
template:
metadata:
labels:
k8s-app: kube-registry
version: v0
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: registry
image: registry:2
resources:
limits:
cpu: 100m
memory: 100Mi
env:
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
ports:
- containerPort: 5000
name: registry
protocol: TCP
volumes:
- name: image-store
persistentVolumeClaim:
claimName: kube-registry-pvc
When I create the rc, It successfully starts a pod, but the pod is unable to mount the volume:
$ kubectl describe po kube-registry --namespace=kube-system
...
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
───────── ──────── ───── ──── ───────────── ────── ───────
1m 1m 1 {scheduler } Scheduled Successfully assigned kube-registry-v0-3jobf to XXXXXXXXXXXXXXX.us-west-1.compute.internal
22s 22s 1 {kubelet XXXXXXXXXXXXXXX.us-west-1.compute.internal} FailedMount Unable to mount volumes for pod "kube-registry-v0-3jobf_kube-system": Timeout waiting for volume state
22s 22s 1 {kubelet XXXXXXXXXXXXXXX.us-west-1.compute.internal} FailedSync Error syncing pod, skipping: Timeout waiting for volume state
I'm able to successfully mount EBS volumes if I don't use persistent volumes and persistent volume claims. The following works without error, for example:
apiVersion: v1
kind: Pod
metadata:
name: test-ebs
spec:
containers:
- image: gcr.io/google_containers/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-ebs
name: test-volume
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: vol-XXXXXXXX
fsType: ext4
My two questions are:
Does anyone know what might be going wrong and how to fix it?
In general, where can I look for more details on errors like these? I haven't been able to find more detailed log messages anywhere, and "Unable to mount volumes...Timeout waiting for volume state" isn't terribly helpful.

I think I was likely running into https://github.com/kubernetes/kubernetes/issues/15073 . (If I create a new EBS volume, I first get a different failure, and then after the pod has been killed if I try to re-create the rc I get the failure I mentioned in my question.)
Also, for anyone else wondering where to look for logs, /var/log/syslog and /var/log/containers/XXX on the kubelet was where I ended up having to look.

Related

Unable to attach or mount volumes : timed out waiting for the condition

While mounting my EBS volume to the kubernetes cluster I was getting this error :
Warning FailedMount 64s kubelet Unable to attach or mount volumes: unmounted volumes=[ebs-volume], unattached volumes=[ebs-volume kube-api-access-rq86p]: timed out waiting for the condition
Below are my SC, PV, PVC, and Deployment files
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
volumeBindingMode: Immediate
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: ebs-pv
labels:
type: ebs-pv
spec:
storageClassName: standard
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
volumeID: vol-0221ed06914dbc8fd
fsType: ext4
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ebs-pvc
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
kind: Deployment
metadata:
labels:
app: gitea
name: gitea
spec:
replicas: 1
selector:
matchLabels:
app: gitea
template:
metadata:
labels:
app: gitea
spec:
volumes:
- name: ebs-volume
persistentVolumeClaim:
claimName: ebs-pvc
containers:
- image: gitea/gitea:latest
name: gitea
volumeMounts:
- mountPath: "/data"
name: ebs-volume
This is my PV and PVC which I believe is connected perfectly
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/ebs-pv 1Gi RWO Retain Bound default/ebs-pvc standard 18m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/ebs-pvc Bound ebs-pv 1Gi RWO standard 18m
This is my storage class
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
standard kubernetes.io/aws-ebs Retain Immediate false 145m
This is my pod description
Name: gitea-bb86dd6b8-6264h
Namespace: default
Priority: 0
Node: worker01/172.31.91.105
Start Time: Fri, 04 Feb 2022 12:36:15 +0000
Labels: app=gitea
pod-template-hash=bb86dd6b8
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/gitea-bb86dd6b8
Containers:
gitea:
Container ID:
Image: gitea/gitea:latest
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/data from ebs-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rq86p (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
ebs-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: ebs-pvc
ReadOnly: false
kube-api-access-rq86p:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20m default-scheduler Successfully assigned default/gitea-bb86dd6b8-6264h to worker01
Warning FailedMount 4m47s (x2 over 16m) kubelet Unable to attach or mount volumes: unmounted volumes=[ebs-volume], unattached volumes=[kube-api-access-rq86p ebs-volume]: timed out waiting for the condition
Warning FailedMount 19s (x7 over 18m) kubelet Unable to attach or mount volumes: unmounted volumes=[ebs-volume], unattached volumes=[ebs-volume kube-api-access-rq86p]: timed out waiting for the condition
This is my ebs-volume the last one which I have connected to the master node on which I am performing operations right now...
NAME FSTYPE LABEL UUID MOUNTPOINT
loop0 squashfs /snap/core18/2253
loop1 squashfs /snap/snapd/14066
loop2 squashfs /snap/amazon-ssm-agent/4046
xvda
└─xvda1 ext4 cloudimg-rootfs c1ce24a2-4987-4450-ae15-62eb028ff1cd /
xvdf ext4 36609bbf-3248-41f1-84c3-777eb1d6f364
The cluster I have created manually on the AWS ubuntu18 instances, there are 2 worker nodes and 1 master node all on Ubuntu18 instances running on AWS.
Below are the commands which I have used to create the EBS volume.
aws ec2 create-volume --availability-zone=us-east-1c --size=10 --volume-type=gp2
aws ec2 attach-volume --device /dev/xvdf --instance-id <MASTER INSTANCE ID> --volume-id <MY VOLUME ID>
sudo mkfs -t ext4 /dev/xvdf
After this the container was successfully created and attached, so I don't think there will be a problem in this part.
I have not done one thing which I don't know if it is necessary or not is the below part
The cluster also needs to have the flag --cloud-provider=aws enabled on the kubelet, api-server, and the controller-manager during the cluster’s creation
This thing I found on one of the blogs but at that moment my cluster was already set-up so I didn't do it but if it is a problem then please notify me and also please give some guidance about how to do it.
I have used Flannel as my network plugin while creating the cluster.
I don't think I left out any information but if there is something additional you want to know please ask.
Thank you in advance!
This is my ebs-volume the last one which I have connected to the master node...
Pod that wish to mount this volume must run on the same node as the PV currently attached. Given the scenario you described; it is currently mounted on your Ubuntu based master node therefore you need to run pod on this node in order to mount it. Otherwise, you need to release it from the master node (detach from the underlying EC2) and re-deploy your PVC/PV/Pod so that they can settle down on a worker node instead of master node.

AWS EBS Volume with kubernetes issue

Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/2e47e8b4-4755-46d6-9bc4-461ea02a6cb9/volumes/kubernetes.io~aws-ebs/pv --scope -- mount -o bind /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-2a/vol-011d7bb42da888b82 /var/lib/kubelet/pods/2e47e8b4-4755-46d6-9bc4-461ea02a6cb9/volumes/kubernetes.io~aws-ebs/pv
Output: Running scope as unit run-20000.scope.
mount: /var/lib/kubelet/pods/2e47e8b4-4755-46d6-9bc4-461ea02a6cb9/volumes/kubernetes.io~aws-ebs/pv: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-2a/vol-011d7bb42da888b82 does not exist.
Warning FailedAttachVolume 7s (x6 over 23s) attachdetach-controller AttachVolume.NewAttacher failed for volume "pv" : Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Warning FailedMount 7s kubelet, ip-172-31-3-191.us-east-2.compute.internal MountVolume.SetUp failed for volume "pv" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/2e47e8b4-4755-46d6-9bc4-461ea02a6cb9/volumes/kubernetes.io~aws-ebs/pv --scope -- mount -o bind /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-2a/vol-011d7bb42da888b82 /var/lib/kubelet/pods/2e47e8b4-4755-46d6-9bc4-461ea02a6cb9/volumes/kubernetes.io~aws-ebs/pv
Output: Running scope as unit run-20058.scope.
mount: /var/lib/kubelet/pods/2e47e8b4-4755-46d6-9bc4-461ea02a6cb9/volumes/kubernetes.io~aws-ebs/pv: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-2a/vol-011d7bb42da888b82 does not exist.
I have Kubernetes cluster running in same availability zone where EBS volumes is available
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2-retain
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
volumeBindingMode: Immediate
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: asvignesh
name: _PVC_
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: gp2-retain
volumeMode: Filesystem
volumeName: _PV_
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: _PV_
spec:
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
fsType: xfs
volumeID: aws://us-east-1a/vol-xxxxxxxxx
capacity:
storage: 10Gi
persistentVolumeReclaimPolicy: Retain
storageClassName: gp2-retain
volumeMode: Filesystem
---
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: asvignesh
spec:
ports:
- port: 3306
targetPort: 3306
selector:
app: asvignesh
tier: mysql
clusterIP: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
labels:
app: asvignesh
spec:
selector:
matchLabels:
app: asvignesh
tier: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: asvignesh
tier: mysql
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: password
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: _PVC_
Are you running a cluster on managed K8s or bare metal?
Because
On the Kubernetes side of the house, you’ll need to make sure that the
--cloud-provider=aws command-line flag is present for the API server, controller manager, and every Kubelet in the cluster.
Document to refer : https://blog.scottlowe.org/2018/09/28/setting-up-the-kubernetes-aws-cloud-provider/
Example YAML
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
Ref : https://faun.pub/mysql-pod-with-persistent-ebs-volume-in-eks-150af369ff94

AWS EKS "0/3 nodes are available: 3 Too many pods" Error

I have 3 node group t3a.micro and I installed ebs csi provider and storage-class.
I want deploy statefulset on mysql
this is my manifest
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql-statefulset
spec:
serviceName: mysql-service
replicas: 1
selector:
matchLabels:
app: mysql-pod
template:
metadata:
labels:
app: mysql-pod
spec:
containers:
- name: mysql
image: mysql
ports:
- containerPort: 3306
volumeMounts:
- name: pvc-test
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: pvc-test
spec:
storageClassName: gp2-retain
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
Warning FailedScheduling 20s (x16 over 20m) default-scheduler 0/3 nodes are available: 3 Too many pods.
As mentioned in AWS EKS - Only 2 pod can be launched - Too many pods error
According to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI, t3a.micro type has
Maximum network interfaces: 2
Private IPv4 addresses per interface: 2
IPv6 addresses per interface: 2
But EKS deploys DaemonSets for e.g. CoreDNS and kube-proxy, so some IP addresses on each node is already allocated.
Possible fix is just upgrade your instance to be a more capable type.

How to remove sub path folder of pvc when call delete pods, ... with client-go k8s

I using k8s/client-go library to control and develop my application (https://github.com/kubernetes/client-go).
I have an issue when use sub-path of persistent volume claim.
Example, I'm having two pod and mount data of each container to 2 subpath ORG1/DIR1 and ORG2/DIR2 on persistent volume claim (efs file), detail in blow:
apiVersion: v1
kind: Pod
metadata:
name: my-lamp-site
spec:
containers:
- name: mysql
image: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: "rootpasswd"
volumeMounts:
- mountPath: /var/lib/mysql
name: site-data
subPath: ORG1/DIR1
- name: php
image: php:7.0-apache
volumeMounts:
- mountPath: /var/www/html
name: site-data
subPath: ORG1/DIR2
volumes:
- name: site-data
persistentVolumeClaim:
claimName: hpc-vinhha-test
And when I call to delete this pod, currently k8s only delete pod, and core lib not delete data of pod on persistent volume claim. So, data of PVC will become garbage and become bigger and bigger.
I want to delele all data in sub path ORG1/DIR1 and ORG1/DIR2 when pods deleted.
This is file yaml of pvc:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"efs-claim","namespace":"default"},"spec":{"accessModes":["ReadWriteMany"],"resources":{"requests":{"storage":"5Gi"}},"storageClassName":"efs-sc"}}
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2020-07-10T04:02:51Z"
finalizers:
- kubernetes.io/pvc-protection
name: efs-claim
namespace: default
resourceVersion: "887409"
selfLink: /api/v1/namespaces/default/persistentvolumeclaims/efs-claim
uid: ab66c2f7-744c-4d6f-a508-2bc90f0b1897
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: efs-sc
volumeMode: Filesystem
volumeName: efs-pv-shared
status:
accessModes:
- ReadWriteMany
capacity:
storage: 5Gi
phase: Bound
So, can you help me with this problem. Because I'm a newbie in k8s and aws-efs. So, I don't have much experience about it :(
Thank so much.

Google Cloud, Kubernetes and Volumes

I'm new to GCE and K8s and I'm trying to figure out my first deployment, but I get an error with my volumes:
Failed to attach volume "pv0001" on node "xxxxx" with: GCE persistent disk not found: diskName="pd-disk-1" zone="europe-west1-b"
Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "xxx". list of unattached/unmounted volumes=[registrator-claim0]
This is my storage yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv0001
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
fsType: ext4
pdName: pd-disk-1
This is my Claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: registrator-claim0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
status: {}
This is my Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: consul
spec:
replicas: 1
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
service: consul
spec:
restartPolicy: Always
containers:
- name: consul
image: eu.gcr.io/xxxx/consul
ports:
- containerPort: 8300
protocol: TCP
- containerPort: 8400
protocol: TCP
- containerPort: 8500
protocol: TCP
- containerPort: 53
protocol: UDP
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- -server
- -bootstrap
- -advertise=$(MY_POD_IP)
- name: registrator
args:
- -internal
- -ip=192.168.99.101
- consul://localhost:8500
image: eu.gcr.io/xxxx/registrator
volumeMounts:
- mountPath: /tmp/docker.sock
name: registrator-claim0
volumes:
- name: registrator-claim0
persistentVolumeClaim:
claimName: registrator-claim0
status: {}
What am I doing wrong? Figuring out K8s and GCE isn't that easy. These errors are not exactly helping. Hope someone can help me.
you've to create the actual storage before you define the PV, this can be done with sth like:
# make sure you're in the right zone
$ gcloud config set compute/europe-west1-b
# create the disk
$ gcloud compute disks create --size 10GB pd-disk-1
Once thats available you can create the PV and the PVC