I want to run a statefulSet in AWS EKS Fargate and attach a EFS volume with it, but I am getting errors in mounting a volume with pod.
These are the error I am getting from describe pod.
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal LoggingEnabled 114s fargate-scheduler Successfully enabled logging for pod
Normal Scheduled 75s fargate-scheduler Successfully assigned default/app1 to fargate-10.0.2.123
Warning FailedMount 43s (x7 over 75s) kubelet MountVolume.SetUp failed for volume "efs-pv" : rpc error: code = Internal desc = Could not mount "fs-xxxxxxxxxxxxxxxxx:/" at "/var/lib/kubelet/pods/b799a6d6-fe9e-4f80-ac2d-8ccf8834d7c4/volumes/kubernetes.io~csi/efs-pv/mount": mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t efs -o tls fs-xxxxxxxxxxxxxxxxx:/ /var/lib/kubelet/pods/b799a6d6-fe9e-4f80-ac2d-8ccf8834d7c4/volumes/kubernetes.io~csi/efs-pv/mount
Output: Failed to resolve "fs-xxxxxxxxxxxxxxxxx.efs.us-east-1.amazonaws.com" - check that your file system ID is correct, and ensure that the VPC has an EFS mount target for this file system ID.
See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail.
Attempting to lookup mount target ip address using botocore. Failed to import necessary dependency botocore, please install botocore first.
Warning: config file does not have fall_back_to_mount_target_ip_address_enabled item in section mount.. You should be able to find a new config file in the same folder as current config file /etc/amazon/efs/efs-utils.conf. Consider update the new config file to latest config file. Use the default value [fall_back_to_mount_target_ip_address_enabled = True].
If anyone has setup efs volume with eks fargate cluster please have a look at it. I am really stucked in from long time.
What I have setup
Created a EFS Volume
CSIDriver Object
apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
name: efs.csi.aws.com
spec:
attachRequired: false
Storage Class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: <EFS filesystem ID>
PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-claim
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
Pod Configuration
apiVersion: v1
kind: Pod
metadata:
name: app1
spec:
containers:
- name: app1
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) >> /data/out1.txt; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: efs-claim
I had the same question as you literally a day after and have been working on the error nonstop since then! Did you check to make sure your VPC had DNS hostnames enabled? That is what fixed it for me.
Just an FYI, if you are using fargate and you want to change this--I had to go as far as deleting the entire cluster after changing the DNS hostnames flag in order for the change to propagate. I'm unsure if you're familiar with the DHCP options of a normal ec2 instance, but usually it takes something like renewing the ipconfig in order to force the flag to propagate, but since fargate is a managed system, I was unable to find a way to do so from the node itself. I have created another post here attempting to answer that question.
Another quick FYI: if your pod execution role doesn't have access to EFS, you will need to add a policy that allows access (I just used the default AmazonElasticFileSystemFullAccess Role for the time being in order to try to get things working). Once again, you will have to relaunch your whole cluster in order to get this role change to propagate if you haven't already done so!
I am trying to mount a PV based on an existing GCP persistent disk onto my pod as read-only.
My configs look like this (Parts of them are masked for confidentiality)
apiVersion: v1
kind: PersistentVolume
metadata:
name: data-and-models-pv
namespace: lipsync
spec:
storageClassName: ""
capacity:
storage: 10Gi
accessModes:
- ReadOnlyMany
claimRef:
namespace: lipsync
name: data-and-models-pvc
# csi:
# driver: pd.csi.storage.gke.io
# volumeHandle: projects/***/zones/***/disks/g-lipsync-data-and-models
gcePersistentDisk:
pdName: g-lipsync-data-and-models
fsType: ext4
readOnly: true
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-and-models-pvc
namespace: lipsync
spec:
storageClassName: ""
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 10Gi
And then, in the pod definition:
volumeMounts:
- mountPath: /app/models
subPath: models
name: data-and-models-v
readOnly: true
- [...]
volumes:
- name: data-and-models-v
persistentVolumeClaim:
claimName: data-and-models-pvc
readOnly: true
However, when I do kubectl apply, the pod never gets created, and I am met with this event:
0s Warning FailedMount pod/lipsync-api-67c784dfb7-4tlln MountVolume.MountDevice failed for volume "data-and-models-pv" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-g-lipsync-data-and-models") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/data-and-models-pv/globalmount") with fstype ("ext4") and options ([]): mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/disk/by-id/google-g-lipsync-data-and-models /var/lib/kubelet/plugins/kubernetes.io/csi/pv/data-and-models-pv/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/data-and-models-pv/globalmount: cannot mount /dev/sdb read-only.
If I manually ssh into the VM that acts as the node that supports the pod, I can observe that adding noload to the mount options can make me successfully mount the disk:
sudo mount -o ro,noload,defaults /dev/sdb .
But I am not aware of any way to make Kubernetes use this extra mount option.
How can I sucessfully make GKE mount this disk to my pod?
I had the same problem with the readOnly mode today. Hope my experience can give you some idea.
I have two snapshots, one is created by someone else, and the other is created by me. The files on both disks are identical.
But when I mounted the disk provisioned by the screenshot that the other created, the readOnly works fine. But mine just kept getting the same error you have.
So I logged in the pod without readOnly mode to see the difference between the two disks. And the only difference I found is the good one has userId: 1003 and groupId 1004 and mine is root:root. So I ran chown 1003:1004 -R ./ to my disk, used it to create a new snapshot, and then it worked smoothly...
I haven't found the reason yet, but it worked at least. Will let you know if i figure out.
I have a AWS EFS created and I have also created an access point: /ap
I want to mount that AP into the Kubernetes deployment, but it's failing, although when I use / it works.
These are the manifests I am using.
PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 1Mi
mountOptions:
- rsize=1048576
- wsize=1048576
- hard
- timeo=600
- retrans=2
- noresvport
persistentVolumeReclaimPolicy: Retain
nfs:
path: /ap
server: fs-xxx.efs.region.amazonaws.com
claimRef:
name: efs-pvc
namespace: product
PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-pvc
namespace: product
spec:
storageClassName: ""
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
And I receive this upon starting a deployment.
Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data default-token-qwclp]: timed out waiting for the condition
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/97e66236-cb08-4bee-82a7-6f6cf1db9353/volumes/kubernetes.io~nfs/efs-pv --scope -- mount -t nfs -o hard,noresvport,retrans=2,rsize=1048576,timeo=600,wsize=1048576 fs-559a4f0e.efs.eu-central-1.amazonaws.com:/atc /var/lib/kubelet/pods/97e66236-cb08-4bee-82a7-6f6cf1db9353/volumes/kubernetes.io~nfs/efs-pv
Output: Running scope as unit run-4806.scope.
mount.nfs: Connection timed out
Am I missing something? Or should I use CSI driver instead?
i strongly suggest you to follow this steps :
step 1 : deploy efs csi drivers on your nodes
link : https://github.com/kubernetes-sigs/aws-efs-csi-driver
step 2 : make new PV and PVC using these tutorial
link : https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/volume_path/README.md
now , if you want to specify a path for folder then you can follow this tutorial
link : https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/volume_path/specs/example.yaml
it worked for me.
I installed the EFS CI driver and got their Static Provisioning example to work: I was able to start a pod that appended to a file on the EFS volume. I could delete the pod and start another one to inspect that file and confirm the data written by the first pod was still there. But what I actually need to do is mount the volume read-only, and I am having no luck there.
Note that after I successfully ran that example, I launched an EC2 instance and in it, I mounted the EFS filesystem, then added the data that my pods need to access in a read-only fashion. Then I unmounted the EFS filesystem and terminated the instance.
Using the configuration below, which is based on the Static Provisioning example referenced above, my pod does not start Running; it remains in ContainerCreating.
Storage class:
$ kubectl get sc efs-sc -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"efs-sc"},"provisioner":"efs.csi.aws.com"}
creationTimestamp: "2020-01-12T05:36:13Z"
name: efs-sc
resourceVersion: "809880"
selfLink: /apis/storage.k8s.io/v1/storageclasses/efs-sc
uid: 71ecce62-34fd-11ea-8a5f-124f4ee64e8d
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
Persistent Volume (this is the only PV in the cluster that uses the EFS Storage Class):
$ kubectl get pv efs-pv-ro -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"efs-pv-ro"},"spec":{"accessModes":["ReadOnlyMany"],"capacity":{"storage":"5Gi"},"csi":{"driver":"efs.csi.aws.com","volumeHandle":"fs-26120da7"},"persistentVolumeReclaimPolicy":"Retain","storageClassName":"efs-sc","volumeMode":"Filesystem"}}
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2020-01-12T05:36:59Z"
finalizers:
- kubernetes.io/pv-protection
name: efs-pv-ro
resourceVersion: "810231"
selfLink: /api/v1/persistentvolumes/efs-pv-ro
uid: 8d54a80e-34fd-11ea-8a5f-124f4ee64e8d
spec:
accessModes:
- ReadOnlyMany
capacity:
storage: 5Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: efs-claim-ro
namespace: default
resourceVersion: "810229"
uid: e0498cae-34fd-11ea-8a5f-124f4ee64e8d
csi:
driver: efs.csi.aws.com
volumeHandle: fs-26120da7
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
volumeMode: Filesystem
status:
phase: Bound
Persistent Volume Claim (this is the only PVC in the cluster attempting to use the EFS storage class:
$ kubectl get pvc efs-claim-ro -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"efs-claim-ro","namespace":"default"},"spec":{"accessModes":["ReadOnlyMany"],"resources":{"requests":{"storage":"5Gi"}},"storageClassName":"efs-sc"}}
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2020-01-12T05:39:18Z"
finalizers:
- kubernetes.io/pvc-protection
name: efs-claim-ro
namespace: default
resourceVersion: "810234"
selfLink: /api/v1/namespaces/default/persistentvolumeclaims/efs-claim-ro
uid: e0498cae-34fd-11ea-8a5f-124f4ee64e8d
spec:
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 5Gi
storageClassName: efs-sc
volumeMode: Filesystem
volumeName: efs-pv-ro
status:
accessModes:
- ReadOnlyMany
capacity:
storage: 5Gi
phase: Bound
And here is the Pod. It remains in ContainerCreating and does not switch to Running:
$ kubectl get pod efs-app -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"efs-app","namespace":"default"},"spec":{"containers":[{"args":["infinity"],"command":["sleep"],"image":"centos","name":"app","volumeMounts":[{"mountPath":"/data","name":"persistent-storage","subPath":"mmad"}]}],"volumes":[{"name":"persistent-storage","persistentVolumeClaim":{"claimName":"efs-claim-ro"}}]}}
kubernetes.io/psp: eks.privileged
creationTimestamp: "2020-01-12T06:07:08Z"
name: efs-app
namespace: default
resourceVersion: "813420"
selfLink: /api/v1/namespaces/default/pods/efs-app
uid: c3b8421b-3501-11ea-b164-0a9483e894ed
spec:
containers:
- args:
- infinity
command:
- sleep
image: centos
imagePullPolicy: Always
name: app
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: persistent-storage
subPath: mmad
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-z97dh
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-192-168-254-51.ec2.internal
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: efs-claim-ro
- name: default-token-z97dh
secret:
defaultMode: 420
secretName: default-token-z97dh
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-01-12T06:07:08Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2020-01-12T06:07:08Z"
message: 'containers with unready status: [app]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2020-01-12T06:07:08Z"
message: 'containers with unready status: [app]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2020-01-12T06:07:08Z"
status: "True"
type: PodScheduled
containerStatuses:
- image: centos
imageID: ""
lastState: {}
name: app
ready: false
restartCount: 0
state:
waiting:
reason: ContainerCreating
hostIP: 192.168.254.51
phase: Pending
qosClass: BestEffort
startTime: "2020-01-12T06:07:08Z"
I am not sure if subPath will work with this configuration or not, but the same problem happens whether or not subPath is in the Pod configuration.
The problem does seem to be with the volume. If I comment out the volumes and volumeMounts section, the pod runs.
It seems that the PVC has bound with the correct PV, but the pod is not starting.
I'm not seeing a clue in any of the output above, but maybe I'm missing something?
Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.8", GitCommit:"211047e9a1922595eaa3a1127ed365e9299a6c23", GitTreeState:"clean", BuildDate:"2019-10-15T12:11:03Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.9-eks-c0eccc", GitCommit:"c0eccca51d7500bb03b2f163dd8d534ffeb2f7a2", GitTreeState:"clean", BuildDate:"2019-12-22T23:14:11Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
aws-efs-csi-driver version: v.0.2.0.
Note that one of requirements is to have installed Golang in version 1.13.4+ but you have go1.12.12. So you have to update it. If you are upgrading from an older version of Go you must first remove the existing version.
Take a look here: upgrading-golang.
This driver is supported on Kubernetes version 1.14 and later Amazon EKS clusters and worker nodes. Alpha features of the Amazon EFS CSI Driver are not supported on Amazon EKS clusters.
Cannot mount read-only volume in kubernetes pod (using EFS CSI driver in AWS EKS). Try to change access mode to:
accessModes:
- ReadWriteMany
You can find more information here: efs-csi-driver.
Make sure that while creating EFS filesystem, it is accessible from Kuberenetes cluster. This can be achieved by creating the filesystem inside the same VPC as Kubernetes cluster or using VPC peering.
Static provisioning - EFS filesystem needs to be created manually first, then it could be mounted inside container as a persistent volume (PV) using the driver.
Mount Options - Mount options can be specified in the persistence volume (PV) to define how the volume should be mounted. Aside from normal mount options, you can also specify tls as a mount option to enable encryption in transit of the EFS filesystem.
Because Amazon EFS is an elastic file system, it does not enforce any file system capacity
limits. The actual storage capacity value in persistent volumes and persistent volume claims
is not used when creating the file system. However, since storage capacity is a required field
in Kubernetes, you must specify a valid value, such as, 5Gi in this example. This value does
not limit the size of your Amazon EFS file system
I'm trying to create a Cassandra cluster in Kubernetes. I want to use awsElasticBlockStore to make the data persistent. As a result, I've written a YAML file like following for the corresponding Replication Controller:
apiVersion: v1
kind: ReplicationController
metadata:
name: cassandra-rc
spec:
# Question: How can I do this?
replicas: 2
selector:
name: cassandra
template:
metadata:
labels:
name: cassandra
spec:
containers:
- resources:
limits :
cpu: 1.0
image: cassandra:2.2.6
name: cassandra
ports:
- containerPort: 7000
name: comm
- containerPort: 9042
name: cql
- containerPort: 9160
name: thrift
volumeMounts:
- name: cassandra-persistent-storage
mountPath: /cassandra_data
volumes:
- name: cassandra-persistent-storage
awsElasticBlockStore:
volumeID: aws://ap-northeast-1c/vol-xxxxxxxx
fsType: ext4
However, only one pod can be properly launched with this configuration.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cassandra-rc-xxxxx 0/1 ContainerCreating 0 5m
cassandra-rc-yyyyy 1/1 Running 0 5m
When I run $ kubectl describe pod cassandra-rc-xxxxx, I see an error like following:
Error syncing pod, skipping: Could not attach EBS Disk "aws://ap-northeast-1c/vol-xxxxxxxx": Error attaching EBS volume: VolumeInUse: vol-xxxxxxxx is already attached to an instance
It's understandable because an ELB Volume can be mounted from only one node. So only one pod can successfully mount the volume and bootup, while others just fail.
Is there any good solution for this? Do I need to create multiple Replication Controllers for each pod?
You are correct, one EBS volume can only be mounted on a single EC2 at a given time. To solve you have the following options:
Use multiple EBS volumes with multiple Replication Controllers
Use a distributed file system (e.g. Gluster) and avoid EBS issue
Follow along with PetSet (https://github.com/kubernetes/kubernetes/issues/260)