I have a AWS EFS created and I have also created an access point: /ap
I want to mount that AP into the Kubernetes deployment, but it's failing, although when I use / it works.
These are the manifests I am using.
PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 1Mi
mountOptions:
- rsize=1048576
- wsize=1048576
- hard
- timeo=600
- retrans=2
- noresvport
persistentVolumeReclaimPolicy: Retain
nfs:
path: /ap
server: fs-xxx.efs.region.amazonaws.com
claimRef:
name: efs-pvc
namespace: product
PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-pvc
namespace: product
spec:
storageClassName: ""
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
And I receive this upon starting a deployment.
Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data default-token-qwclp]: timed out waiting for the condition
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/97e66236-cb08-4bee-82a7-6f6cf1db9353/volumes/kubernetes.io~nfs/efs-pv --scope -- mount -t nfs -o hard,noresvport,retrans=2,rsize=1048576,timeo=600,wsize=1048576 fs-559a4f0e.efs.eu-central-1.amazonaws.com:/atc /var/lib/kubelet/pods/97e66236-cb08-4bee-82a7-6f6cf1db9353/volumes/kubernetes.io~nfs/efs-pv
Output: Running scope as unit run-4806.scope.
mount.nfs: Connection timed out
Am I missing something? Or should I use CSI driver instead?
i strongly suggest you to follow this steps :
step 1 : deploy efs csi drivers on your nodes
link : https://github.com/kubernetes-sigs/aws-efs-csi-driver
step 2 : make new PV and PVC using these tutorial
link : https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/volume_path/README.md
now , if you want to specify a path for folder then you can follow this tutorial
link : https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/volume_path/specs/example.yaml
it worked for me.
Related
I want to run a statefulSet in AWS EKS Fargate and attach a EFS volume with it, but I am getting errors in mounting a volume with pod.
These are the error I am getting from describe pod.
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal LoggingEnabled 114s fargate-scheduler Successfully enabled logging for pod
Normal Scheduled 75s fargate-scheduler Successfully assigned default/app1 to fargate-10.0.2.123
Warning FailedMount 43s (x7 over 75s) kubelet MountVolume.SetUp failed for volume "efs-pv" : rpc error: code = Internal desc = Could not mount "fs-xxxxxxxxxxxxxxxxx:/" at "/var/lib/kubelet/pods/b799a6d6-fe9e-4f80-ac2d-8ccf8834d7c4/volumes/kubernetes.io~csi/efs-pv/mount": mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t efs -o tls fs-xxxxxxxxxxxxxxxxx:/ /var/lib/kubelet/pods/b799a6d6-fe9e-4f80-ac2d-8ccf8834d7c4/volumes/kubernetes.io~csi/efs-pv/mount
Output: Failed to resolve "fs-xxxxxxxxxxxxxxxxx.efs.us-east-1.amazonaws.com" - check that your file system ID is correct, and ensure that the VPC has an EFS mount target for this file system ID.
See https://docs.aws.amazon.com/console/efs/mount-dns-name for more detail.
Attempting to lookup mount target ip address using botocore. Failed to import necessary dependency botocore, please install botocore first.
Warning: config file does not have fall_back_to_mount_target_ip_address_enabled item in section mount.. You should be able to find a new config file in the same folder as current config file /etc/amazon/efs/efs-utils.conf. Consider update the new config file to latest config file. Use the default value [fall_back_to_mount_target_ip_address_enabled = True].
If anyone has setup efs volume with eks fargate cluster please have a look at it. I am really stucked in from long time.
What I have setup
Created a EFS Volume
CSIDriver Object
apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
name: efs.csi.aws.com
spec:
attachRequired: false
Storage Class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: <EFS filesystem ID>
PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-claim
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
Pod Configuration
apiVersion: v1
kind: Pod
metadata:
name: app1
spec:
containers:
- name: app1
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) >> /data/out1.txt; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: efs-claim
I had the same question as you literally a day after and have been working on the error nonstop since then! Did you check to make sure your VPC had DNS hostnames enabled? That is what fixed it for me.
Just an FYI, if you are using fargate and you want to change this--I had to go as far as deleting the entire cluster after changing the DNS hostnames flag in order for the change to propagate. I'm unsure if you're familiar with the DHCP options of a normal ec2 instance, but usually it takes something like renewing the ipconfig in order to force the flag to propagate, but since fargate is a managed system, I was unable to find a way to do so from the node itself. I have created another post here attempting to answer that question.
Another quick FYI: if your pod execution role doesn't have access to EFS, you will need to add a policy that allows access (I just used the default AmazonElasticFileSystemFullAccess Role for the time being in order to try to get things working). Once again, you will have to relaunch your whole cluster in order to get this role change to propagate if you haven't already done so!
Im trying to figure out how to share data between a cronjob and a kubernetes deployment
I'm running Kubernetes hosted on AWS EKS
I've created a persistent volume with a claim and have tried to loop in the claim through both the cronjob and the deployment containers, however after the cronjob runs on the schedule the data still isn't in the other container where it should be
I've seen some threads about using AWS EBS but Im not so sure whats the way to go
Another thread talked about running different schedules to get the persistentvolume
- name: +vars.cust_id+-sophoscentral-logs
persistentVolumeClaim:
claimName: +vars.cust_id+-sophoscentral-logs-pvc
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: +vars.cust_id+-sp-logs-pv
spec:
persistentVolumeReclaimPolicy: Retain
claimRef:
name: +vars.cust_id+-sp-logs-pvc
namespace: +vars.namespace+
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/var/lib/+vars.cust_id+-sophosdata"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: +vars.cust_id+-sp-logs-pvc
namespace: +vars.namespace+
labels:
component: sp
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
volumeName: +vars.cust_id+-sp-logs-pv
EBS volumes do not support ReadWriteMany as a mode. If you want to stay within the AWS ecosystem, you would need to use EFS which is a hosted NFS product. Other options include self hosted Ceph or Gluster and their related CephFS and GlusterFS tools.
This should generally be avoided if possible. NFS brings a whole host of problems to the table and while CephFS (and probably GlusterFS but I'm less familiar with that one personally) is better it's still a far cry from a "normal" network block device volume. Make sure you understand the limitations this brings with it before you include this in a system design.
I am trying to mount a PV based on an existing GCP persistent disk onto my pod as read-only.
My configs look like this (Parts of them are masked for confidentiality)
apiVersion: v1
kind: PersistentVolume
metadata:
name: data-and-models-pv
namespace: lipsync
spec:
storageClassName: ""
capacity:
storage: 10Gi
accessModes:
- ReadOnlyMany
claimRef:
namespace: lipsync
name: data-and-models-pvc
# csi:
# driver: pd.csi.storage.gke.io
# volumeHandle: projects/***/zones/***/disks/g-lipsync-data-and-models
gcePersistentDisk:
pdName: g-lipsync-data-and-models
fsType: ext4
readOnly: true
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-and-models-pvc
namespace: lipsync
spec:
storageClassName: ""
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 10Gi
And then, in the pod definition:
volumeMounts:
- mountPath: /app/models
subPath: models
name: data-and-models-v
readOnly: true
- [...]
volumes:
- name: data-and-models-v
persistentVolumeClaim:
claimName: data-and-models-pvc
readOnly: true
However, when I do kubectl apply, the pod never gets created, and I am met with this event:
0s Warning FailedMount pod/lipsync-api-67c784dfb7-4tlln MountVolume.MountDevice failed for volume "data-and-models-pv" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-g-lipsync-data-and-models") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/data-and-models-pv/globalmount") with fstype ("ext4") and options ([]): mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/disk/by-id/google-g-lipsync-data-and-models /var/lib/kubelet/plugins/kubernetes.io/csi/pv/data-and-models-pv/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/data-and-models-pv/globalmount: cannot mount /dev/sdb read-only.
If I manually ssh into the VM that acts as the node that supports the pod, I can observe that adding noload to the mount options can make me successfully mount the disk:
sudo mount -o ro,noload,defaults /dev/sdb .
But I am not aware of any way to make Kubernetes use this extra mount option.
How can I sucessfully make GKE mount this disk to my pod?
I had the same problem with the readOnly mode today. Hope my experience can give you some idea.
I have two snapshots, one is created by someone else, and the other is created by me. The files on both disks are identical.
But when I mounted the disk provisioned by the screenshot that the other created, the readOnly works fine. But mine just kept getting the same error you have.
So I logged in the pod without readOnly mode to see the difference between the two disks. And the only difference I found is the good one has userId: 1003 and groupId 1004 and mine is root:root. So I ran chown 1003:1004 -R ./ to my disk, used it to create a new snapshot, and then it worked smoothly...
I haven't found the reason yet, but it worked at least. Will let you know if i figure out.
I am running my docker containers with the help of kubernetes cluster on AWS EKS. Two of my docker containers are using shared volume and both of these containers are running inside two different pods. So I want a common volume which can be used by both the pods on aws.
I created an EFS volume and mounted. I am following link to create PersistentVolumeClaim. But I am getting timeout error when efs-provider pod trying to attach mounted EFS volume space. VolumeId, region are correct only.
Detailed Error message for Pod describe:
timeout expired waiting for volumes to attach or mount for pod "default"/"efs-provisioner-55dcf9f58d-r547q". list of unmounted volumes=[pv-volume]. list of unattached volumes=[pv-volume default-token-lccdw]
MountVolume.SetUp failed for volume "pv-volume" : mount failed: exit status 32
AWS EFS uses NFS type volume plugin, and As per
Kubernetes Storage Classes
NFS volume plugin does not come with internal Provisioner like EBS.
So the steps will be:
Create an external Provisioner for NFS volume plugin.
Create a storage class.
Create one volume claim.
Use volume claim in Deployment.
In the configmap section change the file.system.id: and aws.region: to match the details of the EFS you created.
In the deployment section change the server: to the DNS endpoint of the EFS you created.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: efs-provisioner
data:
file.system.id: yourEFSsystemid
aws.region: regionyourEFSisin
provisioner.name: example.com/aws-efs
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: efs-provisioner
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: efs-provisioner
spec:
containers:
- name: efs-provisioner
image: quay.io/external_storage/efs-provisioner:latest
env:
- name: FILE_SYSTEM_ID
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: file.system.id
- name: AWS_REGION
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: aws.region
- name: PROVISIONER_NAME
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: provisioner.name
volumeMounts:
- name: pv-volume
mountPath: /persistentvolumes
volumes:
- name: pv-volume
nfs:
server: yourEFSsystemID.efs.yourEFSregion.amazonaws.com
path: /
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: aws-efs
provisioner: example.com/aws-efs
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: efs
annotations:
volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
For more explanation and details go to https://github.com/kubernetes-incubator/external-storage/tree/master/aws/efs
The problem for me was that I was specifying a different path in my PV than /. And the directory on the NFS server that was referenced beyond that path did not yet exist. I had to manually create that directory first.
The issue was, I had 2 ec2 instances running, but I mounted EFS volume to only one of the ec2 instances and kubectl was always deploying pods on the ec2 instance which doesn't have the mounted volume. Now I mounted the same volume to both the instances and using PVC, PV like below. It is working fine.
ec2 mounting: AWS EFS mounting with EC2
PV.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs
spec:
capacity:
storage: 100Mi
accessModes:
- ReadWriteMany
nfs:
server: efs_public_dns.amazonaws.com
path: "/"
PVC.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: efs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Mi
replicaset.yml
----- only volume section -----
volumes:
- name: test-volume
persistentVolumeClaim:
claimName: efs
I am trying to deploy a stateful set mounted on a Persistent Volume.
I installed Kubernetes on AWS via kops.
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T11:55:20Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
According to this issue I need to create the PVC first:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: zk-data-claim
spec:
storageClassName: default
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: zk-logs-claim
spec:
storageClassName: default
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
The default storage class exists, and the PVC binds a PV successfully:
$ kubectl get sc
NAME PROVISIONER AGE
default kubernetes.io/aws-ebs 20d
gp2 (default) kubernetes.io/aws-ebs 20d
ssd (default) kubernetes.io/aws-ebs 20d
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
zk-data-claim Bound pvc-5584fdf7-3853-11e8-a73b-02bb35448afe 2Gi RWO default 11m
zk-logs-claim Bound pvc-5593e249-3853-11e8-a73b-02bb35448afe 2Gi RWO default 11m
I can see these two volumes in the EC2 EBS Volumes list as "available" at first, but then later becomes "in-use".
And then ingest it in my StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: zk
spec:
serviceName: zk-cluster
replicas: 3
template:
metadata:
labels:
app: zookeeper
spec:
volumes:
- name: zk-data
persistentVolumeClaim:
claimName: zk-data-claim
- name: zk-logs
persistentVolumeClaim:
claimName: zk-logs-claim
containers:
....
volumeMounts:
- name: zk-data
mountPath: /opt/zookeeper/data
- name: zk-logs
mountPath: /opt/zookeeper/logs
Which fails with
Unable to mount volumes for pod "zk-0_default(83b8dc93-3850-11e8-a73b-02bb35448afe)": timeout expired waiting for volumes to attach/mount for pod "default"/"zk-0". list of unattached/unmounted volumes=[zk-data zk-logs]
I'm working in the default namespace.
Any ideas what could be causing this failure?
The problem was that my cluster was made with C5 nodes. C5 and M5 nodes follow a different naming convention (NVMe), and the naming is not recognised.
Recreate the cluster with t2 type nodes.
Yeah, that is a very, very well known problem with AWS and kubernetes. Most often it is caused by a stale directory on another Node causing the EBS volume to still be "in use" from the other Node's perspective, and thus the Linux machine will not turn loose of the device when requested by the AWS API. You will see plenty of chatter about that in both kubelet.service journals, on the machine with the EBS and the machine that wants the EBS.
It has been my experience that only ssh-ing into the Node where the EBS volume is currently attached, finding the mount(s), unmounting them, and then waiting for the exponential back-off timer to expire will solve that :-(
The hand-waving version is:
## cleaning up stale docker containers might not be a terrible idea
docker rm $(docker ps -aq -f status=exited)
## identify any (and there could very well be multiple) mounts
## of the EBS device-name
mount | awk '/dev\/xvdf/ {print $2}' | xargs umount
## or sometimes kubernetes will actually name the on-disk directory ebs~ so:
mount | awk '/ebs~something/{print $2}' | xargs umount
You may experience some success involving lsof in that process, too, but hopefully(!) cleaning up the exited containers will remove the need for such a thing.