vsystem-vrep of vora at Waiting: CrashLoopBackOff - vora

Trying to setup Vora 2 on an AWS kops k8s cluster.
The pod vsystem-vrep cannot start.
In the logfile on the node I see:
sudo cat vsystem-vrep_30.log
{"log":"2018-03-27 12:54:04.164349|+0000|INFO |Starting Kernel NFS Server||vrep|1|Start|server.go(41)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.164897827Z"}
{"log":"2018-03-27 12:54:04.164405|+0000|INFO |Creating directory /exports||dir-handler|1|makeDir|dir_handler.go(40)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.164919387Z"}
{"log":"2018-03-27 12:54:04.164423|+0000|INFO |Listening for private API on port 8738||vrep|18|func1|server.go(45)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.164923893Z"}
{"log":"2018-03-27 12:54:04.166992|+0000|INFO |Configuring Kernel NFS Server||vrep|1|configure|server.go(126)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.167109138Z"}
{"log":"2018-03-27 12:54:04.219089|+0000|INFO |Configuring Kernel NFS Server||vrep|1|configure|server.go(126)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.219235263Z"}
{"log":"2018-03-27 12:54:04.230256|+0000|FATAL|Error starting NFS server: RPC service for NFS server has not been correctly registered||vrep|1|main|server.go(51)\u001e\n","stream":"stderr","time":"2018-03-27T12:54:04.230526346Z"}
How can I solve this?

When installing Vora 2.1 in AWS with kops, you need to first setup a RWX storage class which is needed by vsystem (the default AWS storage class is read only). During installation, you need to point to that storage class using parameter --vsystem-storage-class. Additionally, parameter --vsystem-load-nfs-modules needs to be set. I suspect that the error happened because that last parameter was missing.
Example, how a call of install.sh would look like:
./install.sh --accept-license --deployment-type=cloud --namespace=xxx
--docker-registry=123456789.dkr.ecr.us-west-1.amazonaws.com
--vora-admin-username=xxx --vora-admin-password=xxx
--cert-domain=my.host.domain.com --interactive-security-configuration=no
--vsystem-storage-class=aws-efs --vsystem-load-nfs-modules
A RWX storage class can e.g. be created as following
Create an EFS file system in same region as kops cluster - see https://us-west-2.console.aws.amazon.com/efs/home?region=us-west-2#/filesystems
Create file system
Select VPC of kops cluster
Add kops master and worker security groups to mount target
Optionally give it a name (e.g. same as your kops cluster, to know what it is used for)
Use default options for the remaining
Once created, note the DNS name (similar to fs-1234e567.efs.us-west-2.amazonaws.com).
Create persistent volume and storage class for Vora
E.g. use yaml files similar to below and point to the newly created EFS file system.
$ cat create_pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: vsystem-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: aws-efs
nfs:
path: /
server: fs-1234e567.efs.us-west-2.amazonaws.com
$ cat create_sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: aws-efs
provisioner: xyz.com/aws-efs
kubectl create -f create_pv.yaml
kubectl create -f create_sc.yaml
-- check if newly created pv and sc exist
kubectl get pv
kubectl get storageclasses

Related

GKE how to use existing compute engine disk as persistent volumes?

I might have to rebuild the GKE cluster but the compute engine disks won't be delete and needs to be re-used as persistent volumes for the pods. I haven't found a documentation showing how to link the existing GCP compute engine disk as persistent volumes for the pods.
Is it possible to use the existing GCP compute engine disks with GKE storage class and Persistent volumes?
Yes, it's possible to reuse Persistent Disk as Persistent Volume for another clusters, however there is one limitation:
The persistent disk must be in the same zone as the cluster nodes.
If PD will be in a different zone, the cluster will not find this disk.
In Documentation Using preexisting persistent disks as PersistentVolumes you can find information and examples how to reuse persistent disks.
If you didn't create Persistent Disk yet, you can create it based on Creating and attaching a disk documentation. For this tests, I've used below disk:
gcloud compute disks create pd-name \
--size 10G \
--type pd-standard \
--zone europe-west3-b
If you will create PD with less than 200G you will get below Warning, everything depends on your needs. In zone europe-west3-b, pd-standard type can have storage between 10GB - 65536GB.
You have selected a disk size of under [200GB]. This may result in poor I/O performance. For more information, see: https://developers.google.com/com
pute/docs/disks#performance.
Keep in mind that you might get different types of Persistent Disk on different zones. For more details you can check Disk Types documentation or run $ gcloud compute disk-types list.
Once you have Persistent Disk you can create PersistentVolume and PersistentVolumeClaim.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv
spec:
storageClassName: "test"
capacity:
storage: 10G
accessModes:
- ReadWriteOnce
claimRef:
namespace: default
name: pv-claim
gcePersistentDisk:
pdName: pd-name
fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-claim
spec:
storageClassName: "test"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10G
---
kind: Pod
apiVersion: v1
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/data"
name: task-pv-storage
Tests
$ kubectl get pv,pvc,pod
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv 10G RWO Retain Bound default/pv-claim test 22s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pv-claim Bound pv 10G RWO test 22s
NAME READY STATUS RESTARTS AGE
pod/task-pv-pod 1/1 Running 0 21s
Write some information to disk
$ kubectl exec -ti task-pv-pod -- bin/bash
root#task-pv-pod:/# cd /usr/share/nginx/html
root#task-pv-pod:/usr/share/nginx/html# echo "This is test message from Nginx pod" >> message.txt
Now I removed all previous resources: pv, pvc and pod.
$ kubectl get pv,pvc,pod
No resources found
Now If I would recreate pv, pvc with small changes in pod, for example busybox.
containers:
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello; sleep 10;done"]
volumeMounts:
- mountPath: "/usr/data"
name: task-pv-storage
It will be rebound
$ kubectl get pv,pvc,po
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv 10G RWO Retain Bound default/pv-claim 43m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pv-claim Bound pv 10G RWO 43m
NAME READY STATUS RESTARTS AGE
pod/busybox 1/1 Running 0 3m43s
And in the busybox pod I will be able to find Message.txt.
$ kubectl exec -ti busybox -- bin/sh
/ # cd usr
/ # cd usr/data
/usr/data # ls
lost+found message.txt
/usr/data # cat message.txt
This is test message from Nginx pod
As additional information, you won't be able to use it in 2 clusters in the same time, if you would try you will get error:
AttachVolume.Attach failed for volume "pv" : googleapi: Error 400: RESOURCE_IN_USE_B
Y_ANOTHER_RESOURCE - The disk resource 'projects/<myproject>/zones/europe-west3-b/disks/pd-name' is already being used by 'projects/<myproject>/zones/europe-west3-b/instances/gke-cluster-3-default-pool-bb545f05-t5hc'

How to access AWS ServiceAccount token as non-root in a Fargate container?

I set up an EKS cluster which entirely uses pods on Fargate. I want to run something as a non-root user in a container which needs access to S3. For this, I created a ServiceAccount and added an IAM role with the appropriate S3 policies.
I started a bare-bones pod which just waits indefinitely and used kubectl exec to drop to a bash in the container as root. There I installed the AWS CLI and tried some s3 operations on the command line, which works fine, so the pod can talk to S3 and get data.
Now, my real workload runs as non-root, and has to access stuff on S3, but when it tries to access it, it fails because the token's permissions are set to 600 and belong to root. The non-root user in the container also can't sudo and this is intended. This means I get "permission denied".
Is it possible to give a non-root user access to the serviceaccount token in a Fargate pod or do I have to allow my user to sudo and chmod the token to 644 in the startup script?
The fact that the token is mounted via permissions 600 is actually a known issue. A workaround is to specify an fsGroup.
Something like this works for me:
---
apiVersion: v1
kind: Pod
metadata:
name: foo
labels:
name: foo
spec:
containers:
- name: foo
image: foo:bar
resources:
limits:
memory: "128Mi"
cpu: "500m"
command:
- "/do-something.sh"
securityContext:
fsGroup: 65534
serviceAccountName: serviceAccountWithAccessToS3

Unable to get aws-iam-authenticator in config-map while applying through AWS CodeBuild

I am making CICD pipeline, using AWS CodeBuild to build and deploy application(service) to aws eks cluster. I have installed kubectl and aws-iam-authenticator properly,
getting aws instead of aws-iam-authenticator in command
kind: Config 
preferences: {} 
users: 
- name: arn:aws:eks:ap-south-1:*******:cluster/DevCluster 
user: 
exec: 
apiVersion: client.authentication.k8s.io/v1alpha1 
args: 
- eks 
- get-token 
- --cluster-name 
- DevCluster 
command: aws
env: null 
[Container] 2019/05/14 04:32:09 Running command kubectl get svc 
error: the server doesn't have a resource type "svc"
I donot want to edit configmap manually because it comes through pipeline.
As #Priya Rani said in the comments, he found the solution.
There is no issue with configmap file. Its all right.
1) I need to make Cloudformation (cluster+nodeinstance)trusted role to communicate with Codebuild by editing trusted role.
2) Need to add usedata section to communicate node instance with clusters.
Why you don't just load a proper/dedicated kube config file, by setting KUBECONFIG env variable inside your CICD pipeline, like this:
export KUBECONFIG=$KUBECONFIG:~/.kube/config-devel
which would include a right command to use with aws-iam-authenticator:
#
#config-devel
#
...
kind: Config
preferences: {}
users:
- name: aws
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
args:
- "token"
- "-i"
- "<cluster-name>"

EKS AWS: Can't connect Worker Node

I am a bit very stuck on the step of Launching worker node in the AWS EKS guide. And to be honest, at this point, I don't know what's wrong.
When I do kubectl get svc, I get my cluster so that's good news.
I have this in my aws-auth-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: arn:aws:iam::Account:role/rolename
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
Here is my config in .kube
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: CERTIFICATE
server: server
name: arn:aws:eks:region:account:cluster/clustername
contexts:
- context:
cluster: arn:aws:eks:region:account:cluster/clustername
user: arn:aws:eks:region:account:cluster/clustername
name: arn:aws:eks:region:account:cluster/clustername
current-context: arn:aws:eks:region:account:cluster/clustername
kind: Config
preferences: {}
users:
- name: arn:aws:eks:region:account:cluster/clustername
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
args:
- token
- -i
- clustername
command: aws-iam-authenticator.exe
I have launched an EC2 instance with the advised AMI.
Some things to note :
I launched my cluster with the CLI,
I created some Key Pair,
I am not using the Cloudformation Stack,
I attached those policies to the role of my EC2 : AmazonEKS_CNI_Policy, AmazonEC2ContainerRegistryReadOnly, AmazonEKSWorkerNodePolicy.
It is my first attempt at kubernetes and EKS, so please keep that in mind :). Thanks for your help!
Your config file and auth file looks right. Maybe there is some issue with the security group assignments? Can you share the exact steps that you followed to create the cluster and the worker nodes?
And any special reason why you had to use the CLI instead of the console? I mean if it's your first attempt at EKS, then you should probably try to set up a cluster using the console at least once.
Sometimes for whatever reason aws_auth configmap does not apply automatically. So we need to add them manually. I had this issue, so leaving it here in case it helps someone.
Check to see if you have already applied the aws-auth ConfigMap.
kubectl describe configmap -n kube-system aws-auth
If you receive an error stating "Error from server (NotFound): configmaps "aws-auth" not found", then proceed
Download the configuration map.
curl -o aws-auth-cm.yaml https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/aws-auth-cm.yaml
Open the file with your favorite text editor. Replace <ARN of instance role (not instance profile)> with the Amazon Resource Name (ARN) of the IAM role associated with your nodes, and save the file.
Apply the configuration.
kubectl apply -f aws-auth-cm.yaml
Watch the status of your nodes and wait for them to reach the Ready status.
kubectl get nodes --watch
You can also go to your aws console and find the worker node being added.
Find more info here

kubernetes: using Petset in bare metal environment

I was trying to setup Cassandra cluster by kubernetes 1.3.4 new alpha feature - Petset. Following the yaml file posted here:
http://blog.kubernetes.io/2016/07/thousand-instances-of-cassandra-using-kubernetes-pet-set.html
My kubernetes cluster is based on 1.3.4 on bare metal environment with 10 powerful physical machines. However, after I created the Petset, I can get nothing from kubectl get pv.
run kubectl get pvc, i get following:
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
cass-volume-cassandra-0 Pending 4h
cass-volume-cassandra-1 Pending 4h
cass-volume-cassandra-2 Pending 4h
Reading the README here: https://github.com/kubernetes/kubernetes/blob/b829d4d4ef68e64b9b7ae42b46877ee75bb2bfd9/examples/experimental/persistent-volume-provisioning/README.md
saying the persistent volume will be automatically created if the kubenetes is running on asw, gce or Cinder. Wondering any way I can create such persistent volume and pvc on bare metal environment?
Another question: as long as I run kubernetes cluster on a few EC2 machines in aws, above persistent volume from aws EBS will be automatically created with these clauses in yaml file? or I have to allocate EBS first?
volumeClaimTemplates:
- metadata:
name: cassandra-data
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 380Gi
petset using Dynamic volume provisioning, this means volumeClaimTemplates in the petset definition request for storage from kubernetes, if storage available pvc bound and pod(petset) is running! but for now kubernetes only support "Dynamic volume provisioning" in cloud provider like gce or aws.
if you use kubernetes in bare metal cluster, other way is using network storage like ceph or gluster that need setup network storage in your cluster.
if you want using bare metal hard disk, existen solution is using hostPath type of persistent volume.
By default, the host path provisioner is set to false in the cluster/local-up-cluster.sh. You can enable it by running ENABLE_HOSTPATH_PROVISIONER=true cluster/local-up-cluster.sh. This enables the provisioner and the PV gets created.