pod-identity-webhook missing after EKS 1.16 upgrade - amazon-web-services

After upgrading to EKS 1.16, IAM Roles for Service Account stopped working.
It was configured as described in the article, configuring and assigning service accounts to pods, and worked with EKS 1.14 and 1.15.
Running service-account.yaml and test-pod.yaml on EKS 1.15 (qa env) does mount the following env variables
AWS_ROLE_ARN=arn:aws:iam::xxx:role/oidc-my-service-api-qa
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
While running same resources on EKS 1.16 (test env), they are not added.
service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxx:role/oidc-my-service-test
name: oidc-my-service-service-account
test-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- name: test
image: busybox
command: ["/bin/sh", "-c", "env | grep AWS"]
securityContext:
fsGroup: 1000
serviceAccountName: "oidc-my-service-service-account"
UPDATE
Turns out I'm missing Amazon EKS Pod Identity Webhook, but where did it go?
EKS 1.15
kubectl get mutatingwebhookconfigurations pod-identity-webhook
NAME CREATED AT
pod-identity-webhook 2020-01-11T17:01:52Z
EKS 1.16
kubectl get mutatingwebhookconfigurations pod-identity-webhook
Error from server (NotFound): mutatingwebhookconfigurations.admissionregistration.k8s.io "pod-identity-webhook" not found

I was able to re-install amazon-eks-pod-identity-webhook in my EKS 1.16 cluster by cloning the code from github, building docker image and pushing it to my ECR repo, and running
make cluster-up IMAGE=my-account-id.dkr.ecr.my-region.amazonaws.com/pod-identity-webhook:latest
Still open question where did it go, as it's part of managed service as stated in this comment

Related

How to connect AWS EKS cluster from Azure Devops pipeline - No user credentials found for cluster in KubeConfig content

I have to setup CI in Microsoft Azure Devops to deploy and manage AWS EKS cluster resources. As a first step, found few kubernetes tasks to make a connection to kubernetes cluster (in my case, it is AWS EKS) but in the task "kubectlapply" task in Azure devops, I can only pass the kube config file or Azure subscription to reach the cluster.
In my case, I have the kube config file but I also need to pass the AWS user credentials that is authorized to access the AWS EKS cluster. But there is no such option in the task when adding the New "k8s end point" to provide the AWS credentials that can be used to access the EKS cluster. Because of that, I am seeing the below error while verifying the connection to EKS cluster.
During runtime, I can pass the AWS credentials via envrionment variables in the pipeline but can not add the kubeconfig file in the task and SAVE it.
Azure and AWS are big players in Cloud and there should be ways to connect to connect AWS resources from any CI platform. Does anyone faced this kind of issues and What is the best approach to connect to AWS first and EKS cluster for deployments in Azure Devops CI.
No user credentials found for cluster in KubeConfig content. Make sure that the credentials exist and try again.
Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through the AWS IAM Authenticator for Kubernetes. You may update your config file referring to the following format:
apiVersion: v1
clusters:
- cluster:
server: ${server}
certificate-authority-data: ${cert}
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: aws
name: aws
current-context: aws
kind: Config
preferences: {}
users:
- name: aws
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
env:
- name: "AWS_PROFILE"
value: "dev"
args:
- "token"
- "-i"
- "mycluster"
Useful links:
https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html
https://github.com/kubernetes-sigs/aws-iam-authenticator#specifying-credentials--using-aws-profiles
I got the solution by using ServiceAccount following this post: How to deploy to AWS Kubernetes from Azure DevOps
For anyone who is still having this issue, i had to set this up for the startup i worked for and it was pretty simple.
After your cluster is created create the service account
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: build-robot
EOF
Then apply the cluster rolebinding
$ kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: build-robot
name: build-robot
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- kind: ServiceAccount
name: build-robot
namespace: default
EOF
Be careful with the above as it gives full access, checkout (https://kubernetes.io/docs/reference/access-authn-authz/rbac/) for more info for scoping the access.
From there head over to ADO and follow the steps using deploy-robot as the SA name
$ kubectl get serviceAccounts build-robot -n default -o='jsonpath={.secrets[*].name}'
xyz........
$ kubectl get secret xyz........ -n default -o json
...
...
...
Paste the output into the last box when adding the kubernetes resource into the environment and select Accept UnTrusted Certificates. Then click apply and validate and you should be good to go.

ALB Ingress Controller on AWS

I'm trying to setup an ALB Ingress Controller on AWS-EKS, exactly as the following tutorial describe: ingress_controller_alb, but I cannot get an ingress address.
Indeed, if I run the following command: kubectl get ingress/2048-ingress -n 2048-game, after 10 minutes I get no address. Any idea?
Problem may be in version of aws-controller you are using - you are using old version of ingress controller - 1.0.0, new one is 1.1.3.
I advice you to take look at this documentation: ingress-controller-alb.
1. Download sample ALB ingress controller manifest
wget https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/alb-ingress-controller.yaml
2. Configure the ALB ingress controller manifest
At minimum, edit the following variables:
--cluster-name=devCluster: name of the cluster. AWS resources will be tagged with kubernetes.io/cluster/devCluster:owned
If ec2metadata is unavailable from the controller pod, edit the following variables:
--aws-vpc-id=vpc-xxxxxx: vpc ID of the cluster.
--aws-region=us-west-1: AWS region of the cluster.
3. Deploy the RBAC roles manifest
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/rbac-role.yaml
4. Deploy the ALB ingress controller manifest
kubectl apply -f alb-ingress-controller.yaml
5. Verify the deployment was successful and the controller started
kubectl logs -n kube-system $(kubectl get po -n kube-system | egrep -o "alb-ingress[a-zA-Z0-9-]+")
You should be able to display output similar to the following:
-------------------------------------------------------------------------------
AWS ALB Ingress controller
Release: 1.0.0
Build: git-7bc1850b
Repository: https://github.com/kubernetes-sigs/aws-alb-ingress-controller.git
-------------------------------------------------------------------------------
Then you can deploy sample application
Execute following commands:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-service.yaml
Deploy an Ingress resource for the 2048 game:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-ingress.yaml
After few seconds, verify that the Ingress resource is enabled:
kubectl get ingress/2048-ingress -n 2048-game
I was struggling with the same issue, but finally got it working after following #MaggieO steps above. A couple of things to consider:
Add public and private subnets to your EKS cluster. Make sure your public subnets are tagged with "kubernetes.io/role/elb":"1". If creating a managed node group, only select private subnets for placement of your worker nodes.
Make sure your IAM role for you worker nodes has the policies AmazonEKSWorkerNodePolicy, AmazonEC2ContainerRegistryReadOnly, AmazonEKS_CNI_Policy, and the custom policy defined here https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.2/docs/examples/iam-policy.json.
Examine your ingress controller logs, they are helpful.
kubectl logs -n kube-system [name of your ingress controller]
Thank you for your replies!
I think the problem is the cluster creation that results in cluster creation without EC2 instances, with the command eksctl cluster create -f cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: test
region: eu-central-1
version: "1.14"
vpc:
id: vpc-50b17738
subnets:
private:
eu-central-1a: { id: subnet-aee763c6 }
eu-central-1b: { id: subnet-bc2ee6c6 }
eu-central-1c: { id: subnet-24734d6e }
nodeGroups:
- name: ng-1-workers
labels: { role: workers }
instanceType: t3.medium
desiredCapacity: 2
volumeSize: 5
privateNetworking: true
I try with node groups and with managed node groups, but I get the following timeout error:
...
[ℹ] nodegroup "ng-1-workers" has 0 node(s)
[ℹ] waiting for at least 2 node(s) to become ready in "ng-1-workers"
Error: timed out (after 25m0s) waiting for at least 2 nodes to join the cluster and become ready in "ng-1-workers"
if you succeed to create contoller,you will find this controller:
$ kubectl get po -n kube-system | grep alb
alb-ingress-controller-669b958f64-p69fw 1/1 Running 0 3m7s
and its logs :
$ kubectl logs -n kube-system $(kubectl get po -n kube-system | egrep -o alb-ingress[a-zA-Z0-9-]+)
-------------------------------------------------------------------------------
AWS ALB Ingress controller
Release: v1.1.8
Build: git-ec387ad1
Repository: https://github.com/kubernetes-sigs/aws-alb-ingress-controller.git
-------------------------------------------------------------------------------
W0720 13:31:21.242868 1 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.

How to add insecure Docker registry certificate to kubeadm config

I'm quite new to Kubernetes, and I managed to get an Angular app deployed locally using minikube. But now I'm working on a Bitnami Kubernetes Sandbox EC2 instance, and I've run into issues pulling from my docker registry on another EC2 instance.
Whenever I attempt to apply the deployment, the pods log the following error
Failed to pull image "registry-url.net:5000/app": no available registry endpoint:
failed to do request: Head https://registry-url.net/v2/app/manifests/latest:
x509: certificate signed by unknown authority
The docker registry certificate is signed by a CA (Comodo RSA), but I had to add the registry's .crt and .key files to /etc/docker/certs.d/registry-url.net:5000/ for my local copy of minikube and docker.
However, the Bitnami instance doesn't have an /etc/docker/ directory and there is no daemon.json file to add insecure registry exceptions, and I'm not sure where the cert files are meant to be located for kubeadm.
So is there a similar location to place .crt and .key files for kubeadm, or is there a command I can run to add my docker registry to a list of exceptions?
Or better yet, is there a way to get Kubernetes/docker to recognize the CA of the registry's SSL certs?
Thanks
Edit: I've included my deployment and secret files below:
app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment
spec:
replicas: 1
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: registry-url.net:5000/app
ports:
- containerPort: 80
env:
...
imagePullSecrets:
- name: registry-pull-secret
registry-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: registry-pull-secret
data:
.dockerconfigjson: <base-64 JSON>
type: kubernetes.io/dockerconfigjson
You need to create a secret with details for the repository.
This might be the example of uploading the image to your docker repo:
docker login _my-registry-url_:5000
Username (admin):
Password:
Login Succeeded
docker tag _user_/_my-cool-image_ _my-registry-url_:5000/_my-cool-image_:0.1
docker push _my-registry-url_:5000/_my-cool-image_:0.1
From that host you should create the base64 of ~/.docker/config.json like so
cat ~/.docker/config.json | base64
Then you will be able to add it to the secret, so create a yaml that might look like the following:
apiVersion: v1
kind: Secret
metadata:
name: registrypullsecret
data:
.dockerconfigjson: <base-64-encoded-json-here>
type: kubernetes.io/dockerconfigjson
Once done you can apply the secret using kubectl create -f my-secret.yaml && kubectl get secrets.
As for your pod it should look like this:
apiVersion: v1
kind: Pod
metadata:
name: jss
spec:
imagePullSecrets:
— name: registrypullsecret
containers:
— name: jss
image: my-registry-url:5000/my-cool-image:0.1
So I ended up solving my issue by manually installing docker via the following commands:
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get install docker-ce docker-ce-cli containerd.io
Then I had to create the directory structure /etc/docker/certs.d/registry-url:5000/ and copy the registry's .crt and .key files into the directory.
However, this still didn't work; but after stopping the EC2 instance and starting it again, it appears to pull from the remote registry with no issues.
When I initially ran service kubelet restart the changes didn't seem to take effect, but restarting did the trick. I'm not sure if there's a bettre way of fixing my issue, but this was the only solution that worked for me.

Unable to configure kubernetes URL with kubernetes-Jenkins plugin

Am new to kubernetes and trying out Jenkins kubernetes plugin. I have created a K8s cluster and namespace called jenkins-pl in AWS. Below are my Jenkins deployment and service yaml files:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins
spec:
replicas: 1
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: contactsai123/my-jenkins-image:1.0
env:
- name: JAVA_OPTS
value: -Djenkins.install.runSetupWizard=false
ports:
- name: http-port
containerPort: 8080
- name: jnlp-port
containerPort: 50000
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}
Here is my jenkins-service.yaml file
apiVersion: v1
kind: Service
metadata:
name: jenkins
spec:
type: LoadBalancer
ports:
- port: 8080
targetPort: 8080
selector:
app: jenkins
Am able to launch Jenkins successfully, am unsure on what should I provide in kubernetes URL.
I gave "https://kubernetes.default.svc.cluster.local" and get the error message:
Error testing connection https://kubernetes.default.svc.cluster.local: Failure executing: GET at: https://kubernetes.default.svc.cluster.local/api/v1/namespaces/jenkins-pl/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:jenkins-pl:default" cannot list pods in the namespace "jenkins-pl".
I executed the command:
$ kubectl cluster-info | grep master
and got the following output:
https://api-selegrid-k8s-loca-m23tbb-1891259367.us-west-2.elb.amazonaws.com
I provided the above in Kubernetes URL, for which I get the similar error as before.
Not sure how to move forward?
Your cluster has RBAC enabled. You have to give your deployment necessary RBAC permission to list pods.
Consider your deployment as a user who need to perform some task in your cluster. So, you have to provide it necessary permission.
At first you have to create a role. It could be ClusterRole or Role.
This role define what can be done under this role. A ClusterRole give permission to do some task in cluster scope where Role give permission only in a particular namespace.
Then, you have to create a Service Account. Consider service account as a user. It is for application instead of a person.
Finally, you have to bind Role or ClusterRole to the service account through RoleBinding or ClusterRoleBinding. This actually tell that which user/service can access permissions defined under which roles.
Check this nice post to understand RBAC: Configuring permissions in Kubernetes with RBAC
Also this video might help you to understand the basics: Role Based Access Control (RBAC) with Kubernetes

Trouble mounting an EBS to a Pod in a Kubernetes cluster

The cluster that I use is bootstrapped using kubeadm and it's deployed on AWS.
sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:51:33Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:”linux/amd64"}
I am trying to configure a pod to mount a persistent volume (I don’t think about PV and PVC for the moment), this is the manifest I used:
apiVersion: v1
kind: Pod
metadata:
name: mongodb-aws
spec:
volumes:
- name: mongodb-data
awsElasticBlockStore:
volumeID: vol-xxxxxx
fsType: ext4
containers:
- image: mongo
name: mongodb
volumeMounts:
- name: mongodb-data
mountPath: /data/db
ports:
- containerPort: 27017
protocol: TCP
At first I had this error from the logs of the pod :
“ mount: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-xxxx does not exist “
After some research, I discovered that I have to set a cloud provider and this is what I’ve tried to do for the 10 past hours, I tested many suggestions but none worked; I tried to tag all the resources used by the cluster as mentioned in: https://github.com/kubernetes/kubernetes/issues/53538#issuecomment-345942305, I also tried this official solution to run in-tree cloud providers with kubeadm : https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/ :
kubeadm_config.yml file:
apiVersion: kubeadm.k8s.io/v1alpha3
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
cloud-provider: "aws"
cloud-config: "/etc/kubernetes/cloud.conf"
---
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1alpha3
kubernetesVersion: v1.12.0
apiServerExtraArgs:
cloud-provider: "aws"
cloud-config: "/etc/kubernetes/cloud.conf"
apiServerExtraVolumes:
- name: cloud
hostPath: "/etc/kubernetes/cloud.conf"
mountPath: "/etc/kubernetes/cloud.conf"
controllerManagerExtraArgs:
cloud-provider: "aws"
cloud-config: "/etc/kubernetes/cloud.conf"
controllerManagerExtraVolumes:
- name: cloud
hostPath: "/etc/kubernetes/cloud.conf"
mountPath: “/etc/kubernetes/cloud.conf"
In /etc/kubernetes/cloud.conf I put :
[Global]
KubernetesClusterTag=kubernetes
KubernetesClusterID=kubernetes
After running kubeadm init --config kubeadm_config.yml I had these errors:
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
couldn't initialize a Kubernetes cluster
The Control Plane is not created
When I removed :
apiVersion: kubeadm.k8s.io/v1alpha3
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
cloud-provider: "aws"
cloud-config: "/etc/kubernetes/cloud.conf"
From kubeadm_config.yml and I run kubeadm init --config kubeadm_config.yml, the
Kubernetes master had initialized successfully, but when I executed : kubectl get pods —all-namespaces, I got:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-ip-172-31-31-160 1/1 Running 0 11m
kube-system kube-apiserver-ip-172-31-31-160 1/1 Running 0 11m
kube-system kube-controller-manager-ip-172-31-31-160 0/1 CrashLoopBackOff 6 11m
kube-system kube-scheduler-ip-172-31-31-160 1/1 Running 0 10m
The controller didn’t run.However the --cloud-provider=aws command-line flag is present for the apiserver (in /etc/kubernetes/manifests/kube-apiserver.yaml) and also for the controller manager ( /etc/kubernetes/manifests/kube-controller-manager.yaml )
When I run sudo kubectl logs kube-controller-manager-ip-172-31-13-85 -n kube-system I got:
Flag --address has been deprecated, see --bind-address instead.
I1126 11:27:35.006433 1 serving.go:293] Generated self-signed cert (/var/run/kubernetes/kube-controller-manager.crt, /var/run/kubernetes/kube-controller-manager.key)
I1126 11:27:35.811493 1 controllermanager.go:143] Version: v1.12.0
I1126 11:27:35.812091 1 secure_serving.go:116] Serving securely on [::]:10257
I1126 11:27:35.812605 1 deprecated_insecure_serving.go:50] Serving insecurely on 127.0.0.1:10252
I1126 11:27:35.812760 1 leaderelection.go:187] attempting to acquire leader lease kube-system/kube-controller-manager...
I1126 11:27:53.260484 1 leaderelection.go:196] successfully acquired lease kube-system/kube-controller-manager
I1126 11:27:53.261474 1 event.go:221] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-controller-manager", UID:"b0da1291-f16d-11e8-baeb-02a38a37cfd6", APIVersion:"v1", ResourceVersion:"449", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' ip-172-31-13-85_4603714e-f16e-11e8-8d9d-02a38a37cfd6 became leader
I1126 11:27:53.290493 1 aws.go:1042] Building AWS cloudprovider
I1126 11:27:53.290642 1 aws.go:1004] Zone not specified in configuration file; querying AWS metadata service
F1126 11:27:53.296760 1 controllermanager.go:192] error building controller context: cloud provider could not be initialized: could not init cloud provider "aws": error finding instance i-0b063e2a3c9797398: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""
I didn’t try to downgrade kubeadm (to be able to use manifests with only kind: MasterConfiguration)
If you need more information, please feel free to ask.