I'm trying to setup an ALB Ingress Controller on AWS-EKS, exactly as the following tutorial describe: ingress_controller_alb, but I cannot get an ingress address.
Indeed, if I run the following command: kubectl get ingress/2048-ingress -n 2048-game, after 10 minutes I get no address. Any idea?
Problem may be in version of aws-controller you are using - you are using old version of ingress controller - 1.0.0, new one is 1.1.3.
I advice you to take look at this documentation: ingress-controller-alb.
1. Download sample ALB ingress controller manifest
wget https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/alb-ingress-controller.yaml
2. Configure the ALB ingress controller manifest
At minimum, edit the following variables:
--cluster-name=devCluster: name of the cluster. AWS resources will be tagged with kubernetes.io/cluster/devCluster:owned
If ec2metadata is unavailable from the controller pod, edit the following variables:
--aws-vpc-id=vpc-xxxxxx: vpc ID of the cluster.
--aws-region=us-west-1: AWS region of the cluster.
3. Deploy the RBAC roles manifest
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/rbac-role.yaml
4. Deploy the ALB ingress controller manifest
kubectl apply -f alb-ingress-controller.yaml
5. Verify the deployment was successful and the controller started
kubectl logs -n kube-system $(kubectl get po -n kube-system | egrep -o "alb-ingress[a-zA-Z0-9-]+")
You should be able to display output similar to the following:
-------------------------------------------------------------------------------
AWS ALB Ingress controller
Release: 1.0.0
Build: git-7bc1850b
Repository: https://github.com/kubernetes-sigs/aws-alb-ingress-controller.git
-------------------------------------------------------------------------------
Then you can deploy sample application
Execute following commands:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-service.yaml
Deploy an Ingress resource for the 2048 game:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.3/docs/examples/2048/2048-ingress.yaml
After few seconds, verify that the Ingress resource is enabled:
kubectl get ingress/2048-ingress -n 2048-game
I was struggling with the same issue, but finally got it working after following #MaggieO steps above. A couple of things to consider:
Add public and private subnets to your EKS cluster. Make sure your public subnets are tagged with "kubernetes.io/role/elb":"1". If creating a managed node group, only select private subnets for placement of your worker nodes.
Make sure your IAM role for you worker nodes has the policies AmazonEKSWorkerNodePolicy, AmazonEC2ContainerRegistryReadOnly, AmazonEKS_CNI_Policy, and the custom policy defined here https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.1.2/docs/examples/iam-policy.json.
Examine your ingress controller logs, they are helpful.
kubectl logs -n kube-system [name of your ingress controller]
Thank you for your replies!
I think the problem is the cluster creation that results in cluster creation without EC2 instances, with the command eksctl cluster create -f cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: test
region: eu-central-1
version: "1.14"
vpc:
id: vpc-50b17738
subnets:
private:
eu-central-1a: { id: subnet-aee763c6 }
eu-central-1b: { id: subnet-bc2ee6c6 }
eu-central-1c: { id: subnet-24734d6e }
nodeGroups:
- name: ng-1-workers
labels: { role: workers }
instanceType: t3.medium
desiredCapacity: 2
volumeSize: 5
privateNetworking: true
I try with node groups and with managed node groups, but I get the following timeout error:
...
[ℹ] nodegroup "ng-1-workers" has 0 node(s)
[ℹ] waiting for at least 2 node(s) to become ready in "ng-1-workers"
Error: timed out (after 25m0s) waiting for at least 2 nodes to join the cluster and become ready in "ng-1-workers"
if you succeed to create contoller,you will find this controller:
$ kubectl get po -n kube-system | grep alb
alb-ingress-controller-669b958f64-p69fw 1/1 Running 0 3m7s
and its logs :
$ kubectl logs -n kube-system $(kubectl get po -n kube-system | egrep -o alb-ingress[a-zA-Z0-9-]+)
-------------------------------------------------------------------------------
AWS ALB Ingress controller
Release: v1.1.8
Build: git-ec387ad1
Repository: https://github.com/kubernetes-sigs/aws-alb-ingress-controller.git
-------------------------------------------------------------------------------
W0720 13:31:21.242868 1 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
Related
I am following this AWS guide: https://aws.amazon.com/premiumsupport/knowledge-center/eks-alb-ingress-controller-fargate/ to setup my kubernetes cluster under ALB.
After installing the AWS ALB controller on my EKS cluster, following below steps:
helm repo add eks https://aws.github.io/eks-charts
kubectl apply -k "github.com/aws/eks-charts/stable/aws-load-balancer-controller//crds?ref=master"
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--set clusterName=YOUR_CLUSTER_NAME \
--set serviceAccount.create=false \
--set region=YOUR_REGION_CODE \
--set vpcId=<VPC_ID> \
--set serviceAccount.name=aws-load-balancer-controller \
-n kube-system
I want to deploy my ingress configurations:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/success-codes: 200,302
alb.ingress.kubernetes.io/target-type: instance
kubernetes.io/ingress.class: alb
name: staging-ingress
namespace: staging
finalizers:
- ingress.k8s.aws/resources
spec:
rules:
- http:
paths:
- backend:
serviceName: my-service
servicePort: 80
path: /api/v1/price
Everything looks fine. However, when I run the below command to deploy my ingress:
kubectl apply -f ingress.staging.yaml -n staging
I am having below error:
Error from server (InternalError): error when creating "ingress.staging.yaml": Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": the server could not find the requested resource
There are very few similar issues on Google an none was helping me. Any ideas of what is the problem?
K8s version: 1.18
the security group solved me:
node_security_group_additional_rules = {
ingress_allow_access_from_control_plane = {
type = "ingress"
protocol = "tcp"
from_port = 9443
to_port = 9443
source_cluster_security_group = true
description = "Allow access from control plane to webhook port of AWS load balancer controller"
}
}
I would suggest to take a look at the alb controller logs, the CRDs that you are using are for v1beta1 API group while the latest chart is registering v1 API group webhook aws-load-balancer-controller v2.4.0
If you look at the alb controller startup logs you should see a line similar to the below message
v1beta1
{"level":"info","ts":164178.5920634,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1beta1-ingress"}
v1
{"level":"info","ts":164683.0114837,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1-ingress"}
if that is the case you can fix the problem by using an earlier version of the controller or get the newer version for the CRDs
I'm trying to deploy a simple tutorial app to a new fargate based kubernetes cluster.
Unfortunately I'm stuck on ImagePullBackOff for the coredns pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning LoggingDisabled 5m51s fargate-scheduler Disabled logging because aws-logging configmap was not found. configmap "aws-logging" not found
Normal Scheduled 4m11s fargate-scheduler Successfully assigned kube-system/coredns-86cb968586-mcdpj to fargate-ip-172-31-55-205.eu-central-1.compute.internal
Warning Failed 100s kubelet Failed to pull image "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1": rpc error: code = Unknown desc = failed to pull and unpack image "602
401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1": failed to resolve reference "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1": failed to do request: Head "https://602401143452.dkr.
ecr.eu-central-1.amazonaws.com/v2/eks/coredns/manifests/v1.8.0-eksbuild.1": dial tcp 3.122.9.124:443: i/o timeout
Warning Failed 100s kubelet Error: ErrImagePull
Normal BackOff 99s kubelet Back-off pulling image "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1"
Warning Failed 99s kubelet Error: ImagePullBackOff
Normal Pulling 87s (x2 over 4m10s) kubelet Pulling image "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1"
While googling I found https://aws.amazon.com/premiumsupport/knowledge-center/eks-ecr-troubleshooting/
It contains a following list:
To resolve this error, confirm the following:
- The subnet for your worker node has a route to the internet. Check the route table associated with your subnet.
- The security group associated with your worker node allows outbound internet traffic.
- The ingress and egress rule for your network access control lists (ACLs) allows access to the internet.
Since I've created both my private subnets as well as their NAT Gateways manually I tried to locate an issue here but couldn't find anything. They as well as security groups and ACLs look fine to me.
I even added the AmazonEC2ContainerRegistryReadOnly to my EKS role but after issuing command kubectl rollout restart -n kube-system deployment coredns the result is unfortunately the same: ImagePullBackOff
Unfortunately I've runned out of ideas and I'm stuck. Any help that would help me troubleshoot this would be greatly appreciated. ~Thanks
edit>
After creating new cluster via *eksctl as #mreferre suggested in his comment I get RBAC error with link: https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting_iam.html#security-iam-troubleshoot-cannot-view-nodes-or-workloads
I'm not sure what is going on since I already have
edit>>
The cluster created via AWS Console ( web interface ) doesn't have the configmap aws-auth I've retrieved the configmap below using command kubectl edit configmap aws-auth -n kube-system
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
- system:node-proxier
rolearn: arn:aws:iam::370179080679:role/eksctl-tutorial-cluster-FargatePodExecutionRole-1J605HWNTGS2Q
username: system:node:{{SessionName}}
kind: ConfigMap
metadata:
creationTimestamp: "2021-04-08T18:42:59Z"
name: aws-auth
namespace: kube-system
resourceVersion: "918"
selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
uid: d9a21964-a8bf-49e9-800f-650320b7444e
Creating an answer to sum up the discussion in the comment that deemed to be acceptable. The most common (and arguably easier) way to setup an EKS cluster with Fargate support is to use EKSCTL and setup the cluster using eksctl create cluster --fargate. This will build all the plumbing for you and you will get a cluster with no EC2 instances nor managed node groups with the two CoreDNS pods deployed on two Fargate instances. Note that when you deploy EKSCTL via the command line you may end up using different roles/users between your CLI and console. This may result in access denied issues. Best course of action would be to use a non-root user to login into the AWS console and use CloudShell to deploy with EKSCTL (CloudShell will inherit the same console user identity). {More info in the comments}
After upgrading to EKS 1.16, IAM Roles for Service Account stopped working.
It was configured as described in the article, configuring and assigning service accounts to pods, and worked with EKS 1.14 and 1.15.
Running service-account.yaml and test-pod.yaml on EKS 1.15 (qa env) does mount the following env variables
AWS_ROLE_ARN=arn:aws:iam::xxx:role/oidc-my-service-api-qa
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
While running same resources on EKS 1.16 (test env), they are not added.
service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxx:role/oidc-my-service-test
name: oidc-my-service-service-account
test-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- name: test
image: busybox
command: ["/bin/sh", "-c", "env | grep AWS"]
securityContext:
fsGroup: 1000
serviceAccountName: "oidc-my-service-service-account"
UPDATE
Turns out I'm missing Amazon EKS Pod Identity Webhook, but where did it go?
EKS 1.15
kubectl get mutatingwebhookconfigurations pod-identity-webhook
NAME CREATED AT
pod-identity-webhook 2020-01-11T17:01:52Z
EKS 1.16
kubectl get mutatingwebhookconfigurations pod-identity-webhook
Error from server (NotFound): mutatingwebhookconfigurations.admissionregistration.k8s.io "pod-identity-webhook" not found
I was able to re-install amazon-eks-pod-identity-webhook in my EKS 1.16 cluster by cloning the code from github, building docker image and pushing it to my ECR repo, and running
make cluster-up IMAGE=my-account-id.dkr.ecr.my-region.amazonaws.com/pod-identity-webhook:latest
Still open question where did it go, as it's part of managed service as stated in this comment
I have a kubernetes application using AWS EKS. With the below details:
Cluster:
+ Kubernetes version: 1.15
+ Platform version: eks.1
Node Groups:
+ Instance Type: t3.medium
+ 2(Minimum) - 2(Maximum) - 2(Desired) configuration
[Pods]
+ 2 active pods
[Service]
+ Configured Type: ClusterIP
+ metadata.name: k8s-eks-api-service
[rbac-role.yaml]
https://pastebin.com/Ksapy7vK
[alb-ingress-controller.yaml]
https://pastebin.com/95CwMtg0
[ingress.yaml]
https://pastebin.com/S3gbEzez
When I tried to pull the ingress details. Below are the values (NO ADDRESS)
Host: *
ADDRESS:
My goal is to know why the address has no value. I expect to have private or public address to be used by other service on my application.
solution fitted my case is adding ingressClassName in ingress.yaml or configure default ingressClass.
add ingressClassName in ingress.yaml
#ingress.yaml
metadata:
name: ingress-nginx
...
spec:
ingressClassName: nginx <-- add this
rules:
...
or
edit ingressClass yaml
$ kubectl edit ingressclass <ingressClass Name> -n <ingressClass namespace>
#ingressClass.yaml
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
annotations:
ingressclass.kubernetes.io/is-default-class: "true" <-- add this
....
link
In order for your kubernetes cluster to be able to get an address you will need to be able to manage route53 from withtin the cluster, for this task I would recommend to use externalDns.
In a broader sense, ExternalDNS allows you to control DNS records dynamically via Kubernetes resources in a DNS provider-agnostic way.
source: ExternalDNS
This happened with me too that after all the setup, I was not able to see the ingress address. The best way to debug this issue is to check the logs for the ingress controller. You can do this by:
Get the Ingress controller po name by using: kubectl get po -n kube-system
Check logs for the po using: kubectl logs <po_name> -n kube-system
This will point you to the exact issue as to why you are not seeing the address.
I have started 2 ubuntu 16 EC2 instance(one for master and other for worker). Everything working OK.
I need to setup dashboard to view on my machine. I have copied admin.ctl and executed the script in my machine's terminal
kubectl --kubeconfig ./admin.conf proxy --address='0.0.0.0' --port=8002 --accept-hosts='.*'
Everything is fine.
But in browser when I use below link
http://localhost:8002/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
I am getting Error: 'dial tcp 192.168.1.23:8443: i/o timeout'
Trying to reach: 'https://192.168.1.23:8443/'
I have enabled all traffics in security policy for aws. What am I missing? Please point me a solution
If you only want to reach the dashboard then it is pretty easy, get the IP address of your EC2 instance and the Port on which it is serving dashboard (kubectl get services --all-namespaces) and then reach it using:
First:
kubectl proxy --address 0.0.0.0 --accept-hosts '.*'
And in your browswer:
http://<IP>:<PORT>/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login
Note that this is a possible security vulnerability as you are accepting all traffic (AWS firewall rules) and also all connections for your kubectl proxy (--address 0.0.0.0 --accept-hosts '.*') so please narrow it down or use different approach. If you have more questions feel free to ask.
Have you tried putting http:// in front of localhost?
I don't have enough rep to comment, else I would.
For bypassing dashboard with token. You have to execute the below code
cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
EOF
After this you can skip without providing token. But this will cause security issues.