Kubectl commands to EKS, from EC2 in a private networks, are timing out - amazon-web-services

This EKS cluster has a private endpoint only. My end goal is to deploy Helm charts on the EKS. I connect to an EC2 machine via SSM and I have already installed Helm and Kubectl on that machine. The trouble is that in a private network, the AWS APIs can't be called. So, instead of calling aws eks update-kubeconfig --region region-code --name cluster-name I have created the kubeconfig such as below.
apiVersion: v1
clusters:
- cluster:
server: 1111111111111111.gr7.eu-west-1.eks.amazonaws.com
certificate-authority-data: JTiBDRVJU111111111
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: aws
name: aws
current-context: aws
kind: Config
preferences: {}
users:
- name: aws
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws
args:
- "eks"
- "get-token"
- "--cluster-name"
- "this-is-my-cluster"
# - "--role-arn"
# - "role-arn"
# env:
# - name: AWS_PROFILE
# value: "aws-profile"
Getting the following error:
I0127 21:24:26.336266 3849 loader.go:372] Config loaded from file: /tmp/.kube/config-eks-demo
I0127 21:24:26.337081 3849 round_trippers.go:435] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.21.2 (linux/amd64) kubernetes/d2965f0" 'http://1111111111111111.gr7.eu-west-1.eks.amazonaws.com/api?timeout=32s'
I0127 21:24:56.338147 3849 round_trippers.go:454] GET http://1111111111111111.gr7.eu-west-1.eks.amazonaws.com/api?timeout=32s in 30001 milliseconds
I0127 21:24:56.338171 3849 round_trippers.go:460] Response Headers:
I0127 21:24:56.338238 3849 cached_discovery.go:121] skipped caching discovery info due to Get "http://1111111111111111.gr7.eu-west-1.eks.amazonaws.com/api?timeout=32s": dial tcp 10.1.1.193:80: i/o timeout
There is connectivity in the VPC, there are no issues with NACLs, security groups, port 80.

That looks like this open EKS issue: https://github.com/aws/containers-roadmap/issues/298
If that’s the case, upvote it so that the product team can prioritize it. If you have Enterprise support your TAM can help there as well.

Related

How to integrate Custom CA (AWS PCA) using Kubernetes CSR in Istio

am trying to setup Custom CA (AWS PCA) Integration using Kubernetes CSR in Istio following this doc (Istio / Custom CA Integration using Kubernetes CSR). Steps followed:
i) Enable feature gate on cert-manager controller: --feature-gates=ExperimentalCertificateSigningRequestControllers=true
ii) AWS PCA and aws-privateca-issuer plugin is already in place.
iii) awspcaclusterissuers object in place with AWS PCA arn (arn:aws:acm-pca:us-west-2:<account_id>:certificate-authority/)
iv) Modified Istio operator with defaultConfig and caCertificates of AWS PCA issuer (awspcaclusterissuers.awspca.cert-manager.io/)
v) Modified istiod deployment and added env vars (as mentioned in the doc along with cluster role).
istiod pod is failing with this error:
Generating K8S-signed cert for [istiod.istio-system.svc istiod-remote.istio-system.svc istio-pilot.istio-system.svc] using signer awspcaclusterissuers.awspca.cert-manager.io/cert-manager-aws-root-ca
2023-01-04T07:25:26.942944Z error failed to create discovery service: failed generating key and cert by kubernetes: no certificate returned for the CSR: "csr-workload-lg6kct8nh6r9vx4ld4"
Error: failed to create discovery service: failed generating key and cert by kubernetes: no certificate returned for the CSR: "csr-workload-lg6kct8nh6r9vx4ld4"
K8s Version: 1.22
Istio Version: 1.13.5
Note: Our integration of cert manager and AWS PCA works fine as we generate Private Certificates using cert-manager and PCA with ‘Certificates’ object. It’s the integration of kubernetes csr method with istio that is failing!
Would really appreciate if anybody with knowledge on this could help me out here as there are nearly zero docs on this integration.
I haven't done this with Kubernetes CSR, but I have done it with Istio CSR. Here are the steps to accomplish it with this approach.
Create a certificate in AWS Private CA and download the public root certificate either via the console or AWS CLI (aws acm-pca get-certificate-authority-certificate --certificate-authority-arn <certificate-authority-arn> --region af-south-1 --output text > ca.pem).
Create a secret to store this root certificate. Cert manager will use this public root cert to communicate with the root CA (AWS PCA).
Install cert-manager. Cert manager will essentially function as the intermediate CA in place of istiod.
Install AWS PCA Issuer plugin.
Make sure you have the necessary permissions in place for the workload to communicate with AWS Private CA. The recommended approach would be to use OIDC with IRSA. The other approach is to grant permissions to the node role. The problem with this is that any pod running on your nodes essentially has access to AWS Private CA, which isn't a least privilege approach.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "awspcaissuer",
"Action": [
"acm-pca:DescribeCertificateAuthority",
"acm-pca:GetCertificate",
"acm-pca:IssueCertificate"
],
"Effect": "Allow",
"Resource": "arn:aws:acm-pca:<region>:<account_id>:certificate-authority/<resource_id>"
}
]
}
Once the permissions are in place, you can create a cluster issuer or an issuer that will represent the root CA in the cluster.
apiVersion: awspca.cert-manager.io/v1beta1
kind: AWSPCAClusterIssuer
metadata:
name: aws-pca-root-ca
spec:
arn: <aws-pca-arn-goes-here>
region: <region-where-ca-was-created-in-aws>
Create the istio-system namespace
Install Istio CSR and update the Helm values for the issuer so that cert-manager knows to communicate with the AWS PCA issuer.
helm install -n cert-manager cert-manager-istio-csr jetstack/cert-manager-istio-csr \
--set "app.certmanager.issuer.group=awspca.cert-manager.io" \
--set "app.certmanager.issuer.kind=AWSPCAClusterIssuer" \
--set "app.certmanager.issuer.name=aws-pca-root-ca" \
--set "app.certmanager.preserveCertificateRequests=true" \
--set "app.server.maxCertificateDuration=48h" \
--set "app.tls.certificateDuration=24h" \
--set "app.tls.istiodCertificateDuration=24h" \
--set "app.tls.rootCAFile=/var/run/secrets/istio-csr/ca.pem" \
--set "volumeMounts[0].name=root-ca" \
--set "volumeMounts[0].mountPath=/var/run/secrets/istio-csr" \
--set "volumes[0].name=root-ca" \
--set "volumes[0].secret.secretName=istio-root-ca"
I would also recommend setting preserveCertificateRequests to true at least for the first time you set this up so that you can actually see the CSRs and whether or not the certificate were successfully issued.
When you install Istio CSR, this will create a certificate called istiod as well as a corresponding secret that stores the cert. The secret is called istiod-tls. This is the cert for the intermediate CA (Cert manager with Istio CSR).
9) Install Istio with the following custom configurations:
Update the CA address to Istio CSR (the new intermediate CA)
Disable istiod from functioning as the CA
Mount istiod with with the cert-manager certificate details
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: istio
namespace: istio-system
spec:
profile: "demo"
hub: gcr.io/istio-release
values:
global:
# Change certificate provider to cert-manager istio agent for istio agent
caAddress: cert-manager-istio-csr.cert-manager.svc:443
components:
pilot:
k8s:
env:
# Disable istiod CA Sever functionality
- name: ENABLE_CA_SERVER
value: "false"
overlays:
- apiVersion: apps/v1
kind: Deployment
name: istiod
patches:
# Mount istiod serving and webhook certificate from Secret mount
- path: spec.template.spec.containers.[name:discovery].args[7]
value: "--tlsCertFile=/etc/cert-manager/tls/tls.crt"
- path: spec.template.spec.containers.[name:discovery].args[8]
value: "--tlsKeyFile=/etc/cert-manager/tls/tls.key"
- path: spec.template.spec.containers.[name:discovery].args[9]
value: "--caCertFile=/etc/cert-manager/ca/root-cert.pem"
- path: spec.template.spec.containers.[name:discovery].volumeMounts[6]
value:
name: cert-manager
mountPath: "/etc/cert-manager/tls"
readOnly: true
- path: spec.template.spec.containers.[name:discovery].volumeMounts[7]
value:
name: ca-root-cert
mountPath: "/etc/cert-manager/ca"
readOnly: true
- path: spec.template.spec.volumes[6]
value:
name: cert-manager
secret:
secretName: istiod-tls
- path: spec.template.spec.volumes[7]
value:
name: ca-root-cert
configMap:
defaultMode: 420
name: istio-ca-root-cert
If you want to watch a detailed walk-through on how the different components communicate, you can watch this video:
https://youtu.be/jWOfRR4DK8k
In the video, I also show the CSRs and the certs being successfully issued, as well as test that mTLS is working as expected.
The video is long, but you can skip to 17:08 to verify that the solution works.
Here's a repo with these same steps, the relevant manfiest files and architecture diagrams describing the flow: https://github.com/LukeMwila/how-to-setup-external-ca-in-istio

Cant deploy Ingress object on EKS: failed calling webhook vingress.elbv2.k8s.aws: the server could not find the requested resource

I am following this AWS guide: https://aws.amazon.com/premiumsupport/knowledge-center/eks-alb-ingress-controller-fargate/ to setup my kubernetes cluster under ALB.
After installing the AWS ALB controller on my EKS cluster, following below steps:
helm repo add eks https://aws.github.io/eks-charts
kubectl apply -k "github.com/aws/eks-charts/stable/aws-load-balancer-controller//crds?ref=master"
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
--set clusterName=YOUR_CLUSTER_NAME \
--set serviceAccount.create=false \
--set region=YOUR_REGION_CODE \
--set vpcId=<VPC_ID> \
--set serviceAccount.name=aws-load-balancer-controller \
-n kube-system
I want to deploy my ingress configurations:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/success-codes: 200,302
alb.ingress.kubernetes.io/target-type: instance
kubernetes.io/ingress.class: alb
name: staging-ingress
namespace: staging
finalizers:
- ingress.k8s.aws/resources
spec:
rules:
- http:
paths:
- backend:
serviceName: my-service
servicePort: 80
path: /api/v1/price
Everything looks fine. However, when I run the below command to deploy my ingress:
kubectl apply -f ingress.staging.yaml -n staging
I am having below error:
Error from server (InternalError): error when creating "ingress.staging.yaml": Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": the server could not find the requested resource
There are very few similar issues on Google an none was helping me. Any ideas of what is the problem?
K8s version: 1.18
the security group solved me:
node_security_group_additional_rules = {
ingress_allow_access_from_control_plane = {
type = "ingress"
protocol = "tcp"
from_port = 9443
to_port = 9443
source_cluster_security_group = true
description = "Allow access from control plane to webhook port of AWS load balancer controller"
}
}
I would suggest to take a look at the alb controller logs, the CRDs that you are using are for v1beta1 API group while the latest chart is registering v1 API group webhook aws-load-balancer-controller v2.4.0
If you look at the alb controller startup logs you should see a line similar to the below message
v1beta1
{"level":"info","ts":164178.5920634,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1beta1-ingress"}
v1
{"level":"info","ts":164683.0114837,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1-ingress"}
if that is the case you can fix the problem by using an earlier version of the controller or get the newer version for the CRDs

How to authenticate into AWS ECR in Kubernetes Yaml

I have the following pod.yaml file that simply describes the creation of a Kubernetes pod:
apiVersion: v1
kind: Pod
metadata:
name: dotnet-console-producer-poc.pod
labels:
app: helloworld
spec:
containers:
- name: dotnet-console-producer-pod
image: 442285873998.dkr.ecr.us-east-1.amazonaws.com/dotnet-console-producer-benchmark-docker:latest
ports:
- containerPort: 8001
The image referenced is in AWS ECR (442285873998.dkr.ecr.us-east-1.amazonaws.com/dotnet-console-producer-benchmark-docker:latest).
When running the create resource command (kubectl create -f pod.yaml), the pod gets created but it crashes because it cannot access the image from AWS ECR. The Kubernetes error is shown below:
Failed to pull image "442285873998.dkr.ecr.us-east-1.amazonaws.com/mcflow-dotnet-console-producer-benchmark-docker:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for 442285873998.dkr.ecr.us-east-1.amazonaws.com/mcflow-dotnet-console-producer-benchmark-docker, repository does not exist or may require 'docker login': denied: User: arn:aws:sts::607546651489:assumed-role/nodes.dev.vet-dev.digitalecp.mcd.com/i-055276c817ba7a096 is not authorized to perform: ecr:BatchGetImage on resource: arn:aws:ecr:us-east-1:442285873998:repository/mcflow-dotnet-console-producer-benchmark-docker
My Kubernetes instance is running on an EC2 instance. How can I authenticate into ECR so that Kubernetes can retrieve the image and run it in the pod?
We have created a Helm chart to solve this problem, hope this helps - https://github.com/relizaio/helm-charts/#1-ecr-regcred

kubectl returns "Unable to connect to the server: x509: certificate is valid for..."

I started an instance of minikube on a remote machine (k8_host). I'm trying to connect to it from a local machine (client_comp). I followed the instrustions given here to set it up and move over the certificates.
It appears that I can successfully ping with kubectl on client_comp, but am getting a cert error:
$ kubectl get pods
Unable to connect to the server: x509: certificate is valid for 192.168.49.2, 10.96.0.1, 127.0.0.1, 10.0.0.1, not 192.168.1.69
When I check the IP setup for minikube I get
$minikube ip
192.168.49.2
The ip of k8_host is 192.168.1.69.
If I understand correctly, it appears that when minikube was started up, it auto generated a set of certs, which required a domain. So, it created the certs using the minikube local ip (192.168.49.2) on k8_host. And, when I try to connect form client_comp it's setting the host as the network ip of k8_host (192.168.1.69).
Do I need to update the certs? I'm guessing, since nginx is setup to just pass the ssl cert (using stream), I can't just add the correct host in the nginx config.
For future reference, is there maybe something I did wrong during minikube setup?
For reference:
~/.kube/config (on client_comp)
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: [redacted]
server: [redacted]
name: docker-desktop
- cluster:
certificate-authority: home_computer/ca.crt
server: https://192.168.1.69:51999
name: home_computer
contexts:
- context:
cluster: docker-desktop
user: docker-desktop
name: docker-desktop
- context:
cluster: home_computer
user: home_computer
name: home_computer
current-context: home_computer
kind: Config
preferences: {}
users:
- name: docker-desktop
user:
client-certificate-data: [redacted]
client-key-data: [redacted]
- name: home_computer
user:
client-certificate: home_computer/client.crt
client-key: home_computer/client.key
~/.minikube/config (on k8 host)
apiVersion: v1
clusters:
- cluster:
certificate-authority: /home/coopers/.minikube/ca.crt
extensions:
- extension:
last-update: Thu, 25 Mar 2021 22:27:27 EDT
provider: minikube.sigs.k8s.io
version: v1.18.1
name: cluster_info
server: https://192.168.49.2:8443
name: minikube
contexts:
- context:
cluster: minikube
extensions:
- extension:
last-update: Thu, 25 Mar 2021 22:27:27 EDT
provider: minikube.sigs.k8s.io
version: v1.18.1
name: context_info
namespace: default
user: minikube
name: minikube
current-context: minikube
kind: Config
preferences: {}
users:
- name: minikube
user:
client-certificate: /home/coopers/.minikube/profiles/minikube/client.crt
client-key: /home/coopers/.minikube/profiles/minikube/client.key
/etc/nginx/nginx.conf (on k8 host)
stream {
server {
listen 192.168.1.69:51999;
proxy_pass 192.168.49.2:8443;
}
}
I saw this question, but it seems to have a different root issue.
Thank you for any help or guidance.
Alright, I found an approach. This is a deep-6-ish approach and should only be used if you are ok with losing the state of your k8s cluster.
First, I stopped the cluster, and deleted all the cluster definitions:
$ minikube stop
$ minikube delete --all
I then restarted the cluster with
$ minikube start --apiserver-ips=<k8_host ip>
This recreated the client key and cert, but kept the same ca cert. So, I just needed to copy over ~/.minikube/profiles/minikube/client.crt and ~/.minikube/profiles/minikube/client.key from k8_host to client_comp.
Hope this helps someone else in the future.

How to connect AWS EKS cluster from Azure Devops pipeline - No user credentials found for cluster in KubeConfig content

I have to setup CI in Microsoft Azure Devops to deploy and manage AWS EKS cluster resources. As a first step, found few kubernetes tasks to make a connection to kubernetes cluster (in my case, it is AWS EKS) but in the task "kubectlapply" task in Azure devops, I can only pass the kube config file or Azure subscription to reach the cluster.
In my case, I have the kube config file but I also need to pass the AWS user credentials that is authorized to access the AWS EKS cluster. But there is no such option in the task when adding the New "k8s end point" to provide the AWS credentials that can be used to access the EKS cluster. Because of that, I am seeing the below error while verifying the connection to EKS cluster.
During runtime, I can pass the AWS credentials via envrionment variables in the pipeline but can not add the kubeconfig file in the task and SAVE it.
Azure and AWS are big players in Cloud and there should be ways to connect to connect AWS resources from any CI platform. Does anyone faced this kind of issues and What is the best approach to connect to AWS first and EKS cluster for deployments in Azure Devops CI.
No user credentials found for cluster in KubeConfig content. Make sure that the credentials exist and try again.
Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through the AWS IAM Authenticator for Kubernetes. You may update your config file referring to the following format:
apiVersion: v1
clusters:
- cluster:
server: ${server}
certificate-authority-data: ${cert}
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: aws
name: aws
current-context: aws
kind: Config
preferences: {}
users:
- name: aws
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
env:
- name: "AWS_PROFILE"
value: "dev"
args:
- "token"
- "-i"
- "mycluster"
Useful links:
https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html
https://github.com/kubernetes-sigs/aws-iam-authenticator#specifying-credentials--using-aws-profiles
I got the solution by using ServiceAccount following this post: How to deploy to AWS Kubernetes from Azure DevOps
For anyone who is still having this issue, i had to set this up for the startup i worked for and it was pretty simple.
After your cluster is created create the service account
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: build-robot
EOF
Then apply the cluster rolebinding
$ kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: build-robot
name: build-robot
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- kind: ServiceAccount
name: build-robot
namespace: default
EOF
Be careful with the above as it gives full access, checkout (https://kubernetes.io/docs/reference/access-authn-authz/rbac/) for more info for scoping the access.
From there head over to ADO and follow the steps using deploy-robot as the SA name
$ kubectl get serviceAccounts build-robot -n default -o='jsonpath={.secrets[*].name}'
xyz........
$ kubectl get secret xyz........ -n default -o json
...
...
...
Paste the output into the last box when adding the kubernetes resource into the environment and select Accept UnTrusted Certificates. Then click apply and validate and you should be good to go.