Certificate error when deploying Hashicorp Vault with Kubernetes Auth Method on AWS EKS - amazon-web-services

I am trying to deploy Hashicorp Vault with Kubernetes Auth Method on AWS EKS.
Hashicorp Auth Method:
https://www.vaultproject.io/docs/auth/kubernetes.html
Procedure used, derived from CoreOS Vault Operator. Though I am not actually using their operator:
https://github.com/coreos/vault-operator/blob/master/doc/user/kubernetes-auth-backend.md
Below is the summary of the procedure use with some additional content. Essentially, I am getting a certificate error when attempting to actually login to vault after following the needed steps. Any help is appreciated.
Create the service account and clusterrolebinding for tokenreview:
$kubectl -n default create serviceaccount vault-tokenreview
$kubectl -n default create -f example/k8s_auth/vault-tokenreview-binding.yaml
Contents of vault-tokenreview-binding.yaml file
=========================================
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: vault-tokenreview-binding
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: vault-tokenreview
namespace: default
Enable vault auth and add Kubernetes cluster to vault:
$SECRET_NAME=$(kubectl -n default get serviceaccount vault-tokenreview -o jsonpath='{.secrets[0].name}')
$TR_ACCOUNT_TOKEN=$(kubectl -n default get secret ${SECRET_NAME} -o jsonpath='{.data.token}' | base64 --decode)
$vault auth-enable kubernetes
$vault write auth/kubernetes/config kubernetes_host=XXXXXXXXXX kubernetes_ca_cert=#ca.crt token_reviewer_jwt=$TR_ACCOUNT_TOKEN
Contents of ca.crt file
NOTE: I retrieved the certificate from AWS EKS console. Which
is shown in the "certificate authority" field in
base64 format. I base64 decoded it and placed it here
=================
-----BEGIN CERTIFICATE-----
* encoded entry *
-----END CERTIFICATE-----
Create the vault policy and role:
$vault write sys/policy/demo-policy policy=#example/k8s_auth/policy.hcl
Contents of policy.hcl file
=====================
path "secret/demo/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
$vault write auth/kubernetes/role/demo-role \
bound_service_account_names=default \
bound_service_account_namespaces=default \
policies=demo-policy \
ttl=1h
Attempt to login to vault using the service account created in last step:
$SECRET_NAME=$(kubectl -n default get serviceaccount default -o jsonpath='{.secrets[0].name}')
$DEFAULT_ACCOUNT_TOKEN=$(kubectl -n default get secret ${SECRET_NAME} -o jsonpath='{.data.token}' | base64 --decode)
$vault write auth/kubernetes/login role=demo-role jwt=${DEFAULT_ACCOUNT_TOKEN}
Error writing data to auth/kubernetes/login: Error making API request.
URL: PUT http://localhost:8200/v1/auth/kubernetes/login
Code: 500. Errors:
* Post https://XXXXXXXXX.sk1.us-west-2.eks.amazonaws.com/apis/authentication.k8s.io/v1/tokenreviews: x509: certificate signed by unknown authority

your kubernetes url https://XXXXXXXXX.sk1.us-west-2.eks.amazonaws.com has the bad cert try adding -tls-skip-verify
vault write -tls-skip-verify auth/kubernetes/login .......

Related

AWS EKS Fargate logging to AWS Cloudwatch: log groups are not creating

I used official procedure from AWS and this one to enable logging.
Here is yaml files I've applied:
---
kind: Namespace
apiVersion: v1
metadata:
name: aws-observability
labels:
aws-observability: enabled
---
kind: ConfigMap
apiVersion: v1
metadata:
name: aws-logging
namespace: aws-observability
data:
flb_log_cw: "true"
output.conf: |
[OUTPUT]
Name cloudwatch_logs
Match *
region us-east-1
log_group_name fluent-bit-cloudwatch
log_stream_prefix from-fluent-bit-
auto_create_group true
log_key log
parsers.conf: |
[PARSER]
Name crio
Format Regex
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
filters.conf: |
[FILTER]
Name parser
Match *
Key_name log
Parser crio
Inside the pod I can see that logging was enabled:
apiVersion: v1
kind: Pod
metadata:
annotations:
CapacityProvisioned: 2vCPU 4GB
Logging: LoggingEnabled
kubectl.kubernetes.io/restartedAt: "2023-01-17T19:31:20+01:00"
kubernetes.io/psp: eks.privileged
creationTimestamp: "2023-01-17T18:31:28Z"
Logs exists inside the container:
kubectl logs dev-768647846c-hbmv7 -n dev-fargate
But in AWS CloudWatch log groups are not created, even for fluent-bit itself
From the pod cli I can create log groups in AWS Cloudwatch, so the permissions are ok
I also tried cloudwatch instead of cloudwatch_logs plugin, but no luck
I've solved my issue.
The tricky thing is: IAM policy must be attached to the default pod execution role which created automatically with the namespace and it has no relation to the service account \ custom pod execution role

How to use k8s ServiceAccount to assume correct AWS Role (IRSA)? Running a Spring Boot app on EKS

I have Spring Boot (2.6.6) app running in a EKS cluster that tries to authenticate AWS by assuming an AWS role. I've followed this doc so far.
<dependencies>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-sts</artifactId>
<version>1.12.9</version>
</dependency>
</dependencies>
And in my application helm/k8s setup:
apiVersion: apps/v1
kind: Deployment
spec
template:
metadata:
...
spec:
serviceAccountName: myapp-service-account
securityContext:
fsGroup: 123456
initContainers:
...
ServiceAccount setup:
~ % kubectl get serviceaccounts -n dev
NAME SECRETS AGE
default 1 2y1d
myapp-service-account 1 7d2h
..
~ % kubectl get serviceaccounts/myapp-service-account -n dev -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<account_id>:role/<aws_role_to_assume>
..
But my app doesn't seem to assume the correct role:
2022-06-21 17:50:25.284 WARN [my-app,,] 1 --- [ main] s.AwsSecretsManagerPropertySourceLocator : Unable to load AWS secret from /secret/my-app_dev. User: arn:aws:sts::<account_id>:assumed-role/<cluster_generated_default_role> is
not authorized to perform: secretsmanager:GetSecretValue on resource: /secret/my-app_dev because no identity-based policy allows the secretsmanager:GetSecretValue action (Service: AWSSecretsManager; Status Code: 400; Error Code: AccessDeniedException; Request ID: 1111111-2222-33333-444444; Proxy: null)
In above, I thought adding serviceAccountName: myapp-service-account would allow the app to somehow pick up the new ServiceAccount and thus assume a different role. What am I misconfiguring here?
EDIT
Environment variables:
~ % kubectl exec my-app-pod-1234-abcd -n dev -- env
...
JAVA_OPTS=
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_ROLE_ARN=arn:aws:iam::<aws_account_id>:role/<aws_role_to_assume>
...

EKS: can't see nodes and nodes are not join to the cluster

I read all aws articles. I followed each one by one. But it didn't work any of them. Let me briefly summarize my situation. I created EKS automation with terraform. 1 vpc, 3 public subnets, 3 private subnets, 3 security group, 1 nat gateway(on public), and 2 autoscaled worker node groups. I checked all infra which created with terraform. There are no problem.
My main problem is that after the installation I can't see the nodes and nodes are not join to the cluster. I applied below steps but didn't worked. What should I do? By the way don't tag my question as a duplication I checked all similar questions on stackoverflow. My steps look true but does not work.
kubectl get nodes
No resources found
Before checking node with above command.Firstly I applied below command for setting kubeconfig.
aws eks update-kubeconfig --name eks-DS7h --region us-east-1
Here my kubeconfig:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJfgzsfhadfzasdfrzsd.........
server: https://0F97E579A.gr7.us-east-1.eks.amazonaws.com
name: arn:aws:eks:us-east-1:545153234644:cluster/eks-DS7h
contexts:
- context:
cluster: arn:aws:eks:us-east-1:545153234644:cluster/eks-DS7h
user: arn:aws:eks:us-east-1:545153234644:cluster/eks-DS7h
name: arn:aws:eks:us-east-1:545153234644:cluster/eks-DS7h
current-context: arn:aws:eks:us-east-1:545153234644:cluster/eks-DS7h
kind: Config
preferences: {}
users:
- name: arn:aws:eks:us-east-1:545153234644:cluster/eks-DS7h
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- --region
- us-east-1
- eks
- get-token
- --cluster-name
- eks-DS7h
command: aws
After this I checked the nodes again but I still get no resource found. Than I try to edit aws-auth. Before the edit I check my user on the terminal where I triggered all terraform steps installation.
aws sts get-caller-identity
{
"UserId": "ASDFGSDFGDGSDGDFHSFDSDC",
"Account": "545153234644",
"Arn": "arn:aws:iam::545153234644:user/white"
}
I took my user info and I added blank mapuser area in aws-auth. But still getting No resources found.
kubectl get cm -n kube-system aws-auth
apiVersion: v1
data:
mapAccounts: |
[]
mapRoles: |
- "groups":
- "system:bootstrappers"
- "system:nodes"
- "system:masters"
"rolearn": "arn:aws:iam::545153234644:role/eks-DS7h22060508195731770000000e"
"username": "system:node:{{EC2PrivateDNSName}}"
mapUsers: "- \"userarn\": \"arn:aws:iam::545153234644:user/white\"\n \"username\":
\"white\"\n \"groups\":\n - \"system:masters\"\n - \"system:nodes\" \n"
kind: ConfigMap
metadata:
creationTimestamp: "2022-06-05T08:20:02Z"
labels:
app.kubernetes.io/managed-by: Terraform
terraform.io/module: terraform-aws-modules.eks.aws
name: aws-auth
namespace: kube-system
resourceVersion: "4976"
uid: b12341-33ff-4f78-af0a-758f88
Oh also when I check EKS cluster on dashboard I see below warning too. I don't know is it relevant or not. I want to share it too maybe it will help.

Not authorized to perform sts:AssumeRoleWithWebIdentity- 403

I have been trying to run an external-dns pod using the guide provided by k8s-sig group. I have followed every step of the guide, and getting the below error.
time="2021-02-27T13:27:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 87a3ca86-ceb0-47be-8f90-25d0c2de9f48"
I had created AWS IAM policy using Terraform, and it was successfully created. Except IAM Role for service account for which I had used eksctl, everything else has been spun via Terraform.
But then I got hold of this article which says creating AWS IAM policy using awscli would eliminate this error. So I deleted the policy created using Terraform, and recreated it with awscli. Yet, it is throwing the same error error.
Below is my external dns yaml file.
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns
# If you're using Amazon EKS with IAM Roles for Service Accounts, specify the following annotation.
# Otherwise, you may safely omit it.
annotations:
# Substitute your account ID and IAM service role name below.
eks.amazonaws.com/role-arn: arn:aws:iam::268xxxxxxx:role/eksctl-ats-Eks1-addon-iamserviceaccoun-Role1-WMLL93xxxx
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services","endpoints","pods"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns
namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns
template:
metadata:
labels:
app: external-dns
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: k8s.gcr.io/external-dns/external-dns:v0.7.6
args:
- --source=service
- --source=ingress
- --domain-filter=xyz.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
- --provider=aws
- --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
- --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
- --registry=txt
- --txt-owner-id=Z0471542U7WSPZxxxx
securityContext:
fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
I am scratching my head as there is no proper solution to this error anywhere in the net. Hoping to find a solution to this issue in this forum.
End result must show something like below and fill up records in hosted zone.
time="2020-05-05T02:57:31Z" level=info msg="All records are already up to date"
I also struggled with this error.
The problem was in the definition of the trust relationship.
You can see in some offical aws tutorials (like this) the following setup:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_PROVIDER}:sub": "system:serviceaccount:<my-namespace>:<my-service-account>"
}
}
}
]
}
Option 1 for failure
My problem was that I passed the a wrong value for my-service-account at the end of ${OIDC_PROVIDER}:sub in the Condition part.
Option 2 for failure
After the previous fix - I still faced the same error - it was solved by following this aws tutorial which shows the output of using the eksctl with the command below:
eksctl create iamserviceaccount \
--name my-serviceaccount \
--namespace <your-ns> \
--cluster <your-cluster-name> \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
--approve
When you look at the output in the trust relationship tab in the AWS web console - you can see that an additional condition was added with the postfix of :aud and the value of sts.amazonaws.com:
So this need to be added after the "${OIDC_PROVIDER}:sub" condition.
I was able to get help from the Kubernetes Slack (shout out to #Rob Del) and this is what we came up with. There's nothing wrong with the k8s rbac from the article, the issue is the way the IAM role is written. I am using Terraform v0.12.24, but I believe something similar to the following .tf should work for Terraform v0.14:
data "aws_caller_identity" "current" {}
resource "aws_iam_role" "external_dns_role" {
name = "external-dns"
assume_role_policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": format(
"arn:aws:iam::${data.aws_caller_identity.current.account_id}:%s",
replace(
"${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}",
"https://",
"oidc-provider/"
)
)
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
format(
"%s:sub",
trimprefix(
"${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}",
"https://"
)
) : "system:serviceaccount:default:external-dns"
}
}
}
]
})
}
The above .tf assume you created your eks cluster using terraform and that you use the rbac manifest from the external-dns tutorial.
I have a few possibilities here.
Before anything else, does your cluster have an OIDC provider associated with it? IRSA won't work without it.
You can check that in the AWS console, or via the CLI with:
aws eks describe-cluster --name {name} --query "cluster.identity.oidc.issuer"
First
Delete the iamserviceaccount, recreate it, remove the ServiceAccount definition from your ExternalDNS manfiest (the entire first section) and re-apply it.
eksctl delete iamserviceaccount --name {name} --namespace {namespace} --cluster {cluster}
eksctl create iamserviceaccount --name {name} --namespace {namespace} --cluster
{cluster} --attach-policy-arn {policy-arn} --approve --override-existing-serviceaccounts
kubectl apply -n {namespace} -f {your-externaldns-manifest.yaml}
It may be that there is some conflict going on as you have overwritten what you created with eksctl createiamserviceaccount by also specifying a ServiceAccount in your ExternalDNS manfiest.
Second
Upgrade your cluster to v1.19 (if it's not there already):
eksctl upgrade cluster --name {name} will show you what will be done;
eksctl upgrade cluster --name {name} --approve will do it
Third
Some documentation suggests that in addition to setting securityContext.fsGroup: 65534, you also need to set securityContext.runAsUser: 0.
I've been struggling with a similar issue after following the setup suggested here
I ended up with the exception below in the deploy logs.
time="2021-05-10T06:40:17Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 3fda6c69-2a0a-4bc9-b478-521b5131af9b"
time="2021-05-10T06:41:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 7d3e07a2-c514-44fa-8e79-d49314d9adb6"
In my case, it was an issue with wrong Service account name mapped to the new role created.
Here is a step by step approach to get this done without much hiccups.
Create the IAM Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets"
],
"Resource": [
"arn:aws:route53:::hostedzone/*"
]
},
{
"Effect": "Allow",
"Action": [
"route53:ListHostedZones",
"route53:ListResourceRecordSets"
],
"Resource": [
"*"
]
}
]
}
Create the IAM role and the service account for your EKS cluster.
eksctl create iamserviceaccount \
--name external-dns-sa-eks \
--namespace default \
--cluster aecops-grpc-test \
--attach-policy-arn arn:aws:iam::xxxxxxxx:policy/external-dns-policy-eks \
--approve
--override-existing-serviceaccounts
Created new hosted zone.
aws route53 create-hosted-zone --name "hosted.domain.com." --caller-reference "grpc-endpoint-external-dns-test-$(date +%s)"
Deploy ExternalDNS, after creating the Cluster role and Cluster role binding to the previously created service account.
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services","endpoints","pods"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns-sa-eks
namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns
template:
metadata:
labels:
app: external-dns
# If you're using kiam or kube2iam, specify the following annotation.
# Otherwise, you may safely omit it.
annotations:
iam.amazonaws.com/role: arn:aws:iam::***********:role/eksctl-eks-cluster-name-addon-iamserviceacco-Role1-156KP94SN7D7
spec:
serviceAccountName: external-dns-sa-eks
containers:
- name: external-dns
image: k8s.gcr.io/external-dns/external-dns:v0.7.6
args:
- --source=service
- --source=ingress
- --domain-filter=hosted.domain.com. # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
- --provider=aws
- --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
- --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
- --registry=txt
- --txt-owner-id=my-hostedzone-identifier
securityContext:
fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
Update Ingress resource with the domain name and reapply the manifest.
For ingress objects, ExternalDNS will create a DNS record based on the host specified for the ingress object.
- host: myapp.hosted.domain.com
Validate new records created.
BASH-3.2$ aws route53 list-resource-record-sets --output json
--hosted-zone-id "/hostedzone/Z065*********" --query "ResourceRecordSets[?Name == 'hosted.domain.com..']|[?Type == 'A']"
[
{
"Name": "myapp.hosted.domain.com..",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "ZCT6F*******",
"DNSName": "****************.elb.ap-southeast-2.amazonaws.com.",
"EvaluateTargetHealth": true
}
} ]
In our case this issue occurred when using the Terraform module to create the eks cluster, and eksctl to create the iamserviceaccount for the aws-load-balancer controller. It all works fine the first go-round. But if you do a terraform destroy, you need to do some cleanup, like delete the CloudFormation script created by eksctl. Somehow things got crossed, and the CloudTrail was passing along a resource role that was no longer valid. So check the annotation of the service account to ensure it's valid, and update it if necessary. Then in my case I deleted and redeployed the aws-load-balancer-controller
%> kubectl describe serviceaccount aws-load-balancer-controller -n kube-system
Name: aws-load-balancer-controller
Namespace: kube-system
Labels: app.kubernetes.io/managed-by=eksctl
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-JQL4R3JM7I1A
Image pull secrets: <none>
Mountable secrets: aws-load-balancer-controller-token-b8hw7
Tokens: aws-load-balancer-controller-token-b8hw7
Events: <none>
%>
%> kubectl annotate --overwrite serviceaccount aws-load-balancer-controller eks.amazonaws.com/role-arn='arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-17A92GGXZRY6O' -n kube-system
In my case, I was able to attach the oidc role with route53 permissions policy and that resolved the error.
https://medium.com/swlh/amazon-eks-setup-external-dns-with-oidc-provider-and-kube2iam-f2487c77b2a1
and then with the external-dns service account used that instead of the cluster role.
annotations:
# # Substitute your account ID and IAM service role name below.
eks.amazonaws.com/role-arn: arn:aws:iam::<account>:role/external-dns-service-account-oidc-role
For me the issue was that the trust relationship was (correctly) setup using one partition whereas the ServiceAccount was annotated with a different partition, like so:
...
"Principal": {
"Federated": "arn:aws-us-gov:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
...
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::{{ .Values.aws.account }}:role/{{ .Values.aws.roleName }}
Notice arn:aws:iam vs arn:aws-us-gov:iam

Authenticate an AWS SQS scaler in Keda

I have a Keda deployment that I've been trying to get to work for about a month now. At the moment, my scaler looks like this:
apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
name: {service-name}-scaler
spec:
scaleTargetRef:
deploymentName: {service-name}
containerName: {service-name}
pollingInterval: 30
cooldownPeriod: 600
minReplicaCount: 0
maxReplicaCount: 10
triggers:
- type: aws-sqs-queue
authenticationRef:
name: keda-trigger-authentication
metadata:
queueURL: https://sqs.ap-northeast-1.amazonaws.com/{AWS ID}/{Queue-name}
queueLength: "1"
awsRegion: "ap-northeast-1"
identityOwner: pod
The associated trigger authentication and secret are:
apiVersion: v1
kind: Secret
metadata:
name: keda-secrets
data:
AWS_ACCESS_KEY_ID: {base64-encoded-string}
AWS_SECRET_ACCESS_KEY: {base64-encoded-string}
KEDA_ROLE_ARN: {base64-encoded-string}
---
apiVersion: keda.k8s.io/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-trigger-authentication
spec:
env:
- parameter: awsRegion
name: AWS_REGION
- parameter: awsAccessKeyID
name: AWS_ACCESS_KEY_ID
- parameter: awsSecretAccessKey
name: AWS_SECRET_ACCESS_KEY
- parameter: awsRoleArn
name: KEDA_ROLE_ARN
secretTargetRef:
- parameter: awsRoleArn
name: keda-secrets
key: KEDA_ROLE_ARN
I understand that the KEDA_ROLE_ARN value is repeated here; I left both for debugging purposes. The order of deploying this is as follows:
Install common environment variables (this is where the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and KEDA_ROLE_ARN values are stored. The AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY values are listed as AWS_ACCESS_KEY_ID_ASSUME and AWS_SECRET_ACCESS_KEY_ASSUME respectively in the file and will assume their appropriate values on the container. Again, these are duplicated for debugging purposes. I would prefer to use these values rather than a separate secret.
Install Keda pods with Helm
Deploy the keda-secrets secret and the keda-trigger-authentication trigger authentication
Deploy the container that should be scaled. This is where the AWS_ACCESS_KEY_ID_ASSUME value will assume the name of AWS_ACCESS_KEY_ID and the AWS_SECRET_ACCESS_KEY_ASSUME value will assume the name of AWS_SECRET_ACCESS_KEY and where the AWS_REGION value is defined.
The scaled object is deployed
For some reason, I keep getting an error from AWS when the scaler attempts to scale saying that there are no credential providers in the chain. It appears that the AWS credentials are not being sent. What am I doing wrong here?
I will show you two ways to successfully scale deployment based on AWS SQS
First way : Using AWS IAM role attached to node
If your IAM role (node role) has permission to SQS then accessing SQS becomes easier you just have to change identityOwner: pod field to identityOwner: operator so that KEDA can use node role to access AWS SQS
Sample ScaledObject file with SQS trigger
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: aws-sqs-queue-scaledobject
namespace: default
spec:
scaleTargetRef:
name: test-deployment
minReplicaCount: 0
maxReplicaCount: 2
triggers:
- type: aws-sqs-queue
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/3243234432432/Queue
queueLength: "5"
awsRegion: "us-east-1"
identityOwner: operator
Second way: Using IAM user
In this approach, we need to create below objects
Create IAM user in AWS.
Create secret in Kubernetes.
Create TriggerAuthentication in Kubernetes.
Create scaledObject in Kubernetes.
Create IAM user and give SQS permissions to this IAM user.
first encode IAM user Access Key and Secret key using base64 which will be required while creating Kubernetes secret.
Create secret
apiVersion: v1
kind: Secret
metadata:
name: test-secrets
namespace: default
data:
AWS_ACCESS_KEY_ID: <base64-encoded-key>
AWS_SECRET_ACCESS_KEY: <base64-encoded-secret-key>
Create TriggerAuthentication this will be used in scaledObject
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-trigger-auth-aws-credentials
namespace: default
spec:
secretTargetRef:
- parameter: awsAccessKeyID # Required.
name: test-secrets # Required.
key: AWS_ACCESS_KEY_ID # Required.
- parameter: awsSecretAccessKey # Required.
name: test-secrets # Required.
key: AWS_SECRET_ACCESS_KEY # Required.
Create scaledObject to map keda with deployment you want to scale based on SQS trigger
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: aws-sqs-queue-scaledobject
namespace: default
spec:
scaleTargetRef:
name: test-deployment
minReplicaCount: 0
maxReplicaCount: 2
triggers:
- type: aws-sqs-queue
authenticationRef:
name: keda-trigger-auth-aws-credentials
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/012345678912/Queue
queueLength: "5"
awsRegion: "us-east-1"