How to assign AWS IAM Role to Service Account with Terraform? - amazon-web-services

I have a Kubernetes EKS cluster on AWS, an my goal is to be able to watch particular config maps in my Spring Boot application.
On my local environment everything works correctly, but when I use this setup inside AWS I get forbidden state and my application fails to run.
I've created a Service Account but don't understand how to create Terraform script which can assign the needed IAM Role.
Any help would be appreciated.

This depends on several things.
An AWS IAM Role can be provided to Pods in different ways, but the recommended way now is to use IAM Roles for Service Accounts, IRSA.
Depending on how you provision the Kubernetes cluster with Terraform, this is also done in different ways. If you use AWS EKS and provision the cluster using the Terraform AWS EKS module, then you should set enable_irsa to true.
You then need to create an IAM Role for you application (Pods), and you need to return the ARN for the IAM Role. This can be done using the aws_iam_role resource.
You need to create a Kubernetes ServiceAccount for your pod, it can be created with Terraform, but many want to use Yaml for Kubernetes resources. The ServiceAccount need to be annotated with the IAM Role ARN, like:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::14xxx84:role/my-iam-role
See the EKS workshop for IAM Roles for Service Accounts lesson for a guide through this. However, it does not use Terraform.

First I created the necessary role using below code:
data "aws_iam_policy_document" "eks_pods" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
effect = "Allow"
condition {
test = "StringEquals"
variable = "${replace(aws_iam_openid_connect_provider.eks.url, "https://", "")}:sub"
values = ["system:serviceaccount:kube-system:aws-node"]
}
principals {
identifiers = [aws_iam_openid_connect_provider.eks.arn]
type = "Federated"
}
}
}
# create a role that can be attached to pods.
resource "aws_iam_role" "eks_pods" {
assume_role_policy = data.aws_iam_policy_document.eks_pods.json
name = "eks-pods-iam-role01"
depends_on = [aws_iam_openid_connect_provider.eks]
}
resource "aws_iam_role_policy_attachment" "aws_pods" {
role = aws_iam_role.eks_pods.name
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
depends_on = [aws_iam_role.eks_pods]
}
Then I used the below command to attach the role created to the service account. I have not found any way to do it from within terraform:
kubectl annotate serviceaccount -n kube-system aws-node eks.amazonaws.com/role-arn=arn:aws:iam::<your_account>:role/eks-pods-iam-role01
Then you can verify your service account, it should show the new annotations.
kubectl describe sa aws-node -n kube-system
Name: aws-node
Namespace: kube-system
Labels: <none>
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::<ur_account>:role/eks-pods-iam-role01
Image pull secrets: <none>
Mountable secrets: aws-node-token-xxxxx
Tokens: aws-node-token-xxxxx
Events: <none>

I think you missed something here..you should add trust relationship between the role and the oidc provider as described here:

Related

finding out what HCL literals GCP roles map to

I have the following code:
resource "google_project_iam_binding" "px_kubernetes_engine_cluster_viewer" {
project = var.project_id
role = "roles/kubernetesEngineCluster.viewer"
members = [
"serviceAccount:${google_service_account.px.email}",
]
}
My aim is to assign the Kubernetes Engine Cluster viewer role to a service account, however, whatever string literal represents this and I have tried: "roles/kubernetesEngineCluster.viewer" and "roles/kubernetesEngineClusterViewer" without success, the GCP provider does not like this.
How can I find out what Kubernetes Engine Cluster viewer maps to in HCL ?
The role is roles/container.clusterViewer.
The HCL supports the same definitions that Google Cloud IAM uses. For Kubernetes they are here:
Predefined GKE Roles
The CLI can list all predefined roles:
gcloud iam roles list

How do I create an EKS cluster with nodes via CDK?

I'm able to deploy a Kubernetes Fargate cluster via CDK on my desired VPC:
const vpc = ec2.Vpc.fromLookup(this, 'vpc', {
vpcId: 'vpc-abcdefg'
})
const cluster = new eks.FargateCluster(this, 'sample-eks', {
version: eks.KubernetesVersion.V1_21,
vpc,
})
cluster.addNodegroupCapacity('node-group-capacity', {
minSize: 2,
maxSize: 2,
})
However, there are no nodes within this cluster:
$ kubectl config get-clusters
NAME
minikube
arn:aws:eks:us-east-1:<account_number>:cluster/<cluster_name>
$ kubectl get nodes
No resources found
Very confused as to why this is happening, as I thought the addNodegroupCapacity method is supposed to add nodes to the cluster. I think I can add nodes post-hoc via eksctl, but I was wondering if it'd be possible to deploy with nodes via CDK.
My mistake was not adding a role/user with sufficient permissions to the aws-auth ConfigMap. This meant that the cluster did not have proper permissions to create nodes. The following fixed my issue:
const role = iam.Role.fromRoleName(this, 'admin-role', '<my-admin-role>');
cluster.awsAuth.addRoleMapping(role, { groups: [ 'system:masters' ]});
The <my-admin-role> argument is the name of the role that I assume when I log in to AWS. I found it by running aws sts get-caller-identity, which returns a JSON doc that provides your assumed role's ARN. For me it was arn:aws:sts::<account-number>:assumed-role/<my-admin-role>/<my-username>.
This also resolved another issue, as I was not able to interact with the cluster via kubectl. I would get the following error message: error: You must be logged in to the server (Unauthorized). Adding my assumed role to the aws-auth ConfigMap gave me permission to access the cluster via my terminal.
Not sure why I haven't seen this bit of configuration in the tutorials I've used, would appreciate any comments that could help explain this to me.

Terraform google_project_iam_binding deletes GCP compute engine default service account from IAM principals

Problem
Terraform GCP google_service_account and google_project_iam_binding resource to attach roles/editor deleted Google APIs Service Agent and GCP default compute engine default service account in the IAM principals. GKE cluster cannot be deleted / created due to the deletion in IAM principals, although it still remains in IAM Service Accounts.
The problem here is it disappears (which I wrote "deleted") from the IAM principals, and the Compute Engine default service account is compromised, hence no more able to manage Compute Engine, including GKE cluster/nodes.
Terraform GCP provide github issue #10903
Question
I believe this is a Terraform bug but please help understand if there are things I am missing which can prevent the problem.
Please also advise if there is a way to restore the Compute Engine default service account back in IAM principals with the Editor role.
Environment
$ terraform version
Terraform v1.0.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v4.6.0
.terraform.lock.hcl
# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.
provider "registry.terraform.io/hashicorp/google" {
version = "4.6.0"
hashes = [
"h1:QbO4yjDrnoSpiYKSHrICNL1ZuWsl5J2rVRFj2kNg7xA=",
"zh:005a28a2c79f6b29680b0f57260c69c85d8a992688007b6e5645149bd379951f",
"zh:2604d825de72cf99b4899d7880837adeb19d371f48e419666e32c4c3cf6a72e9",
"zh:290da4eb18e44469480cf299bebce89f54e4d301f856cdffe2837b498878c7ec",
"zh:3e5ba1a55d38fa17533a18fc14a612e781ded76c6309734d3dc0a937be27eec1",
"zh:4a85de3cdb33c092d8ccfced3d7302934de0dd4f72bbcebd79d45afe0a0b6f85",
"zh:5fb1a79800833ae922aaba594a8b2bc83be1d254052e12e0ce8330ca0d8933d9",
"zh:679b9f50c6fe0476e74d37935f7598d46d6e9612f75b26a8ef1ca3c13144d06a",
"zh:893216e32378839668c51ef135af1676cd887d63e2edb6625cf9adad7bfa346f",
"zh:ad8f2fd19adbe4c10281ba9b3c8d5100877a9c541d3580bbbe9357714aa77619",
"zh:bff5d6fd15e98c12ee9ed98b0338761dc4a9ba671a37834926daeabf73c71783",
"zh:debdf15fbed8d63e397cd004bf65586bd2b93ce04e47ca51a7c70c1fe9168b87",
]
}
Reproduction Steps
Tested twice in different GCP projects and the issue was reproduced in the same manner.
Start
In a GCP project, starts without Compute Engine enabled, hence no Compute Engine default service account.
Enable Compute Engine API.
Compute Engine default service account gets created and appears both in IAM Principals and IAM Service Accounts.
Terraform apply
Apply the terraform script to create a service account with IAM bindings.
variable "PROJECT_ID" {
type = string
description = "GCP Project ID"
default = "test-tf-sa"
}
variable "REGION" {
type = string
description = "GCP Region"
default = "us-central1"
}
variable "roles_to_grant_to_service_account" {
description = "IAM roles to grant to the service account"
type = list(string)
default = [
"roles/editor",
"roles/iam.serviceAccountAdmin",
"roles/resourcemanager.projectIamAdmin"
]
}
provider "google" {
project = var.PROJECT_ID
region = var.REGION
}
resource "google_service_account" "terraform" {
account_id = "terraform"
display_name = "terraform service account"
}
resource "google_project_iam_binding" "terraform" {
project = var.PROJECT_ID
#--------------------------------------------------------------------------------
# Grant the service account to have the roles
#--------------------------------------------------------------------------------
members = [
"serviceAccount:${google_service_account.terraform.email}"
]
for_each = toset(var.roles_to_grant_to_service_account)
role = each.value
}
$ terraform apply --auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# google_project_iam_binding.terraform["roles/editor"] will be created
+ resource "google_project_iam_binding" "terraform" {
+ etag = (known after apply)
+ id = (known after apply)
+ members = (known after apply)
+ project = "test-tf-sa"
+ role = "roles/editor"
}
# google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be created
+ resource "google_project_iam_binding" "terraform" {
+ etag = (known after apply)
+ id = (known after apply)
+ members = (known after apply)
+ project = "test-tf-sa"
+ role = "roles/iam.serviceAccountAdmin"
}
# google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be created
+ resource "google_project_iam_binding" "terraform" {
+ etag = (known after apply)
+ id = (known after apply)
+ members = (known after apply)
+ project = "test-tf-sa"
+ role = "roles/resourcemanager.projectIamAdmin"
}
# google_service_account.terraform will be created
+ resource "google_service_account" "terraform" {
+ account_id = "terraform"
+ disabled = false
+ display_name = "terraform service account"
+ email = (known after apply)
+ id = (known after apply)
+ name = (known after apply)
+ project = (known after apply)
+ unique_id = (known after apply)
}
Plan: 4 to add, 0 to change, 0 to destroy.
google_service_account.terraform: Creating...
google_service_account.terraform: Creation complete after 2s [id=projects/test-tf-sa/serviceAccounts/terraform#test-tf-sa.iam.gserviceaccount.com]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creating...
google_project_iam_binding.terraform["roles/editor"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creation complete after 9s [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/editor"]: Creation complete after 9s [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Still creating... [10s elapsed]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creation complete after 10s [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Terraform has deleted the Compute Engine default service account from the IAM principals
Immediately after the terraform apply, verify the IAM principals and the Compute Engine default service account has been deleted in the IAM principal view.
As suggested by #JohnHanley, clicked Include Google-provided role grants to unhide Google-managed service accounts. The original Compute Engine default service account 1079157603081-compute#developer.gserviceaccount.com has gone in the IAM principals view.
The gcloud projects get-iam-policy command does not show the Compute Engine default service account 1079157603081-compute#developer.gserviceaccount.com.
$ GCP_PROJECT_ID=test-tf-sa
$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
- members:
- serviceAccount:service-1079157603081#compute-system.iam.gserviceaccount.com
role: roles/compute.admin
- members:
- serviceAccount:service-1079157603081#compute-system.iam.gserviceaccount.com
role: roles/compute.instanceAdmin
- members:
- serviceAccount:service-1079157603081#compute-system.iam.gserviceaccount.com
role: roles/compute.serviceAgent
- members:
- serviceAccount:service-1079157603081#container-engine-robot.iam.gserviceaccount.com
role: roles/container.serviceAgent
- members:
- serviceAccount:service-1079157603081#containerregistry.iam.gserviceaccount.com
role: roles/containerregistry.ServiceAgent
- members:
- serviceAccount:service-1079157603081#compute-system.iam.gserviceaccount.com
role: roles/editor
- members:
- user:****#gmail.com
role: roles/owner
- members:
- serviceAccount:service-1079157603081#gcp-sa-pubsub.iam.gserviceaccount.com
role: roles/pubsub.serviceAgent
etag: BwXVf2S5fCQ=
version: 1
The service account though still remains in the IAM Service Accounts menu.
Create GKE
Enable the Kubernetes Engine API, and create a GKE cluster. At this point, the impact of Compute Engine default service account did not hinder the GKE creation. It may be because of the eventual consistency.
terraform destroy
Run terraform destroy.
$ terraform destroy --auto-approve
google_service_account.terraform: Refreshing state... [id=projects/test-tf-sa/serviceAccounts/terraform#test-tf-sa.iam.gserviceaccount.com]
google_project_iam_binding.terraform["roles/editor"]: Refreshing state... [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Refreshing state... [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Refreshing state... [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply":
# google_project_iam_binding.terraform["roles/editor"] has been changed
~ resource "google_project_iam_binding" "terraform" {
~ etag = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
id = "test-tf-sa/roles/editor"
~ members = [
+ "serviceAccount:1079157603081#cloudservices.gserviceaccount.com",
# (1 unchanged element hidden)
]
# (2 unchanged attributes hidden)
}
# google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] has been changed
~ resource "google_project_iam_binding" "terraform" {
~ etag = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
id = "test-tf-sa/roles/iam.serviceAccountAdmin"
# (3 unchanged attributes hidden)
}
# google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] has been changed
~ resource "google_project_iam_binding" "terraform" {
~ etag = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
id = "test-tf-sa/roles/resourcemanager.projectIamAdmin"
# (3 unchanged attributes hidden)
}
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to
undo or respond to these changes.
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
# google_project_iam_binding.terraform["roles/editor"] will be destroyed
- resource "google_project_iam_binding" "terraform" {
- etag = "BwXVfBieTDw=" -> null
- id = "test-tf-sa/roles/editor" -> null
- members = [
- "serviceAccount:1079157603081#cloudservices.gserviceaccount.com",
- "serviceAccount:terraform#test-tf-sa.iam.gserviceaccount.com",
] -> null
- project = "test-tf-sa" -> null
- role = "roles/editor" -> null
}
# google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be destroyed
- resource "google_project_iam_binding" "terraform" {
- etag = "BwXVfBieTDw=" -> null
- id = "test-tf-sa/roles/iam.serviceAccountAdmin" -> null
- members = [
- "serviceAccount:terraform#test-tf-sa.iam.gserviceaccount.com",
] -> null
- project = "test-tf-sa" -> null
- role = "roles/iam.serviceAccountAdmin" -> null
}
# google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be destroyed
- resource "google_project_iam_binding" "terraform" {
- etag = "BwXVfBieTDw=" -> null
- id = "test-tf-sa/roles/resourcemanager.projectIamAdmin" -> null
- members = [
- "serviceAccount:terraform#test-tf-sa.iam.gserviceaccount.com",
] -> null
- project = "test-tf-sa" -> null
- role = "roles/resourcemanager.projectIamAdmin" -> null
}
# google_service_account.terraform will be destroyed
- resource "google_service_account" "terraform" {
- account_id = "terraform" -> null
- disabled = false -> null
- display_name = "terraform service account" -> null
- email = "terraform#test-tf-sa.iam.gserviceaccount.com" -> null
- id = "projects/test-tf-sa/serviceAccounts/terraform#test-tf-sa.iam.gserviceaccount.com" -> null
- name = "projects/test-tf-sa/serviceAccounts/terraform#test-tf-sa.iam.gserviceaccount.com" -> null
- project = "test-tf-sa" -> null
- unique_id = "107173424725895843752" -> null
}
Plan: 0 to add, 0 to change, 4 to destroy.
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Destroying... [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]
google_project_iam_binding.terraform["roles/editor"]: Destroying... [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Destroying... [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Destruction complete after 10s
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Destruction complete after 10s
google_project_iam_binding.terraform["roles/editor"]: Still destroying... [id=test-tf-sa/roles/editor, 10s elapsed]
google_project_iam_binding.terraform["roles/editor"]: Destruction complete after 11s
google_service_account.terraform: Destroying... [id=projects/test-tf-sa/serviceAccounts/terraform#test-tf-sa.iam.gserviceaccount.com]
google_service_account.terraform: Destruction complete after 1s
Destroy complete! Resources: 4 destroyed.
Problems
Cannot delete GKE
The impact of the Compute Engine default service account deletion in IAM principals started.
Cannot delete GKE cluster with the error.
Google Compute Engine: Required 'compute.instanceGroups.update' permission for 'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp'.
$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
- [cluster-1] in [us-central1-c]
Do you want to continue (Y/n)? Y
Deleting cluster cluster-1...done.
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
- args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n endTime: \'2022-01-14T00:20:54.190004708Z\'\n error: <Status\n code: 7\n details: []\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n name: \'operation-1642119632548-20038ec5\'\n nodepoolConditions: []\n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642119632548-20038ec5\'\n startTime: \'2022-01-14T00:20:32.548792723Z\'\n status: StatusValueValuesEnum(DONE, 3)\n statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
exit_code: 1
Cannot create GKE
Try to create another GKE cluster.
Cannot create GKE cluster anymore. This is the original issue GCP GKE - Google Compute Engine: Not all instances running in IGM I encountered which lead to this trouble shooting.
cluster-2
Google Compute Engine: Not all instances running in IGM after 18.798524988s. Expected 3, running 0, transitioning 3. Current errors: [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.instances.create' permission for 'projects/1079157603081/zones/us-central1-c/instances/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '1079157603081#cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.create' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '1079157603081#cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.setLabels' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '1079157603081#cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.use' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '1079157603081#cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '1079157603081#cloudservices.gserviceaccount.com') (truncated).
Attempts to fix
Tried these measures but no luck.
Reassign roles/Editor to the service account
GCP_PROJECT_ID=test-tf-sa
GCP_SVC_ACC="serviceAccount:1079157603081-compute#developer.gserviceaccount.com"
gcloud projects add-iam-policy-binding ${GCP_PROJECT_ID} \
--member=serviceAccount:${GCP_SVC_ACC} \
--role=roles/Editor
-----
ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/Editor is not supported for this resource.
Apply undelete service account
$ gcloud beta iam service-accounts undelete 109558708367309276392
restoredAccount:
email: 1079157603081-compute#developer.gserviceaccount.com
etag: MDEwMjE5MjA=
name: projects/test-tf-sa/serviceAccounts/1079157603081-compute#developer.gserviceaccount.com
oauth2ClientId: '109558708367309276392'
projectId: test-tf-sa
uniqueId: '109558708367309276392'
They did not bring the Compute Engine default service account back to IAM principals.
Disable Compute Engine API
Tried to disable the Compute Engine API but as GKE nodes cannot be deleted, it cannot be disabled.
Manually add back the service account
Manually added Compute Engine account 1079157603081-compute#developer.gserviceaccount.com" and added IAM roles/Editor. It is not appear in gcloud projects get-iam-policy command output, but still cannot delete the GKE cluster.
$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
...
- members:
- serviceAccount:1079157603081-compute#developer.gserviceaccount.com <-----
- serviceAccount:service-1079157603081#compute-system.iam.gserviceaccount.com
role: roles/editor
...
etag: BwXVf9cVnaU=
version: 1
$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
- [cluster-1] in [us-central1-c]
Do you want to continue (Y/n)? Y
Deleting cluster cluster-1...done.
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
- args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n
message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for
\'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n
detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for
\'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n
endTime: \'2022-01-14T00:33:38.746564953Z\'\n error: <Status\n code: 7\n details: []\n
message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for
\'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n
name: \'operation-1642120382096-034b0eb7\'\n nodepoolConditions: []
\n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n
selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642120382096-034b0eb7\'\n
startTime: \'2022-01-14T00:33:02.096736326Z\'\n status: StatusValueValuesEnum(DONE, 3)\n
statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for
\'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n
targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n
zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for
\'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
exit_code: 1
Another service account for GKE
Created another service account that has compute.admin roles, and used it to create/delete the GKE cluster(s). However, once the Compute Engine default service account has been compromised, keep having the GCP GKE - Google Compute Engine: Not all instances running in IGM issue.
Goal to achieve
Bring the Compute Engine default service account back into the IAM principals like in the snapshot below, and be able to manage Compute Engines and GKE nodes.
Related issues
I wish I had read these before getting into this issue as another bites the sand.
Usability improvements for *_iam_policy and *_iam_binding resources #8354
Description
I'm sure you know by now there is a decent amount of care required when using the *_iam_policy and *_iam_binding versions of IAM resources. There are a number of "be careful!" and "note" warnings in the resources that outline some of the potential pitfalls, but there are hidden dangers as well. For example, using the google_project_iam_policy resource may inadvertently remove Google's service agents' (https://cloud.google.com/iam/docs/service-agents) IAM roles from the project. Or, the dangers of using google_storage_bucket_iam_policy and google_storage_bucket_iam_binding, which may remove the default IAM roles granted to projectViewers:, projectEditors:, and projectOwners: of the containing project.
The largest issue I encounter with people running into the above situations is that the initial terraform plan does not show that anything is being removed. While the documentation for google_project_iam_policy notes that it's best to terraform import the resource beforehand, this is in fact applicable to all *_iam_policy and *_iam_binding resources. Unfortunately this is tedious, potentially forgotten, and not something that you can abstract away in a Terraform module.
terraform/gcp - In what use cases we have no choice but to use authoritative resources?
Cause
As #toteem pointed out
google_project_iam_binding resource is Authoritative which mean it will delete any binding that is NOT explicitly specified in the terraform configuration.
google_project_iam_binding
Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the project are preserved.
Not sure who can get the clear idea what terraform does with google_project_iam_binding but as GCP has identified, Terraform google_project_iam_binding has deleted all the accounts not in the members attribute that have "roles/Editor" role.
Still, I believe this is a terraform defect.
As per the Google APIs Service Agent document, it is the essential service accounts that GCP internally manages. Terraform should not delete any such GCP managed internal service accounts as it bring the GCP projects down. I doubt in what use cases do we need this to happen.
Google APIs Service Agent
Some Google Cloud services need access to your resources so that they can act on your behalf. For example, when you use Cloud Run to run a container, the service needs access to any Pub/Sub topics that can trigger the container.
To meet this need, Google creates and manages service accounts for many Google Cloud services. These service accounts are known as Google-managed service accounts. You might see Google-managed service accounts in your project's IAM policy, in audit logs, or on the IAM page in the Cloud Console.
Google-managed service accounts are not listed in the Service accounts page in the Cloud Console.
Google APIs Service Agent. Your project is likely to contain a service account named the Google APIs Service Agent, with an email address that uses the following format: project-number#cloudservices.gserviceaccount.com
This service account runs internal Google processes on your behalf. It is automatically granted the Editor role (roles/editor) on the project.
Solution
Use google_project_iam_member.
#--------------------------------------------------------------------------------
# Service Account Roles
# Need roles/resourcemanager.projectIamAdmin to be able to execute this.
#--------------------------------------------------------------------------------
# resource "google_project_iam_binding" "terraform" {
# project = var.PROJECT_ID
#
# #--------------------------------------------------------------------------------
# # Grant the service account to have the roles
# #--------------------------------------------------------------------------------
# members = [
# "serviceAccount:${google_service_account.terraform.email}"
# ]
# for_each = toset(var.roles_to_grant_to_service_account)
# role = each.value
# }
#--------------------------------------------------------------------------------
# Service Account Roles
# Need roles/resourcemanager.projectIamAdmin to be able to execute this.
#--------------------------------------------------------------------------------
resource "google_project_iam_member" "terraform" {
project = local.PROJECT_ID
#--------------------------------------------------------------------------------
# Grant the service account to have the roles
#--------------------------------------------------------------------------------
member = "serviceAccount:${google_service_account.terraform.email}"
for_each = toset(var.roles_to_grant_to_service_account)
role = each.value
}
Fix
In case the GCP internal service accounts have been deleted by google_project_iam_binding.
According to GCP:
To fix this issue you can add the service agent in the IAM page using the Add option at the top. The principal will be "${PROJECT_ID}#cloudservices.gserviceaccount.com" and add the editor role.
As per the error message, add '1079157603081#cloudservices.gserviceaccount.com' in IAM.
'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '1079157603081#cloudservices.gserviceaccount.com') (truncated).
The Google APIs Service Agent is restored in the view.
Create GKE.
Conclusion
I would never use them as I doubt if any use cases exist which we need to destroy other accounts that have the same roles.
google_project_iam_member
google_service_account_iam_binding
You can restore the service accounts using the “gcloud beta iam service-accounts undelete” command.
If you accidentally delete a service account, you can try to undelete the service account instead of creating a new service account.
Please review this link if you need more info.
You may notice that in order to restore a deleted account you may need the 21 digit unique ID.
If you do not have this ID for the account, you could try this command :
gcloud logging read --freshness=30d --format='table(timestamp,resource.labels.email_id,resource.labels.project_id,resource.labels.unique_id)' protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccount" resource.type="service_account" logName:"cloudaudit.googleapis.com%2Factivity"'
or this command:
gcloud logging read --freshness=30d protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccount" | grep 'email_id|unique_id'

How to create AWS IAM role with ServiceAccount and attach to Kubernetes DaemonSet

I found that documentation that we can add AWS IAM role to kubernetes serviceaccount and attach to Pods. And what I'm supposed to do is I want to attach that service account to DaemonSet instead of Pods level permission. But I configured same as that documentation and attached to DaemonSet but I've encountered following error message after that:
Aws::STS::Errors::AccessDenied error="Not authorized to perform sts:AssumeRoleWithWebIdentity
Is that meant those type of serviceaccount with IAM role cannot be attached to DaemonSet?
Is that meant those type of serviceaccount with IAM role cannot be attached to DaemonSet?
No,there shouldn't be any issues with that. I checked here and there is an example with service account in a deployment.
As #PPShein mentioned in comments the issue occurs because he forgot to add the openid_url.
Please refer to this and this documentation.

Use IAM role instead of credentials to create aws resource from an EC2 instance using terraform

We are working on a requirement where we want terraform apply which runs on AWS EC2 instance to use IAM role instead of using credentials(accesskey/secretkey) as part of aws provider to create route53 in AWS.
NOTE: IAM Role added to instance has been provided with policy which gives the role the route53fullaccess.
When we use below syntax in terraform.tf, it works fine. We are able to create route.
SYNTAX:
*provider "aws" {
access_key = "${var.aws_accesskey}
secret_key = "${var.aws_secretkey}
region = "us-east-1"
}
resource "aws_route53_record {}*
But, we want the terraform script to run with IAM Role and not with credentials. (Do not want to maintain credentials file)
STEPS TRIED:
1. Removed provider block from terraform.tf file and run the build.
SYNTAX:
resource "aws_route53_record {}
2.Getting the below error.
Provider.aws :InvalidClientTokenid.
3. Went through the terraform official documentation to use IAM Role. it says to use metadata api. but there is no working sample. (https://www.terraform.io/docs/providers/aws/index.html)
Am new to Terraforms so pardon me if its a basic question. Can someone help with the code/working sample to achieve this ?
You need to supply the profile arn in the "provider" block, not the role, like so :
provider "aws" {
profile = "arn:aws:iam::<your account>:instance-profile/<your role name>"
}
The 'role_arn' key mentioned in the answer above is actually invalid in the 'provider' context.
Insert the following line for IAM role in your terraform script, in provider:
role_arn = "arn:aws:iam::<your account>:role/SQS-Role-demo"