Dataflow setting Controller Service Account - google-cloud-platform

I try to set up controller service account for Dataflow. In my dataflow options I have:
options.setGcpCredential(GoogleCredentials.fromStream(
new FileInputStream("key.json")).createScoped(someArrays));
options.setServiceAccount("xxx#yyy.iam.gserviceaccount.com");
But I'm getting:
WARNING: Request failed with code 403, performed 0 retries due to IOExceptions,
performed 0 retries due to unsuccessful status codes, HTTP framework says
request can be retried, (caller responsible for retrying):
https://dataflow.googleapis.com/v1b3/projects/MYPROJECT/locations/MYLOCATION/jobs
Exception in thread "main" java.lang.RuntimeException: Failed to create a workflow
job: (CODE): Current user cannot act as
service account "xxx#yyy.iam.gserviceaccount.com.
Causes: (CODE): Current user cannot act as
service account "xxx#yyy.iam.gserviceaccount.com.
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:791)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:173)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
...
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
{
"code" : 403,
"errors" : [ {
"domain" : "global",
"message" : "(CODE): Current user cannot act as service account
xxx#yyy.iam.gserviceaccount.com. Causes: (CODE): Current user
cannot act as service account xxx#yyy.iam.gserviceaccount.com.",
"reason" : "forbidden"
} ],
"message" : "(CODE): Current user cannot act as service account
xxx#yyy.iam.gserviceaccount.com. Causes: (CODE): Current user
cannot act as service account xxx#yyy.iam.gserviceaccount.com.",
"status" : "PERMISSION_DENIED"
}
Am I missing some Roles or permissions?

Maybe someone is going to find it helpful:
For controller it was: Dataflow Worker and Storage Object Admin (that was found in Google's documentation).
For executor it was: Service Account User.

I've been hitting this error and thought it worth sharing my experiences (partly because I suspect I'll encounter this again in the future).
The terraform code to create my dataflow job is:
resource "google_dataflow_job" "wordcount" {
# https://stackoverflow.com/a/59931467/201657
name = "wordcount"
template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
temp_gcs_location = "gs://${local.name-prefix}-functions/temp"
parameters = {
inputFile = "gs://dataflow-samples/shakespeare/kinglear.txt"
output = "gs://${local.name-prefix}-functions/wordcount/output"
}
service_account_email = "serviceAccount:${data.google_service_account.sa.email}"
}
The error message:
Error: googleapi: Error 400: (c3c0d991927a8658): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com., badRequest
was returned from running terraform apply. Checking out the logs provided a lot more info:
gcloud logging read 'timestamp >= "2020-12-31T13:39:58.733249492Z" AND timestamp <= "2020-12-31T13:45:58.733249492Z"' --format="csv(timestamp,severity,textPayload)" --order=asc
which returned various log records, including this:
Permissions verification for controller service account failed. IAM role roles/dataflow.worker should be granted to controller service account dataflowdemo#redacted.iam.gserviceaccount.com.
so I granted that missing role grant
gcloud projects add-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.worker"
and ran terraform apply again. This time I got the same error in the terraform output but there were no errors to be seen in the logs.
I then followed the advice given at https://cloud.google.com/dataflow/docs/concepts/access-control#creating_jobs to also grant the roles/dataflow.admin:
gcloud projects add-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.admin"
but there was no discernible difference from the previous attempt.
I then tried turning on terraform debug logging which provided this info:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ---[ REQUEST ]---------------------------------------
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: POST /v1b3/projects/redacted/locations/europe-west1/templates?alt=json&prettyPrint=false HTTP/1.1
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Host: dataflow.googleapis.com
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: User-Agent: google-api-go-client/0.5 Terraform/0.14.2 (+https://www.terraform.io) Terraform-Plugin-SDK/2.1.0 terraform-provider-google/dev
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Length: 385
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Type: application/json
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Goog-Api-Client: gl-go/1.14.5 gdcl/20201023
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Accept-Encoding: gzip
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "environment": {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "serviceAccountEmail": "serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "tempLocation": "gs://jamiet-demo-functions/temp"
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: },
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "gcsPath": "gs://dataflow-templates/latest/Word_Count",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "jobName": "wordcount",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "parameters": {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "inputFile": "gs://dataflow-samples/shakespeare/kinglear.txt",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "output": "gs://jamiet-demo-functions/wordcount/output"
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: -----------------------------------------------------
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 2020/12/31 16:04:14 [DEBUG] Google API Response Details:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ---[ RESPONSE ]--------------------------------------
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: HTTP/1.1 400 Bad Request
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Connection: close
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Transfer-Encoding: chunked
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Alt-Svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Cache-Control: private
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Type: application/json; charset=UTF-8
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Date: Thu, 31 Dec 2020 16:04:15 GMT
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Server: ESF
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: Origin
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: X-Origin
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: Referer
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Content-Type-Options: nosniff
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Frame-Options: SAMEORIGIN
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Xss-Protection: 0
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 1f9
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "error": {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "code": 400,
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "message": "(dbacb1c39beb28c9): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com.",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "errors": [
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "message": "(dbacb1c39beb28c9): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com.",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "domain": "global",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "reason": "badRequest"
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ],
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "status": "INVALID_ARGUMENT"
orm-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 0
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: -----------------------------------------------------
The error being returned from dataflow.googleapis.com is clearly evident:
Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com
At this stage I am puzzled as to why I can see an error being returned from the Google's dataflow API but there is nothing in the GCP logs indicating that an error occurred.
Then tho I had a bit of a lightbulb moment. Why does that error message mention "service account serviceAccount"? Then it hit me, I'd defined the service account incorrectly. Terraform code should have been:
resource "google_dataflow_job" "wordcount" {
# https://stackoverflow.com/a/59931467/201657
name = "wordcount"
template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
temp_gcs_location = "gs://${local.name-prefix}-functions/temp"
parameters = {
inputFile = "gs://dataflow-samples/shakespeare/kinglear.txt"
output = "gs://${local.name-prefix}-functions/wordcount/output"
}
service_account_email = data.google_service_account.sa.email
}
I corrected it and it worked straight away. User error!!!
I then set about removing the various permissions that I'd added:
gcloud projects remove-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.admin"
gcloud projects remove-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.worker"
and terraform apply still worked. However, after removing the grant of role roles/dataflow.worker the job failed with error:
Workflow failed. Causes: Permissions verification for controller service account failed. IAM role roles/dataflow.worker should be granted to controller service account dataflowdemo#redacted.iam.gserviceaccount.com.
so clearly the documentation regarding the appropriate roles to grant (https://cloud.google.com/dataflow/docs/concepts/access-control#creating_jobs) is spot on.
As may be apparent, I started writing this post before I knew what the problem was and I thought it might be useful to document my investigation somewhere. Now that I've finished the investigation and the problem turns out to be one of PEBCAK its probably not so relevant to this thread anymore, and certainly shouldn't be accepted as an answer. Nevertheless, there is probably some useful information in here about how to go about investigating issues with terraform calling Google APIs, and it also reiterates the required role grants, so I'll leave it here in case it ever turns out to be useful.

I just hit this problem again so posting my solution up here as I fully expect I'll get bitten by this again at some point.
I was getting error:
Error: googleapi: Error 403: (a00eba23d59c1fa3): Current user cannot act as service account dataflow-controller-sa#myproject.iam.gserviceaccount.com. Causes: (a00eba23d59c15ac): Current user cannot act as service account dataflow-controller-sa#myproject.iam.gserviceaccount.com., forbidden
I was deploying the dataflow job, via terraform, using a different service account, deployer#myproject.iam.gserviceaccount.com
The solution was to grant that service account the roles/iam.serviceAccountUser role:
gcloud projects add-iam-policy-binding myproject \
--member=serviceAccount:deployer#myproject.iam.gserviceaccount.com \
--role=roles/iam.serviceAccountUser
For those that prefer custom IAM roles over predefined IAM roles the specific permission that was missing was iam.serviceAccounts.actAs.

Issue Got Resolved!
Go to GCP -> Console -> IAM -> ServiceAccount Email -> Add Permission -> Service Account User. as below

Related

EKS fluent-bit unable to assume AWS role from service account

I'm going mad over a fluent bit DaemonSet installed via Helm in EKS on Account AWS yyyyyyy unable to send data to Kinesis in AWS account xxxxxxxxxx.
It looks like EKS does not have OIDC provider on IAM but it's false! Can you help?
fluent bit logs:
[2022/06/29 15:22:34] [debug] [output:kinesis_firehose:kinesis_firehose.0] firehose:PutRecordBatch: events=157, payload=71245 bytes
[2022/06/29 15:22:34] [debug] [output:kinesis_firehose:kinesis_firehose.0] Sending log records to delivery stream kinesis_backend
[2022/06/29 15:22:34] [debug] [http_client] not using http_proxy for header
[2022/06/29 15:22:34] [debug] [aws_credentials] Requesting credentials from the EC2 provider..
[2022/06/29 15:22:34] [debug] [input:tail:tail.0] inode=19100461 events: IN_MODIFY
[2022/06/29 15:22:34] [debug] [input chunk] update output instances with new chunk size diff=693
[2022/06/29 15:22:34] [debug] [input:tail:tail.0] inode=19100461 events: IN_MODIFY
[2022/06/29 15:22:34] [debug] [http_client] server firehose.eu-west-1.amazonaws.com:443 will close connection #74
[2022/06/29 15:22:34] [debug] [aws_client] firehose.eu-west-1.amazonaws.com: http_do=0, HTTP Status: 400
[2022/06/29 15:22:34] [error] [aws_client] auth error, refreshing creds
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the env provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the profile provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Reading shared config file.
[2022/06/29 15:22:34] [debug] [aws_credentials] Shared config file /root/.aws/config does not exist
[2022/06/29 15:22:34] [debug] [aws_credentials] Reading shared credentials file.
[2022/06/29 15:22:34] [error] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the EKS provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Calling STS..
[2022/06/29 15:22:34] [debug] [http_client] not using http_proxy for header
[2022/06/29 15:22:34] [debug] [http_client] server sts.eu-west-1.amazonaws.com:443 will close connection #74
[2022/06/29 15:22:34] [debug] [aws_client] sts.eu-west-1.amazonaws.com: http_do=0, HTTP Status: 400
[2022/06/29 15:22:34] [debug] [aws_client] Unable to parse API response- response is not valid JSON.
[2022/06/29 15:22:34] [debug] [aws_credentials] STS raw response:
<ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
<Error>
<Type>Sender</Type>
<Code>InvalidIdentityToken</Code>
<Message>No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA</Message>
</Error>
<RequestId>c517249d-c018-43c3-a712-d0e5080ded86</RequestId>
</ErrorResponse>
fluent-bit service account in namespace newrelic (created by fluentbit Helm chart)
kubectl -n newrelic describe sa fluent-bit
Name: fluent-bit
Namespace: newrelic
Labels: app.kubernetes.io/instance=fluent-bit
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=fluent-bit
app.kubernetes.io/version=1.9.4
helm.sh/chart=fluent-bit-0.20.2
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxxx:role/kinesis-write
meta.helm.sh/release-name: fluent-bit
meta.helm.sh/release-namespace: newrelic
Policy permissions attached to role arn:aws:iam::xxxxxxxxxx:role/kinesis-write
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"firehose:PutRecord",
"firehose:PutRecordBatch"
],
"Resource": "arn:aws:firehose:region:xxxxxxxxxx:deliverystream/kinesis-backend"
}
]
}
Role arn:aws:iam::xxxxxxxxxx:role/kinesis-write trusted relationships (I included OIDC Provider for my EKS cluster on account yyyyyyyyyy)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::yyyyyyyyy:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA"
},
"Action": [
"sts:AssumeRole",
"sts:AssumeRoleWithWebIdentity"
],
"Condition": {
"StringEquals": {
"oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA:sub": "system:serviceaccount:newrelic:fluent-bit"
}
}
}
]
}

terraform ecs/CreateCapacityProvider request 500

I am getting the following error while trying to create an ECS cluster, at the capacity provider creation phase.
2022-01-05T09:15:20.480-0800 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/05 09:15:20 [DEBUG] [aws-sdk-go] DEBUG: Request ecs/CreateCapacityProvider Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: ecs.us-east-1.amazonaws.com
User-Agent: APN/1.0 HashiCorp/1.0 Terraform/0.12.31 (+https://www.terraform.io) terraform-provider-aws/3.70.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.42.23 (go1.16; darwin; amd64)
Content-Length: 370
Authorization: AWS4-HMAC-SHA256 Credential=AKIAI2AFJ6MZHHPZ2HTA/20220105/us-east-1/ecs/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-target, Signature=e3f52e06669323c64df6ca485fcb7fae41c6941237fb5dbb0ba6e63478a6eb28
Content-Type: application/x-amz-json-1.1
X-Amz-Date: 20220105T171520Z
X-Amz-Target: AmazonEC2ContainerServiceV20141113.CreateCapacityProvider
Accept-Encoding: gzip
{"autoScalingGroupProvider":{"autoScalingGroupArn":"arn:aws:autoscaling:us-east-1:009710336282:autoScalingGroup:2a7b4cd4-919c-4f59-b33d-bc9486033e17:autoScalingGroupName/terraform-20220105170805411900000007","managedScaling":{"maximumScalingStepSize":1000,"minimumScalingStepSize":1,"status":"ENABLED","targetCapacity":80}},"name":"app-client-capacity-provider"}
-----------------------------------------------------: timestamp=2022-01-05T09:15:20.479-0800
2022/01/05 09:15:20 [TRACE] dag/walk: vertex "aws_appautoscaling_policy.ecs_service_policy_scaling" is waiting for "aws_appautoscaling_target.ecs_service_target"
2022-01-05T09:15:21.075-0800 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/05 09:15:21 [DEBUG] [aws-sdk-go] DEBUG: Response ecs/CreateCapacityProvider Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 500
Connection: close
Content-Length: 85
Content-Type: application/x-amz-json-1.1
Date: Wed, 05 Jan 2022 17:15:20 GMT
X-Amzn-Requestid: 67bc83ad-103a-451d-baf5-4697df2e44cb
-----------------------------------------------------: timestamp=2022-01-05T09:15:21.075-0800
2022-01-05T09:15:21.075-0800 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/05 09:15:21 [DEBUG] [aws-sdk-go] {"__type":"ServerException","message":"Service Unavailable. Please try again later."}: timestamp=2022-01-05T09:15:21.075-0800
no details about the error, any idea what might be wrong here?
here is a snippet from my terraform for capacity provider:
... (other stuff like ASG launch template etc)
resource "aws_autoscaling_group" "cluster-asg" {
desired_capacity = 3
min_size = 3
max_size = 50
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
launch_template {
id = aws_launch_template.as_launch_template.id
version = "$Latest"
}
tag {
key = "AmazonECSManaged"
value = true
propagate_at_launch = true
}
}
resource "aws_ecs_capacity_provider" "capacity_provider" {
name = "app-client-capacity-provider"
auto_scaling_group_provider {
auto_scaling_group_arn = aws_autoscaling_group.cluster-asg.arn
managed_scaling {
instance_warmup_period = 120
maximum_scaling_step_size = 1000
minimum_scaling_step_size = 1
status = "ENABLED"
target_capacity = 80
}
}
}
resource "aws_ecs_cluster" "cluster" {
name = "app-client" # Naming the cluster
capacity_providers = [aws_ecs_capacity_provider.capacity_provider.name]
}
I fixed this by creating the cluster without capacity provider first then modifying it to have one.

Can Terraform provider for GSuite access the Admin SDK Directory API when executed in Google Cloud Build from the default Cloud Build Service Account?

I decided to automate the creation of GC projects using Terraform.
One resource that Terraform will create during the run, is a new GSuite user. This is done using the terraform-provider-gsuite. So I set all up (service account, domain-wide delegation, etc) and all works fine when I run the Terraform steps from my command line.
Next, instead of relying on my command line, I decided to have a Cloud Build trigger that would execute Terraform init-plan-apply. As you all know, Cloud builds run under the identity of the GCB Service Account. This means we need to give that SA the permissions that Terraform might need during the execution. So far so good.
So I run the build, and I see that the only resource that Terraform is not able to create is the GSuite user. Digging through the logs I found these 2 requests (and their responses):
GET /admin/directory/v1/users?alt=json&customer=my_customer&prettyPrint=false&query=email%3Alolloso-admin%40codedby.pm HTTP/1.1
Host: www.googleapis.com
User-Agent: google-api-go-client/0.5 (linux amd64) Terraform/0.14.7
X-Goog-Api-Client: gl-go/1.15.6 gdcl/20200514
Accept-Encoding: gzip
HTTP/2.0 400 Bad Request
Cache-Control: private
Content-Type: application/json; charset=UTF-8
Date: Sun, 28 Feb 2021 12:58:25 GMT
Server: ESF
Vary: Origin
Vary: X-Origin
Vary: Referer
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 0
{
"error": {
"code": 400,
"message": "Invalid Input",
"errors": [
{
"domain": "global",
"reason": "invalid"
}
]
}
}
POST /admin/directory/v1/users?alt=json&prettyPrint=false HTTP/1.1
Host: www.googleapis.com
User-Agent: google-api-go-client/0.5 (linux amd64) Terraform/0.14.7
Content-Length: 276
Content-Type: application/json
X-Goog-Api-Client: gl-go/1.15.6 gdcl/20200514
Accept-Encoding: gzip
{
"changePasswordAtNextLogin": true,
"externalIds": [],
"includeInGlobalAddressList": true,
"name": {
"familyName": "********",
"givenName": "*******"
},
"orgUnitPath": "/",
"password": "********",
"primaryEmail": "*********",
"sshPublicKeys": []
}
HTTP/2.0 403 Forbidden
Cache-Control: private
Content-Type: application/json; charset=UTF-8
Date: Sun, 28 Feb 2021 12:58:25 GMT
Server: ESF
Vary: Origin
Vary: X-Origin
Vary: Referer
Www-Authenticate: Bearer realm="https://accounts.google.com/", error="insufficient_scope", scope="https://www.googleapis.com/auth/admin.directory.user https://www.googleapis.com/auth/directory.user"
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 0
{
"error": {
"code": 403,
"message": "Request had insufficient authentication scopes.",
"errors": [
{
"message": "Insufficient Permission",
"domain": "global",
"reason": "insufficientPermissions"
}
],
"status": "PERMISSION_DENIED"
}
}
I think this is the API complaining that the Cloud Build Service Account does not have enough rights to access the Directory API. And here is where the situation gets wild.
In order to do so I thought to grant domain-wide delegation to the Cloud Build SA. But that SA is special and I could not find a way to grant it.
I tried then to give the role serviceAccountUser to the Cloud Build SA on my SA (the one which has domain-wide delegation). But I did not manage to succeed. In fact the build still trows the same error of insufficient permission.
I then tried to use my SA (with domain-wide delegatuion) as custom Cloud Build Service Account. Also there, no luck.
Is it even possible from a Cloud Build to access certain resources for which normally one would use domain-wide delegation?
Thanks
UPDATE 1 (using custom build service account)
As per John comment, I tried to use a user-specified service account to execute my build. The necessary setup info has been taken from the official guide.
This is my cloudbuild.yaml file
steps:
- id: 'tf init'
name: 'hashicorp/terraform'
entrypoint: 'sh'
args:
- '-c'
- |
terraform init
- id: 'tf plan'
name: 'hashicorp/terraform'
entrypoint: 'sh'
args:
- '-c'
- |
terraform plan
- id: 'tf apply'
name: 'hashicorp/terraform'
entrypoint: 'sh'
args:
- '-c'
- |
terraform apply -auto-approve
logsBucket: 'gs://tf-project-creator-cloudbuild-logs'
serviceAccount: 'projects/tf-project-creator/serviceAccounts/sa-terraform-project-creator#tf-project-creator.iam.gserviceaccount.com'
options:
env:
- 'TF_LOG=DEBUG'
where sa-terraform-project-creator#tf-project-creator.iam.gserviceaccount.com is the service account which has domain-wide delegation on my Google Workspace.
I then executed the build manually
export GOOGLE_APPLICATION_CREDENTIALS=.secrets/sa-terraform-project-creator.json; gcloud builds submit --config cloudbuild.yaml
specifying the json private key of the same SA of above.
I would have expected the build to pass but I still get the same error of above
POST /admin/directory/v1/users?alt=json&prettyPrint=false HTTP/1.1
Host: www.googleapis.com
User-Agent: google-api-go-client/0.5 (linux amd64) Terraform/0.14.7
Content-Length: 276
Content-Type: application/json
X-Goog-Api-Client: gl-go/1.15.6 gdcl/20200514
Accept-Encoding: gzip
{
"changePasswordAtNextLogin": true,
"externalIds": [],
"includeInGlobalAddressList": true,
"name": {
"familyName": "REDACTED",
"givenName": "REDACTED"
},
"orgUnitPath": "/",
"organizations": [],
"password": "REDACTED",
"primaryEmail": "REDACTED",
"sshPublicKeys": []
}
-----------------------------------------------------
2021/03/06 17:26:19 [DEBUG] Google API Response Details:
---[ RESPONSE ]--------------------------------------
HTTP/2.0 403 Forbidden
Cache-Control: private
Content-Type: application/json; charset=UTF-8
Date: Sat, 06 Mar 2021 17:26:19 GMT
Server: ESF
Vary: Origin
Vary: X-Origin
Vary: Referer
Www-Authenticate: Bearer realm="https://accounts.google.com/", www.googleapis.com/auth/directory.user"
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 0
{
"error": {
"code": 403,
"message": "Request had insufficient authentication scopes.",
{
"message": "Insufficient Permission",
"domain": "global",
"reason": "insufficientPermissions"
}
],
"status": "PERMISSION_DENIED"
}
}
Is there anything I am missing?
UPDATE 2 (check on active identity when submitting a build)
As deviavir pointed out in their comment, I tried
enabling "Service Accounts" in the GCB settings, but as suspected it did not work.
double checking the active identity while submitting the build. One of the limitations of using a custom build SA, is that the build must be manually triggered. So using gcloud, that means
gcloud builds submit --config cloudbuild.yaml
Til now, when executing this command I have always prepended it by setting GOOGLE_APPLICATION_CREDENTIALS var like this
export GOOGLE_APPLICATION_CREDENTIALS=.secrets/sa-terraform-project-creator.json
The specified private key is the key to my build SA (the one with domain-wide delegation). While doing that, I was always logged in in gcloud with another account (the Owner of the project) which does not have the domain-wide delegation permission). But I thought that by setting GOOGLE_APPLICATION_CREDENTIALS, gcloud would have picked up that credentials. I still think that is the case, but I tried to then submit the build while being logged in gcloud using that same build SA.
So I did
gcloud auth activate-service-account sa-terraform-project-creator#tf-project-creator.iam.gserviceaccount.com --key-file='.secrets/sa-terraform-project-creator.json'
and right after
gcloud builds submit --config cloudbuild.yaml
Yet again, I hit the same permission problem when accessing the Directory API.
As deviavir suspected, I start to think that during the execution of the build, the call to the Directory API is done with the wrong credentials.
Is there a way to log the identity used while executing certain Terraform plugin API calls? That would help a lot.

Unable to create new s3 bucket in terraform

I'm attempting to create a new s3 bucket and getting a conflict though I know the bucket name is new, unique, and has been many hours (8+) since that name was in use. Details attached. I've even tried with a new name that I know was never a bucket in my account (and likely never a bucket).
The name in the logs below is made up and not the one I was using, which was unique and namespaced to my domain.
If I use the aws s3 cli to make the bucket (i.e. aws s3 mb s3://{same-bucket-name} --region us-east-2) where {same-bucket-name} is the name of the bucket I want to create, it works fine.
2019-07-07T00:12:19.463-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] Trying to create new S3 bucket: "my-unique-s3-bucket-name"
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Request s3/CreateBucket Details:
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: PUT /my-unique-s3-bucket-name HTTP/1.1
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Host: s3.us-east-2.amazonaws.com
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: User-Agent: aws-sdk-go/1.20.12 (go1.12.5; darwin; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.2
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Content-Length: 153
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=MYCREDS/20190707/us-east-2/s3/aws4_request, SignedHeaders=content-length;host;x-amz-acl;x-amz-content-sha256;x-amz-date, Signature=b5acd2dbcaf09eda51b4ea8448f1991d26c8eb8249a85e7ac28044864df377b9
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Acl: public-read
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Content-Sha256: 70cae86320841ea73b0bdc759f99920c7caa405e61af2742575750c6586272c9
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Date: 20190707T041219Z
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Accept-Encoding: gzip
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: <CreateBucketConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><LocationConstraint>us-east-2</LocationConstraint></CreateBucketConfiguration>
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: -----------------------------------------------------
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Response s3/CreateBucket Details:
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: ---[ RESPONSE ]--------------------------------------
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: HTTP/1.1 409 Conflict
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Connection: close
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Transfer-Encoding: chunked
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Content-Type: application/xml
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Date: Sun, 07 Jul 2019 04:12:19 GMT
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Server: AmazonS3
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Id-2: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Request-Id: 835B636D828335A1
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: -----------------------------------------------------
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] <?xml version="1.0" encoding="UTF-8"?>
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: <Error><Code>OperationAborted</Code><Message>A conflicting conditional operation is currently in progress against this resource. Please try again.</Message><RequestId>835B636D828335A1</RequestId><HostId>v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=</HostId></Error>
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Validate Response s3/CreateBucket failed, attempt 0/25, error OperationAborted: A conflicting conditional operation is currently in progress against this resource. Please try again.
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: status code: 409, request id: 835B636D828335A1, host id: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [WARN] Got an error while trying to create S3 bucket my-unique-s3-bucket-name: OperationAborted: A conflicting conditional operation is currently in progress against this resource. Please try again.
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: status code: 409, request id: 835B636D828335A1, host id: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [TRACE] Waiting 10s before next try
If the bucket did previously exist then there is an indeterminate amount of time before that bucket name is released.
Unfortunately the AWS docs aren't very specific here:
Important
If you want to continue to use the same bucket name, don't delete the
bucket. We recommend that you empty the bucket and keep it. After a
bucket is deleted, the name becomes available to reuse, but the name
might not be available for you to reuse for various reasons. For
example, it might take some time before the name can be reused, and
some other account could create a bucket with that name before you do.
You can talk to AWS support to confirm what's happening (and check that another AWS account doesn't have the bucket) but ultimately you just need to wait. If the S3 bucket matches a domain name that you control and you intend to use it for website hosting and someone else already has that S3 bucket then there is a process for getting that bucket name back to you, just as there is with CloudFront CNAMEs which are also globally unique.
You should also be able to check if the bucket name is available by running the following command:
aws s3api head-bucket --bucket [bucket name]
Ages back when we briefly tried deleting S3 buckets in test environments over night (along with everything else) we would occasionally see this error for over 48 hours while sometimes the bucket name was available again within a few hours. Unfortunately, AWS provide no guarantees here.

AWS C++ S3 SDK PutObjectRequest unable to connect to endpoint

In working with the AWS C++ SDK I ran into an issue where trying to execute a PutObjectRequest complains that it is "unable to connect to endpoint" when uploaded more than ~400KB.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
Aws::S3::S3Client s3Client(clientConfig);
Aws::S3::Model::PutObjectRequest putObjectRequest;
putObjectRequest.SetBucket("mybucket");
putObjectRequest.SetKey("mykey");
typedef boost::iostreams::basic_array_source<char> Device;
boost::iostreams::stream_buffer<Device> stmbuf(compressedData, dataSize);
std::iostream *stm = new std::iostream(&stmbuf);
putObjectRequest.SetBody(std::shared_ptr<Aws::IOStream>(stm));
putObjectRequest.SetContentLength(dataSize);
Aws::S3::Model::PutObjectOutcome outcome = s3Client.PutObject(putObjectRequest);
As long as my data is less than ~400KB it gets uploaded into a file on S3 but beyond that it is unable to connect to endpoint. I should be able to upload up to 5GB in one PutObjectRequest.
Any thoughts?
Edit:
Responding to #JonathanHenson's comment, the AWS log shows this timeout error repeatedly:
[DEBUG] 2016-08-04 13:42:03 AWSClient [0x700000081000] Request Successfully signed
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Making request to https://s3.amazonaws.com/mybucket/myfile
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Including headers:
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] content-length: 3151261
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] content-type: binary/octet-stream
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] host: s3.amazonaws.com
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] user-agent: aws-sdk-cpp/0.13.9 Darwin/15.6.0 x86_64
[DEBUG] 2016-08-04 13:42:03 CurlHandleContainer [0x700000081000] Attempting to acquire curl connection.
[DEBUG] 2016-08-04 13:42:03 CurlHandleContainer [0x700000081000] Returning connection handle 0x10b09cc00
[DEBUG] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Obtained connection handle 0x10b09cc00
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] HTTP/1.1 100 Continue
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000]
[ERROR] 2016-08-04 13:42:06 CurlHttpClient [0x700000081000] Curl returned error code 28
[DEBUG] 2016-08-04 13:42:06 CurlHandleContainer [0x700000081000] Releasing curl handle 0x10b09cc00
[DEBUG] 2016-08-04 13:42:06 CurlHandleContainer [0x700000081000] Notifying waiting threads.
[DEBUG] 2016-08-04 13:42:06 AWSClient [0x700000081000] Request returned error. Attempting to generate appropriate error codes from response
[WARN] 2016-08-04 13:42:06 AWSClient [0x700000081000] Request failed, now waiting 12800 ms before attempting again.
[DEBUG] 2016-08-04 13:42:19 InstanceProfileCredentialsProvider [0x700000081000] Checking if latest credential pull has expired.
Ultimately what fixed this for me was setting the request timeout. The request time out needs to be long enough for your entire transfer to finish. If you are transferring large files on a slow internet connection make sure the request timeout is long enough to allow those files to transfer.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
clientConfig.connectTimeoutMs = 30000;
clientConfig.requestTimoutMs = 600000;
Tweak your config file to below.And see it will work.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
clientConfig.connectTimeoutMs = 30000;
Aws::S3::S3Client s3Client(clientConfig);
Aws::S3::Model::PutObjectRequest putObjectRequest;
putObjectRequest.SetBucket("mybucket");
putObjectRequest.SetKey("mykey");
typedef boost::iostreams::basic_array_source<char> Device;
boost::iostreams::stream_buffer<Device> stmbuf(compressedData, dataSize);
std::iostream *stm = new std::iostream(&stmbuf);
putObjectRequest.SetBody(std::shared_ptr<Aws::IOStream>(stm));
putObjectRequest.SetContentLength(dataSize);
Aws::S3::Model::PutObjectOutcome outcome = s3Client.PutObject(putObjectRequest);