I'm attempting to create a new s3 bucket and getting a conflict though I know the bucket name is new, unique, and has been many hours (8+) since that name was in use. Details attached. I've even tried with a new name that I know was never a bucket in my account (and likely never a bucket).
The name in the logs below is made up and not the one I was using, which was unique and namespaced to my domain.
If I use the aws s3 cli to make the bucket (i.e. aws s3 mb s3://{same-bucket-name} --region us-east-2) where {same-bucket-name} is the name of the bucket I want to create, it works fine.
2019-07-07T00:12:19.463-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] Trying to create new S3 bucket: "my-unique-s3-bucket-name"
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Request s3/CreateBucket Details:
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: PUT /my-unique-s3-bucket-name HTTP/1.1
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Host: s3.us-east-2.amazonaws.com
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: User-Agent: aws-sdk-go/1.20.12 (go1.12.5; darwin; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.2
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Content-Length: 153
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=MYCREDS/20190707/us-east-2/s3/aws4_request, SignedHeaders=content-length;host;x-amz-acl;x-amz-content-sha256;x-amz-date, Signature=b5acd2dbcaf09eda51b4ea8448f1991d26c8eb8249a85e7ac28044864df377b9
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Acl: public-read
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Content-Sha256: 70cae86320841ea73b0bdc759f99920c7caa405e61af2742575750c6586272c9
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Date: 20190707T041219Z
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Accept-Encoding: gzip
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: <CreateBucketConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><LocationConstraint>us-east-2</LocationConstraint></CreateBucketConfiguration>
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: -----------------------------------------------------
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Response s3/CreateBucket Details:
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: ---[ RESPONSE ]--------------------------------------
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: HTTP/1.1 409 Conflict
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Connection: close
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Transfer-Encoding: chunked
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Content-Type: application/xml
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Date: Sun, 07 Jul 2019 04:12:19 GMT
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Server: AmazonS3
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Id-2: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Request-Id: 835B636D828335A1
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: -----------------------------------------------------
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] <?xml version="1.0" encoding="UTF-8"?>
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: <Error><Code>OperationAborted</Code><Message>A conflicting conditional operation is currently in progress against this resource. Please try again.</Message><RequestId>835B636D828335A1</RequestId><HostId>v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=</HostId></Error>
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Validate Response s3/CreateBucket failed, attempt 0/25, error OperationAborted: A conflicting conditional operation is currently in progress against this resource. Please try again.
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: status code: 409, request id: 835B636D828335A1, host id: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [WARN] Got an error while trying to create S3 bucket my-unique-s3-bucket-name: OperationAborted: A conflicting conditional operation is currently in progress against this resource. Please try again.
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: status code: 409, request id: 835B636D828335A1, host id: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [TRACE] Waiting 10s before next try
If the bucket did previously exist then there is an indeterminate amount of time before that bucket name is released.
Unfortunately the AWS docs aren't very specific here:
Important
If you want to continue to use the same bucket name, don't delete the
bucket. We recommend that you empty the bucket and keep it. After a
bucket is deleted, the name becomes available to reuse, but the name
might not be available for you to reuse for various reasons. For
example, it might take some time before the name can be reused, and
some other account could create a bucket with that name before you do.
You can talk to AWS support to confirm what's happening (and check that another AWS account doesn't have the bucket) but ultimately you just need to wait. If the S3 bucket matches a domain name that you control and you intend to use it for website hosting and someone else already has that S3 bucket then there is a process for getting that bucket name back to you, just as there is with CloudFront CNAMEs which are also globally unique.
You should also be able to check if the bucket name is available by running the following command:
aws s3api head-bucket --bucket [bucket name]
Ages back when we briefly tried deleting S3 buckets in test environments over night (along with everything else) we would occasionally see this error for over 48 hours while sometimes the bucket name was available again within a few hours. Unfortunately, AWS provide no guarantees here.
Related
I'm going mad over a fluent bit DaemonSet installed via Helm in EKS on Account AWS yyyyyyy unable to send data to Kinesis in AWS account xxxxxxxxxx.
It looks like EKS does not have OIDC provider on IAM but it's false! Can you help?
fluent bit logs:
[2022/06/29 15:22:34] [debug] [output:kinesis_firehose:kinesis_firehose.0] firehose:PutRecordBatch: events=157, payload=71245 bytes
[2022/06/29 15:22:34] [debug] [output:kinesis_firehose:kinesis_firehose.0] Sending log records to delivery stream kinesis_backend
[2022/06/29 15:22:34] [debug] [http_client] not using http_proxy for header
[2022/06/29 15:22:34] [debug] [aws_credentials] Requesting credentials from the EC2 provider..
[2022/06/29 15:22:34] [debug] [input:tail:tail.0] inode=19100461 events: IN_MODIFY
[2022/06/29 15:22:34] [debug] [input chunk] update output instances with new chunk size diff=693
[2022/06/29 15:22:34] [debug] [input:tail:tail.0] inode=19100461 events: IN_MODIFY
[2022/06/29 15:22:34] [debug] [http_client] server firehose.eu-west-1.amazonaws.com:443 will close connection #74
[2022/06/29 15:22:34] [debug] [aws_client] firehose.eu-west-1.amazonaws.com: http_do=0, HTTP Status: 400
[2022/06/29 15:22:34] [error] [aws_client] auth error, refreshing creds
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the env provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the profile provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Reading shared config file.
[2022/06/29 15:22:34] [debug] [aws_credentials] Shared config file /root/.aws/config does not exist
[2022/06/29 15:22:34] [debug] [aws_credentials] Reading shared credentials file.
[2022/06/29 15:22:34] [error] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the EKS provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Calling STS..
[2022/06/29 15:22:34] [debug] [http_client] not using http_proxy for header
[2022/06/29 15:22:34] [debug] [http_client] server sts.eu-west-1.amazonaws.com:443 will close connection #74
[2022/06/29 15:22:34] [debug] [aws_client] sts.eu-west-1.amazonaws.com: http_do=0, HTTP Status: 400
[2022/06/29 15:22:34] [debug] [aws_client] Unable to parse API response- response is not valid JSON.
[2022/06/29 15:22:34] [debug] [aws_credentials] STS raw response:
<ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
<Error>
<Type>Sender</Type>
<Code>InvalidIdentityToken</Code>
<Message>No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA</Message>
</Error>
<RequestId>c517249d-c018-43c3-a712-d0e5080ded86</RequestId>
</ErrorResponse>
fluent-bit service account in namespace newrelic (created by fluentbit Helm chart)
kubectl -n newrelic describe sa fluent-bit
Name: fluent-bit
Namespace: newrelic
Labels: app.kubernetes.io/instance=fluent-bit
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=fluent-bit
app.kubernetes.io/version=1.9.4
helm.sh/chart=fluent-bit-0.20.2
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxxx:role/kinesis-write
meta.helm.sh/release-name: fluent-bit
meta.helm.sh/release-namespace: newrelic
Policy permissions attached to role arn:aws:iam::xxxxxxxxxx:role/kinesis-write
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"firehose:PutRecord",
"firehose:PutRecordBatch"
],
"Resource": "arn:aws:firehose:region:xxxxxxxxxx:deliverystream/kinesis-backend"
}
]
}
Role arn:aws:iam::xxxxxxxxxx:role/kinesis-write trusted relationships (I included OIDC Provider for my EKS cluster on account yyyyyyyyyy)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::yyyyyyyyy:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA"
},
"Action": [
"sts:AssumeRole",
"sts:AssumeRoleWithWebIdentity"
],
"Condition": {
"StringEquals": {
"oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA:sub": "system:serviceaccount:newrelic:fluent-bit"
}
}
}
]
}
I am getting the following error while trying to create an ECS cluster, at the capacity provider creation phase.
2022-01-05T09:15:20.480-0800 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/05 09:15:20 [DEBUG] [aws-sdk-go] DEBUG: Request ecs/CreateCapacityProvider Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: ecs.us-east-1.amazonaws.com
User-Agent: APN/1.0 HashiCorp/1.0 Terraform/0.12.31 (+https://www.terraform.io) terraform-provider-aws/3.70.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.42.23 (go1.16; darwin; amd64)
Content-Length: 370
Authorization: AWS4-HMAC-SHA256 Credential=AKIAI2AFJ6MZHHPZ2HTA/20220105/us-east-1/ecs/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-target, Signature=e3f52e06669323c64df6ca485fcb7fae41c6941237fb5dbb0ba6e63478a6eb28
Content-Type: application/x-amz-json-1.1
X-Amz-Date: 20220105T171520Z
X-Amz-Target: AmazonEC2ContainerServiceV20141113.CreateCapacityProvider
Accept-Encoding: gzip
{"autoScalingGroupProvider":{"autoScalingGroupArn":"arn:aws:autoscaling:us-east-1:009710336282:autoScalingGroup:2a7b4cd4-919c-4f59-b33d-bc9486033e17:autoScalingGroupName/terraform-20220105170805411900000007","managedScaling":{"maximumScalingStepSize":1000,"minimumScalingStepSize":1,"status":"ENABLED","targetCapacity":80}},"name":"app-client-capacity-provider"}
-----------------------------------------------------: timestamp=2022-01-05T09:15:20.479-0800
2022/01/05 09:15:20 [TRACE] dag/walk: vertex "aws_appautoscaling_policy.ecs_service_policy_scaling" is waiting for "aws_appautoscaling_target.ecs_service_target"
2022-01-05T09:15:21.075-0800 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/05 09:15:21 [DEBUG] [aws-sdk-go] DEBUG: Response ecs/CreateCapacityProvider Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 500
Connection: close
Content-Length: 85
Content-Type: application/x-amz-json-1.1
Date: Wed, 05 Jan 2022 17:15:20 GMT
X-Amzn-Requestid: 67bc83ad-103a-451d-baf5-4697df2e44cb
-----------------------------------------------------: timestamp=2022-01-05T09:15:21.075-0800
2022-01-05T09:15:21.075-0800 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/05 09:15:21 [DEBUG] [aws-sdk-go] {"__type":"ServerException","message":"Service Unavailable. Please try again later."}: timestamp=2022-01-05T09:15:21.075-0800
no details about the error, any idea what might be wrong here?
here is a snippet from my terraform for capacity provider:
... (other stuff like ASG launch template etc)
resource "aws_autoscaling_group" "cluster-asg" {
desired_capacity = 3
min_size = 3
max_size = 50
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
launch_template {
id = aws_launch_template.as_launch_template.id
version = "$Latest"
}
tag {
key = "AmazonECSManaged"
value = true
propagate_at_launch = true
}
}
resource "aws_ecs_capacity_provider" "capacity_provider" {
name = "app-client-capacity-provider"
auto_scaling_group_provider {
auto_scaling_group_arn = aws_autoscaling_group.cluster-asg.arn
managed_scaling {
instance_warmup_period = 120
maximum_scaling_step_size = 1000
minimum_scaling_step_size = 1
status = "ENABLED"
target_capacity = 80
}
}
}
resource "aws_ecs_cluster" "cluster" {
name = "app-client" # Naming the cluster
capacity_providers = [aws_ecs_capacity_provider.capacity_provider.name]
}
I fixed this by creating the cluster without capacity provider first then modifying it to have one.
I'm trying to make XRAY work as a sidecar container on ECS fargate.
However, it is shutting down and stopping the task.
These are the logs:
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Shutdown Initiated. Current epoch in nanoseconds: 1619452799073953600
2021-04-26 08:59:592021-04-26T15:59:59Z [Info] Got shutdown signal: terminated
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Skipped telemetry data as no segments found
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] telemetry: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] processor: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Trace segment: received: 0, truncated: 0, processed: 0
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Shutdown finished. Current epoch in nanoseconds: 1619452799074286437
2021-04-26 08:59:582021-04-26T15:59:58Z [Info] Starting proxy http server on 0.0.0.0:2000
2021-04-26 08:59:582021-04-26T15:59:58Z [Error] Get instance id metadata failed: RequestError: send request failed
2021-04-26 08:59:58caused by: Get http://169.254.169.254/latest/meta-data/instance-id: dial tcp 169.254.169.254:80: connect: invalid argument
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Using Endpoint: https://xray.us-east-1.amazonaws.com
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Telemetry initiated
2021-04-26 08:59:582021-04-26T15:59:58Z [Info] HTTP Proxy server using X-Ray Endpoint : https://xray.us-east-1.amazonaws.com
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Using Endpoint: https://xray.us-east-1.amazonaws.com
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Batch size: 50
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Get hostname metadata failed: RequestError: send request failed
2021-04-26 08:59:57caused by: Get http://169.254.169.254/latest/meta-data/hostname: dial tcp 169.254.169.254:80: connect: invalid argument
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Using proxy address:
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Fetch region us-east-1 from environment variables
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] Using region: us-east-1
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] ARN of the AWS resource running the daemon:
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] Initializing AWS X-Ray daemon 3.2.0
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Listening on UDP 0.0.0.0:2000
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] Using buffer memory limit of 37 MB
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] 592 segment buffers allocated
I found this and checked I have everything I needed: https://github.com/aws/aws-app-mesh-examples/blob/db4a8d49ab61c62dbc254cd4d35a3911df4cc32c/walkthroughs/howto-alb/app.yaml#L61, particularly for the TASK role (and permissions).
I don't understand what's going on and googling doesn't provide good hints neither.
I will appreciate help.
Thanks
I try to set up controller service account for Dataflow. In my dataflow options I have:
options.setGcpCredential(GoogleCredentials.fromStream(
new FileInputStream("key.json")).createScoped(someArrays));
options.setServiceAccount("xxx#yyy.iam.gserviceaccount.com");
But I'm getting:
WARNING: Request failed with code 403, performed 0 retries due to IOExceptions,
performed 0 retries due to unsuccessful status codes, HTTP framework says
request can be retried, (caller responsible for retrying):
https://dataflow.googleapis.com/v1b3/projects/MYPROJECT/locations/MYLOCATION/jobs
Exception in thread "main" java.lang.RuntimeException: Failed to create a workflow
job: (CODE): Current user cannot act as
service account "xxx#yyy.iam.gserviceaccount.com.
Causes: (CODE): Current user cannot act as
service account "xxx#yyy.iam.gserviceaccount.com.
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:791)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:173)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
...
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
{
"code" : 403,
"errors" : [ {
"domain" : "global",
"message" : "(CODE): Current user cannot act as service account
xxx#yyy.iam.gserviceaccount.com. Causes: (CODE): Current user
cannot act as service account xxx#yyy.iam.gserviceaccount.com.",
"reason" : "forbidden"
} ],
"message" : "(CODE): Current user cannot act as service account
xxx#yyy.iam.gserviceaccount.com. Causes: (CODE): Current user
cannot act as service account xxx#yyy.iam.gserviceaccount.com.",
"status" : "PERMISSION_DENIED"
}
Am I missing some Roles or permissions?
Maybe someone is going to find it helpful:
For controller it was: Dataflow Worker and Storage Object Admin (that was found in Google's documentation).
For executor it was: Service Account User.
I've been hitting this error and thought it worth sharing my experiences (partly because I suspect I'll encounter this again in the future).
The terraform code to create my dataflow job is:
resource "google_dataflow_job" "wordcount" {
# https://stackoverflow.com/a/59931467/201657
name = "wordcount"
template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
temp_gcs_location = "gs://${local.name-prefix}-functions/temp"
parameters = {
inputFile = "gs://dataflow-samples/shakespeare/kinglear.txt"
output = "gs://${local.name-prefix}-functions/wordcount/output"
}
service_account_email = "serviceAccount:${data.google_service_account.sa.email}"
}
The error message:
Error: googleapi: Error 400: (c3c0d991927a8658): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com., badRequest
was returned from running terraform apply. Checking out the logs provided a lot more info:
gcloud logging read 'timestamp >= "2020-12-31T13:39:58.733249492Z" AND timestamp <= "2020-12-31T13:45:58.733249492Z"' --format="csv(timestamp,severity,textPayload)" --order=asc
which returned various log records, including this:
Permissions verification for controller service account failed. IAM role roles/dataflow.worker should be granted to controller service account dataflowdemo#redacted.iam.gserviceaccount.com.
so I granted that missing role grant
gcloud projects add-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.worker"
and ran terraform apply again. This time I got the same error in the terraform output but there were no errors to be seen in the logs.
I then followed the advice given at https://cloud.google.com/dataflow/docs/concepts/access-control#creating_jobs to also grant the roles/dataflow.admin:
gcloud projects add-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.admin"
but there was no discernible difference from the previous attempt.
I then tried turning on terraform debug logging which provided this info:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ---[ REQUEST ]---------------------------------------
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: POST /v1b3/projects/redacted/locations/europe-west1/templates?alt=json&prettyPrint=false HTTP/1.1
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Host: dataflow.googleapis.com
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: User-Agent: google-api-go-client/0.5 Terraform/0.14.2 (+https://www.terraform.io) Terraform-Plugin-SDK/2.1.0 terraform-provider-google/dev
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Length: 385
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Type: application/json
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Goog-Api-Client: gl-go/1.14.5 gdcl/20201023
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Accept-Encoding: gzip
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "environment": {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "serviceAccountEmail": "serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "tempLocation": "gs://jamiet-demo-functions/temp"
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: },
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "gcsPath": "gs://dataflow-templates/latest/Word_Count",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "jobName": "wordcount",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "parameters": {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "inputFile": "gs://dataflow-samples/shakespeare/kinglear.txt",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "output": "gs://jamiet-demo-functions/wordcount/output"
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: -----------------------------------------------------
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 2020/12/31 16:04:14 [DEBUG] Google API Response Details:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ---[ RESPONSE ]--------------------------------------
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: HTTP/1.1 400 Bad Request
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Connection: close
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Transfer-Encoding: chunked
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Alt-Svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Cache-Control: private
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Type: application/json; charset=UTF-8
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Date: Thu, 31 Dec 2020 16:04:15 GMT
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Server: ESF
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: Origin
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: X-Origin
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: Referer
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Content-Type-Options: nosniff
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Frame-Options: SAMEORIGIN
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Xss-Protection: 0
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 1f9
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "error": {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "code": 400,
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "message": "(dbacb1c39beb28c9): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com.",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "errors": [
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "message": "(dbacb1c39beb28c9): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com.",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "domain": "global",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "reason": "badRequest"
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ],
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "status": "INVALID_ARGUMENT"
orm-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 0
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: -----------------------------------------------------
The error being returned from dataflow.googleapis.com is clearly evident:
Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com
At this stage I am puzzled as to why I can see an error being returned from the Google's dataflow API but there is nothing in the GCP logs indicating that an error occurred.
Then tho I had a bit of a lightbulb moment. Why does that error message mention "service account serviceAccount"? Then it hit me, I'd defined the service account incorrectly. Terraform code should have been:
resource "google_dataflow_job" "wordcount" {
# https://stackoverflow.com/a/59931467/201657
name = "wordcount"
template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
temp_gcs_location = "gs://${local.name-prefix}-functions/temp"
parameters = {
inputFile = "gs://dataflow-samples/shakespeare/kinglear.txt"
output = "gs://${local.name-prefix}-functions/wordcount/output"
}
service_account_email = data.google_service_account.sa.email
}
I corrected it and it worked straight away. User error!!!
I then set about removing the various permissions that I'd added:
gcloud projects remove-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.admin"
gcloud projects remove-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.worker"
and terraform apply still worked. However, after removing the grant of role roles/dataflow.worker the job failed with error:
Workflow failed. Causes: Permissions verification for controller service account failed. IAM role roles/dataflow.worker should be granted to controller service account dataflowdemo#redacted.iam.gserviceaccount.com.
so clearly the documentation regarding the appropriate roles to grant (https://cloud.google.com/dataflow/docs/concepts/access-control#creating_jobs) is spot on.
As may be apparent, I started writing this post before I knew what the problem was and I thought it might be useful to document my investigation somewhere. Now that I've finished the investigation and the problem turns out to be one of PEBCAK its probably not so relevant to this thread anymore, and certainly shouldn't be accepted as an answer. Nevertheless, there is probably some useful information in here about how to go about investigating issues with terraform calling Google APIs, and it also reiterates the required role grants, so I'll leave it here in case it ever turns out to be useful.
I just hit this problem again so posting my solution up here as I fully expect I'll get bitten by this again at some point.
I was getting error:
Error: googleapi: Error 403: (a00eba23d59c1fa3): Current user cannot act as service account dataflow-controller-sa#myproject.iam.gserviceaccount.com. Causes: (a00eba23d59c15ac): Current user cannot act as service account dataflow-controller-sa#myproject.iam.gserviceaccount.com., forbidden
I was deploying the dataflow job, via terraform, using a different service account, deployer#myproject.iam.gserviceaccount.com
The solution was to grant that service account the roles/iam.serviceAccountUser role:
gcloud projects add-iam-policy-binding myproject \
--member=serviceAccount:deployer#myproject.iam.gserviceaccount.com \
--role=roles/iam.serviceAccountUser
For those that prefer custom IAM roles over predefined IAM roles the specific permission that was missing was iam.serviceAccounts.actAs.
Issue Got Resolved!
Go to GCP -> Console -> IAM -> ServiceAccount Email -> Add Permission -> Service Account User. as below
In working with the AWS C++ SDK I ran into an issue where trying to execute a PutObjectRequest complains that it is "unable to connect to endpoint" when uploaded more than ~400KB.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
Aws::S3::S3Client s3Client(clientConfig);
Aws::S3::Model::PutObjectRequest putObjectRequest;
putObjectRequest.SetBucket("mybucket");
putObjectRequest.SetKey("mykey");
typedef boost::iostreams::basic_array_source<char> Device;
boost::iostreams::stream_buffer<Device> stmbuf(compressedData, dataSize);
std::iostream *stm = new std::iostream(&stmbuf);
putObjectRequest.SetBody(std::shared_ptr<Aws::IOStream>(stm));
putObjectRequest.SetContentLength(dataSize);
Aws::S3::Model::PutObjectOutcome outcome = s3Client.PutObject(putObjectRequest);
As long as my data is less than ~400KB it gets uploaded into a file on S3 but beyond that it is unable to connect to endpoint. I should be able to upload up to 5GB in one PutObjectRequest.
Any thoughts?
Edit:
Responding to #JonathanHenson's comment, the AWS log shows this timeout error repeatedly:
[DEBUG] 2016-08-04 13:42:03 AWSClient [0x700000081000] Request Successfully signed
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Making request to https://s3.amazonaws.com/mybucket/myfile
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Including headers:
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] content-length: 3151261
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] content-type: binary/octet-stream
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] host: s3.amazonaws.com
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] user-agent: aws-sdk-cpp/0.13.9 Darwin/15.6.0 x86_64
[DEBUG] 2016-08-04 13:42:03 CurlHandleContainer [0x700000081000] Attempting to acquire curl connection.
[DEBUG] 2016-08-04 13:42:03 CurlHandleContainer [0x700000081000] Returning connection handle 0x10b09cc00
[DEBUG] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Obtained connection handle 0x10b09cc00
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] HTTP/1.1 100 Continue
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000]
[ERROR] 2016-08-04 13:42:06 CurlHttpClient [0x700000081000] Curl returned error code 28
[DEBUG] 2016-08-04 13:42:06 CurlHandleContainer [0x700000081000] Releasing curl handle 0x10b09cc00
[DEBUG] 2016-08-04 13:42:06 CurlHandleContainer [0x700000081000] Notifying waiting threads.
[DEBUG] 2016-08-04 13:42:06 AWSClient [0x700000081000] Request returned error. Attempting to generate appropriate error codes from response
[WARN] 2016-08-04 13:42:06 AWSClient [0x700000081000] Request failed, now waiting 12800 ms before attempting again.
[DEBUG] 2016-08-04 13:42:19 InstanceProfileCredentialsProvider [0x700000081000] Checking if latest credential pull has expired.
Ultimately what fixed this for me was setting the request timeout. The request time out needs to be long enough for your entire transfer to finish. If you are transferring large files on a slow internet connection make sure the request timeout is long enough to allow those files to transfer.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
clientConfig.connectTimeoutMs = 30000;
clientConfig.requestTimoutMs = 600000;
Tweak your config file to below.And see it will work.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
clientConfig.connectTimeoutMs = 30000;
Aws::S3::S3Client s3Client(clientConfig);
Aws::S3::Model::PutObjectRequest putObjectRequest;
putObjectRequest.SetBucket("mybucket");
putObjectRequest.SetKey("mykey");
typedef boost::iostreams::basic_array_source<char> Device;
boost::iostreams::stream_buffer<Device> stmbuf(compressedData, dataSize);
std::iostream *stm = new std::iostream(&stmbuf);
putObjectRequest.SetBody(std::shared_ptr<Aws::IOStream>(stm));
putObjectRequest.SetContentLength(dataSize);
Aws::S3::Model::PutObjectOutcome outcome = s3Client.PutObject(putObjectRequest);