I'm trying to make XRAY work as a sidecar container on ECS fargate.
However, it is shutting down and stopping the task.
These are the logs:
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Shutdown Initiated. Current epoch in nanoseconds: 1619452799073953600
2021-04-26 08:59:592021-04-26T15:59:59Z [Info] Got shutdown signal: terminated
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Skipped telemetry data as no segments found
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] telemetry: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Segment batch: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] processor: done!
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Trace segment: received: 0, truncated: 0, processed: 0
2021-04-26 08:59:592021-04-26T15:59:59Z [Debug] Shutdown finished. Current epoch in nanoseconds: 1619452799074286437
2021-04-26 08:59:582021-04-26T15:59:58Z [Info] Starting proxy http server on 0.0.0.0:2000
2021-04-26 08:59:582021-04-26T15:59:58Z [Error] Get instance id metadata failed: RequestError: send request failed
2021-04-26 08:59:58caused by: Get http://169.254.169.254/latest/meta-data/instance-id: dial tcp 169.254.169.254:80: connect: invalid argument
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Using Endpoint: https://xray.us-east-1.amazonaws.com
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Telemetry initiated
2021-04-26 08:59:582021-04-26T15:59:58Z [Info] HTTP Proxy server using X-Ray Endpoint : https://xray.us-east-1.amazonaws.com
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Using Endpoint: https://xray.us-east-1.amazonaws.com
2021-04-26 08:59:582021-04-26T15:59:58Z [Debug] Batch size: 50
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Get hostname metadata failed: RequestError: send request failed
2021-04-26 08:59:57caused by: Get http://169.254.169.254/latest/meta-data/hostname: dial tcp 169.254.169.254:80: connect: invalid argument
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Using proxy address:
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Fetch region us-east-1 from environment variables
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] Using region: us-east-1
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] ARN of the AWS resource running the daemon:
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] Initializing AWS X-Ray daemon 3.2.0
2021-04-26 08:59:572021-04-26T15:59:57Z [Debug] Listening on UDP 0.0.0.0:2000
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] Using buffer memory limit of 37 MB
2021-04-26 08:59:572021-04-26T15:59:57Z [Info] 592 segment buffers allocated
I found this and checked I have everything I needed: https://github.com/aws/aws-app-mesh-examples/blob/db4a8d49ab61c62dbc254cd4d35a3911df4cc32c/walkthroughs/howto-alb/app.yaml#L61, particularly for the TASK role (and permissions).
I don't understand what's going on and googling doesn't provide good hints neither.
I will appreciate help.
Thanks
Related
lambda backend function is invoked every 20 sec from s3 static website - GET request goes via api gateway calls lambda, it tries to get item from dynamodb table and should return it. It returns 500 error- smth wrong on server side.
Enough time(9sec) is given for the lambda to finish executing code - lambda's own time out should not be an issue.
Lambda has a role attached that allows get_item from dynamodb table - IAM is not an issue.
However, when troubleshooting i only see logs in cloudwatch until it tries to get_item. I put lots of print and logs, but it does not go past that line, I even nested try..except - it does not catch any errors. I dont see how to detect whats' wrong. I set logging level to debug - it prints some stuff.
import logging
import boto3
import sys
logging.getLogger().setLevel(logging.DEBUG)
def lambda_handler(event, context):
logging.info('doing retrieving from table votes')
try:
logging.info('********************* TRYING retrieving from table votes')
# dynamodb = boto3.client('dynamodb')
table = boto3.resource('dynamodb', region_name='us-east-1').Table('Votes')
print(table)
logging.info('event')
logging.info(event)
print(type(table))
logging.info(table)
logging.info(type(table))
# logging.error(table)
# logging.error(type(table))
#************************
# I dont see the result of count nowhere in the logs of cloud watch
try:
count = table.get_item(Key={'voter':{'S': 'count'}})
except Exception as e:
logging.info('catching it here - if you see it then something wrong with get_item count')
logging.info('********************BAD******************')
logging.error('********************BAD******************')
e = sys.exc_info()[0]
exception_type = e.__class__.__name__
exception_message = str(e)
logging.error('--------------------------------')
logging.error(exception_message)
logging.error(exception_type)
################## BELOW DONT GET PRINTED AT ALL in cloud watch logs
logging.info('count')
logging.info(count)
print('****************COUNT*********************')
print(count)
print('----------------------------------------------------')
logging.info(count)
a = count["Item"]["a"]
b = count["Item"]["b"]
logging.info('count [Item]')
logging.info(count["Item"])
logging.info('------------------------------------')
logging.info('ok retrieve from table votes')
logging.info('a is ' + a)
logging.info('b is ' + b)
logging.info('************************************ success! a: ' + a + ' and b: ' + b)
return {'statusCode': 200, 'body': '{"a": ' + a + ', "b": ' + b + '}'}
except Exception as e:
logging.info('********************BAD******************')
e = sys.exc_info()[0]
exception_type = e.__class__.__name__
exception_message = str(e)
logging.error('--------------------------------')
logging.error(exception_message)
logging.error(exception_type)
logging.error('---------------------------------------------')
return {'statusCode': 500, 'body': '{"status": "error getting from table votes"}'}
the full logs of 1 request id from start to finish. For some reason it never prints the result of "count = get_item)
START RequestId: 55a5e428-c8d6-4914-908c-20ccec1153dd Version: $LATEST
2023-01-11T21:29:08.805+05:00
[INFO] 2023-01-11T16:29:08.805Z 55a5e428-c8d6-4914-908c-20ccec1153dd doing retrieving from table votes
2023-01-11T21:29:08.805+05:00
[INFO] 2023-01-11T16:29:08.805Z 55a5e428-c8d6-4914-908c-20ccec1153dd ********************* TRYING retrieving from table votes
2023-01-11T21:29:08.806+05:00
[DEBUG] 2023-01-11T16:29:08.806Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
2023-01-11T21:29:08.824+05:00
[DEBUG] 2023-01-11T16:29:08.824Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from before-call.apigateway to before-call.api-gateway
2023-01-11T21:29:08.825+05:00
[DEBUG] 2023-01-11T16:29:08.825Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
2023-01-11T21:29:08.826+05:00
[DEBUG] 2023-01-11T16:29:08.826Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
2023-01-11T21:29:08.827+05:00
[DEBUG] 2023-01-11T16:29:08.827Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
2023-01-11T21:29:08.827+05:00
[DEBUG] 2023-01-11T16:29:08.827Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
2023-01-11T21:29:08.884+05:00
[DEBUG] 2023-01-11T16:29:08.884Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
2023-01-11T21:29:08.925+05:00
[DEBUG] 2023-01-11T16:29:08.924Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
2023-01-11T21:29:08.925+05:00
[DEBUG] 2023-01-11T16:29:08.925Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
2023-01-11T21:29:08.925+05:00
[DEBUG] 2023-01-11T16:29:08.925Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
2023-01-11T21:29:08.925+05:00
[DEBUG] 2023-01-11T16:29:08.925Z 55a5e428-c8d6-4914-908c-20ccec1153dd Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
2023-01-11T21:29:09.044+05:00
[DEBUG] 2023-01-11T16:29:09.044Z 55a5e428-c8d6-4914-908c-20ccec1153dd Loading JSON file: /var/runtime/boto3/data/dynamodb/2012-08-10/resources-1.json
2023-01-11T21:29:09.047+05:00
[DEBUG] 2023-01-11T16:29:09.047Z 55a5e428-c8d6-4914-908c-20ccec1153dd IMDS ENDPOINT: http://169.254.169.254/
2023-01-11T21:29:09.105+05:00
[DEBUG] 2023-01-11T16:29:09.105Z 55a5e428-c8d6-4914-908c-20ccec1153dd Looking for credentials via: env
2023-01-11T21:29:09.105+05:00
[INFO] 2023-01-11T16:29:09.105Z 55a5e428-c8d6-4914-908c-20ccec1153dd Found credentials in environment variables.
2023-01-11T21:29:09.106+05:00
[DEBUG] 2023-01-11T16:29:09.106Z 55a5e428-c8d6-4914-908c-20ccec1153dd Loading JSON file: /var/runtime/botocore/data/endpoints.json
2023-01-11T21:29:09.266+05:00
[DEBUG] 2023-01-11T16:29:09.266Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event choose-service-name: calling handler <function handle_service_name_alias at 0x7fa39bdec160>
2023-01-11T21:29:09.365+05:00
[DEBUG] 2023-01-11T16:29:09.364Z 55a5e428-c8d6-4914-908c-20ccec1153dd Loading JSON file: /var/runtime/botocore/data/dynamodb/2012-08-10/service-2.json
2023-01-11T21:29:09.427+05:00
[DEBUG] 2023-01-11T16:29:09.426Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event creating-client-class.dynamodb: calling handler <function add_generate_presigned_url at 0x7fa39be91c10>
2023-01-11T21:29:09.485+05:00
[DEBUG] 2023-01-11T16:29:09.484Z 55a5e428-c8d6-4914-908c-20ccec1153dd Setting dynamodb timeout as (60, 60)
2023-01-11T21:29:09.485+05:00
[DEBUG] 2023-01-11T16:29:09.485Z 55a5e428-c8d6-4914-908c-20ccec1153dd Loading JSON file: /var/runtime/botocore/data/_retry.json
2023-01-11T21:29:09.486+05:00
[DEBUG] 2023-01-11T16:29:09.486Z 55a5e428-c8d6-4914-908c-20ccec1153dd Registering retry handlers for service: dynamodb
2023-01-11T21:29:09.487+05:00
[DEBUG] 2023-01-11T16:29:09.486Z 55a5e428-c8d6-4914-908c-20ccec1153dd Loading dynamodb:dynamodb
2023-01-11T21:29:09.487+05:00
[DEBUG] 2023-01-11T16:29:09.487Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event creating-resource-class.dynamodb.ServiceResource: calling handler <function lazy_call.<locals>._handler at 0x7fa39bd7aca0>
2023-01-11T21:29:09.546+05:00
[DEBUG] 2023-01-11T16:29:09.546Z 55a5e428-c8d6-4914-908c-20ccec1153dd Loading dynamodb:Table
2023-01-11T21:29:09.547+05:00
[DEBUG] 2023-01-11T16:29:09.547Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event creating-resource-class.dynamodb.Table: calling handler <function lazy_call.<locals>._handler at 0x7fa39bd7ad30>
2023-01-11T21:29:09.584+05:00
[DEBUG] 2023-01-11T16:29:09.584Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event creating-resource-class.dynamodb.Table: calling handler <function lazy_call.<locals>._handler at 0x7fa39bd7aca0>
2023-01-11T21:29:09.584+05:00
dynamodb.Table(name='Votes')
2023-01-11T21:29:09.584+05:00
[INFO] 2023-01-11T16:29:09.584Z 55a5e428-c8d6-4914-908c-20ccec1153dd event
2023-01-11T21:29:09.584+05:00
[INFO] 2023-01-11T16:29:09.584Z 55a5e428-c8d6-4914-908c-20ccec1153dd {'version': '2.0', 'routeKey': 'GET /results', 'rawPath': '/results', 'rawQueryString': '', 'headers': {'accept': 'application/json','user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36', 'x-amzn-trace-id': 'Root=1-63bee3d4-679c4d201abf991d1f331f33', 'x-forwarded-for': '164.40.37.179', 'x-forwarded-port': '443', 'x-forwarded-proto': 'https'}, 'requestContext': {'accountId': '025416187662', 'apiId': '5y7dfynd34', 'domainName': '5y7dfynd34.execute-api.us-east-1.amazonaws.com', 'domainPrefix': '5y7dfynd34', 'http': {'method': 'GET', 'path': '/results', 'protocol': 'HTTP/1.1', 'sourceIp': '164.40.37.179', 'userAgent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36'}, 'requestId': 'eliJRin9oAMEc9Q=', 'routeKey': 'GET /results', 'stage': '$default', 'time': '11/Jan/2023:16:29:08 +0000', 'timeEpoch': 1673454548760}, 'isBase64Encoded': False}
2023-01-11T21:29:09.584+05:00
<class 'boto3.resources.factory.dynamodb.Table'>
2023-01-11T21:29:09.585+05:00
[INFO] 2023-01-11T16:29:09.584Z 55a5e428-c8d6-4914-908c-20ccec1153dd dynamodb.Table(name='Votes')
2023-01-11T21:29:09.585+05:00
[INFO] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd <class 'boto3.resources.factory.dynamodb.Table'>
2023-01-11T21:29:09.585+05:00
[DEBUG] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd Calling dynamodb:get_item with {'TableName': 'Votes', 'Key': {'voter': {'S': 'count'}}}
2023-01-11T21:29:09.585+05:00
[DEBUG] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event provide-client-params.dynamodb.GetItem: calling handler <function _dynamodb_params at 0x7fa39b870ca0>
2023-01-11T21:29:09.585+05:00
[DEBUG] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event before-parameter-build.dynamodb.GetItem: calling handler <bound method TransformationInjector.inject_condition_expressions of <boto3.dynamodb.transform.TransformationInjector object at 0x7fa39b852730>>
2023-01-11T21:29:09.585+05:00
[DEBUG] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event before-parameter-build.dynamodb.GetItem: calling handler <bound method TransformationInjector.inject_attribute_value_input of <boto3.dynamodb.transform.TransformationInjector object at 0x7fa39b852730>>
2023-01-11T21:29:09.585+05:00
[DEBUG] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event before-parameter-build.dynamodb.GetItem: calling handler <function generate_idempotent_uuid at 0x7fa39be0d3a0>
2023-01-11T21:29:09.585+05:00
[DEBUG] 2023-01-11T16:29:09.585Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event before-parameter-build.dynamodb.GetItem: calling handler <function block_endpoint_discovery_required_operations at 0x7fa39be32d30>
2023-01-11T21:29:09.586+05:00
[DEBUG] 2023-01-11T16:29:09.586Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event before-call.dynamodb.GetItem: calling handler <function inject_api_version_header_if_needed at 0x7fa39be11c10>
2023-01-11T21:29:09.586+05:00
[DEBUG] 2023-01-11T16:29:09.586Z 55a5e428-c8d6-4914-908c-20ccec1153dd Making request for OperationModel(name=GetItem) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target': 'DynamoDB_20120810.GetItem', 'Content-Type': 'application/x-amz-json-1.0', 'User-Agent': 'Boto3/1.20.32 Python/3.9.13 Linux/4.14.255-296-236.539.amzn2.x86_64 exec-env/AWS_Lambda_python3.9 Botocore/1.23.32 Resource'}, 'body': b'{"TableName": "Votes", "Key": {"voter": {"M": {"S": {"S": "count"}}}}}', 'url': 'https://dynamodb.us-east-1.amazonaws.com/', 'context': {'client_region': 'us-east-1', 'client_config': <botocore.config.Config object at 0x7fa39b897c40>, 'has_streaming_input': False, 'auth_type': None}}
2023-01-11T21:29:09.586+05:00
[DEBUG] 2023-01-11T16:29:09.586Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event request-created.dynamodb.GetItem: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7fa39b897a90>>
2023-01-11T21:29:09.586+05:00
[DEBUG] 2023-01-11T16:29:09.586Z 55a5e428-c8d6-4914-908c-20ccec1153dd Event choose-signer.dynamodb.GetItem: calling handler <function set_operation_specific_signer at 0x7fa39be0d280>
2023-01-11T21:29:09.587+05:00
[DEBUG] 2023-01-11T16:29:09.587Z 55a5e428-c8d6-4914-908c-20ccec1153dd Calculating signature using v4 auth.
2023-01-11T21:29:09.587+05:00
[DEBUG] 2023-01-11T16:29:09.587Z 55a5e428-c8d6-4914-908c-20ccec1153dd CanonicalRequest:
POST
/
content-type:application/x-amz-json-1.0
host:dynamodb.us-east-1.amazonaws.com
x-amz-date:20230111T162909Z
x-amz-security- x-amz-target:DynamoDB_20120810.GetItem
content-type;host;x-amz-date;x-amz-security-token;x-amz-target
6dd016d6033694be300988a73dded6cba15ade0cf920e8bafb56369e3719c397
[DEBUG] 2023-01-11T16:29:09.587Z 55a5e428-c8d6-4914-908c-20ccec1153dd CanonicalRequest: POST / content-type:application/x-amz-json-1.0 host:dynamodb.us-east-1.amazonaws.com x-amz-date:20230111T162909Z x-amz-target:DynamoDB_20120810.GetItem content-type;host;x-amz-date;x-amz-security-token;x-amz-target 6dd016d6033694be300988a73dded6cba15ade0cf920e8bafb56369e3719c397
2023-01-11T21:29:09.587+05:00
[DEBUG] 2023-01-11T16:29:09.587Z 55a5e428-c8d6-4914-908c-20ccec1153dd StringToSign:
AWS4-HMAC-SHA256
20230111T162909Z
20230111/us-east-1/dynamodb/aws4_request
33bbba9cdeb906cc5b3ddc600b02d47f0a73e019d5f3efa0627ea82e05e86eee
[DEBUG] 2023-01-11T16:29:09.587Z 55a5e428-c8d6-4914-908c-20ccec1153dd StringToSign: AWS4-HMAC-SHA256 20230111T162909Z 20230111/us-east-1/dynamodb/aws4_request 33bbba9cdeb906cc5b3ddc600b02d47f0a73e019d5f3efa0627ea82e05e86eee
2023-01-11T21:29:09.587+05:00
[DEBUG] 2023-01-11T16:29:09.587Z 55a5e428-c8d6-4914-908c-20ccec1153dd Signature: f36e8c5a9c7d47f1ef41c1ecce566a988f7243f8f95bf7f7c43b951a87e488eb
2023-01-11T21:29:09.644+05:00
[DEBUG] 2023-01-11T16:29:09.643Z 55a5e428-c8d6-4914-908c-20ccec1153dd Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://dynamodb.us-east-1.amazonaws.com/, headers={'X-Amz-Target': b'DynamoDB_20120810.GetItem', 'Content-Type': b'application/x-amz-json-1.0', 'User-Agent': b'Boto3/1.20.32 Python/3.9.13 Linux/4.14.255-296-236.539.amzn2.x86_64 exec-env/AWS_Lambda_python3.9 Botocore/1.23.32 Resource', 'X-Amz-Date': b'20230111T162909Z', 'Authorization': b'AWS4-HMAC-SHA256 Credential=ASIAQL2XMH4HN6PEJXPA/20230111/us-east-1/dynamodb/aws4_request, SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target, Signature=f36e8c5a9c7d47f1ef41c1ecce566a988f7243f8f95bf7f7c43b951a87e488eb', 'Content-Length': '70'}>
[DEBUG] 2023-01-11T16:29:09.644Z 55a5e428-c8d6-4914-908c-20ccec1153dd Certificate path: /var/runtime/botocore/cacert.pem
2023-01-11T21:29:09.644+05:00
[DEBUG] 2023-01-11T16:29:09.644Z 55a5e428-c8d6-4914-908c-20ccec1153dd Starting new HTTPS connection (1): dynamodb.us-east-1.amazonaws.com:443
2023-01-11T21:29:17.816+05:00
2023-01-11T16:29:17.815Z 55a5e428-c8d6-4914-908c-20ccec1153dd Task timed out after 9.01 seconds
2023-01-11T21:29:17.816+05:00
END RequestId: 55a5e428-c8d6-4914-908c-20ccec1153dd
It looks like the issue is that the table.get_item(Key={'voter':{'S': 'count'}}) call is raising an exception, but your exception handling code is not catching it. One possible cause for this could be that the table.get_item method is returning a botocore.exception.ClientError object, rather than a Python Exception object. You can modify your except block to catch botocore.exception.ClientError and add some log information like request_id and error message in the catch block.
Another way to detect the error is to enable enhanced monitoring for your lambda function, this will give more detailed metrics on the function performance and error.
One thing you could try is to use sys.exc_info() to get the current exception inside the catch block, and print out the exception's class and message. That way you can see exactly what type of exception is being raised and what the error message is.
You could also try to call get_item separately in a local environment to see if it raises an exception, this will help you to figure out if the issue is with your code or the IAM policy or resource configurations.
I'm going mad over a fluent bit DaemonSet installed via Helm in EKS on Account AWS yyyyyyy unable to send data to Kinesis in AWS account xxxxxxxxxx.
It looks like EKS does not have OIDC provider on IAM but it's false! Can you help?
fluent bit logs:
[2022/06/29 15:22:34] [debug] [output:kinesis_firehose:kinesis_firehose.0] firehose:PutRecordBatch: events=157, payload=71245 bytes
[2022/06/29 15:22:34] [debug] [output:kinesis_firehose:kinesis_firehose.0] Sending log records to delivery stream kinesis_backend
[2022/06/29 15:22:34] [debug] [http_client] not using http_proxy for header
[2022/06/29 15:22:34] [debug] [aws_credentials] Requesting credentials from the EC2 provider..
[2022/06/29 15:22:34] [debug] [input:tail:tail.0] inode=19100461 events: IN_MODIFY
[2022/06/29 15:22:34] [debug] [input chunk] update output instances with new chunk size diff=693
[2022/06/29 15:22:34] [debug] [input:tail:tail.0] inode=19100461 events: IN_MODIFY
[2022/06/29 15:22:34] [debug] [http_client] server firehose.eu-west-1.amazonaws.com:443 will close connection #74
[2022/06/29 15:22:34] [debug] [aws_client] firehose.eu-west-1.amazonaws.com: http_do=0, HTTP Status: 400
[2022/06/29 15:22:34] [error] [aws_client] auth error, refreshing creds
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the env provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the profile provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Reading shared config file.
[2022/06/29 15:22:34] [debug] [aws_credentials] Shared config file /root/.aws/config does not exist
[2022/06/29 15:22:34] [debug] [aws_credentials] Reading shared credentials file.
[2022/06/29 15:22:34] [error] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist
[2022/06/29 15:22:34] [debug] [aws_credentials] Refresh called on the EKS provider
[2022/06/29 15:22:34] [debug] [aws_credentials] Calling STS..
[2022/06/29 15:22:34] [debug] [http_client] not using http_proxy for header
[2022/06/29 15:22:34] [debug] [http_client] server sts.eu-west-1.amazonaws.com:443 will close connection #74
[2022/06/29 15:22:34] [debug] [aws_client] sts.eu-west-1.amazonaws.com: http_do=0, HTTP Status: 400
[2022/06/29 15:22:34] [debug] [aws_client] Unable to parse API response- response is not valid JSON.
[2022/06/29 15:22:34] [debug] [aws_credentials] STS raw response:
<ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
<Error>
<Type>Sender</Type>
<Code>InvalidIdentityToken</Code>
<Message>No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA</Message>
</Error>
<RequestId>c517249d-c018-43c3-a712-d0e5080ded86</RequestId>
</ErrorResponse>
fluent-bit service account in namespace newrelic (created by fluentbit Helm chart)
kubectl -n newrelic describe sa fluent-bit
Name: fluent-bit
Namespace: newrelic
Labels: app.kubernetes.io/instance=fluent-bit
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=fluent-bit
app.kubernetes.io/version=1.9.4
helm.sh/chart=fluent-bit-0.20.2
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxxx:role/kinesis-write
meta.helm.sh/release-name: fluent-bit
meta.helm.sh/release-namespace: newrelic
Policy permissions attached to role arn:aws:iam::xxxxxxxxxx:role/kinesis-write
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"firehose:PutRecord",
"firehose:PutRecordBatch"
],
"Resource": "arn:aws:firehose:region:xxxxxxxxxx:deliverystream/kinesis-backend"
}
]
}
Role arn:aws:iam::xxxxxxxxxx:role/kinesis-write trusted relationships (I included OIDC Provider for my EKS cluster on account yyyyyyyyyy)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::yyyyyyyyy:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA"
},
"Action": [
"sts:AssumeRole",
"sts:AssumeRoleWithWebIdentity"
],
"Condition": {
"StringEquals": {
"oidc.eks.eu-west-1.amazonaws.com/id/AAAAAAAAAAAAAAAAAA:sub": "system:serviceaccount:newrelic:fluent-bit"
}
}
}
]
}
I'm attempting to create a new s3 bucket and getting a conflict though I know the bucket name is new, unique, and has been many hours (8+) since that name was in use. Details attached. I've even tried with a new name that I know was never a bucket in my account (and likely never a bucket).
The name in the logs below is made up and not the one I was using, which was unique and namespaced to my domain.
If I use the aws s3 cli to make the bucket (i.e. aws s3 mb s3://{same-bucket-name} --region us-east-2) where {same-bucket-name} is the name of the bucket I want to create, it works fine.
2019-07-07T00:12:19.463-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] Trying to create new S3 bucket: "my-unique-s3-bucket-name"
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Request s3/CreateBucket Details:
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: PUT /my-unique-s3-bucket-name HTTP/1.1
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Host: s3.us-east-2.amazonaws.com
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: User-Agent: aws-sdk-go/1.20.12 (go1.12.5; darwin; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.2
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Content-Length: 153
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=MYCREDS/20190707/us-east-2/s3/aws4_request, SignedHeaders=content-length;host;x-amz-acl;x-amz-content-sha256;x-amz-date, Signature=b5acd2dbcaf09eda51b4ea8448f1991d26c8eb8249a85e7ac28044864df377b9
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Acl: public-read
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Content-Sha256: 70cae86320841ea73b0bdc759f99920c7caa405e61af2742575750c6586272c9
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Date: 20190707T041219Z
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Accept-Encoding: gzip
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: <CreateBucketConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><LocationConstraint>us-east-2</LocationConstraint></CreateBucketConfiguration>
2019-07-07T00:12:19.464-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: -----------------------------------------------------
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Response s3/CreateBucket Details:
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: ---[ RESPONSE ]--------------------------------------
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: HTTP/1.1 409 Conflict
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Connection: close
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Transfer-Encoding: chunked
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Content-Type: application/xml
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Date: Sun, 07 Jul 2019 04:12:19 GMT
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: Server: AmazonS3
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Id-2: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: X-Amz-Request-Id: 835B636D828335A1
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.697-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4:
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: -----------------------------------------------------
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] <?xml version="1.0" encoding="UTF-8"?>
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: <Error><Code>OperationAborted</Code><Message>A conflicting conditional operation is currently in progress against this resource. Please try again.</Message><RequestId>835B636D828335A1</RequestId><HostId>v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=</HostId></Error>
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [DEBUG] [aws-sdk-go] DEBUG: Validate Response s3/CreateBucket failed, attempt 0/25, error OperationAborted: A conflicting conditional operation is currently in progress against this resource. Please try again.
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: status code: 409, request id: 835B636D828335A1, host id: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [WARN] Got an error while trying to create S3 bucket my-unique-s3-bucket-name: OperationAborted: A conflicting conditional operation is currently in progress against this resource. Please try again.
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: status code: 409, request id: 835B636D828335A1, host id: v5M1x31BcVCS4DLIgqmCR4KRHipO3ZRbTSXF1PCS9+q9nyT8O5/3s04Z22o8t4x8JZ0HF9HWkO4=
2019-07-07T00:12:19.698-0400 [DEBUG] plugin.terraform-provider-aws_v2.18.0_x4: 2019/07/07 00:12:19 [TRACE] Waiting 10s before next try
If the bucket did previously exist then there is an indeterminate amount of time before that bucket name is released.
Unfortunately the AWS docs aren't very specific here:
Important
If you want to continue to use the same bucket name, don't delete the
bucket. We recommend that you empty the bucket and keep it. After a
bucket is deleted, the name becomes available to reuse, but the name
might not be available for you to reuse for various reasons. For
example, it might take some time before the name can be reused, and
some other account could create a bucket with that name before you do.
You can talk to AWS support to confirm what's happening (and check that another AWS account doesn't have the bucket) but ultimately you just need to wait. If the S3 bucket matches a domain name that you control and you intend to use it for website hosting and someone else already has that S3 bucket then there is a process for getting that bucket name back to you, just as there is with CloudFront CNAMEs which are also globally unique.
You should also be able to check if the bucket name is available by running the following command:
aws s3api head-bucket --bucket [bucket name]
Ages back when we briefly tried deleting S3 buckets in test environments over night (along with everything else) we would occasionally see this error for over 48 hours while sometimes the bucket name was available again within a few hours. Unfortunately, AWS provide no guarantees here.
I try to set up controller service account for Dataflow. In my dataflow options I have:
options.setGcpCredential(GoogleCredentials.fromStream(
new FileInputStream("key.json")).createScoped(someArrays));
options.setServiceAccount("xxx#yyy.iam.gserviceaccount.com");
But I'm getting:
WARNING: Request failed with code 403, performed 0 retries due to IOExceptions,
performed 0 retries due to unsuccessful status codes, HTTP framework says
request can be retried, (caller responsible for retrying):
https://dataflow.googleapis.com/v1b3/projects/MYPROJECT/locations/MYLOCATION/jobs
Exception in thread "main" java.lang.RuntimeException: Failed to create a workflow
job: (CODE): Current user cannot act as
service account "xxx#yyy.iam.gserviceaccount.com.
Causes: (CODE): Current user cannot act as
service account "xxx#yyy.iam.gserviceaccount.com.
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:791)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:173)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
...
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
{
"code" : 403,
"errors" : [ {
"domain" : "global",
"message" : "(CODE): Current user cannot act as service account
xxx#yyy.iam.gserviceaccount.com. Causes: (CODE): Current user
cannot act as service account xxx#yyy.iam.gserviceaccount.com.",
"reason" : "forbidden"
} ],
"message" : "(CODE): Current user cannot act as service account
xxx#yyy.iam.gserviceaccount.com. Causes: (CODE): Current user
cannot act as service account xxx#yyy.iam.gserviceaccount.com.",
"status" : "PERMISSION_DENIED"
}
Am I missing some Roles or permissions?
Maybe someone is going to find it helpful:
For controller it was: Dataflow Worker and Storage Object Admin (that was found in Google's documentation).
For executor it was: Service Account User.
I've been hitting this error and thought it worth sharing my experiences (partly because I suspect I'll encounter this again in the future).
The terraform code to create my dataflow job is:
resource "google_dataflow_job" "wordcount" {
# https://stackoverflow.com/a/59931467/201657
name = "wordcount"
template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
temp_gcs_location = "gs://${local.name-prefix}-functions/temp"
parameters = {
inputFile = "gs://dataflow-samples/shakespeare/kinglear.txt"
output = "gs://${local.name-prefix}-functions/wordcount/output"
}
service_account_email = "serviceAccount:${data.google_service_account.sa.email}"
}
The error message:
Error: googleapi: Error 400: (c3c0d991927a8658): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com., badRequest
was returned from running terraform apply. Checking out the logs provided a lot more info:
gcloud logging read 'timestamp >= "2020-12-31T13:39:58.733249492Z" AND timestamp <= "2020-12-31T13:45:58.733249492Z"' --format="csv(timestamp,severity,textPayload)" --order=asc
which returned various log records, including this:
Permissions verification for controller service account failed. IAM role roles/dataflow.worker should be granted to controller service account dataflowdemo#redacted.iam.gserviceaccount.com.
so I granted that missing role grant
gcloud projects add-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.worker"
and ran terraform apply again. This time I got the same error in the terraform output but there were no errors to be seen in the logs.
I then followed the advice given at https://cloud.google.com/dataflow/docs/concepts/access-control#creating_jobs to also grant the roles/dataflow.admin:
gcloud projects add-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.admin"
but there was no discernible difference from the previous attempt.
I then tried turning on terraform debug logging which provided this info:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ---[ REQUEST ]---------------------------------------
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: POST /v1b3/projects/redacted/locations/europe-west1/templates?alt=json&prettyPrint=false HTTP/1.1
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Host: dataflow.googleapis.com
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: User-Agent: google-api-go-client/0.5 Terraform/0.14.2 (+https://www.terraform.io) Terraform-Plugin-SDK/2.1.0 terraform-provider-google/dev
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Length: 385
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Type: application/json
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Goog-Api-Client: gl-go/1.14.5 gdcl/20201023
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Accept-Encoding: gzip
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "environment": {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "serviceAccountEmail": "serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "tempLocation": "gs://jamiet-demo-functions/temp"
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: },
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "gcsPath": "gs://dataflow-templates/latest/Word_Count",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "jobName": "wordcount",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "parameters": {
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "inputFile": "gs://dataflow-samples/shakespeare/kinglear.txt",
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "output": "gs://jamiet-demo-functions/wordcount/output"
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:13.129Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: -----------------------------------------------------
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 2020/12/31 16:04:14 [DEBUG] Google API Response Details:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ---[ RESPONSE ]--------------------------------------
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: HTTP/1.1 400 Bad Request
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Connection: close
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Transfer-Encoding: chunked
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Alt-Svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Cache-Control: private
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Content-Type: application/json; charset=UTF-8
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Date: Thu, 31 Dec 2020 16:04:15 GMT
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Server: ESF
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: Origin
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: X-Origin
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: Vary: Referer
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Content-Type-Options: nosniff
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Frame-Options: SAMEORIGIN
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: X-Xss-Protection: 0
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 1f9
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "error": {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "code": 400,
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "message": "(dbacb1c39beb28c9): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com.",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "errors": [
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: {
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "message": "(dbacb1c39beb28c9): Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com.",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "domain": "global",
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "reason": "badRequest"
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: ],
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: "status": "INVALID_ARGUMENT"
orm-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: }
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: 0
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5:
2020-12-31T16:04:14.647Z [DEBUG] plugin.terraform-provider-google_v3.51.0_x5: -----------------------------------------------------
The error being returned from dataflow.googleapis.com is clearly evident:
Current user cannot act as service account serviceAccount:dataflowdemo#redacted.iam.gserviceaccount.com
At this stage I am puzzled as to why I can see an error being returned from the Google's dataflow API but there is nothing in the GCP logs indicating that an error occurred.
Then tho I had a bit of a lightbulb moment. Why does that error message mention "service account serviceAccount"? Then it hit me, I'd defined the service account incorrectly. Terraform code should have been:
resource "google_dataflow_job" "wordcount" {
# https://stackoverflow.com/a/59931467/201657
name = "wordcount"
template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
temp_gcs_location = "gs://${local.name-prefix}-functions/temp"
parameters = {
inputFile = "gs://dataflow-samples/shakespeare/kinglear.txt"
output = "gs://${local.name-prefix}-functions/wordcount/output"
}
service_account_email = data.google_service_account.sa.email
}
I corrected it and it worked straight away. User error!!!
I then set about removing the various permissions that I'd added:
gcloud projects remove-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.admin"
gcloud projects remove-iam-policy-binding $PROJECT \
--member="serviceAccount:dataflowdemo#${PROJECT}.iam.gserviceaccount.com" \
--role="roles/dataflow.worker"
and terraform apply still worked. However, after removing the grant of role roles/dataflow.worker the job failed with error:
Workflow failed. Causes: Permissions verification for controller service account failed. IAM role roles/dataflow.worker should be granted to controller service account dataflowdemo#redacted.iam.gserviceaccount.com.
so clearly the documentation regarding the appropriate roles to grant (https://cloud.google.com/dataflow/docs/concepts/access-control#creating_jobs) is spot on.
As may be apparent, I started writing this post before I knew what the problem was and I thought it might be useful to document my investigation somewhere. Now that I've finished the investigation and the problem turns out to be one of PEBCAK its probably not so relevant to this thread anymore, and certainly shouldn't be accepted as an answer. Nevertheless, there is probably some useful information in here about how to go about investigating issues with terraform calling Google APIs, and it also reiterates the required role grants, so I'll leave it here in case it ever turns out to be useful.
I just hit this problem again so posting my solution up here as I fully expect I'll get bitten by this again at some point.
I was getting error:
Error: googleapi: Error 403: (a00eba23d59c1fa3): Current user cannot act as service account dataflow-controller-sa#myproject.iam.gserviceaccount.com. Causes: (a00eba23d59c15ac): Current user cannot act as service account dataflow-controller-sa#myproject.iam.gserviceaccount.com., forbidden
I was deploying the dataflow job, via terraform, using a different service account, deployer#myproject.iam.gserviceaccount.com
The solution was to grant that service account the roles/iam.serviceAccountUser role:
gcloud projects add-iam-policy-binding myproject \
--member=serviceAccount:deployer#myproject.iam.gserviceaccount.com \
--role=roles/iam.serviceAccountUser
For those that prefer custom IAM roles over predefined IAM roles the specific permission that was missing was iam.serviceAccounts.actAs.
Issue Got Resolved!
Go to GCP -> Console -> IAM -> ServiceAccount Email -> Add Permission -> Service Account User. as below
In working with the AWS C++ SDK I ran into an issue where trying to execute a PutObjectRequest complains that it is "unable to connect to endpoint" when uploaded more than ~400KB.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
Aws::S3::S3Client s3Client(clientConfig);
Aws::S3::Model::PutObjectRequest putObjectRequest;
putObjectRequest.SetBucket("mybucket");
putObjectRequest.SetKey("mykey");
typedef boost::iostreams::basic_array_source<char> Device;
boost::iostreams::stream_buffer<Device> stmbuf(compressedData, dataSize);
std::iostream *stm = new std::iostream(&stmbuf);
putObjectRequest.SetBody(std::shared_ptr<Aws::IOStream>(stm));
putObjectRequest.SetContentLength(dataSize);
Aws::S3::Model::PutObjectOutcome outcome = s3Client.PutObject(putObjectRequest);
As long as my data is less than ~400KB it gets uploaded into a file on S3 but beyond that it is unable to connect to endpoint. I should be able to upload up to 5GB in one PutObjectRequest.
Any thoughts?
Edit:
Responding to #JonathanHenson's comment, the AWS log shows this timeout error repeatedly:
[DEBUG] 2016-08-04 13:42:03 AWSClient [0x700000081000] Request Successfully signed
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Making request to https://s3.amazonaws.com/mybucket/myfile
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Including headers:
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] content-length: 3151261
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] content-type: binary/octet-stream
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] host: s3.amazonaws.com
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] user-agent: aws-sdk-cpp/0.13.9 Darwin/15.6.0 x86_64
[DEBUG] 2016-08-04 13:42:03 CurlHandleContainer [0x700000081000] Attempting to acquire curl connection.
[DEBUG] 2016-08-04 13:42:03 CurlHandleContainer [0x700000081000] Returning connection handle 0x10b09cc00
[DEBUG] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] Obtained connection handle 0x10b09cc00
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000] HTTP/1.1 100 Continue
[TRACE] 2016-08-04 13:42:03 CurlHttpClient [0x700000081000]
[ERROR] 2016-08-04 13:42:06 CurlHttpClient [0x700000081000] Curl returned error code 28
[DEBUG] 2016-08-04 13:42:06 CurlHandleContainer [0x700000081000] Releasing curl handle 0x10b09cc00
[DEBUG] 2016-08-04 13:42:06 CurlHandleContainer [0x700000081000] Notifying waiting threads.
[DEBUG] 2016-08-04 13:42:06 AWSClient [0x700000081000] Request returned error. Attempting to generate appropriate error codes from response
[WARN] 2016-08-04 13:42:06 AWSClient [0x700000081000] Request failed, now waiting 12800 ms before attempting again.
[DEBUG] 2016-08-04 13:42:19 InstanceProfileCredentialsProvider [0x700000081000] Checking if latest credential pull has expired.
Ultimately what fixed this for me was setting the request timeout. The request time out needs to be long enough for your entire transfer to finish. If you are transferring large files on a slow internet connection make sure the request timeout is long enough to allow those files to transfer.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
clientConfig.connectTimeoutMs = 30000;
clientConfig.requestTimoutMs = 600000;
Tweak your config file to below.And see it will work.
Aws::Client::ClientConfiguration clientConfig;
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = Aws::Region::US_EAST_1;
clientConfig.connectTimeoutMs = 30000;
Aws::S3::S3Client s3Client(clientConfig);
Aws::S3::Model::PutObjectRequest putObjectRequest;
putObjectRequest.SetBucket("mybucket");
putObjectRequest.SetKey("mykey");
typedef boost::iostreams::basic_array_source<char> Device;
boost::iostreams::stream_buffer<Device> stmbuf(compressedData, dataSize);
std::iostream *stm = new std::iostream(&stmbuf);
putObjectRequest.SetBody(std::shared_ptr<Aws::IOStream>(stm));
putObjectRequest.SetContentLength(dataSize);
Aws::S3::Model::PutObjectOutcome outcome = s3Client.PutObject(putObjectRequest);