Amazon Kinesis: Caught exception while sync'ing Kinesis shards and leases - amazon-web-services

I am trying to make Snowplow work on AWS. When I am trying to run stream-enrich service on instance, I am getting this exception:
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Syncing Kinesis shard info
[main] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardSyncTask - Caught exception while sync'ing Kinesis shards and leases
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 4 datums to CloudWatch
I don't think error is due to Cloud Watch:
Caught exception while sync'ing Kinesis shards and leases

As mentioned in the comments above, this error will crop when you're lacking permissions to AWS resources required by Kinesis Client Library (KCL). This can be the DynamoDB, CloudWatch, or Kinesis. For the Stream Enrich component of Snowplow, you'll need the following permissions:
Read permission to input kinesis stream (collector good)
Write permission to output kinesis streams (enrich good & enrich bad)
List permission to kinesis streams
Read/write/create permission to DynamoDB state table (table name is the “appName” value in your stream enrich application.conf)
PutMetricData to Cloudwatch
A templated version of an IAM policy that meets these needs is as follows:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kinesis:DescribeStream",
"kinesis:GetShardIterator",
"kinesis:GetRecords",
"kinesis:ListShards"
],
"Resource": [
"${collector_stream_out_good}"
]
},
{
"Effect": "Allow",
"Action": [
"kinesis:ListStreams"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"kinesis:DescribeStream",
"kinesis:PutRecord",
"kinesis:PutRecords"
],
"Resource": [
"${enricher_stream_out_good}",
"${enricher_stream_out_bad}"
]
},
{
"Effect": "Allow",
"Action": [
"dynamodb:CreateTable",
"dynamodb:DescribeTable",
"dynamodb:Scan",
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:DeleteItem"
],
"Resource": [
"${enricher_state_table}"
]
},
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData"
],
"Resource": "*"
}
]
}
I've written up a blog post that covers required IAM permissions for Stream Enrich and other Snowplow components since documentation on the exact required permissions was sparse/non-existent in the Snowplow documentation.
Hope that helps!

So I had this problem when setting up Snowplow. I'm using terraform to automate the infrastructure and got this error after a destroy and re-apply. Here's what I learned.
You give the enricher DynamoDB privilages so it can create a table. If this table is already created before the enricher creates it (but not destroyed by terraform in my case) it is not able to create a table with the same name. It also seemingly won't link to existing tables.
My solution was to delete the existing DynamoDB table via the AWS console, terminate my enricher, and start up a new one. The error no longer appeared and my enricher worked as intended.

I faced this issue today. For me, the issue was that, I changed the kinesis stream names without changing the appName in the enrich configuration.
Once I changed the appName to a new name and deployed an updated to snowplow enrich, I was able to get rid of the error.

Related

AWS Athena Federated query gives permission error while running in AWS Batch

I have set up MySQL datasource in Athena (it required creating Lambda for RDB access) and can run federated queries successfully in Athena console - I can do joins between RDB tables and Athena/Glue tables (when RDB table is referred, it is supposed to be specified as <datasource_name>.<db_name>.<table_name>) and get the results.
Now I am trying to run the same federated query in my AWS Batch application, and getting the following error:
The Amazon Athena query failed to run with error message: Amazon Athena experienced a permission error. Please provide proper permission and submitting the query again. If the issue reoccurs, contact AWS support for further assistance. You will not be charged for this query. We apologize for the inconvenience.
I can successfully run usual (non-federated) Athena queries that only use Athena/Glue tables, in AWS Batch.
My AWS Batch job definition uses ecsTaskExecutionRole as "execution role" and "job role ARN".
I have added the following policies into both ecsTaskExecutionRole and ecsInstanceRole. Is there any policy that I am missing?
policy for all Athena actions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"athena:*"
],
"Resource": [
"arn:aws:athena:<my_region>:<my_acc_id>:*"
]
}
]
}
policy for all Glue actions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:*"
],
"Resource": [
"arn:aws:glue:<my_region>:<my_acc_id>:*"
]
}
]
}
policy for all actions of Lambda that was created for accessing MySQL datasource:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"lambda:*"
],
"Resource": [
"arn:aws:lambda:<my_region>:<my_acc_id>:function:<my_lambda_name>:*"
]
}
]
}
policy for S3 buckets - the one with table data and the one for storing Athena output:
{
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:AbortMultipartUpload",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::<table_bucket>",
"arn:aws:s3:::<table_bucket>/*",
"arn:aws:s3:::<athena_output_bucket>",
"arn:aws:s3:::<athena_output_bucket>/*"
]
}
]
}
UPD. just for convenience, added the following policy according to this doc: https://docs.aws.amazon.com/athena/latest/ug/federated-query-iam-access.html#fed-using-iam
{
"Effect": "Allow",
"Action": "athena:ListWorkGroups",
"Resource": "*"
}
and also added this resource "arn:aws:s3:::<athena_output_bucket>/athena-spill*" (spill bucket is the same with athena output bucket) to S3 policy. Still no success
Figured out the reason - the Lambda resource should be specified without wildcard at the end:
arn:aws:lambda:<my_region>:<my_acc_id>:function:<my_lambda_name>

AWS Tutorial Code - Lambda Access Denied, incorrect server location showing in error message

I am very new to AWS and have only just started learning it. I am following AWS's full-stack tutorial, however, when I test module 4, my lambda function is not authorized to perform dynamodb:PutItem. In the error message, I can see the ARN has us-east-1 in it, however, the ARN I passed into the JSON for the IAM policy is eu-west-2. I have set everything up on eu-west-2 servers.
Here is the JSON used in the IAM policy, I have replaced my ID with xxxxx, but it is the same as what's listed in the table details on the DynamoDB dashboard.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"dynamodb:PutItem",
"dynamodb:DeleteItem",
"dynamodb:GetItem",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:UpdateItem"
],
"Resource": "arn:aws:dynamodb:eu-west-2:xxxxxxxxx:table/HelloWorldDatabase/*"
}
]
}
Is there anything I should be checking elsewhere they could be wrong?
EDIT:
Having changed some JSON from comments, JSON now looks like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListAndDescribe",
"Effect": "Allow",
"Action": [
"dynamodb:List*",
"dynamodb:DescribeReservedCapacity*",
"dynamodb:DescribeLimits",
"dynamodb:DescribeTimeToLive"
],
"Resource": "*"
},
{
"Sid": "SpecificTable",
"Effect": "Allow",
"Action": [
"dynamodb:BatchGet*",
"dynamodb:DescribeStream",
"dynamodb:DescribeTable",
"dynamodb:Get*",
"dynamodb:Query",
"dynamodb:Scan",
"dynamodb:BatchWrite*",
"dynamodb:CreateTable",
"dynamodb:Delete*",
"dynamodb:Update*",
"dynamodb:PutItem"
],
"Resource": "arn:aws:dynamodb:*:*:table/HelloWorldDatabase"
}
]
}
This is the full stack trace I am now getting:
Requested resource not found (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ResourceNotFoundException; Request ID: S62KLPBAGKNLA66SSI77RC1AC7VV4KQNSO5AEMVJF66Q9ASUAAJG)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1799)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5110)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5077)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executePutItem(AmazonDynamoDBClient.java:2721)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.putItem(AmazonDynamoDBClient.java:2687)
at com.amazonaws.services.dynamodbv2.document.internal.PutItemImpl.doPutItem(PutItemImpl.java:85)
at com.amazonaws.services.dynamodbv2.document.internal.PutItemImpl.putItem(PutItemImpl.java:63)
at com.amazonaws.services.dynamodbv2.document.Table.putItem(Table.java:168)
at com.example.app.SavePersonHandler.persistData(SavePersonHandler.java:38)
at com.example.app.SavePersonHandler.handleRequest(SavePersonHandler.java:27)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.base/java.lang.reflect.Method.invoke(Unknown Source)
From DynamoDB this is the table details:
Region EU (London)
Amazon Resource Name (ARN) arn:aws:dynamodb:eu-west-2:xxxxxxxxx:table/HelloWorldDatabase
The Problem is the region name 3rd module step named Create a WebApp With Amplify Console
quoting from the above step:
In a new browser window, log into the Amplify Console. NOTE: We will be using the Oregon (us-west-2) region for this tutorial.
Please use the Amazon DynamoDB: Allows access to a specific table
Below policy shows how you might create a policy that allows full access to the HelloWorldDatabase DynamoDB table. This policy grants the permissions necessary to complete this action from the AWS API or AWS CLI only.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListAndDescribe",
"Effect": "Allow",
"Action": [
"dynamodb:List*",
"dynamodb:DescribeReservedCapacity*",
"dynamodb:DescribeLimits",
"dynamodb:DescribeTimeToLive"
],
"Resource": "*"
},
{
"Sid": "SpecificTable",
"Effect": "Allow",
"Action": [
"dynamodb:BatchGet*",
"dynamodb:DescribeStream",
"dynamodb:DescribeTable",
"dynamodb:Get*",
"dynamodb:Query",
"dynamodb:Scan",
"dynamodb:BatchWrite*",
"dynamodb:CreateTable",
"dynamodb:Delete*",
"dynamodb:Update*",
"dynamodb:PutItem"
],
"Resource": "arn:aws:dynamodb:eu-west-2:xxxxxx:table/HelloWorldDatabase"
}
]
}
If you want to learn how to build Lambda functions that interact with AWS Services, such as Amazon DynamoDB, you can use the Lambda Java runtime API. This gives you full control exactly what you want the Lambda function to perform.
To interact with the AWS Services, you have to use an IAM role (as discussed in this tutorial). For example, to use DynamoDB, the IAM role has to have a policy that lets it use Amazon DynamoDB.
All of these concepts are covered in this API development tutorial. In addition, this tutorial shows you how to schedule the Lambda function using scheduled events:
Creating scheduled events to invoke Lambda functions
I met the same issue. To solve the issue, please use only usa-east-1 server along the way when doing the tutorial. The jar file seems to hard-code the server address.

AWS Sagemaker + AWS Lambda

I try to use AWS SageMaker following documentation. I successfully loaded data, trained and deployed the model.
deployed-model
My next step have to be using AWS Lambda, connect it to this SageMaker endpoint.
I saw, that I need to give Lambda IAM execution role permission to invoke a model endpoint.
I add some data to IAM policy JSON and now it has this view
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:us-east-1:<my-account>:*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:us-east-1:<my-account>:log-group:/aws/lambda/test-sagemaker:*"
]
},
{
"Effect": "Allow",
"Action": "sagemaker:InvokeEndpoint",
"Resource": "*"
}
]
}
Problem that even with role that have permission for invoking SageMaker endpoint my Lambda function didn't see it
An error occurred (ValidationError) when calling the InvokeEndpoint operation: Endpoint xgboost-2020-10-02-12-15-36-097 of account <my-account> not found.: ValidationError
I found an error by myself. Problem was in different regions. For training and deploying model I used us-east-2 and for lambda I used us-east-1. Just creating all in same region fixed this issue!

Unable to get aws:PrincipalOrgID to work with creating subscription filter

I have an AWS account with Organizations enabled. I want to ensure that certain logs from my child accounts go to my Kinesis stream in a logging account. The idea is that in future if I create a new child account in Organizations, the logs should go to Kinesis.
For this, I have created a Kinesis log destination in my logging account using aws logs put-destination command. I added a destination policy to it. The policy I used was:
{
"Version": "2012-10-17",
"Statement": {
"Sid": "PutSubscriptionFilter",
"Effect": "Allow",
"Principal": {
"AWS": ["*"]
},
"Action": "logs:PutSubscriptionFilter",
"Resource": "arn:aws:logs:us-east-1:123456789012:destination:mytestLogDestination",
"Condition": {
"StringEquals": {
"aws:PrincipalOrgID": "o-abcde12345"
}
}
}
}
The command I used to add the destination policy was:
aws logs put-destination-policy \
--destination-name mytestLogDestination \
--access-policy file://destination_policy.json
This added the destination policy successfully. I can confirm this by running the command: aws logs describe-destinations --destination-name-prefix mytestLogDestination. When I try to create a new subscription filter in one of my member accounts using the following command, it errors out. The command I tried is:
aws logs put-subscription-filter \
--log-group-name "/aws/lambda/GetOrgIdFunction" \
--filter-name randomsubscriptionfilter --filter-pattern "" \
--destination-arn arn:aws:logs:us-east-1:123456789012:destination:mytestLogDestination
Error message is:
An error occurred (AccessDeniedException) when calling the PutSubscriptionFilter operation: User with accountId: 210987654321 is not authorized to perform: logs:PutSubscriptionFilter on resource: arn:aws:logs:us-east-1:123456789012:destination:mytestLogDestination
When I remove the condition and restrict the Principal to just my account (210987654321), it works fine. Is it possible to get this setup working or does AWS currently not support it?
As of August 02, 2019
After talking to AWS Support, this is CloudWatch Logs limitation as they don't yet support PrincipalOrgID. We would have to add each account separately when creating the log destination policy.
Marking this as an answer for now.
Update: January 06, 2021
According to a new AWS release, this is now supported. AWS documentation for reference: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CreateDestination.html
How annoying, I wasted so much time testing different methods to try and get this running. Glad I eventually found your answer!
I don't suppose they gave you any further information regarding dates when it might be supported? I'm assuming no conditions work with these policies as I tried PrincipleArn and I was having the same issue.
I was only able to get it to work with the aws:SourceArn condition which is fairly frustrating.
{
"Version": "2008-10-17",
"Id": "__default_policy_ID",
"Statement": [
{
"Sid": "__default_statement_ID",
"Effect": "Allow",
"Principal": {
"Service": "cloudwatch.amazonaws.com",
"AWS": "*"
},
"Action": [
"SNS:GetTopicAttributes",
"SNS:SetTopicAttributes",
"SNS:AddPermission",
"SNS:RemovePermission",
"SNS:DeleteTopic",
"SNS:Subscribe",
"SNS:ListSubscriptionsByTopic",
"SNS:Publish",
"SNS:Receive"
],
"Resource": "<topic arn>",
"Condition": {
"ArnLike": {
"aws:SourceArn": [
"arn:aws:cloudwatch:<region>:<account a>:alarm:*",
"arn:aws:cloudwatch:<region>:<account b>:alarm:*",
"arn:aws:cloudwatch:<region>:<account c>:alarm:*"
]
}
}
}
]
}

Filtering list of tables through IAM in DynamoDB Admin Interface

I'm trying to filter the list of table names in the DynamoDB admin UI using IAM.
When I use this policy it shows all tables:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "XXXXXXX",
"Effect": "Allow",
"Action": [
"dynamodb:DescribeTable",
"dynamodb:ListTables"
],
"Resource": [
"*"
]
}
]
}
When I use this policy it shows nothing (just "Not Authorized" message):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "XXXXXXX",
"Effect": "Allow",
"Action": [
"dynamodb:DescribeTable",
"dynamodb:ListTables"
],
"Resource": [
"us-east-1:XXXXXXXXXXX:table/table_to_show"
]
}
]
}
Anyone know if this is possible?
There are two problems with your policy.
As pointed out in the above comment the URN is wrong it should be starting with
arn:aws:dynamodb:*
Second ListTables operation in dynamoDB is not allowed to be accessed at resource level
Quote from AWS Documentation
You can use resource-level ARNs in IAM policies for all DynamoDB actions, with the exception of ListTables. The ListTables action returns the table names owned by the current account making the request for the current region; it is the only DynamoDB action that does not support resource-level ARN policies.
So it you try any other operation it will pass like DescribeTable
Solution to this is that you should add a policy just for ListTables operation with "Resource":["*"]
Looks like you need to change the ARN of Dynamo DB table in this format.
"Resource": "arn:aws:dynamodb:us-west-2:123456789012:table/Books"
Using IAM to Control Access to DynamoDB Resources.
Another point: To list something in Admin console, you would need more permission than the Describe and list, since the list shows the attributes etc. Try adding all these (this will be effectively full read on specific table).
"dynamodb:ListTables",
"dynamodb:DescribeTable",
"dynamodb:GetItem",
"dynamodb:BatchGetItem",
"dynamodb:Query",
"dynamodb:Scan"