AWS-CDK - DynamoDB Initial Data - amazon-web-services

Using the AWS CDK for a Serverless project but I've hit a sticking point. My project deploys a DynamoDB table which I need to populate with data prior to my Lambda function executing.
The data that needs to be loaded is generated by making API calls and isn't static data that can be loaded by a .json file or something simple.
Any ideas on how to approach this requirement for a production workload?

You can use AwsCustomResource in order to make a PutItem call to the table.
AwsSdkCall initializeData = AwsSdkCall.builder()
.service("DynamoDB")
.action("putItem")
.physicalResourceId(PhysicalResourceId.of(tableName + "_initialization"))
.parameters(Map.ofEntries(
Map.entry("TableName", tableName),
Map.entry("Item", Map.ofEntries(
Map.entry("id", Map.of("S", "0")),
Map.entry("data", Map.of("S", data))
)),
Map.entry("ConditionExpression", "attribute_not_exists(id)")
))
.build();
AwsCustomResource tableInitializationResource = AwsCustomResource.Builder.create(this, "TableInitializationResource")
.policy(AwsCustomResourcePolicy.fromStatements(List.of(
PolicyStatement.Builder.create()
.effect(Effect.ALLOW)
.actions(List.of("dynamodb:PutItem"))
.resources(List.of(table.getTableArn()))
.build()
)))
.onCreate(initializeData)
.onUpdate(initializeData)
.build();
tableInitializationResource.getNode().addDependency(table);
The PutItem operation will be triggered if the stack is created or if table is updated (tableName is supposed to be different in that case). If it doesn't work for some reason, you can set physicalResourceId to random value, i.e. UUID, to trigger the operation for each stack update (the operation is idempotent due to ConditionExpression)

CustomResource allows you to write custom provisioning logic. In this case you could use something like a AWS Lambda Function in a Custom Resource to read in the custom json and update DynamoDb.

Related

Return all items or a single item from DynamoDB from AWS API Gateway endpoint

I am using AWS proxy with AWS API Gateway to interact with a DynamoDB table. I have an API resource, under which I have a GET method with the below configuration:
The API uses the Scan action as seen above to fetch all the items from the DynamoDB table. I also have the following request integration mapping template;
{
"TableName": tableName
}
Its really simple. But my problem is that I would like to add another GET method to get each item by their id, which will be supplied in the URL as a param. However, since I have already setup one GET method, I am not able to setup another to fetch only a single item. I am aware I can use mapping templates and Scan as given in the docs to conditionally fetch items if a param is given, but that would mean scanning the entire table, which is a waste each time I want to fetch a single item.
Is there any other way to do this?

List all LogGroups using cdk

I am quite new to the CDK, but I'm adding a LogQueryWidget to my CloudWatch Dashboard through the CDK, and I need a way to add all LogGroups ending with a suffix to the query.
Is there a way to either loop through all existing LogGroups and finding the ones with the correct suffix, or a way to search through LogGroups.
const queryWidget = new LogQueryWidget({
title: "Error Rate",
logGroupNames: ['/aws/lambda/someLogGroup'],
view: LogQueryVisualizationType.TABLE,
queryLines: [
'fields #message',
'filter #message like /(?i)error/'
],
})
Is there anyway I can add it so logGroupNames contains all LogGroups that end with a specific suffix?
You cannot do that dynamically (i.e. you can't make this work such that if you add a new LogGroup, the query automatically adjusts), without using something like AWS lambda that periodically updates your Log Query.
However, because CDK is just a code, there is nothing stopping you from making an AWS SDK API call inside the code to retrieve all the log groups (See https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CloudWatchLogs.html#describeLogGroups-property) and then populate logGroupNames accordingly.
That way, when CDK compiles, it will make an API call to fetch LogGroups and then generated CloudFormation will contain the log groups you need. Note that this list will only be updated when you re-synthesize and re-deploy your stack.
Finally, note that there is a limit on how many Log Groups you can query with Log Insights (20 according to https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html).
If you want to achieve this, you can create a custom resource using AwsCustomResource and AwsSdkCall classes to do the AWS SDK API call (as mentioned by #Tofig above) as part of the deployment. You can read data from the API call response as well and act on it as you want.

Can AWS Glue write to DynamoDB?

I need to do some grouping job from a Source DynamoDB table, then write each resulting Item to another Target DynamoDB table (or a secondary index of the Source one).
Here I see that DynamoDB can be used as a Source (as well as reported in Connection Types).
However, it's not clear to me if a DynamoDB table can be used as Target as well.
Note: each resulting grouping item must be written into a separate DynamoDB Item (i.e., if there are X objects resulting from grouping, X Items must be written to Target DynamoDB table).
Glue can now read and write to DynamoDB. The option to write is not available via the console, but can be done by editing the script.
Example:
Datasink1 = glueContext.write_dynamic_frame_from_options(
frame=ApplyMapping_Frame1,
connection_type="dynamodb",
connection_options={
"dynamodb.output.tableName": "myDDBTable",
"dynamodb.throughput.write.percent": "1.0"
}
)
As per:
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect.html#etl-connect-dynamodb-as-sink
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-dynamo-db-cross-account.html
The Glue Job scripts can be customized to write to any datasource. If you are using the auto generated scripts, you can add boto3 library to write to DynamoDb tables.
If you want to test the scripts easily, you can create a Dev endpoint through AWS console & launch a jupyter notebook to write and test your glue job scripts.

Daily AWS Lambda not creating Athena partition, however commands runs successfully

I have an Athena database set up pointing at an S3 bucket containing ALB logs, and it all works correctly. I partition the table by a column called datetime and the idea is that it has the format YYYY/MM/DD.
I can manually create partitions through the Athena console, using the following command:
ALTER TABLE alb_logs ADD IF NOT EXISTS PARTITION (datetime='2019-08-01') LOCATION 's3://mybucket/AWSLogs/myaccountid/elasticloadbalancing/eu-west-1/2019/08/01/'
I have created a lambda to run daily to create a new partition, however this doesn't seem to work. I use the boto3 python client and execute the following:
result = athena.start_query_execution(
QueryString = "ALTER TABLE alb_logs ADD IF NOT EXISTS PARTITION (datetime='2019-08-01') LOCATION 's3://mybucket/AWSLogs/myaccountid/elasticloadbalancing/eu-west-1/2019/08/01/'",
QueryExecutionContext = {
'Database': 'web'
},
ResultConfiguration = {
"OutputLocation" : "s3://aws-athena-query-results-093305704519-eu-west-1/Unsaved/"
}
)
This appears to run successfully without any errors and the query execution even returns a QueryExecutionId as it should. However if I run SHOW PARTITIONS web.alb_logs; via the Athena console it hasn't created the partition.
I have a feeling it could be down to permissions, however I have given the lambda execution role full permissions to all resources on S3 and full permissions to all resources on Athena and it still doesn't seem to work.
Since Athena query execution is asynchronous your Lambda function never sees the result of the query execution, it just gets the result of starting the query.
I would be very surprised if this wasn't a permissions issue, but because of the above the error will not appear in the Lambda logs. What you can do is to log the query execution ID and look it up with the GetQueryExecution API call to see that the query succeeded.
Even better would be to rewrite your code to use the Glue APIs directly to add the partitions. Adding a partition is a quick and synchronous operation in Glue, which means you can make the API call and get a status in the same Lambda execution. Have a look at the APIs for working with partitions: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-partitions.html

How can I setup a delete Trigger on AWS RDS

I have a setup on AWS RDS with MariaDB 10.3. I have several DBs on the RDS instance. I'm trying to replicate a table (routes) from one DB (att) to another DB (pro) using triggers. I have triggers for create, update and delete. The create and update triggers works fine while the delete trigger gives the error message below. I've tested all triggers locally and they work.
My trigger looks like this.
CREATE DEFINER=`root`#`%` TRIGGER routes_delete AFTER DELETE ON
`routes` FOR EACH ROW
BEGIN
DELETE FROM `pro`.`routes`
WHERE `route_id` = OLD.route_id;
END
Error message
Query execution failed
Reason:
SQL Error [1442] [HY000]: (conn:349208) Can't update table 'routes' in
stored function/trigger because it is already used by statement which
invoked this stored function/trigger
Query is : DELETE FROM `att`.routes WHERE route_code = 78 AND company_id = 3
I don't understand what other statement is using the routes table since there is nothing else linked to it. What adjustment is needed to get this work on AWS RDS?
what other statement is using the routes table
The "other" query is the query that invoked the trigger.
...is already used by [the] statement which invoked this stored function/trigger
A trigger is not allowed to modify its own table. BEFORE INSERT and BEFORE UPDATE triggers can modify the current row before it is written to the table using the NEW alias, but that is the extent to which a trigger can modify the table where it is defined.
Triggers are subject to all the same limitations as stored functions, and a stored function...
Cannot make changes to a table that is already in use (reading or writing) by the statement invoking the stored function.
https://mariadb.com/kb/en/library/stored-function-limitations/