AWS SAM and DynamoDB table in production - amazon-web-services

I am using AWS SAM to define my app, and I am defining a DynamoDB table using this: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-dynamodb-table.html#cfn-dynamodb-table-tablename
However, I am worried that in Prod, this will lead to deleting the table and its content.
How do others handle this? Is there a way to keep the table and not drop and recreate it?

Use changesets and check them carefully to ensure that you are not causing a replacement of the table.
Use Deletion and UpdateReplace policies to ensure you do not lose data, even if you do replace it by accident.
Use a stack policy to block updates to the resource

Related

How does removalPolicy: cdk.RemovalPolicy.DESTROY works?

I am using removalPolicy: cdk.RemovalPolicy.DESTROY.
The other two options are RETAIN and SNAPSHOT.
If I delete my table from the console and try to create using the cdk it gives an error say could not find resource.
Question -- what option I can use if the script is unable to find the table then it should create ?
The RemovalPolicy has nothing to do with things you remove yourself outside of CDK. This is a bad idea (and in some cases not permitted) as you are supposed to delete resources by updating your CDK code to remove that resource, and then redeploying it.
The RemovalPolicy tells CloudFormation what to do if you change your CDK code so that the resource is no longer part of your CDK stack.
For example if you have an S3 bucket you cannot rename it, but you can still change its name in your CDK stack. If you do, CDK will need to remove the old S3 bucket and create a new one with the new name. The same applies for a number of other resources that can't be renamed, such as DynamoDB tables.
How CDK handles removing this old resource is what the RemovalPolicy is for. If you set it to RETAIN it will just forget about it and leave it up to you to clean up manually later. Using the DESTROY policy tells CDK to try to delete your resource automatically along with all the data it contains.
Usually you would use DESTROY if the data is not important and can be easily recreated (e.g. cache data), and RETAIN if the data is important and you would not want to lose it (e.g. user data).
Often it is a good idea just to always use RETAIN. This way if you accidentally make a typo in your CDK stack and rename a resource by mistake, all the data in it won't get deleted!
what option I can use if the script is unable to find the table then it should create ?
You just create the resource normally. When you write a CDK stack, you are not telling it what to do (create this S3 bucket, create this DynamoDB table) but rather you are telling it what you want (I want an S3 bucket with this name, and a DynamoDB table with that name). CDK will figure out which resources need to be created to meet your request, and if CDK has already created those resources in an earlier deploy, it will just update them if changes are needed or leave them untouched if no changes are required.
The reason you got an error after you deleted the resource manually is because CDK was trying to find it to figure out whether it needed to be updated or not. This is why you should never change any AWS resources manually that were configured by CDK - always update the CDK template and redeploy. If you fiddle with the resources manually it is very easy to break CDK, and in that situation the only solution is to destroy the stack, manually clean up any resources that couldn't be destroyed, then redeploy it from scratch (and then also reupload any user data you might have, often a big deal!)
CDK RemovalPolicy is equivalent to Cloudformatoin DeletePolicy which takes into effect when a resource is removed from CDK/Cloudformation.
DESTROY: This is default option, removes actual resource if code is removed in CDK.
RETAIN: This will retain the actual resource, if resource code is removed from CDK.
SNAPSHOT: This will also deletes the resource if resource code is removed from CDK but will create SNAPSHOT before deleting. Ex: RDS cluster or EC2 Volume.
These options are applicable when actual resource is removed from CDK code, but not from AWS. if a resource created by CDK/Cloudformation is manually deleted, it can no longer be maintained by CDK and will result in errors unless id is changed ex: MyQueue is changed to MyQueueSomething. This will result in creation of new queue and removal of old queue. since old queue is not existing it will be ignored.
new sqs.Queue(this, 'MyQueue', {
encryption: sqs.QueueEncryption.KMS_MANAGED
});
If we mistakenly delete a resource manually outside cdk/cf and we want to continue managing it via cdk/cf, we have to manually create the resource with same physical id. Here are some more details.

restoring DynamoDB table from AWS Backup

I am using AWS Backup to back up some DynamoDB tables. Using the AWS Backup console to restore the back-ups I am prompted to restore to a new table. This works fine but my tables are deployed using CloudFormation, so I need the restored data in the existing table.
What is the process to get the restored data into the existing table? It looks like there are some third-party tools to copy data between tables but I'm looking for something within AWS itself.
I recently had this issue and actually got cloudformation to work quite seamlessly. The process was
Delete existing tables directly from dynamodb (do not delete from cloudformation)
Restore backup to new table, using the name of the deleted table
In cloudformation, detect drift, manually fix any drift errors in dynamodb, and then detect drift again
After this, the CFN template was healthy
At this time, AWS has no direct way to do this (though it looks like you can export to some service, then import from that service into an existing table).
I ended up writing my own code to do this.

DynamoDB - restoring table using PITR for DynamoDB table managed by CloudFormation

I would like to be able to perform PITR restoration without losing benefit of Infrastructure-as-a-code with CloudFormation.
Specifically, if I perform PITR restoration manually and then point application to the new database, won't that result in new DynamoDB table falling out of CloudFormation managed infrastructure? AFAIK, there is no mechanism at the moment to add a resource to CloudFormation after it was already created.
Has anyone solved this problem?
There is a now a way to import existing resources into cloudformation.
This means that you can do a PiTR and then import the newly created table into your stack.
You are correct, the restored table will be outside cloudformation control. The only solution that I know of is to write a script that copies that from the recovered table to the original table. Obviously there is a cost and time involved in that and it is less than ideal.
As ever there is always the option to write a custom resource but that somewhat undermines the point of using Cloudformation in the first place.

Create AWS dynamodb table via aws cli and attach it to a cloudformation stack

Is it possible to create an aws dynamodb resource and attach it to a cloudformation stack post stack creation?
Use case: I have a dynamodb table that I want to wipe clean (delete all items). The two ways to do this are deleting and then recreating the table or deleting each item individually which is costly. As such, I would like to opt for deleting and recreating the whole table. However, the resource belongs to a cloudformation stack and I'd like to keep it that way.
Any ideas?
It's easy enough to remove the table from the stack resources, either by simply removing the resource from the template, or, a bit of a cleaner solution, use a Condition on the cloudformation resource to toggle the table on or off. you can then toggle off, deploy the stack ( removing the table), toggle on, and recreate the stack (recreating the table).
The real challenge with this technique is not the table itself, but all the references to that table in the CloudFormation stack. It's likely that you'll be referring to the table elsewhere - for example, as resources in your IAM Policies allowing access, in your application config to specify the table, etc. If this is the case, you'll have to change those places too, to use Fn::If to control the creation of the reference with the same condition that creates the table. This ends up being rather complicated, but can be done with a combination of Fn::If and {"Ref": "AWS::NoValue" }.
I've done devops in AWS for quite a few years, and overall, I'd strongly recommend my developers to build an efficient script to clear dynamo tables and use that. It's not trivial to purge the table by deleting all items, but it's a lot simpler than conditionalizing the creation of all references to the table in your stack. At the end of the day, resetting the table data is an operational task distinct from infrastructure management, I'd suggest you keep it that way. What is the recommended way to delete a large number of items from DynamoDB? might get you started .

I don't wanna indexing DynamoDB secondary indexes in Elasticsearch

I have one DynamoDB table, and there is a secondary index on the table.
But I have a faced duplication problem when I was query something.
I don't want my Lambda function don't trace secondary index...
I saw IAM policy but there is no relational policy.
How can I solve this problem? This is my lambda function: aws-dynamodb-to-elasticsearch/dynamodb-to-es.py at master ยท vladhoncharenko/aws-dynamodb-to-elasticsearch
This is probably because you have many Lambda functions or many Lambda Function versions in your account for that region.
[Total size of all the deployment packages that can be uploaded per region | 75 GB][1]
Looks like this is a pretty common problem for serverless and someone has developed a plugin to help alleviate this issue: https://github.com/claygregory/serverless-prune-plugin
If you want to deal with this manually you'll need to use either the console, or an sdk/cli to delete old lambda versions. https://docs.aws.amazon.com/cli/latest/reference/lambda/delete-function.html