How to force deploy to lambda via SAM cli - amazon-web-services

I am using sam deploy command to deploy my lambda to AWS. Sometimes I get this error An error occurred (ValidationError) when calling the CreateChangeSet operation: Stack:arn:aws:cloudformation:ap-southeast-2:xxxx:stack/xxxx/xxxx is in ROLLBACK_COMPLETE state and can not be updated. I know there is a failure happens on the previous deployment. I can manually delete the stack in AWS cloundformation console and retry the command. But I wonder is there is way to force the command to delete any rollback state stack?
I know I can delete the failed stack via aws cli or console. But my deploy script is on CI and I'd like to make CI to use deploy command to override the failed stack. So the scenario is:
1. CI failed on deploy lambda function
2. My team analysis the issue and fix the issue in cloudformation template file
3. Push the fix to github to tigger the CI
4. CI is triggered and use the latest change to override the failed stack.
I don't want the team to manually delete the stack.

The ROLLBACK_COMPLETE status exists only after a failed stack creation. The only option is to delete the stack. This is to give you a chance to correctly analyze the reason behind the failure.
You can delete the stack from the command line with:
aws cloudformation delete-stack --stack-name <value>
From the documentation of ROLLBACK_COMPLETE:
Successful removal of one or more stacks after a failed stack creation or after an explicitly canceled stack creation. Any resources that were created during the create stack action are deleted.
This status exists only after a failed stack creation. It signifies that all operations from the partially created stack have been appropriately cleaned up. When in this state, only a delete operation can be performed.
Normally the ROLLBACK_COMPLETE should not happen in production. I would suggest validating your stack in a development environment or have one successful stack creation in your production environment before continuously deploying your stack.
Still, you could have a custom script in your CI that checks the stack status (DescribeStacks) and if it's ROLLBACK_COMPLETE delete it (DeleteStack). This script would run before sam deploy.

Related

How to launch AWS cloud formation stack with glue?

I'm trying to get this repo going: https://github.com/mydatastack/google-analytics-to-s3.
A link is provided to launch the AWS CloudFormation stack, its meant to be one click to launch the stack but it is no longer working because the S3 bucket containing the template is no longer active.
As a result I'm trying to launch the stack myself via sam deploy --guided --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_IAM since all the resources for the stack are within the repo. I've added this lambda layer for the paramiko package referenced by collector-ga.yaml to fix this error .
Frustratingly, I'm not quite up and running yet, GlueConfigurationLambda, an AWS lambda function (line 691) failed to create:
Waiting for changeset to be created..
CloudFormation stack changeset
---------------------------------------------------------------------------------------------------------------------
Operation LogicalResourceId ResourceType Replacement
---------------------------------------------------------------------------------------------------------------------
+ Add GoogleAnalyticsCollectorSta AWS::CloudFormation::Stack N/A
ck
---------------------------------------------------------------------------------------------------------------------
Changeset created successfully. arn:aws:cloudformation:eu-central-1:XXXXXX:changeSet/samcli-deploy1628597635/4ee26e-46b5-4131-bdba-1b9fc34f99d6
Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]: y
2021-08-10 13:14:04 - Waiting for stack create/update to complete
CloudFormation events from changeset
---------------------------------------------------------------------------------------------------------------------------------------------------------
ResourceStatus ResourceType LogicalResourceId ResourceStatusReason
---------------------------------------------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS AWS::CloudFormation::Stack GoogleAnalyticsCollectorStack -
CREATE_IN_PROGRESS AWS::CloudFormation::Stack GoogleAnalyticsCollectorStack Resource creation Initiated
CREATE_FAILED AWS::CloudFormation::Stack GoogleAnalyticsCollectorStack Embedded stack
arn:aws:cloudformation:eu-central-1:
XXXXXX:stack/GAN2S3-GoogleAnal
yticsCollectorStack-JUATDT3EBD82/e19
a4950-ff27-11ea-943e-06072e1f2808
was not successfully created: The
following resource(s) failed to
create: [GlueConfigurationLambda].
Full Trace - https://pastebin.pl/view/50b3e402
My first question is if there's anywhere to get a more in-depth log of the error?
My second question is if anyone knows how to fix this error.
Can you have a look at the AWS Console CloudFormation application? You should be able to opt to view the Deleted stacks, after which you should be able to select the substack that has failed. In the events list of that deleted stack, you should be able to view a more precise error of what went wrong.
If it's still unclear from that precise error, feel free to edit the question to add the specific error and add a comment to this answer to draw my attention to it.
(Edit)
I've looked through the template file again and noticed the Lambda that's failing is still configured to use Node.js 8, which has been deprecated for some time. You should change it to a newer version, e.g., Node.js 14.
Find the currently supported runtimes here: Lambda runtimes

AWS CodePipeline is failing with InternalFailure

I have migrated existing AWS Resources from one Cloudformation (CFT) stack to another CFT stack using below link.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resource-import-new-stack.html
After migration, my new CFT stack's status was "IMPORT_COMPLETE". Then I have created an AWS CodePipeline wherein my source is AWS CodeCommit and I am trying to deploy it in Cloudformation stack using CodePipeline.
In my CodePipeline I am using my new CFT stack where I have migrated my existing AWS resources and in the same template I have updated my code by added SQS queue policy and uploaded the code in CodeCommit.
So, when my AWS CodePipeline is getting triggered it is getting failed with "InternalFailure" error and it is not giving any specific error about why it is getting failed.
Also, I have checked into CloudTrail logs and there I can see my pipeline is getting failed after "UploadArchive" event which belongs to CodeCommit and it is nor moving further. Also, I tried to give administrator permission to my pipeline service role as well as cloudformation role but still the error is same.
Later, one thing I observed and that is when I update my new Cloudformation stack using AWS Cloudformation console then my stack's status is changing to "Update_Complete" status. Then after that if I try to update the code into CodeCommit then my pipeline is getting completed successfully.
So, not sure why my Pipeline is getting failed with "InternalFailure" when my stacks status is "IMPORT_COMPLETE". Could you please help me to understand if I am missing any specific step die to which my pipeline is getting failed with this error when my CFT stacks status is "IMPORT_COMPLETE" status
It's a bug in codepipeline. I'd recommend submitting at ticket to them in hopes they make a fix. I only found this out via support myself.

AWS SAM deploy failure

I was testing the AWS SAM functionality and encountered an issue.
If by manually delete a resource that was originally created by the SAM template, then subsequent SAM deployment will fail. I do understand that deleting resource manually that was created by SAM is not a good practice. But this was just a test only
Error
Is there any way to fix this?
AWS SAM uses Cloudformation underneath to create various resources.
How do I update an AWS CloudFormation stack that's failing because of a resource that I manually deleted?
If you delete a resource from an AWS CloudFormation stack, then you must remove the resource from your AWS CloudFormation template. Otherwise, your stack fails to update, and you get an error message.
similar post : Function not found after manually deleting a function in a SAM CloudFormation stack

ValidationError Stack:arn aws cloudformation stack is in ROLLBACK_COMPLETE state and can not be updated

When I deploy using cloudformation aws cloudformation deploy --region $region --stack-name ABC
I get the error:
An error occurred (ValidationError) when calling the CreateChangeSet
operation:
Stack:arn:aws:cloudformation:stack/service/7e1d8c70-d60f-11e9-9728-0a4501e4ce4c
is in ROLLBACK_COMPLETE state and can not be updated.
This happens when stack creation fails. By default the stack will remain in place with a status of ROLLBACK_COMPLETE. This means it's successfully rolled back (deleted) all the resources which the stack had created. The only thing remaining is the empty stack itself. You cannot update this stack; you must manually delete it, after which you can attempt to deploy it again.
If you set "Rollback on failure" to disabled in the console (or set --on-failure to DO_NOTHING in the CLI command, if using create-stack), stack creation failure will instead result in a status of CREATE_FAILED. Any resources created before the point of failure won't have been rolled back.
If instead you were deploying updates to an existing (successfully created) stack, and the updates failed but were successfully rolled back, it will go back into its previous valid state (with a status of UPDATE_ROLLBACK_COMPLETE), allowing you to reattempt updates.
As #SteffenOpel points out, you can now specify that a stack should be deleted on failure by setting the --on-failure option (for create-stack only, not deploy) to DELETE in the CLI. This option is not yet available in the console at the time of writing (13/11/20).
Run the following AWS CLI command to delete your stack:
aws cloudformation delete-stack --stack-name <<stack-name>>
It may take less than a minute to delete your stack, and then try re-deploying it.
2 solutions
1.you have to manually delete all the objects in the s3
(if still th error occurs ,Stack:arn:aws:cloudformation:eu-west-3:624140032431:stack/as*****cbucket/f57c54f0-618a-11ec-afd7-06fc90426f3e is in ROLLBACK_COMPLETE state and can not be updated., move to second solution)
2.create a new bucket to continue
the case is that the S3 bucket is unique globally, same happened to me I was getting the same error while I was using the CloudFormation.
in my case, S3 bucket name was not unique in my case, it was already created, i change then name of the bucket and it worked.

AWS Secrets Manager and Cloud Formation - can not create secret because it already exists

I have a CF template with a simple secret inside, like this:
Credentials:
Type: 'AWS::SecretsManager::Secret'
Properties:
Name: !Sub ${ProjectKey}.${StageName}.${ComponentId}.credentials
Description: client credentials
SecretString: !Sub
'{"client_id":"${ClientId}","client_secret":"${ClientSecret}"}'
The stack is created successfully and the secret is correctly generated.
However when I delete the stack and recreate it again I get the following error message:
The operation failed because the secret pk.stage.compid.credentials
already exists. (Service: AWSSecretsManager; Status Code: 400; Error
Code: ResourceExistsException; Request ID: ###)
I guess this is because the secret is not really deleted but only marked for deletion for x days.
It is possible to delete a secret immediately via CLI, but how can this be done within the CF Template?
I need to delete and recreate the stacks because it is part of a continous integration/delivery pipeline which is automatically triggered on source code commits.
Normally when you delete a stack the secret should be deleted also; and CFN does the aforementioned immediate delete. This should succeed even if the secret was scheduled for deletion outside of the CFN stack.
If (after your stack was deleted) the secret was created by another cloud formation stack or the same test running in another CI pipeline re-created the secret, you might see this error. Also, most AWS systems (Secrets Manager included) are eventually consistent, and you may see a delay between the stack being deleted and the actual secret deletion. If your tests run quick enough, or the same secret name is re-used in multiple tests, the previous delete may not have completed before the next create.
We have faced similar problems in our CI stacks and the way we work around it is to use a per test random name that is generated. You could, for example, pass in a random prefix to your stacks as a parameter and use that to construct the name (ensuring each test uses a unique suffix).
BTW - you can test if a secret was scheduled for deletion or is actually not there by running get-secret-value on the secret. If it is scheduled for deletion you will see the error "...You can’t perform this operation on the secret because it was deleted", whereas if the secret is actually deleted you will see "Secrets Manager can’t find the specified secret". If you schedule a secret for deletion and then delete it with --force-delete-without-recovery you may see a short multi-second lag between the two states.
Another option is to delete the secret immediately through the cli. This prevents the 7 day delay before it is is actually gone, after it is marked for deletion. This command line option does the trick:
aws secretsmanager delete-secret --secret-id your-secret --force-delete-without-recovery --region your-region
Do replace your-secret and your-region accordingly.
See this reference page: https://aws.amazon.com/premiumsupport/knowledge-center/delete-secrets-manager-secret/