How to use CloudFormation to update AWS Glue Jobs - amazon-web-services

We have many AWS Glue jobs and we are only updating the job code, which are scripts stored in S3.
The problem is CloudFormation couldn't tell when and when not to update our Glue jobs because all CloudFormation template parameters remain the same after script changes, even the script location is pointing to the same S3 object.

You can use the CloudFormation package command. This enables you to reference local files in your git repository as scripts for Glue Jobs. Every time before you deploy to CloudFormation you just run the package command.

As this is similar to Lambda code package, you can use parameterize the Glue script path and then have different versions of Glue script file
Glue CFT has "Command" parameter taking in "JobCommand" Type value which includes "ScriptLocation" attribute, make this as a CFT parameter and have the script dynamic
{
"Name" : String,
"PythonVersion" : String,
"ScriptLocation" : String
}
You can probably setup a CI/CD pipeline using AWS CodePipeline or your 3rd party CI/CD tool with the below steps
Pull the new code from your SCM like Github to deploy S3
Update CloudFormation stack with new S3 script path (with versions like v1, v2 etc...)

Related

How to create cloudformation template from SAm project?

I am trying to convert a SAM project to a cloudformation template in order to call
cloudformation.createStack()
to create multiple stacks when a lambda is invoked. So far I can upload the SAM project with
sam build
sam package
But the size of the S3 is to big and I am getting errors. What are the steps to correctly upload the cloudformation template?
These pre-reqs need to be met before continuing:
Install the SAM CLI.
Create an Amazon S3 bucket to store the serverless code artifacts that the SAM template generates. At a minimum, you will need permission to put objects into the bucket.
The permissions applied to your IAM identity must include iam:ListPolicies.
4.You must have AWS credentials configured either via the AWS CLI or in your shell's environment via the AWS_* environment variables.
5.Git installed.
6.Python 3.x installed.
(Optional) Install Python's virtualenvwrapper.
Source Link:- https://www.packetmischief.ca/2020/12/30/converting-from-aws-sam-to-cloudformation/

SAM/Serverless/CodeBuild clarification

I am hoping for some clarification around some terms I have been seeing on the web as it pertains to AWS and specifically lambdas. For starters, I would like the know how the commands sam build/deploy work versus setting up a CodeBuild job. Do I need a CodeBuild job to run those commands? What files specifically does the sam deploy command look for? Does it look for serverless.yml or template.yml or both? What is a sam.yml file or are they antiquated?
I have an app with a CodeBuild pipeline for a lambda, but I am expanding my repo to contain multiple lambdas and thinking about putting a serverless.yml file in each lambda directory, but I don't want to create a CodeBuild job and buildspec for each one. I assume sam deploy searches for all template.yml and serverless.yml files and constructs your stack as a whole (and updates only what needs to be updated?)
App is in Node if curious using API Gateway. Any insight would be appreciated.
I will try to give brief answers:
What does sam deploy do: It will zip the code and create cloudformation yaml file into .aws-sam folder and run cloudformation deploy.
Do we need CodeBuild to run same deploy: We still need some server to run sam deploy or build with node installed, which could be a local machine or remote server or a CodeBuild environment.
Do we need multiple templates? All Lambdas can be created in single template. But there is limit of 150 resources in cloudformation. if we have too many functions and APIs in single template, we will easily hit that limit. Each api might get converted into multiple cloud-formation resources. ex: 1 lambda function can be iam roles, cloudwatch logs, api routes, methods, integration, event source, etc.
Does sam deploy always looks for template.yaml By default yes, but can be easily overridden by passing --template-file sam deploy --template-file template-x.yml
Only changed resources are updated? Cloudformation update-stack updates only the resources that are changed.

Purpose and scope of AWS CDK bootstrap stack?

The docs on AWS CDK boostrapping state of the cdk bootstrap command:
cdk bootstrap
Deploys a CDKToolkit CloudFormation stack into the specified environment(s), that provides an S3 bucket that cdk deploy will use to store synthesized templates and the related assets, before triggering a CloudFormation stack update. The name of the deployed stack can be configured using the --toolkit-stack-name argument.
$ # Deploys to all environments
$ cdk bootstrap --app='node bin/main.js'
$ # Deploys only to environments foo and bar
$ cdk bootstrap --app='node bin/main.js' foo bar
However, how often does CDK need to be bootstrapped? Is it:
once for each AWS account?
once for each application in each AWS account?
once for each application in each AWS account that requires assets?
something else?
background:
cdk bootstrap is a tool in the AWS CDK command-line interface
responsible for populating a given environment (that is, a combination
of AWS account and region) with resources required by the CDK to
perform deployments into that environment.
When you run cdk bootstrap cdk deploys the CDK toolkit stack into an AWS environment.
The bootstrap command creates a CloudFormation stack in the environment passed on the command line. Currently, the only resource in that stack is An S3 bucket that holds the file assets and the resulting CloudFormation template to deploy.
cdk bootstrap command is running one time per account/ region.
Simple scenario to sum-up:
Run cdk bootstrap - create a new s3 bucket, IAM roles, etc.
Run cdk deploy - to deploy your stack for the first time, new template added to bootstrap s3 bucket.
Apply any change to cdk stack.
Run cdk diff - to view differences -
Behind the scenes, CDK generates the new template and compare it with the CDK template that exists in the bootstrap bucket.
More about cdk bootstrap.

How to add a wait time before executing a step in Bitbucket Pipeline

I have a Bitbucket pipeline where it creates AWS resources using cloudformation and deploys website to it. But deployment fails even the cloudformation creates the stack correctly. What I think the issue is when the deployment happens cloudformation S3 bucket creation may not have been finished.
I have a Hugo website and I have created a bitbucket pipeline to deploy it to server. What it does is it creates S3 bucket using cloudformation to host the website and then upload the Hugo website to it. When I ran the steps in the pipeline manually in a terminal with a delay between each step, it happens successfully. But when it happens on Bitbucket pipeline it gave error saying the S3 bucket that I'm trying to upload content is not available. When I checked in AWS that bucket is actually there. That means Cloudformation has worked correctly. But when the files start to copy, the bucket may have not been available to upload the file. That's my assumption. Is there a workaround for this one. When doing it locally I can wait between the two commands of cloudformation creation and file copying. But how to handle it in Bitbucket pipeline environment. Following is my pipeline code.
pipelines:
pull-requests:
'**':
- step:
script:
- aws cloudformation create-stack --stack-name demo-web --template-body file://cloudformation.json --parameters ParameterKey=S3BucketName,ParameterValue=demo-web
- hugo
- aws s3 cp public/ s3://demo-web/ --recursive
How to handle this scenario in the correct way. Is there a workaround for this situation. Or is the problem that I have identified is not the actual problem.
First, to wait in bitbucket pipelines you should be able to just use sleep x where x is the number of seconds you want to sleep.
A different note - bear in mind that after the first subsequent run of this, deployment will potentially fail the next time as you are using create-stack which will fail if stack already exists...
Using the AWS CloudFormation API for this is best practice.
There is a wait command just as there is a create-stack command, the wait command and its options allows AWS to halt your processing until stack is in COMPLETE status before continuing.
Option 1:
Put your stack creation command into a shell script and as a second step in your shell script you invoke the wait stack-create-complete command.
Then invoke that shell script in pipeline instead of the direct command.
Option 2:
In your bitbucket pipeline, right after your create command, invoke the aws cloudformation await complete command before the upload/hugo command.
I prefer option one because it allows you to manipulate and trace the creation as you wish.
See documentation: https://docs.aws.amazon.com/cli/latest/reference/cloudformation/wait/stack-create-complete.html
Using the CloudFormation API is great because you don't have to guess as to how long to wait, it is more guaranteed to be available.

"BundleType must be either YAML or Json" Error using Jenkins and AWS CodeDeploy

I am trying to deploy revisions to my AWS lambda functions using Jenkins and the AWS CodeDeploy add-on. I am able to build the project successfully and upload a zip of the project to an S3 bucket. At this point I receive the error:
BundleType must be either YAML or JSON
I have an appspec.yml file in my code directory. I am unsure if I need to instruct Jenkins to do something different, or if I need to instruct AWS to unzip the file and use it.
Today CodeDeploy lambda deployment only take in a YAML or JSON file as deployment revision input (which is just your AppSpec file). Today CodeDeploy Jenkins plugin needs to be updated to support uploading YAML or JSON file without zipping it: https://github.com/jenkinsci/aws-codedeploy-plugin/blob/master/src/main/java/com/amazonaws/codedeploy/AWSCodeDeployPublisher.java#L230