create glue jobs through aws cli command over github actions - amazon-web-services

I have created a config file contains details of a glue job and running the following command on aws cli --> aws glue create-job --cli-input-json file://test.json working fine.
This created a job in glue. But now I would like to run this command through github actions such as - once committed the config file on git repo and given the command in github actions file it should create the job glue.
I am getting an error that file does not exist. It is not taking the json file which I mention in the create-job statement. command running through GH actions -->
name: create glue job
run: aws glue create-job --cli-input-json test.json --debug
Questions - How can I pass the config file name to this command in GH actions. So that it will read and create the glue job. Any path be it aws s3 or git location???

Related

How to use CloudFormation to update AWS Glue Jobs

We have many AWS Glue jobs and we are only updating the job code, which are scripts stored in S3.
The problem is CloudFormation couldn't tell when and when not to update our Glue jobs because all CloudFormation template parameters remain the same after script changes, even the script location is pointing to the same S3 object.
You can use the CloudFormation package command. This enables you to reference local files in your git repository as scripts for Glue Jobs. Every time before you deploy to CloudFormation you just run the package command.
As this is similar to Lambda code package, you can use parameterize the Glue script path and then have different versions of Glue script file
Glue CFT has "Command" parameter taking in "JobCommand" Type value which includes "ScriptLocation" attribute, make this as a CFT parameter and have the script dynamic
{
"Name" : String,
"PythonVersion" : String,
"ScriptLocation" : String
}
You can probably setup a CI/CD pipeline using AWS CodePipeline or your 3rd party CI/CD tool with the below steps
Pull the new code from your SCM like Github to deploy S3
Update CloudFormation stack with new S3 script path (with versions like v1, v2 etc...)

aws emr add-steps a spark application

I'd like to add a step as a spark application using AWS CLI, but I cannot find a working command, from AWS official doc: https://docs.aws.amazon.com/cli/latest/reference/emr/add-steps.html, they listed out 6 examples, none of them is for spark.
But I could configure it through AWS Console UI and it runs fine, but for efficiency, I'd like to be able to do so via aws cli.
The closest that I could come up with is this command:
aws emr add-steps --cluster-id j-cluster-id --steps Type=SPARK,Name='SPARK APP',ActionOnFailure=CONTINUE,Jar=s3://my-test/RandomJava-1.0-SNAPSHOT.jar,MainClass=JavaParquetExample1,Args=s3://my-test/my-file_0000_part_00.parquet,my-test --profile my-test --region us-west-2
but this resulted in this configuration on AWS EMR step:
JAR location : command-runner.jar
Main class : None
Arguments : spark-submit s3://my-test/my-file_0000_part_00.parquet my-test
Action on failure: Continue
which resulted in failure.
The correct one (completed successfully, configured through AWS Console UI) looks like this:
JAR location : command-runner.jar
Main class : None
Arguments : spark-submit --deploy-mode cluster --class sparkExamples.JavaParquetExample1 s3://my-test/RandomJava-1.0-SNAPSHOT.jar --s3://my-test/my-file_0000_part_00.parquet --my-test
Action on failure: Continue
Any help is greatly appreciated!
This seems to be working for me. I am adding a spark application to a cluster with the step name My step name. Let's say you name the file as step-addition.sh. The content of it is following:
#!/bin/bash
set -x
#cluster id
clusterId=$1
startDate=$2
endDate=$3
aws emr add-steps --cluster-id $clusterId --steps Type=Spark,Name='My step name',\
ActionOnFailure=TERMINATE_CLUSTER,Args=[\
"--deploy-mode","cluster","--executor-cores","1","--num-executors","20","--driver-memory","10g","--executor-memory","3g",\
"--class","your-package-structure-like-com.a.b.c.JavaParquetExample1",\
"--master","yarn",\
"--conf","spark.driver.my.custom.config1=my-value-1",\
"--conf","spark.driver.my.custom.config2=my-value-2",\
"--conf","spark.driver.my.custom.config.startDate=${startDate}",\
"--conf","spark.driver.my.custom.config.endDate=${endDate}",\
"s3://my-bucket/my-prefix/path-to-your-actual-application.jar"]
You can execute the above script simply like this:
bash $WORK_DIR/step-addition.sh $clusterId $startDate $endDate

error ! No valid credentials source for S3 bucket , Terraform AWS docker-compose

I 'am trying to setup terraform to create ressources ,
I need to add an AWS S3 bucket for storing terraform state , a dynamodb table for handling state-locking and an AWS ECR repository, so we can build and push our images.
I will setup project to run terraform using docker-compose to avoid dependencies
I"ve created the s3 bucket ,and I've enabled versionning in it, so every time we add a new file to the bucket ,it will store the previous version of that file, so if we update the file , we
can revert to previous version of it.
I didn't install the terraform in my localmachine
I have executed my credentials with aws-vault with "aws-vault exec fouednajari --duration=12h"
the docker-compose file
the main.tf file
this is my ~/.aws/config file
this is my ~/.aws/credentials file
But I have got this error when trying to run my docker-compose command to initiate terraform .
the errors
please help me ,
I HAVE SOLVED THIS BY ADDING THE ACCESS_KEY AND THE SECRET KEY IN THE BACKEND AND THE PROVIDER "AWS" SECTION AND THEN I'VE EXPORTED THE VARIABLES WITH THE REGION VARIABLE !!!!!.
THANK YOU GOD !
Terraform looks for credentials in the following order:
Static credentials
Environment variables
Shared credentials/configuration file
CodeBuild, ECS, and EKS Roles
EC2 Instance Metadata Service (IMDS and IMDSv2)
Try to configure your credentials with "aws configure" command or export those vars to the environment

Transferring Cloud Custodian output json file to S3

I have a requirement. I am using CloudCustodian to get resources metadata in dev environment. I created one sample policy.yml file for EC2 like below:
policies:
- name: my-first-policy
resource: ec2
When I run this command from a ec2:
custodian run --dryrun -s . policy.yml
I can see in the root directory one directory has been created with "my-first-policy". In this directory there is one file resource.json which includes all the details for EC2 instance. I want to send this file to s3 whenever I run cloud custodian command. How can I do this from command line?
Is there any policy that can be written which would transfer the resource.json file to S3 whenever I run the command?
You can supply the S3 bucket as a value to the -s / --output-dir argument
custodian run --dryrun -s s3://mys3bucketpath policy.yml
Then you can see the output stored in s3 directly
aws s3 ls s3://mys3bucketpath
References:
https://cloudcustodian.io/docs/aws/usage.html#s3-logs-records

"BundleType must be either YAML or Json" Error using Jenkins and AWS CodeDeploy

I am trying to deploy revisions to my AWS lambda functions using Jenkins and the AWS CodeDeploy add-on. I am able to build the project successfully and upload a zip of the project to an S3 bucket. At this point I receive the error:
BundleType must be either YAML or JSON
I have an appspec.yml file in my code directory. I am unsure if I need to instruct Jenkins to do something different, or if I need to instruct AWS to unzip the file and use it.
Today CodeDeploy lambda deployment only take in a YAML or JSON file as deployment revision input (which is just your AppSpec file). Today CodeDeploy Jenkins plugin needs to be updated to support uploading YAML or JSON file without zipping it: https://github.com/jenkinsci/aws-codedeploy-plugin/blob/master/src/main/java/com/amazonaws/codedeploy/AWSCodeDeployPublisher.java#L230