We are currently updating glue job using CLI commands. In the console, we have the ability to add job parameters as such:
I would like to replicate this in the CLI command. Currently, I have the following:
-name: Update Glue job
run: |
aws glue update-job --job-name "${{ env.notebook_name }}-job" \
--job-update "Role=${{ env.glue_service_role }}, Command={Name=glueetl, ScriptLocation=${{ env.aws_s3_bucket }}/etl/${{ env.notebook_name }}_${GITHUB_SHA}.py}, DefaultArguments={'--job-bookmark-option':'job-bookmark-enable', '--enable-metrics': 'enable', '--enable-continuous-cloudwatch-log': 'enable'}" \
--region ${{ env.region }}
My assumption is that I cannot add this job parameter under "DefaultArguments". I was using the following AWS Doc: https://docs.aws.amazon.com/cli/latest/reference/glue/update-job.html. I did not see a job parameter options.
What am I missing? Thank you!
You have to use default arguments if you believe that the values won't change. Otherwise, you have to pass the arguments while triggering the glue job from CLI something like this
aws glue start-job-run --job-name my-job --arguments myarg='myavlue'
Related
I have created a config file contains details of a glue job and running the following command on aws cli --> aws glue create-job --cli-input-json file://test.json working fine.
This created a job in glue. But now I would like to run this command through github actions such as - once committed the config file on git repo and given the command in github actions file it should create the job glue.
I am getting an error that file does not exist. It is not taking the json file which I mention in the create-job statement. command running through GH actions -->
name: create glue job
run: aws glue create-job --cli-input-json test.json --debug
Questions - How can I pass the config file name to this command in GH actions. So that it will read and create the glue job. Any path be it aws s3 or git location???
I'd like to add a step as a spark application using AWS CLI, but I cannot find a working command, from AWS official doc: https://docs.aws.amazon.com/cli/latest/reference/emr/add-steps.html, they listed out 6 examples, none of them is for spark.
But I could configure it through AWS Console UI and it runs fine, but for efficiency, I'd like to be able to do so via aws cli.
The closest that I could come up with is this command:
aws emr add-steps --cluster-id j-cluster-id --steps Type=SPARK,Name='SPARK APP',ActionOnFailure=CONTINUE,Jar=s3://my-test/RandomJava-1.0-SNAPSHOT.jar,MainClass=JavaParquetExample1,Args=s3://my-test/my-file_0000_part_00.parquet,my-test --profile my-test --region us-west-2
but this resulted in this configuration on AWS EMR step:
JAR location : command-runner.jar
Main class : None
Arguments : spark-submit s3://my-test/my-file_0000_part_00.parquet my-test
Action on failure: Continue
which resulted in failure.
The correct one (completed successfully, configured through AWS Console UI) looks like this:
JAR location : command-runner.jar
Main class : None
Arguments : spark-submit --deploy-mode cluster --class sparkExamples.JavaParquetExample1 s3://my-test/RandomJava-1.0-SNAPSHOT.jar --s3://my-test/my-file_0000_part_00.parquet --my-test
Action on failure: Continue
Any help is greatly appreciated!
This seems to be working for me. I am adding a spark application to a cluster with the step name My step name. Let's say you name the file as step-addition.sh. The content of it is following:
#!/bin/bash
set -x
#cluster id
clusterId=$1
startDate=$2
endDate=$3
aws emr add-steps --cluster-id $clusterId --steps Type=Spark,Name='My step name',\
ActionOnFailure=TERMINATE_CLUSTER,Args=[\
"--deploy-mode","cluster","--executor-cores","1","--num-executors","20","--driver-memory","10g","--executor-memory","3g",\
"--class","your-package-structure-like-com.a.b.c.JavaParquetExample1",\
"--master","yarn",\
"--conf","spark.driver.my.custom.config1=my-value-1",\
"--conf","spark.driver.my.custom.config2=my-value-2",\
"--conf","spark.driver.my.custom.config.startDate=${startDate}",\
"--conf","spark.driver.my.custom.config.endDate=${endDate}",\
"s3://my-bucket/my-prefix/path-to-your-actual-application.jar"]
You can execute the above script simply like this:
bash $WORK_DIR/step-addition.sh $clusterId $startDate $endDate
How can we create a glue job using CLI commands? Can I have one sample code?Thanks!
Refer to this link which talks about creating AWS Glue resources using CLI. This blog is in Japanese. Following is the sample to create a Glue job using CLI.
aws glue create-job \
--name ${GLUE_JOB_NAME} \
--role ${ROLE_NAME} \
--command "Name=glueetl,ScriptLocation=s3://${SCRIPT_BUCKET_NAME}/${ETL_SCRIPT_FILE}" \
--connections Connections=${GLUE_CONN_NAME} \
--default-arguments file://${DEFAULT_ARGUMENT_FILE}
Follow documentation and post error if any
Link to docs
https://docs.aws.amazon.com/cli/latest/reference/glue/create-job.html
I have multiple aws sqs queues and some have tags but some do not. Now I want to add tags to the ones which do not have them. We have the cli command: aws sqs tag-queue to add tags for a single queue. Is it possible to add multiple tags to multiple queues through a single cli command or would I have to write it through any script.
The cli command aws sqs tag-queue can only target a single queue. However, you could write a script that loops over all queues and calls tag-queue for each.
#!/bin/bash
for url in $(aws sqs list-queues --output text --query 'QueueUrls')
do
aws sqs tag-queue --queue-url $url --tags YourKey=YourValue
done
You can use the resourcegroupstaggingapi tag-resources command to tag (almost) any arbitrary resources you'd like, not just SQS queues!
For example,
aws resourcegroupstaggingapi tag-resources \
--resource-arn-list arn:aws:sqs:us-east-2:123456789:foobarqueue arn:aws:sqs:us-east-2:123456789:fizzbuzzqueue \
--tags Foo=Bar,Fizz=Buzz
Note that you can also do this in the console if you find that easier if you click on Resource Groups > Tag Editor at the top of the console.
This worked for me using the AWS CLI :
queueurl=https://sqs.us-west-2.amazonaws.com/<yourAWSaccountnumber>/L7Cn
aws sqs tag-queue --queue-url $queueurl --tags Key1=Value1,Key2=Value2
I recommend creating tags when initially creating the SQS queue using the AWS CLI :
queuename=trymeout
aws sqs create-queue --queue-name $queuename --tags Key1=Value1,Key2=Value2
Then check your tags after creating the SQS queue using the AWS CLI :
aws sqs list-queue-tags --queue-url https://sqs.us-west-2.amazonaws.com/<yourAWSaccountnumber>/trymeout
Make sure you have the newest AWS CLI installed.
Older versions of the AWS CLI can contain bugs so....
To get my AWS CLI version I run : aws --version and today I get:
aws-cli/2.7.14 Python/3.9.11 Darwin/21.5.0 exe/x86_64 prompt/off
I am using the following AWS Cli Cloud Formation commands to create and then execute and change set:
aws cloudformation create-change-set --change-set-name change-set-1
aws cloudformation execute-change-set --change-set-name change-set-1
However the first command returns before the the change set has been created, the if I execute the second command immediately it fails.
Solutions I have considered:
Adding a delay between the two commands.
Repeating the second command until it succeeds.
Both of these have their problems.
Ideally there would be an option on the create-change-set command to execute immediately, or to run synchronously and not return until the change set has been created.
Has anyone ever tried this and come up with a better solution than me?
I haven't personally tried it, but maybe you could use the command list-change-sets to loop until your change set is with a status CREATE_COMPLETE, and then execute your second command.
Hope this helps.
I solved this issue by using the following sequence :
aws cloudformation create-change-set
aws cloudformation wait change-set-create-complete
aws cloudformation execute-change-set
aws cloudformation wait stack-create-complete
Hope it will help.
If you don't require the intermediate step of creating a change set and then executing it (as we didn't) then use the update-stack sub command.
aws cloudformation update-stack --stack-name myStack --template-url ...