I'm looking to trigger a glue job from SNS without using Lambda. Is this possible?
Related
In AWS Glue, I am executing a couple of ETL jobs using workflow, Now I want to inform business via email on the failure of any of the ETL jobs. I need help to get name of failed job and the error caused the job to fail, and pass it to job which would trigger an email using Amazon SES.
It has to be done using only a Glue Workflow to trigger a second job that read the output message from the first job and send the email. Need to perform without using EventBridge for this.
Is it possible to call a glue job/ or python script from within another glue job without passing by glue endpoint and adding a new rule in SG?
You can use EventBridge for that. EventBridge supports Glue events.
I have created an AWS Glue Trigger as part of the AWS Glue Workflow that runs on a periodic basis. I have successfully set the periodic schedule via the trigger with no problems, but now I need to adjust the schedule. Is there a way for me to directly edit the schedule of the trigger without recreating the entire AWS Glue Workflow?
I tried modifying it directly from the AWS Glue Trigger Console:
But I can't get it done since the console requires me to choose a glue job that will get executed by the trigger which is not applicable to my case since the trigger should initiate a crawler instead of a glue job.
Answering my own question for others' reference:
Currently, there is no way to edit it directly using the AWS Glue Console. But I was able to accomplish it without recreating the entire Glue Workflow by leveraging the aws-cli for glue:
aws glue update-trigger --name "us_im_bol-cl-t0-prod-tg" --cli-input-json '{"TriggerUpdate":{"Name":"us_im_bol-cl-t0-prod-tg","Schedule":"cron(0 14 * * ? *)","Actions":[{"CrawlerName":"us_im_bol-t0-prod-cl"}]}}'
Just update the cron rule for the "Schedule" property.
I am learning about a wonderful tool called AWS Cloudformation and I am having a hard time finding resources to find how to trigger AWS Gluejob via SQS.
I learnt about Glue Triggers from here. How do I trigger a gluejob whenever something is dumped in SQS?
Any help or guidance is appreciated.
There is currently no possibility of SQS triggering a Glue job directly.
What you could do though, is writing a Lambda function, which gets triggered by your SQS.
In this Lambda function you could call the Glue SDK to start your Glue Job.
I have 30 Glue jobs that I want to run in parallel. If one job fails, others must continue. I started with step function, creating state machine that executes runner lambda function which on other hand triggers glue job depending on parameter(name of glue job). For one job there is decent amount of step function logic implemented(retry, error handling etc.)
Is there any way to execute state machine from other state machine? In that way I can have 30 parallel tasks that executes other state machines. If you have any suggestions please feel free to share.
AWS recommends using SNS for a fan out architecture to run parallel jobs from a single S3 event, as you get an overlap error if two lambdas try to use the same S3 event.
You basically send the S3 event to SNS and subscribe your 30 lambdas so they all trigger from the SNS notification (containing details of the S3 event) when it's published.
Create the Topic
Update the Topic Policy to allow Event Notifications from an S3 Bucket
Configure the S3 Bucket to send Event Notifications to the SNS Topic
Create the parallel Lambda functions, one for each job
Modify the Lambda functions to process SNS messages of S3 event notifications instead of the S3 event itself
https://aws.amazon.com/blogs/compute/fanout-s3-event-notifications-to-multiple-endpoints/
There is also another nice example with CloudFormation template https://aws.amazon.com/blogs/compute/messaging-fanout-pattern-for-serverless-architectures-using-amazon-sns/
I'm trying to figure out how to automatically kick off an AWS Glue Job when an AWS Glue Crawler completes. I see that the Crawlers send events when they complete, but I'm struggling to parse through the documentation to figure out how to listen to that event and then launch the AWS Glue Job.
This seems like a fairly simple question, but I haven't been able to find any leads so far. I'd appreciate some help. Thanks in advance!
You can create a CloudWatch event, choose Glue Crawler state change as Event source, choose a Lambda function as Event target, and in the Lambda function you can use boto3(or other language sdk) to invoke the job to run.
Use a AWS Glue Trigger.
For anything involving more than two steps, I'd recommend using AWS Glue Workflows. They are formed by chaining Glue jobs, crawlers and triggers together into a workflow that can be visualised and monitored easily.