Trigger Lambda based on Crawler output

Trigger Lambda based on Crawler output - amazon-web-services

I have a setup wherein I need to trigger a lambda function when my glue crawler has run and data is ready in redshift. Is there a way to create such a trigger?
Edit:
I added an Event bridge rule for crawler state change, that works and triggers the lambda function but it triggers when any of my crawlers are running. I want to isolate it to trigger only after a specific crawler is run. I tested with the code below but it doesn't seem to pick my crawler name. Is there any other way to specify the crawler name in the rule or am I making a syntactical error?
{
"source": ["aws.glue"],
"detail-type": ["Glue Crawler State Change"],
"eventName": "crawler_name",
"detail": {
"state": ["Succeeded"]
}
}

Solution: Add an EvenBridge rule with the following Event Pattern
{
"source": ["aws.glue"],
"detail-type": ["Glue Crawler State Change"],
"detail": {
"crawlerName": ["newton_pfi_new_raw_to_source"],
"state": ["Succeeded"]
}
}

Related

Trigger script automatically on EC2 creation (no user data)

Everytime an EC2 instance gets created, I want to run a script on that instance. I understand this could be done using the user_data parameter but some of these instances get created manually so people may forget to fill in that parameter sometimes. I want to rely on something automatic instead.
I figured to do it with EventBridge, catch an event that would indicate me that an instance has been created then trigger a lambda that would run the script. But when looking in the documentation I couldn't find any event that would relate to "EC2 created", see https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/monitoring-instance-state-changes.html.
Any idea how to get this done?

Create an EventBridge rule with the following pattern to catch the event:
{
"source": ["aws.ec2"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"eventSource": ["ec2.amazonaws.com"],
"eventName": ["RunInstances"]
}
}
and configure the target of the rule to be an AWS lambda function. Configure the lambda to parse the event and invoke an SSM run command against the instance.

In my case I have an EventBridge Rule with the following detail:
{
"detail-type": ["EC2 Instance State-change Notification"],
"detail": {
"state": ["running"]
},
"source": ["aws.ec2"]
}
And my target is a lambda function that runs an SSM document on that instance.

How do you make one AWS sagemaker pipeline trigger another one?

I know you can trigger sagemaker pipelines with all kind of events. Can you also trigger sagemaker pipelines when another pipeline finishes it's execution?

Yes, use an Amazon EventBridge event, such as
{
"source": [
"aws.sagemaker"
],
"detail-type": [
"SageMaker Model Building Pipeline Execution Status Change"
],
"detail": {
"currentPipelineExecutionStatus": [
"Succeeded"
]
}
}
Then call the next pipeline as the target of the EventBridge event

You can use any EventbridgeEvent to trigger a pipeline step. Since Eventbridge supports Sagemaker Pipeline Status Change as an event, you should be able to trigger a pipeline by another one.

Yes, use an Amazon EventBridge event, such as
{
"source": ["aws.sagemaker"],
"detail-type": ["SageMaker Model Building Pipeline Execution Status Change"],
"detail": {
"currentPipelineExecutionStatus": ["Succeeded"],
"previousPipelineExecutionStatus": ["Executing"],
"pipelineArn": ["your-pipelineArn"]
}
}

Execute my lambda when all the glue crawlers have run

I have a requirement where I need to trigger my lambda function when all of the glue crawlers have run & my data is ready in redshift to be queried.
I have setup the following AWS cloudwatch rule but it triggers the lambda if any of the crawlers have succeeded.
{
"detail-type": [
"Glue Crawler State Change"
],
"source": [
"aws.glue"
],
"detail": {
"crawlerName": [
"crw-raw-recon-the-hive-ces-cashflow",
"crw-raw-recon-the-hive-ces-position",
"crw-raw-recon-the-hive-ces-trade",
"crw-raw-recon-the-hive-ces-movement",
"crw-raw-recon-the-hive-ces-inventory"
],
"state": [
"Succeeded"
]
}
}
Now my question is there a way I could enforce the lambda to be triggered only when all of them have succeeded?
Also, I am not sure if redshift generates any similar events when it receives data.

How can I view the log of cloudwatch rule?

I create a rule in cloudwatch to trigger a lambda function when a glue job state is changed. The rule patterned is defined:
{
"detail-type": [
"Glue Job State Change"
],
"source": [
"aws.glue"
]
}
In Show metrics for the rule view I can see that there is one FailedInvocation but I can't find a way to see why the invocation is failed. I have checked the lambda function log but it is not being called. So how can I view the log of the failed invocation?

Trigger Lambda when new message arrives to SQS

I'm new to AWS and here is the task I'm trying to solve.
SQS queue is set up and from time to time new messages are coming to it. I want to set up Lambda and retrieve those messages and perform some business logic on the content of that messages.
Searching across AWS site and Internet in general I understood that SQS itself can't be a trigger for Lambda, hence I need to set up Cloud Watch that will trigger Lambda by schedule (every minute for example). Here is code example from aws github how to consume a message.
So far so good. Now, when creating Lambda itself, I need to specify the input type to implement RequestHandler interface:
public interface RequestHandler<I, O> {
O handleRequest(I var1, Context var2);
}
But if my Lambda is not expecting any input, it will go to SQS on its own and pull the messages does it make any sense to have input?
Can I leave it void or even use some other method signature at all (of course not implementing that interface in this case)?

Here your Lambda will get a reference to the cloudwatch trigger.
You might not be interested in that but there can be instances where the Lambda wants to know the trigger details even if the trigger is a cloudwatch alarm
The following is an example event:
{ "version": "0", "id": "53dc4d37-cffa-4f76-80c9-8b7d4a4d2eaa",
"detail-type": "Scheduled Event", "source": "aws.events", "account":
"123456789012", "time": "2015-10-08T16:53:06Z", "region": "us-east-1",
"resources": [
"arn:aws:events:us-east-1:123456789012:rule/my-scheduled-rule" ],
"detail": {} }

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Trigger Lambda based on Crawler output - amazon-web-services

Solution: Add an EvenBridge rule with the following Event Pattern { "source": ["aws.glue"], "detail-type": ["Glue Crawler State Change"], "detail": { "crawlerName": ["newton_pfi_new_raw_to_source"], "state": ["Succeeded"] } }

Related

Trigger script automatically on EC2 creation (no user data)

How do you make one AWS sagemaker pipeline trigger another one?

Execute my lambda when all the glue crawlers have run

How can I view the log of cloudwatch rule?

Trigger Lambda when new message arrives to SQS

Categories

Resources