Quick question: Is it possible to trigger the execution of a Step Function after an SQS message was sent?, if so, how would you specify it into the cloudformation yaml file?
Thanks in advance.
The first think to consider is this: do you really need to use SQS to start a Step Functions state machine? Can you use API gateway instead? Or could you write your messages to a S3 bucket and use the CloudWatch events to start a state machine?
If you must use SQS, then you will need to have a lambda function to act as a proxy. You will need to set up the queue as a lambda trigger, and you will need to write a lambda that can parse the SQS message and make the appropriate call to the Step Functions StartExecution API.
I’m on mobile, so I can’t type up the yaml right now, but if you need it, I can try to update with it later. For now, here is detailed walkthrough of how to invoke a Step Functions state machine from Lambda (including example yaml), and here is walkthrough of how to use CloudFormation to set up SQS to trigger a Lambda.
EventBridge Pipes (launched at re:Invent 2022) allows you to trigger Step Functions State Machines without need for a Lambda function.
https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-pipes.html
You can find an example here:
https://github.com/aws-samples/aws-stepfunctions-examples/blob/main/sam/demo-trigger-stepfunctions-from-sqs/template.yaml
Related
As per my existing solution I have two lambda function which gets triggered by the different SQS message and create a folder structure in S3.
Now, I have the requirement where I need to use the single SQS message to trigger both the lambda function.
Is it possible to trigger multiple lambda function via a single SQS message if yes then can you please explain the process and how efficient it would be?
If is there any other approach I can follow please let me know.
Thanks!
No, you can't do that. The best way is to create fan out setup with SNS + two SQS queues.
Otherwise, you have to develop other solution, e.g. one lambda gets triggered by sqs, and then invokes the second one passing the message as input.
I am new to AWS ecosystem. I'm building a (near) real-time system, where data comes from external API. The API is updated every 10 seconds, so I would like to consume and populate my Kinesis pipeline as soon as new data appears.
However, I'm not sure which tool use for that. I did a small research and, I think, I have two options:
AWS lambda which is triggered every 10 seconds and puts data on Kinesis
AWS StepFunction
What is the standard approach for a given use case?
AWS Step functions is created by Lambda functions. That is, each step in a workflow is actually a Lambda function. You can think of a workflow created by AWS Step Functions as a chain of Lambda functions.
If you are not familiar with how to create a workflow see this AWS tutorial:
Create AWS serverless workflows by using the AWS SDK for Java
(you can create a Lambda function in any supported programming language. This one happens to use Java).
Now, to answer your question, using a workflow to populate a Kinesis data stream is possible. You can build a Lambda function that gathers data (using logic in your Lambda function), and then invoke the putRecord operation of Kinesis to populate the data stream. You can create a scheduled event that fires off every x min based on a CRON expression.
If you do use a CRON expression, you can use the AWS Step Functions API to fire off the workflow. That is, create another Lambda function that is scheduled to fire say every 10 mins. Then in this Lambda funciton, use the Step Functions API to invoke the workflow. Now the workflow can populate the Kinesis data stream with data.
I have an AWS Python lambda function that connects to a DB, checks data integrity and send alerts to a slack channel(that's already done).
I want to execute that lambda every XX minutes.
What's the best way to do it?
You can build this with AWS EventBridge.
The documentation contains an example for this exact use case:
Tutorial: Schedule AWS Lambda Functions Using EventBridge
Given a REST API, outside of my AWS environment, which can be queried for json data:
https://someExternalApi.com/?date=20190814
How can I setup a serverless job in AWS to hit the external endpoint on a periodic basis and store the results in S3?
I know that I can instantiate an EC2 instance and just setup a cron. But I am looking for a serverless solution, which seems to be more idiomatic.
Thank you in advance for your consideration and response.
Yes, you absolutely can do this, and probably in several different ways!
The pieces I would use would be:
CloudWatch Event using a cron-like schedule, which then triggers...
A lambda function (with the right IAM permissions) that calls the API using eg python requests or equivalent http library and then uses the AWS SDK to write the results to an S3 bucket of your choice:
An S3 bucket ready to receive!
This should be all you need to achieve what you want.
I'm going to skip the implementation details, as it is largely outside the scope of your question. As such, I'm going to assume your function already is written and targets nodeJS.
AWS can do this on its own, but to make it simpler, I'd recommend using Serverless. We're going to assume you're using this.
Assuming you're entirely new to serverless, the first thing you'll need to do is to create a handler:
serverless create --template "aws-nodejs" --path my-service
This creates a service based on the aws-nodejs template on the provided path. In there, you will find serverless.yml (the configuration for your function) and handler.js (the code itself).
Assuming your function is exported as crawlSomeExternalApi on the handler export (module.exports.crawlSomeExternalApi = () => {...}), the functions entry on your serverless file would look like this if you wanted to invoke it every 3 hours:
functions:
crawl:
handler: handler.crawlSomeExternalApi
events:
- schedule: rate(3 hours)
That's it! All you need now is to deploy it through serverless deploy -v
Below the hood, what this does is create a CloudWatch schedule entry on your function. An example of it can be found over on the documentation
First thing you need is a Lambda function. Implement your logic, of hitting the API and writing data to S3 or whatever, inside the Lambda function. Next thing, you need a schedule to periodically trigger your lambda function. Schedule expression can be used to trigger an event periodically either using a cron expression or a rate expression. The lambda function you created earlier should be configured as the target for this CloudWatch rule.
The resulting flow will be, CloudWatch invokes the lambda function whenever there's a trigger (depending on your CloudWatch rule). Lambda then performs your logic.
I have a service that uses a JSON file on an S3 bucket for its configuration.
I would like to be able to modify this file, but I'm going to run into a concurrency issue as multiple administrators will be able to write in this file at the same time.
I'm going to use an SNS Topic to trigger a lambda that will write the config changes.
For the moment, I'm going to check the queue every minute and then handle the messages, so that I am sure that I don't have multiple instances of lambda running at the same time and writing in the same file.
Is there any way to have an SNS topic to trigger a lambda function for each message, and then wait for this message to be handled and then move on to the next one?
Cheers,
Julien
You can achieve this by setting the max concurrent executions of your Lambda function to 1. See the documentation for more details about managing concurrency for Lambdas.