I am building one service which would use the data that would come from another source(service). So, I am thinking to use the following pipeline:-
Other service ----> SNS Topic ----> SQS ----> AWS Lambda ----> Dynamo Db
So, what above flow says is Another service will push data to SNS Topic to which an SQS would be a subscriber. Now AWS Lambda will have a trigger on this SQS which would listen to the messages in SQS and push it to Dynamo Db. Although it looks okay to do this. But now I am thinking if I really need SQS or not. Can I avoid using it? Instead of using SQS, AWS Lambda directly has a trigger on SNS. I am just thinking of one case if I don't use AWS SQS. How will it handle the scenario if AWS Dynamo DB fails? I think with only SNS, I would lose some messages during the time, my Dynamo Db is in failed state but if I have SQS, then those messages would be stored in SQS queue.
Please let me know if my understanding is correct.
Thanks a lot for your help.
Couldn't answer as much in the comments so I'll try here.
The architecture you linked to is pretty common. The two biggest downfalls are that you're going to billed for Lambda usage even if there is nothing to do and your data may be delayed by the amount of the polling interval which is a minimum of 1 minute. Neither of these things may matter in your problem though.
SQS could be used as a temporary store for data in the event of a DynamoDB failure. But what exactly are you going to do if it fails? What if SQS fails and loses your messages? What if Lambda fails and never runs your code? DynamoDB is a hosted service just like SQS and Lambda - Amazon is going to work very hard to keep it running just like their other services. Trying to architect around every possible failure scenario will mean you never deliver code. I'd focus on the simplest architecture you can and put some trust in the services you're paying for.
Related
I'm trying to understand the implementation flow while I'm designing the blueprint for one of our usecase. As per existing articles/blogs, AWS now supports self hosted kafka implementation for lambdas. Also there exists scheduled lambdas. But does anyone know where does eventbridge stands here?
Basically I want to trigger lambda everytime there's an event change in the topic it's subscribing too. So should the lambda act as a consumer that will listen to topics? If yes, since it's serverless, someone has to tell that there's a change. So will cloudwatch would be the one to do so?
Again if yes, does cloudwatch also needs to be acting as consumer and listen to topics?
Please help me understand, this might sound like an opinion question, but really nowhere I could find the correct answer.
P.S.- I know there's MSK and Kinesis, but it's recommended to used between lambda, eventbridge, sqs, sns, s3, etc only. The target is to read the data from topics and send out emails to recipients.
The Lambda service manages the integration with Kafka itself. You will config how it interacts, but ultimately your function will receive an event just like any other integration and it will include the messages from Kafka.
Trying to design a solution for error handling. We have a lambda that receives data from an sns topic and sends data to a legacy service that has been known to be unavailable at times.
When the legacy service is down I want to send the messages to a dyanmodb table to be replayed.
I want to use a circuit breaker pattern. So at the minute I am thinking of spinning up a service that will constantly poll the legacy service then some pseudo code that looks like this
If (legacy service changes from dead to alive){
Send all entries from dynamo to sns topic;
//This will trigger the lambda again which will hit legacy service which we know is now up
}
The thing is, we like using serverless technologies and not sure I can have a serverless service that constantly polls, it makes sense for that to run on a server.
I am looking for a nice way to do this so I am wondering is it possible to configure dynamodb to poll the legacy service and on the condition it changes from dead to alive populate the sns topic. Or any other solutions using serverless technologies.
P.s I don't like the idea of running a lambda in intervals to check the dB as we could miss some down time, also reading data from dB and sending to sns could be a lengthy operation.
Update: been reading into circuit pattern more and realise I don't need to constantly poll I can just check amount of failed calls in last XX seconds in my dynamodb table so a new question has arose, can I send message from dynamodb to sns depending on a condition on one of its entries. Eg. FailsInLastMinute changes from 3 to below 3 we send all the messages from a column in dynamo to sns or do I need a service for this part
I don't think DynamoDB can do this, it's a database after all not an integration platform.
That said, a possible solution would be to use DynamoDB as a queue between SNS and the legacy app using DynamoDB streams. Any message from SNS gets inserted into DynamoDB using a Lambda. DynamoDB streams then triggers another Lambda that sends the message to the legacy app.
If the legacy app is down the Lambda function generates a failure as it cannot connect. DynamoDB will then retry the Lambda until it succeeds.
Note that you are probably better off using an SQS queue with fifo enabled. This will do the same but without the overhead of DynomoDB.
I have a PHP web application that is running on an ec2 server. The app is integrated with another service which involves subscribing to a number of webhooks.
The number of requests the server is receiving from these webhooks has become unmanageable, and I'm looking for a more efficient way to deal with data coming from these webhooks.
My initial thought was to use API gateway and put these requests into an SQS queue and read from this queue in batches.
However, I would like these batches to be read by the ec2 instance because the code used to process the webhooks is code reused throughout my application.
Is this possible or am I forced to use a lambda function with SQS? Is there a better way?
The approach you suggested (API Gateway + SQS) will work just fine. There is no need to use AWS Lambda. You'll want to use the AWS SDK for PHP when writing the application code that receives messages from your SQS queue.
I've used this pattern before and it's a great solution.
. . . am I forced to use a lamda function with SQS?
SQS plus Lambda is basically free. At this time, you get 1M (million) lambda calls and 1M (million) SQS requests per month. Remember that those SQS Requests may contain up to 10 messages and that's a potential 10M messages, all inside the free tier. Your EC2 instance is likely always on. Your lambda function is not. Even if you only use Lambda to push the SQS data to a data store like RDBMS for your EC2 to periodically poll, the operation would be bullet-proof and very inexpensive. With the introduction of SQS you could transition the common EC2 code to Lambda function(s). These now have a run time of 15 minutes.
To cite my sources:
SQS pricing for reference: https://aws.amazon.com/sqs/pricing/
Lambda pricing for reference: https://aws.amazon.com/lambda/pricing/
I am working with PHP technology.
I have my program that will write message to Amazon SQS.
Can anybody tell me how I can use lambda service to get data from SQS and push it into MySQL. Lambda service should get trigger whenever new record gets added to the queue.
Can somebody share the steps or code that will help me to get through with this task?
There isn't any official way to link SQS and Lambda at the moment. Have you looked into using an SNS topic instead of an SQS queue?
Agree with Mark B.
Ways to get events over to lambda.
use SNS http://docs.aws.amazon.com/sns/latest/dg/sns-lambda.html
use SNS->SQS and have the lambda launched by the sns notification just use it to load whatever is in te SQS queue.
use kinesis.
alternatively have lambda run by cron job to read sqs. Depends on needed latency. If you require it be processed immediately then this is not the solution because you would be running the lambda all the time.
Important note for using SQS. You are charged when you query even if no messages are waiting. So do not do fast polls even in your lambdas. Easy to run up a huge bill doing nothing. Also good reason to make sure you set up cloudwatch on the account to monitor usage and charges.
I want to use an AWS lambda function to fan out and insert activity stream info to a firebase endpoint for every user.
Should I be using Kinesis, SQS or SNS to trigger the lambda function for this use case? The updates to the activity stream can be triggered from the server and clients should receive the update near real time (within 60 seconds or so).
I think I have a pretty good idea on what SQS is, and have used Kinesis in the past but not quite sure about SNS.
If we created an SNS topic for each user and then each follower subscribes to these topics with an AWS lambda function - would that work?
Does it make sense to programmatically create topics and subscriptions for every user and follow relationship respectively?
As usual, answer to such a question is mostly, 'it depends on your use-case'.
Kinesis vs SQS:
If your clients care about relative (timestamp-based, for e.g.) ordering between events, you'll almost certainly have to go with Kinesis. SQS is a best-effort FIFO queue, meaning events can arrive out of order and it would up to your client to manage relative ordering.
As far as latencies are concerned, I have seen that data ingested into Kinesis can become visible to its consumer in as less as 300 ms.
When can SNS be interesting to you?
(Even with SNS, you'd have to use SQS). If you use SNS, it will be easy to add a new application that can process your events. For example, if in future you decide to ingest all events into, say, an Elasticsearch to provide real-time analytics, all you'd have to do is add another SQS queue to your existing topic(s) and write a consumer.