TLDR: Is there a way to trigger an AWS lambda or step function based on an external system's websocket message?
I'm building a synchronization service which connects to a system which supports websockets. I can use timers in step functions to wake periodically and call lambda functions to perform the sync, but I would prefer to subscribe to the websocket and perform the sync only when a message is received.
There are plenty of ways to expose websockets in AWS, but I haven't found a way to consume them short of something like an EC2 instance with a custom service running on it. I'm trying to stay in the serverless ecosystem.
It seems like consuming a websocket is a fairly common requirement; have I overlooked something?
Lambdas are ephemeral. They can't be sitting there waiting for a websocket message.
However, I think what you can do is use an Activity task. Once the step function gets to that state it will wait. The activity worker will run on an EC2 instance and subscribe to a websocket. When a message is received it will poll the State Machine for an activity token and call SendTaskSuccess. The state machine will then continue execution and call the lambda that performs the sync.
You can use AWS API gateway service and lambda. It supports web sockets and can trigger lambda on request
I have a PHP web application that is running on an ec2 server. The app is integrated with another service which involves subscribing to a number of webhooks.
The number of requests the server is receiving from these webhooks has become unmanageable, and I'm looking for a more efficient way to deal with data coming from these webhooks.
My initial thought was to use API gateway and put these requests into an SQS queue and read from this queue in batches.
However, I would like these batches to be read by the ec2 instance because the code used to process the webhooks is code reused throughout my application.
Is this possible or am I forced to use a lambda function with SQS? Is there a better way?
The approach you suggested (API Gateway + SQS) will work just fine. There is no need to use AWS Lambda. You'll want to use the AWS SDK for PHP when writing the application code that receives messages from your SQS queue.
I've used this pattern before and it's a great solution.
. . . am I forced to use a lamda function with SQS?
SQS plus Lambda is basically free. At this time, you get 1M (million) lambda calls and 1M (million) SQS requests per month. Remember that those SQS Requests may contain up to 10 messages and that's a potential 10M messages, all inside the free tier. Your EC2 instance is likely always on. Your lambda function is not. Even if you only use Lambda to push the SQS data to a data store like RDBMS for your EC2 to periodically poll, the operation would be bullet-proof and very inexpensive. With the introduction of SQS you could transition the common EC2 code to Lambda function(s). These now have a run time of 15 minutes.
To cite my sources:
SQS pricing for reference: https://aws.amazon.com/sqs/pricing/
Lambda pricing for reference: https://aws.amazon.com/lambda/pricing/
I'm looking for help with an architectural design decision I'm making with a product.
We've got multiple producers (initiated by API Gateway calls into Lambda) that put messages on a SQS queue (the request queue). There can be multiple simultaneous calls, so there would be multiple Lambda instances running in parallel.
Then we have consumers (lets say twenty EC2 instances) who long-poll on the SQS for the message to process them. They take about 30-45 seconds to process a message each.
I would then ideally like to send the response back to the producer that issued the request - and this is the part I'm struggling with with SQS. I would in theory have a separate response queue that the initial Lambda producers would then be consuming, but there doesn't seem to be a way to cherry pick the specific correlated response. That is, each Lambda function might pick up another function's response. I'm looking for something similar to this design pattern: http://soapatterns.org/design_patterns/asynchronous_queuing
The only option that I can see is to create a new SQS Response queue for each Lambda API call, passing in its ARN in the message for the consumers to put the response on, but I can't imagine that's very efficient - especially when there's potentially hundreds of messages a minute? Am I missing something obvious?
I suppose the only other alternative would be setting up a bigger message broker (e.g. RabbitMQ/ApacheMQ) environment, but I'd like to avoid that if possible.
Thanks!
Create a (Temporary) Response Queue For Every Request
To late for the party, but i was thinking that i might find some help in what i want to achieve, #MattHouser #Zaheer Ally , or give an idea to someone working on a related issue.
I am facing a similar challenge. I have an API that upon request by a client, needs to communicate to multiple external APIs and collect (delayed) results.
Since my PHP API is synchronous, it can only perform these requests sequentially. So, i was thinking to use a request queue, where the producer (API) would send messages. Then, multiple workers would consume these messages, each of them performing one of these external API calls.
To get the results back, the producer would have created a temporary response queue, the name-identifier of which would be embedded in the message sent to workers. Hence, each worker would 'publish' his results on this temporary queue.
In the meantime, the producer would keep polling the temporary queue until he received the expected number of messages. Finally, he would delete the queue and send the collected results back to the client.
Yes, you could use RabbitMQ for a more "rpc" queue pattern.
But if you want to stay within AWS, try using something other than SQS for the response.
Instead, you could use S3 for the response. When your producer puts the item into SQS, include in the message an S3 destination for the response. When your consumer completes the tasks, put the response in the desired S3 location.
Then you can check S3 for the response.
Update
You may be able to accomplish an RPC-like message queue using Redis.
https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis
Then, you can use AWS ElastiCache for your Redis cluster. This would completely replace the use of SQS.
Another option would be to use Redis' pub/sub mechanism to asynchronously notify your lambda that the backend work is done. You can use AWS's Elasticache for Redis for an all-AWS-managed solution. Your lambda function would generate a UUID for each request, use that as the channel name to subscribe to, pass it along in the SQS message, and then the backend workers would publish a notification to that channel when the work is done.
I was facing this same problem so I tried it out, and it does work. Whether it's worth the effort over just polling S3 is another question. You have to configure the lambda functions to run inside your VPC, so they can access your Redis. I was going to have to do this anyway since I'd want the workers, in my case also lambda functions, to be able to access my Elasticsearch and RDS. But there are some considerations: most importantly, you need to use a private subnet with a NAT Gateway (or your own NAT Instance), so it can get out to the Internet and AWS managed services (including SQS).
One other thing I just stumbled across is that requests through API Gateway currently cannot take longer than 29 seconds, and this cannot be increased by AWS. You mentioned your jobs take 30 or more seconds, so this could be a showstopper for you using API Gateway and Lambda in this way anyway.
AWS now provides a Java client that supports temporary queues. This is useful for request/response patterns. I can't see a non-Java version.
What is a good way to deploy a WebSockets client on AWS?
I'm building an app on AWS which needs to subscribe to several WebSockets and several REST sources and process incoming messages (WebSockets) or make periodic requests (REST). I'm trying to go server-less and maximize use of AWS platform services, to eliminate the need to manage VMs, OS patches, etc. (and hopefully reduce cost).
My idea so far is to trigger a Lambda function every time a message arrives. The function can then transform/normalize the message and push it to an SQS queue for further processing by other subsystems.
There would be two types of such Lambda clients, one that subscribes to WebSockets messages and another that makes HTTP request periodically when invoked by a CloudWatch schedule. It would look something like this:
This approach seems reasonable for my REST clients, but I haven't been able to determine if it's possible to subscribe to WebSockets messages using Lambda. Lambdas can be triggered by IoT, and apparently IoT supports WebSockets now, but apparently only as a transport for the MQTT protocol:
AWS IoT Now Supports WebSockets, Custom Keepalive Intervals, and Enhanced Console
What is the best/easiest/cheapest way to deploy a WebSockets client without deploying an entire EC2 or Docker instance?
I have an external data source as an ActiveMQ topic. I can only connect and consume messages. They come pretty rarely, about 1 message per 10-30 seconds.
I want to collect all the messages and put them into the database.
Also I'd like to have an active web page that can receive the new messages over WebSockets and draw a chart.
I have a prototype built with Python/Flask/MongoDB/SocketIO, BUT...
I would like to use Amazon AWS cloud infrastructure to avoid processing the data on servers.
I believe that AWS Lambda can accept the messages and store them into the database (DynamoDB?) and also send a notification (maybe using SQS) being transformed into WebSocket message. (Not everything is clear there yet, maybe simple ajax polling will be enough).
Here is a question: how it would be possible to consume the messages from external ActiveMQ topic and process it with AWS Lambda?
I was looking on Kinesis, but it looks it only supports the data being pushed to it, but not polling for the data by some protocol...
You can use Lambda as a cron-like facility and poll on a schedule. See Lambda Scheduled Events.