Build a firebase / fanout.io like service on amazon web services aws - amazon-web-services

I am using firebase to notify web browsers (javascript clients) about changes on specific topics. I am very happy with it. However I would really like to (only) use aws web services.
Unfortunately I am not able to determine whether it is possible to build such a service on aws. I am not talking about having EC2 instances running some firebase / fanout.io alternatives. I am talking about utilizing services such as lambda, dynamodb streams, SNS & SQS.
Are there any socket notification services available or is it possible to achieve something similar by using the provided services?

I looked into this very recently with the same idea, but eventually I came down on just using fanout. AWS does not provide server-push HTTP notification services out of the box.
Lambda functions are billed per 100 ms, so any long-polling against lambda will end up billing for the entirety of the time the client is connected.
SNS does not provide long polling to browsers; the available clients are geared towards mobile, email, HTTP/S, and other Amazon products like Lambda and SQS.
SQS would require a dedicated queue per client as it does not support broadcast.
Now, if the lambda pricing doesn't bother you, you could possibly do this:
Write a lambda function that is called via the API service that opens up a connection to SQS and waits for a message. The key is to start the lambda call from HTTP, but within the function wait on the queue (using Boto, for example, if you are writing this in Python). This code would need to create a queue dedicated to servicing one particular client, uniquely keyed by something like a GUID that is passed in by the client.
Link to the lambda function using the Amazon API service.
Call the lambda function via the API from the browser and wait for it to either receive a message on the dedicated SQS queue or timeout, probably using long-polling both in the API connection and the SQS connection. Fully draining the queue (or at least taking as many messages in a batch up to some limit) would be advisable here as well in order to reduce the number of calls to the API.
Publish your event to the dedicated SQS queue associated with the client. This will require the publisher to know the client's unique key.
Return the event read from SQS as the result of the lambda call.
Some problems with this approach:
Lambda pricing - not terribly expensive, but something like fanout is basically free
You would need a dedicated SQS queue per client; cleanup might become a problem
SQS bills on number of calls, which includes checking for a message. Long-polling SQS will alleviate some of this
You would need to write the JavaScript client to call the lambda API endpoint repeatedly in a long-polling fashion
Lambda is currently limited as to the number of concurrently running functions it supports (100 right now but you can contact support to bump that up)
Some benefits with this approach:
SQS queues are persistent, so unless a message is processed successfully it will go back on the queue after the visibility timeout
You can set up CloudWatch to monitor all of the API, Lambda, and SQS events
Other Notes
You could call the SQS APIs directly from the browser by using Lambda to issue temporary security credentials via STS. Receiving a message in JavaScript is documented here: http://docs.aws.amazon.com/AWSJavaScriptSDK/guide/browser-examples.html#Receiving_a_message I do not, however, know off the top of my head if you would run into cross-domain issues.
Your only other option, if it must be all AWS, is to use load-balanced EC2 instances running something like fanout as you mentioned.
Using fanout is very little work: it's both extremely affordable and already built and tested.

Related

Is there a way to invoke an AWS Step Function or Lambda in response to a websocket message?

TLDR: Is there a way to trigger an AWS lambda or step function based on an external system's websocket message?
I'm building a synchronization service which connects to a system which supports websockets. I can use timers in step functions to wake periodically and call lambda functions to perform the sync, but I would prefer to subscribe to the websocket and perform the sync only when a message is received.
There are plenty of ways to expose websockets in AWS, but I haven't found a way to consume them short of something like an EC2 instance with a custom service running on it. I'm trying to stay in the serverless ecosystem.
It seems like consuming a websocket is a fairly common requirement; have I overlooked something?
Lambdas are ephemeral. They can't be sitting there waiting for a websocket message.
However, I think what you can do is use an Activity task. Once the step function gets to that state it will wait. The activity worker will run on an EC2 instance and subscribe to a websocket. When a message is received it will poll the State Machine for an activity token and call SendTaskSuccess. The state machine will then continue execution and call the lambda that performs the sync.
You can use AWS API gateway service and lambda. It supports web sockets and can trigger lambda on request

AWS Reduce webhooks ec2 impact with queue

I have a PHP web application that is running on an ec2 server. The app is integrated with another service which involves subscribing to a number of webhooks.
The number of requests the server is receiving from these webhooks has become unmanageable, and I'm looking for a more efficient way to deal with data coming from these webhooks.
My initial thought was to use API gateway and put these requests into an SQS queue and read from this queue in batches.
However, I would like these batches to be read by the ec2 instance because the code used to process the webhooks is code reused throughout my application.
Is this possible or am I forced to use a lambda function with SQS? Is there a better way?
The approach you suggested (API Gateway + SQS) will work just fine. There is no need to use AWS Lambda. You'll want to use the AWS SDK for PHP when writing the application code that receives messages from your SQS queue.
I've used this pattern before and it's a great solution.
. . . am I forced to use a lamda function with SQS?
SQS plus Lambda is basically free. At this time, you get 1M (million) lambda calls and 1M (million) SQS requests per month. Remember that those SQS Requests may contain up to 10 messages and that's a potential 10M messages, all inside the free tier. Your EC2 instance is likely always on. Your lambda function is not. Even if you only use Lambda to push the SQS data to a data store like RDBMS for your EC2 to periodically poll, the operation would be bullet-proof and very inexpensive. With the introduction of SQS you could transition the common EC2 code to Lambda function(s). These now have a run time of 15 minutes.
To cite my sources:
SQS pricing for reference: https://aws.amazon.com/sqs/pricing/
Lambda pricing for reference: https://aws.amazon.com/lambda/pricing/

AWS SQS Asynchronous Queuing Pattern (Request/Response)

I'm looking for help with an architectural design decision I'm making with a product.
We've got multiple producers (initiated by API Gateway calls into Lambda) that put messages on a SQS queue (the request queue). There can be multiple simultaneous calls, so there would be multiple Lambda instances running in parallel.
Then we have consumers (lets say twenty EC2 instances) who long-poll on the SQS for the message to process them. They take about 30-45 seconds to process a message each.
I would then ideally like to send the response back to the producer that issued the request - and this is the part I'm struggling with with SQS. I would in theory have a separate response queue that the initial Lambda producers would then be consuming, but there doesn't seem to be a way to cherry pick the specific correlated response. That is, each Lambda function might pick up another function's response. I'm looking for something similar to this design pattern: http://soapatterns.org/design_patterns/asynchronous_queuing
The only option that I can see is to create a new SQS Response queue for each Lambda API call, passing in its ARN in the message for the consumers to put the response on, but I can't imagine that's very efficient - especially when there's potentially hundreds of messages a minute? Am I missing something obvious?
I suppose the only other alternative would be setting up a bigger message broker (e.g. RabbitMQ/ApacheMQ) environment, but I'd like to avoid that if possible.
Thanks!
Create a (Temporary) Response Queue For Every Request
To late for the party, but i was thinking that i might find some help in what i want to achieve, #MattHouser #Zaheer Ally , or give an idea to someone working on a related issue.
I am facing a similar challenge. I have an API that upon request by a client, needs to communicate to multiple external APIs and collect (delayed) results.
Since my PHP API is synchronous, it can only perform these requests sequentially. So, i was thinking to use a request queue, where the producer (API) would send messages. Then, multiple workers would consume these messages, each of them performing one of these external API calls.
To get the results back, the producer would have created a temporary response queue, the name-identifier of which would be embedded in the message sent to workers. Hence, each worker would 'publish' his results on this temporary queue.
In the meantime, the producer would keep polling the temporary queue until he received the expected number of messages. Finally, he would delete the queue and send the collected results back to the client.
Yes, you could use RabbitMQ for a more "rpc" queue pattern.
But if you want to stay within AWS, try using something other than SQS for the response.
Instead, you could use S3 for the response. When your producer puts the item into SQS, include in the message an S3 destination for the response. When your consumer completes the tasks, put the response in the desired S3 location.
Then you can check S3 for the response.
Update
You may be able to accomplish an RPC-like message queue using Redis.
https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis
Then, you can use AWS ElastiCache for your Redis cluster. This would completely replace the use of SQS.
Another option would be to use Redis' pub/sub mechanism to asynchronously notify your lambda that the backend work is done. You can use AWS's Elasticache for Redis for an all-AWS-managed solution. Your lambda function would generate a UUID for each request, use that as the channel name to subscribe to, pass it along in the SQS message, and then the backend workers would publish a notification to that channel when the work is done.
I was facing this same problem so I tried it out, and it does work. Whether it's worth the effort over just polling S3 is another question. You have to configure the lambda functions to run inside your VPC, so they can access your Redis. I was going to have to do this anyway since I'd want the workers, in my case also lambda functions, to be able to access my Elasticsearch and RDS. But there are some considerations: most importantly, you need to use a private subnet with a NAT Gateway (or your own NAT Instance), so it can get out to the Internet and AWS managed services (including SQS).
One other thing I just stumbled across is that requests through API Gateway currently cannot take longer than 29 seconds, and this cannot be increased by AWS. You mentioned your jobs take 30 or more seconds, so this could be a showstopper for you using API Gateway and Lambda in this way anyway.
AWS now provides a Java client that supports temporary queues. This is useful for request/response patterns. I can't see a non-Java version.

What is the simplest way to process WebSockets messages on AWS?

What is a good way to deploy a WebSockets client on AWS?
I'm building an app on AWS which needs to subscribe to several WebSockets and several REST sources and process incoming messages (WebSockets) or make periodic requests (REST). I'm trying to go server-less and maximize use of AWS platform services, to eliminate the need to manage VMs, OS patches, etc. (and hopefully reduce cost).
My idea so far is to trigger a Lambda function every time a message arrives. The function can then transform/normalize the message and push it to an SQS queue for further processing by other subsystems.
There would be two types of such Lambda clients, one that subscribes to WebSockets messages and another that makes HTTP request periodically when invoked by a CloudWatch schedule. It would look something like this:
This approach seems reasonable for my REST clients, but I haven't been able to determine if it's possible to subscribe to WebSockets messages using Lambda. Lambdas can be triggered by IoT, and apparently IoT supports WebSockets now, but apparently only as a transport for the MQTT protocol:
AWS IoT Now Supports WebSockets, Custom Keepalive Intervals, and Enhanced Console
What is the best/easiest/cheapest way to deploy a WebSockets client without deploying an entire EC2 or Docker instance?

AWS Lambda fetch from ActiveMQ topic

I have an external data source as an ActiveMQ topic. I can only connect and consume messages. They come pretty rarely, about 1 message per 10-30 seconds.
I want to collect all the messages and put them into the database.
Also I'd like to have an active web page that can receive the new messages over WebSockets and draw a chart.
I have a prototype built with Python/Flask/MongoDB/SocketIO, BUT...
I would like to use Amazon AWS cloud infrastructure to avoid processing the data on servers.
I believe that AWS Lambda can accept the messages and store them into the database (DynamoDB?) and also send a notification (maybe using SQS) being transformed into WebSocket message. (Not everything is clear there yet, maybe simple ajax polling will be enough).
Here is a question: how it would be possible to consume the messages from external ActiveMQ topic and process it with AWS Lambda?
I was looking on Kinesis, but it looks it only supports the data being pushed to it, but not polling for the data by some protocol...
You can use Lambda as a cron-like facility and poll on a schedule. See Lambda Scheduled Events.