I'm looking to connect to a public websocket and receive the data into AWS Lambda or SNS.
Looking at similar posts, it seems the only way to do this is via EC2, ECS etc. These posts were from a few years ago so I'd first like to see if anyone has found a way to do this in a serverless manor?
Everything you are thinking of doing in EC2/ECS you should be able to do in AWS Lambda + a Cloudwatch rule to run this websocket data ingestion logic on whatever schedule you need. That said Lambda functions are optimized for running burst workloads and handling non-steady traffic, if you need to maintain a consistent real-time connection to this websocket and dont want to manage servers, your best option will be Fargate, which offers serverless compute for containers that you can use to build an application like yours.
Also, I highly recommend looking into AWS Copilot for a simple way to manage/deploy your application: https://aws.github.io/copilot-cli/
Related
I've been learning more and more about AWS lately. I've been reading through the white papers and working my way through the various services. I've been working on PHP applications and front-end dev for a while now. Two things really stuck out to me. Those two things are server-less architecture using Lambdas with event-triggers and SQS (queues). The last three years I have been working with REST over HTTP with frameworks like Angular.
It occurred to me though that one could create an entire back-end/service layer through Lambda's and message queues alone. Perhaps I'm naive as I have never used that type of architecture for a real world project but it seems like a very simple means to build a service layer.
Has anyone built a web application back-end consisting of only Lambdas and message queues as opposed to "traditional" http request with REST. If so what types of drawbacks are there to this type of architecture besides relying so heavily on a vendor like AWS?
For example, wouldn't it be entirely possible to build a CMS using these technologies where the scripts create the AWS assets programmatically given a key with full admin rights to an account?
Yes, you can practically create the entire backend service using serverless architecture.
There are a lot of AWS services that usually play into the serverless gambit of things.
DynamoDB, SNS, SQS, S3 to name a few.
AWS Lambda is the backbone and sort of acts as a glue to bind these services.
Serverless doesn't mean you move away fromĀ "traditional" http request to message queues. If you need the web interface you would still need to use HTTP. You would primarily use message queues to decouple your services.
So, if you want the service to be accessible over HTTP just like your REST services and still be serverless then you can do that as well. And for that you will need to use AWS API Gateway in conjunction with AWS Lambda
One primary drawback/limitation is that debugging is not very straightforward. You cannot login to the system and cannot attach remote debuggers. And then obviously you get tied into the vendor.
Then there are limitations on the resources. E.g. Lambda can offer you a maximum memory footprint of 5GB, so if you need to do some compute intensive job that needs more memory and can't be broken down into sub tasks then serverless (AWS Lambda) is not an option for you.
Is it possible to integrate AWS Lambda with Apache Kafka ?
I want to put a consumer in a lambda function. When a consumer receive a message the lambda function execute.
Continuing the point by Arafat. We have successfully built an infrastructure to consume from Kafka using AWS Lambdas. Here are some gotcha's:
Make sure to consistently batch and commit while reading when consuming.
If you are storing the batches to s3, make sure to clean your file descriptors.
If you are forwarding the batches to another service make sure to clean the variables. Variable caching in AWS Lambda might result in memory overflows.
A good idea is to check how much time you have left while from the context object in the Lambda and give yourself some wiggle room to do something with the buffer you populated in your consumer which might not be read to a file unless you call close().
We are using Apache Airflow for scheduling. I hear cloudwatch can do that too.
Here is AWS article on scheduled lambdas.
Given your Kafka installation will be running in a VPC, best practise is to configure your Lambda to run within the VPC as well - this will simplify the security group configuration for the EC2 instances running Kafka.
Here is the AWS blog article on configuring Lambdas to run in a VPC.
Yes it is very much possible to have a Kafka consumer in AWS Lambda function.
However note that you would not be able to invoke the lambda using some sort of notification. You will rather have to poll the Kafka topic. And the easiest way can be to use a Scheduled Lambda
If you are using managed apache kafka in AWS (MSK):
Since august 2020 you can connect AWS Managed Streaming for Kafka (MSK) as event source. Not your own installed kafka cluster but if you already uses AWS managed kafka this could be useful.
More in the announcement https://aws.amazon.com/about-aws/whats-new/2020/08/aws-lambda-now-supports-amazon-managed-streaming-for-apache-kafka-as-an-event-source/
Screenshot from AWS Console:
AWS now supports "self-hosted Apache Kafka as an event source for AWS Lambda"
When you create a new Lambda, in the "Configuration" tab, click "Add trigger", you can now select and configure your self-hosted Apache Kafka.
Feel free to read more here:
https://aws.amazon.com/blogs/compute/using-self-hosted-apache-kafka-as-an-event-source-for-aws-lambda/
https://docs.aws.amazon.com/lambda/latest/dg/kafka-smaa.html
There is a community-provided Kafka Connector for AWS Lambda. This solution would require you to run the connector somewhere such as EC2 or ECS.
I'd like to use AWS IoT to manage a grid of devices. Data by device must be sent to a queue service (RabbitMQ) hosted on an EC2 instance that is the starting point for a real time control application. I read how to make a rule to write data to other Service: Here
However there isn't an example for EC2. Using the AWS IoT service, how can I connect to a service on EC2?
Edit:
I have a real time application developed with storm that consume data from RabbitMQ and puts the result of computation in another RabbitMQ queue. RabbitMQ and storm are on EC2. I have devices producing data and connected to IoT. Data produced by devices must be redirected to the queue on EC2 that is the starting point of my application.
I'm sorry if I was not clear.
The AWS IoT supports pushing the data directly to other AWS services. As you have probably figured out by now publishing to third party APIs isn't directly supported.
From the choices AWS offers Lambda, SQS, SNS and Kinesis would probably work best for you.
With Lambda you could directly forward the incoming message using the one of Rabbit MQs APIs.
With SQS you would put it into an AWS queue first and than poll this queue transfering it to RabbitMQ.
Kinesis would allow more sophisticated processing, but is probably too complex.
I suggest you program a Lamba with the programming language of your choice using one of the numerous RabbitMQ APIs.
I'm working on a new game project at the moment that will consist of a React Native front-end and a Lambda-based back-end. The app requires some real time features such as active user records, geofencing, etc.
I was looking at Firebase's Realtime Database that looks like a really elegant solution for real-time data sync but I don't think AWS has anything quite like it.
The 3 options I could think of for "serverless" realtime using only AWS services are:
Option 1: AWS IoT Messaging over WebSockets
This one is quite obvious, a managed WebSockets connection through the IoT SDK. I was thinking of triggering Lambdas in response to inbound and outbound events and just use WebSockets as the realtime layer, building custom handling logic on the app client as you typically would.
The downside to this, at least compared to Firebase, is that I will have to handle the data in the events myself which will add another layer of management on top of WebSockets and will have to be standardized with the API data layer in the application's stores.
Pros:
Scalable bi-directional realtime connection
Cons:
Only works when the app is open
Message structure needs to be implemented
Multiple transport layers to be managed
Option 2: Push-triggered re-fetch
Another option is to use push notifications as real-time triggers but use a regular HTTP request to API Gateway to actually get the updated payload.
I like this approach because it sticks to only one transport layer and a single source of truth for application state. It will also trigger updates when the app is not open since these are Push Notifications.
The downside is that this is a lot of custom work with potentially difficult mappings between push notifications to the data that needs to be fetched.
Pros:
Push notifications work even when app is closed
Single source of truth, transport layer
Cons:
Most custom solution
Will involve many more HTTP requests overall
Option 3: Cognito Sync
This is newer to me and I'm not sure if it can actually be interfaced with from the server.
Cognito Sync offers user state sync. across devices complete with offline support and is part of the Cognito SDK which I'll be using anyway. It sounds like just what I'm looking for but couldn't find any conclusive evidence as to whether it is possible to modify, or "trigger", updates from AWS and not just from one of the devices.
Pros:
Provides an abstracted real-time data model
Connected to Cognito user records OOTB
Cons:
Not sure if can be modified or updated from Lambdas
I'm wondering if anyone has experience doing real-time on AWS as part of a Lambda-based architecture and if you have an opinion on what is the best way to proceed?
I asked a similar question to the AWS Support, and this was their response.
My question to them:
What's the group of AWS services (if it's possible) to give that same
in-browser real-time DBaaS feel like Firebase?
AWS Cognito seems to be great for user-accounts. Is there anything
similar for the WebSockets / real-time DB part?
Their response:
To your question, Firebase is closest to the AWS service AWS
MobileHub. You can check out more details below about mobilehub from
below link.
https://aws.amazon.com/mobile/details/
https://aws.amazon.com/mobile/getting-started/
"AWS Cognito seems to be great for user-accounts. Is there anything
similar for the WebSockets / real-time DB part?"
Amazon Dynamodb is a fast and flexible NoSQL database service for all
applications that need consistent, single-digit millisecond latency at
any scale. It is a fully managed cloud database and supports both
document and key-value store models. Its flexible data model, reliable
performance, and automatic scaling of throughput capacity, makes it a
great fit for mobile, web, gaming, ad tech, IoT, and many other
applications.
Amazon Dynamodb can be further optimized with Amazon DynamoDB
Accelerator (DAX) which is a fully managed, highly available,
in-memory cache that can reduce Amazon DynamoDB response times from
milliseconds to microseconds, even at millions of requests per second.
For more information, please see below documentation.
https://aws.amazon.com/dynamodb/getting-started/
https://aws.amazon.com/dynamodb/dax/
Should you have any further questions, please do not hesitate to let
me know.
Thanks.
Best regards,
Tayo O. Amazon Web Services
Check out the AWS Support Knowledge Center, a knowledge base of
articles and videos that answer customer questions about AWS services:
https://aws.amazon.com/premiumsupport/knowledge-center/?icmpid=support_email_category
Also while researching this answer I also found this, looks interesting:
https://aws.amazon.com/blogs/database/how-to-build-a-chat-application-with-amazon-elasticache-for-redis/
The comments to that article is interesting as well.
Jacob Wakeem:
What advantage this
approach have over using aws iot? It seems that iot has all these
functionality without writing a single line of code and with
server-less architecture.
Sam Dengler:
The managed PubSub feature in the AWS IoT
service is also a good approach to message-based applications, like
the one demonstrated in the article. With Elasticache (Redis),
customers who use Pub/Sub are typically also using Redis as a data
store for other use cases such as caching, leaderboards, etc. With
that said, you could also use ElastiCache (Redis) with the AWS IoT
service by triggering an AWS Lambda function via the AWS IoT rules
engine. Depending on how the message-based application is architected
and how the data is leveraged, one solution may be a better fit than
the other.
Check out AWS AppSync for some of these realtime and offline features using different data sources, including databases search and compute.
AWS Amplify is AWS's modern answer to Firebase.
Fastest way to build mobile and web applications
AWS Amplify is a development platform for building secure, scalable
mobile and web applications. It makes it easy for you to authenticate
users, securely store data and user metadata, authorize selective
access to data, integrate machine learning, analyze application
metrics, and execute server-side code. Amplify covers the complete
mobile application development workflow from version control, code
testing, to production deployment, and it easily scales with your
business from thousands of users to tens of millions. The Amplify
libraries and CLI, part of the Amplify Framework, are open source and
offer a pluggable interface that enables you to customize and create
your own plugins.
Sounds like AWS Serverless is most suited alternative.
Also wondering: AWS vs Firebase - Is It Even a Fair Fight?
AWS Amplify. You can find more information here: AWS Amplify
You could consider using supabase.
It is opensource and can be installed onto ec2 / docker containers.
https://supabase.com/docs/guides/hosting/docker
I've found the hosted solution / free really poewrful to get up and running quickly. (yet to deploy to aws)
I know this is an old question, but nowadays AWS offers AppSync... a service that destroys Firebase RDB in every aspect
What is a good way to deploy a WebSockets client on AWS?
I'm building an app on AWS which needs to subscribe to several WebSockets and several REST sources and process incoming messages (WebSockets) or make periodic requests (REST). I'm trying to go server-less and maximize use of AWS platform services, to eliminate the need to manage VMs, OS patches, etc. (and hopefully reduce cost).
My idea so far is to trigger a Lambda function every time a message arrives. The function can then transform/normalize the message and push it to an SQS queue for further processing by other subsystems.
There would be two types of such Lambda clients, one that subscribes to WebSockets messages and another that makes HTTP request periodically when invoked by a CloudWatch schedule. It would look something like this:
This approach seems reasonable for my REST clients, but I haven't been able to determine if it's possible to subscribe to WebSockets messages using Lambda. Lambdas can be triggered by IoT, and apparently IoT supports WebSockets now, but apparently only as a transport for the MQTT protocol:
AWS IoT Now Supports WebSockets, Custom Keepalive Intervals, and Enhanced Console
What is the best/easiest/cheapest way to deploy a WebSockets client without deploying an entire EC2 or Docker instance?