Serverless web socket server? [closed] - amazon-web-services

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Are there any serverless technologies that allow a serverless websocket server to be built?
I know the nature of long running connections is that they are stateful, but if the only state is the connection itself at the transport layer then it seems like there could be a serverless product that abstracts this away so you only deal with the application layer. Is there a cloud provider (AWS, Azure, etc) that allows this? I can't see a way for AWS Lambda or Azure Functions to achieve this.
Anyone got any ideas? Just checking.
Thanks

With the release of WebSocket support for AWS API Gateway, you can create a server-less WebSocket API.
A WebSocket API is composed of one or more routes. To determine which route a particular inbound request should use, you provide a route selection expression. The expression is evaluated against an inbound request to produce a value that corresponds to one of your route’s routeKey values. API gateway routes the request to corresponding lambda function.
From AWS blog example,
The application is composed of the WebSocket API in API Gateway that handles the connectivity between the client and servers (1). Two AWS Lambda functions react when clients connect (2) or disconnect (5) from the API. The sendMessage function (3) is invoked when the clients send messages to the server. The server sends the message to all connected clients (4) using the new API Gateway Management API. To track each of the connected clients, use a DynamoDB table to persist the connection identifiers (You can also use it to store other state information about the connection).
You can extend this to send messages to clients on any changes to data using DynamoDB Streams

Currently AWS Lambda and Azure Functions doesn't support this. If you plan to setup an scalable environment in AWS with websockets, you can use Application Load Balancer and in front of ECS cluster or EC2 instances with Websocket supported server like NodeJS.
Another solution is to go with fully managed services, like Google Firebase Service or Pubnub in your architecture to handle the real-time part.

if the only state is the connection itself at the transport layer
That's not really the case. Web socket connections exchange keep-alives as layer 7 payload. Others might argue that it's more accurately described as a sublayer somewhere between layers 6 and 7... but in any event, it is well-above the transport layer.
And many applications use web sockets in other ways that are also not stateless. Once connected, then authenticated, there's no need to continually re-authenticate, because the client on the socket now will be the same client 15 minutes from now, and this is overhead that would not be avoidable in a serverless environment -- every action on a websocket would need to be re-authenticated. For another example, with a constant data stream, the server might keep track of what has been sent or what specific subset of the stream the client is interested in.
If you aren't maintaining (or don't need) a persistent connection to a server, the question could be asked "why are you using a web socket?"
Perhaps also relevant: HAProxy, a commonly used load balancer with web socket support, maintains a persistent connection to a single back-end server for each current web socket connection. If that backend server goes offline, there's no provision in the balancer to choose another back-end for the existing connection. The client will need to reconnect.

AWS IoT provides MQTT endpoints and it supports MQTT + WebSocket on port 443. This might be the closest thing you can get as a hosted service on AWS.
Check this link: AWS IoT Protocols
You can define rules that trigger Lambdas on AWS IoT or pass them to Kinesis and process streams through Lambdas.

Fanout can do this. It works as a proxy that can translate WebSocket client activity into a series of HTTP requests. This allows FaaS backends like Lambda to manage raw WebSockets. The function is invoked only when there is activity to react to.
Docs: https://fanout.io/docs/devguide.html#custom-websocket-api
WebSocket-over-HTTP protocol: http://pushpin.org/docs/protocols/websocket-over-http/
Python helper library: https://github.com/fanout/python-faas-grip

AWS Lambda/Azure Functions/GCP Functions don't support web-socket protocol. So it's normal to use third-party services for this purpose.
In additional to comments above I can propose Ably: https://www.ably.io/ .
It provides api/libraries to handle web-socket without pain.

Related

Is it possible to use AWS Application Loadbalancer with RSocket?

Is it possible to use AWS Application Loadbalancer for RSocket?
An AWS Application Loadbalancer can also be used for WebSocket connections and my project uses RSocket with WebSocket as its transport. This made me wonder if it is possible to use this loadbalancer for RSocket aswell.
On one hand I would think it is possible to use this loadbalancer, as it only receives a connection and passes this to the target RSocket server.
On the other hand, if all RSocket frames go through the loadbalancer, it might not know how to handles these frames, which would make it not possible to use.
I couldn't find much about RSocket and loadbalancing online besides this post .But this is client side loadbalancing and I was looking for server side loadbalancing.
And this post .But this uses LoadBalanceSocketClient while I want to find out if an AWS Application Loadbalancer can be used.
Here follows a simple diagram of what I would like to have (if possible):
The RSocket client connects to the loadbalancer which passes the connection to a RSocket server (for example server A). Then the client and RSocket server A can communicate.
AWS will see this as a typical websocket service. So as long as it lets HTTP/1.1 connections through and lets them upgrade to WebSocket there shouldn't be a problem. This is very standard so it shouldn't be an issue. Ideally it won't see individual frames of the traffic, and you app will handle all frames on a single WebSocket connection. But it looks like the API Gateway support does deal with individual messages https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-set-up-websocket-deployment.html. You should ignore the RSocket client load balancing, and focus on AWS WebSocket routing.
As an example, with GCP (instead of AWS) the complexity is that this bumps you up from AppEngine Standard to Flexible. The demo site https://demo.rsocket.io/ is deployed to GCP and exposes websockets.
The additional kink, is that you possibly want stateful routing if you want client resumption.

AWS API Gateway integration with Socket.io

I want to map an API Gateway endpoint with a Socket.io server endpoint, in order to authenticate users through Cognito and, if successful, redirect to the Socket.io server and establish a socket with optional namespace and rooms.
Is that makes sense? I didn't found any example, and API Gateway has only recently enabled a WebSocket API but without support for Socket.io
Your question has two parts:
First, the API Gateway using Cognito to authenticate your client;
Second, assuming you are using an EC2 running Node.JS with Socket.IO using API Gateway as an endpoint for your clients.
For the First part, you may use the following reference from AWS documentation.
There are several sub-parts when you talk about AWS Cognito, for example including AIM permissions Method Execution to enable API resource endpoint HTTP method.
For the second point, enable API Gateway to establish a synchronous connection with EC2 port running Socket.io you may read some references like this one.
You should configure your API Gateway:
Protocol WebSocket connection
Select your Route Selection expression ,e.g. \$default
Map the target backend for each $connect, $disconnect and $default
Use integration type AWS Service
Select EC2 and fill the rest of configs.
The answer by Rafael focuses more on using the Websocket API Gateway which in my opinion is still relatively new and there is some space to improvements. Plus I don't like having lambda integrations with database access because without RDS proxy they exceed the db connections really fast, and I don't think HTTP integration adds anything to the whole thing because you're performing HTTP request in the end but it's called through the Websocket API.
One thing I agree on with Rafael is that you need to have an EC2 instance running socket.io whether it's in Node.js or python (I used python with Flask in my case).
I managed to connect to my socket.io by using the HTTP API Gateway and setting allow_upgrades=False so http protocol won't be upgraded to ws protocol, because HTTP API Gateway doesn't support ws. My HTTP API Gateway is just forwarding socket.io requests to the load balancer, and good thing about that is that you can define access control on each route defined in the HTTP API Gateway.
The socket.io on my EC2 instance is defined like this:
socketio = SocketIO(async_handlers=True, allow_upgrades=False, cors_allowed_origins='*')
And my client connects to it by simply calling the route defined in the HTTP API Gateway which has proxy integration enabled.
https://xxxxxxxxx.execute-api.us-west-2.amazonaws.com/socket.io/{proxy}
Final result - client connected to socket
Before websocket technology, if you wanted real-time data in your browser, you needed a wasteful polling strategy. That's why websocket technology was introduced. However, it took some time before browsers supported it. On top of that, it wasn't that good at handling reconnects.
Socket-io gave us early-access to a reliable solution by combining multiple protocols, and adding several features to improve the stability and to recover from errors. With new releases, the protocol changed, and more flags and options were added.
That evolution made socket-io what it is today, which isn't exactly an "open standard". For that reason, it will probably never be decently supported on AWS.
Some possible solutions:
Having said that, browsers have evolved and most of them support websockets now. So, you could consider to migrate (back) from socket-io to plain old websockets. Nevertheless, you probably want to add a "heartbeat" that sends back and forth ping/pong messages to detect disconnects (which is one of those things that socket-io has built-in).
However, if you like GraphQL, then you should certainly consider AWS AppSync, which amongst others supports GraphQL subscriptions to push notifications to the client. Apollo client is extremely popular and reliable.

best architecture to deploy TCP/IP and UDP service on amazon AWS (Without EC2 instances)

i am traying to figure it out how is the best way to deploy a TCP/IP and UDP service on Amazon AWS.
I made a previous research to my question and i can not find anything. I found others protocols like HTTP, MQTT but no TCP or UDP
I need to refactor a GPS Tracking service running right now in AMAZON EC2. The GPS devices sent the position data using udp and tcp protocol. Every time a message is received the server have to respond with an ACKNOWLEDGE message, giving the reception confirmation to the gps device.
The problem i am facing right now and is the motivation to refactor is:
When the traffic increase, the server is not able to catch up all the messages.
I try to solve this issue with load balancer and autoscaling but UDP is not supported.
I was wondering if there is something like Api Gateway, which gave me a tcp or udp endpoint, leave the message on a SQS queue and process with a lambda function.
Thanks in advance!
Your question really doesn't make a lot of sense - you are asking how to run a service without running a server.
If you have reached the limits of a single instance, and you need to grow, look at using the AWS Network Load Balancer with an autoscaled group of EC2 instances. However, this will not support UDP - if you really need that, then you may have to look at 3rd party support in the AWS Marketplace.
Edit: Serverless architectures are designed for http based application, where you send a request and get a response. Since your app is TCP based, and uses persistent connections, most existing serverless implementations simply won't support it. You will need to rewrite your app to support http, or use traditional server based infrastructures that can support persistent connections.
Edit #2: As of Dec. 2018, API gateway supports WebSockets. This probably doesn't help with the original question, but opens up other alternatives if you need to run lambda code behind a long running connection.
If you want to go more Serverless, I think the ECS Container Service has instances that accept TCP and UDP. Also take a look at running Docker Containers with with Kubernetes. I am not sure if they support those protocols, but I believe they do.
If not, some EC2 instances with load balancing can be your best bet.

Is any aws service suitable for sending real time updates to browser?

I'm developing a stocks app and have to keep users browser updated with pricing changes
I don't need to access past data, browser just have to get current data whenever it changes
is it possible to filter a dynamodb stream and expose an endpoint (behind api gateway) that could be used with a javascript EventSource?
I realize this is not using Server Sent Events but AWS just announced Serverless WebSockets for API Gateway. Pricing is based on minutes connected and number of messages sent.
Product Launch Article: https://aws.amazon.com/about-aws/whats-new/2018/12/amazon-api-gateway-launches-support-for-websocket-apis/
Documentation: https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-websocket-api.html
Pricing: https://aws.amazon.com/api-gateway/pricing/
API Gateway is a store-and-forward service. It collects the response from whatever the back-end may happen to be (Lambda, an HTTP server, etc.) and then returns it en block to the browser -- it doesn't stream the response, so it would not be suited for use as an Eventsource.
AWS doesn't currently have a managed service offering that is obviously suited to this use case... you'd need a server (or more than one) on EC2, consuming the data stream and relaying it back to the connected browsers.
Assuming that running EC2 servers is an acceptable option, you then need HTTPS and load balancing. Application Load Balancer supports web sockets, so it also might also support an eventsource. A Classic ELB in TCP (not HTTP) mode should support an eventsource without a problem, though it might not correctly signal to the back-end when the browser connection is lost. Both of those balancers can also offload HTTPS for you. Network Load Balancer would definitely work for balancing an eventsource, but your instances would need to provide the HTTPS, since NLB doesn't offload it for you.
A somewhat unorthodox alternative might actually be AWS IoT, which has built-in websocket support... Not the same as eventsource, of course, but a streaming connection nonetheless... in such an environment, I suppose each browser user could be an addressable "thing."

How to implement service as app in DEA?

I am trying to create a clustered cache service for Cloud Foundry. I understand that I need to implement Service Broker API. However, I want this service to be clustered, and in the Cloud Foundry environment. As you know, container to container connection (TCP) is not supported yet, I don't want to host my backend in another environment.
Basically my question is almost same as this one: http://grokbase.com/t/cloudfoundry.org/vcap-dev/142mvn6y2f/distributed-caches-how-to-make-it-work-multicast
And I am trying to achieve this solution he adviced:
B) is to create a CF Service by implementing the Service Broker API as
some of the examples show at the bottom of this doc page [1] .
services have no inherant network restrictions. so you could have a CF
Caching Service that uses multicast in the cluster, then you would
have local cache clients on your apps that could connect to this
cluster using outbound protocols like TCP.
First of all, where does this service live? In the DEA? Will backend implementation be in the broker itself? How can I implement the backend for scaling the cluster, start the same service broker over again?
Second and another really important question is, how do the other services work if TCP connection is not allowed for apps? For example, how does a MySQL service communicates with the app?
There are a few different ways to solve this, the more robust the solution, the more complicated.
The simplest solution is to have a fixed number of backend cache servers, each with their own distinct route, and let your client applications implement (HTTP) multicast to these routes at the application layer. If you want the backend cache servers to run as CF applications, then for now, all solutions will require something to perform the HTTP multicast logic at the application layer.
The next step would be to introduce an intermediate service broker, so that your client apps can all just bind to the one service to get the list of routes of the backend cache servers. So you would deploy the backends, then deploy your service broker API instances with the knowledge of the backends, and then when client apps bind they will get this information in the user-provided service metadata.
What happens when you want to scale the backends up or down? You can then get more sophisticated, where the backends are basically registering themselves with some sort of central metadata/config/discovery service, and your client apps bind to this service and can periodically query it for live updates of the cache server list.
You could alternatively move the multicast logic into a single (clustered) service, so:
backend caches register with the config/metadata/discovery service
multicaster periodically queries the discovery service for list of cache server routes
client apps make requests to the multicaster service
One difficulty is in implementing the metadata service if you're doing it yourself. If you want it clustered, you need to implement a highly-available-ish consistent-ish datastore, it's almost the original problem you're solving except the service handles replicating data to all nodes in the cluster, so you don't have to multicast.
You can look at https://github.com/cloudfoundry-samples/github-service-broker-ruby for an example service broker that runs as a CF application.