AWS WebSocket Connection Time Limit - Other options? - amazon-web-services

I am creating a panic button app that requires clients to remain connected for several hours in case there is a panic alert triggered. AWS WebSockets disconnect after 2 hours, which defeats the purpose of my app.
I talked to AWS and they confirmed that there is no way to extend that. I need to find an alternative. What is the best way to keep clients connected for a long time? I have computers that need to remain connected sometimes for up to 8 or 10 hours and they can't disconnect because if they do and there is a panic alert triggered, they would not get it.
I wanted to go serverless, but I do not know if it is possible given the limitation of websockets disconnecting after 2 hrs. I did some research and people mentioned that I should use ELB and EC2. Would I be able to deploy a Socket.io app this way? If not socket.io, are there any other options? How expensive can it get to keep clients connected to an EC2 instance with socket.io installed on it?

Related

ECS graceful shutdown

I am beginner to ECS. My team has an RESTful API service processing asynchronous jobs on top ECS using EC2. The main API is to create some assets n number of times. Depending on size of n, the API time can take upto many hours. During deployment, there is abrupt termination of these processes, we hence need to do the manual work of noting down the processes that are ongoing and then send these requests again after the deployment to prevent jobs requests being lost. I read this article Graceful shutdowns with ECS and it seems one option is to listen to SIGTERM signal, make note of the assets remaining to be made and then terminate the hosts. After deployment, when the hosts start up, they can check if any such requests before the deployment were pending and do those first. Could someone please tell if there is a better way to handle this? If not, could someone please tell an example or some references or more details how above approach can be implemented?
Thanks

Load balance Postfix within AWS without SES?

I'm working on a project for a client who does message pre and post-processing at very high volumes. I'm trying to figure out a reliable configuration between two or three API servers that will push outgoing email messages to any of two or more instances of Postfix. This is for outbound only and it's preferred not to have a single point of failure.
I am a bit lost within the AWS ecosystem and all I know is we cannot use SES and the client is set up for high volume smtp with Amazon, so throttling is not an issue.
I've looked into ELB, HAProxy, and a few other things but the whole thing has now gotten muddy and I'm not sure if I'm just overthinking it now.
Any quick thoughts would be appreciated.
Thanks

Amazon EC2 Servers getting frozen sporadically

I've been working with Amazon EC2 servers for 3+ years and I noticed a recurrent behaviour: some servers get frozen sporadically (between 1 to 5 times by year).
When this fact ocurs, I can't connect to server (tried http, mysql and ssh connections) till server be restarted.
The server back to work after a restart.
Sometimes the server goes online by 6+ months, sometimes the server get frozen about 1 month after a restart.
All servers I noticed this behavior were micro instances (North Virginia and Sao Paulo).
The servers have an ordinary Apache 2, Mysql 5, PHP 7 environment, with Ubuntu 16 or 18. The PHP/MySQL Web application is not CPU intensive and is not accessed by more than 30 users/hour.
The same environment and application on Digital Ocean servers does NOT reproduce the behaviour (I have two digital ocean servers running uninterrupted for 2+ years).
I like Amazon EC2 Servers, mainly because Amazon has a lot of useful additional services (like SES), but this behaviour is really frustating. Sometimes I got customers calls complaining about systems down and I just need an instance restart to solve the problem.
Does anybody have a tip about solving this problem?
UPDATE 1
They are t2.micro instances (1Gb RAM, 1 vCPU).
MySQL SHOW GLOBAL VARIABLES: pastebin.com/m65ieAAb
UPDATE 2
There is a CPU utilization peak in the logs, near the time when server was down. It was at 3AM. At this time there is a daily crontab task to make a database backup. But, considering this task runs everyday, why just sometimes it would make server get frozen?
I have not seen this exact issue, but on any cloud platform I assume any instance can fail at any time, so we design for failure. For example we have autoscaling on all customer facing instances. Anytime an instance fails, it is automatically replaced.
If a customer is calling to advise you a server is down, you may need to consider more automated methods of monitoring instance health and take automated action to recover the instance.
CloudWatch also has server recovery actions available that can be trigger if certain metric thresholds are reached.

Scaling ActiveMQ on AWS

First of all my knowledge of ActiveMQ, AMQPS and AWS auto scaling is fairly limited and I have been handed over this task where I need to create a scalable broker architecture for messaging over AMQPS using ActiveMQ.
In my current architecture if have a single machine ActiveMQ broker where the messaging is happening over AMQP + SSL and as a need of the product there is a publisher subscriber authentication (TLS authentication) to ensure correct guys are talking to each other. That part is working fine.
Now the problem is that I need to scale the whole broker thing over AWS cloud with auto-scaling in my mind. Without auto-scaling, I assume I can create a master slave architecture using EC2 instances, but then adding more slaves will be more like a manual process than automatic.
I want to understand wether below two options can solve the purpose -
ELB + ActiveMQ nodes being auto scaled
Something like a Bitnami powered ActiveMQ AMI running with auto scaling enabled.
In first case where ELB is there, I understand that ELB terminates SSL which will fail my mutual authentication. Also I am not sure wether my Pub/Sub model will still work where different ActiveMQ instances are independently running with no shared DB as such. If yes, if anyone can offer a pointer or a reference material it will be a help as I am not able to find one by myself.
In second case again, my concern is that when multiple instances are running with ActiveMQ how they will coordinate between each other and ensure that everyone has access to data being held up in queue.
The questions may be lame, but if any pointer it will be helpful.
AJ

AWS auto-scaling websocket cluster

So I am changing lots of my sites' "live" updates (currently working over AJAX) to use Websockets. Tried Pusher.com, pricing ridiculously high for my amount of traffic, so I got slanger (cheers Steve!) up on a big fat EC2 instance, redis EC instance, all good. Now for about 100M frames/day it seems to work fine, but I'd like to think ahead and consider what happens when I'll have even more traffic.
AWS ELBs do not support WS communication as I have read around here and on the AWS forums so far (quite lame considering people are asking for this since WS first popped up, thanks AWS!). So I am thinking to:
0) start with one instance, ws.mydomain.com
1) set up an auto-scaling group
2) cloudwatch alert on average CPU/memory usage
3) when it goes above 75% fire a SQS message saying something like "scale up now"
4) when message at #3 is received by some other random server polling the queue then fire up a new instance, add it to the group (ohnoes, that AWS API again!) and add the public IP to the Route53 DNS for ws.mydomain.com, so there will be 2 of them
5) when load drops fire another message, basically doing everything the other way around
So the question is: Could this work or should it be easier to go with an ELB in front of the slanger nodes?
TIA
Later edit:
1) don't care if we don't get the client IP
2) the slanger docs advertise that connection states are stored into redis so it does not matter to which node the clients connect to, so we don't need any session stickiness