Trigger instance launch on inbound IP traffic on its interface - amazon-web-services

AWS spot instances are started on pricing decrease, this can happen anytime of the day (or does not happen for many days, depending on the config), i'm looking for a similar way to start an instance when there's a user request, eg. http request/ssh connection ... etc
I'm willing to monitor inbound IP traffic to my elastic IPs (on my internet gateway), and if target instance is stopped then launch it, a kind of wake-on-inbound-traffic.
The monitoring can be in two ways:
Passive: mirror traffic, analyze it in offline with some libpcap-based tool and launch instances based on the capture analysis, the downside is the very early IP requests to that instance will timeout till it start and bootstrap.
Active: analyze inbound traffic inline (using libpcap-based tool as well) and start the target instance, redirect http requests to some "Please wait while warming your instance" till the instance get started.
Anyone thought of doing this ? any guidelines for doing it ?
thanks !

Related

AWS classic LB changing IPs/dropping connections results in lost messages on RabbitMQ

I run a rabbit HA cluster with 3 nodes and a classic AWS load-balancer(LB) in front of them. There are two apps, one that publishes and the other one that consumes through the LB.
When publisher app starts sending 3 million messages, after short period of time its connection is put into Flow Control state. After the publishing is finished, in publisher app logs I can see that all 3 million messages are sent. On the other hand in consumer app log I can only see 500K - 1M messages (varies between runs), which means that the large number of messages is lost.
So what is happening is that in the middle of a run, classic LB decides to change its IP address or drop connections, thus loosing a lot of messages (see my update for more details).
The issue does not occur if I skip LB and hit the nodes directly, doing load-balancing on app side. Of course in this case I lose all the benefits of ELB.
My question are:
Why is LB changing IP addresses and dropping connections, is that related to high message rate from publisher or Flow Control state?
How to configure LB, so that this issue doesn't occur?
UPDATE:
This is my understanding what is happening:
I use AMQP 0-9-1 and publish without 'publish confirms', so message is considered sent as soon as it's put on a wire. Also, the connection on rabbitmq node is between LB and a node, not Publisher app and a node.
Before the communication enters Flow Control, messages are passed from LB to a node immediately
Then the connection between LB and a node enters Flow Control, Publisher App connection is not blocked and thus it continues to publish at the same rate. That causes messages to pile up on LB.
Then LB decides to change IP(s) or drop the connection for whatever reasons and create a new one, causing all the piled messages to be lost. This is clearly visible from the RabbitMQ logs:
=WARNING REPORT==== 6-Jan-2018::10:35:50 ===
closing AMQP connection <0.30342.375> (10.1.1.250:29564 -> 10.1.1.223:5672):
client unexpectedly closed TCP connection
=INFO REPORT==== 6-Jan-2018::10:35:51 ===
accepting AMQP connection <0.29123.375> (10.1.1.22:1886 -> 10.1.1.223:5672)
The solution is to use AWS network LB. The network LB is going to create a connection between Publisher App and rabbitmq node. So if the connection is blocked or dropped Publisher is going to be aware of that and act accordingly. I have run the same test with 3M messages and not the single message is lost.
In the AWS docs, there's this line which explains the behaviour:
Preserve source IP address Network Load Balancer preserves the client side source IP allowing the back-end to see the IP address of
the client. This can then be used by applications for further
processing.
From: https://aws.amazon.com/elasticloadbalancing/details/
ELBs will change their addresses when they scale in reaction to traffic. New nodes come up, and appear in DNS, and then old nodes may go away eventually, or they may stay online.
It increases capacity by utilizing either larger resources (resources with higher performance characteristics) or more individual resources. The Elastic Load Balancing service will update the Domain Name System (DNS) record of the load balancer when it scales so that the new resources have their respective IP addresses registered in DNS. The DNS record that is created includes a Time-to-Live (TTL) setting of 60 seconds, with the expectation that clients will re-lookup the DNS at least every 60 seconds. (emphasis added)
— from “Best Practices in Evaluating Elastic Load Balancing”
You may find more useful information in that "best practices" guide, including the concept of pre-warming a balancer with the help of AWS support, and how to ramp up your test traffic in a way that the balancer's scaling can keep up.
The behavior of a classic ELB is automatic, and not configurable by the user.
But it also sounds as if you have configuration issues with your queue, because it seems like it should be more resilient to dropped connections.
Note also that an AWS Network Load Balancer does not change its IP addresses and does not need to scale by replacing resources the way ELB does, because unlike ELB, it doesn't appear to run on hidden instances -- it's part of the network infrastructure, or at least appears that way. This might be a viable alternative.

How to launch a single EC2 instance in response to an incoming network connection

I'm planning to host some private services on AWS. These services will only be used by me and possibly 1-2 other people, but never more.
I would like the single EC2 instance that I'm using to only run when I'm using it. I don't want to manually start and stop it on the AWS console.
Ideally, I would need to things to happen:
Automatically shut down the single EC2 instance, if there have been no requests for the past hour.
Automatically start the instance, when there is an incoming request, i.e. I visit the URL of a service I'm hosting.
I've been able to configure a load balancer to shutdown the instance depending on incoming network traffic. I guess this is ok, if there is no better solution that can filter by IP region or number of incoming connections.
However, I can't figure out how to automatically launch an instance, just by visiting the corresponding URL. Is this even possible?
I could probably write a simple bash script to launch an instance, but I would prefer it to be automated.

Trying to understand how does the AWS scaling work

There is one thing of scaling that I yet do not understand. Assume a simple scenario ELB -> EC2 front-end -> EC2 back-end
When there is high traffic new front-end instances are created, but, how is the connection to the back-end established?
How does the back-end application keep track of which EC2 it is receiving from, so that it can respond to the right end-user?
Moreover, what happen if a connection was established from one of the automatically created instances, and then the traffic is low again and the instance is removed.. the connection to the end-user is lost?
FWIW, the connection between the servers is through WebSocket.
Assuming that, for example, your ec2 'front-ends' are web-servers, and your back-end is a database server, when new front-end instances are spun up they must either be created from a 'gold' AMI that you previously setup with all the required software and configuration information, OR as part of the the machine starting up it must install all of your customizations (either approach is valid). with either approach they will know how to find the back-end server, either by ip address or perhaps a DNS record from the configuration information on the newly started machine.
You don't need to worry about the backend keeping track of the clients - every client talking to the back-end will have an IP address and TCPIP will take care of that handshaking for you.
As far as shutting down instances, you can enable connection draining to make sure existing conversations/connections are not lost:
When Connection Draining is enabled and configured, the process of
deregistering an instance from an Elastic Load Balancer gains an
additional step. For the duration of the configured timeout, the load
balancer will allow existing, in-flight requests made to an instance
to complete, but it will not send any new requests to the instance.
During this time, the API will report the status of the instance as
InService, along with a message stating that “Instance deregistration
currently in progress.” Once the timeout is reached, any remaining
connections will be forcibly closed.
https://aws.amazon.com/blogs/aws/elb-connection-draining-remove-instances-from-service-with-care/

Using Redis behing AWS load balancer

We're using Redis to collect events from our web application (pub/sub based) behind AWS ELB.
We're looking for a solution that will allow us to scale-up and high-availability for the different servers. We do not wish to have these two servers in a Redis cluster, our plan is to monitor them using cloudwatch and switch between them if necessary.
We tried a simple test of locating two Redis server behind the ELB, telnetting the ELB DNS and see what happens using 'redis-cli monitor', but we don't see nothing. (when trying the same without the ELB it seems fine)
any suggestions?
thanks
I came across this while looking for a similar question, but disagree with the accepted answer. Even though this is pretty old, hopefully it will help someone in the future.
It's more appropriate for your question here to use DNS failover with a Redis Replication Auto-Failover configuration. DNS failover provides groups of availability (if you need that level of scale) and the Replication group provides cache up time.
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover-configuring.html
The Active-passive failover should provide the solution you're wanting with High Availability:
Active-passive failover: Use this failover configuration when you want
a primary group of resources to be available the majority of the time
and you want a secondary group of resources to be on standby in case
all of the primary resources become unavailable. When responding to
queries, Amazon Route 53 includes only the healthy primary resources.
If all of the primary resources are unhealthy, Amazon Route 53 begins
to include only the healthy secondary resources in response to DNS
queries.
After you setup the DNS, then you would point that to the Elasticache Redis failover group's URL and add multiple groups for higher availability during a failover operation.
However, you might need to setup your application to write and read from different endpoints to maximize the architecture's scalability.
Sources:
http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/Replication.html
http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/AutoFailover.html
Placing a pair of independent redis nodes behind a LB will likely not be what you want. What will happen is ELB will try to balance connections to each instance, splitting half to one and half to another. This means that commands issued by one connection may not be seen by another. It also means no data is shared. So client a could publish a message, and client b being subscribed to the other server won't see the message.
For PUBSUB behind ELB you have a secondary problem. ELB will close an idle connection. So if you subscribe to a channel that isn't busy your ELB will close your connection. As I recall the max you can make this is 60s, meaning if you don't publish a message every single minute your clients will be disconnected.
As to how much of a problem that is depends on your client library, and frankly in my experience most don't handle it well in that they are unaware of the need to re-subscribe upon re-establishing the connection, meaning you would have to code that yourself.
That said a sentinel + redis solution would be quite ideal if your c,isn't has proper sentinel support. In this scenario. Your client asks the sentinels for the master to talk to, and on a connection failure it repeats this process. This would handle the setup you describe, without the problems of being behind an ELB.
Assuming you are running in VPC:
did you register the EC2 instances with the ELB?
did you add the correct security group setting to the ELB (allowing inbound port 23)?
did you add an ELB listener that maps port 23 on the ELB to port 23 on the instances?
did you set sensible ELB health checks (e.g. TCP on port 23) so that ELB thinks the EC2 instances are healthy?
If the ELB thinks the servers behind it are not healthy then ELB will not send them any traffic.

Prevent machine on Amazon from shutting down before all users finished tasks

I'm planning a server environment on AWS with auto scaling over VPC.
My application has some process that is done in several steps on server, and the user should stick to the same server by using ELB's sticky session.
The problem is, that when the auto scaling group suppose to shut down server, some users may be in the middle of the process (the process takes multiple request - for example -
1. create an album
2. upload photos to the album each at a time
3. convert photos to movie and delete photos
4. store movie on S3)
Is it possible to configure the ELB to stop passing NEW users to the server that is about to shut down, while still passing previous users (that has the sticky session set)?, and - is it possible to tell the server to wait for, let's say, 10 min. after the shutdown rule applied before it actually shut down?
Thank you very much
This feature hasn't been available in Elastic Load Balancing at the time of your question, however, AWS has meanwhile addressed the main part of your question by adding ELB Connection Draining to avoid breaking open network connections while taking an instance out of service, updating its software, or replacing it with a fresh instance that contains updated software.
Please not that you still need to specify a sufficiently large timeout based on the maximum time you expect users to finish their activity, see Connection Draining:
When you enable connection draining for your load balancer, you can set a maximum time for the load balancer to continue serving in-flight requests to the deregistering instance before the load balancer closes the connection. The load balancer forcibly closes connections to the deregistering instance when the maximum time limit is reached.
[...]
If your instances are part of an Auto Scaling group and if connection draining is enabled for your load balancer, Auto Scaling will wait for the in-flight requests to complete or for the maximum timeout to expire, whichever comes first, before terminating instances due to a scaling event or health check replacement. [...] [emphasis mine]
The emphasized part confirms that it is not possible to specify an additional timeout that only applies after the last connection has been drained.