balanced partition placement with auto scaling in EC2 - amazon-web-services

I have a regional application which is made up of several duplicate zonal applications for HA. Each zonal application is made up of many instances, and I want each individual zonal application to be as high availability as possible. Ideally, no more than a certain percent of VMs should fail at once in a given AZ (otherwise we would treat it as a total zonal failure and fail away from that AZ).
I want to use partition placement groups with as close as possible to an equal number of instances in each partition to improve our chances of meeting the “no more than X% simultaneous failures” goal. I also want to use autoscaling groups to scale the zonal application in and out according to demand.
Is there a way to configure autoscaling groups to achieve my goal of roughly-equal partition sizes within a zone? I considered configuring one ASG per partition and relying on my load balancing to spread load equally across them, but if they get unbalanced for some reason I think they would stay unbalanced.

Related

How to autoscale Cassandra cluster on AWS or how to node dynamically to the cluster depending upon the load

I have been trying to auto-scale a 3node Cassandra cluster with Replication Factor 3 and Consistency Level 1 on Amazon EC2 instances.
What steps do I need to perform to add/remove nodes to the cluser dynamically based on load on application?
Unfortunately scaling up and down responding to the current load is not straightforward, and if you have a cluster with a large amount of data, this won't be possible:
you can't add multiple nodes simultaneously to a cluster, all the
operations need to be sequential.
adding or removing a node will require to stream data in or out of
the node; this will depend on the size of your data, as well as the
EC2 instance type you are using (for the network bandwidth limit);
also, there will be differences if you are utilizing instance
storage or EBS (EBS will limit you in IOPS)
You mentioned that you are using AWS and a replication factor of 3,
are you also using different availability zones (AZ's)? if you are,
the EC2Snitch will work to ensure that the information is balanced
between them, in order to be resilient, when you are scaling up and
down you will need to keep an even distribution between AZ's.
The scale operations will cause a rearrangement in the distribution
of tokens, once that it is completed you will need to do a cleanup
(nodetool cleanup) to remove data that is not in use anymore by
the node; this operation will also take time. This is important to
keep in mind if you are scaling up because you are running out
space.
For our use case, we are getting good results taking a proactive approach, we have set up an aggressive alert/monitoring strategy to have an early detection, so we can start the scale up operations before there is any performance impact. If your application or use case has a predictable pattern of usage can also help you to take actions in preparation of periods of high workloads.

AWS latency between Zones within a same Region

I have an EC2 and RDS in the same region US East(N. Virginia) but both resources are in different zones; RDS in us-east-1a and EC2 in us-east-1b.
Now the question is that if I put both resources within the same zone then would it speed up the data transfer to/from DB? I receive daily around 20k-30k entries from app to this instance.
EDIT
I read here that:
Each Availability Zone is isolated, but the Availability Zones in a region are connected through low-latency links.
Now I am wondering if these low-latency links are very minor or should I consider shifting my resources in the same zone to speed up the data transfer?
Conclusion
As discussed in answers and comments:
Since I have only one instance of EC2 and RDS, failure of one service in a zone will affect the whole system. So there is no advantage to keeping them in a separate zone.
Even though zones are connected together with low-latency links but there is still some latency which is neglectable in my case.
There is also a minor data transfer charge of USD 0.01/GB between EC2 and RDS in different zones.
What are typical values for Interzone data transfers in the same region?
Although AWS will not guarantee, state, or otherwise commit to hard numbers, typical measurments are sub 10 ms, with numbers around 3 ms is what I have seen.
How does latency affect data transfer throughput?
The higher the latency the lower the maximum bandwidth. There are a number of factors to consider here. An excellent paper was written by Brad Hedlund.
Should I worry about latency in AWS networks between zones in the same region?
Unless you are using the latest instances with very high performance network adapters (10 Gb or higher) I would not worry about it. The benefits of fault tolerance should take precendence except for the most specialized cases.
For your use case, database transactions, the difference between 1 ms and 10 ms will have minimal impact, if at all, on your transaction performance.
However, unless you are using multiple EC2 instances in multiple zones, you want your single EC2 instance in the same zone as RDS. If you are in two zones, the failure of either zone brings down your configuration.
There are times where latency and network bandwidth are very important. For this specialized case, AWS offers placement groups so that the EC2 instances are basically in the same rack close together to minimize latency to the absolute minimum.
Moving the resources to the same AZ would decrease latency by very little. See here for some unofficial benchmarks. For your use-case of 20k reads/writes per day, this will NOT make a huge difference.
However, moving resources to the same AZ would significantly increase reliability in your case. If you only have 1 DB and 1 Compute Instance that depend on each other, then there is no reason to put them in separate availability zones. With your current architecture, a failure in either us-east-1a or us-east-1b would bring down your project. Unless you plan on scaling out your project to have multiple DBs and Compute Instances, they should both reside in the same AZ.
According to some tests, i can see like 600 microseconds (0.6 ms) latency between availability zones, inside the same region. A fiber has 5 microseconds delay (latency) per km, and between azs there is less than 100km, hence the result matches.

reduce price by on AWS (EC2 and spot instances)

I have a queue of jobs and running AWS EC2 instances which process the jobs. We have an AutoScaling groups for each c4.* instance type in spot and on-demand version.
Each instance has power which is a number equal to number of instances CPUs. (for example c4.large has power=2 since it has 2 CPUs).
The the exact power we need is simply calculated from the number of jobs in the queue.
I would like to implement an algorithm which would periodically check the number of jobs in the queue and change the desired value of the particular AutoScaling groups by AWS SDK to save as much money as possible and maintain the total power of instances to keep jobs processed.
Especially:
I prefer spot instances to on-demand since they are cheaper
EC2 instances are charged per hour, we would like to turn off the instance only at the very last minute of its 1hour uptime.
We would like to replace on-demand instance by spot instances when possible. So, at 55min increase spot-group, at 58 check the new spot instance is running and if yes, decrease on-demand-group.
We would like to replace spot instances by on-demand if the bid would be too high. Just turn off the on-demand one and turn on the spot one.
Seems the problem is really difficult to handle. Anybody have any experience or a similar solution implemented?
You could certainly write your own code to do this, effectively telling your Auto Scaling groups when to add/remove instances.
Also, please note that a good strategy for lowering costs with Spot Instances is to appreciate that the price for a spot instance varies by:
Region
Availability Zone
Instance Type
So, if the spot price for a c4.xlarge goes up in one AZ, it might still be the same cost in another AZ. Also, the price of a c4.2xlarge might then be lower than a c4.xlarge, with twice the power.
Therefore, you should aim to diversity your spot instances across multiple AZs and multiple instance types. This means that spot price changes will impact only a small portion of your fleet rather than all at once.
You could use Spot Fleet to assist with this, or even third-party products such as SpotInst.
It's also worth looking at AWS Batch (not currently available in every region), which is designed to intelligently provide capacity for batch jobs.
Autoscaling groups allow you to use alarms and metrics that are defined outside of the autoscaling group.
If you are using SNS, you should be able to set up an alarm on your SNS queue and use that to scale up and scale down your scaling group.
If you are using a custom queue system, you can push metrics to cloudwatch to create a similar alarm.
You can determine how often scaling actions do occur, but it may be difficult to get the run time to exactly one hour.

AWS Scale out , Scale Up

In AWS, we come across scaling up (Adding more storage i.e from t1.small to t2.medium or t2.large) and scaling out is adding up of instances (adding EC2 instances or other). How are these related to Horizontal scaling and vertical scaling. Also, what is preferred to be used more in Recovery and Backups, Volume management more often while the condition is to minimize the cost of the infrastructure maintenance.
Scaling up is when you change the instance types within your Auto Scaling Group to a higher type (for example: changing an instance from a m4.large to a m4.xlarge), scaling down is to do the reverse.
Scaling out is when you add more instances to your Auto Scaling Group and scaling in is when you reduce the number of instances in your Auto Scaling Group.
When you scale out, you distribute your load and risk which in turn provides a more resilient solution, here is an example:
Let's say you have an ASG with 4x m4.xlarge instances. If one fails that means you lost 25% of your processing capability, it doesn't matter that these are sizeable instances with a good amount of CPU and Ram, the fact is by having bigger instance types but less of them you increase the impact of a failure.
However if you had say 8x m4.large instead, your total compute is the same as 4x m4.xlarge however if 1 instance dies then you only lose 12.5% of your resources.
Typically its better to use more smaller instances than less larger ones, so you will see that its more common to "scale-out" to meet demand than it is to "scale-up".
One last consideration is, in order to scale-up/scale-down you have to restart the instance, so there is a service impact when you scale-up/scale-down. There is no such impact when you scale-in/scale-out however.
I hope this helps!
This might help to get better picture on scaling in AWS
Any application loaded with a considerable amount of business logic, typically, follows a three tier architecture (client, server and data-storage) with multiple TSL. Right combination of AWS services can help to achieve the scalability goal. Let's focus on each layer individually and come up with an infrastructure plan on scalability.
Full Article is Here

Automatic recovery from an availability zone outage?

Are there any tools or techniques available to automatically create new instances in a different availability zone in the event that an availability zone suffers an outage in Amazon Web Services/EC2?
I think I understand how to do automatic fail over in the event of an availability zone (AZ) outage, but what about automatic recovery (create new instances in a new AZ) from an outage? Is that possible?
Example scenario:
We have a three-instance cluster.
An ELB round-robins traffic to the cluster.
We can lose any one instance, but not two instances in the cluster, and still be fully functional.
Because of (3), each instance is in a different AZ. Call them AZs A, B and C.
The ELB health check is configured so that the ELB can ensure each instance is healthy.
Assume that one instance is lost due to an AZ outage in AZ A.
At this point the ELB will see that the lost instance is no longer responding to health checks and will stop routing traffic to that instance. All requests will go to the two remaining healthy instances. Failover is successful.
Recovery is where I am not clear. Is there a way to automatically (i.e. no human intervention) replace the lost instance in a new AZ (e.g. AZ D)? This will avoid the AZ that had the outage (A) and not use an AZ that already has an instance in it (AZs B and C).
AutoScaling Groups?
AutoScaling Groups seem like a promising place to start, but I don't know if they can deal with this use case properly.
Questions:
In an AutoScaling Group there doesn't seem to be a way to specify that the new instances that replace dead/unhealthy instances should be created in a new AZ (e.g. create it in AZ D, not in AZ A). Is this really true?
In an AutoScaling Group there doesn't seem to be a way to tell the ELB to remove the failed AZ and automatically add a new AZ. Is that right?
Are these true shortcomings in AutoScaling Groups, or am I missing something?
If this can't be done with AutoScaling Groups, is there some other tool that will do this for me automatically?
In 2011 FourSquare, Reddit and others were caught by being reliant on a single availability zone (http://www.informationweek.com/cloud-computing/infrastructure/amazon-outage-multiple-zones-a-smart-str/240009598). It seems like since then tools would have come a long way. I have been surprised by the lack of automated recovery solutions. Is each company just rolling its own solution and/or doing the recovery manually? Or maybe they're just rolling the dice and hoping it doesn't happen again?
Update:
#Steffen Opel, thanks for the detailed explanation. Auto scaling groups are looking better, but I think there is still an issue with them when used with an ELB.
Suppose I create a single auto scaling group with a min, max & desired set to 3, spread across 4 AZs. Auto scaling would create 1 instance in 3 different AZs, with the 4th AZ left empty. How do I configure the ELB? If it forwards to all 4 AZs, that won't work because one AZ will always have zero instances and the ELB will still route traffic to it. This will result in HTTP 503s being returned when traffic goes to the empty AZ. I have experienced this myself in the past. Here is an example of what I saw before.
This seems to require manually updating the ELB's AZs to just those with instances running in them. This would need to happen every time auto scaling results in a different mix of AZs. Is that right, or am I missing something?
Is there a way to automatically (i.e. no human intervention) replace the lost instance in a new AZ (e.g. AZ D)?
Auto Scaling is indeed the appropriate service for your use case - to answer your respective questions:
In an AutoScaling Group there doesn't seem to be a way to specify that the new instances that replace dead/unhealthy instances should be created in a new AZ (e.g. create it in AZ D, not in AZ A). Is this really true? In an AutoScaling Group there doesn't seem to be a way to tell the ELB to remove the failed AZ and automatically add a new AZ. Is that right?
You don't have to specify/tell anything of that explicitly, it's implied in how Auto Scaling works (See Auto Scaling Concepts and Terminology) - You simply configure an Auto Scaling group with a) the number of instances you want to run (by defining the minimum, maximum, and desired number of running EC2 instances the group must have) and b) which AZs are appropriate targets for your instances (usually/ideally all AZs available in your account within a region).
Auto Scaling then takes care of a) starting the requested number of instances and b) balancing these instance in the configured AZs. An AZ outage is handled automatically, see Availability Zones and Regions:
Auto Scaling lets you take advantage of the safety and reliability of geographic redundancy by spanning Auto Scaling groups across multiple Availability Zones within a region. When one Availability Zone becomes unhealthy or unavailable, Auto Scaling launches new instances in an unaffected Availability Zone. When the unhealthy Availability Zone returns to a healthy state, Auto Scaling automatically redistributes the application instances evenly across all of the designated Availability Zones. [emphasis mine]
The subsequent section Instance Distribution and Balance Across Multiple Zones explains the algorithm further:
Auto Scaling attempts to distribute instances evenly between the Availability Zones that are enabled for your Auto Scaling group. Auto Scaling does this by attempting to launch new instances in the Availability Zone with the fewest instances. If the attempt fails, however, Auto Scaling will attempt to launch in other zones until it succeeds. [emphasis mine]
Please check the linked documentation for even more details and how edge cases are handled.
Update
Regarding your follow up question about the number of AZs being higher than the number of instances,
I think you need to resort to a pragmatic approach:
You should simply select a number of AZz equal or lower than the number of instances you want to run; in case of an AZ outage, Auto Scaling will happily balance your instances across the remaining healthy AZs, which means you'd be able to survive the outage of 2 out of 3 AZs in your example and still have all 3 instances running in the remaining AZ.
Please note that while it might be intriguing to use as many AZs as are available, New customers can access three EC2 Availability Zones in US East (Northern Virginia) and two in US West (Northern California) only anyway (see Global Infrastructure), i.e. only older accounts might actually have access to all 5 AZs in us-east-1, some just 4 and newer ones 3 at most.
I consider this to be a legacy issue, i.e. AWS is apparently rotating older AZs out of operation. For example, even if you have access to all 5 AZs in us-east-1, some instances types might not be available in all of these in fact (e.g. the New EC2 Second Generation Standard Instances m3.xlarge and m3.2xlarge are only available in 3 out of 5 AZs in one of the accounts I'm using).
Put another way, 2-3 AZs are considered to be a fairly good compromise for fault tolerance within a region, if anything cross region fault tolerance would probably be the next thing I'd be worried about.
there are many ways to solve this problem. without knowing the particulars of what your "cluster" is and how a new node comes alive, maybe registers with a master, loads data, etc, to bootstrap. for instance on hadoop, a new slave node needs to be registered with the namenode that will be serving it content. but ignoring that. just focusing on a startup of a new node.
you can use the cli tools for windows or linux instances. i fire them off from both my dev box in both os's and on the servers both os's. here is the link for linux for example:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/setting_up_ec2_command_linux.html#set_aes_home_linux
They consist of scores of commands that you can execute at the dos or linux shell to do things like fire off an instance or terminate one. they require the configuring of environment variables like your aws credentials and the path to java. here is an example input and output for creating an instance in AvailZone=us-east-1d
sample command:
ec2-request-spot-instances ami-52009e3b -p 0.02 -z us-east-1d --key DrewKP3 --group linux --instance-type m1.medium -n 1 --type one-time
sample output:
SPOTINSTANCEREQUEST sir-0fd0dc32 0.020000 one-time Linux/UNIX open 2013-05-01T09:22:18-0400 ami-52009e3b m1.medium DrewKP3 linux us-east-1d monitoring-disabled
note I am being a cheap-wad and using a 2 cent Spot Instance whereby you would be using a standard instance and not spot. but then again I am creating hundreds of servers.
alright, so you have a database. for argument sake, let's say you have AWS RDS mysql, micro instance running in Multi-AvailZone mode for an extra half a cent an hr. that is is 72 cents a day. It contains a table, call it zonepref (AZ,preference). such as
us-west-1b,1
us-west-1c,2
us-west-2b,3
us-east-1d,4
eu-west-1b,5
ap-southeast-1a,6
you get the idea. The preference of zones.
there is another table in RDS that is something like "active_nodes" with columns IP addr, instance-id,zone,lastcontact,status (string,string,string,datetime,char). let's say it contains the following active nodes info:
'10.70.132.101','i-2c55bb41','us-east-1d','2013-05-01 11:18:09','A'
'10.70.132.102','i-2c66bb42','us-west-1b','2013-05-01 11:14:34','A'
'10.70.132.103','i-2c77bb43','us-west-2b','2013-05-01 11:17:17','A'
'A'=Alive and healthy, 'G'=going dead, 'D'=Dead
now your node on startup establishes either a cron job or runs a service, let's call it a server that is in any language of your liking like java or ruby. this is baked into your ami to run at startup, and on initialization it goes out and does an insert of its data into the active_nodes table so its row is there. at a minimum it runs every, say, 5 min (depending on how mission critical this whole thing is). the cron job would run at that interval or the java/ruby would create a thread that would sleep for that amount of time. when it comes to life, it grabs its ipaddr,instanceid,AZ, and makes a call to RDS to update it's row where status='A' using UTC time for lastcontact which is consistent across timezones. If it's status is not 'A' then no update will occur.
In addition it updates the status column of any other ip addr row in there that is status='A', changing it to status='G' (going dead) for any, like I said, other ipaddr that now()-lastcontact is greater than, say, 6 or 7 minutes. Additionally it can using sockets (pick a port) contact that Going Dead server and say, hey, are you there ? If so, maybe that Going Dead server merely can't access RDS tho it is in Multi-AZ but can still handle other traffic. If no contact then change the other server status to 'D'=Dead. Refine as needed.
The concept of writing the 'server' that runs on its node here is one that has a housekeeping thread that sleeps, and the main thread that will block/listen on a port. the whole thing can be written in ruby in less than 50 to 70 lines of code.
The servers can use the CLI and terminate the instance id's of other servers, but before doing so it would do something like issue a select statement from table zonepref ordered by preference for the first row that is not in active_nodes. it now has the next zone, it runs ec2-run-instances with the correct ami-id and next zone etc, passing along user data if necessary. You don't want both the Alive servers to create a new instance, so either wrap the create with a row lock in mysql or push the request onto a queue or a stack so only one of them perform it.
anyway, might seem like overkill, but i do a lot of cluster work where nodes have to talk to one another directly. Note that I am not suggesting that just because a node seems to have lost its heartbeat that its AZ has gone down :> Maybe just that instance lost its lunch.
Not enough rep to comment.
I wanted to add that an ELB will not route traffic to an empty AZ. This is because ELB's route traffic to instances, not AZ's.
Attaching AZ's to an ELB merely creates an Elastic Network Interface in a subnet in that AZ so that traffic could be routed if an instance in that AZ is added. It's adding instances (for which the AZ associated with the instance but also be associated with the ELB) that creates the routing.