Is it a good practice to setup 1 autoscaling with multiple target groups. All the target groups registered with same load balancer.
The scenario is Application load balancer LB1 listens on 80 and 443, has the target groups:
"open" Port is http/80
"secure". Port is https/443
If the auto scaling group has target tracking policy on average CPU utilisation, if "open" target group has higher CPU utilisation than "secure". Then there would be no auto scaling?
If alarm is breached how auto scaling group determine which target group should have the new instance?
Do I have to create separate auto scaling. group for each target group. I could not find any amazon docs for this scenario of multiple target group under 1 auto scaling group.
Please let me know
According to the AWS Documentation
If you attach multiple load balancer target groups or Classic Load Balancers to the group, all of them must report that the instance is healthy in order for it to consider the instance healthy. If any one of them reports an instance as unhealthy, the Auto Scaling group replaces the instance, even if other ones report it as healthy.
My test confirms that behavior with the one exception. Usually people start to configure auto-scaling group with the default settings, which means that the health-check is set to EC2 by default. For the proper work of multiply target groups attached to one auto-scaling group the health check should be set to ELB. If you change it after ASG started up, existing instances are not obey this new setting and retain in the groups.
It may cause the incorrect conclusion like this:
I have tried it, The instances retained even if its marked un healfthy from one target group #prassank
So the answer is:
It is not a good way to attach multiply target groups to a singe auto-scaling group unless you want more strict multiply health checking.
Related
There is an ability to forward requests to multiple weighted target groups. Is there a way to set up priorities for target groups, not weights? What I'm trying to achieve is: I wanna have a rule which would have 2 tgs, one with scaled down ecs services (to 0) and another one with lambdas. If I need a performance boost (to avoid lambda warmups) I wanna be able to scale up the ecs cluster and it should overtake the request handling. That would be very nicely possible if I could set up two target groups as I described. When the number of healthy targets in ecs will be 0, it won't be able to handle any requests and the infra would route all the requests to the lambda target group. When the number of healthy targets in ecs will be 1 and more, based on the priority, all the requests will go to it. Is that possible? If not, what is the alternative approach to achieve what I have described?
You are basically asking for a failover mechanism. You want to route requests to the primary Target Group. Then if it goes unhealthy, route to the next Target Group based on the priority score. However, this article specifically mentions failover between target groups is not supported by default.
The Application Load Balancer distributes traffic to the target groups based on weights. However, if all targets in a particular target group fail health checks, then the Application Load Balancer doesn't automatically route the requests to another target group or failover to another target group. In this case, if a target group has only unhealthy registered targets, then the load balancer nodes route requests across its unhealthy targets. A weighted target group should not be used as a failover mechanism when all the targets are unhealthy in a specific target group.
However, if your problem is just the Lambda cold start, you can always schedule a ping to your Lambda every 5 minutes. I would think this is much easier than having to duplicate the same endpoint in ECS, especially from ops and maintenance perspective.
Finally, may be obvious, but if you really want it, you can always implement a script to do so.
EDIT
There is another way using Route 53, since it has a failover mechanism. Create 2 ALBs, one backed by ECS and another by Lambda. Use the failover routing.
I'm a little too confused on the terms and its usage. Can you please help me understand how are these used with Load Balancers?
I referred the aws-doc in vain for this :(
Target groups are just a group of Ec2 instances. Target groups are closely associated with ELB and not ASG.
ELB -> TG - > Group of Instances
We can just use ELB and Target groups to route requests to EC2 instances. With this setup, there is no autoscaling which means instances cannot be added or removed when your load increases/decreases.
ELB -> TG - > ASG -> Group of Instances
If you want autoscaling, you can attach a TG to ASG which in turn gets associated to ELB. Now with this setup, you get request routing and autoscaling together. Real world usecases follow this pattern. If you detach the target group from the Auto Scaling group, the instances are automatically deregistered from the target group
Hope this helps.
What is a target group?
A target group contains EC2 instances to which a load balancer distributes workload.
A load balancer paired with a target group does NOT yet have auto scaling capability.
What is an Auto Scaling Group (ASG)?
This is where auto scaling comes in. An auto scaling group (ASG) can be attached to a load balancer.
We can attach auto scaling rules to an ASG. Then, when thresholds are met (e.g. CPU utilization), the number of instances will be adjusted programatically.
How to attach an ASG to a load balancer?
For Classic load balancer, link ASG with the load balancer directly
For Application load balancer, link ASG with the target group (which itself is attached to the load balancer)
Auto Scaling Group is just a group of identical instances that AWS can scale out (add a new one) or in (remove) automatically based on some configurations you've specified. You use this to ensure at any point in time, there is the specific number of instances running your application, and when a threshold is reached (like CPU utilization), it scales up or down.
Target Group is a way of getting network traffic routed via specified protocols and ports to specified instances. It's basically load balancing on a port level. This is used mostly to allow accessing many applications running on different ports but the same instance.
Then there are the classical Load Balancers where network traffic is routed between instances.
The doc you referred to is about attaching load balancers (either classical or target group) to an auto-scaling group. This is done so scaling instances can be auto-managed (by the auto scaling group) while still having network traffic routed to these instances based on the load balancer.
Target groups
They listen to HTTP/S request from a Load Balancer
Are the Load Balancer's targets which will be available to handle an HTTP/S request from any kind of clients (Browser, Mobile, Lambda, Etc). A target has a specific purpose like Mobile API processing, Web App processing, Etc. Further, these target groups could contain instances with any kind of characteristics.
AWS Docs
Each target group is used to route requests to one or more registered targets. When you create each listener rule, you specify a target group and conditions. When a rule condition is met, traffic is forwarded to the corresponding target group. You can create different target groups for different types of requests. For example, create one target group for general requests and other target groups for requests to the microservices for your application. Reference
So, a Target Group provides a set of instances to process specific HTTP/S requests.
AutoScaling groups
They are a set of instances who were started up to handle a specific workload, i.e: HTTP requests, SQS' message, Jobs to process any kind of tasks, Etc.
On this side, these groups are a set of instances who were started up by a metric which exceeded a specific threshold and triggered an alarm. The main difference is that Autoscaling groups' instances are temporary and they are available to process anything, from HTTP/S requests until SQS' messages. Further, the instances here are temporary and can be terminated at any time according to the configured metric. Likewise , the Autoscaling groups share the same characteristics because the follow something called Launch Configuration.
AWS Docs
An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of instance scaling and management. For example, if a single application operates across multiple instances, you might want to increase the number of instances in that group to improve the performance of the application or decrease the number of instances to reduce costs when demand is low. Reference
So, an Autoscaling group not only will be able to process HTTP/S requests but also can process backend stuff, like Jobs to send emails, jobs to process tasks, Etc.
As I understand it, Target Groups is a connection between ELB and EC2 instances. Some kind of a service discovery rules. This layer allows to Target Groups for ECS Services for instance when it's possible to have more than one container per instance.
Auto-Scaling Groups is an abstraction for aggregation of EC2 metrics and taking some actions based on that data.
Also, bear in mind, that the possibility of attaching of Auto-Scaling Groups to ELB comes from the previous generation of ELBs. You may compare the first generation and the second one in the CloudFormation docs.
I have created few autoscaling groups, but for few of them I don't see any instances. How can I check if it is the not enough resources issue ?
First check, what is the minimum number of instances you put for the autoscaling group. Then you can also look at the Target Groups for registered instances for the load balancer, whether there are any issues when running the instances.
I have an auto-scaling group with EC2 that implement a custom health.
From time to time, the health check fails and instances are terminated and replaced.
The health check itself is implemented as a shell script that runs on the instances. If it detects problems, it will inform the auto scaling group via the AWS API:
aws autoscaling set-instance-health --instance-id $instance --health-status Unhealthy
The problem is only that I have no information about what check failed, beside the notification:
Cause: At 2017-06-13T09:11:47Z an instance was taken out of service in response to a user health-check
What is the recommended way to debug these type of problems. Is there a way to make AWS only stop instances and not terminate them, so their disk state could be inspected?
(First I thought about "enable termination protection", but from my understanding this will not make a difference, here. Autoscaling group will still terminate the instances when the shutdown was requested by a failing custom health check.)
Using the set-instance-health command tells Auto Scaling that the instance is unhealthy and needs to be replaced. Auto Scaling will then terminate the unhealthy instance and launch a new instance to replace it.
If you wish to perform forensic analysis on an unhealthy instance, remove it from the Auto Scaling group with the aws autoscaling detach-instances command:
Removes one or more instances from the specified Auto Scaling group. After the instances are detached, you can manage them independent of the Auto Scaling group.
If you do not specify the option to decrement the desired capacity, Auto Scaling launches instances to replace the ones that are detached.
If there is a Classic Load Balancer attached to the Auto Scaling group, the instances are deregistered from the load balancer. If there are target groups attached to the Auto Scaling group, the instances are deregistered from the target groups.
So, instead of calling set-instance-health, call detach-instances (and optionally replace it). You can then debug the instance. If you wish to send it back into service, use aws autoscaling attach-instances.
I am looking for how to specify the zone I want to deploy to in a single instance deployment, with autoscaling, while also having automatic failover to another zone -- Do any options exist to achieve this?
More context
Due to how reserved instances are linked to a single availability zone (AZ), we find it to be a good strategy (from an "ease of management"/simplicity perspective), when buying reserved instances for our dev environment, to buy them all in a single zone and then launch all dev instances in that single zone. (In production, we buy across zones and run with autoscale groups that specify to deploy across all zones).
I am looking for how to:
Specify the AZ that I want an instance to be deployed to, so that I can leverage the reserved instances that are tied to a single (and consistent) AZ.
while also having
The ability to failover to an alternate zone if the primary zone fails (yes, you will pay more money until you move the reserved instances, but presumably the failover is temporary e.g. 8 hours, and you can fail back once the zone is back online).
The issue is that I can see how you can achieve 1 or 2, but not 1 and 2 at the same time.
To achieve 1, I would specify a single subnet (and therefore AZ) to deploy to, as part of the autoscale group config.
To achieve 2, I would specify more than one subnet in different AZs, while keeping the min/max/capacity setting at 1. If the AZ that the instance non-deterministically got deployed to fails, the autoscale group will spin up an instance in the other AZ.
One cannot do 1 and 2 together to achieve a preference for which zone an autoscale group of min/max/capacity of 1 gets deployed to while also having automatic failover if the zone the server is in fails; they are competing solutions.
This solution uses all AWS mechanisms to achieve the desired effect:
Launch the instance into the preferred zone by specifying that zone's subnet in the 1st autoscale group's config; this group's min/max/capacity is set to 1/1/1.
Create a second autoscale group with the same launch config as the 1st, but this other autoscale group is set to a min/max/desired of 0/1/0; this group should be configured with the subnets in every available zone in the region except the one specified in the 1st autoscale group.
Associate the 2nd autoscale group with the same ELB that is associated with the 1st autoscale group.
Set up a CloudWatch alarm that triggers on the unhealthy host alarm for #1's autoscale group; have the alarm change the #2 autoscale group's to a min/max/desired of 1/1/1. (As well as send out a notification so that you know this happened).
If you don't expect to get unhealthy host alarms except in the cases where there is an actual host failure or if the AZ goes down -- which is true in our case -- this is a workable solution.
As you have already figured out, (as of mid-2015) that's not possible. Auto-scaling doesn't have the concept of failover, strictly speaking. It expects you to provide more than one AZ and machines enough in each one if you want to have high availability. If you don't, then you aren't going to get it.
The only possible workaround I can imagine for this is setting up a watchdog yourself which changes the auto-scaling group's subnet once an AZ becomes unavailable. Not so hard to do, but no so reliable as well.