Can someone help me with GCP autoscaling. I want to achive Auto Scaling Without using Load Balancer in GCP because the service which is running on the VM does not need any endpoint its more likely a kafka consumer where its only fetch the data from cluster and send it to DB so there is no load balancing.
so far i have successfully created instaces template and have define the minimum and maximum state there but thats only maintaining the running state not perfroming autoscaling.
You can use instance groups which is a collection of virtual machine (VM) instances that you can manage as a single entity.
Autoscaling groups of instances have managed instance groups which will autoscale as per requirement by using the following policies.
CPU Usage: The size of the group is controlled to keep the average processor load of the virtual machines in the group at the required level
HTTP Load Balancing Usage: The size of the group is controlled to keep the load of the HTTP traffic balancer at the required level
Stackdriver Monitoring Metric: The size of the group is controlled to keep the selected metric from the Stackdriver Monitoring instrument at the required level .
Multiple Metrics: The decision to change the size of the group is made on the basis of multiple metrics.
Select your required policy and create a managed group of instances which will autoscale your VM.Here in this document you can find the steps to create scaling based on CPU usage, similarly you can create a required group.
For understanding attaching a Blog refer to it for more information.
I am making an AWS ECS cluster using EC2 and trying to use capacity providers. I don't really understand why do I need to enable instance scale-in protection inside my AWS Auto Scaling group.
Isn't the point of Auto Scaling the termination of needless EC2 instances?
why do I need to enable instance scale-in protection
This is only needed when you chose to use managed scaling:
When managed scaling is enabled, Amazon ECS manages the scale-in and scale-out actions of the Auto Scaling group used when creating the capacity provider. On your behalf, Amazon ECS creates an AWS Auto Scaling scaling plan with a target tracking scaling policy based on the target capacity value you specify.
The managed scaling ensures that ECS controlled when instances are removed. By doing this it protects any instances that have some tasks running on it from being terminated:
When managed termination protection is enabled, Amazon ECS prevents Amazon EC2 instances that contain tasks and that are in an Auto Scaling group from being terminated during a scale-in action.
The entire idea is that you enable instance scale-in protection on your ASG so that ECS has control over which instances to terminate based on tasks they run. Without this, your ASG could terminate instances based on other criteria, not nervelessly related to "needless EC2 instances". For example, ASG can choose to terminate instances based due to AZRebalance process. This could lead to ASG terminating instances with running tasks, which may not be what you want.
Is there a way like assigning a specific tag for the EC2 instances to automatically attached to the load balancer on AWS?
I believe I had done that in the past but unable to find that option now.
Since you say you've done it in the past, I believe you're thinking of a feature offered by EC2 auto-scaling groups (ASGs). ASG is a capability of the EC2 infrastructure that scales machine counts up and down based on workload or maintains a set number of healthy instances always running (destroying and replacing failed instances). When an ASG is attached to a load balancer, the instances controlled by the ASG are automatically registered and deregistered from the balancer.
Amazon EC2 Auto Scaling integrates with Elastic Load Balancing to enable you to attach one or more load balancers to an existing Auto Scaling group. After you attach the load balancer, it automatically registers the instances in the group and distributes incoming traffic across the instances.
https://docs.aws.amazon.com/autoscaling/ec2/userguide/attach-load-balancer-asg.html
I'm a little too confused on the terms and its usage. Can you please help me understand how are these used with Load Balancers?
I referred the aws-doc in vain for this :(
Target groups are just a group of Ec2 instances. Target groups are closely associated with ELB and not ASG.
ELB -> TG - > Group of Instances
We can just use ELB and Target groups to route requests to EC2 instances. With this setup, there is no autoscaling which means instances cannot be added or removed when your load increases/decreases.
ELB -> TG - > ASG -> Group of Instances
If you want autoscaling, you can attach a TG to ASG which in turn gets associated to ELB. Now with this setup, you get request routing and autoscaling together. Real world usecases follow this pattern. If you detach the target group from the Auto Scaling group, the instances are automatically deregistered from the target group
Hope this helps.
What is a target group?
A target group contains EC2 instances to which a load balancer distributes workload.
A load balancer paired with a target group does NOT yet have auto scaling capability.
What is an Auto Scaling Group (ASG)?
This is where auto scaling comes in. An auto scaling group (ASG) can be attached to a load balancer.
We can attach auto scaling rules to an ASG. Then, when thresholds are met (e.g. CPU utilization), the number of instances will be adjusted programatically.
How to attach an ASG to a load balancer?
For Classic load balancer, link ASG with the load balancer directly
For Application load balancer, link ASG with the target group (which itself is attached to the load balancer)
Auto Scaling Group is just a group of identical instances that AWS can scale out (add a new one) or in (remove) automatically based on some configurations you've specified. You use this to ensure at any point in time, there is the specific number of instances running your application, and when a threshold is reached (like CPU utilization), it scales up or down.
Target Group is a way of getting network traffic routed via specified protocols and ports to specified instances. It's basically load balancing on a port level. This is used mostly to allow accessing many applications running on different ports but the same instance.
Then there are the classical Load Balancers where network traffic is routed between instances.
The doc you referred to is about attaching load balancers (either classical or target group) to an auto-scaling group. This is done so scaling instances can be auto-managed (by the auto scaling group) while still having network traffic routed to these instances based on the load balancer.
Target groups
They listen to HTTP/S request from a Load Balancer
Are the Load Balancer's targets which will be available to handle an HTTP/S request from any kind of clients (Browser, Mobile, Lambda, Etc). A target has a specific purpose like Mobile API processing, Web App processing, Etc. Further, these target groups could contain instances with any kind of characteristics.
AWS Docs
Each target group is used to route requests to one or more registered targets. When you create each listener rule, you specify a target group and conditions. When a rule condition is met, traffic is forwarded to the corresponding target group. You can create different target groups for different types of requests. For example, create one target group for general requests and other target groups for requests to the microservices for your application. Reference
So, a Target Group provides a set of instances to process specific HTTP/S requests.
AutoScaling groups
They are a set of instances who were started up to handle a specific workload, i.e: HTTP requests, SQS' message, Jobs to process any kind of tasks, Etc.
On this side, these groups are a set of instances who were started up by a metric which exceeded a specific threshold and triggered an alarm. The main difference is that Autoscaling groups' instances are temporary and they are available to process anything, from HTTP/S requests until SQS' messages. Further, the instances here are temporary and can be terminated at any time according to the configured metric. Likewise , the Autoscaling groups share the same characteristics because the follow something called Launch Configuration.
AWS Docs
An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of instance scaling and management. For example, if a single application operates across multiple instances, you might want to increase the number of instances in that group to improve the performance of the application or decrease the number of instances to reduce costs when demand is low. Reference
So, an Autoscaling group not only will be able to process HTTP/S requests but also can process backend stuff, like Jobs to send emails, jobs to process tasks, Etc.
As I understand it, Target Groups is a connection between ELB and EC2 instances. Some kind of a service discovery rules. This layer allows to Target Groups for ECS Services for instance when it's possible to have more than one container per instance.
Auto-Scaling Groups is an abstraction for aggregation of EC2 metrics and taking some actions based on that data.
Also, bear in mind, that the possibility of attaching of Auto-Scaling Groups to ELB comes from the previous generation of ELBs. You may compare the first generation and the second one in the CloudFormation docs.
I have an Auto Scaling Group and I want to stop that instance from Auto Scaling Group rather than terminating, Is it possible to do so?
No. From the official definition:
Auto Scaling is a web service designed to launch or terminate Amazon EC2 instances automatically based on user-defined policies, schedules, and health checks.
When scaling-out, new instances are launched into the Auto Scaling group.
When scaling-in, instances are terminated.
Auto Scaling does not start/start instances.
Some benefits of this approach are:
Instances can be launched in different Availability Zones in case there is a failure in a particular AZ
Failed instances can be easily replaced
There is no limit to the number of instances that could be launched (compared to running out of 'stopped' instances)
Launch Configurations can be updated, so any newly-launched instances will use the new configuration (as opposed to recycling old instances)