Can someone help me with GCP autoscaling. I want to achive Auto Scaling Without using Load Balancer in GCP because the service which is running on the VM does not need any endpoint its more likely a kafka consumer where its only fetch the data from cluster and send it to DB so there is no load balancing.
so far i have successfully created instaces template and have define the minimum and maximum state there but thats only maintaining the running state not perfroming autoscaling.
You can use instance groups which is a collection of virtual machine (VM) instances that you can manage as a single entity.
Autoscaling groups of instances have managed instance groups which will autoscale as per requirement by using the following policies.
CPU Usage: The size of the group is controlled to keep the average processor load of the virtual machines in the group at the required level
HTTP Load Balancing Usage: The size of the group is controlled to keep the load of the HTTP traffic balancer at the required level
Stackdriver Monitoring Metric: The size of the group is controlled to keep the selected metric from the Stackdriver Monitoring instrument at the required level .
Multiple Metrics: The decision to change the size of the group is made on the basis of multiple metrics.
Select your required policy and create a managed group of instances which will autoscale your VM.Here in this document you can find the steps to create scaling based on CPU usage, similarly you can create a required group.
For understanding attaching a Blog refer to it for more information.
Related
I'm learning load balancer and managed instance group auto scaling. I do not understand how does MIG autoscales when using HTTP load balancing utilization:
So, in MIG autoscale setting, I set Target HTTP load balancing utilization to 10%:
And in setting external HTTP load balancer: I have following two options:
utilization:
rate:
I can understand CPU based MIG autoscale, if the average CPU usage is greater than the number I inputed, then MIG will add more VMs to lower the number. It's very simple and straightforward.
But I do not know when will MIG autoscale when using HTTP load balancing utilization?
GCP Load Balancing offers three types of autoscaling:
You can choose to scale using the following policies:
Average CPU utilization.
HTTP load balancing serving capacity, which can be based on either utilization or requests per second.
Cloud Monitoring metrics (not supported by regional autoscalers)
First one as you said yourself is pretty self-explanatory.
And this is what the official documentaiton says about Requests per second (RPS) based autoscaling:
With RATE, you must specify a target number of requests per second on
a per-instance basis or a per-group basis. (Only zonal instance groups
support specifying a maximum rate for the whole group.
But there is a limitation to the RPS based autoscaling:
Autoscaling does not work with maximum requests per group because this
setting is independent of the number of instances in the instance
group. The load balancer continuously sends the maximum number of
requests per group to the instance group, regardless of how many
instances are in the group.
For example, if you set the backend to handle 100 maximum requests per
group per second, the load balancer sends 100 requests per second to
the group, whether the group has two instances or 100 instances.
Because this value cannot be adjusted, autoscaling does not work with
a load balancing configuration that uses the maximum number of
requests per second per group.
You may also find useful to have a look at the types of GCP load balancing supported by in various scenarios.
This document also describes when it's best not to use some types of load balancing.
i set up an elastic load balancer on AWS to reach a target group made of 3 EC2 instances, in 3 different zones.
I saw I can see CloudWatch load balancer metric, target group metric, or EC2 metric. I'd like to know if exists a kind of plugin to display metrics for all the hosts available in the target group, like grafana/prometheus.
In addition I'd like to know if the are best practise to gather application logs from EC2 instances to consult them, if some error occur.
Thank you very much
It depends on what kind of monitoring you want to use, but assuming you just want to gather logs, you can do the following:
Pre-bake the AMI, based on your OS, with Cloudwatch Logs agent.
Specify log group name in agent configuration, enable agent on startup
Launch instance group from that AMI
This way logs from different instances should be collected in one log group under different streams corresponding to instance.
You can also use 3rd-party services, like ELK stack, but the idea is the same - AMI with log agent.
I'm a little too confused on the terms and its usage. Can you please help me understand how are these used with Load Balancers?
I referred the aws-doc in vain for this :(
Target groups are just a group of Ec2 instances. Target groups are closely associated with ELB and not ASG.
ELB -> TG - > Group of Instances
We can just use ELB and Target groups to route requests to EC2 instances. With this setup, there is no autoscaling which means instances cannot be added or removed when your load increases/decreases.
ELB -> TG - > ASG -> Group of Instances
If you want autoscaling, you can attach a TG to ASG which in turn gets associated to ELB. Now with this setup, you get request routing and autoscaling together. Real world usecases follow this pattern. If you detach the target group from the Auto Scaling group, the instances are automatically deregistered from the target group
Hope this helps.
What is a target group?
A target group contains EC2 instances to which a load balancer distributes workload.
A load balancer paired with a target group does NOT yet have auto scaling capability.
What is an Auto Scaling Group (ASG)?
This is where auto scaling comes in. An auto scaling group (ASG) can be attached to a load balancer.
We can attach auto scaling rules to an ASG. Then, when thresholds are met (e.g. CPU utilization), the number of instances will be adjusted programatically.
How to attach an ASG to a load balancer?
For Classic load balancer, link ASG with the load balancer directly
For Application load balancer, link ASG with the target group (which itself is attached to the load balancer)
Auto Scaling Group is just a group of identical instances that AWS can scale out (add a new one) or in (remove) automatically based on some configurations you've specified. You use this to ensure at any point in time, there is the specific number of instances running your application, and when a threshold is reached (like CPU utilization), it scales up or down.
Target Group is a way of getting network traffic routed via specified protocols and ports to specified instances. It's basically load balancing on a port level. This is used mostly to allow accessing many applications running on different ports but the same instance.
Then there are the classical Load Balancers where network traffic is routed between instances.
The doc you referred to is about attaching load balancers (either classical or target group) to an auto-scaling group. This is done so scaling instances can be auto-managed (by the auto scaling group) while still having network traffic routed to these instances based on the load balancer.
Target groups
They listen to HTTP/S request from a Load Balancer
Are the Load Balancer's targets which will be available to handle an HTTP/S request from any kind of clients (Browser, Mobile, Lambda, Etc). A target has a specific purpose like Mobile API processing, Web App processing, Etc. Further, these target groups could contain instances with any kind of characteristics.
AWS Docs
Each target group is used to route requests to one or more registered targets. When you create each listener rule, you specify a target group and conditions. When a rule condition is met, traffic is forwarded to the corresponding target group. You can create different target groups for different types of requests. For example, create one target group for general requests and other target groups for requests to the microservices for your application. Reference
So, a Target Group provides a set of instances to process specific HTTP/S requests.
AutoScaling groups
They are a set of instances who were started up to handle a specific workload, i.e: HTTP requests, SQS' message, Jobs to process any kind of tasks, Etc.
On this side, these groups are a set of instances who were started up by a metric which exceeded a specific threshold and triggered an alarm. The main difference is that Autoscaling groups' instances are temporary and they are available to process anything, from HTTP/S requests until SQS' messages. Further, the instances here are temporary and can be terminated at any time according to the configured metric. Likewise , the Autoscaling groups share the same characteristics because the follow something called Launch Configuration.
AWS Docs
An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of instance scaling and management. For example, if a single application operates across multiple instances, you might want to increase the number of instances in that group to improve the performance of the application or decrease the number of instances to reduce costs when demand is low. Reference
So, an Autoscaling group not only will be able to process HTTP/S requests but also can process backend stuff, like Jobs to send emails, jobs to process tasks, Etc.
As I understand it, Target Groups is a connection between ELB and EC2 instances. Some kind of a service discovery rules. This layer allows to Target Groups for ECS Services for instance when it's possible to have more than one container per instance.
Auto-Scaling Groups is an abstraction for aggregation of EC2 metrics and taking some actions based on that data.
Also, bear in mind, that the possibility of attaching of Auto-Scaling Groups to ELB comes from the previous generation of ELBs. You may compare the first generation and the second one in the CloudFormation docs.
Does anyone know what the difference between Automatic Load-based Scaling vs having explicit auto scaling groups on OpsWorks is?
this: http://docs.aws.amazon.com/opsworks/latest/userguide/workinginstances-autoscaling-loadbased.html
vs https://aws.amazon.com/blogs/devops/auto-scaling-aws-opsworks-instances/
With load-based instances, how does one add one to a target group?
Can you have multiple auto scaling groups in one layer of OpsWorks?
I’m looking at going with an ALB to route our traffic, which cannot act as an independent layer in Opsworks.
So I would need to pipe requests to 1 auto scaling group for one type of requests and the rest to the other other auto scaling group.
I just am not sure what load-based instances are and am perplexed by them not providing a default number of machines to start with.
Which one should I use for ALB routing traffic between the two groups?
OpsWorks is a configuration management tool that utilises Chef to configure your infrastructure. OpsWorks utilises a different approach when it comes to scaling out than a auto-scaling group.
Unlike an auto-scaling group, you have these instances pre-defined on your OpsWorks stack (layer) and they are being launched when a certain metric (threshold) is triggered (CloudWatch data: CPU, memory, load... etc).
OpsWorks will not spawn (create) any new instances, but will only be capable of starting instances you have previously created and set them as load-based instances. This is also only available for OpsWorks and cannot be used for any other service outside of OpsWorks.
AWS EC2 auto-scaling actually can launch very large number of instances (instances which do not need to be created beforehand) into your AWS environment, and same as the OpsWorks load-based scaling, can be triggered by CloudWatch alarms (CPU, memory, Load... etc).
Auto-scaling is not by default available on OpsWorks, and there is no build in way to have an auto-scaling group associated with your OpsWorks stack, but it's possible with a bit of work. Read about it here.
Let me divide the answer for you.
Does anyone know what the difference between Automatic Load-based
Scaling vs having explicit auto scaling groups on OpsWorks is?
Automatic Load-based Scaling:
Amazon Opsworks Service provides you the the feature, automatic load-based scaling where you can add instances to your layer in stack and set the auto scaling configuration policies directly.
Load based scaling scales up or down the instances based upon the load you have set to handle. You need to set the threshold, using the parameters and define the scaling policies.
Explicit Auto Scaling groups on OpsWorks:
Amazon Opsworks Service allows you to add existing instances to your layer in stack. Which means You can set an autoscaling launch configuration and set the scale up and scale down events based on the load. Then create an Autoscaling group and launch instances in it. Then you can go to Opsworks and add these existing instances to your layer in stack. So when the load increases or decreases more or less than the threshold set, the Autoscaling group handles the scaling.
With load-based instances, how does one add one to a target group?
Once you have the Load based Instances Ready either you have launched then directly from Automatic Load-based Scaling in Opsworks or Explicitly using Auto Scaling groups on OpsWork, you can go to Application Load balancer in EC2 Console and configure with necessary configurations and then register the load based instances you have just created with ALB in Register targets TAB.
Can you have multiple auto scaling groups in one layer of OpsWorks?
Yes, you can have multiple auto scaling groups in one layer of OpsWorks.
Which one should I use for ALB routing traffic between the two groups?
You can use any of the group.
so that you can pipe requests to 1 auto scaling group for one type of
requests and the rest to the other other auto scaling group.
Please Refer Autoscaling once.
I just am not sure what load-based instances are
Load Based Instances are the instances which are configured with Load based scaling configuration. You need to set the threshold,configuration and the events to define when to scale up and scale down.
EX: Suppose, If you have 5 instances running at initial stage and as you want your application to be running even your load increases to minimize your downtime, you will set autoscaling configuration such that if average CPU utilization of instances increase more than 70% launch 2 more instances. You can set scale up and scale down on many more factors.
Hope it Helps:)
I am new to using web services but we have built a simple web service hosted in IIS on an Amazon EC2 instance with an Amazon RDS hosted database server, this all works fine as a prototype for our mobile application.
The next stage s to look at scale and I need to know how we can have a cluster of instances handling the web service calls as we expect to have a high number of calls to the web service and need to scale the number of instances handling the calls.
I am pretty new to this so at the moment I see we use an IP address in the call to the web service which implies its directed at a specific server> how do we build an architecture on Amazon where the request from the mobile device can be handled by one of a number of servers and in which we can scale the capacity to handle more web service calls by just adding more servers on Amazon
Thanks for any help
Steve
You'll want to use load balancing, that conveniently AWS also offers:
http://aws.amazon.com/elasticloadbalancing/
Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances. It enables you to achieve even greater fault tolerance in your applications, seamlessly providing the amount of load balancing capacity needed in response to incoming application traffic. Elastic Load Balancing detects unhealthy instances within a pool and automatically reroutes traffic to healthy instances until the unhealthy instances have been restored. Customers can enable Elastic Load Balancing within a single Availability Zone or across multiple zones for even more consistent application performance.
In addition to Elastic Load Balancing, you'll want to have an Amazon Machine Image created, so you can launch instances on-demand without having to do manual configuration on each instance you launch. The EC2 documentation describes that process.
There's also Auto Scaling, which lets you set specific metrics to watch and automatically provision more instances. I believe it's throttled, so you don't have to worry about creating way too many, assuming you set reasonable thresholds at which to start and stop launching more instances.
Last (for a simple overview), you'll want to consider being in multiple availability zones so you're resilient to any potential outages. They aren't frequent, but they do happen. There's no guarantee you'll be available if you're only in one AZ.