Google cloud load balancer unevenly distributes traffic - google-cloud-platform

I have created instance groups with 3 instances and GCP load balancer of type HTTP/2.
When I hit the load balancer's IP, the requests get randomly distributed say I hit 12 requests since there are 3 instances the load distribution should have been 4 per VM but it doesn't happen in a round robin away. Is there any possibility that I can achieve this in GCP?

The algorithm used by the GCP load Balancing is intended to distribute load according to the geographic location of the clients. If more than one zone is configured with backends in a region, the traffic is distributed across the instance groups in each zone according to each group's capacity.
The Round-robin algorithm is present only when you create a backend made of instances within the same zone; in that case the requests are spread evenly over the instances.

Related

Improving latency times for EC2 from different geographic areas

We have an EC2 instance in US East, and our latency for users in the UK + AU is about 1-2 seconds higher. We are only dealing with text data and an RDS server in the same zone. Provided we want to go the route of creating another image instance of the primary EC2. How does the process work in lowering latency? Per our understanding:
Create a replica from the image in AU/UK zones
Add the external IP's of these two servers to the domain nameserver which will automatically generally help route the user to the closest server?
Or does it involve creating some sort of load balancer with a geographic rule, and the load balancer IP is what our NS will be?
TLDR: How do we route UK users to the UK EC2 server, what does the setup look like?
I recommend that you use Latency-based routing - Amazon Route 53:
If your application is hosted in multiple AWS Regions, you can improve performance for your users by serving their requests from the AWS Region that provides the lowest latency.
You would configure one DNS Name to route to multiple IP addresses. Route 53 will examine the location of the incoming request and route the traffic to the destination with the lowest latency.
This is not quite the same as geographic routing because some countries have better Internet connectivity and traffic will be routed according to latency rather than distance.
Alternatively, you can use AWS Global Accelerator, which routes traffic across the AWS global network. It uses a single IP address that exists in multiple locations (known as anycast) to redirect traffic to the closest (fewest-hop) endpoint of the AWS global network, then sends traffic over that network to the closest AWS location where you have provisioned services. This can achieve lower latency than routing across the Internet, but incurs a per GB cost for traffic.

Load Balancing of 2 instances in AWS

I have two VM's (in AWS cloud) connected to single DB. Each VM is having same application running. I want to load balance those two VM's and route based on the traffic. (Like if traffic is more on one VM instance then it should switch to another VM).
Currently I am accessing 2 different instances with 2 different IP addresses with HTTP. Now I want to access those 2 VM's with HTTPS and route the instances with same DNS like (https://dns name/service1/),
(https://dns name/service2/)
How can I do load balancing using nginx ingress.
I am new to AWS cloud. Can someone help me or guide me or suggest me some appropriate related references in getting the solution to it.
AWS offers an Elastic Load Balancing service.
From What is Elastic Load Balancing? - Elastic Load Balancing:
Elastic Load Balancing automatically distributes your incoming traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more Availability Zones. It monitors the health of its registered targets, and routes traffic only to the healthy targets. Elastic Load Balancing scales your load balancer as your incoming traffic changes over time. It can automatically scale to the vast majority of workloads.
You can use this ELB service instead of running another Amazon EC2 instance with nginx. (Charges apply.)
Alternatively, you could configure your domain name on Amazon Route 53 to use Weighted routing:
Weighted routing lets you associate multiple resources with a single domain name (example.com) or subdomain name (acme.example.com) and choose how much traffic is routed to each resource. This can be useful for a variety of purposes, including load balancing and testing new versions of software.
This would distribute the traffic when resolving the DNS Name rather than using a Load Balancer. It's not quite the same because DNS information is cached, so the same client would continue to be redirected to the same server until the cache is cleared. However, it is practically free to use.

CPU and memory utilization discrepancies for ejabberd and Riak clusters on AWS

I'm running a 2-node ejabberd cluster (behind an elastic load balancer) that in turn connects with a 3-node Riak cluster (again, via an ELB) on AWS. When I load-test the platform via Tsung (creating 0.5 million user registrations), I notice that the CPU utilization for the ejabberd nodes differs amongst themselves by around 10%. For the Riak nodes, the CPU and memory utilization amongst nodes differs by around 5%.
The nodes are of identical configuration, so wondering what could be leading to these non-trivial differences in utilization. Can anyone throw some light here please / educate me?
Is it due to the load balancer? Or a network impact? I expect that once a cluster is formed (either of ejabberd or of Riak KV), the nodes are all identical in behavior, especially for ejabberd where the entire database is replicated across the cluster.
Not that these differences are a problem, but would be good to understand the inner workings of the clusters here...
Many thanks.
Elastic Load Balancing mechanism
DNS server uses DNS round robin to determine which load balancer node in a specific Availablity Zone will receive the request
The selected load balancer checks for "sticky session" cookie
The selected load balancer sends the request to the least loaded instance
And in greater details:
Availability Zones (unlikely your case)
By default, the load balancer node routes traffic to back-end instances within the same Availability Zone. To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to have approximately equivalent numbers of instances in each zone. For example, if you have ten instances in Availability Zone us-east-1a and two instances in us-east-1b, the traffic will still be equally distributed between the two Availability Zones. As a result, the two instances in us-east-1b will have to serve the same amount of traffic as the ten instances in us-east-1a.
Sessions (most likely your case)
By default a load balancer routes each request independently to the server instance with the smallest load. By comparison, a sticky session binds a user's session to a specific server instance so that all requests coming from the user during the session will be sent to the same server instance.
AWS Elastic Beanstalk uses load balancer-generated HTTP cookies when sticky sessions are enabled for an application. The load balancer uses a special load balancer–generated cookie to track the application instance for each request. When the load balancer receives a request, it first checks to see if this cookie is present in the request. If so, the request is sent to the application instance specified in the cookie. If there is no cookie, the load balancer chooses an application instance based on the existing load balancing algorithm. A cookie is inserted into the response for binding subsequent requests from the same user to that application instance. The policy configuration defines a cookie expiry, which establishes the duration of validity for each cookie.
Routing Algorithm (less likely your case)
Load balancer node sends the request to healthy instances within the same Availability Zone using the leastconns routing algorithm. The leastconns routing algorithm favors back-end instances with the fewest connections or outstanding requests.
Source: Elastic Load Balancing Terminology And Key Concepts
Hope it helps.

AWS ELB doesn't distribute requests to auto scaling group EC2 instances in some cases

I'm trying to do performance testing for my AWS auto scaling group using jmeter.
Firstly, I did scale-in/out testing. I set the threshold to be 70% cpu utilization for 2 periods, each period is 2 minutes. The ELB works fine, and the requests was distributed to all EC2 instances in auto scaling group, in spite of un-equality, after the system scale-out.
In next, I want to test whether the two instances' load can be twice of one instance's.
I fixed the instance number of auto scaling group, I set the min/max/desired instance count to be 2. When I push load from single JMeter, there are always just only one instance works and its cpu utilization reach almost 100 percent, but the cpu utilization of the other instance is still zero.... If I push load from an JMeter cluster which contains several slaves, all instances take load.
Somebody said, maybe the load is not heavy enough, so the ELB considered that just one instance can handle it and didn't dispatch requests to other instance. I don't think so, because I push load from just one slave of this JMeter cluster, however I increase the load, just only one instance handle requests.
I found an blog which said ELB is great in HA but not load balancing.
https://www.stackdriver.com/elb-affinity-problems
But, I don't think the behavior, only one instance handle requests, is normal.
What the hell in the ELB load balance mechanism? I'm confused.
Elastic Load Balancing mechanism
DNS server uses DNS round robin to determine which load balancer node in a specific Availablity Zone will receive the request
The selected load balancer checks for "sticky session" cookie
The selected oad balancer sends the request to the least loaded instance
And in greater details:
Availability Zones (unlikely your case)
By default, the load balancer node routes traffic to back-end instances within the same Availability Zone. To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to have approximately equivalent numbers of instances in each zone. For example, if you have ten instances in Availability Zone us-east-1a and two instances in us-east-1b, the traffic will still be equally distributed between the two Availability Zones. As a result, the two instances in us-east-1b will have to serve the same amount of traffic as the ten instances in us-east-1a.
Sessions (most likely your case)
By default a load balancer routes each request independently to the server instance with the smallest load. By comparison, a sticky session binds a user's session to a specific server instance so that all requests coming from the user during the session will be sent to the same server instance.
AWS Elastic Beanstalk uses load balancer-generated HTTP cookies when sticky sessions are enabled for an application. The load balancer uses a special load balancer–generated cookie to track the application instance for each request. When the load balancer receives a request, it first checks to see if this cookie is present in the request. If so, the request is sent to the application instance specified in the cookie. If there is no cookie, the load balancer chooses an application instance based on the existing load balancing algorithm. A cookie is inserted into the response for binding subsequent requests from the same user to that application instance. The policy configuration defines a cookie expiry, which establishes the duration of validity for each cookie.
Routing Algorithm (less likely your case)
Load balancer node sends the request to healthy instances within the same Availability Zone using the leastconns routing algorithm. The leastconns routing algorithm favors back-end instances with the fewest connections or outstanding requests.
Source: Elastic Load Balancing Terminology And Key Concepts
Hope it helps.
I had this issue with unbalanced ELB traffic when back end instances where in different availability zones and the ELB was receiving requests from a small number of clients. In our case we were using an internal ELB within the application tiers. In your case "push load from single JMeter" likely means a small number of clients as seen by the ELB. The solution was to enable cross zone load balancing using the API similar to this fragment:-
elb-modify-lb-attributes ${ELB} --region ${REGION} --crosszoneloadbalancing "enabled=true"
http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/enable-disable-crosszone-lb.html

Unusual behavior of AWS Elastic Load balancer

I have runnning e-commerce based ruby on rails application on AWS stack. I am running ubuntu 10.04 ec2 instances with Elastic load balancer and I have maintained equal number of instance in the both the availability zone, 1a and 1b. But according to my observation, ELB seeming to be pushing more traffic to 1a rather then dividing it equally. Though the health of the instances running in 1b is good and also I have disabled the sticky session on the ELB. I have 2 large and 1 medium instances running on both the availability zones.
What the cause of in equal distribution of the load.
In my experience, this can happen if a disproportionate amount of traffic is coming from a single network or ip address.
ELB uses different layers of balancing. DNS load balancing will send it to a set of IP addresses in one of the two zones, and the software load balancer will distribute traffic between instances in the zone.
If you have a lot of traffic coming from the same network, its likely that a lot of users are getting the same DNS resolution on your load balancer and ending up in the same zone.
If the source traffic is coming from a single Network/IP range or IP address, ELB might load balance the traffic disproportionately to the backend. I have discussed this point as well as couple of other details to note on ELB in my blog "Dissecting ELB". I have also noticed this behavior in some popular OSS LB implementations as well, You can have balance algorithm as "source" and as well as session sticky combined . If session ID is not sent on the HTTP, then it will load balance based "source".