I have an Aurora cluster with a reader and writer instances. And the reader instance has high hits and hits 100% every now and then. I was wondering about possible options to reduce the load on the same. The current instance type is db.r4.4xlarge .
I also read about adding multiple reader instances which uses the same endpoint and AWS load balances the traffic between them automatically. I would love to know if all I have to do is add another reader instance and all the load balancing happens automatically? And does creating a new reader affect the performance of the cluster while the new one is being created?
What about using a Redis ElastiCache instance? How can I use this with RDS to reduce load o the same instance?
Which of the above 2 would be the best way to go forward ??? Please suggest
Adding more reader instances to Aurora cluster or scaling up reader instance are way to avoid high CPU. While using readeronly endpoint, you have to keep few things in mind
Load Balancing with the Aurora Reader Endpoint
The Aurora reader endpoint contains all Aurora Replicas, it can provide DNS-based, round-robin load balancing for new connections. Every time you resolve the reader endpoint, you'll get an instance IP that you can connect to, chosen in round-robin fashion.
DNS load balancing works at the connection level (not the individual query level). You must keep resolving the endpoint without caching DNS to get a different instance IP on each resolution. If you only resolve the endpoint once and then keep the connection in your pool, every query on that connection goes to the same instance. If you cache DNS, you receive the same instance IP each time you resolve the endpoint.
DNS Caching
Aurora Replicas can experience unequal utilization because of DNS caching.
Unless you use a smart database driver, you depend on DNS record updates and DNS propagation for failovers, instance scaling, and load balancing across Aurora Replicas.
Currently, Aurora DNS zones use a short Time-To-Live (TTL) of 5 seconds. Ensure that your network and client configurations don’t further increase the DNS cache TTL.
Remember that DNS caching can occur anywhere from your network layer, through the operating system, to the application container. For example, Java virtual machines (JVMs) are notorious for caching DNS indefinitely unless configured otherwise.
Another good read on the same topic.
Related
What I have:
One VPC with 2 EC2 Ubuntu instances in it: One with phpmyadmin,
another one with mysql database. I am able to connect from one
instance to another.
What I need to achieve:
Set up the Disaster recovery for those instances. In case of networking issues or if the first VPC is not available for any reason all requests sent to the first VPC are
redirected to the second one. If I got it right it can be achieved
with VPC endpoints. Cannot find any guide on how to proceed with
this. (I have 2 VPCs with 2 ec2 instances in each of them)
Edit:
Currently I have 2 VPC with 2 EC2 instances in each of them.
Yes, ideally I need to have 2 databases running and sync the date between them. Not it is just 2 separate db instances with no sync.
First ec2 instance in each VPC has web app running. So external requests to the web app should be sent to the first VPC if it is available and to the second VPC if smth is wrong with the first one. Same with the DBs: if DB instance in the first VPC is available - web app requests should update data in this DB. If not requests should access the data from the second DB instance
Traditionally, Disaster Recovery (DR) involves having a secondary copy of 'everything' (eg servers in a different data center). Then, if something goes wrong, failover would involve pointing to the secondary copy.
However, the modern cloud emphasises High Availability rather than Disaster Recovery. An HA architecture actually has multiple systems continually running in separate Availability Zones (AZs) (which are effectively Data Centers). When something goes wrong, the remaining systems continue to service requests without needing to 'failover' to alternate infrastructure. Then, additional infrastructure is brought online to make up for the failed portion.
High Availability can also operate at multiple levels. For example:
High Availability for the database would involve running the database under Amazon RDS "Multi-AZ" configuration. There is one 'primary' database that is servicing requests, but the data is being continually copied to a 'secondary database in a different AZ. If the database or AZ should fail, then the secondary database takes over as the primary database. No data is lost.
High Availability for web apps running on Amazon EC2 instances involves using a Load Balancer to distribute requests to Amazon EC2 instances running in multiple AZs. If an instance or AZ should fail, then the Load Balancer will continue serving traffic to the remaining instances. Auto Scaling would automatically launch new instances to make up for the lost capacity.
To compare:
Disaster Recovery is about having a second set of infrastructure that isn't being used. When something fails, the second set of infrastructure is 'switched on' and traffic is redirected there.
High Availability is all about continually handling loads across multiple Data Centers (AZs). When something fails, it keeps going and new infrastructure is launched. There should be no 'outage period'.
You might think that running multiple EC2 instances simultaneously to provide High Availability is more expensive. However, each instance would only need to handle a portion of the load. A single 'Large' instance costs the same as two 'Medium' instances, so splitting the workload between multiple instances does not need to cost more.
Also, please note that VPCs are logical network configurations. A VPC can have multiple Subnets, and each Subnet can be in a different AZ. Therefore, there is no need for two VPCs -- one is perfectly sufficient.
VPC Endpoints are not relevant for DR or HA. They are a means of connecting from a VPC to AWS Services, and operate across multiple AZs already.
See also:
High availability is not disaster recovery - Disaster Recovery of Workloads on AWS: Recovery in the Cloud
High Availability Application Architectures in Amazon VPC (ARC202) | AWS re:Invent 2013 - YouTube
In addition to the previous answers, you might wanna take a look in migrating your DBs to RDS or Aurora.
It would provide HA for your DB tier via multi-AZ configuration, and you would not have to figure out how to sync the data between the databases.
That being said, you also have to decide what level of availability is acceptable for you:
multi AZ - data & services span across multiple data centers in one region -> if the whole region goes down, your application goes down.
multi region - data & services span across multiple data centers in multiple regions -> single region failure won't put you out of business, but it requires some more bucks & effort to configure
I have web application which is communicating with AWS RDS (MYSQL). now we want to scale our database instance to reduce master traffic. so AWS RDS has provided "Read Replica" feature , through which we distributes traffic to reduce load on master. Now the question here is how my web application knows to redirect read traffic to read replica instance. what is best approach through which we can achieve this functionality.
There is no magic bullet; you have an endpoint for the master db and an endpoint for EACH read replica. You have to account for this in your application code, i.e. have a pool of read DB connections that point to your read replicas and a pool of write DB connections that point to your master db. In your application code, when you are doing reads, you use the read connection and writes, the master connection. You can use general design techniques to hide this from the rest of your application code somewhat (in the Persistence layer) but the other layers have to inform the Persistence layer what type of connection they want.
As an aside, why RDS + MySQL rather than with Aurora? One advantage of Aurora is that you get a single endpoint for all your read replicas and Aurora will take on the responsibility of load balancing across the available replicas. With RDS + MySQL, you have an endpoint for each replica and have to do the load balancing on your own.
The Heimdall Proxy can split reads/writes with Strong Consistency. It is available on the AWS, Azure and GCP Marketplace. The proxy speaks monitors the RDS status via APIs to ensure proper failover and query routing. It also does not depend on DNS when tends to unevenly distribute load across read replicas. The Heimdall Proxy makes better use of your replicas.
Check out their blog: https://aws.amazon.com/blogs/apn/using-the-heimdall-proxy-to-split-reads-and-writes-for-amazon-aurora-and-amazon-rds/
I'm running a 2-node ejabberd cluster (behind an elastic load balancer) that in turn connects with a 3-node Riak cluster (again, via an ELB) on AWS. When I load-test the platform via Tsung (creating 0.5 million user registrations), I notice that the CPU utilization for the ejabberd nodes differs amongst themselves by around 10%. For the Riak nodes, the CPU and memory utilization amongst nodes differs by around 5%.
The nodes are of identical configuration, so wondering what could be leading to these non-trivial differences in utilization. Can anyone throw some light here please / educate me?
Is it due to the load balancer? Or a network impact? I expect that once a cluster is formed (either of ejabberd or of Riak KV), the nodes are all identical in behavior, especially for ejabberd where the entire database is replicated across the cluster.
Not that these differences are a problem, but would be good to understand the inner workings of the clusters here...
Many thanks.
Elastic Load Balancing mechanism
DNS server uses DNS round robin to determine which load balancer node in a specific Availablity Zone will receive the request
The selected load balancer checks for "sticky session" cookie
The selected load balancer sends the request to the least loaded instance
And in greater details:
Availability Zones (unlikely your case)
By default, the load balancer node routes traffic to back-end instances within the same Availability Zone. To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to have approximately equivalent numbers of instances in each zone. For example, if you have ten instances in Availability Zone us-east-1a and two instances in us-east-1b, the traffic will still be equally distributed between the two Availability Zones. As a result, the two instances in us-east-1b will have to serve the same amount of traffic as the ten instances in us-east-1a.
Sessions (most likely your case)
By default a load balancer routes each request independently to the server instance with the smallest load. By comparison, a sticky session binds a user's session to a specific server instance so that all requests coming from the user during the session will be sent to the same server instance.
AWS Elastic Beanstalk uses load balancer-generated HTTP cookies when sticky sessions are enabled for an application. The load balancer uses a special load balancer–generated cookie to track the application instance for each request. When the load balancer receives a request, it first checks to see if this cookie is present in the request. If so, the request is sent to the application instance specified in the cookie. If there is no cookie, the load balancer chooses an application instance based on the existing load balancing algorithm. A cookie is inserted into the response for binding subsequent requests from the same user to that application instance. The policy configuration defines a cookie expiry, which establishes the duration of validity for each cookie.
Routing Algorithm (less likely your case)
Load balancer node sends the request to healthy instances within the same Availability Zone using the leastconns routing algorithm. The leastconns routing algorithm favors back-end instances with the fewest connections or outstanding requests.
Source: Elastic Load Balancing Terminology And Key Concepts
Hope it helps.
I recently started reading about and playing around with AWS. I have particular interest in the different high availability architectures that can be acheived using the platform. Specifically, I am looking for a reliable poor man's solution that can be implemented using the least amount of servers.
So far, I am satisfied with solutions for the main HA concerns: load balancing, redundancy, auto recovery, scalability ...
The only sticking point I have is with failover solutions.
Using an ELB might seem great, however ELB actually uses DNS balancing under the hood. See Is AWS's Elastic Load Balancer a single point of failure?. Also from a Netflix blog post: Lessons Netflix Learned from the AWS Outage
This is because the ELB is a two tier load balancing scheme. The first tier consists of basic DNS based round robin load balancing. This gets a client to an ELB endpoint in the cloud that is in one of the zones that your ELB is configured to use.
Now, I have learned DNS failover is not an ideal solution, as others have pointed out, mainly because of unpredictable DNS caching. See for example: Why is DNS failover not recommended?.
Other than ELBs, it seems to me that most AWS HA architectures rely on DNS failover using route 53.
Finally, the floating IP/Elastic IP (EIP) strategy has popped up in a very small number of articles, such as Leveraging Multiple IP Addresses for Virtual IP Address Fail-over and I'm having a hard time figuring out if this is a viable solution for production systems. Also, all examples I came across implemented this using a set of active-passive instances. It seems like a waste to have a passive for every active to achieve this.
In light of this, I would like to ask you what is a faster and more reliable way to perform failover?
More specifically, please discuss how to perform failover without using DNS for the following 2 setups:
2 active-active EC2 instances in seperate AZs. Active-active, because this is a budget setup, were we can't afford to have an instance sitting around.
1 ELB with 2 EC2 instances in region A, 1 ELB with 2 EC2 instances in region B. Again, both regions are active and serving traffic. How do you handle the failover from 1 ELB to the other?
You'll understand ELB better by playing with it, if you are the inquisitive type, as I am.
"1" ELB provisioned in 2 availability zones is billed as 1 but deployed as 2. There are 2 IP addresses assigned, one to each balancer, and 2 A records auto-created, one for each, with very short TTLs.
Each of these 2 balancers will forward traffic to the instance in its same AZ, or you can enable cross-AZ load balancing (and you should, if you only have 1 server instance in each AZ).
These IP addresses do not change often and though it stands to reason that ELBs fail like anything else, I have maybe 30 of them and have never knowingly had a dead one on my hands, presumably because the ELB infrastructure will replace a dead instance and change the DNS without your intervention.
For 2 regions, you have little choice other than using DNS at some level. Latency-based routing from Route 53 can send people to the closest site in normal operations and route all traffic to the other site in the event of an outage of an entire region (as detected by Route 53 health checks), but with this is somewhat more likely to encounter issues with DNS caching when the entire region is unavailable.
Of course, part of the active/passive dilemma in a single region using Elastic IP is easily remedied with HAProxy on both app servers. It's an http request router and load balancer like ELB, but with a broader set of features. The code is so tight that you can likely run it on your app servers with negligible CPU consumption. The instance with the EIP would then balance traffic between its local app server and the peer. Across regions, HAProxy behind ELB could forward traffic to a mate in a remote region, if the local region is up but for whatever reason the application can't serve requests from the local region. (I have used such a setup to increase availability of external services, by bouncing the request to a remote AWS region when the direct Internet path from the local region is not working.)
I'm trying to do performance testing for my AWS auto scaling group using jmeter.
Firstly, I did scale-in/out testing. I set the threshold to be 70% cpu utilization for 2 periods, each period is 2 minutes. The ELB works fine, and the requests was distributed to all EC2 instances in auto scaling group, in spite of un-equality, after the system scale-out.
In next, I want to test whether the two instances' load can be twice of one instance's.
I fixed the instance number of auto scaling group, I set the min/max/desired instance count to be 2. When I push load from single JMeter, there are always just only one instance works and its cpu utilization reach almost 100 percent, but the cpu utilization of the other instance is still zero.... If I push load from an JMeter cluster which contains several slaves, all instances take load.
Somebody said, maybe the load is not heavy enough, so the ELB considered that just one instance can handle it and didn't dispatch requests to other instance. I don't think so, because I push load from just one slave of this JMeter cluster, however I increase the load, just only one instance handle requests.
I found an blog which said ELB is great in HA but not load balancing.
https://www.stackdriver.com/elb-affinity-problems
But, I don't think the behavior, only one instance handle requests, is normal.
What the hell in the ELB load balance mechanism? I'm confused.
Elastic Load Balancing mechanism
DNS server uses DNS round robin to determine which load balancer node in a specific Availablity Zone will receive the request
The selected load balancer checks for "sticky session" cookie
The selected oad balancer sends the request to the least loaded instance
And in greater details:
Availability Zones (unlikely your case)
By default, the load balancer node routes traffic to back-end instances within the same Availability Zone. To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to have approximately equivalent numbers of instances in each zone. For example, if you have ten instances in Availability Zone us-east-1a and two instances in us-east-1b, the traffic will still be equally distributed between the two Availability Zones. As a result, the two instances in us-east-1b will have to serve the same amount of traffic as the ten instances in us-east-1a.
Sessions (most likely your case)
By default a load balancer routes each request independently to the server instance with the smallest load. By comparison, a sticky session binds a user's session to a specific server instance so that all requests coming from the user during the session will be sent to the same server instance.
AWS Elastic Beanstalk uses load balancer-generated HTTP cookies when sticky sessions are enabled for an application. The load balancer uses a special load balancer–generated cookie to track the application instance for each request. When the load balancer receives a request, it first checks to see if this cookie is present in the request. If so, the request is sent to the application instance specified in the cookie. If there is no cookie, the load balancer chooses an application instance based on the existing load balancing algorithm. A cookie is inserted into the response for binding subsequent requests from the same user to that application instance. The policy configuration defines a cookie expiry, which establishes the duration of validity for each cookie.
Routing Algorithm (less likely your case)
Load balancer node sends the request to healthy instances within the same Availability Zone using the leastconns routing algorithm. The leastconns routing algorithm favors back-end instances with the fewest connections or outstanding requests.
Source: Elastic Load Balancing Terminology And Key Concepts
Hope it helps.
I had this issue with unbalanced ELB traffic when back end instances where in different availability zones and the ELB was receiving requests from a small number of clients. In our case we were using an internal ELB within the application tiers. In your case "push load from single JMeter" likely means a small number of clients as seen by the ELB. The solution was to enable cross zone load balancing using the API similar to this fragment:-
elb-modify-lb-attributes ${ELB} --region ${REGION} --crosszoneloadbalancing "enabled=true"
http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/enable-disable-crosszone-lb.html