Rds cross region read delays - amazon-web-services

We have an rds instance in us-east-1, the applications that access the rds are in us-east-1 and us-east-2. the two regions have vpc peering. we are load balancing the request received by using route53 weighted routing policy. we are experiencing 10 ms delay when communicating across regions. currently, these 10 ms delays are acceptable between microservices. but when the applications in region2 are accessing rds, we are facing a considerable delay. (due to the large amount of database calls of hibernate ). Are there any was to reduce this database latency ?

If you have cross-region databases and reading from slave / read-replicas then expect some delays. This is not technology issue but physics problem as data has to physically travel from A to B. The latency will grow as you move these points (A & B) further away from each other.
A solution to reduce this latency is to use cache layer. All the first reads will be from database (cross-region) but subsequent ones can be from Redis cache for e.g.

Related

Scale ECS Task by open network connections?

We have a service that connects to a database. The memory and CPU consumption is overall quite low.
However, we find we're running out of concurrent requests with, say, one container. Scaling to 2+ allows more throughput.
What metric, if any, can we latch onto to autoscale based on concurrent requests? I think they're handles, but I'm out of my lane a bit there.
I would look into using your load balancer's RequestCount or ActiveConnectionCount metrics as your auto-scaling trigger.

Is their enough capacity in all AWS regions for disaster recovery

In case of a disaster, when an entire AWS region fails and all its customers want to move their workloads to the next closest region in a disaster recovery scenario, is AWS ready for this?
I imagine millions of servers running in each region. Is AWS ready to provision them in another region the next day? Do they have that capacity at the ready?
AWS global infrastructure is using the concept of Availability Zones inside each region, to partition the resources, isolate the risks and ultimately reduce the blast radius of an eventual failure. AZs are groups of datacenter within a region that are designed to be independent of each others in terms of risks (i.e. different connection to the power grid, redundant and isolated network infrastructure, isolated in terms of geographical risks such as earthquake, fooding etc)
Some services are designed to automatically take advantage of this redundant infrastructure (Amazon S3, Amazon DynamoDB, ELB etc), customer do not need to configure anything, redundancy and failover at the regional level is handled by the service. Some other services are operating at AZ level (Amazon EC2, EBS, RDS etc) Fo these services, the best practice is to design for multiple AZ architecture and data replication.
In the very unlikely case a service would not be available in an AZ, a well architected architecture will transparently fail over to another AZ, without any noticeable customer impact.
Back to your question, the architecture is designed to avoid a region-wide failure of all services. This never happened since we launched AWS in 2006. And, yes, we have a lot of capacity. I propose you to watch this keynote from James Hamilton to learn more about it https://www.youtube.com/watch?v=AyOAjFNPAbA

AWS latency between Zones within a same Region

I have an EC2 and RDS in the same region US East(N. Virginia) but both resources are in different zones; RDS in us-east-1a and EC2 in us-east-1b.
Now the question is that if I put both resources within the same zone then would it speed up the data transfer to/from DB? I receive daily around 20k-30k entries from app to this instance.
EDIT
I read here that:
Each Availability Zone is isolated, but the Availability Zones in a region are connected through low-latency links.
Now I am wondering if these low-latency links are very minor or should I consider shifting my resources in the same zone to speed up the data transfer?
Conclusion
As discussed in answers and comments:
Since I have only one instance of EC2 and RDS, failure of one service in a zone will affect the whole system. So there is no advantage to keeping them in a separate zone.
Even though zones are connected together with low-latency links but there is still some latency which is neglectable in my case.
There is also a minor data transfer charge of USD 0.01/GB between EC2 and RDS in different zones.
What are typical values for Interzone data transfers in the same region?
Although AWS will not guarantee, state, or otherwise commit to hard numbers, typical measurments are sub 10 ms, with numbers around 3 ms is what I have seen.
How does latency affect data transfer throughput?
The higher the latency the lower the maximum bandwidth. There are a number of factors to consider here. An excellent paper was written by Brad Hedlund.
Should I worry about latency in AWS networks between zones in the same region?
Unless you are using the latest instances with very high performance network adapters (10 Gb or higher) I would not worry about it. The benefits of fault tolerance should take precendence except for the most specialized cases.
For your use case, database transactions, the difference between 1 ms and 10 ms will have minimal impact, if at all, on your transaction performance.
However, unless you are using multiple EC2 instances in multiple zones, you want your single EC2 instance in the same zone as RDS. If you are in two zones, the failure of either zone brings down your configuration.
There are times where latency and network bandwidth are very important. For this specialized case, AWS offers placement groups so that the EC2 instances are basically in the same rack close together to minimize latency to the absolute minimum.
Moving the resources to the same AZ would decrease latency by very little. See here for some unofficial benchmarks. For your use-case of 20k reads/writes per day, this will NOT make a huge difference.
However, moving resources to the same AZ would significantly increase reliability in your case. If you only have 1 DB and 1 Compute Instance that depend on each other, then there is no reason to put them in separate availability zones. With your current architecture, a failure in either us-east-1a or us-east-1b would bring down your project. Unless you plan on scaling out your project to have multiple DBs and Compute Instances, they should both reside in the same AZ.
According to some tests, i can see like 600 microseconds (0.6 ms) latency between availability zones, inside the same region. A fiber has 5 microseconds delay (latency) per km, and between azs there is less than 100km, hence the result matches.

Direct Connect RDS Latency Expectations

Recently we have moved our databases to AWS RDS with applications OnPrem, Obviously, latency was huge so we provisioned direct connect with Megaport between AWS Oregon region(RDS) and our data centers(applications) in San Francisco.
But surprisingly we did not see any major difference for latency(please find attached results and below data), It's almost similar to the connection over the internet.
OnPrem App - OnPrem DB (Seconds) Insert: 0.112
OnPrem App - AWS DB Over Direct Connect(Seconds) Insert: 1.332
OnPrem App - AWS DB Over Internet (Seconds) Insert: 1.50
Is this expected?
Do we have any options to improve latency?
Please provide any checkpoints for improvements.
Appreciate your support.
Latency on uncongested routes is largely a factor of distance along with the number of hops in the link.
Assuming, your DC has uncongested connections to the Internet and AWS sure does. For small requests congestion is not going to be an issue and latency will be relatively low.
However that's not guaranteed and could vary from reasonable to terrible at any time. The Internet tends to suffer a degree of packet loss causing re-transmission which increases latency. These effects would be more noticable with a larger traffic volume.
What Direct Connect gets you along with security improvements is a guaranteed latency and bandwidth. Not only is your request marginally quicker, you can ramp up the volume and be sure that the performance won't get any worse.
Megaport publishes latency figures for their part of the network.
Unfortunately there are not any RDS options that would reduce the latency further. Other strategies such as local read-replicas or local caching may be suitable for your application.
Disclaimer: I work for Megaport

Ensuring consistent network throughput from AWS EC2 instance?

I have created few AWS EC2 instances, however, sometimes, my data throughput (both for upload and download) are becoming highly limited on certain servers.
For example, typically I have about 15-17 MB/s throughput from instance located in US West (Oregon) server. However, sometimes, especially when I transfer a large amount of data in a single day, my throughput drops to 1-2 MB/s. When it happens on one server, the other servers have a typical network throughput (as previously expect).
How can I avoid it? And what can cause this?
If it is due to amount of my data upload/download, how can I avoid it?
At the moment, I am using t2.micro type instances.
Simple answer, don't use micro instances.
AWS is a multi-tenant environment as such resource are shared. When it comes to network performance, the larger instance sizes get higher priority. Only the largest instances get any sort of dedicated performance.
Micro and nano instances get the lowest priority out of all instances types.
This matrix will show you what priority each instance size gets:
https://aws.amazon.com/ec2/instance-types/#instance-type-matrix