Optimizing Google Cloud DNS for users from specific location - google-cloud-platform

my website receives most of the traffic from a specific country. Though I am using Google cloud DNS, I am observing DNS lookup time to hover in the range of 200 to 300 ms which is quite slow I guess. Is there any extra optimization on Cloud DNS that will help to resolve DNS queries faster in my country of interest?

I am observing DNS lookup time to hover in the range of 200 to 300 ms
which is quite slow I guess.
This neither slow nor fast. However, the performance of your Internet connection and provider can have a big impact on this number. First remove your Internet connection from the equation to determine so that you can determine how much of the delay is caused by Google -> your provider and then your provider -> your computer.
Is there any extra optimization on Cloud DNS that will help to resolve
DNS queries faster in my country of interest?
You are leaving our details. What country? How are you testing queries?
Use an Internet DNS testing service that will report response time from points around the world. This will provide you with more realistic data. Testing from a single Internet connection is pointless unless that is the source of all your traffic.
However, there are no settings you can change in Google Cloud DNS that will affect geolocation response times.
The response time for DNS has very little to do with Internet performance of a website unless you set the TTL for DNS resource records to be very short. A client and the DNS servers around the world will look up your DNS resource records once, cache the result and use the resulting IP address for traffic until the TTL expires. This means that for most traffic, your DNS server is not involved. I am oversimplifying the process of resource record lookup and caching around the world.
In summary, you are focussing on something that really does not matter and that you cannot change.

Related

Switching between on-prem and cloud server IPs without load balancing

I own a something.com domain and want to switch between an old on-premises server to a new Google Cloud VM. I can do that by changing the A record under DNS settings. If the new server fails I need to be able to switch back to the old server.
The problem with using A records is that DNS doesn't propagate fast even if you use Cloudflare. Google Chrome in particular sticks to its DNS table like crazy and if it first learned that something.com resolves to X.X.X.X it will not let go of it.
I need to be able to direct all traffic going to the Google Cloud static IP back to the old server's IP. I'm looking to find a proxy/routing rule menu that I can use to apply - not a full blown load-balancing menu that will cost extra per month.
The solution is to get rid of the old server and build a more robust solution on GCP. There are multiple ways to do this, but one obvious way is to use a Managed Instance Group (https://cloud.google.com/compute/docs/instance-groups). MIGs can be configured to be autohealing (https://cloud.google.com/compute/docs/tutorials/high-availability-autohealing) and autoscaling (if needed).
In this case you should be particularly looking at stateful MIGs I guess (https://cloud.google.com/compute/docs/instance-groups/stateful-migs).
You have two solutions to switch your DNS from an IP to another one dynamically
Either you use a DNS failover service, not proposed on GCP today. Use a low TTL in your DNS definition, else your will wait a lot before the automatic switch.
Or you implement it by yourselves with a proxy server that you have to manage.

AWS - Abnormal Data Transfer OUT

I´m consistently being charged for a surprisingly high amount of data transfer out (from Amazon to Internet).
I looked into the Usage Reports of the past few months and found out that the Data Transfer Out was coming out of an Application Load Balancer (ALB) between the Internet and multiple nodes of my application (internal IPs).
Also noticed that DataTransfer-Out-Bytes is very close to the DataTransfer-In-Bytes in the same load balancer, which is weird (coincidence?). I was expecting the response to each request to be way smaller than the request itself.
So, I enabled flow logs in the ALB for a few minutes and found out the following:
Requests coming from the Internet (public IPs) in to ALB = ~0.47 GB;
Requests coming from ALB to application servers in the same availability zone = ~0.47 GB - ALB simply passing requests through to application servers, as expected. So, about the same amount of traffic.
Responses from application servers back into the same ALB = ~0.04 GB – As expected, responses generate way less traffic back into ALB. Usually a 1K request gets a simple “HTTP 200 OK” response.
Responses from ALB back to the external IP addresses => ~0.43 GB – this was mind-blowing. I was expecting ~0.04GB, the same amount received from the application servers.
Unfortunately, ALB does not allow me to use packet sniffers (e.g. tcpdump) to see that is actually coming in and out. Is there anything I´m missing? Any help will be much appreciated. Thanks in advance!
Ricardo.
I believe the next step in your investigation would be to enable ALB access logs and see whether you can correlate the "sent_bytes" in the ALB access log to either your Flow log or your bill.
For information on ALB access logs see: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-access-logs.html
There is more than one way to analyze the ALB access logs, but I've always been happy to use Athena, please see: https://aws.amazon.com/premiumsupport/knowledge-center/athena-analyze-access-logs/

ELB cross-AZ balancing DNS resolution with Sticky sessions

I am preparing for AWS certification and came across a question about ELB with sticky session enabled for instances in 2 AZs. The problem is that requests from a software-based load tester in one of the AZs end up in the instances in that AZ only instead of being distributed across AZs. At the same time regular requests from customers are evenly distributed across AZs.
The correct answers to fix the load tester issue are:
Forced the software-based load tester to re-resolve DNS before every
request;
Use third party load-testing service to send requests from
globally distributed clients.
I'm not sure I can understand this scenario. What is the default behaviour of Route 53 when it comes to ELB IPs resolution? In any case, those DNS records have 60 seconds TTL. Isn't it redundant to re-resolve DNS on every request? Besides, DNS resolution is a responsibility of DNS service itself, not load-testing software, isn't it?
I can understand that requests from the same instance, with load testing software on it, will go to the same LBed EC2, but why does it have to be an instance in the same AZ? It can be achieved only by Geolocation- or Latency-based routing, but I can't find anything in the specs whether those are the default ones.
When an ELB is in more than one availability zone, it always has more than one public IP address -- at least one per zone.
When you request these records in a DNS lookup, you get all of these records (assuming there are not very many) or a subset of them (if there are a large number, which would be the case in an active cluster with significant traffic) but they are unordered.
If the load testing software resolves the IP address of the endpoint and holds onto exactly one of the IP addresses -- as it a likely outcome -- then all of the traffic will go to one node of the balancer, which is in one zone, and will send traffic to instances in that zone.
But what about...
Cross-Zone Load Balancing
The nodes for your load balancer distribute requests from clients to registered targets. When cross-zone load balancing is enabled, each load balancer node distributes traffic across the registered targets in all enabled Availability Zones. When cross-zone load balancing is disabled, each load balancer node distributes traffic across the registered targets in its Availability Zone only.
https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html
If stickiness is configured, those sessions will initially land in one AZ and then stick to that AZ because they stick to the initial instance where they landed. If cross-zone is enabled, the outcome is not quite as clear, but either balancer nodes may prefer instances in their own zone in that scenario (when first establishing stickiness), or this wasn't really the point of the question. Stickiness requires coordination, and cross-AZ traffic takes a non-zero amount of time due to distance (typically <10 ms) but it would make sense for a balancer to prefer to select instances its local zone for sessions with no established affinity.
In fact, configuring the load test software to re-resolve the endpoint for each request is not really the focus of the solution -- the point is to ensure that (1) the load test software uses all of them and does not latch onto exactly one and (2) that if more addresses become available due to the balancer scaling out under load, that the load test software expands its pool of targets.
In any case, those DNS records have 60 seconds TTL. Isn't it redundant to re-resolve DNS on every request?
The software may not see the TTL, may not honor the TTL and, as noted above, may stick to one answer even if multiple are available, because it only needs one in order to make the connection. Every request is not strictly necessary, but it does solve the problem.
Besides, DNS resolution is a responsibility of DNS service itself, not load-testing software, isn't it?
To "resolve DNS" in this context simply means to do a DNS lookup, whatever that means in the specific instance, whether using the OS's DNS resolver or making a direct query to a recursive DNS server. When software establishes a connection to a hostname, it "resolves" (looks up) the associated IP address.
The other solution, "use third party load-testing service to send requests from globally distributed clients," solves the problem by accident, since the distributed clients -- even if they stick to the first address they see -- are more likely to see all of the available addresses. The "global" distribution aspect is a distraction.
ELB relies on random arrival of requests across its external-facing nodes as part of the balancing strategy. Load testing software whose design overlooks this is not properly testing the ELB. Both solutions mitigate the problem in different ways.
The sticky is the issue , see here : https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-sticky-sessions.html
The load balancer uses a special cookie to associate the session with
the instance that handled the initial request, but follows the
lifetime of the application cookie specified in the policy
configuration. The load balancer only inserts a new stickiness cookie
if the application response includes a new application cookie. The
load balancer stickiness cookie does not update with each request. If
the application cookie is explicitly removed or expires, the session
stops being sticky until a new application cookie is issued.
The first solution, to re-resolve DNS will create new sessions and with this will break the stickiness of the ELB . The second solution is to use multiple clients , stickiness is not an issue if the number of globally distributed clients is large.
PART 2 : could not add as comment , is to long :
Yes, my answer was to simple and incomplete.
What we know is that ELB is 2 AZ and will have 2 nodes with different IP. Not clear how many IP , depends on the number of requests and the number of servers on each AZ. Route 53 is rotating the IP for every new request , first time in NodeA-IP , NodeB-IP , second time is NodeB-IP, NodeA-IP. The load testing application will take with every new request the first IP , balancing between the 2 AZ. Because a Node can route only inside his AZ , if the sticky cookie is for NodeA and the request arrives to NodeB , NodeB will send it to one of his servers in AZ2 ignoring the cookie for a server in AZ 1.
I need to run some tests, quickly tested with Route53 with classic ELB and 2 AZ and is rotating every time the IP's. What I want to test if I have a sticky cookie for AZ 1 and I reach the Node 2 will not forward me to Node 1 ( In case of no available servers, is described in the doc this interesting flow ). Hope to have updates in short time.
Just found another piece of evidence that Route 53 returns multiple IPs and rotate them for ELB scaling scenarios:
By default, Elastic Load Balancing will return multiple IP addresses when clients perform a DNS resolution, with the records being randomly ordered on each DNS resolution request. As the traffic profile changes, the controller service will scale the load balancers to handle more requests, scaling equally in all Availability Zones.
And then:
To ensure that clients are taking advantage of the increased capacity, Elastic Load Balancing uses a TTL setting on the DNS record of 60 seconds. It is critical that you factor this changing DNS record into your tests. If you do not ensure that DNS is re-resolved or use multiple test clients to simulate increased load, the test may continue to hit a single IP address when Elastic Load Balancing has actually allocated many more IP addresses.
What I didn't realize at first is that even if regular traffic is distributed evenly across AZs, it doesn't mean that Cross-Zone Load Balancing is enabled. As Michael pointed out regular traffic will naturally come through various locations and end up in different AZs.
And as it was not explicitly mentioned in the test, Cross-AZ balancing might not have been in place.
https://aws.amazon.com/articles/best-practices-in-evaluating-elastic-load-balancing/

Why is Cloudfront serving me from datacenter far from me?

I'm located 1000 miles from Singapore. I use S3 in Singapore region with CloudFront to serve data.
When I download contents, cloudfront is serving me from US Washington server. (checking IP addresses)
Why doesn't it serve from Singapore instead?
GeoIP lookups for IP addresses associated with CDN's is notoriously inaccurate.
Services that provide a GeoIP lookup gather information about the geographical assignment of each IP address from a wide range of sources and do their best to provide an accurate geographic assignment. In my experience, cheaper services are 80%-85% accurate, while the most expensive services are not more than around 90% accurate.
AWS does not publish the assignment of IP address to specific region. Instead, they designate the IP addresses merely as GLOBAL. As a result, the specific geography of each IP is likely unknown to the GeoIP service you are using. They make the best guess they can.
Additionally, a CDN will attempt to use the node with the least latency to you. Latency generally increases with geographic distance, but at times the longer route may offer lower latency due to a faster or less congested connection.
In your case, I suspect that you are receiving data from Singapore and your GeoIP provider is just getting the location of the IP wrong.

A better usage of Weighted Round Robin Routing in Amazon Route 53

The question might not be as fundamental as you thought. First of all, thanks for reading it. I am a computer science student. I just begin to learn about AWS, especially the Route 53 so please forgive me if there is anything that hurts your eyes :)
We all know that Amazon Route 53 provides customers with the ability
to route users to EC2 instances, S3 buckets, and elastic Load
Balancers across multiple availability zones and regions and there are
different forms of DNS load balancing including:
LBR/Latency Based Routing, to route to the region with the lowest latency
WRR/Weighted Round Robin, to assign weights to different targets
Also, user-specified configurations that combine both are possible
(LBR+WRR).
Route 53 flexibility allows users to save costs, however manual
configuration can become increasingly complex for final users. Looking
for the best non-probabilistic policy (such as the WRR weights) is
NP-complete.
What are the possible cases that we need to give server ip addresses different weight ? given that there can be EC2 servers that across multiple availability zones and instances can contain both front end and back end or contain either application tiers or databases only ? Are there any ideas of finding a possible better usage of Route 53 in combination with other AWS services, in order to improve the performance of interactive multi-tier cloud applications ?
Sorry for the lengthy question. I am looking for thoughts and ideas about the best way/starting point to experiment about the better usage of Route 53 and in combination with other AWS services for a multi-tier cloud application. Not necessarily a 100% correct answer. Any ideas or suggestions are welcomed. Many thanks in advance !
UPDATE:
I should probably rephrase the question: What is the purpose of having Weighted record set in Route 53 i.e in a DNS service ? Obviously, WRR in DNS can control potions of traffic but if we simply rely on this DNS load balance (or load distribution) we are going to put heavy workload on the many other DNS servers. One case I could think off is that web sites like google or Facebook will potentially gets tons of tons domain name queries, WRR DNS load balancing can be useful and there has to be some sort of session stickiness since sharing session across servers seems to be a bad idea.
Are there any other way / purpose of using Weighted record in Route 53.
Thank you very much for considering my question !
Another use case to consider is A/B testing of frontend or backend services. Let me illustrate: Let's say we've just CI-tested version 1.0.1 of our web application (which runs in a Docker container), and we've deployed the container but we're not yet routing traffic to it. We don't want to flip a switch and immediately dump our one million daily active users (woohoo!) onto v1.0.1 until we can give it a little real-world testing. So we decide to use the Weighted Round Robin load balancing available in Route 53 to send 0.25% of our users to the v1.0.1 container(s), allowing us to feel out the new version with real-world users before flipping the switch. We can do the same thing with virtually any service that uses hostname lookup to find resources.
One use case can be, to use it to load balance internal services that can't be balanced using an elastic load balancer, like a rds or elastic cache read replicas, so instead of creating a ec2 instance with a haproxy for example to load balance your services, you can create a Route 53 level balancer based on weights or latency.
My guess is that internally, they use a custom load balancer at the dns server, that balance requests based on domain aliases and the selected balancing policy.