Requests are not served from the nearest region in aws

Requests are not served from the nearest region in aws - amazon-web-services

I have a video streaming application and my client is from Maldives. The nearest AWS region for Maldives is Asia Pacific Mumbai. I had set up CDN and now most of the requests from Maldives are served from Singapore region even though Mumbai is the nearest region. Why is it so?

I presume that you are using Amazon CloudFront to serve the video.
CloudFront uses many methods, such as AnyCast, to determine from where to serve traffic. Typically, it will serve content from an edge location that provide better performance in terms of network latency. This will not necessarily be the closest physical location.
In fact, some cities have more than one edge location so that major ISPs can be better serviced via fast network connections.

One best way to start troubleshoot the problem is to:
Get one IP of India POP (BOM/DEL/HYD/BLR) and perform MTR to see if we see better latency compare to when you connect to Singapore.
If so:
Try 8.8.8.8 as a resolver on a test machine, this supports EDNS and check if you get any India pop.
At last, perform dig/nslookup resolver-identity.cloudfront.net and provide your resolver IP to AWS to see which edge location is best for you or if they can fix anything for you.

Related

Best architecture practice to improve performance of ecommerce website hosted in India to overcome the Great Firewall of China

Not sure if this is the right platform but would like to give it a shot here as there are many legendary folks that live in stackoverflow domain.
We have a typical ecommerce website hosted in India. The architectural touch points can be summarized below
Static content is loaded from AWS CDN
The first point of entry is the AWS application load balancer
Search functionality using elastic search
Redis caching enabled
business functionality implemented using nodejs, typescript, javascript and hosted in kubernetes cluster
The website is pretty fast anywhere but China because of the Great firewall of china
Solutions which we have tried
We have tried using the Alibaba Global accelerator with AWS geolocation record routing policy to ensure that traffic originating from China should come from the Global accelerator instead from the main internet.
We have also tried Huawei Cloud Connect again with Geo location to route traffic via Cloud connect going through the NAT gateway from HongKong to Singapore and then behaving as if the traffic is originated from Singapore
Performance wise the Huawei cloud connect seems to be performing better.
However the static content which is loaded from AWS CDN is still slow as it tries to get from nearest AWS CDN node which is Japan.
How can we accelerate the CDN for the images ?
Any thoughts are welcome.

You're right. Since the website is fast everywhere except China, so it most probably got to do with the China's firewall. Optimizing TTL and cache headers won't get you far. I believe this is quite expected on websites outside China's network.
I'm not an expert regarding China's network, but you'll need an ICP license (ICP Filing or ICP Commercial License) to legally operate your website in China mainland. Without the license, your website may get blocked at any point. There are certain requirements for getting the license, so this might be quite challenging.
You'll probably want to host some of your servers in China region while getting the ICP license. Alibaba Cloud, Tencent Cloud, and Huawei Cloud are popular cloud providers in China mainland, and they provide ICP registration as a service as well (Alibaba Cloud GoChina ICP Filing Assistant / Huawei Cloud ICP License Service). AWS also have China region operated by third party companies. I think it'd be best to stick with a single provider for better service support and avoid architectural complexities.
Usually I recommend using Cloudflare as CDN as it's going to save you tons on bandwidth cost and it comes with bunch of features as well. However in this case, it wouldn't be of much help unless you subscribe to Cloudflare's China Network, which is only available for Enterprise customers (you'd still need ICP license).
Bottom line is, you'd need ICP license and comply with their regulations, and host your servers in China mainland to properly serve your Chinese customers.
Here are a few good reads that might help you:
How Your Web App Can Serve the Chinese Market
SANS - Doing Cloud in China
Alibaba Cloud - FAQ about ICP filings
AWS China Support - ICP Recordal

Collect more browser-side data
Please collect better data from a person based inside China using Chrome's Lighthouse feature accessible via Dev Tools or as an extension, and share key metrics that Lighthouse flags as "needing attention" on this thread.
Check at least the following common (mis)configurations
Is the origin server setting the correct time-to-live for every image, JS, and CSS file? AWS docs. Check S3 file-level metadata specifically if S3 is in use, or your application server's HTTP response to AWS CloudFront for static resources.
Are all intermediary CDNs and application-level proxies passing through the above Cache-Control and max-age directives to downstream users?
Using your browser's dev tools, are you able to observe the above headers once you set or edit them at the origin server? Is a typical user in mainland China able to observe these headers?
Examine non-standard X-From-Cache headers (or similar) inserted by CDN networks or caching proxies to see which intermediary is misconfigured.

Blocking a spam attack using WAF

I have a website behind hosted on S3 behind the cloudfront. I get continuous spamming for couple of hours every evening. The spammer uses different IPs/subnets to launch the same. Going through the access logs, I can not identify any common pattern or special request origin. What kind of further digging is needed to create a WAF rule to avoid this. What is the maximum damage I am seeing here [apart from cloudfront transfer costs] ?
The IP's being used are from Amazon, which indicates that new instances or containers are being spun up to launch the same.

Amazon publishes its IP address ranges, here. If you are not expecting traffic from AWS instances, you can either create a WAF rule that denies traffic from the IP address ranges, or create a viewer-request Lambda#Edge that does the same. Depending in the number of requests/amount of traffic, the Lambda may prove to be substantially more cost-effective.

Network Latency between GCP server and linux data centres

What are the best practices to avoid network latency between GCP server and unix server. My client application, which is on linux, is accessing GCP end point, but facing network latency. How to avoid it.

Are you suspecting that part of the latency is not due to the distance between your server and GCP? If not then obviously all you can do is (1) place your server closer to your GCP node and (2) maybe cluster/parallelize your GCP requests if you have many of them.
So I suggest that you determine the distance between the two sites and compare it to the roundtrip time for your requests. If it's significantly larger then indeed you will have to analyze the structure of your requests.

Latency is not related to the OS you are using. Network latency is a measure of the time delay required for information to travel across a network. In all the factors that may affect this time delay, the one you can manage in cloud is the distance from the source to the destination. You can find other latency factors in this previous answer.
If you are looking to optimize the latency you could use a Cloud Load Balancer. Using Google Cloud Platform HTTP(S) load balancer, the requests are always routed to the instance group that is closest to the user. With load balancer you can also use Cloud CDN. Cloud CDN reduces latency by serving assets directly at Google's network edge.

How to calculate the data transfer rate of a particular site(virtual host) in AWS

I have virtual host files(sites) setup on two linux EC2 instances which are behind an ELB. I would like to know the data transfer rate of one particular site which is hosted on these EC2 instances. There are almost 30 virtual hosts on each EC2 instance and I need to calculate the average data transfer rate of all these sites. From cloudwatch I could only gather the information at service level but not for particular site. Is there any way to accomplish this?

In this case, I recommend 2 things:
Do not use ELB or EC2 directly for data transfer.
If you want to know the exactly MB/GB/TB:
Work with CloudFront in front of your ELB.
Register one CloudFront distribution for each site (domain, subdomain or whatever).
If you do that, you will have more control and save money (data transfer) but it depends on the region (sometimes it can be a bit more expensive).

Why is Cloudfront serving me from datacenter far from me?

I'm located 1000 miles from Singapore. I use S3 in Singapore region with CloudFront to serve data.
When I download contents, cloudfront is serving me from US Washington server. (checking IP addresses)
Why doesn't it serve from Singapore instead?

GeoIP lookups for IP addresses associated with CDN's is notoriously inaccurate.
Services that provide a GeoIP lookup gather information about the geographical assignment of each IP address from a wide range of sources and do their best to provide an accurate geographic assignment. In my experience, cheaper services are 80%-85% accurate, while the most expensive services are not more than around 90% accurate.
AWS does not publish the assignment of IP address to specific region. Instead, they designate the IP addresses merely as GLOBAL. As a result, the specific geography of each IP is likely unknown to the GeoIP service you are using. They make the best guess they can.
Additionally, a CDN will attempt to use the node with the least latency to you. Latency generally increases with geographic distance, but at times the longer route may offer lower latency due to a faster or less congested connection.
In your case, I suspect that you are receiving data from Singapore and your GeoIP provider is just getting the location of the IP wrong.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js