How do I ping the original webpage, bypassing CDN? - web-services

I live in Warsaw, Poland. I am pinging any US webpage (like www.nba.com):
$ ping www.nba.com
PING a1570.gd.akamai.net (213.155.152.161) 56(84) bytes of data.
64 bytes from 213-155-152-161.customer.teliacarrier.com (213.155.152.161): icmp_req=1 ttl=58 time=6.90 ms
64 bytes from 213-155-152-161.customer.teliacarrier.com (213.155.152.161): icmp_req=2 ttl=58 time=5.68 ms
Time that I receive is around 7-10 ms, while distance from Poland to US and back (packages go forth and back) is around 16000 km (16*10^6 m). c=3*10^8 m/s. Distance/c = 0,05 s = 50ms.
So I suppose, that some webpages are cached on some other server e.g. in Western Europe (5ms, means less than 750km from my place). How than I can ping the original, US webpage then?
Or did I miss something?
EDIT1: Ok, I missed, I am pinging actually a1570.gd.akamai.net in London, but distance is still too far (>750km). Is it a ping time counter error?

You are not pinging www.nba.com but one of CDN servers they are using, namely:
a1570.gd.akamai.net (213.155.152.161)
This Akamai server is located in London. Hence your ping is so fast, proving CDN actually works.

Related

Simple HelloWorld app on cloudrun (or knative) seems too slow

I deployed a sample HelloWorld app on Google Cloud Run, which is basically k-native, and every call to the API takes 1.4 seconds at best, in an end-to-end manner. Is it supposed to be so?
The sample app is at https://cloud.google.com/run/docs/quickstarts/build-and-deploy
I deployed the very same app on my localhost as a docker container and it takes about 22ms, end-to-end.
The same app on my GKE cluster takes about 150 ms, end-to-end.
import os
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
target = os.environ.get('TARGET', 'World')
return 'Hello {}!\n'.format(target)
if __name__ == "__main__":
app.run(debug=True,host='0.0.0.0',port=int(os.environ.get('PORT', 8080)))
I am a little experience in FaaS and I expect API calls would get faster as I invoked them in a row. (as in cold start vs. warm start)
But no matter how many times I execute the command it doesn't go below 1.4 seconds.
I think the network distance isn't the dominant factor here. The round-trip time via ping to the API endpoint is only 50ms away, more or less
So my questions are as follows:
Is it potentially an unintended bug? Is it a technical difficulty which will be resolved eventually? Or maybe nothing's wrong, it's just the SLA of k-native?
If nothing's wrong with Google Cloud Run and/or k-native, what is the dominant time-consuming factor here for my API call? I'd love to learn the mechanism.
Additional Details:
Where I am located at: Seoul/Asia
The region for my Cloud Run app: us-central1
type of Internet connection I am testing under: Business, Wired
app's container image size: 343.3MB
the bucket location that Container Registry is using: gcr.io
WebPageTest from Seoul/Asia (warmup time):
Content Type: text/html
Request Start: 0.44 s
DNS Lookup: 249 ms
Initial Connection: 59 ms
SSL Negotiation: 106 ms
Time to First Byte: 961 ms
Content Download: 2 ms
WebPageTest from Chicago/US (warmup time):
Content Type: text/html
Request Start: 0.171 s
DNS Lookup: 41 ms
Initial Connection: 29 ms
SSL Negotiation: 57 ms
Time to First Byte: 61 ms
Content Download: 3 ms
ANSWER by Steren, the Cloud Run product manager
We have detected high latency when calling Cloud Run services from
some particular regions in the world. Sadly, Seoul seems to be one of
them.
[Update: This person has a networking problem in his area. I tested his endpoint from Seattle with no problems. Details in the comments below.]
I have worked with Cloud Run constantly for the past several months. I have deployed several production applications and dozens of test services. I am in Seattle, Cloud Run is in us-central1. I have never noticed a delay. Actually, I am impressed with how fast a container starts up.
For one of my services, I am seeing cold start time to first byte of 485ms. Next invocation 266ms, 360ms. My container is checking SSL certificates (2) on the Internet. The response time is very good.
For another service which is a PHP website, time to first byte on cold start is 312ms, then 94ms, 112ms.
What could be factors that are different for you?
How large is your container image? Check Container Registry for the size. My containers are under 100 MB. The larger the container the longer the cold start time.
Where is the bucket located that Container Registry is using? You want the bucket to be in us-central1 or at least US. This will change soon with when new Cloud Run regions are announced.
What type of Internet connection are you testing under? Home based or Business. Wireless or Ethernet connection? Where in the world are you testing from? Launch a temporary Compute Engine instance, repeat your tests to Cloud Run and compare. This will remove your ISP from the equation.
Increase the memory allocated to the container. Does this affect performance? Python/Flask does not require much memory, my containers are typically 128 MB and 256 MB. Container images are loaded into memory, so if you have a bloated container, you might now have enough memory left reducing performance.
What does Stackdriver logs show you? You can see container starts, requests, and container terminations.
(Cloud Run product manager here)
We have detected high latency when calling Cloud Run services from some particular regions in the world. Sadly, Seoul seems to be one of them.
We will explicitly capture this as a Known issue and we are working on fixing this before General Availability. Feel free to open a new issue in our public issue tracker

AWS WAF How to rate limit path by IP below the minimum of 2000 requests/minute

I have a path (mysite.com/myapiendpoint for sake of example) that is both resource intensive to service, and very prone to bot abuse. I need to rate limit access to that specific path to something like 10 requests per minute per client IP address. How can this be done?
I'm hosting off an EC2 instance with CloudFront and AWS WAF in front. I have the standard "Rate Based Rule" enabled, but its 2,000 requests per minute per IP address minimum is absolutely unusable for my application.
I was considering using API Gateway for this, and have used it in the past, but its rate limiting as I understand it is not based on IP address, so bots would simply use up the limit and legitimate users would constantly be denied usage of the endpoint.
My site does not use sessions of any sort, so I don't think I could do any sort of rate limiting in the server itself. Also please bear in mind my site is a one-man-operation and I'm somewhat new to AWS :)
How can I limit the usage per IP to something like 10 requests per minute, preferably in WAF?
[Edit]
After more research I'm wondering if I could enable header forwarding to the origin (running node/express) and use a rate-limiter package. Is this a viable solution?
I don't know if this is still useful to you - but I just got a tip from AWS support. If you add the rate limit rule multiple times, it effectively reduces the number of requests each time. Basically what happens is each time you add the rule, it counts an extra request for each IP. So say an IP makes a single request. If you have 2 rate limit rules applied, the request is counted twice. So basically, instead of 2000 requests, the IP only has to make 1000 before it gets blocked. If you add 3 rules, it will count each request 3 times - so the IP will be blocked at 667 requests.
The other thing they clarified is that the "window" is 5 minutes, but if the total is breached anywhere in that window, it will be blocked. I thought the WAF would only evaluate the requests after a 5 minute period. So for example. Say you have a single rule for 2000 requests in 5 minutes. Say an IP makes 2000 requests in the 1st minute, then only 10 requests after that for the next 4 minutes. I initially understood that the IP would only be blocked after minute 5 (because WAF evaluates a 5 minute window). But apparently, if the IP exceeds the limit anywhere in that window, it will be locked immediately. So if that IP makes 2000 requests in minute 1, it will actually be blocked from minute 2, 3, 4 and 5. But then will be allowed again from minute 6 onward.
This clarified a lot for me. Having said that, I haven't tested this yet. I assume the AWS support techie knows what he's talking about - but definitely worth testing first.
AWS have now finally released an update which allows the rate limit to go as low as 100 requests every 5 minutes.
Announcement post: https://aws.amazon.com/about-aws/whats-new/2019/08/lower-threshold-for-aws-waf-rate-based-rules/
Using rule twice will not work, because WAF rate based rule will count on cloud watch logs basis, both rule will count 2000 requests separately, so it would not work for you.
You can use AWS-WAF automation cloud front template, and choose lambda/Athena parser, this way, request count will perform on s3 logs basis, also you will be able to block SQL,XSS and bad bot requests.

Why is AWS EC2 CPU usage shooting up to 100% momentarily from IOWait?

I have a large web-based application running in AWS with numerous EC2 instances. Occasionally -- about twice or thrice per week -- I receive an alarm notification from my Sensu monitoring system notifying me that one of my instances has hit 100% CPU.
This is the notification:
CheckCPU TOTAL WARNING: total=100.0 user=0.0 nice=0.0 system=0.0 idle=25.0 iowait=100.0 irq=0.0 softirq=0.0 steal=0.0 guest=0.0
Host: my_host_name
Timestamp: 2016-09-28 13:38:57 +0000
Address: XX.XX.XX.XX
Check Name: check-cpu-usage
Command: /etc/sensu/plugins/check-cpu.rb -w 70 -c 90
Status: 1
Occurrences: 1
This seems to be a momentary occurrence and the CPU goes back down to normal levels within seconds. So it seems like something not to get too worried about. But I'm still curious why it is happening. Notice that the CPU is taken up with the 100% IOWaits.
FYI, Amazon's monitoring system doesn't notice this blip. See the images below showing the CPU & IOlevels at 13:38
Interestingly, AWS says tells me that this instance will be retired soon. Might that be the two be related?
AWS is only displaying a 5 minute period, and it looks like your CPU check is set to send alarms after a single occurrence. If your CPU check's interval is less than 5 minutes, the AWS console may be rolling up the average to mask the actual CPU spike.
I'd recommend narrowing down the AWS monitoring console to a smaller period to see if you see the spike there.
I would add this as comment, but I have no reputation to do so.
I have noticed my ec2 instances have been doing this, but for far longer and after apt-get update + upgrade.
I tough it was an Apache thing, then started using Nginx in a new instance to test, and it just did it, run apt-get a few hours ago, then came back to find the instance using full cpu - for hours! Good thing it is just a test machine, but I wonder what is wrong with ubuntu/apt-get that might have cause this. From now on I guess I will have to reboot the machine after apt-get as it seems to be the only way to put it back to normal.

Objects are getting expired from CloudFront [duplicate]

Cloudfront is configured to cache the images from our app. I found that the images were evicted from the cache really quickly. Since the images are generated dynamically on the fly, this is pretty intense for our server. In order to solve the issue I set up a testcase.
Origin headers
The image is served from our origin server with correct Last-Modified and Expires headers.
Cloudfront cache behaviour
Since the site is HTTPS only I set the Viewer Protocol Policy to HTTPS. Forward Headers is set to None and Object Caching to Use Origin Cache Headers.
The initial image request
I requested an image at 11:25:11. This returned the following status and headers:
Code: 200 (OK)
Cached: No
Expires: Thu, 29 Sep 2016 09:24:31 GMT
Last-Modified: Wed, 30 Sep 2015 09:24:31 GMT
X-Cache: Miss from cloudfront
A subsequent request
A reload a little while later (11:25:43) returned the image with:
Code: 304 (Not Modified)
Cached: Yes
Expires: Thu, 29 Sep 2016 09:24:31 GMT
X-Cache: Hit from cloudfront
A request a few hours later
Nearly three hours later (at 14:16:11) I went to the same page and the image loaded with:
Code: 200 (OK)
Cached: Yes
Expires: Thu, 29 Sep 2016 09:24:31 GMT
Last-Modified: Wed, 30 Sep 2015 09:24:31 GMT
X-Cache: Miss from cloud front
Since the image was still cached by the browser it loaded quickly. But I cannot understand how the Cloudfront could not return the cached image. Therefor the app had to generate the image again.
I read that Cloudfront evicts files from its cache after a few days of being inactive. This is not the case as demonstrated above. How could this be?
I read that Cloudfront evicts files from its cache after a few days of being inactive.
Do you have an official source for that?
Here's the official answer:
If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that have been requested more recently.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
There is no guaranteed retention time for cached objects, and objects with low demand are more likely to be evicted... but that isn't the only factor you may not have considered. Eviction may not be the issue, or the only issue.
Objects cached by CloudFront are like Schrödinger's cat. It's a loose analogy, but I'm running with it: whether an object is "in the cloudfront cache" at any given instant is not a yes-or-no question.
CloudFront has somewhere around 53 edge locations (where your browser connects and the content is physically stored) in 37 cities. Some major cities have 2 or 3. Each request that hits cloudfront is routed (via DNS) to the most theoretically optimal location -- for simplicity, we'll call it the "closest" edge to where you are.
The internal workings of Cloudfront are not public information, but the general consensus based on observations and presumably authoritative sources is that these edge locations are all independent. They don't share caches.
If, for example, your are in Texas (US) and your request routed through and was cached in Dallas/Fort Worth, TX, and if the odds are equal that you any request from you could hit either of the Dallas edge locations, then until you get two misses of the same object, the odds are about 50/50 that your next request will be a miss. If I request that same object from my location, which I know from experience tends to route through South Bend, IN, then the odds of my first request being a miss are 100%, even though it's cached in Dallas.
So an object is not either in, or not in, the cache because there is no "the" (single, global) cache.
It is also possible that CloudFront's determination of the "closest" edge to your browser will change over time.
CloudFront's mechanism for determining the closest edge appears to be dynamic and adaptive. Changes in the topology of the Internet at large can change shift which edge location will tend to receive requests sent from a given IP address, so it is possible that over the course of a few hours, that the edge you are connecting to will change. Maintenance or outages or other issues impacting a particular edge could also cause requests from a given source IP address to be sent to a different edge than the typical one, and this could also give you the impression of objects being evicted, since the new edge's cache would be different from the old.
Looking at the response headers, it isn't possible to determine which edge location handled each request. However, this information is provided in the CloudFront access logs.
I have a fetch-and-resize image service that handles around 750,000 images per day. It's behind CloudFront, and my hit/miss ratio is about 50/50. That is certainly not all CloudFront's fault, since my pool of images exceeds 8 million, the viewers are all over the world, and my max-age directive is shorter than yours. It has been quite some time since I last analyzed the logs to determine which and how "misses" seemed unexpected (though when I did, there definitely were some, but their number was not unreasonable), but that is done easily enough, since the logs tell you whether each response was a hit or a miss, as well as identifying the edge location... so you could analyze that to see if there's really a pattern here.
My service stores all of its output content in S3, and when a new request comes in, it first sends a quick request to the S3 bucket to see if there is work that can be avoided. If a result is returned by S3, then that result is returned to CloudFront instead of doing all the fetching and resizing work, again. Mind you, I did not implement that capability because of the number of CloudFront misses... I designed that in from the beginning, before I ever even tested it behind CloudFront, because -- after all -- CloudFront is a cache, and the contents of a cache are pretty much volatile and ephemeral, by definition.
Update: I stated above that it does not appear possible to identify the edge location forwarding a particular request by examining the request headers from CloudFront... however, it appears that it is possible with some degree of accuracy by examining the source IP address of the incoming request.
For example, a test request sent to one of my origin servers through CloudFront arrives from 54.240.144.13 if I hit my site from home, or 205.251.252.153 when I hit the site from my office -- the locations are only a few miles apart, but on opposite sides of a state boundary and using two different ISPs. A reverse DNS lookup of these addresses shows these hostnames:
server-54-240-144-13.iad12.r.cloudfront.net.
server-205-251-252-153.ind6.r.cloudfront.net.
CloudFront edge locations are named after the nearest major airport, plus an arbitrarily chosen number. For iad12 ... "IAD" is the International Air Transport Association (IATA) code for Washington, DC Dulles airport, so this is likely to be one of the edge locations in Ashburn, VA (which has three, presumably with different numerical codes at the end, but I can't confirm that from just this data). For ind6, "IND" matches the airport at Indianapolis, Indiana, so this strongly suggests that this request comes through the South Bend, IN, edge location. The reliability of this test would depend on the consistency with which CloudFront maintains its reverse DNS entries. It is not documented how many independent caches might be at any given edge location; the assumption is that there's only one, but there might be more than one, having the effect of increasing the miss ratio for very small numbers of requests, but disappearing into the mix for large numbers of requests.

How does Amazon Tier/Instance time works?

I want to get a free tier + cheap medium instance which I was told that I can get cheaply for $5 and I can use it only for 100 hours. Here's what I'm confused about, does that 100 hours counts from the time when the instance start working and responding or where I only use it, for example I use it for 2 hours today and a week later I have 98 hours left to use ?
Kind Regards
When you start a server, you are "using" it until you shut it down. You are charged for the entire time the server is running (in 1 hour increments). Just because you aren't logged into the server and actively using it is irrelevant in regards to billing. As long as the server exists it is using resources and you will be billed for those resources.