AWS S3 + CDN Speed, significantly slower - amazon-web-services

Ok,
So I've been playing around with amazon web services as I am trying to speed up my website and save resources on my server by using AWS S3 & CloudFront.
I ran a page speed test initially, the page speed loaded in 1.89ms. I then put all assets onto an s3 bucket, made that bucket available to cloudfront and then used the cloudfront url on the page.
When I ran the page speed test again using this tool using all their server options I got these results:
Server with lowest speed of : 3.66ms
Server with highest speed of: 5.41ms
As you can see there is a great increase here in speed. Have I missed something, is it configured wrong? I thought the CDN was supposed to make the page speed load faster.

Have I missed something, is it configured wrong? I thought the CDN was supposed to make the page speed load faster.
Maybe, in answer to both of those statements.
CDNs provide two things:
Ability to scale horizontally as request volume increases by keeping copies of objects
Ability to improve delivery experience for users who are 'far away' (either in network or geographic terms) from the origin, either by having content closer to the user, or having a better path to the origin
By introducing a CDN, there are (again two) things you need to keep in mind:
Firstly, the CDN generally holds a copy of your content. If the CDN is "cold" then there is less likely to be any acceleration, especially when the test user is close to the origin
Secondly, you're changing the infrastructure to add an additional 'hop' into the route. If the cache is cold, the origin isn't busy, and you're already close to it, you'll almost always see an increase in latency, not a decrease.
Whether a CDN is right for you or not depends on the type of traffic you're seeing.
If I'm a distributor based in the US with customers all over the world, even for dynamic, completely uncachable content, then using a CDN can help me improve performance for those users.
If I'm a distributor based in North Virginia with customers only in North Virginia, and I only see one request an hour, then I'm likely to have no visible performance improvement - because the cache is likely not to be kept populated, the network path isn't preferable, and I don't have to deal with scale.

Generally, yes, a CDN is faster. But 1.89ms is scorchingly fast; you probably won't beat that, certainly not under any kind of load.
Don't over-optimize here. Whatever you're doing, you have bigger fish to fry than 1.77ms in load time based on only three samples and no load testing.

I've seen the same thing. I initially moved some low-traffic static sites to S3/cloudfront to improve performance, but found even a small linux ec2-instance running nginx would give better response times in my use cases.
For a high traffic, geographically dispersed set of clients S3/Cloudfront would probably outperform.
BTW: I suspect you don't mean 1.89ms, but instead 1.89 seconds, correct?

Related

AWS having longer wait time than Siteground fo

I'm currently trying for my own education to achieve the fastest site speed possible for a landing page. I have hosted it once on Siteground with all speed optimizations on (Minify, Level 3 Supercacher (Memcached) and Lazy Loading, Cloudflare)
I have setup the 99% same site (100% not possible since SG has its own optimizer)
I assumed AWS will be faster. But when I look in my Developer Toolbar, Pingdom or GTMetrix SG wins. The reason is all files have a longer waiting time. I know it is minimal, but given the fact I want to achieve maximum speed I am wondering what the reason is. I tried to use a bigger instance, but changing from t2.micro to m4.16xlarge didn't make a difference. I am wondering if it would anyway without visitors.
This is the loading time on Siteground:
This is on my AWS Site:
The difference is that the JS and CSS files on SG get loaded in 20-30ms and the files on AWS in 40-50ms.
The last options I could try would be Varnish Cache or move to Lightsail, but I'm not sure if this will help.

How to use CloudFront efficiently for less popular website?

We are building a website which contains a lot of images and data. We have optimized a lot to make the website faster. Then we decided to use AWS CloudFront also to make it faster for all regions around the world. The app works faster after the integration of CloudFront.
But later we found that the data will load to CloudFront cache only when the website asks for it. So we are afraid that the initial load will take the same time as it used to take without the CDN because it loads from S3 to CDN first and then to the user.
Also, we used the default TTL values (ie., 24 hours). In our case, a user may log in once or twice per week to this website. So in that case also, the advantage of caching won't work here as well because the caching expires after 24 hours. Will raising the time of TTL (Maximum TTL) to a larger value solve the issue? Does it cost more money? And I also read that, increasing to a longer TTL is not a good idea as it has some disadvantages also for updating the data in s3.
Cloudfront will cache the response only after the first user requests for it. So it will be slow for the first user, but it will be significantly faster for every other user after the first user. So it does make sense to use Cloudfront.
Using the default TTL value is okay. Since most users will see the same content and the website has a lot of static components as well. Every user except the first user will see a fast response from your website. You could even reduce this to 10-12 hours depending on how often you expect your data to change.
There is no additional cost to increasing your TTL. However invalidation requests are charged. So if you want to remove a cache, there will be a cost added to it. So I would prefer to keep a short TTL as short as your data is expected to change, so you dont have to invalidate existing caches when your data changes. At the same time, maximum number of users can benefit from your CDN.
No additional charge for the first 1,000 paths requested for invalidation each month. Thereafter, $0.005 per path requested for invalidation.
UPDATE: In the event that you only have 1 user using the website over a long period of time (1 week or so), it might not be of much benefit to use CloudFront at all. CloudFront and all caching services are only effective when there are multiple users requesting for the same resources.
However you might still have a marginal benefit using CloudFront, as the requests will be routed from the edge location to S3 over AWS's backbone network which is much faster than the internet. But whether this is cost effective for you or not depends on how many users are using the website and how slow it is.
Aside from using CloudFront, you could also try S3 Cross Region Replication to increase your overall speed. Cross Region Replication can replicate your buckets to a different region as and when they are added in one region. This can help to minimize latency for users from other regions.

GCP CDN not caching data

We have deployed our website on GCP VM, and enabled GCP CDN in front of the VM. When we browse website in most of the cases GCP CDN making requests to the Origin VM.
I am using below stack driver query to check the cache hits.
resource.type="http_load_balancer"
resource.labels.forwarding_rule_name="rule_name"
httpRequest.serverIp="gcpvmip"
httpRequest.requestUrl="request_url"
httpRequest.cacheFillBytes > 0
Based on your latest comment, it sounds like you're expecting all requests to your site to be served from Cloud CDN's caches without contacting your origin server. However, it's normal to see cache misses when using a CDN. Each CDN operates numerous caches, not one big global cache. The fact the content for one URL has been inserted into one cache does not mean it will be present in all caches everywhere. Further, unpopular cache entries are routinely evicted from cache to make room for more popular content.
Here are some relevant excerpts from the Cloud CDN docs:
Cloud CDN uses caches in numerous locations around the world. Caching
is reactive in that an object is stored in a particular cache if a
request goes through that cache and if the response is cacheable. An
object stored in one cache does not automatically replicate into other
caches; cache-to-cache fill happens only in response to a
client-initiated request.
https://cloud.google.com/cdn/docs/overview
Note that the expiration time is an upper bound on how long a cache
entry remains valid. There is no guarantee that a cache entry will
remain in the cache until it expires.
https://cloud.google.com/cdn/docs/caching
Note, though, that Cloud CDN operates numerous caches around the
world, and old cache entries are routinely evicted to make room for
new content. As a result, multiple cache fills per resource are
expected as part of normal operation.
https://cloud.google.com/cdn/docs/support#low-hit-rate
If you're seeing low cache hit rates for popular content, that last link has suggestions that should help.
I know exactly what the problem is... GCP CDN does not have Origin Shield feature. Even worse, with GCP almost every request comes from a different one of its massive number of CDN PoPs around the world. Without Origin Shield, your app server is the origin server and it has to fill the cache of every CDN edge point.
In my experience you should use GCP CDN only for DOS protection & caching and improving the TTFB performance of HTML requests (specially to offload SSL handshake). Use another CDN for caching other assets with better Cache/Hit ratio.
Some CDN providers have Origin Shield which helps with the cache hit ratio. E.g. create cdn.yourdomain.com with a CDN provider that has Origin Shield Feature and serve all other static content from there.
I know it may sound crazy to put a CDN in front of your CDN, but trust me it works amazing and you can even save money if you go with a CDN that charges less for bandwidth. Also, GCP CDN only caches content up to 10MB.

How to reduce Amazon Cloudfront costs?

I have a site that has exploded in traffic the last few days. I'm using Wordpress with W3 Total Cache plugin and Amazon Cloudfront to deliver the images and files from the site.
The problem is that the cost of Cloudfront is quite huge, near $500 just the past week. Is there a way to reduce the costs? Maybe using another CDN service?
I'm new to CDN, so I might not be implementing this well. I've created a cloudfront distribution and configured it on W3 Total Cache Plugin. However, I'm not using S3 and don't know if I should or how. To be honest, I'm not quite sure what's the difference between Cloudfront and S3.
Can anyone give me some hints here?
I'm not quite sure what's the difference between Cloudfront and S3.
That's easy. S3 is a data store. It stores files, and is super-scalable (easily scaling to serving 1000's of people at once.) The problem is that it's centralized (i.e. served from one place in the world.)
CloudFront is a CDN. It caches your files all over the world so they can be served faster. If you squint, it looks like they are 'storing' your files, but the cache can be lost at any time (or if they boot up a new node), so you still need the files at your origin.
CF may actually hurt you if you have too few hits per file. For example, in Tokyo, CF may have 20 nodes. It may take 100 requests to a file before all 20 CF nodes have cached your file (requests are randomly distributed). Of those 100 requets, 20 of them will hit an empty cache and see an additional 200ms latency as it fetches the file. They generally cache your file for a long time.
I'm not using S3 and don't know if I should
Probably not. Consider using S3 if you expect your site to massively grow in media. (i.e. lots of use photo uploads.)
Is there a way to reduce the costs? Maybe using another CDN service?
That entirely depends on your site. Some ideas:
1) Make sure you are serving the appropriate headers. And make sure your expires time isn't too short (should be days or weeks, or months, ideally).
The "best practice" is to never expire pages, except maybe your index page which should expire every X minutes or hours or days (depending on how fast you want it updated.) Make sure every page/image says how long it can be cached.
2) As stated above, CF is only useful if each page is requested > 100's of times per cache time. If you have millions of pages, each requested a few times, CF may not be useful.
3) Requests from Asia are much more expensive than the from the US. Consider launching your server in Toyko if you're more popular there.
4) Look at your web server log and see how often CF is requesting each of your assets. If it's more often than you expect, your cache headers are setup wrong. If you setup "cache this for months", you should only see a handful of requests per day (as they boot new servers, etc), and a few hundred requests when you publish a new file (i.e. one request per CF edge node).
Depending on your setup, other CDNs may be cheaper. And depending on your server, other setups may be less expensive. (i.e. if you serve lots of small files, you might be better off doing your own caching on EC2.)
You could give cloudflare a go. It's not a full CDN so it might not have all the features as cloudfront, but the basic package is free and it will offload a lot of traffic from your server.
https://www.cloudflare.com
Amazon Cloudfront costs Based on 2 factor
Number of Requests
Data Transferred in GB
Solution
Reduce image requests. For that combine small images into one image and use that image
https://www.w3schools.com/css/tryit.asp?filename=trycss_sprites_img (image sprites)
Don't use CDN for video file because video size is high and this is responsible for too high in CDN coast
What components make up your bill? One thing to check with W3 Total Cache plugin is the number of invalidation requests it is sending to CloudFront. It's known to send a large amount of invalidations paths on each change, which can add up.
Aside from that, if your spend is predictable, one option is to use CloudFront Security Savings Bundle to save up to 30% by committing to a minimum amount for a one year period. It's self-service, so you can sign up in the console and purchase additional commitments as your usage grows.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/savings-bundle.html
Don't forget that cloudfront has 3 different price classes, which will influence how far your data is being replicated, but at the same time, it will make it cheaper.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/PriceClass.html
The key here is this:
"If you choose a price class that doesn’t include all edge locations, CloudFront might still occasionally serve requests from an edge location in a region that is not included in your price class. When this happens, you are not charged the rate for the more expensive region. Instead, you’re charged the rate for the least expensive region in your price class."
It means that you could use price class 100 (the cheapest one) and still get replication on regions you are not paying for <3

Can a CDN be beneficial over a DEDICATED SERVER when there seems to be no CDN in our region?

I am considering trying out Amazons CloudFront CDN, which utilizes their S3 service for file storage and springs data to servers closest to the browser, however, we have a dedicated server in South Africa, Johannesburg to be exact, so my question is this:
Amazons CloudFront seems to give you the option to have your base server in EU, America, Japan but nowhere near South Africa - I guess EU is the closest? So, will I still benefit from using the CDN to server static files (css, images, javascript - when not using Googles ajax API - and media files) rather than calling them from the same, dedicated server? Bear in mind that although we have a dedicated server, it is STILL a shared hosting environment as we host multiple clients websites on the server.
Secondly, if I DO use a CDN like Amazons CloudFront, can I benefit from caching my content and using far future expires headers, compression etc?
Many thanks
Well, the CDN would take load off your main dedicated server. This is quite advantageous, because it means that you can downgrade your server to something a little cheaper.
It also provides many, many more levels of redundancy than small-medium business could ever hope to achieve (especially with only a single server).
The fact that a CDN has mirrors all over the world is only 1 feature of such a network. It's a good one, but it is by no means the only one.
Really, you have to think about what you are hoping to achieve by using the CDN, and what you are using it for.
If your application relies on the response time and nothing else matters, then it might be worth keeping your dedicated server running everything because of the proximity to your location. That said, quite often applications that require quick response time only require it for a very small subsection of the overall functionality of the server, so you could unload only the non-time-critical elements to the CDN and still get the best of both worlds.
In short, CDNs have many functions. They save businesses time, money and provide infrastraucture that is otherwise unattainable to the large majority of developers and administrators.
I don't think it matters where your server is. You're still uploading stuff to Amazon (although, yes you'd probably want to make use of the EU servers so it's quicker if you're sending stuff from South Africa) but the idea of a CDN is that it has end points around the world so that you don't have to worry about having servers in those locations.
In short, yes a CDN such as Amazon's CloudFront is what you want to be able to serve static content quickly.
The main idea is that the closest server is (the better route is) to user, the better download speed is. CDN is a network of multiple servers geographically spread. This means that on the contrary to the single dedicated server, static content is given away from the closest point of presence. This allows better speeds in different regions. And yes - you can use different headers in CDN.