I am trying to take advantage of the built-in Cloud Storage edge caching feature. When a valid Cache-Control header is set, the files can be stored at edge locations. This is without having to set up Cloud Load Balancer & CDN. This built-in behavior is touched on in this Cloud Next '18 video.
What I am seeing though is a hard limit of 10 MB. When I store a file over 10 MB and then download it, it's missing the Age response header. A 9 MB file will have it. The 10 MB limit is mentioned in the CDN docs here, though. What doesn't make sense to me is why files over 10 MB don't get cached to the edge. After all, the Cloud Storage server meets all the requirements, it even says: Cloud Storage supports byte range requests for most objects.
Does anyone know more about the default caching limits? I can't seem to find any limits documented for Cloud Storage.
At the moment, the information of the cache limit managed by the objects in Cloud Storage is documented in the Cloud DN documentation and what you describe is an expected behavior.
Origin server does not support byte range requests: 10 MB (10,485,760
bytes)
Performing tests on my side with files of an exact size of 10,485,760 bytes include the Age field.
However files above that limit such as 10,485,770 no longer include it.
I recommend you create a feature request here in order to improve the Google Cloud Storage documentation.
In this way you will have direct communication with the team responsible for the documentation and your request may be supported by other members of the community.
Related
We want to download objects from Google Storage via the application via the 5G network. But the download speed does not rise above 130 Mbps, despite the fact that the speed test on the device shows 400 Mbps. Can you please tell me if there are any restrictions on the download speed in this service?
Google Cloud Storage doesn't have any hard limit 1, but the network between your device and Google infrastructure may affect your download speeds.
Check your ping to GCP regions with this tool. If your data is stored in a location that has very high latency, try moving your storage bucket somewhere closer.
You can also take a look at this article to find out how you can improve your Google Cloud Storage performance.
5G implies a mobile device, and I do not know how to tune the TCP settings on a mobile device, but one of the fundamental limits to the performance of a TCP connection is:
Throughput <= WindowSize / RoundTripTime
If you cannot make the RoundTripTime smaller by accessing things more closely to you, you can seek to increase WindowSize.
I have a video streaming application which does streaming the video from google storage bucket. All the files which reside on the storage bucket are not public. Every time when users click on a video from the front-end I am generating a signed URL using API and load into the HTML5 video player.
Problem
I see if the file size is more than 100 MB it takes around 30-40 sec to load the video on front-end.
When I googled to resolved this problem, some of the articles are saying use cloud CDN and storage bucket then cache the file. As far as I know, to cache the file, the file has to publicly available. I can't make files publicly available.
So my concern is, are there any ways where we can make it scalable/ reduce the initial time?
Cloud CDN will help your latency for sure. Also, with that amount of latency it might be good to look into the actual requests that are being sent to Cloud Storage to make sure chunks are being requested and that the whole video file isn't being loaded before starting to play.
Caching the file does not require that the file is public. You can make the file private and add the Cloud CDN service into your Cloud Storage ACLs (https://cloud.google.com/cdn/docs/using-signed-urls#configuring_permissions). Also, as Kolban noted above, signed cookies might be better for your application to streamline the requests.
Not an exact answer but this site is useful to design solution using GCP.
https://gcp.solutions/diagram/media-transcoding
As mentioned earlier, CDN is right way to go for video streaming with low latency.
This question already has answers here:
Is Google Cloud Storage an automagical global CDN?
(4 answers)
Closed 3 years ago.
Based on the discussions here I made my Google Cloud Storage image public but its still taking a lot of time for TTFB? Any idea why? How can I reduce the TTFB while calling Google Cloud Storage?. My URL and snapshot of what i see on developer tools is given below
Public image URL
Ok, now I understand your question. Your concern is how to reduce the TTBF time of requesting an image from Google Cloud Storage. There is not a magic way to reduce the TTFB to 0. This is near to impossible. Time to First Byte is how long the browser has to wait before start receiving data. For the specific case of Google Cloud Storage is (in a general way) the time between you requesting an image, this message being delivered to the Google server where your image is stored, this server searching the image and delivering the image to you.
This will depend on 2 main factors:
The speed of the message being transport to/from the server. This will depend on the speed of your connection and the distance between the server and you. It is not the same if you are fetching an image form USA or from India, this will give you 2 very different TTFB.
You can see this example where I get the same image from 2 different buckets with public policy. For reference, I'm in Europe.
Here is my result calling the image from a bucket in Europe:
And here is my result calling the image from India:
As you can see that my download time doesn't increase that much while my TTFB is doubled.
The second factor to see if you want to reduce your TTFB is the speed of the request being processed by your server. In this case, you don't have much influence on this since you are requesting this directly from Google Cloud Storage and you can't modify the code. The only way to influence this is by removing load to the request. Making the image public helps with this because now the server doesn’t have to look for certs or permissions, it will just send you back the image.
So, in conclusion, there is not much that you can do in here to reduce the TTFB more than select a server closer to your user location and improve your internet speed.
I found this article really useful and could help you to understand better the TTFB and how to understand it measurement.
Thanks I have moved my bucket to a nearby location to my users. That has reduced the time
Currently, Cloud Run has a request limit of 32 Mb per request, which makes it impossible to upload files like videos (which placed with no changes to GCP Storage). Meantime All Quotas page doesn't list this limitation as the one you can request an increase from support. So question is - does anyone know how to increase this limit or how to make it possible (uploading video and bigger files) to Cloud Run with given limitation?
Google recommended best practice is to use Signed URLs to upload files, which is likely to be more scalable and reliable (over flaky networks) for file uploads:
see this url for further information:
https://cloud.google.com/storage/docs/access-control/signed-urls
As per official GCP documentation, the maximum request size limit for Cloud Run (which is 32 MB) cannot be increased.
Update since the other answers were posted - the request size can be unlimited if using HTTP/2.
See https://cloud.google.com/run/quotas#cloud_run_limits which says "Request Maximum HTTP/1 request size, in MB 32 if using HTTP/1 server. No limit if using HTTP/2 server."
I have a site that has exploded in traffic the last few days. I'm using Wordpress with W3 Total Cache plugin and Amazon Cloudfront to deliver the images and files from the site.
The problem is that the cost of Cloudfront is quite huge, near $500 just the past week. Is there a way to reduce the costs? Maybe using another CDN service?
I'm new to CDN, so I might not be implementing this well. I've created a cloudfront distribution and configured it on W3 Total Cache Plugin. However, I'm not using S3 and don't know if I should or how. To be honest, I'm not quite sure what's the difference between Cloudfront and S3.
Can anyone give me some hints here?
I'm not quite sure what's the difference between Cloudfront and S3.
That's easy. S3 is a data store. It stores files, and is super-scalable (easily scaling to serving 1000's of people at once.) The problem is that it's centralized (i.e. served from one place in the world.)
CloudFront is a CDN. It caches your files all over the world so they can be served faster. If you squint, it looks like they are 'storing' your files, but the cache can be lost at any time (or if they boot up a new node), so you still need the files at your origin.
CF may actually hurt you if you have too few hits per file. For example, in Tokyo, CF may have 20 nodes. It may take 100 requests to a file before all 20 CF nodes have cached your file (requests are randomly distributed). Of those 100 requets, 20 of them will hit an empty cache and see an additional 200ms latency as it fetches the file. They generally cache your file for a long time.
I'm not using S3 and don't know if I should
Probably not. Consider using S3 if you expect your site to massively grow in media. (i.e. lots of use photo uploads.)
Is there a way to reduce the costs? Maybe using another CDN service?
That entirely depends on your site. Some ideas:
1) Make sure you are serving the appropriate headers. And make sure your expires time isn't too short (should be days or weeks, or months, ideally).
The "best practice" is to never expire pages, except maybe your index page which should expire every X minutes or hours or days (depending on how fast you want it updated.) Make sure every page/image says how long it can be cached.
2) As stated above, CF is only useful if each page is requested > 100's of times per cache time. If you have millions of pages, each requested a few times, CF may not be useful.
3) Requests from Asia are much more expensive than the from the US. Consider launching your server in Toyko if you're more popular there.
4) Look at your web server log and see how often CF is requesting each of your assets. If it's more often than you expect, your cache headers are setup wrong. If you setup "cache this for months", you should only see a handful of requests per day (as they boot new servers, etc), and a few hundred requests when you publish a new file (i.e. one request per CF edge node).
Depending on your setup, other CDNs may be cheaper. And depending on your server, other setups may be less expensive. (i.e. if you serve lots of small files, you might be better off doing your own caching on EC2.)
You could give cloudflare a go. It's not a full CDN so it might not have all the features as cloudfront, but the basic package is free and it will offload a lot of traffic from your server.
https://www.cloudflare.com
Amazon Cloudfront costs Based on 2 factor
Number of Requests
Data Transferred in GB
Solution
Reduce image requests. For that combine small images into one image and use that image
https://www.w3schools.com/css/tryit.asp?filename=trycss_sprites_img (image sprites)
Don't use CDN for video file because video size is high and this is responsible for too high in CDN coast
What components make up your bill? One thing to check with W3 Total Cache plugin is the number of invalidation requests it is sending to CloudFront. It's known to send a large amount of invalidations paths on each change, which can add up.
Aside from that, if your spend is predictable, one option is to use CloudFront Security Savings Bundle to save up to 30% by committing to a minimum amount for a one year period. It's self-service, so you can sign up in the console and purchase additional commitments as your usage grows.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/savings-bundle.html
Don't forget that cloudfront has 3 different price classes, which will influence how far your data is being replicated, but at the same time, it will make it cheaper.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/PriceClass.html
The key here is this:
"If you choose a price class that doesn’t include all edge locations, CloudFront might still occasionally serve requests from an edge location in a region that is not included in your price class. When this happens, you are not charged the rate for the more expensive region. Instead, you’re charged the rate for the least expensive region in your price class."
It means that you could use price class 100 (the cheapest one) and still get replication on regions you are not paying for <3