Since my company using Sign Url to upload directly to S3 and with accelerated upload. And Cloudfront to download. Do I need the bucket to be same region with out web app. Or I can use US oregon for the S3. The reason to save the cost with since some region cost more than others region.(The S3 Infrequent Access ).
By using CloudFront, you will use AWS' dedicated backbone to connect to the bucket wherever it is anyway. It does mean increased latency, though, so you should be considerate of what you want (best performance or best cost). Do be aware, too, if you download the same file multiple times the latency is much less of an issue as it will be cached at the Cloudfront edge node.
Agree with what #henry said,
also consider Data-Transfer Cost within regions , sometimes they incur more cost along with storage if kept in a region with a low storage cost
Related
I read this and understand the difference between CloudFront and S3 Transfer Acceleration. Since both can improve the download speed(though S3 Transfer Acceleration is mainly for uploading, it also improving download), can I use them together? Based on my test, it seems to be impossible as CloudFront will always take the S3 bucket URL like xxx.s3.amazonaws.com/ as the source. It cannot take xxx.s3-accelerate.amazonaws.com as the source.
Amazon CloudFront caches content in 225+ Points of Presence.
The first user who requests a piece of content in a particular location will trigger the edge location to 'pull' the content from the origin. Future requests for that content will be served from the cache. It is an excellent way to reduce latency for users spread around the world.
Amazon S3 Transfer Acceleration always makes a request back to the source bucket. This is good for uploading, but does not reduce latency for users spread around the world since all of their requests would need to go to the source bucket.
You might be able to have CloudFront in front of S3 Transfer Acceleration, but CloudFront uses the AWS network to reach origin buckets, so it is unlikely to make any difference. If you do experiment with this, let us know your findings!
We keep user-specific downloadable files on AWS S3 buckets in N.Virginia region. Our clients download the files from these buckets all over the world. Files size ranges from 1-20 GB. For larger files, clients in non-US location face and complain about slow downloads or interrupted downloads. How can we optimize these downloads?
We are thinking about the following approaches:
Accelerated downloads (higher costs)
use of CloudFront CDN with S3 origin (Since our downloads are of different files, each file being downloaded just once or twice, will CDN help since, for 1st time, it will fetch data from US bucket only)
Use of akamai as CDN (same concern as of CloudFront, only thing is we have a better price deal with akamai at org level)
Depending on the user's location (we know where the download will happen), we can keep the file in the specific bucket which was created at that aws region.
So, I want recommendations in terms of cost+download speed. Which may be a better option to explore further?
As each file will only be downloaded a few times, you won't benefit from CloudFront's caching, because the likelihood that the download requests all hit the same CloudFront node and that this node hasn't evicted the file from its cache yet, are probably near zero, especially for such large files.
On the other hand you gain something else by using CloudFront or S3 Transfer Acceleration (the latter one being essentially the same as the first one without caching): The requests enter AWS' network already at the edge, so you can avoid using congested networks from the location of the user to the location of your S3 bucket, which is usually the main reason for slow and interrupted downloads.
Storing the data depending on the users location would improve the situation as well, although CloudFront edge locations are usually closer to a user than the next AWS region with S3. Another reason for not distributing the files to different S3 buckets depending on the users location is the management overhead: You need to manage multiple S3 buckets, store each file in the correct bucket and point each user to the correct bucket. While storing could be simplified by using S3 Replication (you could use a filter to only replicate objects to a specific target bucket meant for this bucket), the overhead with managing multiple endpoints for multiple customers remains. Also while you state that you know the location of the customers, what happens if a customer does change its location and suddenly wants to download an object which is now stored on the other side of the world? You'd have the same problem again.
In your situation I'd probably choose option 2 and set up CloudFront in front of S3. I'd prefer CloudFront over S3 Transfer Acceleration, as it gives you more flexibility: You can use your own domain with HTTPS, you can later on reconfigure origins when the location of the files changes, etc. Depending on how far you want to go you could even combine that with S3 replication and have multiple origins for your CloudFront distribution to direct requests for different files to S3 buckets in different regions.
Which solution to choose depends on your use case and constraints. One constraint seems to be cost for you, another one could for example be the maximum file size of 20GB supported by CloudFront, if you have files to distribute larger than that.
I am in a position where I have a static site hosted in S3 that I need to front with CloudFront. In other words I have no option but to put CloudFront in front of it. I would like to reduce my S3 costs by changing the objects storage class to S3 Infrequent Access (IA), this will reduce my S3 costs by like 45% which is nice since I have to now spend money on CloudFront. Is this a good practice to do? since the resources will be cached by CloudFront anyways? S3 IA has 99.9% uptime which means it can have as much as 8.75 hours of down time per year with AWS s3 IA.
First, don't worry about the downtime. Unless you are using Reduced Redundancy or One-Zone Storage, all data on S3 has pretty much the same redundancy and therefore very high availability.
S3 Standard-IA is pretty much half-price for storage ($0.0125 per GB) compared to S3 Standard ($0.023 per GB). However, data retrieval costs for Standard-IA is $0.01 per GB. Thus, if the data is retrieved more than once per month, then Standard-IA is more expensive.
While using Amazon CloudFront in front of S3 would reduce data access frequency, it's worth noting that CloudFront caches separately in each region. So, if users in Singapore, Sydney and Tokyo all requested the data, it would be fetched three times from S3. So, data stored as Standard-IA would incur 3 x $0.01 per GB charges, making it much more expensive.
See: Announcing Regional Edge Caches for Amazon CloudFront
Bottom line: If the data is going to be accessed at least once per month, it is cheaper to use Standard Storage instead of Standard-Infrequent Access.
Background
We have developed an e-commerce application where I want to use CDN to improve the speed of the app and also to reduce the load on the host.
The application is hosted on an EC2 server and now we are going to use Cloud Front.
Questions
After reading a lot of articles and documents, I have created a distribution for my sample site. After doing all the experience I have come to know the following things. I want to be sure if am right about these points or not.
When we create a distribution it takes all the accessible data from the given origin path. We don't need to copy/ sync our files to cloud front.
We just have to change the path of our application according to this distribution CNAME (if cname is given).
There is no difference between placing the images/js/CSS files on S3 or on our own host. Cloud Front will just take them by itself.
The application will have thousands of pictures of the products, should we place them on S3 or its ok if they are on the host itself? Please share any good article to understand the difference of both the techniques.
Because if S3 is significantly better then I'll have to make a program to sync all such data on S3.
Thanks for the help.
Some reasons to store the images on Amazon S3 rather than your own host (and then serve them via Amazon CloudFront):
Less load on your servers
Even though content is cached in Amazon CloudFront, your servers will still be hit with requests for the first access of each object from every edge location (each edge location maintains its own cache), repeated every time that the object expires. (Refreshes will generate a HEAD request, and will only re-download content that has changed or been flushed from the cache.)
More durable storage
Amazon S3 keeps copies of your data across multiple Availability Zones within the same Region. You could also replicate data between your servers to improve durability but then you would need to manage the replication and pay for storage on every server.
Lower storage cost
Storing data on Amazon S3 is lower cost than storing it on Amazon EBS volumes. If you are planning on keeping your data in both locations, then obviously using S3 is more expensive but you should also consider storing it only on S3, which makes it lower cost, more durable and less for you to backup on your server.
Reasons to NOT use S3:
More moving parts -- maintaining code to move files to S3
Not as convenient as using a local file system
Having to merge log files from S3 and your own servers to gather usage information
After reading some AWS documentations, I am wondering what's the difference between these different use cases if I want to delivery (js, css, images and api request) content in Asia (including China), US, and EU.
Store my images and static files on S3 US region and setup EU and Asia(Japan or Singapore) cross region replication to sync with US region S3.
Store my images and static files on S3 US region and setup cloudfront CDN to cache my content in different locations after initial request.
Do both above (if there is significant performance improvement).
What is the most cost effective solution if I need to achieve global deployment? And how to make request from China consistent and stable (I tried cloudfront+s3(us-west), it's fast but the performance is not consistent)?
PS. In early stage, I don't expect too many user requests, but users spread globally and I want them to have similar experience. The majority of my content are panorama images which I'd expect to load ~30MB (10 high res images) data sequentially in each visit.
Cross region replication will copy everything in a bucket in one region to a different bucket in another region. This is really only for extra backup/redundancy in case an entire AWS region goes down. It has nothing to do with performance. Note that it replicates to a different bucket, so you would need to use different URLs to access the files in each bucket.
CloudFront is a Content Delivery Network. S3 is simply a file storage service. Serving a file directly from S3 can have performance issues, which is why it is a good idea to put a CDN in front of S3. It sounds like you definitely need a CDN, and it sounds like you have tested CloudFront and are unimpressed. It also sounds like you need a CDN with a larger presence in China.
There is no reason you have to chose CloudFront as your CDN just because you are using other AWS services. You should look at other CDN services and see what their edge networks looks like. Given your requirements I would highly recommend you take a look at CloudFlare. They have quite a few edge network locations in China.
Another option might be to use a CDN that you can actually push your files to. I've used this feature in the past with MaxCDN. You would push your files to the CDN via FTP, and the files would automatically be pushed to all edge network locations and cached until you push an update. For your use case of large image downloads, this might provide a more performant caching mechanism. MaxCDN doesn't appear to have a large China presence though, and the bandwidth charges would be more expensive than CloudFlare.
If you want to serve your files in S3 buckets to all around the world, then I believe you may consider using S3 Transfer acceleration. It can be used in cases where you either upload to or download from your S3 bucket . Or you may also try AWS Global Accelerator
CloudFront's job is to cache content at hundreds of caches ("edge locations") around the world, making them more quickly accessible to users around the world. By caching content at locations close to users, users can get responses to their requests more quickly than they otherwise would.
S3 Cross-Region Replication (CRR) simply copies an S3 bucket from one region to another. This is useful for backing up data, and it also can be used to speed up content delivery for a particular region. Unlike CloudFront, CRR supports real-time updating of bucket data, which may be important in situations where data needs to be current (e.g. a website with frequently-changing content). However, it's also more of a hassle to manage than CloudFront is, and more expensive on a multi-region scale.
If you want to achieve global deployment in a cost-effective way, then CloudFront would probably be the better of the two, except in the special situation outlined in the previous paragraph.