upload files via cloudfront distribution - amazon-web-services

how to upload files directly to cloudfront distribution ?
now I use the putobject method in the s3 class in the javascript sdk
According to documentation we can upload to the distribution directly
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/AddingObjects.html
when I send the put request to distributionname.cloudfront.net It says 403 forbidden
although I enabled the CORS configuration in s3
is there any similar method to s3.putobject for uploading to the cloudfront directly ?
or should I keep sending to the s3 origin of distribution buketname.s3.amazonaws.com/?

Actually AWS released a feature later called as
"Amazon S3 Transfer Acceleration"
to Transfer Acceleration takes advantage of Amazon CloudFront’s globally distributed edge locations
http://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html

There is no such concept as uploading files "to" Cloudfront.
The link you cited only really discusses adding objects to your origin -- not to your distribution, in spite of the ambiguous title in the Amazon documentation. As discussed there, you're adding objects to your origin so that they will be accessible via your distribution... not actually adding objects "to" the distribution.
Cloudfront does not provide persistent storage -- it only stores (caches) the objects that are requested through it, after the objects are fetched from the origin (which can be S3, or not).
Once an object is requested and cached at a Cloudfront edge location, it still isn't in any real sense "in" Cloudfront. It's only stored at the particular edge locations where it's been requested, and only until it either expires or is otherwise evicted from the Cloudfront cache at that location.
“[...] when space is needed at an edge location, the Amazon CloudFront will remove less popular objects in order to make room for more popular ones. This means that your static objects that aren’t accessed frequently are less likely to remain in Amazon CloudFront’s edge locations’ caches.”
— https://aws.amazon.com/cloudfront/details/
Now, with all of that said... it is technically possible to upload objects to S3 through Cloudfront, but this technique doesn't put the object "into" Cloudfront... it only allows you to put the object into S3 using Cloudfront as a proxy, which can offer some performance improvement in less than ideal network conditions, but has no impact on subsequent behavior on the part of Cloudfront when fetching the object, and doesn't invalidate the old copy of the object that might already be cached in Cloudfront at the various edge locations around the globe.

As I understand, if you use Transfer Acceleration on a bucket then the objects uploaded to that bucket will be first uploaded to CloudFront and then it will be transferred to the actual S3 bucket.

Cloud Front can support those requests if you enable them: POST, PUT, DELETE, OPTIONS, and PATCH.
Aws recommend to use CloudFront for upload/download file < 1Gb in size. For larger files, the S3 Transfer Accelerator is recommended.
see: https://aws.amazon.com/blogs/aws/amazon-cloudfront-content-uploads-post-put-other-methods/
https://aws.amazon.com/s3/faqs/#Amazon_S3_Transfer_Acceleration

Related

Use Amazon CloudFront together with Amazon S3 Transfer Acceleration?

I read this and understand the difference between CloudFront and S3 Transfer Acceleration. Since both can improve the download speed(though S3 Transfer Acceleration is mainly for uploading, it also improving download), can I use them together? Based on my test, it seems to be impossible as CloudFront will always take the S3 bucket URL like xxx.s3.amazonaws.com/ as the source. It cannot take xxx.s3-accelerate.amazonaws.com as the source.
Amazon CloudFront caches content in 225+ Points of Presence.
The first user who requests a piece of content in a particular location will trigger the edge location to 'pull' the content from the origin. Future requests for that content will be served from the cache. It is an excellent way to reduce latency for users spread around the world.
Amazon S3 Transfer Acceleration always makes a request back to the source bucket. This is good for uploading, but does not reduce latency for users spread around the world since all of their requests would need to go to the source bucket.
You might be able to have CloudFront in front of S3 Transfer Acceleration, but CloudFront uses the AWS network to reach origin buckets, so it is unlikely to make any difference. If you do experiment with this, let us know your findings!

S3 Transfer Acceleration Semantics

I have a rather simple question which I cannot find an explicit answer to, but anyone using the subject should be able to answer.
Does S3 Transfer Acceleration follow an eventual model i.e. clients upload to a CF edge location, get the response back and then the data is eventually moved to a bucket OR is the performance (speed) gain is simply because of the AWS internal network usage and upon the request completion the data is always 100% IN the S3 bucket?
If it's the former is there any SLA regarding how fast this eventual process is?
S3 Transfer Acceleration uses CloudFront and CloudFront doesn't cache POST/PUT request which means the data gets uploaded to the S3 at the same time. CloudFront doesn't buffer it, it simply saves your RTT (round trip time) by letting you connect to the nearest edge location compare to when you connect to S3 endpoint situated far from you.
And, since transfer between CloudFront and S3 is in AWS Network, it should be faster.
(buffer in the sense you can consider Acceleration endpoint as proxy).
S3 Transfer Acceleration uses portions of the CloudFront infrastructure to provide low-latency, performance-optimized connectivity from browser to edge to bucket.
It does not use any storage or caching components of CloudFront.
The acceleration is only TLS and transport (buffer and routing) related; all HTTP interactions are ultimately end-to-end with the actual S3 bucket, with CloudFront edge servers providing termination for the browser-facing TLS session and a reverse-proxy function.
Nothing stored outside the bucket, so S3's standard consistency model applies.

Amazon S3 as cdn copying images from my server

I have searched a lot on this but all I get is using CloudFront (CDN) in collaboration with S3.
I want to do something different.
CloudFront works as a CDN with its Origin set to either my domain where images are, or S3.
If I set it to my domain, there is an issue of having my hosting space used.
If I use it with S3, the question is, how to get my images to S3 without much hassle? In case of CDN, this is automatic, as every call to CloudFront copies the image from my server automatically.
Is it possible that CloudFront works with S3 but if image is not present on S3, it copies it from my server to S3?
Or may be S3 itself works as CDN (best solution). I have seen on some sites that they use s3 urls for hosting their images, like this:
https://retsimages.s3.amazonaws.com/14/A10363214_6.jpg
How is that possible?
If I set it to my domain, there is an issue of having my hosting space used.
More expensive than the storage space is the cost of having a server sitting there ready to handle the request. Your application logic konws when the images change; that's the time to put them in S3.
how to get my images to S3 without much hassle?
There's an SDK for just about every language, so upload the image as it comes in. Use s3cmd sync to move the images you have. Then you can just turn off your server.
Or may be S3 itself works as CDN
CloudFront can use a customer provided dns name and matching certificate so that you can use a custom domain with https. It can integrate into AWS WAF which S3 cannot directly. Otherwise, CDN behaves similarly to s3. CloudFront should provide better caching and endpoint locality, but you'll see little functional difference at low volumes. Neither is read-after-write consistent, but Cloudfront caches additionally. Pricing is unlikely to make CloudFront cheaper for most uses.
Is it possible that CloudFront works with S3 but if image is not present on S3, it copies it from my server to S3?
Close.
CloudFront does have a feature that would help move you in this direction -- origin groups. Create an origin group with S3 as primary and your server as secondary. Any time CloudFront encounters a cache miss, it will first check S3, and only if the image is not there, it will retry by sending the request to your server. It will cache the response, but it will not remember the source of the object -- so subsequent requests on future cache misses for the same object will always try S3 first.
This means something on your server needs to be responsible for ultimately moving images to S3 -- but as long as the image exists in one place or the other, the image will be served by CloudFront and cached in CloudFront in the edge or edges (up to two -- one global/outer, one regional/inner) that handled the request.

Multi region active-active using S3 bucket in each region

I'm trying to design an architecture for a "simple" problem but for the moment I did not found the solution.
The problem:
I have a S3 bucket (one in each region with bucket replication in order to have the same thing in each bucket) and I would like to have a CloudFront in front of it to cache objects.
My need: to have the lowest latency for each user in the world when displaying an object from S3 bucket.
I wanted to have a CloudFront distribution in front of each S3 bucket and a Route53 to route based on the latency to the nearest CF. The problem is that we cannot have many distribution for the same cname.
Here bellow the architecture I have so far (which is not good).
Any idea how to achieve this ?
Thanks.
C.C.
Just keep one of your buckets, AWS CloudFront does all of them for you.
How CloudFront Delivers Content to Your Users
After you configure CloudFront to deliver your content, here's what happens when users request your objects:
1-A user accesses your website or application and requests one or more objects, such as an image file and an HTML file.
2-DNS routes the request to the CloudFront edge location that can best serve the request—typically the nearest CloudFront edge location in terms of latency—and routes the request to that edge location.
3-In the edge location, CloudFront checks its cache for the requested files. If the files are in the cache, CloudFront returns them to the user. If the files are not in the cache, it does the following:
CloudFront compares the request with the specifications in your
distribution and forwards the request for the files to the applicable
origin server for the corresponding file type—for example, to your
Amazon S3 bucket for image files and to your HTTP server for the HTML
files.
The origin servers send the files back to the CloudFront edge
location.
As soon as the first byte arrives from the origin, CloudFront
begins to forward the files to the user. CloudFront also adds the
files to the cache in the edge location for the next time someone
requests those files.
For more info read the following doc:
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/HowCloudFrontWorks.html
To deliver content to end users with lower latency, Amazon CloudFront uses a global network of 138 Points of Presence (127 Edge Locations and 11 Regional Edge Caches) in 63 cities across 29 countries. Amazon CloudFront Edge locations are located in:
One thing you can do is that you can create one single CloudFront distribution and you can attach a Lambda#Edge to it and use it to rewrite the host header in the request. Inside the Lambda you can access all the headers and you can rewrite them at will, based on any logic you want. When you rewrite the host header, the request will be sent to another bucket in another region.
We used this solution to build multi-region active-active delivery from replicated buckets from two regions.
The original idea is from here: https://medium.com/buildit/a-b-testing-on-aws-cloudfront-with-lambda-edge-a22dd82e9d12
This seems to be the same solution for a different problem: https://aws.amazon.com/blogs/apn/using-amazon-cloudfront-with-multi-region-amazon-s3-origins/
We presented our solution on the AWS Summit in Berlin this year, but haven't posted about it yet anywhere.
The answer seems to be pretty elaborate as provided by #Reza Mousavi. The point of AWS CloudFront distribution is to cache objects on the Edge locations worldwide (see options while configuring-attached snapshot).
Best practice (at least what I do -no complaints so far) is to configure a single distribution for each application origin. The option while configuring gives you the regions to choose based on your customer origin.
The AWS solutions has launched new solution to address the S3 replication across the regions.
For example, you can create objects in Oregon, rename them in Singapore, and delete them in Dublin, and the changes are replicated to all other regions. This solution is designed for workloads that can tolerate lost events and variations in replication speed.
https://aws.amazon.com/solutions/multi-region-asynchronous-object-replication-solution/

Difference between Amazon S3 cross region replication and Cloudfront

After reading some AWS documentations, I am wondering what's the difference between these different use cases if I want to delivery (js, css, images and api request) content in Asia (including China), US, and EU.
Store my images and static files on S3 US region and setup EU and Asia(Japan or Singapore) cross region replication to sync with US region S3.
Store my images and static files on S3 US region and setup cloudfront CDN to cache my content in different locations after initial request.
Do both above (if there is significant performance improvement).
What is the most cost effective solution if I need to achieve global deployment? And how to make request from China consistent and stable (I tried cloudfront+s3(us-west), it's fast but the performance is not consistent)?
PS. In early stage, I don't expect too many user requests, but users spread globally and I want them to have similar experience. The majority of my content are panorama images which I'd expect to load ~30MB (10 high res images) data sequentially in each visit.
Cross region replication will copy everything in a bucket in one region to a different bucket in another region. This is really only for extra backup/redundancy in case an entire AWS region goes down. It has nothing to do with performance. Note that it replicates to a different bucket, so you would need to use different URLs to access the files in each bucket.
CloudFront is a Content Delivery Network. S3 is simply a file storage service. Serving a file directly from S3 can have performance issues, which is why it is a good idea to put a CDN in front of S3. It sounds like you definitely need a CDN, and it sounds like you have tested CloudFront and are unimpressed. It also sounds like you need a CDN with a larger presence in China.
There is no reason you have to chose CloudFront as your CDN just because you are using other AWS services. You should look at other CDN services and see what their edge networks looks like. Given your requirements I would highly recommend you take a look at CloudFlare. They have quite a few edge network locations in China.
Another option might be to use a CDN that you can actually push your files to. I've used this feature in the past with MaxCDN. You would push your files to the CDN via FTP, and the files would automatically be pushed to all edge network locations and cached until you push an update. For your use case of large image downloads, this might provide a more performant caching mechanism. MaxCDN doesn't appear to have a large China presence though, and the bandwidth charges would be more expensive than CloudFlare.
If you want to serve your files in S3 buckets to all around the world, then I believe you may consider using S3 Transfer acceleration. It can be used in cases where you either upload to or download from your S3 bucket . Or you may also try AWS Global Accelerator
CloudFront's job is to cache content at hundreds of caches ("edge locations") around the world, making them more quickly accessible to users around the world. By caching content at locations close to users, users can get responses to their requests more quickly than they otherwise would.
S3 Cross-Region Replication (CRR) simply copies an S3 bucket from one region to another. This is useful for backing up data, and it also can be used to speed up content delivery for a particular region. Unlike CloudFront, CRR supports real-time updating of bucket data, which may be important in situations where data needs to be current (e.g. a website with frequently-changing content). However, it's also more of a hassle to manage than CloudFront is, and more expensive on a multi-region scale.
If you want to achieve global deployment in a cost-effective way, then CloudFront would probably be the better of the two, except in the special situation outlined in the previous paragraph.