Protecting S3 assets without CloudFront - amazon-web-services

I am beginner in using AWS and currently I am hosting the assets for a web application using a microservices-based architecture on a S3 bucket. I want to allow the browsers using the application to access the assets. But all over the internet, it is always stated that it's highly recommended to prevent public access to the S3 bucket.
How can I do that without CloudFront, which I won't be using since all of the users are in the same region ?

You cant use S3 for static hosting and follow AWS' best practice around S3 buckets being private - you need to pick one.
The recommended structure is a private S3 bucket, with a public CloudFront distribution in front, and an origin access identity to control access to the origin bucket. Honestly, if you do configure your bucket with just GET access and enable static web hosting its not terrible, but CloudFront offers a couple of significant benefits over S3 static website hosting:-
Private S3, public CloudFront is a better default security stance and your less likely to make several common mistakes - hence why you see this guidance all over the internet.
Hosting files over S3+CloudFront will on average reduce latency and increase download speed compared to just S3 alone even in the same region. There are many edge locations interconnected by super high speed connections all over the world. End users connecting via edge locations effectively take a shorter route to the origin S3 bucket than going directly to the regional S3 bucket over the public internet.
Using CloudFront will probably work out cheaper than S3 alone.
Flexability - CloudFront can access mutiple buckets (or load balencers) and serve different paths from different origins.
If you do go down the CF route (i recomend you do), for the extra effort you get many benefits.
Bare in mind CF respects any caching headers associated with your objects in S3, or uses defaults set on the CF distribution. Be careful setting long cache times on files - you can clear the cache (called an invalidation) in CF - but end users browsers that have downloaded the file will also likely respect the cache headers (this is where you can use "cache busting" query strings...).

Related

Is it best practice to enable both CloudFront and S3 access logs?

We are implementing a static site hosted in S3 behind CloudFront with OAI. Both CloudFront and S3 can have logging enabled and I'm trying to determine whether it is necessary/practical to have logging enabled for both.
The only S3 access that would not come from CloudFront should be site deployments from our CI/CD pipeline which may still be useful to log. It may be useful for that exact reason to find any unintended access that was not from CloudFront or deployments too? The downside of course is that there would be two sets of access logs that would mostly overlap and adding to the monthly bill.
We will need to make a decision on this, but curious what the consensus is out there.
Thanks,
John
If you are using CloudFront with origin Access Identity , then your bucket can be private to the world.
Only the OAI and any other users you want can have read access and denying any others access to the s3 bucket/files inside the bucket which host the static website.
Which means user around the world need to mandatorily come via cloud front and direct access to s3 bucket and it's files will be denied.
So if you have implemented it right , you do not need to have s3 access logging enabled.
However the value of security is only know once we face a disaster , just weigh in and take a decision.
References :
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html

give public access to s3 bucket rather than serving static website for aws serverless application

i am new to aws serverless, and trying to host django app in aws serverless.
now aws serverless uses s3 bucket for static website hosting which cost around $0.50 (I am in free tier).
my question is instead of hosting static website can i not give public access to s3 bucket? as it would save me money. is it possible to use public bucket for aws serverless?
Yes, hosting static content on S3 is the most cost cost effective way to serve content. I would suggest to keep your bucket private and enable cloudfront as distribution (CDN) point in front of S3. That allows to keep a cache at the edge, close to your customers and to slightly lower the outgoing bandwidth costs (Cloudfront outgoing bandwidth costs is lower than S3 : in the US $0.085/Gb vs $0.090/GB)
This article will give you detailed instructions how to do so https://aws.amazon.com/blogs/networking-and-content-delivery/amazon-s3-amazon-cloudfront-a-match-made-in-the-cloud/
I explained the high level steps on my blog too : https://www.stormacq.com/2018/10/17/migrated-to-serverless.html

Use S3 for website in multi-regions

I want to host my website in S3 (because it seems cheaper and i don't have server side script). I have a domain, and i want my domain link to my S3 website. So far, what i do is enabling Static website hosting in my S3 website bucket, and set Route53 record set's Alias Target to my S3 website. it's working. But it's not good enough, i want it to deal with multi regions.
I know that Transfer acceleration can auto sync files to other regions so it's faster for other regions. But i don't know how to make it work with Route53. I hear that some people uses CloudFront to do that but i don't quite understand how. And i don't want to manually create buckets in several regions and manually set up for each region
do you have any suggestion for me?
If your goal is to reduce latency for users worldwide, then Amazon CloudFront is definitely the way to go.
Amazon CloudFront has over 100 edge locations globally, so it has more coverage than merely using AWS regions.
You simply create a CloudFront distribution, point it to your S3 bucket and then point your domain name to CloudFront.
Whenever somebody accesses your content, CloudFront will retrieve it from S3 and cache it in the edge location closest to that user. Then, other users who want the data will receive it from the local cache. Thus, your website appears fast for many users.
See also: Amazon CloudFront pricing

With Amazon S3, can I prevent trolls/grievers from making millions of GET-requests with bots?

I'm working on a website that contains photo galleries, and these images are stored on Amazon S3. Since Amazon charges like $0.01 per 10k GET-requests, it seems that a potential troll could seriously drive up my costs with a bot that makes millions of page requests per day.
Is there an easy way to protect myself from this?
The simplest strategy would be to create randomized URLs for your images.
You can serve these URLs with your page information. But they cannot be guessed by the bruteforcer and will usually lead to a 404.
so something like yoursite/images/long_random_string
Add aws Cloudfront service for your S3 object images. So it will retrieve the cached data from the edge location.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/MigrateS3ToCloudFront.html
As #mohan-shanmugam pointed out, you should use a CloudFront CDN with your origin as the S3 bucket. It is considered bad practice for external entities to hit S3 buckets directly.
http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html
With a CloudFront distribution, you can alter your S3 bucket's security policy to only allow access from the distribution. This will block direct access to S3 even if the URLs are known.
In reality, you would likely suffer from website performance way before needing to worry about additional charges as a direct DDOS attempt against S3 should result in AWS throttling API requests.
In addition, you can set up AWS WAF in front of your CloudFront distribution and use it for advanced control of security related concerns.

Websites hosted on Amazon S3 loading very slowly

I have an application which is a static website builder.Users can create their websites and publish them to their custom domains.I am using Amazon S3 to host these sites and a proxy server nginx to route the requests to the S3 bucket hosting sites.
I am facing a load time issue.As S3 specifically is not associated with any region and the content being entirely HTML there shouldn't ideally be any delay.I have a few css and js files which are not too heavy.
What can be the optimization techniques for better performance? eg: Will setting headers ? or Leverage caching help? I have added an image of pingdom analysis for reference.
Also i cannot use cloudfront as when the user updates an image the edge locations have a delay of few minutes before the new image is reflected.It is not instant update,hence restricting the use for me. Any suggestions on improving it?
S3 HTTPS access from a different region is extremely slow especially TLS handshake. To solve the problem we invented Nginx S3 proxy which can be find over the web. S3 is the best as origin source but not as a transport endpoint.
By the way try to avoid your "folder" as a subdomain but specify only S3 regional(!) endpoint URL instead with the long version of endpoint URL, never use https://s3.amazonaws.com
One the good example that reduces number of DNS calls is the following below:
https://s3-eu-west-1.amazonaws.com/folder/file.jpg
Your S3 buckets are associated with a specific region that you can choose when you create them. They are not geographically distributed. Please see AWS doc about S3 regions: https://aws.amazon.com/s3/faqs/
As we can see in your screenshot, it looks like your bucket is located in Singapore (ap-southeast-1).
Are your clients located in Asia? If they are not, you should try to create buckets nearer, in order to reduce data access latency.
About cloudfront, it should be possible to use it if you invalide your objects, or just use new filenames for each modification, as tedder42 suggested.