I have a Python/Flask application that I've deployed in elastic beanstalk. I have been deploying updates via git aws.push, which includes my static js libraries, css, and images.
I now have about 1 GB of static content in the form of images. I want to serve that content from the same location as my application, that is, from the same place I was serving them before, in a /static/img/ folder. However, I obviously don't want to add the images to source control or deploy them with the git macro.
Ideally, I would like to connect to the instance where the files are hosted and upload them manually. However, I do not know how to do this. I have searched through the s3 associated with the elastic beanstalk app, but there is no sign of my app there, only a repository of zipped deployments.
I could create a new bucket and handle things that way, but I haven't been able to map a domain to a new bucket. Whenever I try to add a CNAME record to the bucket, it is rejected because "URL/IP cannot be added as a CNAME." In any case, the process that seems most intuitive is to manually put unversioned static content in place next to versioned, deployed code.
You're correct, this type of static content should not be part of your repository and certainly not stored on EC2 instance's volumes.
AWS' best practice for this use case would be to use S3 and directly link to S3 objects from your HTML code. S3 is a natively HTTP enabled object storage service.
In order to use S3 as web server, you must create a bucket on S3.
You can either use the S3 provided URL <bucket-name>.s3-website-<AWS-region>.amazonaws.com to link to your content from your web pages.
Or you can use your own domain name. In this case, your bucket name must be named after your domain name and you must enable "Website Hosting" option at the bucket level. This is required to let S3 know how to map HTTP requests to buckets.
A high level scenario is described here : http://docs.aws.amazon.com/gettingstarted/latest/swh/website-hosting-intro.html
And more details are provided by S3 documentation.
As an added benefit, storage in S3 costs less money than EBS storage.
Related
Backend has been built using Strapi(3.1.3) and Media library has been installed for file upload. This app has been deployed on Heroku and since Heroku uses ephemeral filesystem the uploaded files does not persist upon re-deployment or restart of dynos.
To overcome this I am using "strapi-provider-upload-aws-s3" which stores the uploaded files in S3 bucket which is perfect.
Bucket Policy lets us restrict read access to a certain list of IP addresses, however how do i restrict access to upload files from a certain domain? Using IP address to give upload access is not possible because Heroku manages the dynos and it's IP address keeps changing
Can i use a domain name instead of IP address?
Any help would be great
Thanks
I have created S3 bucket, and done the steps to enable static web hosting on it.
I have verified it works by going to the URL
which looks something as following https://my-bucket.s3.aws.com
I want to put my web assets in a sub folder now
I put the web assets in a folder I called foobar
Now if want to access it I have to explictly enter URL as following:
https://my-bucket.s3.aws.com/foobar/index.html
So my question is, do I need to use some other service such as CloudFront to enable so I can go into the bucket with the following URL instead https://my-bucket.s3.aws.com/foobar, that is I don't want to have to explicit say index.html at the end?
You can't do this with a default document for a subfolder using CloudFront. Documentation says
However, if you define a default root object, an end-user request for
a subdirectory of your distribution does not return the default root
object. For example, suppose index.html is your default root object
and that CloudFront receives an end-user request for the install
directory under your CloudFront distribution:
http://d111111abcdef8.cloudfront.net/install/
CloudFront does not return the default root object even if a copy of
index.html appears in the install directory.
But that same page also says
The behavior of CloudFront default root objects is different from the
behavior of Amazon S3 index documents. When you configure an Amazon S3
bucket as a website and specify the index document, Amazon S3 returns
the index document even if a user requests a subdirectory in the
bucket. (A copy of the index document must appear in every
subdirectory.) For more information about configuring Amazon S3
buckets as websites and about index documents, see the Hosting
Websites on Amazon S3 chapter in the Amazon Simple Storage Service
Developer Guide.
So check out out that referenced guide, and the section on Configuring an Index Document in particular.
We are running a static website that gets deployed by CI automatically to a public S3 bucket. The website is a jekyll page that has multiple folders. We are very happy with the setup because of the ease of deployment and no infrastructure.
But we now have traffic to our website and we want to add a staging phase.
This phase should be reachable by selected non-technical people from known IP's. We are not able to achieve this using a S3 bucket as this bucket needs to be public.
So we are looking for a way to deploy the static website with a staging area that is not public. Is this possible with a AWS service or other cloud offering?
The first part is relatively easy, just set up another bucket, deploy to there for staging and from there to your production bucket to go live.
Second part turns out to be straightforward too, you can specifiy a policy on an S3 bucket that restricts access to an IP range - see the example here: http://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html#example-bucket-policies-use-case-3
Personally I'd suggest that it would be better to use a login based restriction if at all possible (the person you need to sign off being out of the office is a classic example of where IP address restrictions get you into trouble) either way you have sufficiently fine grained control over S3 bucket permissions to let you do what you need
We solved this issue by having subfolders in the S3 bucket with unguessable names. The names allow the subfolders to be publically available and act as a shared secret password to the static website. Every pull request gets automatically deployed in this bucket to a subfolder.
Example:
s3-staging-bucket
└ ed567c0e-dca9-44fc-b1bc-18ed5237f598/
└ index.html
I am making an app that uses S3 to store a lot of user media which I then display to the user at a later time. I am trying to figure out the best and most secure way to accomplish this.
I read that storing the media at S3 and then using the url from S3 to load media might be a bad choice because it can expose information that you might not want out in the open. Is it right to download all media to the sever from S3 before loading it on a page? that seems like it defeats the purpose of S3 in the first place if I have to keep downloading media from there in order to display it.
What is the best practice for accomplishing this?
I would appreciate a little advice.
Thanks!
There are many different ways to use S3. There isn't a single "best-practice".
Serve all content through web server:
In this scenario, S3 is used simply as a storage medium. Content is requested through the web server, which then downloads the content from S3 and forwards the contents to the client.
In this scenario, the client is not aware of the S3 bucket.
This does not defeat the purpose of S3 because the purpose of S3 (in this scenario) is content storage, not delivery.
Serve content from a public S3 bucket:
In this scenario, you setup your S3 bucket to serve up the content directly. In this case, all of the content is public, so direct linking to the content from the web app is used. All content in the S3 bucket can be downloaded by anyone.
The bucket can be referenced as bucket.s3-website.amazonaws.com, or under your own domain.
This scenario has the benefit that it offloads the delivery of the content from your web server.
Serve content from a private S3 bucket:
In this scenario, your bucket is private, but you still serve up the content directly. Using this system, you create expiring pre-signed URLs to protect the private content. The content is downloaded directly from S3, but not all content can be downloaded by everyone.
Like the previous scenario, this scenario has the benefit that it offloads the delivery of the content from your web server.
CloudFront:
You can use CloudFront in front of your app and/or S3 buckets to do any of the following:
cache the content, speeding up global delivery,
protect your web server, in conjunction with WAF
Final thoughts:
The setup you choose depends on your application.
I have a java application deployed at elastic beanstalk tomcat and the purpose of the application is to serve resources from S3 in zipped bundles. For instance I have 30 audio files that I zip up and return in the response.
I've used the getObject request from the AWS SDK, however its super slow, I assume it's requesting each object over the network. Is it possible to access the S3 resources directly? The bucket with my resources is located next to the beanstalk bucket.
Transfer from S3 to EC2 is fast, if they are in the same region.
If you still want faster (and reliable) delivery of files, consider keeping files pre-zipped on S3 and serve from S3 directly rather than going through your server. You can use signed URL scheme here, so that the bucket need not be public.
Next level is speed up is by keeping the S3 behind Cloudfront as an origin server. Here the files are cached in locations near your users. Serving Private Content through CloudFront