What is the best practice to restrict data in the cloud - amazon-web-services

I tried to use DigitalOcean Spaces which is like AWS S3 to give certain users the ability to view a file like a video. I only can give them a custom link (to one file, not a hole direcotry) with a defined period of time to view .
I would like to know what is the best practice in the cloud, how to share files privately only to registred users.

You can create a private S3 bucket and using the SDK, you can create pre-signed URL for a file and set the expiry time of the link.
https://docs.aws.amazon.com/AmazonS3/latest/dev/ShareObjectPreSignedURL.html

Related

How to safely share files via amazon S3 bucket

I need to share ~10K files with ~10K people (one-to-one matching) and I would like to use Amazon S3 for this (by giving public access to these files).
The caveat is that I do not want anyone to be able to download all these files together. What are the right permissions for this.
Currently, my plan is:
Create non-public buckets (foo)
Name each file with a long string so one cannot guess the link (bar)
Make all files public
Share links of the form https://foo.s3.amazonaws.com/bar
It seems that by having a non-public bucket, I ensure that no one can list files in my bucket and hence won't be able to guess names of the files inside. Is it correct?
I would approach this using pre-signed urls as this allows you to grant access on an object-level, even when your bucket and objects are kept private. This means that the only way to access an object in the bucket is by using the link you provide to each individual user.
Therefore to follow best practice, you should block all public access and make all objects private. This will prevent anyone from listing bucket objects.
To automate this, you could upload the files naming them after each user, or some other identifying string like an id number. You can then generate a presigned url giving the user a limited time to retrieve the file without granting them access to the bucket as a whole with some kind of loop.
I use bash so that's the example I'll give but there's probably a similar powershell solution for this too.
The easiest way to do this would be with the aws-cli:
aws s3 presign s3://<YOUR-BUCKET-NAME>/<userIdNumber>.file \
--expires-in 604800
Put all of your userid's or whatever you've used to identify your user's files in a text file and loop over them with bash to generate all your presigned url's like so:
Contents of users.txt:
user1
user2
user3
user4
user5
The loop:
for i in $(cat users.txt) ;
do
echo "$i" ;
aws s3 presign "s3://my-bucket/$i.file" --expires-in 604800 ;
done
This should spit out a list of usernames with a url below each user. Just send the link to each user and they will be able to get their document.

AWS S3 filename

I’m trying to build application with backend in java that allows users to create a text with images in it (something like a a personal blog). I’m planning to store these images to s3 bucket. When uploading image files to bucket i’m hashing the original name and store the hashed one in the bucket. Images are for display purpose only, no user will be able to download them. Frontend displays these images by getting a path to them from the server. So the question is, is there any need to store original name of the image file in the database? And what are the reasons, if any, of doing so?
I guess in general it is not needed because what is more important is how these resources are used or managed in the system.
Assuming your service is something like data access (similar to google drive), I don't think it's necessary to store it in DB, unless you want to make faster search queries.

how to restrict google cloud storage upload

I have a mobile application that uses Google Cloud Storage. The application allows each registered user to upload a specific number of files.
My question is, is there a way to do some kind of checks before the storage upload? Or do I need to implement a separate reservation API of sorts that OKs an upload step?
Any alternative suggestions are welcome too, of course.
warning: Not an authoritative answer. Happy to accept removal or update requests.
I am not aware of any GCS or Firebase Cloud Storage mechanisms that will inherently limit the number of files (objects) that a given user can create. If it were me, this is how I would approach the puzzle.
I would create a database (eg. Firestore / Datastore) that has a key for each user and a value which is the number of files they have uploaded. When a user wants to upload a new file, it would first make a REST call to a Cloud Function that I would write. This Cloud Function would implicitly know the identity of the calling user. It would look up the record in the database and determine if we are allowed to upload a new file. If no, then return an error and end of story. If yes, then increment the value in the database. Next I would create a GCS "signed URL" that can be used to permit an upload. It would be that signed URL that the Cloud Function would return. The app that now wishes to upload can use that signed URL to perform the actual upload.
I would also add metadata to each file uploaded to identify the logical uploader (user) of the file. That can be then used for reconciliation if needed. We could examine all the files in the bucket and re-build the database of how many files each user had uploaded.
A possible alternative to this story is for the Cloud Function to not return a signed-url but instead receive the data to be uploaded in the same request. If the check on number of files passes, then the Cloud Function could be a proxy to a GCS write to create the file directly. This alternative needs to be carefully examined as a function of the sizes of the files to be uploaded. If the files are large this may be a very poor solution. We want to be in and out of Cloud Functions as quickly as possible and holding a Cloud Function "around" to service data pass through isn't great. We may want to look at Cloud Run in that case as it supports concurrency in the instance without increasing the cost per call.

What is the correct way to set up S3 for loading content in the browser?

I want to do the following: a user in a browser types some text and after he presses a 'Save' button, the text should be saved in a file (for example: content.txt) in a folder (for example: /username_text) on the root of an S3 bucket.
Also, I want the user to be able, when he visits the same page, load the content from S3 and continue working on the file. Then, if he/she is done, save the file to S3 again.
Probably important to mention, but I plan on using NodeJS for my back-end...
My question now is: What is the best way to set this storing-and-retrieving thing up? Do I create an API gateway + Lambda function to GET and POST files through that? Or do I for example use the aws-sdk in Node to directly push and pull files from S3? Or is there a better way to do this?
I looked at the following two guides:
Using AWS S3 Buckets in a NodeJS App – Codebase – Medium
Image Upload and Retrieval from S3 Using AWS API Gateway and Lambda
Welcome to StackOverflow!
I think you are worrying too much about the not-so-important stuff. S3 is nothing but a storage system. You could have decided to store the content of these files on DynamoDB, RDS, etc. What would you do if you stored its contents on these real databases? You'd fetch for data and display it to the user, wouldn't you?
This is what you need to do with S3! S3 is a smart choice on your scenario because your "file" can grow very big and S3 is a great place for storing files. However, apparently, you're not actually storing files (think of .pdf, .mp4, .mov, etc.), you're essentially only storing human-readable text.
So here's one approach on how to solve your problem:
FETCHING FILE CONTENT
User logs in
You fetch the user's personal information based on some token. You can store all the metadata in DynamoDB, where given a user_id, fetch all the "files" from this user. These "files" (metadata only) would be the bucket and key for the actual file on S3.
You use the getObject API from S3 to fetch the file based on your query and display the body of your file to your user in a RESTful way. Your response should look something like this:
{
"content": "some content"
}
SAVING FILE CONTENT
User logs in
The user writes anything in a form and submits it. In your Lambda function, you grab the content of this form and process it. This request should look something like this:
{
"file_id": "some-id",
"user_id": "some-id",
"content": "some-content"
}
If the file_id exists, update the content in S3. Otherwise, upload a new file in S3 and then create a new entry in DynamoDB. You'd then, of course, have to handle if the user submitting the changes actually owns the file, but if you're using UUIDs it shouldn't be too much of a problem, but still worth checking in case an ID is leaked somehow.
This way, you don't need to worry about uploading/downloading files as these are CPU intensive tasks, so you can keep your costs low as well as using very little RAM in your functions (128MB should be more than enough), after all, you're now only serving text. Not only this will simplify your way of designing it, but will also make things simpler both in API Gateway and in your code as you won't have to deal with binary types. The maximum you'll do is convert the buffer from S3 to a String when serving some content, but this should be completely fine.
EDIT
On your question regarding whether you should upload it from the browser or not, I suggest you take a look into this answer where I cover the pros/cons of doing it via API Gateway vs from the Browser.

S3 bucket policy to list multiple objects in public bucket

I have set up a public bucket in S3 and copied multiple objects into it. In this case they are jpeg photos.
I want to share all these objects with anonymous public users (friends), but I want to send them one static website address for the bucket and for the objects to show up as a list (or at least show all the images) when they click on that one address link.
Is this possible to display the objects this way using S3 to public users who don't have an S3 account?
The alternative I know of is to send them a unique link to each of the objects in the bucket (which would take forever!).
Any advice would be helpful.
S3 doesn't have anything built-in to do a "directory index" like nginx and Apache can do. It can be done with AWS Lambda, though.
I built a rudimentary image index with lambda, you might be able to adapt it to solve your problem.
yes.
you can host an static webpage inside a s3 bucket: http://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteHosting.html
just generate a static html page with links to all the photos, upload it in the bucket, set the bucket to serve as a static webpage and give the link to it.
Or, for the extra lazy :) https://github.com/rgrp/s3-bucket-listing
Thanks for your answers, they helped me to find a really simple solution. On a different forum I found someone has written some script and put it in a link that you just upload straight into your bucket and that puts all the objects into a simple list...... genius!
This is the link:
http://regexp.s3.amazonaws.com/list.html
So for the less techy people (like me) you literally upload that link above into your bucket. Even if you haven't downloaded it onto your PC, just copy and paste it into the upload file path.
When I uploaded it, the file appeared in the S3 bucket as list.html
Make sure the file is readable and you've set the ACL appropriately. And make sure your bucket has a policy that allows anyone to access it.
Your bucket objects(content) are then shown at the url link below.
http://<your bucket name>.s3.amazonaws.com/list.html
Where <your bucket name> is written above, replace that part with just the name of your bucket.
And you should be able to click on that link and see the list of objects in your bucket. Once you get your head around it, it is actually very simple.