allow open access to S3 bucket with certain restrictions

allow open access to S3 bucket with certain restrictions - amazon-web-services

How do I set permissions in such a way that anyone can upload files to my bucket?
Here is an example that has these 3 features:
I can upload any file and download my file from anywhere.
But I am not able to download files uploaded by others.
However, I can delete files uploaded by others.
I will like to know how this bucket (abc) was set up and who owns it.
1) I can upload:
[root#localhost ~]# aws s3 cp test.txt s3://abc/
upload: ./test.txt to s3://abc/test.txt
2) I can list contents:
[root#localhost ~]# aws s3 ls s3://abc | head
PRE doubleverify-iqm/
PRE folder400/
PRE ngcsc/
PRE out/
PRE pd/
PRE pit/
PRE soap1/
PRE some-subdir/
PRE swoo/
2018-06-15 12:06:27 2351 0Sw5xyknAcVaqShdROBSfCfa7sdA27WbFMm4QNdUHWqf2vymo5.json
3) I can download my file from anywhere:
[root#localhost ~]# aws s3 cp s3://abc/test.txt .
download: s3://abc/test.txt to ./test.txt
4) But not able to download other's file
[root#localhost ~]# aws s3 cp s3://abc/zQhAqmwIUfIeDnEEHpiaGhXuERgO3bR84jkjhbei1aLiV1758t.json .
fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
5) however, I can delete the file not uploaded by me:
[root#localhost ~]# aws s3 rm s3://abc/zQhAqmwIUfIeDnEEHpiaGhXuERgO3bR84jkjhbei1aLiV1758t.json
delete: s3://abc/zQhAqmwIUfIeDnEEHpiaGhXuERgO3bR84jkjhbei1aLiV1758t.json
I am not sure how to set-up such a bucket.

It is not advisable to setup a bucket in this manner.
The fact that anyone can upload to the bucket means that somebody could store, potentially, TBs of data and you would be liable for the cost. For example, somebody could host large video files, using your bucket for free storage and bandwidth.
Similarly, it is not good security practice to grant permissions for anyone to list the contents of your bucket. They might find sensitive data that was not intended to be released.
It would also be unwise to allow anyone to delete objects from your bucket, because somebody could delete everything!
There are two primary ways to grant access to objects:
Bucket Policy
A Bucket Policy can grant permissions on the whole bucket, or specific paths within a bucket. For example, granting GetObject to the whole bucket means that anyone can download any object.
See: Bucket Policy Examples - Amazon Simple Storage Service
Object-level permissions
Basic permissions can also be granted on a per-object basis. For example, when an object is copied to a bucket, the Access Control List (ACL) can specify who can access the object.
For example, this would grant ownership of the object to the bucket owner:
aws s3 cp foo.txt s3://my-bucket/foo.txt --acl bucket-owner-full-control
If the --acl is excluded, then the object 'belongs' to the identity that uploaded the file, which is why you were download your own file. This is not recommended, because it could lead to a situation where the bucket owner cannot access (and potentially cannot even delete!) the object.
Bottom line: Think about your security before implementing rules that grant other people, or anyone, permissions on your buckets.

Related

Copy objects between buckets using aws sdk

Using AWS CLI; we can copy or sync files directly from one bucket to other. Using SDK; I can see api for download and upload. But can we directly copy files from one bucket to other bucket ( in different aws account) using sdk !

Yes. The CopyObject API call can copy an object between Amazon S3 buckets, including bucket in different regions and different accounts.
To copy objects between accounts, the one set of credentials requires sufficient permission to Read from the source bucket and Write to the destination bucket. You can either:
Use credentials from the destination account, and use a Bucket Policy on the source bucket to grant Read access, or
Use credentials from the source account, and use a Bucket policy on the destination bucket to grant Write access. Make sure you set ACL=public-read to pass ownership of the object to the destination Account.
Please note that it only copies one object at a time, so you would need to loop through a list of objects and call CopyObject for each one individually if you wish to copy multiple objects.

It's easy, see all the CLI commands with the help:
aws s3 --help
Upload a file:
aws s3 cp <path-to-file-from-local> s3://<S3_BUCKET_NAME>/<file-name>
Download a file:
aws s3 cp s3://<S3_BUCKET_NAME>/<file-name> <path-to-file-from-local>
Move a file:
aws s3 mv s3://<S3_BUCKET_NAME>/<file-name> s3://<S3_BUCKET_NAME>/<file-name>
You can use . to specify the current directory, eg:
aws s3 cp s3://MyBucket/Test.txt .

AWS S3 sync creates objects with different permissions than bucket

I'm trying to use an S3 bucket to upload files to as part of a build, it is configured to provide files as a static site and the content is protected using a Lambda and CloudFront. When I manually create files in the bucket they are all visible and everything is happy, but when the files are uploaded what is created are not available, resulting in an access denied response.
The user that's pushing to the bucket does not belong in the same AWS environment, but it has been set up with an ACL that allows it to push to the bucket, and the bucket with a policy that allows it to be pushed to by that user.
The command that I'm using is:
aws s3 sync --no-progress --delete docs/_build/html "s3://my-bucket" --acl bucket-owner-full-control
Is there something else that I can try that basically uses the bucket permissions for anything that's created?

According to OP's feedback in the comment section, setting Object Ownership to Bucket owner preferred fixed the issue.

cross-account-access between 3 AWS accounts using assumeRule

We have a service in AWS-Account-A which will copy some files with ACL: 'bucket-owner-full-control' to a s3 bucket in AWS-Account-B . Now there is a AWS-Account-C which already have a assumeRule ( which a S3 Read access policy is attached to it ) from AWS-Account-B, and S3 bucket policy already gave read access to AWS-Account-C rules, So the problem is, AWS-Account-C : Can't read those files which uploaded from AWS-Account-A and only CAN read files which uploaded using AWS-Account-B itself.
I know it's a reallay compliated secnario, but as far as I understand, it's a ownership problem. The bucket policy applies only to objects that are owned by the bucket owner, So it's like , X own some files, and he copy it to Y, now Z can't get it from Y, because it's not owned by Y.
If anyone faced to this kind of sencarios before and have solution, I really appreciate it to give some guidance.

Your problem is that you used Account-A to copy files to a bucket owned by Account-B but now the copied files are owned by Account-A. This is why Account-C cannot access them. Account-C does not have the required permission.
The correct procedure is to create a role in Account-B to be assumed by Account-A. Then before Account-A copies file to the bucket in Account-B, it assumes the Account-B role. Now files copied to the bucket will be owned by Account-B.
For the files currently in the Account-B bucket, while using Account-B's credentials do an inplace copy. This will switch the ownership to Account-B.
Here is an example inplace copy. Note: No data is transferred over the internet just within S3 so it executes quickly.
aws s3 cp s3://mybucket/mykey s3://mybucket/mykey --storage-class STANDARD
The '--recursive' argument to apply to an entire folder of keys.
Warnings:
1) All custom metadata and existing permissions will be lost.
2) Ensure you have backups of your data prior to executing a command such as this.

Amazon S3 File Permissions, Access Denied when copied from another account

I have a set of video files that were copied from one AWS Bucket from another account to my account in my own bucket.
I'm running into a problem now with all of the files where i am receiving Access Denied errors when I try to make all of the files public.
Specifically, I login to my AWS account, go into S3, drill down through the folder structures to locate one of the videos files.
When I look at this specificfile, the permissions tab on the files does not show any permissions assigned to anyone. No users, groups, or system permissions have been assigned.
At the bottom of the Permissions tab, I see a small box that says "Error: Access Denied". I can't change anything about the file. I can't add meta-data. I can't add a user to the file. I cannot make the file Public.
Is there a way i can gain control of these files so that I can make them public? There are over 15,000 files / around 60GBs of files. I'd like to avoid downloading and reuploading all of the files.
With some assistance and suggestions from the folks here I have tried the following. I made a new folder in my bucket called "media".
I tried this command:
aws s3 cp s3://mybucket/2014/09/17/thumb.jpg s3://mybucket/media --grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=my_aws_account_email_address
I receive a fatal error 403 when calling the HeadObject operation: Forbidden.

A very interesting conundrum! Fortunately, there is a solution.
First, a recap:
Bucket A in Account A
Bucket B in Account B
User in Account A copies objects to Bucket B (having been granted appropriate permissions to do so)
Objects in Bucket B still belong to Account A and cannot be accessed by Account B
I managed to reproduce this and can confirm that users in Account B cannot access the file -- not even the root user in Account B!
Fortunately, things can be fixed. The aws s3 cp command in the AWS Command-Line Interface (CLI) can update permissions on a file when copied to the same name. However, to trigger this, you also have to update something else otherwise you get this error:
This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.
Therefore, the permissions can be updated with this command:
aws s3 cp s3://my-bucket/ s3://my-bucket/ --recursive --acl bucket-owner-full-control --metadata "One=Two"
Must be run by an Account A user that has access permissions to the objects (eg the user who originally copied the objects to Bucket B)
The metadata content is unimportant, but needed to force the update
--acl bucket-owner-full-control will grant permission to Account B so you'll be able to use the objects as normal
End result: A bucket you can use!

aws s3 cp s3://account1/ s3://accountb/ --recursive --acl bucket-owner-full-control

To correctly set the appropriate permissions for newly added files, add this bucket policy:
[...]
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
And set ACL for newly created files in code. Python example:
import boto3
client = boto3.client('s3')
local_file_path = '/home/me/data.csv'
bucket_name = 'my-bucket'
bucket_file_path = 'exports/data.csv'
client.upload_file(
local_file_path,
bucket_name,
bucket_file_path,
ExtraArgs={'ACL':'bucket-owner-full-control'}
)
source: https://medium.com/artificial-industry/how-to-download-files-that-others-put-in-your-aws-s3-bucket-2269e20ed041 (disclaimer: written by me)

In case anyone trying to do the same but using Hadoop/Spark job instead of AWS CLI.
Step 1: Grant user in Account A appropriate permissions to copy
objects to Bucket B. (mentioned in above answer)
Step 2: Set the fs.s3a.acl.default configuration option using Hadoop Configuration. This can be set in conf file or in program:
Conf File:
<property>
<name>fs.s3a.acl.default</name>
<description>Set a canned ACL for newly created and copied objects. Value may be Private,
PublicRead, PublicReadWrite, AuthenticatedRead, LogDeliveryWrite, BucketOwnerRead,
or BucketOwnerFullControl.</description>
<value>$chooseOneFromDescription</value>
</property>
Programmatically:
spark.sparkContext.hadoopConfiguration.set("fs.s3a.acl.default", "BucketOwnerFullControl")

by putting
--acl bucket-owner-full-control
made it to work.

I'm afraid you won't be able to transfer ownership as you wish. Here's what you did:
Old account copies objects into new account.
The "right" way of doing it (assuming you wanted to assume ownership on the new account) would be:
New account copies objects from old account.
See the small but important difference? S3 docs kind of explain it.
I think you might get away with it without needing to download the whole thing by just copying all of the files within the same bucket, and then deleting the old files. Make sure you can change the permissions after doing the copy. This should save you some money too, as you won't have to pay for the data transfer costs of downloading everything.

boto3 "copy_object" solution :
Providing Grant control to the destination bucket owner
client.copy_object(CopySource=copy_source, Bucket=target_bucket, Key=key, GrantFullControl='id=<bucket owner Canonical ID>')
Get for console
Select bucket, Permission tab, "Access Control List" tab

How to download data from Amazon's requester pay buckets?

I have been struggling for about a week to download arXiv articles as mentioned here: http://arxiv.org/help/bulk_data_s3#src.
I have tried lots of things: s3Browser, s3cmd. I am able to login to my buckets but I am unable to download data from arXiv bucket.
I tried:
s3cmd get s3://arxiv/pdf/arXiv_pdf_1001_001.tar
See:
$ s3cmd get s3://arxiv/pdf/arXiv_pdf_1001_001.tar
s3://arxiv/pdf/arXiv_pdf_1001_001.tar -> ./arXiv_pdf_1001_001.tar [1 of 1]
s3://arxiv/pdf/arXiv_pdf_1001_001.tar -> ./arXiv_pdf_1001_001.tar [1 of 1]
ERROR: S3 error: Unknown error
s3cmd get with x-amz-request-payer:requester
It gave me same error again:
$ s3cmd get --add-header="x-amz-request-payer:requester" s3://arxiv/pdf/arXiv_pdf_manifest.xml
s3://arxiv/pdf/arXiv_pdf_manifest.xml -> ./arXiv_pdf_manifest.xml [1 of 1]
s3://arxiv/pdf/arXiv_pdf_manifest.xml -> ./arXiv_pdf_manifest.xml [1 of 1]
ERROR: S3 error: Unknown error
Copying
I have tried copying files from that folder too.
$ aws s3 cp s3://arxiv/pdf/arXiv_pdf_1001_001.tar .
A client error (403) occurred when calling the HeadObject operation: Forbidden
Completed 1 part(s) with ... file(s) remaining
This probably means that I made a mistake. The problem is I don't know how and what to add that will convey my permission to pay for download.
I am unable to figure out what should I do for downloading data from S3. I have been reading a lot on AWS sites, but nowhere I can get pinpoint solution to my problem.
How can I bulk download the arXiv data?

Try downloading s3cmd version 1.6.0: http://sourceforge.net/projects/s3tools/files/s3cmd/
$ s3cmd --configure
Enter your credentials found in the account management tab of the Amazon AWS website interface.
$ s3cmd get --recursive --skip-existing s3://arxiv/src/ --requester-pays

Requester Pays is a feature on Amazon S3 buckets that requires the user of the bucket to pay Data Transfer costs associated with accessing data.
Normally, the owner of an S3 bucket pays Data Transfer costs, but this can be expensive for free / Open Source projects. Thus, the bucket owner can activated Requester Pays to reduce the portion of costs they will be charged.
Therefore, when accessing a Requester Pays bucket, you will need to authenticate yourself so that S3 knows whom to charge.
I recommend using the official AWS Command-Line Interface (CLI) to access AWS services. You can provide your credentials via:
aws configure
and then view the bucket via:
aws s3 ls s3://arxiv/pdf/
and download via:
aws s3 cp s3://arxiv/pdf/arXiv_pdf_1001_001.tar .
UPDATE: I just tried the above myself, and received Access Denied error messages (both on the bucket listing and the download command). When using s3cmd, it says ERROR: S3 error: Access Denied. It would appear that the permissions on the bucket no longer permit access. You should contact the owners of the bucket to request access.

At the bottom of this page arXiv explains that s3cmd gets denied because it does not support access to requester pays bucket as a non-owner and you have to apply a patch to the source code of s3cmd. However, the version of s3cmd they used is outdated and the patch does not apply to the latest version of s3cmd.
Basically you need to allow s3cmd to add "x-amz-request-payer" header to its HTTP request to buckets. Here is how to fix it:
Download the source code of s3cmd.
Open S3/S3.py with a text editor.
Add this two lines of code at the bottom of __init__ function:
if self.s3.config.extra_headers:
self.headers.update(self.s3.config.extra_headers)
Install s3cmd as instructed.

For me the problem was that my IAM user didn't have enough permissions.
Setting AmazonS3FullAccess was the solution for me.
Hope it'll save time to someone

Don't want to steal the thunder, but OttoV's comment actually gave the right command that works for me.
aws s3 ls --request-payer requester s3://arxiv/src/
My EC2 is in Region us-east-2, but the arXiv s3 buckets are in Region us-east-1, so I think that's why the --request-payer requester is needed.
From https://aws.amazon.com/s3/pricing/?nc=sn&loc=4 :
You pay for all bandwidth into and out of Amazon S3, except for the following:
• Data transferred in from the internet.
• Data transferred out to an Amazon Elastic Compute Cloud (Amazon EC2) instance, when the instance is in the same AWS Region as the S3 bucket (including to a different account in the same AWS region).
• Data transferred out to Amazon CloudFront (CloudFront).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js