Copied S3 Bucket is not public by default - amazon-web-services

I have two S3 buckets -
production
staging
I want to periodically refresh the staging bucket so it has all the latest production objects for testing, so I used the aws-cli as follows -
aws s3 sync s3://production s3://staging
So now both buckets have the exact same files.
However, for any given file/object, the production link works and the staging doesn't
e.g.
This works: https://s3-us-west-1.amazonaws.com/production/users/photos/000/001/001/medium/my_file.jpg
This doesn't: https://s3-us-west-1.amazonaws.com/staging/users/photos/000/001/001/medium/my_file.jpg
The staging bucket's objects are not public links, and are private by default.
Is there a way to correct this or avoid this with the aws-cli? I know I can change the bucket policy itself, but it was previously working with all the files that were there. So I'm wondering what it is about copying files over that changed their visibility.
Thanks!

you should be able to add --acl flag :
aws s3 sync s3://production s3://staging --acl public-read
as mentioned in doc private acl is the default

Just did some more research.
Frédéric's answer is correct, but just wanted to expand on that a bit more.
aws s3 sync isn't really a true "sync" by default. It just goes through each file in the source bucket and copies files into the target bucket
If a target file with the same name already exists. I looked for a --force flag to force the overwrite, but apparently none exists
It won't delete "extra" files in the target directory by default (i.e. a file that does not exist in the source directory). The --delete flag will allow you to do that
It does not copy over permissions by default. It's true that --acl public-read will set the target permissions to publicly readable, but that has 2 problems - (1) it just blindly sets that for all files, which you may not want, and (2) it doesn't work when you have several files of varying permissions.
There's an issue about it here, and a PR that's open but still un-merged as of today.
So if you're trying to do a full blind refresh like me for testing purposes, the best option is to
Completely empty the target staging bucket by right clicking in the console and clicking Empty
Run the sync and blindly set everything as public-read (other visibility options are available, see documentation here). - aws s3 sync s3://production s3://staging --acl public-read

Related

AWS S3 sync creates objects with different permissions than bucket

I'm trying to use an S3 bucket to upload files to as part of a build, it is configured to provide files as a static site and the content is protected using a Lambda and CloudFront. When I manually create files in the bucket they are all visible and everything is happy, but when the files are uploaded what is created are not available, resulting in an access denied response.
The user that's pushing to the bucket does not belong in the same AWS environment, but it has been set up with an ACL that allows it to push to the bucket, and the bucket with a policy that allows it to be pushed to by that user.
The command that I'm using is:
aws s3 sync --no-progress --delete docs/_build/html "s3://my-bucket" --acl bucket-owner-full-control
Is there something else that I can try that basically uses the bucket permissions for anything that's created?
According to OP's feedback in the comment section, setting Object Ownership to Bucket owner preferred fixed the issue.

How to disable directory listing in Google Cloud [duplicate]

We're using google cloud storage as our CDN.
However, any visitors can list all files by typing: http://ourcdn.storage.googleapis.com/
How to disable it while all the files under the bucket is still public readable by default?
We previously set the acl using
gsutil defacl ch -g AllUsers:READ
In GCP dashboard:
get in your bucket
click "Permissions" tab and get in.
in member list find "allUsers", change role from Storage Object Viewer to Storage Legacy Object Reader
then, listing should be disabled.
Update:
as #Devy comment, just check the note below here
Note: roles/storage.objectViewer includes permission to list the objects in the bucket. If you don't want to grant listing publicly, use roles/storage.legacyObjectReader.
Upload an empty index.html file in the root of your bucket. Open the bucket settings and click Edit website configuration - set index.html as the Main Page.
It will prevent the listing of the directory.
Your defacl looks good. The problem is most likely that for some reason AllUsers must also have READ, WRITE, or FULL_CONTROL on the bucket itself. You can clear those with a command like this:
gsutil acl ch -d AllUsers gs://bucketname
Your command set the default object ACL on the bucket to READ, which means that objects will be accessible by anyone. To prevent users from listing the objects, you need to make sure users don't have an ACL on the bucket itself.
gsutil acl ch -d AllUsers gs://yourbucket
should accomplish this. You may need to run a similar command for AllAuthenticatedUsers; just take a look at the bucket ACL with
gsutil acl get gs://yourbucket
and it should be clear.

Amazon S3 File Permissions, Access Denied when copied from another account

I have a set of video files that were copied from one AWS Bucket from another account to my account in my own bucket.
I'm running into a problem now with all of the files where i am receiving Access Denied errors when I try to make all of the files public.
Specifically, I login to my AWS account, go into S3, drill down through the folder structures to locate one of the videos files.
When I look at this specificfile, the permissions tab on the files does not show any permissions assigned to anyone. No users, groups, or system permissions have been assigned.
At the bottom of the Permissions tab, I see a small box that says "Error: Access Denied". I can't change anything about the file. I can't add meta-data. I can't add a user to the file. I cannot make the file Public.
Is there a way i can gain control of these files so that I can make them public? There are over 15,000 files / around 60GBs of files. I'd like to avoid downloading and reuploading all of the files.
With some assistance and suggestions from the folks here I have tried the following. I made a new folder in my bucket called "media".
I tried this command:
aws s3 cp s3://mybucket/2014/09/17/thumb.jpg s3://mybucket/media --grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=my_aws_account_email_address
I receive a fatal error 403 when calling the HeadObject operation: Forbidden.
A very interesting conundrum! Fortunately, there is a solution.
First, a recap:
Bucket A in Account A
Bucket B in Account B
User in Account A copies objects to Bucket B (having been granted appropriate permissions to do so)
Objects in Bucket B still belong to Account A and cannot be accessed by Account B
I managed to reproduce this and can confirm that users in Account B cannot access the file -- not even the root user in Account B!
Fortunately, things can be fixed. The aws s3 cp command in the AWS Command-Line Interface (CLI) can update permissions on a file when copied to the same name. However, to trigger this, you also have to update something else otherwise you get this error:
This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.
Therefore, the permissions can be updated with this command:
aws s3 cp s3://my-bucket/ s3://my-bucket/ --recursive --acl bucket-owner-full-control --metadata "One=Two"
Must be run by an Account A user that has access permissions to the objects (eg the user who originally copied the objects to Bucket B)
The metadata content is unimportant, but needed to force the update
--acl bucket-owner-full-control will grant permission to Account B so you'll be able to use the objects as normal
End result: A bucket you can use!
aws s3 cp s3://account1/ s3://accountb/ --recursive --acl bucket-owner-full-control
To correctly set the appropriate permissions for newly added files, add this bucket policy:
[...]
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
And set ACL for newly created files in code. Python example:
import boto3
client = boto3.client('s3')
local_file_path = '/home/me/data.csv'
bucket_name = 'my-bucket'
bucket_file_path = 'exports/data.csv'
client.upload_file(
local_file_path,
bucket_name,
bucket_file_path,
ExtraArgs={'ACL':'bucket-owner-full-control'}
)
source: https://medium.com/artificial-industry/how-to-download-files-that-others-put-in-your-aws-s3-bucket-2269e20ed041 (disclaimer: written by me)
In case anyone trying to do the same but using Hadoop/Spark job instead of AWS CLI.
Step 1: Grant user in Account A appropriate permissions to copy
objects to Bucket B. (mentioned in above answer)
Step 2: Set the fs.s3a.acl.default configuration option using Hadoop Configuration. This can be set in conf file or in program:
Conf File:
<property>
<name>fs.s3a.acl.default</name>
<description>Set a canned ACL for newly created and copied objects. Value may be Private,
PublicRead, PublicReadWrite, AuthenticatedRead, LogDeliveryWrite, BucketOwnerRead,
or BucketOwnerFullControl.</description>
<value>$chooseOneFromDescription</value>
</property>
Programmatically:
spark.sparkContext.hadoopConfiguration.set("fs.s3a.acl.default", "BucketOwnerFullControl")
by putting
--acl bucket-owner-full-control
made it to work.
I'm afraid you won't be able to transfer ownership as you wish. Here's what you did:
Old account copies objects into new account.
The "right" way of doing it (assuming you wanted to assume ownership on the new account) would be:
New account copies objects from old account.
See the small but important difference? S3 docs kind of explain it.
I think you might get away with it without needing to download the whole thing by just copying all of the files within the same bucket, and then deleting the old files. Make sure you can change the permissions after doing the copy. This should save you some money too, as you won't have to pay for the data transfer costs of downloading everything.
boto3 "copy_object" solution :
Providing Grant control to the destination bucket owner
client.copy_object(CopySource=copy_source, Bucket=target_bucket, Key=key, GrantFullControl='id=<bucket owner Canonical ID>')
Get for console
Select bucket, Permission tab, "Access Control List" tab

Upload nested directories to S3 with the AWS CLI?

I have been trying to upload a static website to s3 with the following cli command:
aws s3 sync . s3://my-website-bucket --acl public-read
It successfully uploads every file in the root directory but fails on the nested directories with the following:
An error occurred (InvalidRequest) when calling the ListObjects operation: Missing required header for this request: x-amz-content-sha256
I have found references to this issue on GitHub but no clear instruction of how to solve it.
s3 sync command recursively copies the local folders to folder like s3 objects.
Even though S3 doesn't really support folders, the sync command creates the s3 objects with a format which will have the folder names in their keys.
As reported on the following amazon support thread "forums.aws.amazon.com/thread.jspa?threadID=235135" the issue should be solved by setting the region correctly.
S3 has no concept of directories.
S3 is an object store where each object is identified by a key.
The key might be a string like "dir1/dir2/dir3/test.txt"
AWS graphical user interfaces on top of S3 interpret the "/" characters as a directory separator and present the file list "as is" it was in a directory structure.
However, internally, there is no concept of directory, S3 has a flat namespace.
See http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html for more details.
This is the reason directories are not synced as there is no directories on S3.
Also the feature request is open in https://github.com/aws/aws-cli/issues/912 but has not been added yet.

access multiple buckets in s3

My manager has an AWS account n using his credentials we create buckets per employee. Now i want to access another bucket through command line. So is it possible that i can access two buckets (mine and one more)? I have the access key for both buckets. But still i am not able to access both the buckets simultaneously.so that i could upload and download my files on which ever bucket i want..?
I have already tried changing access key and security in my s3config. But it didn't serve the purpose.
I have been already granted the ACL for that new bucket.
Thanks
The best you can do without having a single access key that has permissions for both buckets is create a separate .s3cfg file. I'm assuming you're using s3cmd.
s3cmd --configure -c .s3cfg_bucketname
Will allow you to create a new configuration in the config file .s3cfg_bucketname. From then on when you are trying to access that bucket you just have to add the command line flag to specify which configuration to use:
s3cmd -c .s3cfg_bucketname ls
Of course you could add a bash function to your .bashrc (now I'm assuming bash... lots of assumptions! Let me know if I'm in the wrong, please) to make it even simpler:
function s3bucketname(){
s3cmd -c ~/.s3cfg_bucketname "$#"
}
Usage:
s3bucketname ls
I'm not sure which command line tool you are using, if you are using Timothy Kay's tool
than you will find that the documentation allows you the set the access key and secret key as environment variables and not only in a config file so you can set them in command line before the put command.
I am one of the developer of Bucket Explorer and you can use its Transfer panel with two credentials and perform operation between your and other bucket.
for more details read Copy Move Between two different account