How to access aws s3 current bucketlist content info - amazon-web-services

I have been provided with the access and secret key for an Amazon S3 container. No more details were provided other than to drop some files into some specific folder.
I downloaded Amazon CLI and also the Amazon SDK. So far, seems to be no way for me to check the bucket name or list the folders where I'm supposed to drop my files. Every single command seems to require the knowledge of a bucket name.
Trying to list with aws s3 ls gives me the error:
An error occurred (AccessDenied) when calling the ListBuckets operation: Access Denied
Is there a way to list the content of my current location (I'm guessing the credentials I was given are linked directly to a bucket?). I'd like to see at least the folders where I'm supposed to drop my files, but the SDK client for the console app I'm building seems to always require a bucket name.
Was I provided incomplete info or limited rights?

Do you know the bucket name or not? If you don't and you don't have permission to ListAllMyBuckets and GetBucketLocation on * and ListBucket on the bucket in question, then you can't get the bucket name. That's how it is supposed to work. If you know the bucket, then you can run aws s3 s3://bucket-name/ to get objects in the bucket.
Note, that S3 buckets don't have the concept of "folder". It's User interface "sugar" to make it look like folders and files. Internally, it's just the key and the object

Looks like it was just not possible without enhanced rights or with the actual bucketname. I was able to procure both later on from the client and able to complete the task. Thanks for the comments.

Related

Boto3 access denied when calling the listobjects operation on a s3 bucket directory

I'm trying to access a bucket via cross account reference, the connection is established, but the put/list permissions are set on a specific directory (folder) i.e. bucketname/folder_name/*
s3 = boto3.client('s3')
s3.upload_file("filename.csv","bucketname","folder_name/file.csv"
,ExtraArgs={'ACL':'bucket-owner-full-control'})
Not sure how do I allow the same via code, it throws access denied on both list/put. Nothing wrong with permissions as such, have verified the access via awscli, it works.
let me know if i'm missing something here, thanks!
There was an issue with the assumed role, followed the documentation as mentioned here https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-api.html along with the code mentioned above.

Accessing s3 bucket on AWS ParallelCluster

I have a requirement of accessing S3 bucket on the AWS ParallelCluster nodes. I did explore the s3_read_write_resource option in the ParallelCluster documentation. But, it is not clear as to how we can access the bucket. For example, will it be mounted on the nodes, or will the users be able to access it by default. I did test the latter by trying to access a bucket I declared using the s3_read_write_resource option in the config file, but was not able to access it (aws s3 ls s3://<name-of-the-bucket>).
I did go through this github issue talking about mounting S3 bucket using s3fs. In my experience it is very slow to access the objects using s3fs.
So, my question is,
How can we access the S3 bucket when using s3_read_write_resource option in AWS ParallelCluster config file
These parameters are used in ParallelCluster to include S3 permissions on the instance role that is created for cluster instances. They're mapped into Cloudformation template parameters S3ReadResource and S3ReadWriteResource . And later used in the Cloudformation template. For example, here and here. There's no special way for accessing S3 objects.
To access S3 on one cluster instance, we need to use the aws cli or any SDK . Credentials will be automatically obtained from the instance role using instance metadata service.
Please note that ParallelCluster doesn't grant permissions to list S3 objects.
Retrieving existing objects from S3 bucket defined in s3_read_resource, as well as retrieving and writing objects to S3 bucket defined in s3_read_write_resource should work.
However, "aws s3 ls" or "aws s3 ls s3://name-of-the-bucket" need additional permissions. See https://aws.amazon.com/premiumsupport/knowledge-center/s3-access-denied-listobjects-sync/.
I wouldn't use s3fs, as it's not AWS supported, it's been reported to be slow (as you've already noticed), and other reasons.
You might want to check the FSx section. It can create an attach an FSx for Lustre filesystem. It can import/export files to/from S3 natively. We just need to set import_path and export_path on this section.

How to check permissions on folders in S3?

I want to simply check the permissions that I have on a buckets/folders/files in AWS S3. Something like:
ls -l
Sounds like it should be pretty easy but I cannot find any information on the subject. I just want to know if I have read access to a content, or if I can load a file locally without trying to load the data, to have an "Error Code: 403 Forbidden" thrown at me.
Note: I am using databricks and want to check the permission from there.
Thanks!
You can check the permissions using the command,
aws s3api get-object-acl --bucket my-bucket --key index.html
You acl for each object can vary across your bucket.
More documentation at,
https://docs.aws.amazon.com/cli/latest/reference/s3api/get-object-acl.html
Hope it helps.
There are several different ways to grant access to objects in Amazon S3.
Permissions can be granted on a whole bucket, or a path within a bucket, via a Bucket Policy.
Permissions can also be granted to an IAM User or Role, giving that specific user/role permissions similar to a bucket policy.
Then there are permissions on the object itself, such as making it publicly readable.
So, there is no simple way to say "what are the permissions on this particular object" because it depends who you are. Also, the policies can restrict by IP address and time of day, so there isn't always one answer.
You could use the IAM Policy Simulator to test whether a certain call (eg PutObject or GetObject) would work for a given user.
Some commands in the AWS Command-Line Interface (CLI) come with a --dryrun option that will simply test whether the command would have worked, without actually executing the command.
Or, sometimes it is just easiest to try to access the object and see what happens!

Upload nested directories to S3 with the AWS CLI?

I have been trying to upload a static website to s3 with the following cli command:
aws s3 sync . s3://my-website-bucket --acl public-read
It successfully uploads every file in the root directory but fails on the nested directories with the following:
An error occurred (InvalidRequest) when calling the ListObjects operation: Missing required header for this request: x-amz-content-sha256
I have found references to this issue on GitHub but no clear instruction of how to solve it.
s3 sync command recursively copies the local folders to folder like s3 objects.
Even though S3 doesn't really support folders, the sync command creates the s3 objects with a format which will have the folder names in their keys.
As reported on the following amazon support thread "forums.aws.amazon.com/thread.jspa?threadID=235135" the issue should be solved by setting the region correctly.
S3 has no concept of directories.
S3 is an object store where each object is identified by a key.
The key might be a string like "dir1/dir2/dir3/test.txt"
AWS graphical user interfaces on top of S3 interpret the "/" characters as a directory separator and present the file list "as is" it was in a directory structure.
However, internally, there is no concept of directory, S3 has a flat namespace.
See http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html for more details.
This is the reason directories are not synced as there is no directories on S3.
Also the feature request is open in https://github.com/aws/aws-cli/issues/912 but has not been added yet.

How to automate S3 http url permission grant

I have a url (in the form of https://s3.amazonaws.com/...) pointing to a file in S3 and I want to have it downloadable by a user so I set the permission from S3 dashboard in AWS console but I learned that it is being reset whenever the file is re-written (the filename remains the same).
Is there a way to automatically set the permission right after the file creation? I looked at boto library but couldn't figure it out. Thanks in advance!
This is a very common operation.
With the Boto library, you can set an ACL. Assuming you have a Key:
key.set_acl('public-read')
If you don't have a Key, you'll need to have a Bucket:
bucket.set_acl('public-read', 'path/to/key')
You can use non-canned ACLs also. The documentation links through to that.
In boto3, you can also set an ACL.
Bucket syntax:
s3client.put_bucket_acl(ACL='public-read', Bucket='bucketname')
Key syntax:
s3client.put_object_acl(ACL='public-read', Bucket='bucketname', Key='path/to/key')
Non-canned ACLs are a little easier in boto3.