Possible to access an AWS public dataset using Cyberduck? - amazon-web-services

Cyberduck version: Version 7.9.2
Cyberduck is designed to access non-public AWS buckets. It asks for:
Server
Port
Access Key ID
Secret Access Key
The Registry of Open Data on AWS provides this information for an open dataset (using the example at https://registry.opendata.aws/target/):
Resource type: S3 Bucket
Amazon Resource Name (ARN): arn:aws:s3:::gdc-target-phs000218-2-open
AWS Region: us-east-1
AWS CLI Access (No AWS account required): aws s3 ls s3://gdc-target-phs000218-2-open/ --no-sign-request
Is there a version of s3://gdc-target-phs000218-2-open that can be used in Cyberduck to connect to the data?

If the bucket is public, any AWS credentials will suffice. So as long as you can create an AWS account, you only need to create an IAM user for yourself with programmatic access, and you are all set.
No doubt, it's a pain because creating an AWS account needs your credit (or debit) card! But see https://stackoverflow.com/a/44825406/1094109 and https://stackoverflow.com/a/44825406/1094109
I tried this with s3://gdc-target-phs000218-2-open and it worked:
For RODA buckets that provide public access to specific prefixes, you'd need to edit the path to suit. E.g. s3://cellpainting-gallery/cpg0000-jump-pilot/source_4/ (this is a RODA bucket maintained by us, yet to be released fully)
NOTE: The screenshots below show a different URL that's no longer operational. The correct URL is s3://cellpainting-gallery/cpg0000-jump-pilot/source_4/

No, it's explicitly stated in the documentation that
You must obtain the login credentials [in order to connect to Amazon S3 in Cyberduck]

Related

How to get access to s3 for .NET SDK with the same credentials used for awscli?

I am on a federated account that only allows for 60 minutes access tokens. This makes using AWS difficult since I have to constantly relog in with MFA, even for the AWS CLI on my machine. I'm fairly certain that any programmatic secret access key and token I generate would be useless after an hour.
I am writing a .NET program (.NET framework 4.8) that will run on a EC2 instance to read and write from an S3 bucket. As per the documentation example, they give this example to initalize the AmazonS3Client:
// Before running this app:
// - Credentials must be specified in an AWS profile. If you use a profile other than
// the [default] profile, also set the AWS_PROFILE environment variable.
// - An AWS Region must be specified either in the [default] profile
// or by setting the AWS_REGION environment variable.
var s3client = new AmazonS3Client();
I've looked into SecretManager and ParameterStore, but that would matter if the programmatic access keys go inactive after an hour. Perhaps there is another way to give the program access to S3 and the SDK...
If I cannot use access keys and tokens stored in a file, could I use the IAM access that awscli uses? For example, I can type into powershell aws s3 ls s3://mybucket to list and read files from s3 to the ec2 instance. Could the .NET SDK use the same credentials to access the S3 bucket?

Put Object to S3 Bucket of another account

We are able to put objects into our S3 Bucket.
But now we have a requirement that we need to put these Object directly to an S3 Bucket which belongs to a different account and different region.
Here we have few questions:
Is this possible?
If possible what changes we need to do for this?
They have provided us Access Key, Secret Key, Region, and Bucket details.
Any comments and suggestions will be appreciated.
IAM credentials are associated with a single AWS Account.
When you launch your own Amazon EC2 instance with an assigned IAM Role, it will receive access credentials that are associated with your account.
To write to another account's Amazon S3 bucket, you have two options:
Option 1: Your credentials + Bucket Policy
The owner of the destination Amazon S3 bucket can add a Bucket Policy on the bucket that permits access by your IAM Role. This way, you can just use the normal credentials available on the EC2 instance.
Option 2: Their credentials
It appears that you have been given access credentials for their account. You can use these credentials to access their Amazon S3 bucket.
As detailed on Working with AWS Credentials - AWS SDK for Java, you can provide these credentials in several ways. However, if you are using BOTH the credentials provided by the IAM Role AND the credentials that have been given to you, it can be difficult to 'switch between' them. (I'm not sure if there is a way to tell the Credentials Provider to switch between a profile stored in the ~/.aws/credentials file and those provided via instance metadata.)
Thus, the easiest way is to specify the Access Key and Secret Key when creating the S3 client:
BasicAWSCredentials awsCreds = new BasicAWSCredentials("access_key_id", "secret_key_id");
AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
.withCredentials(new AWSStaticCredentialsProvider(awsCreds))
.build();
It is generally not a good idea to put credentials in your code. You should load them from a configuration file.
Yes, it's possible. You need to allow cross account S3 put operation in bucket's policy.
Here is a blog by AWS. It should help you in setting up cross account put action.

Accessing s3 bucket on AWS ParallelCluster

I have a requirement of accessing S3 bucket on the AWS ParallelCluster nodes. I did explore the s3_read_write_resource option in the ParallelCluster documentation. But, it is not clear as to how we can access the bucket. For example, will it be mounted on the nodes, or will the users be able to access it by default. I did test the latter by trying to access a bucket I declared using the s3_read_write_resource option in the config file, but was not able to access it (aws s3 ls s3://<name-of-the-bucket>).
I did go through this github issue talking about mounting S3 bucket using s3fs. In my experience it is very slow to access the objects using s3fs.
So, my question is,
How can we access the S3 bucket when using s3_read_write_resource option in AWS ParallelCluster config file
These parameters are used in ParallelCluster to include S3 permissions on the instance role that is created for cluster instances. They're mapped into Cloudformation template parameters S3ReadResource and S3ReadWriteResource . And later used in the Cloudformation template. For example, here and here. There's no special way for accessing S3 objects.
To access S3 on one cluster instance, we need to use the aws cli or any SDK . Credentials will be automatically obtained from the instance role using instance metadata service.
Please note that ParallelCluster doesn't grant permissions to list S3 objects.
Retrieving existing objects from S3 bucket defined in s3_read_resource, as well as retrieving and writing objects to S3 bucket defined in s3_read_write_resource should work.
However, "aws s3 ls" or "aws s3 ls s3://name-of-the-bucket" need additional permissions. See https://aws.amazon.com/premiumsupport/knowledge-center/s3-access-denied-listobjects-sync/.
I wouldn't use s3fs, as it's not AWS supported, it's been reported to be slow (as you've already noticed), and other reasons.
You might want to check the FSx section. It can create an attach an FSx for Lustre filesystem. It can import/export files to/from S3 natively. We just need to set import_path and export_path on this section.

S3 credentials for a public bucket?

I have this snippet to upload a file on S3
s3 = boto3.resource('s3')
s3.Object('bucketname', timestamped_filename).put(Body=open(FILE_SAVE_PATH, 'rb'))
my bucket has a delete/upload permission for everyone, so it does work on my Windows machine.
However, when I try to run the same code on my Mac it throws
botocore.exeptions.NoCredentialsError: Unable to locate credentials
Is this behavior normal?
And what kind of credentials I can possibly provide if I'm accessing a public bucket?
Thank you.
When making an API call to AWS, valid credentials must be provided. These credentials are associated with an IAM User and grant access to AWS services.
When making API calls (or using the AWS Command-Line Interface (CLI)) from an Amazon EC2 instance, these credentials can be granted to the EC2 instance by assigning an IAM Role to the instance at launch time.
When making calls from a non-EC2 computer, credentials must be provided via a configuration file or environment variables.
It appears that your Windows machine is either an EC2 instance with a role, or it has a local configuration file with valid credentials; and it appears that your Mac has neither of these.
See: boto3 Credentials documentation

Mounting AWS S3 bucket using AWS IAM roles instead of using a passwd file

I am mounting an AWS S3 bucket as a filesystem using s3fs-fuse. It requires a file which contains AWS Access Key Id and AWS Secret Access Key.
How do I avoid the access using this file? And instead use AWS IAM roles?
As per Fuse Over Amazon document, you can specify the credentials using 4 methods. If you don't want to use a file, then you can set AWSACCESSKEYID and AWSSECRETACCESSKEY environment variables.
Also, if your goal is to use AWS IAM instance profile, then you need to run your s3fs-fuse from an EC2 instance. In that case, you don't have to set these credential files/environment variables. This is because while creating the instance, if you attach the instance role and policy, the EC2 instance will get the credentials at boot time. Please see the section 'Using Instance Profiles' in page 190 of AWS IAM User Guide
there is an argument -o iam_role=--- which helps you to avoid AccessKey and SecretAccessKey
The Full steps to configure this is given below
https://www.nxtcloud.io/mount-s3-bucket-on-ec2-using-s3fs-and-iam-role/