access amazon S3 bucket from hadoop specifying SecretAccessKey from command line - amazon-web-services

I am trying to access amazon S3 bucket using hdfs command. Here is command that I run:
$ hadoop fs -ls s3n://<ACCESSKEYID>:<SecretAccessKey>#<bucket-name>/tpt_files/
-ls: Invalid hostname in URI s3n://<ACCESSKEYID>:<SecretAccessKey>#<bucket-name>/tpt_files
Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [<path> ...]
My SecretAccessKey includes “/”. Could it be cause of such behavior?
In the same time I have aws cli installed in this server and I can access my bucket using aws cli without any issues (AccessKeyId and SecretAccessKey configured in .aws/credentials):
aws s3 ls s3:// <bucket-name>/tpt_files/
If there any way how to access amazon S3 bucket using Hadoop command without specifying Keys in core-site.xml? I’d prefer to specify Keys in command line.
Any suggestions will be very helpful.

The best practice is run hadoop on an instance created with an EC2 Instance profile role, and the S3 access is specified as a policy of the assigned role. Keys are no longer needed when using the instance profile.
http://docs.aws.amazon.com/java-sdk/latest/developer-guide/credentials.html
You can also launch AMIs with an Instance profile role and the CLI and SDKs will use it. If your code uses the DefaultAWSCredentialsProviderChain class, then credentials can be obtained through environment variables, system properties, or credential profiles file (as well as EC2 Instance profile role).

Related

How to upload local system files in my Linux server to Amazon S3 Bucket using ssh?

I am trying to upload a file which I have on my Linux server onto my AWS S3 bucket. Can anyone please advise on how to do so as I only find documentations which is related to upload the files to EC2 instead.
I do have the .pem certificate present on my server directory.
I tried to run the following command but it doesn't solve the issue
scp -i My_PEM_FILE.pem "MY_FILE_TO_BE_UPLOADED.txt" MY_USER#S3-INSTANCE.IP.ADDRESS.0.compute.amazonaws.com
It is not possible to upload to Amazon S3 by using SSH.
The easiest way to upload from anywhere to an Amazon S3 bucket is to use the AWS Command-Line Interface (CLI):
aws s3 cp MY_FILE_TO_BE_UPLOADED.txt s3://my-bucket/
This will require an Access Key and a Secret Key to be stored via the aws configure command. You can obtain these keys from your IAM User in the IAM management console (Security Credentials tab).
See: aws s3 cp — AWS CLI Command Reference

AWS: Hot to configure AWS credentials for multiple accounts in Mac OS terminal

I found way to configure AWS credentials by
aws configure
command. But this is not very comfortable for me since I'm using multiple AWS accounts. Is there any way to make it easy to configure AWS credentials and switch between them?
Yes. You can configure multiple profiles.
The easiest way is to use:
aws configure --profile <name>
You can then use it with:
aws s3 ls --profile <name>
If --profile is not specified, it will use the default profile.
All configuration information is stored in the ~/.aws/credentials and ~/.aws/config files.
See: Named profiles - AWS Command Line Interface

Have to delete environment variables for aws cli to work without --profile flag

ok so I am baffled by this aws cli behavior. Basically what is going on is that when I set my AWS creds related in environment variable, AWS CLI forces me to pass --profile flag each time I use the CLI.
So basically when AWS_ACCESS_KEY_ID AND AWS_SECRET_ACCESS_KEY then I cannot run commands like aws s3 ls without passing --profile flag to it even though my profile is [default]
Also, jus to note the environment variable values and the values inside my /.aws/credentials
file is exactly same. Also, I tried to set both AWS_PROFILE and AWS_DEFAULT_PROFILE to default hoping that if all values such as keys,secret and profile are set in environment variable then I do not have to pass any --profile flag explicitly. Not having to pass this flag explicitly is very important for me at this point because if I am running an application which connects with aws and picks up default credentials, there is no easy way to pass profile information to that app.
my credentials file look like following:
[default]
aws_access_key_id = AKIA****
aws_secret_access_key = VpR***
My config file looks like following:
[default]
region = us-west-1
output = json
And my environment variables do have the same values for corresponding entries. for key, secret and profile at least.
Any idea on how to solve this issue?
The AWS CLI looks for credentials using a series of providers in a particular order. (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html#config-settings-and-precedence)
Specifically:
Command line options – You can specify --region, --output, and --profile as parameters on the command line.
Environment variables – You can store values in the environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN. If they are present, they are used.
CLI credentials file – This is one of the files that is updated when you run the command aws configure. The file is located at ~/.aws/credentials on Linux or macOS, or at C:\Users\USERNAME\.aws\credentials on Windows. This file can contain the credential details for the default profile and any named profiles.
CLI configuration file – This is another file that is updated when you run the command aws configure. The file is located at ~/.aws/config on Linux or macOS, or at C:\Users\USERNAME\.aws\config on Windows. This file contains the configuration settings for the default profile and any named profiles.
Container credentials – You can associate an IAM role with each of your Amazon Elastic Container Service (Amazon ECS) task definitions. Temporary credentials for that role are then available to that task's containers. For more information, see IAM Roles for Tasks in the Amazon Elastic Container Service Developer Guide.
Instance profile credentials – You can associate an IAM role with each of your Amazon Elastic Compute Cloud (Amazon EC2) instances. Temporary credentials for that role are then available to code running in the instance. The credentials are delivered through the Amazon EC2 metadata service. For more information, see IAM Roles for Amazon EC2 in the Amazon EC2 User Guide for Linux Instances and Using Instance Profiles in the IAM User Guide.
Another potential option for you would be to unset any colliding variables in your env and rely on the aws credentials file to provide the appropriate access credentials from the default entry.

User Data script to call aws cli

I am trying to get some files from S3 on startup in an EC2 instance by using a User Data script and the command
/usr/bin/aws s3 cp ...
The log tells me that permission was denied and I believe it is due to aws cli not finding any credentials when executing the user data script.
Running the command with sudo after the instance has started works fine.
I have run aws configure both with sudo and without.
I do not want to use cronjob to run something on startup since I am working with an AMI and often need to change the script, therefore it is more convenient for me to change the user data instead of creating a new AMI everytime the script changes.
If possible, I would also like to avoid writing the credentials into the script.
How can I configure awscli in such a way that the credentials are used when running a user data script?
I suggest you remove the AWS credentials from the instance/AMI. Your userdata script will be supplied with temporary credentials when needed by the AWS metadata server.
See: IAM Roles for Amazon EC2
Clear/delete AWS credentials configurations from your instance and create an AMI
Create a policy that has the minimum privileges to run your script
Create a IAM role and attach the policy you just created
Attach the IAM role when you launch the instance (very important)
Have your userdata script call /usr/bin/aws s3 cp ... without supplying credentials explicitly or using credentials file
You can configure your EC2 instance to receive a pre-defined IAM Role that has its credentials "baked-in" to the instance that it fetches upon instantiation, which it turn will allow it to call aws-cli commands in your User Data script without the need to configure credentials at all.
Here's more info on IAM Roles for EC2:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
It's worth noting here that you'll need to attach the appropriate policies to the IAM Role that you assign to your instance in order for the aws-cli commands to succeed. More information on that can be found here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#working-with-iam-roles

How to use multiple AWS accounts from the command line?

I've got two different apps that I am hosting (well the second one is about to go up) on Amazon EC2.
How can I work with both accounts at the command line (Mac OS X) but keep the EC2 keys & certificates separate? Do I need to change my environment variables before each ec2-* command?
Would using an alias and having it to the setting of the environment in-line work? Something like: alias ec2-describe-instances1 = export EC2_PRIVATE_KEY=/path; ec2-describe-instances
You can work with two accounts by creating two profiles on the aws command line.
It will prompt you for your AWS Access Key ID, AWS Secret Access Key and desired region, so have them ready.
Examples:
$ aws configure --profile account1
$ aws configure --profile account2
You can then switch between the accounts by passing the profile on the command.
$ aws dynamodb list-tables --profile account1
$ aws s3 ls --profile account2
Note:
If you name the profile to be default it will become default profile i.e. when no --profile param in the command.
More on default profile
If you spend more time using account1, you can make it the default by setting the AWS_DEFAULT_PROFILE environment variable. When the default environment variable is set, you do not need to specify the profile on each command.
Linux, OS X Example:
$ export AWS_DEFAULT_PROFILE=account1
$ aws dynamodb list-tables
Windows Example:
$ set AWS_DEFAULT_PROFILE=account1
$ aws s3 ls
How to set "manually" multiple AWS accounts ?
1) Get access - key
AWS Console > Identity and Access Management (IAM) > Your Security Credentials > Access Keys
2) Set access - file and content
~/.aws/credentials
[default]
aws_access_key_id={{aws_access_key_id}}
aws_secret_access_key={{aws_secret_access_key}}
[{{profile_name}}]
aws_access_key_id={{aws_access_key_id}}
aws_secret_access_key={{aws_secret_access_key}}
3) Set profile - file and content
~/.aws/config
[default]
region={{region}}
output={{output:"json||text"}}
[profile {{profile_name}}]
region={{region}}
output={{output:"json||text"}}
4) Run - file with params
Install command-line app - and use AWS Command Line it, for example for product AWS EC2
aws ec2 describe-instances -- default
aws ec2 describe-instances --profile {{profile_name}} -- [{{profile_name}}]
Ref
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html
IMHO, the easiest way is to edit .aws/credentials and .aws/config files manually.
It's easy and it works for Linux, Mac and Windows. Just read this for more detail (1 minute read).
.aws/credentials file:
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
[user1]
aws_access_key_id=AKIAI44QH8DHBEXAMPLE
aws_secret_access_key=je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY
.aws/config file:
[default]
region=us-west-2
output=json
[profile user1] <-- 'profile' in front of 'profile_name' (not for default)!!
region=us-east-1
output=text
You should be able to use the following command-options in lieu of the EC2_PRIVATE_KEY (and even EC2_CERT) environment variables:
-K <private key>
-C <certificate>
You can put these inside aliases, e.g.
alias ec2-describe-instances1 ec2-describe-instances -K /path/to/key.pem
Create or edit this file:
vim ~/.aws/credentials
List as many key pairs as you like:
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
[user1]
aws_access_key_id=AKIAI44QH8DHBEXAMPLE
aws_secret_access_key=je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY
Set a local variable to select the pair of keys you want to use:
export AWS_PROFILE=user1
Do what you like:
aws s3api list-buckets # any aws cli command now using user1 pair of keys
You can also do it command by command by including --profile user1 with each command:
aws s3api list-buckets --profile user1
# any aws cli command now using user1 pair of keys
More details: Named profiles for the AWS CLI
The new aws tools now support multiple profiles.
If you configure access with the tools, it automatically creates a default in ~/.aws/config.
You can then add additional profiles - more details at: Getting started with the AWS CLI
I created a simple tool, aaws, to switch between AWS accounts.
It works by setting the AWS_DEFAULT_PROFILE in your shell. Just make sure you have some entries in your ~/.aws/credentials file and it will easily switch between multiple accounts.
/tmp
$ aws s3 ls
Unable to locate credentials. You can configure credentials by running "aws configure".
/tmp
$ aaws luk3
[luk3] 🔐 /tmp
$ aws s3 ls
2013-11-05 21:40:04 luk3thomas.com
I wrote a toolkit to switch default AWS profile.
The mechanism is physically moving the profile key to the default section in config and credentials files.
The better solution today should be one of the following ways:
Use aws command option --profile.
Use environment variable AWS_PROFILE.
I don't remember why I didn't use the solution of --profile, maybe I was not realized its existence.
However the toolkit can still be useful by doing other things. I'll add a soft switch flag by using the way of AWS_PROFILE in the future.
$ xsh list aws/cfg
[functions] aws/cfg/move
[functions] aws/cfg/set
[functions] aws/cfg/activate
[functions] aws/cfg/get
[functions] aws/cfg/delete
[functions] aws/cfg/list
[functions] aws/cfg/copy
Repo: https://github.com/xsh-lib/aws
Install:
curl -s https://raw.githubusercontent.com/alexzhangs/xsh/master/boot | bash && . ~/.xshrc
xsh load xsh-lib/aws
Usage:
xsh aws/cfg/list
xsh aws/cfg/activate <profilename>
You can write shell script to set corresponding values of environment variables for each account based on user input. Doing so, you don't need to create any aliases and, furthermore, tools like ELB tools, Auto Scaling Command Line Tools will work under multiple accounts as well.
To use an IAM role, you have to make an API call to STS:AssumeRole, which will return a temporary access key ID, secret key, and security token that can then be used to sign future API calls. Formerly, to achieve secure cross-account, role-based access from the AWS Command Line Interface (CLI), an explicit call to STS:AssumeRole was required, and your long-term credentials were used. The resulting temporary credentials were captured and stored in your profile, and that profile was used for subsequent AWS API calls. This process had to be repeated when the temporary credentials expired (after 1 hour, by default).
More details: How to Use a Single IAM User to Easily Access All Your Accounts by Using the AWS CLI