When to use boto3 sessions explicitly - amazon-web-services

By default boto3 creates sessions whenever required, according to the documentation
it is possible and recommended to maintain your own session(s) in some
scenarios
My understanding is if I use a session created by me I can reuse the same session across the application instead of boto3 automatically creating multiple sessions or if I want to pass credentials from code.
Has anyone ever maintained sessions on their own? If yes what was the advantage that it provided apart from the one mentioned above.
secrets_manager = boto3.client('secretsmanager')
session = boto3.session.Session()
secrets_manager = session.client('secretsmanager')
Is there any advantage of using one over the other and which one is recommended in this case.
References: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/session.html

I have seen the second method used when you wish to provide specific credentials without using the standard Credentials Provider Chain.
For example, when assuming a role, you can use the new temporary to create a session, then create a client from the session.
From boto3 sessions and aws_session_token management:
import boto3
role_info = {
'RoleArn': 'arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/<AWS_ROLE_NAME>',
'RoleSessionName': '<SOME_SESSION_NAME>'
}
client = boto3.client('sts')
credentials = client.assume_role(**role_info)
session = boto3.session.Session(
aws_access_key_id=credentials['Credentials']['AccessKeyId'],
aws_secret_access_key=credentials['Credentials']['SecretAccessKey'],
aws_session_token=credentials['Credentials']['SessionToken']
)
You could then use: s3 = session.client('s3')

Here is an example where I needed to use the session object with both Boto3 and AWS Datawrangler to set the region for both:
REGION = os.environ.get("REGION")
session = boto3.Session(region_name=REGION)
client_rds = session.client("rds-data")
df = wr.s3.read_parquet(path=path, boto3_session=session)

Related

boto3/aws: resource vs session

I can use resource like this way
s3_resource = boto3.resource('s3')
s3_bucket = s3_resource.Bucket(bucket)
Also I can use session like this way.
session = boto3.session.Session()
s3_session = session.resource("s3", endpoint_url=self.endpoint_url)
s3_obj = s3_session.Object(self.bucket, key)
Internally, does session.resource("s3" uses boto3.resource('s3')?
Normally, people ask about boto3 client vs resource.
Calls using client are direct API calls to AWS, while resource is a higher-level Pythonic way of accessing the same information.
In your examples, you are using session, which is merely a way of caching credentials. The session can then be used for either client or resource.
For example, when calling the AWS STS assume_role() command, a set of temporary credentials is returned. These can be stored in a session and API calls can be made using these credentials.
There is effectively no difference between your code samples, since no specific information has been stored in the session object. If you have nothing to specifically configure in the session, then you can skip it entirely.

How to get python package `awswranger` to accept a custom `endpoint_url`

I'm attempting to use the python package awswrangler to access a non-AWS S3 service.
The AWS Data Wranger docs state that you need to create a boto3.Session() object.
The problem is that the boto3.client() supports setting the endpoint_url, but boto3.Session() does not (docs here).
In my previous uses of boto3 I've always used the client for this reason.
Is there a way to create a boto3.Session() with a custom endpoint_url or otherwise configure awswrangler to accept the custom endpoint?
I finally found the configuration for awswrangler:
import awswrangler as wr
wr.config.s3_endpoint_url = 'https://custom.endpoint'
Any configuration variables for awswrangler can be overwritten directly using the wr.config config object as you stated in your answer, but it may be cleaner or preferable in some use cases to use environment variables.
In that case, simply set WR_S3_ENDPOINT_URL to your custom endpoint, and the configuration will reflect that when you import the library.
Once you create your session, you can use client as well. For example:
import boto3
session = boto3.Session()
s3 = session.client('s3', endpoint_url='<custom-endpoint>')

How to get the value of aws iam list-account-aliases as variable?

I want to write a python program to check which account I am in by using account alias (I have multiple Tennants et up on AWS). I think aws iam list-account-aliases return what exactly I am looking for but it is a command line results and I am not sure what is the best to capture as variable in a python program.
Also I was reading about the aws iam list-account-aliases and they have a output section mentioned AccountAliases -> (list). (https://docs.aws.amazon.com/cli/latest/reference/iam/list-account-aliases.html)
I wonder what this AccountAliases is? an option? a command? a variable? I was a little bit confused here.
Thank you!
Use Boto3 to get the account alias in python. Your link points to aws-cli.
Here is the link for equivalent Boto3 command for python:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.list_account_aliases
Sample code:
import boto3
client = boto3.client('iam')
response = client.list_account_aliases()
Response:
{
'AccountAliases': [
'myawesomeaccount',
],
'IsTruncated': True,
}
Account alias on AWS is a readable alias created against AWS user's account id. You can find more info here :
https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html
AWS has provided a very well documented library for Python called Boto3 which can be used to obtain connect to your AWS account as client, resource or session (more information on these in SO answer here: Difference in boto3 between resource, client, and session?)
For your use case you can connect to your AWS as client with iam label:
import boto3
client = boto3.client(
'iam',
region_name="region_name",
aws_access_key_id="your_aws_access_key_id",
aws_secret_access_key="your_aws_secret_access_key")
account_aliases = client. list_account_aliases()
The response is a JSON object which can be traversed to get the desired information on aliases.

botocore.exceptions.NoCredentialsError: Unable to locate credentials , Even after passing credentials manually

Hi I am a newbie in creating flask application, i have created a small GUI to upload files to the S3 Bucket
Here is the code snippet which is handling the same
s3 = boto3.client('s3', region_name="eu-west-1",
endpoint_url=S3_LOCATION, aws_access_key_id=S3_KEY, aws_secret_access_key=S3_SECRET)
myclient = boto3.resource('s3')
file = request.files['file[]']
filename=file.filename
data_files = request.files.getlist('file[]')
for data_file in data_files:
file_contents = data_file.read()
ts = time.gmtime()
k=time.strftime("%Y-%m-%dT%H:%M:%S", ts)
name=filename[0:-4]
newfilename=(name+k+'.txt')
myclient.Bucket(S3_BUCKET).put_object(Key=newfilename,Body=file_contents)
message='File Uploaded Successfully'
print('upload Successful')
the part is working fine when I am testing it from my local system, but upon uploading it to the EC2 Instance,the part
myclient.Bucket(S3_BUCKET).put_object(Key=newfilename,Body=file_contents)
is where it is throwing the error:
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I have created a file config.py where I am storing the all the credentials and passing them at runtime.
Not sure what is Causing the Error at EC2 instance, please help me with it
You may be confusing boto3 service resource and client.
#!!!! this instantiate an object call s3 to boto3.s3.client with credential
s3 = boto3.client('s3', region_name="eu-west-1",
endpoint_url=S3_LOCATION, aws_access_key_id=S3_KEY, aws_secret_access_key=S3_SECRET)
#!!!! this instantiate an object call myclient to boto3.s3.resource that use
# credential inside .aws folder, since no designated credential given.
myclient = boto3.resource('s3')
It seems you try to pass explicit credential to boto3.resource without using .aws/credential and .aws/default access key. If so, this is not the right way to do it. To explicitly pass the credential to boto3.resource, it is recommended to to use boto3.Session (that also works for boto3.client too). This also allow you to connect to different AWS services by using the initialise session than passing API key for different services inside your program.
import boto3
session = boto3.session(
region_name = 'us-west-2',
aws_access_key_id=S3_KEY,
aws_secret_access_key=S3_SECRET)
# now instantiate the services
myclient = session.resource('s3')
# .... the rest of the code
Nevertheless, the better way is make use of .aws credential. Because it is a bad practice to hard code any access key/password inside the code. You can also use the profile name call if you need to access different API key in different region. e.g.
~/.aws/credential
[default]
aws_access_key_id = XYZABC12345
aws_secret_access_key = SECRET12345
[appsinfinity]
aws_access_key_id = XYZABC12346
aws_secret_access_key = SECRET12346
~/.aws/config
[default]
region = us-west-1
[profile appsinfinity]
region = us-west-2
And the code
import boto3
app_infinity_session = boto3.session(profile_name= 'appsinfinity')
....

Amazon S3 - Unable to create a datasource

I tried creating a datasource using boto for machine learning but ended up with an error.
Here's my code :
import boto
bucketname = 'mybucket'
filename = 'myfile.csv'
schema = 'myfile.csv.schema'
conn = boto.connect_s3()
datasource = 'my_datasource'
ml = boto.connect_machinelearning()
#create a data source
ds = ml.create_data_source_from_s3(
data_source_id = datasource,
data_spec ={
'DataLocationS3':'s3://'+bucketname+'/'+filename,
'DataSchemaLocationS3':'s3://'+bucketname+'/'+schema},
data_source_name=None,
compute_statistics = True)
print ml.get_data_source(datasource,verbose=None)
I get this error as a result of get_data_source call:
Could not access 's3://mybucket/myfile.csv'. Either there is no file at that location, or the file is empty, or you have not granted us read permission.
I have checked and I have FULL_CONTROL as my permissions. The bucket, file and schema all are present and are non-empty.
How do I solve this?
You may have FULL_CONTROL over that S3 resource but in order for this to work you have to grant the Machine Learning service the appropriate access to that S3 resource.
I know links to answers are frowned upon but in this case I think its best to link to the definitive documentation from the Machine Learning Service since the actual steps are complicated and could change in the future.