How to get python package `awswranger` to accept a custom `endpoint_url` - amazon-web-services

I'm attempting to use the python package awswrangler to access a non-AWS S3 service.
The AWS Data Wranger docs state that you need to create a boto3.Session() object.
The problem is that the boto3.client() supports setting the endpoint_url, but boto3.Session() does not (docs here).
In my previous uses of boto3 I've always used the client for this reason.
Is there a way to create a boto3.Session() with a custom endpoint_url or otherwise configure awswrangler to accept the custom endpoint?

I finally found the configuration for awswrangler:
import awswrangler as wr
wr.config.s3_endpoint_url = 'https://custom.endpoint'

Any configuration variables for awswrangler can be overwritten directly using the wr.config config object as you stated in your answer, but it may be cleaner or preferable in some use cases to use environment variables.
In that case, simply set WR_S3_ENDPOINT_URL to your custom endpoint, and the configuration will reflect that when you import the library.

Once you create your session, you can use client as well. For example:
import boto3
session = boto3.Session()
s3 = session.client('s3', endpoint_url='<custom-endpoint>')

Related

Spring Cloud: using AWS Parameter Store only with certain profiles

I am integrating my application with AWS parameter store. For local development which may have no access to AWS I need to disable fetching property values from AWS and use values from application.yml. The issue seems to be not application.yml, but the dependencies: as soon as AWS starter appears in POM, AWS integration is being initialized: Spring is trying to use AwsParamStorePropertySourceLocator. I guess what I need to do is to force my application to use Spring's property source locator regardless of AWS jar being on the class path. Not sure how to do that.
For parameter store it is quite easy: AwsParamStoreBootstrapConfiguration bean is conditional on property aws.paramstore.enabled. Creating aws.paramstore.enabled environment variable and setting its value to false will disable AWS parameter store.
I also tried disabling AWS secrets manager and setting aws.secretsmanager.enabled to false is not sufficient. To fully disable it I had to disable auto configuration for few classes:
import org.springframework.cloud.aws.autoconfigure.context.ContextCredentialsAutoConfiguration;
import org.springframework.cloud.aws.autoconfigure.context.ContextInstanceDataAutoConfiguration;
import org.springframework.cloud.aws.autoconfigure.context.ContextRegionProviderAutoConfiguration;
import org.springframework.cloud.aws.autoconfigure.context.ContextResourceLoaderAutoConfiguration;
import org.springframework.cloud.aws.autoconfigure.context.ContextStackAutoConfiguration;
import org.springframework.cloud.aws.autoconfigure.mail.MailSenderAutoConfiguration;
#Configuration
#Profile("local")
#EnableAutoConfiguration(exclude = { ContextCredentialsAutoConfiguration.class,
ContextInstanceDataAutoConfiguration.class, ContextRegionProviderAutoConfiguration.class,
ContextResourceLoaderAutoConfiguration.class, ContextStackAutoConfiguration.class,
MailSenderAutoConfiguration.class })
public class LocalScanConfig {
}

When to use boto3 sessions explicitly

By default boto3 creates sessions whenever required, according to the documentation
it is possible and recommended to maintain your own session(s) in some
scenarios
My understanding is if I use a session created by me I can reuse the same session across the application instead of boto3 automatically creating multiple sessions or if I want to pass credentials from code.
Has anyone ever maintained sessions on their own? If yes what was the advantage that it provided apart from the one mentioned above.
secrets_manager = boto3.client('secretsmanager')
session = boto3.session.Session()
secrets_manager = session.client('secretsmanager')
Is there any advantage of using one over the other and which one is recommended in this case.
References: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/session.html
I have seen the second method used when you wish to provide specific credentials without using the standard Credentials Provider Chain.
For example, when assuming a role, you can use the new temporary to create a session, then create a client from the session.
From boto3 sessions and aws_session_token management:
import boto3
role_info = {
'RoleArn': 'arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/<AWS_ROLE_NAME>',
'RoleSessionName': '<SOME_SESSION_NAME>'
}
client = boto3.client('sts')
credentials = client.assume_role(**role_info)
session = boto3.session.Session(
aws_access_key_id=credentials['Credentials']['AccessKeyId'],
aws_secret_access_key=credentials['Credentials']['SecretAccessKey'],
aws_session_token=credentials['Credentials']['SessionToken']
)
You could then use: s3 = session.client('s3')
Here is an example where I needed to use the session object with both Boto3 and AWS Datawrangler to set the region for both:
REGION = os.environ.get("REGION")
session = boto3.Session(region_name=REGION)
client_rds = session.client("rds-data")
df = wr.s3.read_parquet(path=path, boto3_session=session)

How to get the value of aws iam list-account-aliases as variable?

I want to write a python program to check which account I am in by using account alias (I have multiple Tennants et up on AWS). I think aws iam list-account-aliases return what exactly I am looking for but it is a command line results and I am not sure what is the best to capture as variable in a python program.
Also I was reading about the aws iam list-account-aliases and they have a output section mentioned AccountAliases -> (list). (https://docs.aws.amazon.com/cli/latest/reference/iam/list-account-aliases.html)
I wonder what this AccountAliases is? an option? a command? a variable? I was a little bit confused here.
Thank you!
Use Boto3 to get the account alias in python. Your link points to aws-cli.
Here is the link for equivalent Boto3 command for python:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.list_account_aliases
Sample code:
import boto3
client = boto3.client('iam')
response = client.list_account_aliases()
Response:
{
'AccountAliases': [
'myawesomeaccount',
],
'IsTruncated': True,
}
Account alias on AWS is a readable alias created against AWS user's account id. You can find more info here :
https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html
AWS has provided a very well documented library for Python called Boto3 which can be used to obtain connect to your AWS account as client, resource or session (more information on these in SO answer here: Difference in boto3 between resource, client, and session?)
For your use case you can connect to your AWS as client with iam label:
import boto3
client = boto3.client(
'iam',
region_name="region_name",
aws_access_key_id="your_aws_access_key_id",
aws_secret_access_key="your_aws_secret_access_key")
account_aliases = client. list_account_aliases()
The response is a JSON object which can be traversed to get the desired information on aliases.

How to create ami with specific image-id using moto?

I'am using moto to mock aws for my application. I wondering if it is possible to create ami in moto with specific image-id (for example: ami-1a2b3c4d).
Thank you!
You want to use preloaded resources like this file: https://github.com/spulec/moto/blob/master/moto/ec2/resources/amis.json
You can use the following environment variable: MOTO_AMIS_PATH=/full/path/to/amis.json
This JSON-file has to be in the same format as the one linked above. Note that the environment variable has to be set before Moto is initialised - these AMI's are loaded the moment you call from moto import mock_ec2, so the environment variable has to be set before that import takes place.
(Copied from https://stackoverflow.com/a/72270977/7224682)
Here is an example coming straight from the docs:
from . import add_servers
from moto import mock_ec2
#mock_ec2
def test_add_servers():
add_servers('ami-XXXXXXX', 2)
client = boto3.client('ec2', region_name='us-west-1')
instances = client.describe_instances()['Reservations'][0]['Instances']
assert len(instances) == 2
instance1 = instances[0]
assert instance1['ImageId'] == 'ami-XXXXXXXX'
You can choose the AMI ID to be whatever you want, there are no restrictions. I'm not sure I understand what the problem is as these are "mock" resources so they can be in any format/contain any name that you want.

kinesis stream account incorrect

I have setup my pc with python and connections to AWS. This has been successfully tested using the s3_sample.py file, I had to create an IAM user account with the credentials in a file which worked fine for S3 buckets.
My next task was to create an mqtt bridge and put some data in a stream in kinesis using the awslab - awslabs/mqtt-kinesis-bridge.
This seems to be all ok except I get an error when I run the bridge.py. The error is:
Could not find ACTIVE stream:my_first_stream error:Stream my_first_stream under account 673480824415 not found.
Strangely this is not the account I use in the .boto file that is suggested to be set up for this bridge, which are the same credentials I used for the S3 bucket
[Credentials]
aws_access_key_id = AA1122BB
aws_secret_access_key = LlcKb61LTglis
It would seem to me that the bridge.py has a hardcoded account but I can not see it and i can't see where it is pointing to the .boto file for credentials.
Thanks in Advance
So the issue of not finding the Active stream for the account is resolved by:
ensure you are hooked into the US-EAST-1 data centre as this is the default data centre for bridge.py
create your stream, you will only need 1 shard
The next problem stems from the specific version of MQTT and the python library paho-mqtt I installed. The bridge application was written with the API of MQTT 1.2.1 using paho-mqtt 0.4.91 in mind.
The new version which is available for download on their website has a different way of interacting with the paho-mqtt library which passes an additional "flags" object to the on_connect callback. This generates the error I was experiencing, since its not expecting the 5th argument.
You should be able to fix it by making the following change to bridge.py
Line 104 currently looks like this:
def on_connect(self, mqttc, userdata, msg):
Simply add flags, after userdata, so that the callback function looks like this:
def on_connect(self, mqttc, userdata,flags, msg):
This should resolve the issue of the final error of the incorrect number of arguments being passed.
Hope this helps others, thank for the efforts.
When you call python SDK for aws service, there is a line to import the boto modules for aws services in bridge.py.
import boto
The setting is pointing to the .boto for credentials and defined defaultly in boto.
Here is the explanation Boto Config :
Details
A boto config file is a text file formatted like an .ini configuration file that specifies values for options that control the behavior of the boto library. In Unix/Linux systems, on startup, the boto library looks for configuration files in the following locations and in the following order:
/etc/boto.cfg - for site-wide settings that all users on this machine will use
~/.boto - for user-specific settings
~/.aws/credentials - for credentials shared between SDKs
Of course, you can set the environment directly,
export AWS_ACCESS_KEY_ID="Your AWS Access Key ID"
export AWS_SECRET_ACCESS_KEY="Your AWS Secret Access Key"