Apache Zeppelin with Athena handling session token using jdbc Interpreter

Apache Zeppelin with Athena handling session token using jdbc Interpreter - amazon-web-services

I am trying to connect Athena with Apache Zeppelin.I need to handle secret_key, Access_key, and Session_token. I am feeling hard to establish my connection with the Zeppelin JDBC interpreter.
I am following the steps as mentioned in this block,
If any one can help me out in establishing the connection with AWS Session token approach that would be helpful.
Thank You

The main docs for this are here:
https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html
I found there are 2 driver versions, -1.1.0 and -1.0.1 . I could only get Zeppelin working with 1.1.0, and the links on that page don't go to that file, the only way to get it was using the aws s3 cp command
e.g.
aws s3 cp s3://athena-downloads/drivers/AthenaJDBC41-1.1.0.jar .
although I've given feedback on that page so it should be fixed soon.
Regarding the parameters, you use default.user and enter the Access_Key, default.password and enter the Secret_key. The default.driver should be com.amazonaws.athena.jdbc.AthenaDriver
The default.s3_staging_dir is actually the bucket where csv results are written so needs to match your athena settings.
There is no mention of where you might put a session token, however, you could always try putting it on the jdbc connection string ( which goes in default.url parameter value)
e.g.
jdbc:awsathena://athena.{REGION}.amazonaws.com:443?SessionToken=blahblahsomethingrealsessiontokengoeshere
but of course, replace {REGION} with the actual aws region and use your real session token.

Related

AWS Service Quota: How to get service quota for Amazon S3 using boto3

I get the error "An error occurred (NoSuchResourceException) when calling the GetServiceQuota operation:" while trying running the following boto3 python code to get the value of quota for "Buckets"
client_quota = boto3.client('service-quotas')
resp_s3 = client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
In the above code, QuotaCode "L-89BABEE8" is for "Buckets". I presumed the value of ServiceCode for Amazon S3 would be "s3" so I put it there but I guess that is wrong and throwing error. I tried finding the documentation around ServiceCode for S3 but could not find it. I even tried with "S3" (uppercase 'S' here), "Amazon S3" but that didn't work as well.
What I tried?
client_quota = boto3.client('service-quotas') resp_s3 = client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
What I expected?
Output in the below format for S3. Below example is for EC2 which is the output of resp_ec2 = client_quota.get_service_quota(ServiceCode='ec2', QuotaCode='L-6DA43717')

I just played around with this and I'm seeing the same thing you are, empty responses from any service quota list or get command for service s3. However s3 is definitely the correct service code, because you see that come back from the service quota list_services() call. Then I saw there are also list and get commands for AWS default service quotas, and when I tried that it came back with data. I'm not entirely sure, but based on the docs I think any quota that can't be adjusted, and possibly any quota your account hasn't requested an adjustment for, will probably come back with an empty response from get_service_quota() and you'll need to run get_aws_default_service_quota() instead.
So I believe what you need to do is probably run this first:
client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
And if that throws an exception, then run the following:
client_quota.get_aws_default_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')

Is there a way to specify AWS_SESSION_TOKEN when using SQLWorkbench and Athena JDBC driver?

I am using SQLWorkbench to connect to AWS Athena and SQLWorkbench Variables section to specify AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. This works. However, when I have to connect to accounts, which require AWS_SESSION_TOKEN, the connection fails. I can connect by modifying credentials file, but that's inconvenient. Is there a better way?

I received an answer from AWS support, and at this point, according to them, it appears that the driver does not support AWS_SESSION_TOKEN parameter.
Answering the question, which appeared on the thread, if you have to use session token, it appears that the only way is to modify your aws credentials file. This can be done either by adding a section or modifying default. Here is an example of a connection string for the former, where simba_session is a profile in credentials:
jdbc:awsathena://AwsRegion=us-west-2;AwsCredentialsProviderClass=com.simba.athena.amazonaws.auth.profile.ProfileCredentialsProvider;AwsCredentialsProviderArguments=simba_session;
If you don't need to use session token, you can specify AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY by pressing the Variables button and adding the keys/values. In this case, the connection string can look like this:
jdbc:awsathena://AwsRegion=us-west-2;AwsCredentialsProviderClass=com.simba.athena.amazonaws.auth.DefaultAWSCredentialsProviderChain;
Also note that you can add S3OutputLocation (if needed) and a Workgroup (if needed) by pressing Extended Properties button and adding keys/values, rather than doing it in the connection string.

Export Data from AWS S3 bucket to MySql RDS instance using JDBC

I want to import CSVs in my s3 bucket into my MySql rds instance using jdbc. It is a one time process and not an ongoing one. Interested in knowing the end to end process.

As you mentioned its one time activity, hence I would like you to suggest direct CSV import to MySQL rather then using JDBC unless there is specific reason you might have that you have not mentioned into the question.
Here is approach you could utilize,
for loop of your files in S3, the use following command to import data to MySQL RDS.
>mysqlimport [options] db_name textfile1 [textfile2 ...]
Please refer following for more details.
https://dev.mysql.com/doc/en/mysqlimport.html
I hope this provides you way to move forward. If I'm missing something, re-frame your question, I could reattempt the answer.

InvalidSignatureException when using boto3 for dynamoDB on aws

Im facing some sort of credentials issue when trying to connect to my dynamoDB on aws. Locally it all works fine and I can connect using env variables for AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION and then
dynamoConnection = boto3.resource('dynamodb', endpoint_url='http://localhost:8000')
When changing to live creds in the env variables and setting the endpoint_url to the dynamoDB on aws this fails with:
"botocore.exceptions.ClientError: An error occurred (InvalidSignatureException) when calling the Query operation: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details."
The creds are valid as they are used in a different app which talks to the same dynamoDB. Ive also tried not using env variables but rather directly in the method but the error persisted. Furthermore, to avoid any issues with trailing spaces Ive even used the credentials directly in the code. Im using Python v3.4.4.
Is there maybe a header that also should be set that Im not aware of? Any hints would be apprecihated.
EDIT
Ive now also created new credentials (to make sure there are only alphanumerical signs) but still no dice.

You shouldn't use the endpoint_url when you are connecting to the real DynamoDB service. That's really only for connecting to local services or non-standard endpoints. Instead, just specify the region you want:
dynamoConnection = boto3.resource('dynamodb', region_name='us-west-2')

It sign that your time zone is different. Maybe you can check your:
1. Time zone
2. Time settings.
If there are some automatic settings, you should fix your time settings.

"sudo hwclock --hctosys" should do the trick.

Just wanted to point out that accessing DynamoDB from a C# environment (using AWS.NET SDK) I ran into this error and the way I solved it was to create a new pair of AWS access/secret keys.
Worked immediately after I changed those keys in the code.

kinesis stream account incorrect

I have setup my pc with python and connections to AWS. This has been successfully tested using the s3_sample.py file, I had to create an IAM user account with the credentials in a file which worked fine for S3 buckets.
My next task was to create an mqtt bridge and put some data in a stream in kinesis using the awslab - awslabs/mqtt-kinesis-bridge.
This seems to be all ok except I get an error when I run the bridge.py. The error is:
Could not find ACTIVE stream:my_first_stream error:Stream my_first_stream under account 673480824415 not found.
Strangely this is not the account I use in the .boto file that is suggested to be set up for this bridge, which are the same credentials I used for the S3 bucket
[Credentials]
aws_access_key_id = AA1122BB
aws_secret_access_key = LlcKb61LTglis
It would seem to me that the bridge.py has a hardcoded account but I can not see it and i can't see where it is pointing to the .boto file for credentials.
Thanks in Advance

So the issue of not finding the Active stream for the account is resolved by:
ensure you are hooked into the US-EAST-1 data centre as this is the default data centre for bridge.py
create your stream, you will only need 1 shard
The next problem stems from the specific version of MQTT and the python library paho-mqtt I installed. The bridge application was written with the API of MQTT 1.2.1 using paho-mqtt 0.4.91 in mind.
The new version which is available for download on their website has a different way of interacting with the paho-mqtt library which passes an additional "flags" object to the on_connect callback. This generates the error I was experiencing, since its not expecting the 5th argument.
You should be able to fix it by making the following change to bridge.py
Line 104 currently looks like this:
def on_connect(self, mqttc, userdata, msg):
Simply add flags, after userdata, so that the callback function looks like this:
def on_connect(self, mqttc, userdata,flags, msg):
This should resolve the issue of the final error of the incorrect number of arguments being passed.
Hope this helps others, thank for the efforts.

When you call python SDK for aws service, there is a line to import the boto modules for aws services in bridge.py.
import boto
The setting is pointing to the .boto for credentials and defined defaultly in boto.
Here is the explanation Boto Config :
Details
A boto config file is a text file formatted like an .ini configuration file that specifies values for options that control the behavior of the boto library. In Unix/Linux systems, on startup, the boto library looks for configuration files in the following locations and in the following order:
/etc/boto.cfg - for site-wide settings that all users on this machine will use
~/.boto - for user-specific settings
~/.aws/credentials - for credentials shared between SDKs
Of course, you can set the environment directly,
export AWS_ACCESS_KEY_ID="Your AWS Access Key ID"
export AWS_SECRET_ACCESS_KEY="Your AWS Secret Access Key"

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Apache Zeppelin with Athena handling session token using jdbc Interpreter - amazon-web-services

Related

AWS Service Quota: How to get service quota for Amazon S3 using boto3

Is there a way to specify AWS_SESSION_TOKEN when using SQLWorkbench and Athena JDBC driver?

Export Data from AWS S3 bucket to MySql RDS instance using JDBC

InvalidSignatureException when using boto3 for dynamoDB on aws

kinesis stream account incorrect

Categories

Resources