How to connect to Amazon Athena using Simba ODBC

How to connect to Amazon Athena using Simba ODBC - amazon-web-services

I am attempting to connect to Athena from RStudio using DBI::dbConnect, and I am having issues with opening the driver.
con <- DBI::dbConnect(
odbc::odbc(),
Driver = "[Simba Athena ODBC Driver]",
S3OutputLocation = "[s3://bucket-folder/]",
AwsRegion = "[region]",
AuthenticationType = "IAM Credentials",
Schema = "[schema]",
UID = rstudioapi::askForPassword("AWS Access Key"),
PWD = rstudioapi::askForPassword("AWS Secret Key"))
Error: nanodbc/nanodbc.cpp:983: 00000: [unixODBC][Driver Manager]Can't open lib '[Simba Athena ODBC Driver]' : file not found
In addition, this code returns nothing.
sort((unique(odbcListDrivers()[[1]])))
character(0)
It appears that my ODBC driver is unaccessible or incorrectly installed, but I am having trouble understanding why. I have downloaded the driver and can see it in my library.
Any insight is greatly appreciated!

The function arguments look strange. Remove the [] from Driver, S3OutputLocation and AwsRegion.

I solved by validating the list of driver that R recognizes using odbc::odbcListDrivers(), then adjusting the name of the Driver argument accordingly. If R can't definitely identify the driver, setting ODBCSYSINI=/folder_that_contains_odbcinst.ini/ in .Renviron solved for me.

Related

Unable to connect to AWS Athena Workgroup using JDBC connection?

I am using JDBC to connect to Athena for a specific Workgroup. But it is by default redirecting to the primary workgroup
Below is the code snippet
Properties info = new Properties();
info.put("user", "access-key");
info.put("password", "secrect-access-key");
info.put("WorkGroup","test");
info.put("schema", "testschema");
info.put("s3_staging_dir", "s3://bucket/athena/temp");
info.put("aws_credentials_provider_class","com.amazonaws.auth.DefaultAWSCredentialsProviderChain");
Class.forName("com.simba.athena.jdbc.Driver");
Connection connection = DriverManager.getConnection("jdbc:awsathena://athena.<region>.amazonaws.com:443/", info);
As you can see I am using "Workgroup" as the key for the properties. I also tried "workgroup", "work-group", "WorkGroup". It is not able to redirect to the specified Workgroup. Always going to the default one i.e primary workgroup.
Kindly help. Thanks

If you look at the release notes of Athena JDBC, the workgroup support is from v2.0.7.
If you jar is below this version, it will not work. Try to upgrade the library to 2.0.7 or above

You need to Override Client-Side Settings in workgroup.Enable below setting and rerun the query via JDBC.
Check this doc for more information.

Spark org.postgresql.Driver not found even though it's configured EMR

I am trying to write a pyspark data frame to a Postgres database with the following code:
mode = "overwrite"
url = "jdbc:postgresql://host/database"
properties = {"user": "user","password": "password","driver": "org.postgresql.Driver"}
dfTestWrite.write.jdbc(url=url, table="test_result", mode=mode, properties=properties)
However I am getting the following error:
An error occurred while calling o236.jdbc.
: java.lang.ClassNotFoundException: org.postgresql.Driver
I've found a few SO questions that address a similar issue but haven't found anything that helps. I followed the AWS docs here to add the configuration and from the EMR console it looks as though it was successful:
What am I doing wrong?

What you followed document is to add the database connector for the Presto and it is not a way to add the jdbc driver into the spark. Connector does not mean the driver.
You should download the postgresql jdbc driver and locate it to the spark lib directory or somewhere to refer it by a configuration.

Issue connecting to Databricks table from Azure Data Factory using the Spark odbc connector

We have managed to get a valid connection from Azure Data Factory towards our Azure Databricks cluster using the Spark (odbc) connector. In the list of tables we do get the expected list, but when querying a specific table we get an exception.
ERROR [HY000] [Microsoft][Hardy] (35) Error from server: error code:
'0' error message:
'com.databricks.backend.daemon.data.common.InvalidMountException:
Error while using path xxxx for resolving path xxxx within mount at
'/mnt/xxxx'.'.. Activity ID:050ac7b5-3e3f-4c8f-bcd1-106b158231f3
In our case the Databrick tables and mounted parquet files stored in Azure Data Lake 2, this is related to the above exception. Any suggestions how to solve this issue?
Ps. the same error appaers when connectin from Power BI desktop.
Thanks
Bart

In your configuration to mount the lake can you add this setting:
"fs.azure.createRemoteFileSystemDuringInitialization": "true"
I haven't tried your exact scenario - however this solved a similar problem for me using Databricks-Connect.

Connect SAS to AWS Athena

I am trying to establish a connection between SAS & AWS Athena.
I am working on RHEL 6.7, java version is 1.8.0_71.
Could someone advise how to configure that please?
So far, after some reading on "Accessing Amazon Athena with JDBC" , I have tried a 'maybe it will work' naive approach with trying to set a DSN in odbc.ini files (outside of SAS): I have downloaded Athena JDBC jar file and tried configuring connection in a similar way I did it for EMR.
odbc.ini:
[ODBC]
# Specify any global ODBC configuration here such as ODBC tracing.
[ODBC Data Sources]
ATHENA=Amazon Athena JDBC Driver
[ATHENA]
Driver=/opt/amazon/hiveodbc/lib/64/AthenaJDBC41-1.1.0.jar
HOST=jdbc:awsathena://athena.eu-west-1.amazonaws.com:443?s3_staging_dir=s3://aws-athena-query-results/sas/
odbcinst.ini
[ODBC Drivers]
Amazon Athena JDBC Driver=Installed
[Amazon Athena JDBC Driver]
Description=Amazon Athena JDBC Driver
Driver=/opt/amazon/hiveodbc/lib/64/AthenaJDBC41-1.1.0.jar
## The option below is for using unixODBC when compiled with -DSQL_WCHART_CONVERT.
## Execute 'odbc_config --cflags' to determine if you need to uncomment it.
# IconvEncoding=UCS-4LE
iODBC throws the following:
iODBC Demonstration program
This program shows an interactive SQL processor
Driver Manager: 03.52.0709.0909
Enter ODBC connect string (? shows list): DSN=ATHENA
1: SQLDriverConnect = [iODBC][Driver Manager]/opt/amazon/hiveodbc/lib/64/AthenaJDBC41-1.1.0.jar: invalid ELF header (0) SQLSTATE=00000
2: SQLDriverConnect = [iODBC][Driver Manager]Specified driver could not be loaded (0) SQLSTATE=IM003
Any suggestion would be much appreciated!

INFORMATICA LOADING ERROR

I am newbie to informatica.
I am using INFORMATICA 9.1.0 and oracle 11g as source and target database.
I tried to create one table in target database and tried to load data from source to target.
Table is getting created in target database. and I created mapping and workflow which is valid and i start work flow but it gave me following error.
Message Code RR_4036
Message
Error connecting to database [ Arun
ORA-00900: invalid SQL statement
Database driver error...
Function Name : executeDirect
SQL Stmt : Arun
Oracle Fatal Error
Database driver error...
Function Name : ExecuteDirect
Oracle Fatal Error
].
please help me with good solutions.

I got solution for this.
Previously while creating remote connection in Relational Connection Editor for a session, In code page option i chose "UTF-8 encoding of unicode". Now i changed to "ms windows latin 1 (ansi) superset of latin1" and I restarted the workflow which is succeeded.
The following video link shows how to create relational connection for a session.
http://www.youtube.com/watch?v=oM2d-IHfRUw

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to connect to Amazon Athena using Simba ODBC - amazon-web-services

The function arguments look strange. Remove the [] from Driver, S3OutputLocation and AwsRegion.

I solved by validating the list of driver that R recognizes using odbc::odbcListDrivers(), then adjusting the name of the Driver argument accordingly. If R can't definitely identify the driver, setting ODBCSYSINI=/folder_that_contains_odbcinst.ini/ in .Renviron solved for me.

Related

Unable to connect to AWS Athena Workgroup using JDBC connection?

Spark org.postgresql.Driver not found even though it's configured EMR

Issue connecting to Databricks table from Azure Data Factory using the Spark odbc connector

Connect SAS to AWS Athena

INFORMATICA LOADING ERROR

Categories

Resources