Connect IntelliJ to Amazon Redshift - amazon-web-services

I'm using the latest version of IntelliJ and I've just created a cluster in Amazon Redshift. How do I connect IntelliJ to Redshift so that I can query it from my favorite IDE?

Download a jdbc driver:
http://docs.aws.amazon.com/redshift/latest/mgmt/configure-jdbc-connection.html#download-jdbc-driver
On IntelliJ: View |Tool Windows | Database
Click on "Data Source
Properties" ()
Click Add (+) and select "Database Driver":
Uncheck "JDBC drivers", and add a jdbc driver, select a class from the dropdown and select a PostgreSQL dialect:
6.Add a new connection, and use this datasource for your connection: (+ | Data Source | RedShift).
7.Set URL templates:
jdbc:redshift://[{host::localhost}[:{port::5439}]][/{database::postgres}?][\?<&,user={user:param},password={password:param},{:identifier}={:param}>]
jdbc:redshift://\[{host:ipv6:\:\:1}\][:{port::5439}][/{database::postgres}?][\?<&,user={user:param},password={password:param},{:identifier}={:param}>]
jdbc:redshift:{database::postgres}[\?<&,user={user:param},password={password:param},{:identifier}={:param}>]

You can connect IntelliJ to Redshift by the using the JDBC Driver supplied by Amazon. In the Redshift Console, go to "Connect Client" to get the driver.
Then, in the IntelliJ Data Source window, add the JAR as a Driver file, and use the following settings:
Class: com.amazon.redshift.jdbc41.Driver
URL template: jdbc:redshift://{host}:{port}/{database}
Common Pitfalls:
If the driver file is not readable or marked as in quarantine by OS X, you won't be able to select the driver class.
For a more detailed guide, see this blog post: Connecting IntelliJ to Redshift
Note: There is no native Redshift support in IntelliJ yet. IntelliJ Issue DBE-1459

Update for 2019: I've just created a PostgreSQL connection and then filled the usual Redshift settings (don't forget port: 5439), no need to download Amazon's JDBC driver.
Only little issue is that the syntax check doesn't know Redshift specificities such as AS and some functions, but queries execute correctly.

Update for 2020: PyCharm (and possibly all other JetBrains IDEs) now supports connecting to Redshift through IAM AWS credentials without manual driver installation.
Here are the detailed setup instructions:
Grant a redshift:GetClusterCredentials permission to your AWS user. Either create and attach a new policy (docs) or use an existing one such as AmazonRedshiftFullAccess (not recommended: too permissive).
Create an AWS access key (access key id + secret access key pair) for your user (docs).
Create a text configuration file ~/.aws/credentials (no extension) with the following content (docs):
[default] # arbitrary profile name, will be used later
region = <your region>
aws_access_key_id = <your access key id> # created on the previous step
aws_secret_access_key = <your secret access key>
Create a new PyCharm database connection of type Amazon Redshift and set it up (docs):
Choose connection type = IAM cluster/region (right under the «General» tab of the connection settings window).
Authentication = AWS Profile
User = {your AWS login}
Profile = default or the one you have used in credentials file.
The credentials can possibly be provided through AccessKeyID/SecretAccessKey connection settings on the «Advanced» tab but it did not work for me (due to NullPointerException if Profile field is empty).
Database = {your database}, choose an existing one to not face non descriptive errors from the driver.
Region = {your region}
Cluster = {cluster name}, get it from Redshift AWS console.
Setup the connection:
Check necessary databases in the «Schemas» tab.
«Advanced» tab: AutoCreate = true (literal lowercase true as the setting value). This will automatically create a new database user with your AWS login.
Test connection.

Related

Sequelize.js AWS RDS unknown database despite passing valid database identifier to configuration

I'm trying to connect to AWS RDS database from my code:
const {
DATABASE_HOSTNAME,
DATABASE_BASENAME, // ches
DATABASE_USERNAME,
DATABASE_PASSWORD,
} = process.env;
this.sequelize = new Sequelize(DATABASE_BASENAME, DATABASE_USERNAME, DATABASE_PASSWORD, {
dialect: 'mysql',
host: DATABASE_HOSTNAME,
});
This is my AWS RDS database. However when I run my code I'm getting error:
code: 'ER_BAD_DB_ERROR',
errno: 1049,
sqlState: '42000',
sqlMessage: "Unknown database 'ches'"
What I'm doing wrong? Is empty "DB name" field in instance configuration source of problems?
From your screenshot it can be seen that there is no default DB created.
DB Name:
-
By default, when you setup your RDS, it does not create a database for you. Instead you have a DB instance, which is listed under RDS->Databases.
When you setup your RDS you can use Additional configuration section to provide a name of database to create, e.g. ches. Without this you have a DB instance without an actual database.
For existing rds instances, you can use mysql client to connect to the instance and create a database using standard CREATE DATABASE ches; as you would normally do in a mysql.

Unable to connect to AWS Athena Workgroup using JDBC connection?

I am using JDBC to connect to Athena for a specific Workgroup. But it is by default redirecting to the primary workgroup
Below is the code snippet
Properties info = new Properties();
info.put("user", "access-key");
info.put("password", "secrect-access-key");
info.put("WorkGroup","test");
info.put("schema", "testschema");
info.put("s3_staging_dir", "s3://bucket/athena/temp");
info.put("aws_credentials_provider_class","com.amazonaws.auth.DefaultAWSCredentialsProviderChain");
Class.forName("com.simba.athena.jdbc.Driver");
Connection connection = DriverManager.getConnection("jdbc:awsathena://athena.<region>.amazonaws.com:443/", info);
As you can see I am using "Workgroup" as the key for the properties. I also tried "workgroup", "work-group", "WorkGroup". It is not able to redirect to the specified Workgroup. Always going to the default one i.e primary workgroup.
Kindly help. Thanks
If you look at the release notes of Athena JDBC, the workgroup support is from v2.0.7.
If you jar is below this version, it will not work. Try to upgrade the library to 2.0.7 or above
You need to Override Client-Side Settings in workgroup.Enable below setting and rerun the query via JDBC.
Check this doc for more information.

Reading And Writing to MYSQL in AWS Glue

enter image description hereI'm able to connect to MYSQL while running my Pyspark Code Locally in juypter notebook, but the same code I am getting Communication error in AWS Glue while running the code. I have added MySQL jar in jar files required while creating the job in AWS Glue.
Reading from MYSQL
dataframe_mysql = sqlContext.read.format("jdbc").option("url", "jdbc:mysql://localhost/read").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "student").option("user", "root").option("password", "root").load()
Writing to MYSQL
df = sc.parallelize([[25, 'Prem'],
[20, 'Kate'],
[20, 'Kate'],
[40, 'Cheng']]).toDF(["Depy_id","Dept_name"])
df.write.format('jdbc').options(
url='jdbc:mysql://localhost/test',
driver='com.mysql.jdbc.Driver',
dbtable='dept',
user='root',
password='root').mode('overwrite').save()
Please note that you have to provide a valid database URL, not a localhost. I believe your jupyter notebook was run locally on a laptop, in the same local environment where your mysql is running as well.
AWS Glue runs within an AWS environment, and behind the scene it would launch number of EC2 instances depending upon the DPU configuration. If your URL is configured as LOCALHOST, then the EC2 instance where the pyspark code is running, would look for a mysql database on the same node.
Please make sure you have a valid public IP for the mysql database, and try to setup a connection in the AWS Glue as suggested by bdcloud, and retry again. If you dont want to create a connection, you can hard code the connection parameters in the code and try again. If you cannot get a public IP for the installed mysql database, maybe you can try setup an RDS Mysql on AWS, and use it for testing.
Sample code snippet:
conn = mysql.connector.connect(host=url, user=uname, password=pwd, database=dbase)
cur = conn.cursor()
insertQry = "INSERT INTO emp (id, emp_name, dept, designation, address1, city, state, active_start_date, is_active) SELECT (SELECT coalesce(MAX(ID),0) + 1 FROM atlas.emp) id, tmp.emp_name, tmp.dept, tmp.designation, tmp.address1, tmp.city, tmp.state, tmp.active_start_date, tmp.is_active from EMP_STG tmp ON DUPLICATE KEY UPDATE dept=tmp.dept, designation=tmp.designation, address1=tmp.address1, city=tmp.city, state=tmp.state, active_start_date=tmp.active_start_date, is_active =tmp.is_active ;"
n = cur.execute(insertQry)
print (" CURSOR status :", n)
Please refer to the AWS Glue connections section:
yes that's true I am able to connect it as above by just adding the connection to the job as well as changing the local host to the respective

Unable to do a Restore from S3 with RDS SQL Server

I am attempting to do a restore from S3 in AWS RDS (SQL Server). On the page when I can select the engine, I select SQL Server. But the options to select the Edition are all grayed out and I cannot select one and move on. You can see this from the screen shot below. Note, this does not happen if I simply attempt to create an instance of SQL Server in RDS. I can then select an Edition.
It looks like you can't do it straight from the console and the Database has to exist already.
Restoring a Database
To restore your database, you call the rds_restore_database stored
procedure.
The following parameters are required:
#restore_db_name – The name of the database to restore.
#s3_arn_to_restore_from – The Amazon S3 bucket that contains the backup file, and the name of the file.
The following parameters are optional:
#kms_master_key_arn – If you encrypted the backup file, the key to use to decrypt the file.
Example Without Encryption
exec msdb.dbo.rds_restore_database
#restore_db_name='database_name',
#s3_arn_to_restore_from='arn:aws:s3:::bucket_name/file_name_and_extension';
Example With Encryption
exec msdb.dbo.rds_restore_database
#restore_db_name='database_name',
#s3_arn_to_restore_from='arn:aws:s3:::bucket_name/file_name_and_extension',
#kms_master_key_arn='arn:aws:kms:region:account-id:key/key-id';
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/SQLServer.Procedural.Importing.html

How do I import a local MySQL db to RDS db instance?

I've created a RDS instance called realcardiodb (the engine is mysql)
and I've exported my database from my localhost. File is saved locally called localhostrealcardio.sql
Most research says to use mysqldump to import data from a local system to a web server, but my system doesn't even recognize mysqldump.
C:\xampp\mysql>mysqldump
'mysqldump' is not recognized as an internal or external command, operable program or batch file.
How do I resolve this error should I use mysqldump? (I definitely have mysql install on my system)
Is there a better utility I should use?
Any help is appreciated, especially if you have experience importing mysql to aws rds.
Thanks!
DK
Update 7/31/2012
So I got the error resolved. mysqldump is in the bin directory C:\xampp\mysql\bin>mysqldump
AWS provides the folloinwg instructions for uploading a local database to RDS:
mysqldump acme | mysql --host=hostname --user=username --password acme
Can someone break this down for me?
1) Is the first 'acme' (after mysqldump command) the name of my local database or the exported sql file I saved locally?
2)Is the hostname the IP address, Public DNS, RDS Endpoint or neither?
3)The username and password I assume is the RDS credentials and the second acme is the name of the database I created in RDS.
Thanks!
This is how I did it for a couple instances that had data in the MySQl tables.
The steps to creating an RDS database instance:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_GettingStarted.CreatingConnecting.MySQL.html
Note: Make sure the RDS instance has a security group configured that relates to the EC2 security group.
http://docs.amazonwebservices.com/AmazonRDS/latest/UserGuide/USER_Workin...
Before we go forward, let me provide a list of what some of the following placeholders are:
host.address.for.rds.server = this will be what is referred to as the "end point" in your RDS description/settings page.
rdsusername = the master user account which you created during RDS setup.
rdsdatabase = a blank database which you created inside the server on your RDS instance.
backupfile.sql = the sql dump file your made of your pre-existing installation's database.
Once you've created a fresh RDS database instance, and have configured its security settings, log into this server (from within an ssh session to your EC2 server) and then create an empty database inside the instance using basic SQL commands.
mysql -h host.address.for.rds.server -P 3306 -u rdsusername -p
(enter your password)
create database rdsdatabase;
Then quit out of the MySQL environment inside your RDS server.
\q
This tutorial assumes you already have a backup from your old database. If you don't, go create one now. After that, you’re ready to import that sql dump file into the empty database waiting on your RDS server.
mysql -h host.address.for.rds.server -u rdsusername -p rdsdatabase < backupfile.sql
It might take a few seconds to complete, depending on the size of the sql dump file. Your indication that it is finished is that the bash command prompt reappears.
Note: the command “mysqlimport” is used when imported data directly into an existing table inside a database. It might seem like we’re “importing” data, but this is not what we’re actually doing in this situation. The database we are migrating to has no tables yet, and the sql dump file we’re using contains the sql commands to generate the tables it needs.
Confirm the Transfer
Now, if you didn't get any error messages, then your sql transfer probably worked. If you want, you can double check to see if it did by connecting to your RDS database server, looking up the database you created, and check to see if the tables are now present.
mysql -h host.address.for.rds.server -P 3306 -u rdsusername -p
(enter your password)
use rdsdatabase;
show tables;
I prefer using MySQL workbench. It's much more easier & user friendly than the command line way.
It provides a simple GUI.
MySQL workbench or SQL Yog.
These are the steps that I did.
1) Install MySQL Workbench.
2) In AWS console, there must be a security group for your RDS instance.
Add an inbound rule to that group for allowing connections from your machine.
It's simple. Add your IP-address.
3) Open MySQL workbench, Add a new connection.
4) Give the connection a name you prefer.
5) Choose connection method- Standard TCP/IP
6) Enter your RDS endpoint in the field of Hostname.
7) Port:3306
8) Username: master username (the one which which you created during the creation of your RDS instance)
9)Password: master password
10) Click Test Connection to check your connection.
11) If connection is successful, click OK.
12) Open the connection.
13) you will see your database 'realcardiodb' there.
14) Now you can export your mysqldump file to this database. Go to-> Server. Click Data Import.
15) You can check whether the data has been migrated by simply opening a blank SQL file & typing in basic SQL commands like use database, select * from table;
That's it. Viola.
If you have a backup.sql in your PC, No need to transfer to EC2. Just give below line on your terminal in your PC.
$ mysql -h rdsinstance-hostaddress-ending.rds.amazonaws.com -u rds_username -p rds_database < /path/to/your/backup.sql
Enter password: paswd_mysql_user
That's all.
Import backup directly from existing remote server
SSH connect to your remote server
Get the remote server mysql backup (backup/path/backupfile.sql)
Import backup file to RDS mysql while you in remote server shell
mysql -h your-mysql-instance.region.rds.amazonaws.com -u db_username -p db_name < backup/path/backupfile.sql
Note:
I have tried all the above criteria to import my existing backup to new RDS database, including through EC2 as in AWS documentation. It was a 10GB backup. So I have tried tables by tables as well. It shows process completed but some data were missing for large tables. So I had to write a DB to DB data migration script.
Using work bench :
setup connection
go to management tab and click on data import/restore
click on import from self contained file .
choose your mysqlbackup.sql file.
select default database.
click on start import button.
Using command line (On Windows ) :
mysqldump -u <localuser>
--databases world
--single-transaction
--compress
--order-by-primary
-p<localpassword> | mysql -u <rds-user-name>
--port=3306
--host=ednpoint
-p<rds-password>
For more detail please refer :
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Procedural.Importing.SmallExisting.html
or
https://docs.bitnami.com/aws/how-to/migrate-database-rds/#using-phpmyadmin-110
Hope it helps.
The step by step instruction on how to migrate already existing db on mysql/mariadb to already running RDS instance.
Here is the AWS RDS Mysql document to import customer data into RDS
http://aws.amazon.com/articles/2933
Create flat files containing the data to be loaded
Stop any applications accessing the target DB Instance
Create a DB Snapshot
Disable Amazon RDS automated backups
Load the data using mysqlimport
Enable automated backups again