Are there any existing packages in R or somewhere else that can connect AWS Redshift clusters to R shiny apps? I'm trying to build up an interactive dashboard using Shiny and the data source is primarily Amazon Redshift or S3. Any workable alternatives or suggestions are welcomed too.
I am using R Shiny with redshift with very nice results.
First you have to install
library(RPostgreSQL)
library(shinydashboard) #just if you want to use nice dashboards
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, host="blabla.eu-west-1.redshift.amazonaws.com",
port="5439", dbname="xx, user="aaaaa", password="xxxxx")
conn #run you connection
test <-data.frame( dbGetQuery(conn, "select * from youtbalename"))
That is working for me.
I know this is an old post but wanted to mention RPostgres.
Unlike RPostgreSQL, RPostgres supports SSL and parameterization. Plus, you don't have to download an additional driver like RJDBC.
More on this here:
https://auth0.com/blog/a-comprehensive-guide-for-connecting-with-r-to-redshift/
I've connected in the past using both RJDBC and RPostgreSQL - both work pretty well.
Bear in mind that both ODBC and JDBC Redshift drivers aren't supported on Shinyapps.io (because Shinyapps.io is built on Ubuntu) - RPostgreSQL may therefore be your best bet.
It's very easy to get a working connection in either RJDBC or RPostgreSQL.
Related
I'm not experienced in Java or Hadoop ecosystem. I configured my Spark cluster to connect to Amazon Keyspaces by using spark-cassandra-connector from Datastax. I'm using Pyspark to fetch data from Cassandra. I can successfully connect to Keyspaces/Cassandra cluster. But, when I try to fetch data from it.
df = spark.sql("SELECT * FROM cass.tutorialkeyspace.tutorialtable")
print ("Table Row Count: ")
print (df.count())
I get this error:
Unsupported partitioner: com.amazonaws.cassandra.DefaultPartitioner
Yes, keyspace & table exists and has data. How can I fix/workaround this? Thanks!
As an FYI, Keyspaces now supports using the RandomPartitioner, which enables reading and writing data in Apache Spark by using the open-source Spark Cassandra Connector.
Docs: https://docs.aws.amazon.com/keyspaces/latest/devguide/spark-integrating.html
Launch announcement: https://aws.amazon.com/about-aws/whats-new/2022/04/amazon-keyspaces-read-write-data-apache-spark/
Spark Cassandra Connector is relying on specific partitioner implementation to define data splits, etc. There is no workaround for this problem right now, until somebody adds the implementation of corresponding TokenFactory into this code. It shouldn't be very complex, just should be done by someone who is interested in it.
Thank you for the feedback. At this time, You can write to Keyspaces using the Cassandra Spark Connector. Reading requires support for token rage. Please see the following doc page to see list of supported APIs https://docs.aws.amazon.com/keyspaces/latest/devguide/cassandra-apis.html.
Although we don't have timelines to share at the moment, we prioritize our roadmap based on customer feedback. We are releasing new features all the time. To learn more about our roadmap and upcoming features please contact your AWS Account manager.
I'm trying to connect to Athena using the JDBC drivers provided by Amazon, and using SQL Developer as the client. So far, I haven't had any luck with Java 1.8.181 and AthenaJDBC42-2.0.7.jar Has anyone had any luck on this front? Before I try mixing up which versions of Java, JDBC driver, and/or SQL Developer, I thought I'd at least ask if anyone has been successful using SQL Developer with the Athena JDBC drivers.
No.
SQL Developer doesn't allow for just any JDBC driver to be added...we restrict connectivity to the platforms we officially support for database migrations to the Oracle Database platform.
Athena doesn't have migration support, hence the lack of connectivity. If you need assistance with a migration, please send me a note.
I have teradata files on SERVER A and I need to copy to Server B into HDFS. what options do i have?
distcp is ruled because Teradata is not on HDFS
scp is not feasible for huge files
Flume and Kafka are meant for Streaming and not for file movement. Even if i use Flume using Spool_dir, it will be an overkill.
Only option I can think of is NiFi. Does anyone has any suggestions on how can i utilize Nifi?
or if someone has already gone through these kind of scenarios, what was the approach followed?
I haven't specifically worked with Teradata dataflow in NiFi but having worked with other SQL sources on NiFi, I believe it is possible & pretty straight-forward to develop dataflow that ingests data from Teradata to HDFS.
For starters you can do a quick check with ExecuteSQL processor available in NiFi. The SQL related processors take one DBCPConnectionPool property which is a NiFi controller service which should be configured with the JDBC URL of your Teradata server and the driver path and driver class name. Once you validate the connection is fine, you can take a look at GenerateTableFetch/ QueryDatabaseTable
Hortonworks has an article which talks about configuring DBCPConnectionPool with a Teradata server : https://community.hortonworks.com/articles/45427/using-teradata-jdbc-connector-in-nifi.html
This is quite a general one, I'm trying to move a database from a local psql server on Windows to an AWS RDS server I've set up. The database is small, I'm really just doing it for the sake of learning how.
The problems I'm having are as so:
-Seems like such a simple thing yet no simple solutions? (although I know/think if I had a linux system I could use pg_dump and cat dump. )
-I've gone into looking at aws documentation about how to migrate my database using their database migration service. However I've come to problems so early on that I'm close to believing it won't be worth the effort at my level of experience.
Seems like there must be a simple way and hoping the S/O community can help!
Use some tool like PHPMyAdmin, use import to import your database dump to AWS RDS.
If you wish to use CLI, than use some FTP tool to upload dump to database, connect via SSH and use CLI to import this dump
Did I make wrong to use firebird database I don't know. It has lot's of good futures but I can't figure out why my query (stored procedure) didn't work.
Is there any profiler/monitoring tool for firebird?
Firebird database is working stand alone so it is embeded db. And It doesn't allow to connect with 2 users. If there is a profiler I wonder how it will connect while I'm executing my queries.
IBExpert and Database Worbench have stored procedure debugger
There is also many monitoring tools http://www.firebirdfaq.org/faq95/
I advice you to install server version if you want to have more than 2 users