How to create a working JDBC connection in Google Cloud Composer? - google-cloud-platform

To get the JDBC Hook working, I first add in the jaydebeapi package in the PYPI packages page in Composer.
However, that alone does not allow a JDBC connection to work:
1) How do I specify the .jar driver path for the JDBC driver I have?
I was thinking it would be something like "/home/airflow/gcs/drivers/xxx.jar" (assuming I've created a drivers folder in the gcs directory)... but I haven't been able to verify or find documentation on this.
2)How do I install/point toward the Java JRE? On Ubuntu I run this command to install JRE: sudo apt-get install default-jre libc6-i386. Is a JRE or ability to install a JRE available in Cloud Composer? This is the current error message I get in the Adhoc window with the JDBC connection: [Errno 2] No such file or directory: '/usr/lib/jvm'
If either of the above options are not currently available, are there any workarounds to get a JDBC connection working with Composer?

There are known issues with JDBC issues in Airflow 1.9 (https://github.com/apache/incubator-airflow/pull/3257); hopefully, we should be able to backport these fixes in Composer by GA!

Related

Coldfusion - JEE Installation - JVM Arguments which and how?

We are trying to change how we manage our QA Enviroments.
Till now we have installed Coldfusion using the Adobe Installer that install a modified Tomcat.
We want to manage the QA server using CommandBox Tool that installs Coldfusion in JEE install mode using war file, but situation will be similar with any of the supported JEE Application Server. (EAP, Wildfly, Tomcat).
We have little experience in using Coldfusion in the JEE install mode then several questions arise regarding the JVM Settings.
Below the JVM Setting we got after install CF using the Adobe Standard installer, accessing the Coldfusion (CF) Administrator (here for a fresh installation of a CF2018):
--add-modules=java.xml.ws
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens=java.base/sun.util.cldr=ALL-UNNAMED
--add-opens=java.base/sun.util.locale.provider=ALL-UNNAMED
--add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED
-Dcoldfusion.classPath={application.home}/lib/updates,{application.home}/lib,{application.home}/lib/axis2,{application.home}/gateway/lib/,{application.home}/wwwroot/WEB-INF/cfform/jars,{application.home}/wwwroot/WEB-INF/flex/jars,{application.home}/lib/oosdk/lib,{application.home}/lib/oosdk/classes
-Dcoldfusion.home={application.home}
-Dcoldfusion.jsafe.defaultalgo=FIPS186Random
-Dcoldfusion.libPath={application.home}/lib
-Dcoldfusion.rootDir={application.home}
-Dcom.sun.xml.bind.v2.bytecode.ClassTailor.noOptimize=true
-Djava.awt.headless=true
-Djava.locale.providers=COMPAT,SPI
-Djava.security.auth.policy={application.home}/lib/neo_jaas.policy
-Djava.security.policy={application.home}/lib/coldfusion.policy
-Djava.util.logging.config.file={application.home}/lib/logging.properties
-Djdk.attach.allowAttachSelf=true
-Dorg.apache.coyote.USE_CUSTOM_STATUS_MSG_IN_HEADER=true
-Dorg.eclipse.jetty.util.log.class=org.eclipse.jetty.util.log.JavaUtilLog
-Dsun.font.layoutengine=icu
-XX:+UseParallelGC
-XX:MaxMetaspaceSize=192m
-Xbatch
-Xdebug
-Xms256m
-Xmx1024m
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=6006
-server
Q1. the Coldfusion war have inside it the JVM settings (we have written above) and will apply it during deploy?
Q2. if answer to is no!, what are the settings that are applicable according the JEE Application Server used (i.e. some settings are ok for Tomcat but meaniless for wildfly).
We were able to find specific documentation from Adobe for Coldfusion 2018 (https://helpx.adobe.com/it/coldfusion/installing/coldfusion-2018-install-jee-configuration.html).
The number of settings is incredible low comparing with the standard one!:
-Djava.locale.providers=COMPAT,SPI
-Dcoldfusion.disablejsafe=true
-Djdk.attach.allowAttachSelf=true
-Djdk.serialFilter= !org.mozilla.**;!com.sun.syndication.**
We were not able to find similar documentationfor Coldfusion 2021 (latest release).
Till now trying to get help from Adobe has been produce not effective results.
Thanks in advance
Adobe isn't going to help you support an installation via CommandBox, you need to talk to Ortus Solutions about that. Every JVM setting available in CF Admin for Standard is not displayed in the UI for CF Enterprise. You're looking for the CF install folder for the instance you have (non-CB version), find a file called jvm.config.
https://www.cfguide.io/coldfusion-administrator/server-settings-java-jvm/
In ColdBox, you can set everything via the command line or use a server.json file to do that same configuration.
https://commandbox.ortusbooks.com/embedded-server/configuring-your-server/jvm-args
Here's an example from a CB install I have on a Mac where I'm running an older version of CF that requires an older JDK, but I've also set the heapSize.
{
"jvm": {
"javaHome": "/Library/Java/JavaVirtualMachines/jdk1.7.0_15.jdk/Contents/Home",
"heapSize": 1024
}
}
What I would suggest is
export your existing CF Admin settings to a .car file
log in to your ColdBox server's CF Admin
import the .car file
adjust all the settings (DSN, etc) for your QA server
go to the command line and install the CFConfig plugin
export all the CF Admin settings to a JSON file (instructions in their docs)
save that to a private repo or your secrets / passwords vault.
Join the BoxTeam Slack where you can find the CommandBox community and ask more questions.

Connect to a mysql source in Cloud Dataflow without requirements

Is there a package that is currently installed in the Python SDK that would allow me to connect to a mysql source? If not, I'll need to add in a requirements.txt file, which I'm trying to eliminate, as it drastically increases the setup time for things.
Update: I suppose pandas can, though I believe it needs an additional 'binding' for each sql source it connects to if I'm not mistaken?.
Since you are trying to connect to MySQL you need a specific client that will establish a channel between you and the database. Therefore, you will have to use the requirements.txt file to install this library.
You can refer to this StuckOverflow link that has a similar question. The answer specifies that "You must install a MySQL driver before doing anything. Unlike PHP, Only the SQLite driver is installed by default with Python. ...".
So only the SQLite driver is installed with Python SDK, not the MySQL one.

Zeppelin: how to download or save the Zeppelin notebook?

I am using Zeppelin sandbox with aws EMR.
Is there a way to download or save the zeppelin notebook in a way so that it can be imported into another Zeppelin server ?
As noted in the comments above, this feature is available starting in version 0.5.6. You can find more details in the release notes. Downloading and installing this version would solve that issue.
Given that you are using EMR, it looks like you will have to work with the version available. As Samuel mentioned above, you can backup the contents of the incubator-zeppelin/notebook folder and make the transfer.

Running EMR with Cascading SDK failed

I was following this tutorial for installing Cascading to EMR:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/CreateCascading.html
But it failed because of bootstrap action installing the cascading-sdk. The corresponding logs is here: http://pastebin.com/jybHssTQ. As seen from the logs, it's failed because of apt-get not found. Seriously?
I also checked the sdk installation script, and found option to disable installing screen with --no-screen. It is still failed, with different error http://pastebin.com/T6CvA2H1
And now it is because of permission denied. What?
It's official guide, but I can't seem to run it. Any idea?
Rather than changing the script first, try a different EMR AMI version.
AMI versions up until 2.4.8 use Debian OS, where apt-get will work, but this runs Hadoop 1.x. AMI versions 3.0.x run Hadoop 2.2 and use Amazon Linux, which uses Yum.
See Below:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/ami-versions-supported.html
Also, try to add the "--tmpdir" option to get around the "Permission Denied" error.

Unable to download distribution package for Django

I'm currently testing with a django web application using version 1.6 with a python version 3.4.1 and needs to install some packages here on my machine. Based from what I've observed we are currently connected to a proxy server which is why I'm having issues downloading some of it. Below are the actions that I've taken so far.
1) I've updated my http_proxy connection to http://innoproxy:8083/proxy.pac which is our current proxy connection.
2) Below is error that mostly occurs when I would install the South Package.
C:\Users\fx0.MANDAUE>pip install South
Downloading/unpacking South
Cannot fetch index base URL https://pypi.python.org/simple/
Could not find any downloads that satisfy the requirement South
Cleaning up...
No distributions at all found for South
Storing debug log for failure in C:\Users\fx0.MANDAUE\pip\pip.log
My question is, would it be possible for me to install that package without using the command prompt(manual download) or do I still lack some actions from my end for the downloading to work? I've already checked other possible solutions but so far to no avail. Thanks!
I'm in a similar situation behind my corporate proxy. You may first want to check whether your proxy is looking for authentication, in which case setting you connection string to http://username:password#proxyserver:port/ may help. In my case, however, our authentication relies on Windows Active Directory, which I've yet to overcome on my Linux box.
If all else fails, as in my case, you can manually download the source tar.bz (or similar compressed directory) from PyPI and use pip install path/to/source. This will mean manually downloading all dependencies and installing them the same way. It can be a pain, but it works.