I'm trying to run some code locally to AWS, but I'm not sure where to store the jdbc drivers. The goal is to have my pyspark application, read a rds database to do an ELT process from a cluster.
I'm getting two sets of errors:
First: Cannot locate jar files
Two: Error: Missing Additional resource
Here is what my code looks like:
import pyspark
from pyspark.sql import SparkSession
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--driver-class-path /path/to/driver/jars --jars /file/path/to/jars'
post_df = spark.read\
.format("djbc") \
.option("url", "jdbc:postgres:url-to-rds-amazonaws.com") \
.option("dbtable", "mytable") \
.option("user", "myuser") \
.option("password", 'passowrd') \
.query("select * from my table").load()
post_df.createOrReplaceTempView("post_fin_v")
transformed_df = spark.sql('''
perform more aggregation here
''')
transformed_df.write.format("jdbc").mode("append").option("url", "jdbc:sql_server:url-to-rds-amazonaws.com") \
.option("dbtable", "mytable") \
.option("user", "myuser") \
.option("password", 'passowrd')
I've developed a shiny app and i'm trying to do a first lightweight deploy using shinyproxy.
All installation seems fine.
I've installed docker, java.
I thought that building a package that wraps the app and other function would be a good idea.
So I developed a package (CI) and CI::launch_application is basically a wrapper around RunApp function of shiny package. This is the code:
launch_application <- function(launch.browser = interactive(), ...) {
runApp(
appDir = system.file("app", package = "CI"),
launch.browser = launch.browser,
...
)
}
I succesfully built the docker image with this Dockerfile
FROM rocker/r-base:latest
## Install required dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
## for R package 'curl'
libcurl4-gnutls-dev \
apt-utils \
## for R package 'xml2'
libxml2-dev \
## for R package 'openssl'
libssl-dev \
zlib1g-dev \
default-jdk \
default-jre \
&& apt-get clean \
&& R CMD javareconf \
&& rm -rf /var/lib/apt/lists/
## Install major fixed R dependencies
# - they will always be needed and we want them in a dedicated layer,
# as opposed to getting them dynamically via `remotes::install_local()`
RUN install2.r --error \
shiny \
dplyr \
devtools \
rJava \
RJDBC
# copy the app to the image
RUN mkdir /root/CI
COPY . /root/CI
# Install CI
RUN install2.r --error remotes \
&& R -e "remotes::install_local('/root/CI')"
# Set host and port
RUN echo "options(shiny.port = 80, shiny.host = '0.0.0.0')" >> /usr/local/lib/R/Rprofile.site
EXPOSE 80
CMD ["R", "-e", "CI::launch_application()"]
This is my application.yml file
proxy:
title:
logo-url: http://www.openanalytics.eu/sites/www.openanalytics.eu/themes/oa/logo.png
landing-page: /
heartbeat-rate: 10000
heartbeat-timeout: 60000
port: 8080
admin-groups: scientists
users:
- name: jack
password: password
groups: scientists
- name: jeff
password: password
groups: mathematicians
authentication: simple
# Docker configuration
docker:
cert-path: /home/none
url: http://localhost:2375
port-range-start: 20000
specs:
- id: home
display-name: Customer Intelligence
description: Segment your customer
container-cmd: ["R", "-e", "CI::launch_application()"]
container-image: company/image
access-groups: scientist
logging:
file:
shinyproxy.log
When I launch java shinyproxy.jar and i visited the url with the port I exposed, I see a login mask.
I logged in with simple authentication ( login is successful from shinyproxy.log) but neither an app is showing nor a list of app.
When I launch the app locally everything is fine.
Thanks
There is a misprint in the allowed user group in application.yml (should be scientists over scientist):
access-groups: scientists
Dzimitry is right. It was a typo error: scientists over scientist.
I am looking for a batch loader for a glue job to load into RDS using a PySpark script witht he DataFormatWriter.
I have this working for RedShift as follows:
df.write \
.format("com.databricks.spark.redshift") \
.option("url", jdbcconf.get("url") + '/' + DATABASE + '?user=' + jdbcconf.get('user') + '&password=' + jdbcconf.get('password')) \
.option("dbtable", TABLE_NAME) \
.option("tempdir", args["TempDir"]) \
.option("forward_spark_s3_credentials", "true") \
.mode("overwrite") \
.save()
Where df is defined above to read in a file. What is the best approach I could take to do this in RDS instead of in REDSHIFT?
In RDS would you be only APPEND / OVERWRITE, in such case you can create an RDS JDBC connection, and use something like below:
postgres_url="jdbc:postgresql://localhost:portnum/sakila?user=<user>&password=<pwd>"
df.write.jdbc(postgres_url,table="actor1",mode="append") #for append
df.write.jdbc(postgres_url,table="actor1",mode="overwrite") #for overwrite
If it involves UPSERTS, then probably you can use a MYSQL library as an external python library, and perform INSERT INTO ..... ON DUPLICATE KEY.
Please refer this url: How to use JDBC source to write and read data in (Py)Spark?
regards
Yuva
I learned that this can be only done through JDBC. Eg.
df.write.format("jdbc") \
.option("url", jdbcconf.get("url") + '/' + REDSHIFT_DATABASE + '?user=' + jdbcconf.get('user') + '&password=' + jdbcconf.get('password')) \
.option("dbtable", REDSHIFT_TABLE_NAME) \
.option("tempdir", args["TempDir"]) \
.option("forward_spark_s3_credentials", "true") \
.mode("overwrite") \
.save()
I am trying to use OpenSSL but I am stuck on the step of compiling. The OpenSSL project has very unfriendly (bad) documentation.
Is there any actual help how to build the latest OpenSSL version on Windows with Visual Studio 2017?
I didn't find any helpful information on the official OpenSSL site. Yes, there are a lot of posts on the Internet about OpenSSL compilation, but all of them are obsolete.
I've not used VS2017 but previous versions. I imagine it is much the same. Note the instructions below are for OpenSSL 1.1.0 or above. They do not work for OpenSSL 1.0.2. In brief the steps are:
Install Perl (either ActiveState or Strawberry)
[EDIT, see my (kritzel_sw) comment below: I would strongly recommend to use Strawberry)]
Install NASM
Make sure both Perl and NASM are on your %PATH%
Fire up a Visual Studio Developer Command Prompt with administrative privileges (make sure you use the 32-bit one if you are building 32-bit OpenSSL, or the 64-bit one if you are building 64-bit OpenSSL)
From the root of the OpenSSL source directory enter perl Configure VC-WIN32, if you want 32-bit OpenSSL or perl Configure VC-WIN64A if you want 64-bit OpenSSL
Enter nmake
Enter nmake test
Enter nmake install
[EDIT, unless you change the target directory in the configuration, nmake install needs administrator privileges. So the VC command prompt must be started as administrator for this final step]
If anything goes wrong at any stage, check the INSTALL file and the NOTES.WIN file.
Modified version of The Quantum Physicist python script
It can compile OpenSSL 1.0.x or OpenSSL 1.1.x
It can compile with multiple version of Visual Studio 2017/2019 included.
1) Create the file: CompileOpenSSL.py
import os
import os.path
from subprocess import call
import shutil
import sys
import re
import argparse
# args
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--filename", help="First argument must be the tar.gz file of OpenSSL source", required=True)
parser.add_argument("-a", "--arch", help="Second argument must be x86 or amd64", required=True)
parser.add_argument("-v", "--vs_version", help="Visual Studio version (eg:90, 140, 150)", required=True)
parser.set_defaults(writeVersionInfos=False)
args = parser.parse_args()
compile_flags = "-no-asm"
#compile_flags = "-no-asm -no-shared"
openssl_32_flag = "VC-WIN32"
openssl_64_flag = "VC-WIN64A"
working_dir = os.getcwd()
dirname = args.filename.replace(".tar.gz","")
src_32_suffix = "_" + "vs" + args.vs_version + "_32"
src_64_suffix = "_" + "vs" + args.vs_version + "_64"
vs_tools_env_var = "VS" + args.vs_version + "COMNTOOLS"
if args.arch != "x86" and args.arch != "amd64":
print("Second argument must be x86 or amd64")
exit(1)
if not bool(re.match("(openssl-){1}(\d)+(.)(\d)+(.)(\d)+(\w)+(.tar.gz)",args.filename)):
print("The file given doesn't seem to be an openssl source file. It must be in the form: openssl-x.y.zw.tar.gz")
exit(1)
call("7z x -y " + args.filename) #extract the .gz file
dirname_src_32 = dirname + src_32_suffix
dirname_src_64 = dirname + src_64_suffix
dirname_bin_32 = dirname + src_32_suffix + "_build"
dirname_bin_64 = dirname + src_64_suffix + "_build"
openssl_tar_file = args.filename[0:-3]
if args.arch == "x86":
#delete previous directories
shutil.rmtree(os.getcwd()+'/'+dirname, ignore_errors=True)
shutil.rmtree(os.getcwd()+'/'+dirname_src_32, ignore_errors=True)
#extract tar file for 32
call("7z x -y " + openssl_tar_file)
os.rename(dirname, dirname_src_32)
#Compile 32
os.chdir(dirname_src_32)
print("perl Configure " + openssl_32_flag + " --prefix=" + os.path.join(working_dir,dirname_bin_32) + " " + compile_flags)
call("perl Configure " + openssl_32_flag + " --prefix=" + os.path.join(working_dir,dirname_bin_32) + " " + compile_flags,shell=True)
if( os.path.exists("ms/do_ms.bat") ):
call("ms\do_ms.bat",shell=True)
print(os.getcwd())
call("nmake -f ms/ntdll.mak",shell=True)
call("nmake -f ms/ntdll.mak install",shell=True)
else:
call("nmake",shell=True)
call("nmake test",shell=True)
call("nmake install",shell=True)
print("32-bit compilation complete.")
#Go back to base dir
os.chdir(working_dir)
################
if args.arch == "amd64":
#delete previous directories
shutil.rmtree(os.getcwd()+'/'+dirname, ignore_errors=True)
shutil.rmtree(os.getcwd()+'/'+dirname_src_64, ignore_errors=True)
#extract for 64
call("7z x -y " + openssl_tar_file)
os.rename(dirname, dirname_src_64)
#Compile 64
os.chdir(dirname_src_64)
call("perl Configure " + openssl_64_flag + " --prefix=" + os.path.join(working_dir,dirname_bin_64) + " " + compile_flags,shell=True)
if( os.path.exists("ms\do_ms.bat") ):
call("ms\do_win64a.bat",shell=True)
call("nmake -f ms/ntdll.mak",shell=True)
call("nmake -f ms/ntdll.mak install",shell=True)
else:
call("nmake",shell=True)
call("nmake test",shell=True)
call("nmake install",shell=True)
print("64-bit compilation complete.")
#Go back to base dir
os.chdir(working_dir)
################
os.remove(openssl_tar_file)
2) Create the file: CompileOpenSSL_vs.cmd
ECHO --------------------------------------
ECHO Require Python, 7Zip, PERL and NASM in PATH
ECHO --------------------------------------
Rem ------------------------------------------------------
Rem TO CONFIGURE -----------------------------------------
Rem ------------------------------------------------------
Rem SET YOUR LOCAL PATHS-----------------------------------------
SET PATH=C:\Program Files (x86)\7-Zip;C:\Perl64\bin;M:\Backup\Coders\_tools\7-Zip\;%PATH%
Rem SET YOUR OPENSSL ARCHIVE-----------------------------------------
REM SET FILENAME=openssl-1.0.2r.tar.gz
SET FILENAME=openssl-1.1.1b.tar.gz
Rem SET THE VERSION OF YOUR VISUAL STUDIO-----------------------------------------
SET VSVERSION=%1
Rem ------------------------------------------------------
Rem COMPILATION LAUNCH -----------------------------------
Rem ------------------------------------------------------
Rem UTILS PATH-----
SET VSCOMNTOOLSNAME=VS%VSVERSION%COMNTOOLS
Rem Pick the good path for Visual Studio-----------------------------------------
IF %VSVERSION% GEQ 150 (
Echo DO NOT FORGET TO ADD A SYSTEM VARIABLE %VSCOMNTOOLSNAME% - like: "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\"
SET VCVARPATH="%%%VSCOMNTOOLSNAME%%%..\..\VC\Auxiliary\Build\vcvarsall.bat"
) ELSE (
SET VCVARPATH="%%%VSCOMNTOOLSNAME%%%..\..\VC\vcvarsall.bat"
)
Rem Set env -----------------------------------------
#pushd "%~dp0"
call %VCVARPATH% %2
#popd
Rem ------------------------------------------------------
Rem TEST APP EXIST -----------------------------------
Rem ------------------------------------------------------
where /q 7z.exe
IF ERRORLEVEL 1 (
ECHO The application "7z.exe" is missing. Ensure to add/install it to the PATH in beginning of this script, check SET PATH
PAUSE
EXIT /B
)
where /q perl.exe
IF ERRORLEVEL 1 (
ECHO The application "perl.exe" is missing. Ensure to add/install it to the PATH in beginning of this script, check SET PATH
PAUSE
EXIT /B
)
where /q nmake.exe
IF ERRORLEVEL 1 (
ECHO The application "nmake.exe" is missing. Ensure to add/install it to the PATH in beginning of this script, check SET PATH
PAUSE
EXIT /B
)
where /q py.exe
IF ERRORLEVEL 1 (
ECHO The application "py.exe" [shortcut of python] is missing. Ensure to add/install it to the PATH in beginning of this script, check SET PATH
PAUSE
EXIT /B
)
Rem Launch compilation -----------------------------------------
py CompileOpenSSL.py -f %FILENAME% -a %2 -v %VSVERSION%
PAUSE
3) Launch compilation from command line (Outside Visual Studio)
eg:
CompileOpenSSL_vs.cmd 150 x86
CompileOpenSSL_vs.cmd 150 amd64
CompileOpenSSL_vs.cmd 90 x86
For OpenSSL 1.0.2, I wrote a Python script that does the building for me. I have this habit of making these scripts, as I don't like to reinvent the wheel everytime I need to build something.
The script is made for OpenSSL 1.0.2. Probably the changes are minimal for OpenSSL 1.1.0.
Here's the script:
import os
from subprocess import call
import sys
import re
vs_version = "140"
compile_flags = "-no-asm -no-shared"
openssl_32_flag = "VC-WIN32"
openssl_64_flag = "VC-WIN64A"
src_32_suffix = "_" + "vs" + vs_version + "_32"
src_64_suffix = "_" + "vs" + vs_version + "_64"
vs_tools_env_var = "VS" + vs_version + "COMNTOOLS"
if len(sys.argv) < 2:
print("First argument must be the tar.gz file of OpenSSL source")
exit(1)
if len(sys.argv) < 3:
print("Second argument must be 32 or 64")
exit(1)
filename = sys.argv[1]
dirname = filename.replace(".tar.gz","")
working_dir = os.getcwd()
arch = sys.argv[2]
if arch != "32" and arch != "64":
print("Second argument must be 32 or 64")
exit(1)
if not bool(re.match("(openssl-){1}(\d)+(.)(\d)+(.)(\d)+(\w)+(.tar.gz)",filename)):
print("The file given doesn't seem to be an openssl source file. It must be in the form: openssl-x.y.zw.tar.gz")
exit(1)
call("7z x " + filename) #extract the .gz file
dirname_src_32 = dirname + src_32_suffix
dirname_src_64 = dirname + src_64_suffix
dirname_bin_32 = dirname + src_32_suffix + "_build"
dirname_bin_64 = dirname + src_64_suffix + "_build"
openssl_tar_file = filename[0:-3]
if arch == "32":
#extract tar file for 32
call("7z x " + openssl_tar_file)
os.rename(dirname, dirname_src_32)
#Compile 32
os.chdir(dirname_src_32)
call("perl Configure " + openssl_32_flag + " --prefix=" + os.path.join(working_dir,dirname_bin_32) + " " + compile_flags,shell=True)
call(r"ms\do_ms.bat",shell=True)
call(r"nmake -f ms\nt.mak",shell=True)
call(r"nmake -f ms\nt.mak instalL",shell=True)
print("32-bit compilation complete.")
#Go back to base dir
os.chdir(working_dir)
################
if arch == "64":
#extract for 64
call("7z x " + openssl_tar_file)
os.rename(dirname, dirname_src_64)
#Compile 64
os.chdir(dirname_src_64)
call("perl Configure " + openssl_64_flag + " --prefix=" + os.path.join(working_dir,dirname_bin_64) + " " + compile_flags,shell=True)
call(r"ms\do_ms.bat",shell=True)
call(r"nmake -f ms\nt.mak",shell=True)
call(r"nmake -f ms\nt.mak instalL",shell=True)
print("64-bit compilation complete.")
#Go back to base dir
os.chdir(working_dir)
################
os.remove(openssl_tar_file)
Option 1: Save the script to CompileOpenSSL.py, and download the OpenSSL source file that is expected to have the name format openssl-1.X.Y.tar.gz. Now assuming that 7zip and perl are accessible from the global scope on your command prompt and you have the correct MSVC variables loaded (with e.g. vsvars32.bat, or starting the right terminal), run the following:
python CompileOpenSSL.py openssl-1.X.Y.tar.gz 32
If you're using MSVC 32-bit, or
python CompileOpenSSL.py openssl-1.X.Y.tar.gz 64
for MSVC 64-bit.
Option 2: Do what the script does manually. The script simply extracts the archive, configures the sources and runs do_ms.bat then nmake. Follow the source and it'll work.
Good luck!
go into ssl directory using visual studio cmd and add perl and nasm to system path then type:
perl Configure --openssldir=D:OpenSSLdirectory VC-WIN32
ms\do_ms.bat
nmake -f ms\ntdll.mak
nmake -f ms\ntdll.mak install
( enjoy. )
I want to generate a file list with gcc -M which delivers something like the following which works fine:
../active_var/timervar.h \
../active_var/universal_var.h \
../chrono_timer/chrono_timer.cpp \
/home/bla/foreign_components/gmock-1.7.0/fused-src/gmock/gmock.h \
/home/bla/foreign_components/gmock-1.7.0/fused-src/gmock-gtest-all.cc \
/home/bla/foreign_components/gmock-1.7.0/fused-src/gmock_main.cc \
/home/bla/foreign_components/gmock-1.7.0/fused-src/gtest/gtest.h \
../../mtp/index_tuple.h \
../observer/observer_with_stop_marker.h \
test_bugfixing.cpp \
test_counter.cpp \
I give this files to the
INPUT = <files as listed above >
This works as expected.
Now I simply want to ignore the files coming from /home/bla/foreign_components/
I tried:
EXCLUDE = */home/bla/foreign_components/*
or
EXCLUDE = /home/bla/foreign_components/
or
EXCLUDE = /home/bla/foreign_components/*
nothing works!
I tried the same with all above listed patterns with the EXCLUDE_PATTERNS
Also no effect.
Is this feature simply broken in doxygen or can files which are explicitly listed in files not be filtered out?
I am using doxygen version: 1.8.10
can files which are explicitly listed in files not be filtered out?
I think so.
You can generate your INPUT list by filtering gcc -M output:
gcc -M | grep -v "/home/bla/foreign_components"