Failed to detect version from SPARK_HOME or SPARK_HOME_VERSION - data-science-experience

I'm trying to follow a tutorial for using spark from RStudio on DSX, but I'm running into the following error:
> library(sparklyr)
> sc <- spark_connect(master = "CS-DSX")
Error in spark_version_from_home(spark_home, default = spark_version) :
Failed to detect version from SPARK_HOME or SPARK_HOME_VERSION. Try passing the spark version explicitly.
I took the above code snippet from the connect to spark dialog in RStudio:
So I took a look at SPARK_HOME:
> Sys.getenv("SPARK_HOME")
[1] "/opt/spark"
Ok, Lets check that dir exists:
> dir("/opt")
[1] "ibm"
I'm guessing this is the cause of the problem?
NOTE: there are a few similar questions on stackoverflow, but none of them are about IBM's Data Science Experience (DSX).
Update 1:
I tried the following:
> sc <- spark_connect(config = "CS-DSX")
Error in config$spark.master : $ operator is invalid for atomic vectors
Update 2:
An extract from my config.yml. Note that I have many more spark services in my, I've just pasted the first one:
default:
method: "shell"
CS-DSX:
method: "bluemix"
spark.master: "spark.bluemix.net"
spark.instance.id: "7a4089bf-3594-4fdf-8dd1-7e9fd7607be5"
tenant.id: "sdd1-7e9fd7607be53e-39ca506ba762"
tenant.secret: "xxxxxx"
hsui.url: "https://cdsx.ng.bluemix.net"
Note that my config.yml was generated for me.
Update 3:
My .Rprofile looks like this:
# load sparklyr library
library(sparklyr)
# setup SPARK_HOME
if (nchar(Sys.getenv("SPARK_HOME")) < 1) {
Sys.setenv(SPARK_HOME = "/opt/spark")
}
# setup SparkaaS instances
options(rstudio.spark.connections = c("CS-DSX","newspark","cleantest","4jan2017","Apache Spark-4l","Apache Spark-3a","ML SPAAS","Apache Spark-y9","Apache Spark-a8"))
Note that my .Rprofile was generated for me.
Update 4:
I uninstalled sparklyr and restarted the session twice. Next I tried to run:
library(sparklyr)
library(dplyr)
sc <- spark_connect(config = "CS-DSX")
However, the above command hung. I stopped the command and checked the version of sparklyr which seems to be ok:
> ip <- installed.packages()
> ip[ rownames(ip) == "sparklyr", c(0,1,3) ]
Package Version
"sparklyr" "0.4.36"

You cannot use master parameter to connect to bluemix spark service if that is the intent since your kernels are defined in config.yml file, you should be using config parameter instead to connect.
config.yml is loaded up with your available kernel information(spark instances).
Apache Spark-ic:
method: "bluemix"
spark.master: "spark.bluemix.net"
spark.instance.id: "41a2e5e9xxxxxx47ef-97b4-b98406426c07"
tenant.id: "s7b4-b9xxxxxxxx7e8-2c631c8ff999"
tenant.secret: "XXXXXXXXXX"
hsui.url: "https://cdsx.ng.bluemix.net"
Please use config
sc <- spark_connect(config = "Apache Spark-ic")
as suggested in tutorial:-
http://datascience.ibm.com/blog/access-ibm-analytics-for-apache-spark-from-rstudio/
FYI,
By Default, you are connected to , i am working on finding how to change version with config parameter.
> version <- invoke(spark_context(sc), "version")
print(version)
[1] "2.0.2"
Thanks,
Charles.

I had the same issue and fix it as follows:
go to C:\Users\USER_NAME\AppData\Local/spark/ and delete everything you'll find in the directory
Then, in the R console run:
if (!require(shiny)) install.packages("shiny");
library(shiny)
if (!require(sparklyr)) install.packages("sparklyr");
library(sparklyr)
spark_install()

Related

AWS_ACCESS_KEY_ID: Missing access token for source AWS

I use sits in R on Windows 10 in order to receive satellite images from Amazon Web Services. It worked out already, but now I get this error message:
bfmn <- sits_cube(
source="AWS",
collection="SENTINEL-S2-L2A",
dir="C:/temp/final2",
bands = c("B02","B03","B04","B05","B08","B11","B12","SCL"),
start_date = "2019-03-01",
end_date = "2019-06-04",
roi=AOI,
delim = "_",
multicores = 2,
progress = TRUE)
Error: sits_cube: AWS_ACCESS_KEY_ID: Missing access token for source AWS (nzchar(Sys.getenv(x)) is not TRUE)
What do I set as AWS_ACCESS_KEY_ID? Does this I mean I have to log in in order to able to use the service?
Thanks
> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8 LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C LC_TIME=German_Germany.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RPostgres_1.4.4
loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 lattice_0.20-45 here_1.0.1 png_0.1-8 withr_2.5.0 rprojroot_2.0.3 rappdirs_0.3.3 grid_4.2.2 lifecycle_1.0.3
[10] jsonlite_1.8.4 DBI_1.1.3 rlang_1.0.6 cli_3.4.1 rstudioapi_0.14 blob_1.2.3 Matrix_1.5-1 ellipsis_0.3.2 vctrs_0.5.1
[19] reticulate_1.27 tools_4.2.2 bit64_4.0.5 bit_4.0.5 hms_1.1.2 compiler_4.2.2 pkgconfig_2.0.3
it seems to me that your function is trying to access the environment variable AWS_ACCESS_KEY_ID.
double check that you have that value set using echo $AWS_ACCESS_KEY_ID for Linux/mac

AWS SSM RunCommand - Issue with RunRemoteScript Document to run PowerShell script with parameters

In AWS SSM, I use RunRemoteScript document to run a PowerShell script to install some software on SSM managed instances. The script is hosted in a public accessible S3 bucket.
The RunCommand works fine with the script not taking any parameters. Software was successfully deployed to managed instances. But my script has a unique CID embedded in the code. For security reasons, I need to take it out and set it as a parameter for the PS script. Ever since then, the RunCommand just keeps failing.
My script looks like below (with parameter CID):
param (
[Parameter(Position = 0, Mandatory = 1)]
[string]$CID
)
Start-Transcript -Path "$([System.Environment]::GetEnvironmentVariable('TEMP','Machine'))\app_install.log" -Append
function Install-App {
<#
Installs App
#>
[CmdletBinding()]
[OutputType([PSCustomObject])]
param (
[Parameter(Position = 0, Mandatory = 1)]
[string]$msiURL,
[Parameter(Position = 2, Mandatory = 1)]
[string]$InstallCheck,
[Parameter(Position = 3, Mandatory = 1)]
[string]$CustomerID
)
if ( -not(Test-Path $installCheck)) {
# Do stuff
...
}
else {
Write-Host ("$installCheck - Already Installed")
Return "Already Installed, Skipped $(($msiURL -split '([^\\/]+$)')[1])"
}
}
Install-App -msiURL "https://s3.amazonaws.com/app.foo.com/Windows/app.exe" -InstallCheck "C:\Program Files\App\app.exe" -CustomerID $CID
Stop-Transcript
By following AWS SSM documentation below, I run the command below to kick off the RunCommand.
https://docs.aws.amazon.com/systems-manager/latest/userguide/integration-remote-scripts.html
aws ssm send-command --document-name "AWS-RunRemoteScript" --targets "Key=instanceids,Values=mi-abc12345"
--parameters '{"sourceType":["S3"],"sourceInfo":["{\"path\": "https://s3.amazonaws.com/app.foo.com/Windows/app_install.ps1\"}"],"commandLine":["app_install.ps1 abcd123456"]}'
The RunCommand keeps failing with error below:
----------ERROR-------
app_install.ps1 : The term 'app_install.ps1' is not recognized
as the name of a cmdlet, function, script file, or operable program. Check the
spelling of the name, or if a path was included, verify that the path is
correct and try again.
At C:\ProgramData\Amazon\SSM\InstanceData\mi-abcd1234\document\orchest
ration\a6811111d-c411-411-a222-bad123456\runPowerShellScript\_script.ps1:4
char:2
+ app_install.ps1 abcd123456
+ ~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (app_install.ps1:String)
[], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
failed to run commands: exit status 255
I suspect this is to do with the way how RunCommand handles the argument for the PowerShell script. But I cannnot find any examples other than the official document, which I followed. Anyone can point out what the issue is here?
BTW, I already tried putting the ps1 after ".\" without luck.
I found out the cause of the issue. The IAM role attached to the instance did not have sufficient rights to access the S3 bucket holds the script. As a result SSM wasn't able to download the script to the instance, hence the error "...ps1 is not recognized".
So it's not related to the code actually.

IBM Data Science Experience (DSX): Using ibmdbR on RStudio

I’ve created a connection to Db2 Warehouse on Cloud: dashDB for Analytics-t1 / Database: BLUDB. I’ve given ‘dashdb connect’ as the connection name.
Then I've selected Tools / RStudio. In RStudio, I've run the following lines. The error message below.
Code snippet:
library(ibmdbR)
con <- idaConnect('BLUDB','','')
#Close the connection
idaClose(con)
Output:
con <- idaConnect('BLUDB','','')
Warning messages:
1: In RODBC::odbcDriverConnect("DSN=BLUDB", believeNRows = FALSE) : [RODBC] ERROR: state 08001, code -30082, message [unixODBC][IBM][CLI Driver] SQL30082N Security processing failed with reason "17" ("UNSUPPORTED FUNCTION"). SQLSTATE=08001
2: In RODBC::odbcDriverConnect("DSN=BLUDB", believeNRows = FALSE) : ODBC connection failed
Your code snippet will only work as is if you run it in the RStudio from the DB2 Warehouse console. If you launch RStudio within DSX you need to configure connectivity. The following worked for me:
install.packages("ibmdbR")
library(ibmdbR)
dsn_driver <- "BLUDB"
dsn_database <- "BLUDB"
dsn_hostname <- "..."
dsn_port <- "50000"
dsn_protocol <- "TCPIP"
dsn_uid <- "..."
dsn_pwd <- "..."
con_path <- paste(dsn_driver,";DATABASE=",dsn_database,";HOSTNAME=",dsn_hostname,";PORT=",dsn_port,";PROTOCOL=",dsn_protocol,";UID=",dsn_uid,";PWD=",dsn_pwd,sep="")
ch <-idaConnect(con_path)
idaInit(ch)
idaShowTables()
Replace "..." with your credentials and you should be good to go.
I followed the instructions from the video named "Connect to dashDB in RStudio" on this page: https://datascience.ibm.com/docs/content/analyze-data/rstudio-overview.html and found the following documentation: https://datascience.ibm.com/blog/dashdb-r-dsx/

Can't enable phar writing

I am actually using wamp 2.5 with PHP 5.5.12 and when I try to create a phar file it returns me the following message :
Uncaught exception 'UnexpectedValueException' with message 'creating archive "..." disabled by the php.ini setting phar.readonly'
even if I turn to off the phar.readonly option in php.ini.
So how can I enable the creation of phar files ?
I had this same problem and pieced together from info on this thread, here's what I did in over-simplified explanation:
in my PHP code that's generating this error, I added echo phpinfo(); (which displays a large table with all sort of PHP info) and in the first few rows verify the path of the php.ini file to make sure you're editing the correct php.ini.
locate on the phpinfo() table where it says phar.readonly and note that it is On.
open the php.ini file from step 1 and search for phar.readonly. Mine is on line 995 and reads ;phar.readonly = On
Change this line to phar.readonly = Off. Be sure that there is no semi-colon at the beginning of the line.
Restart your server
Confirm that you're phar project is now working as expected, and/or search on the phpinfo()table again to see that the phar.readonly setting has changed.
phar.readonly can only be disabled in php.ini due to security reasons.
If you want to check that it's is really not done using other method than php.ini then in terminal type this:-
$ php -r "ini_set('phar.readonly',0);print(ini_get('phar.readonly'));"
If it will give you 1 means phar.readonly is On.
More on phar.configuration
Need to disable in php.ini file
Type which php
Gives a different output depending on machine e.g.
/c/Apps/php/php-7.2.11/php
Then open the path given not the php file.
E.g. /c/Apps/php/php-7.2.11
Edit the php.ini file
could do
vi C:\Apps\php\php-7.2.11\php.ini
code C:\Apps\php\php-7.2.11\php.ini
[Phar]
; http://php.net/phar.readonly
phar.readonly = Off
; http://php.net/phar.require-hash
phar.require_hash = Off
Save
Using php-cli and a hashbang, we can set it on the fly without messing with the ini file.
testphar.php
#!/usr/bin/php -d phar.readonly=0
<?php
print(ini_get('phar.readonly')); // Must return 0
// make sure it doesn't exist
#unlink('brandnewphar.phar');
try {
$p = new Phar(dirname(__FILE__) . '/brandnewphar.phar', 0, 'brandnewphar.phar');
} catch (Exception $e) {
echo 'Could not create phar:', $e;
}
echo 'The new phar has ' . $p->count() . " entries\n";
$p->startBuffering();
$p['file.txt'] = 'hi';
$p['file2.txt'] = 'there';
$p['file2.txt']->compress(Phar::GZ);
$p['file3.txt'] = 'babyface';
$p['file3.txt']->setMetadata(42);
$p->setStub('<?php
function __autoload($class)
{
include "phar://myphar.phar/" . str_replace("_", "/", $class) . ".php";
}
Phar::mapPhar("myphar.phar");
include "phar://myphar.phar/startup.php";
__HALT_COMPILER();');
$p->stopBuffering();
// Test
$m = file_get_contents("phar://brandnewphar.phar/file2.txt");
$m = explode("\n",$m);
var_dump($m);
/* Output:
* there
**/
✓ Must be set executable:
chmod +x testphar.php
✓ Must be called like this:
./testphar.php
// OUTPUT there
⚠️ Must not be called like this:
php testphar.php
// Exception, phar is read only...
⚠️ Won't work called from a CGI web server
php -S localhost:8785 testphar.php
// Exception, phar is read only...
For anyone who has changed the php.ini file, but just doesn't see any changes. Try to use the CLI version of the file. For me, it was in /etc/php/7.4/cli/php.ini
Quick Solution!
Check:
cat /etc/php/7.4/apache2/php.ini | grep phar.readonly
Fix:
sed -i 's/;phar.readonly = On/;phar.readonly = Off/g' /etc/php/7.4/apache2/php.ini

Configuring Blackfire on a base virtual box using Chef

I'm trying to give Blackfire.io (by Sensiolabs) a try to profile an existing PHP application running on a Vagrant machine (with PHP 5.3) on Mac.
I'm using Chef to provision my machine with Blackfire, but when running "vagrant provision" I get the following error:
default: STDERR: The server ID parameter is not set. Please run
blackfire-agent -register to configure it.
..which I already did
This is my Vagrant file:
is_windows = (RbConfig::CONFIG['host_os'] =~ /mswin|mingw|cygwin/)
Vagrant.configure("2") do |config|
..
config.vm.box = "covex/ubuntu1204-x64"
config.omnibus.chef_version = :latest
config.vm.provision "chef_solo" do |chef|
chef.json = {
:blackfire => {
:'server-id' => "d4860b49-be67-404b-9fa1-b..",
:'server-token' => "c412751f30d6c724033d8408e.."
}
}
chef.add_recipe "blackfire"
end
end
I followed the installation steps on https://blackfire.io/getting-started, except for the Probe paragraph.
Is my Vagrant file wrongly configured, so it can't read the server ID and token? Is the "brew install blackfire-php53" needed for this, if so, is there a way to configure this through my Vagrant file?
Guessing you are using https://supermarket.chef.io/cookbooks/blackfire
You missed the agent node in the config tree
{
"blackfire" => {
"agent" => {
"server-id" => "your server-id",
"server-token" => "your server-token",
}
}
}