Eport PCA Nugget output to html in SPSS Modeler 16 using Python - pca

I'm trying to export PCA nugget to an HTML file using Python, but I get this error while trying to do so.
Script error (Cannot export '"Factor_Analysis":factor[model#id5YWTDKXKEW9]' with the format 'HTML')
I used the following piece of code to get the HTML output, which threw me an error but I was successfully able to export the PMML file for K-Means node with the same code [file format - XML and changed the nugget ID ].
taskrunner.exportModelToFile(stream.findByID("id7VBSU9JC1BY").createModelOutput(True), "D:\\factor_out.html", modeler.api.FileFormat.HTML)
I'm using Modeler 16. Any help on how to achieve this will be greatly appreciated.
Thanks,
Ron

Related

Export data from GBQ into CSV with specific encoding

Im using GBQ, I want to export the results of a query into CSV file.
The data is larger than 20M lines so Im using this option :
In my query results I have some text in french, that is being saved in bad encoding to CSV.
Is there a way to define the encoding on Saving step in GBQ ?
Thank you
You can write simple Python script (or another language that made you feel comfortable) to query and save the result by using Python code. So you can use any encoding you want to save your result to CSV file.

WSO2 SP fails to use PMML file to make predictions

Following the examples presented in the WSO2 SP 4.1.0 documentation, I am trying to run an example where I read data from a csv file, I predict some result based on the data, and the export the predicted result to a csv file.
So far, the reading and writing to a csv file is working fine, but when I add the PMML prediction part, I cant run the file getting as error "ERROR {org.wso2.extension.siddhi.gpl.execution.pmml.util.PMMLUtil} - Failed to unmarshal the pmml definition: null".
The model is a random forest regressor with 15 trees and max_depth=15 trained with sklearn and was exported using the sklearn2pmml 0.35.1 Python library.
I already copied the "siddhi-gpl-execution-pmml-4.0.13.jar" file to "{wso2_4.0.0 install dir}/lib".
I am wondering if there is a version mismatch between the PMML definition exported with sklearn2pmml (the model follows the PMML 4.3 definition) and the PMML definitions accepted by WSO2 SP.
EDIT:
The error isnt showing right now, and I am attaching an image of the WSO2 SP Editor running on Firefox where you can see that the query has an error, but the error box is empty! (this only happens with this error).
Link to a screenshot. Note that the message box from the error is empty!
EDIT2:
I already tried the .jar proposed in No Extension Exists for pmml:predict WSO2 Stream Processor (siddhi-gpl-execution-pmml-4.0.11.jar) and also the siddhi-gpl-execution-pmml-4.0.13.jar. Both give the same error (without any explanation in the error message box).
WSO2 SP's PMML extension supports PMML 4.3 definitions inherently.
Can you please verify the "pmml_model_path" provided as the parameter for the extension.

How to open a tab-delimited file in Weka

When I try to open a tab-delimited file in Weka it says: "file format is not recognized". In the subsequent dialog box it shows weka.core.converters.CSVLoader and says "Reads a source that is in comma separated or tab separate format." When I click the OK button, it throws an error saying "wrong number of values. Read 11, expected 10 line 4." I verified the same file in Excel that the line had 10 fields.
Could someone advise a workaround?
The data file cannot be converted to CSV format because some of the fields contain a comma.
When installing the unofficial Weka package common-csv-weka-package, you can load tab-delimited CSV files using the CommonCSVLoader loader. Simply change the loader's format from DEFAULT to TDF (-F command-line option).
I had same problem. So far the best solution I found is using R to convert a tabular data file into arff. Google two keywords "import data to R" and "export R data to weka arff". My second choice is using JMP or SAS to open a csv or Excel workbook and then export as CSV.
I found a solution: for Windows 10, install the R language package from this url:
https://cran.r-project.org/web/packages/rio/index.html
install RStudio from:
https://www.rstudio.com/products/rstudio/download/#download
from the prompt in RStudio follow the Import, Export, and Convert Data Files instructions here:
https://cran.microsoft.com/snapshot/2015-11-15/web/packages/rio/vignettes/rio.html
works a treat, converted my .tsv files to Weka arff format no problem. The only thing I haven't done is test the arff files in Weka yet (and compare with Python sklearn results), as I'm hoping there isn't a problem with commas embedded in the text message bodies. Scikit-Learn and TfidfVectorizer has no problems with embedded commas in a tsv file!

About the intrepreter for AIML

I tried to build a chatbot in AIML. I downloaded the codes from http://nlp-addiction.com/chatbot/mathbot/ but couldn't get the idea about how to run the program. Please help me.
An AIML file isn't program code, it's a data file (much like any other xml file).
You need to use an interpreter like Program-AB to load and use the file to answer queries.
If you just want to test the contents and formatting of the aiml file, you could use Pandorabots and load the file into a blank bot fairly easily.
Yes, AIML file isn't program code. It's just like a data format. You can learn about it more from here : http://www.alicebot.org/aiml.html
AIML is a data encoding format that tells the bot when to do what to do. Many interpreters can be used to interpret the aiml tags.
One of them is PyAIML which is python based interpreter fairly simple to use.

how can i read datasets in Weka?

I want to use some of the datasets available at the website of the Weka to perform some
experiments with Neural Networks.
What do I have to do to read the data?
I downloaded the datasets and they were saved as .arff.txt so I deleted the extension of .txt to have only .arff. So I used this file as an ipnut but an error occurs.
Which is the right way to read data?
Do I have to write code?
Please help me.
Thank you
I'm using Weka 3.6.6 and coc81.arff opens just fine. You are using Weka 3.7.x, which is the development branch of Weka. I suggest that you download 3.6.6 or 3.6.7 (the latest stable release) and try to open the file again.
There is also another simple throw...
open your dataset file in excel in my case MS Excel2010, format fields intype.
and save as 'csv',
then reload that csv file in the weka explorer and save on the local drive as arff format.
may be this help.