Weka: Array Index out of bounds exception with CSV files

Weka: Array Index out of bounds exception with CSV files - weka

I get an unfriendly ArrayOutOfBoundsException when I try to input a CSV file to Weka. But it works fine when I use the same in the GUI.
pvadrevu#MacPro~$ java -Xmx2048m -cp weka.jar weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -t "some.csv" -d temp.model
Refreshing GOE props...
[KnowledgeFlow] Loading properties and plugins...
[KnowledgeFlow] Initializing KF...
java.lang.ArrayIndexOutOfBoundsException: 1
weka.classifiers.evaluation.Evaluation.setPriors(Evaluation.java:3843)
weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1503)
weka.classifiers.Evaluation.evaluateModel(Evaluation.java:650)
weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:359)
weka.classifiers.functions.Logistic.main(Logistic.java:1134)
at weka.classifiers.evaluation.Evaluation.setPriors(Evaluation.java:3843)
at weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1503)
at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:650)
at weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:359)
at weka.classifiers.functions.Logistic.main(Logistic.java:1134)

Turns out that the new versions of Weka donot handle CSV files through command line. There are two options:
Revert back to an older version of Weka. 3.6.11 works fine for me while 3.7.11 does not.
Convert the CSV files to ARFF. It can be done using the Weka GUI.

Related

Weka exception: Can't open file iris.arrf

I am trying to run a script from command line for Weka classification task, it is working perfectly for training CSV data. But for Arrf file it is not working.
Command:
java -cp "C:\Program Files\Weka-3-8-5\weka.jar" weka.classifiers.trees.J48 -t iris.arrf
Error: Weka exception: Can't open file iris.arrf.
The Iris file is the sample data obtained from Weka/data folder so I don't know where is the issue.

You misspelled the file name, using .arrf instead of .arff.

Postman Collection Format v1 is no longer supported and can not be imported directlyYou may convert ur collection to Formatv2 and try importing again

I need your help. I have this error, I managed to convert the .json file from version 1.0.0 to version 2.0.0.0 with the following command at the prompt
C:\Users\AC\Desktop\test>postman-collection-transformer convert -i test.json -o prueba1.json -j 1.0.0 -p 2.0.0 -P
In the following url is the collection that I converted and the one that after making the changes in Visual, it doesn't update. and when I send the request, the file is not updated with the changes that I specify in Visual Studio Code, I don’t know what could be going wrong. Why Postman doesn't allow version 1.0.0?
Capture Visual
Anaconda run
In this image it should return me an id, not that phrase. It's as if after the conversion something is lost.

it won't update the file it will create a new collection file :
postman-collection-transformer convert -i old_collection.json -o new_collection.json -j 1.0.0 -p 2.0.0 -P
the above command converts the v1 "old_collection.json" file and creates a new_collection.json

How to snappy compress a file using a python script

I am trying to compress in snappy format a csv file using a python script and the python-snappy module. This is my code so far:
import snappy
d = snappy.compress("C:\\Users\\my_user\\Desktop\\Test\\Test_file.csv")
with open("compressed_file.snappy", 'w') as snappy_data:
snappy_data.write(d)
snappy_data.close()
This code actually creates a snappy file, but the snappy file created only contains a string: "C:\Users\my_user\Desktop\Test\Test_file.csv"
So I am a bit lost on getting my csv compressed. I got it done working on windows cmd with this command:
python -m snappy -c Test_file.csv compressed_file.snappy
But I need it to be done as a part of a python script, so working on cmd is not fine for me.
Thank you very much,
Álvaro

You are compressing the plain string, as the compress function takes raw data.
There are two ways to compress snappy data - as one block and the other as streaming (or framed) data
This function will compress a file using framed method
import snappy
def snappy_compress(path):
path_to_store = path+'.snappy'
with open(path, 'rb') as in_file:
with open(path_to_store, 'w') as out_file:
snappy.stream_compress(in_file, out_file)
out_file.close()
in_file.close()
return path_to_store
snappy_compress('testfile.csv')
You can decompress from command line using:
python -m snappy -d testfile.csv.snappy testfile_decompressed.csv
It should be noted that the current framing used by python / snappy is not compatible with the framing used by Hadoop

How do I create a powershell script for windows that runs 2 python scripts

This is the first time I tried doing a shell script.
So this is what I got for now:
$pdf = read-host "enter the pdf name"
cmd /k C:\the path\\./PDF_ID.py $pdf /all> C:\the path\data.txt
C:\the path\pdf.py
So the first python script is executed and it saves the output to the data.txt but I dont know how to run the second python script that analyzes data.txt and outputs if the pdf file contains malware or not.

If you are able to run Python from Powershell, then python <scriptName>.py will get your job done.
You can give the full path of the Python executable within the command line (incase not working). You could check and ensure that your python executable path is available in your system variables.
If you are not looking from any concurrent solution, then directly put these two inside your samplePS1 file:
python firstscript.py ;
python secondscript.py ;
Note: If you want to pass arguments , then you can pass the arguments also. This is a sequential execution.
Hope it helps you.

Inspectdb in Oracle-Django gets OCI-22061: invalid format text [T

I'm using Oracle Database 10g xe universal Rel.10.2.0.1.0 against cx_Oracle-5.0.4-10g-unicode-py26-1.x86_64 on a django project on Ubuntu 10.04
My db is generated by Oracle 10gr2 enterprise edition (on Windows XP, import done in US7ASCII character set and AL16UTF16 NCHAR character set, import server uses AL32UTF8 character set, export client uses EL8MSWIN1253 character set)
When I try django-admin.py inspectdb I get the following error:
......."indexes = connection.introspection.get_indexes(cursor,
table_name) File
"/usr/lib/pymodules/python2.6/django/db/backends/oracle/introspection.py",
line 116, in get_indexes
for row in cursor.fetchall(): File "/usr/lib/pymodules/python2.6/django/db/backends/oracle/base.py", line
483, in fetchall
for r in self.cursor.fetchall()]) cx_Oracle.DatabaseError: OCI-22061: invalid format text [T".
I am aware of "inspectdb works with PostgreSQL, MySQL and SQLite" but as I understand from other posts it also works with Oracle somehow.
Does anyone know why I get this error or how I could fix it?

can you try it with updating cx_Oracle 5.1.1 package, then try this:
python manage.py inspectdb --database dbname

You can download cx_Oracle-5.1.2 and fix the issue using below command.
$ wget -c http://prdownloads.sourceforge.net/cx-oracle/cx_Oracle-5.1.2-11g-py27-1.x86_64.rpm
Command to install rpm
$ sudo yum install cx_Oracle-5.0.4-11g-unicode-py27-1.x86_64.rpm
Also download Oracle instantclient http://download.oracle.com/otn/linux/instantclient/11101/basic-11.1.0.6.0-linux-x86_64.zip and http://download.oracle.com/otn/linux/instantclient/11101/sdk-11.1.0.6.0-linux-x86_64.zip
Extract the above downloaded zip file.
copy the include folder from sdk-11.1.0.6.0-linux-x86_64 and paste in basic-11.1.0.6.0-linux-x86_64
Set the below path in .bashrc file
export $LD_LIBRARY_PATH = $LD_LIBRARY_PATH:/oracle_lib/oracle_instantclient_11_1
export $ORACLE_HOME = /oracle_lib/oracle_instantclient_11_1
$ ls /oracle_lib/oracle_instantclient_11_1
You should find the include folder with list of files
Then execute .bashrc file using $ source ~/.bashrc
I have tested it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Weka: Array Index out of bounds exception with CSV files - weka

Turns out that the new versions of Weka donot handle CSV files through command line. There are two options: Revert back to an older version of Weka. 3.6.11 works fine for me while 3.7.11 does not. Convert the CSV files to ARFF. It can be done using the Weka GUI.

Related

Weka exception: Can't open file iris.arrf

Postman Collection Format v1 is no longer supported and can not be imported directlyYou may convert ur collection to Formatv2 and try importing again

How to snappy compress a file using a python script

How do I create a powershell script for windows that runs 2 python scripts

Inspectdb in Oracle-Django gets OCI-22061: invalid format text [T

Categories

Resources