Error while reading a compressed csv file using Python2.7 - python-2.7

I am getting an error while reading a compressed csv file.
The error is as below:
"zlib.error: Error -3 while decompressing: invalid distances set"
Code :
filename = 'testfile.gz'
with gzip.open(filename, 'rb') as reader:
for line in reader:
print(line)
I tried gunzip on the file and it worked without any issues.
I used gunzip -t . It gave rc 0.

This one seems like a bug in zlib library with version 1.2.7-15
$ rpm -qa | grep zlib
zlib-1.2.7-15.el7.x86_64
I got it updated to 1.2.7-17 and the issue got resolved.
$ rpm -qa | grep zlib
zlib-1.2.7-17.el7.x86_64

Related

How to snappy compress a file using a python script

I am trying to compress in snappy format a csv file using a python script and the python-snappy module. This is my code so far:
import snappy
d = snappy.compress("C:\\Users\\my_user\\Desktop\\Test\\Test_file.csv")
with open("compressed_file.snappy", 'w') as snappy_data:
snappy_data.write(d)
snappy_data.close()
This code actually creates a snappy file, but the snappy file created only contains a string: "C:\Users\my_user\Desktop\Test\Test_file.csv"
So I am a bit lost on getting my csv compressed. I got it done working on windows cmd with this command:
python -m snappy -c Test_file.csv compressed_file.snappy
But I need it to be done as a part of a python script, so working on cmd is not fine for me.
Thank you very much,
Álvaro
You are compressing the plain string, as the compress function takes raw data.
There are two ways to compress snappy data - as one block and the other as streaming (or framed) data
This function will compress a file using framed method
import snappy
def snappy_compress(path):
path_to_store = path+'.snappy'
with open(path, 'rb') as in_file:
with open(path_to_store, 'w') as out_file:
snappy.stream_compress(in_file, out_file)
out_file.close()
in_file.close()
return path_to_store
snappy_compress('testfile.csv')
You can decompress from command line using:
python -m snappy -d testfile.csv.snappy testfile_decompressed.csv
It should be noted that the current framing used by python / snappy is not compatible with the framing used by Hadoop

gdal2tiles.py no input file specified, wrong formatting?

I'm having trouble running gdal2tiles.py through a command line. I followed instructions on installing gdal from http://cartometric.com/blog/2011/10/17/install-gdal-on-windows/ I then verified through command prompt that gdal was installed by typing in gdalinfo --version, and the correct version came up which means that my path and variables are set.
So when I try to run this:
gdal2tiles.py -p raster -z 0-6 test.jpg abc
I keep getting an error that says "error: No input file was specified" and
"Usage: gdal2tiles.py [options] input_file(s) [output]"
I am able to run other gdal commands and they work just fine. I've also tried to run
gdal2tiles.py test.jpg
and this gives the same error.
I'm pretty sure I have the right formatting so if anyone has any suggestions or might have a solution to this please let me know. Thanks
In command prompt just type in:
python gdal2tiles.py -p raster -z 0-6 test.jpg abc
That corrected the problem for me.

Weka: Array Index out of bounds exception with CSV files

I get an unfriendly ArrayOutOfBoundsException when I try to input a CSV file to Weka. But it works fine when I use the same in the GUI.
pvadrevu#MacPro~$ java -Xmx2048m -cp weka.jar weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -t "some.csv" -d temp.model
Refreshing GOE props...
[KnowledgeFlow] Loading properties and plugins...
[KnowledgeFlow] Initializing KF...
java.lang.ArrayIndexOutOfBoundsException: 1
weka.classifiers.evaluation.Evaluation.setPriors(Evaluation.java:3843)
weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1503)
weka.classifiers.Evaluation.evaluateModel(Evaluation.java:650)
weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:359)
weka.classifiers.functions.Logistic.main(Logistic.java:1134)
at weka.classifiers.evaluation.Evaluation.setPriors(Evaluation.java:3843)
at weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1503)
at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:650)
at weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:359)
at weka.classifiers.functions.Logistic.main(Logistic.java:1134)
Turns out that the new versions of Weka donot handle CSV files through command line. There are two options:
Revert back to an older version of Weka. 3.6.11 works fine for me while 3.7.11 does not.
Convert the CSV files to ARFF. It can be done using the Weka GUI.

wodi64: ocamlopt issues an error

I installed wodi64 on windows 7. When I try to compile a simple hello world program with:
ocamlopt -o hello hello.ml
I get an error:
File "hello.ml", line 1:
Error: Corrupted compilation unit description
C:/wodi64/opt/wodi64/lib/ocaml/std-lib\pervasives.cmx
The contents of the hello.ml file are just:
print_string "Hello world!\n";;
Any idea on how to solve this?
Thanks.
First of all, check that your files are still ok. There are various anti-virus software, that don't like the ocaml compiler and manipulates/remove it's files.
Instructions (from the installed cygwin shell):
cd /tmp # or: wget 'http://wodi.forge.ocamlcore.org/wodi64o.md5sum' -O /tmp/wodi64o.md5sum
godi_console wget 'http://wodi.forge.ocamlcore.org/wodi64o.md5sum'
cd /opt/wodi64
md5sum -c /tmp/wodi64o.md5sum
# install md5sum via cygwin's setup, if it's not already installed
There can be some mismatches, because configuration files will be updated during operation (e.g /opt/wodi64/lib/ocaml/std-lib/ld.conf, Makefile.config will differ ); but binary files should be identical.

dpkg fails when started by C++ but not when called from command line

I'm currently developping in C++ a software which updates some packages on a Linux distribution (using dpkg, provided by Busybox).
All it does is download some files, check their MD5 checksums and install them using dpkg -i.
The code which runs dpkg is
stringstream packetcmdstream;
packetcmdstream << "dpkg -i " << filename;
string packetcmd = packetcmdstream.str();
int success = system(packetcmd.c_str());
The problem is that it fails with the same strange error such as :
Preparing to replace sqlite3 0.8-1 (using /tmp/sqlite3_0.8-1_arm.deb)...
dpkg: can't remove old file /usr/lib/libsqlite3.so.0: Directory not empty
But everything works well with the same .deb file if I do dpkg -i /tmp/sqlite3_0.8-1_arm.deb from the command line...
Do you have an idea about what could cause this problem ?
Thanks in advance !