I am creating a NetCDF file with mostly NaN values. Is there a way I can specify it to be compressed rather than taking up large disk space? I am using the University Corporation for Atmospheric Research C++ NetCDF library.
thanks!
Yes, but it depends on which netCDF C++ API you are using, the legacy (netcdf-cxx-4.2) C++ API or the newer netcdf-cxx4-4.2 C++ API.
With the netCDF-4 C++ library, documented here, just use the NcVar::setCompression method.
With the legacy netCDF-3 C++ library, there is no C++ method to do what you want directly. But that library is implemented as a thin layer over the netCDF C library, so by adding an NcVar constructor that sets the compression level by calling the C function nc_def_var_deflate, it ought to be fairly straightforward. Of course, you would have to make sure your legacy C++ library was built to use a previously installed netCDF-4 library.
Obviously it is better to produce netcdf4 compressed files at creation time, but I just wanted to add that if one already has code written that produces the file in an uncompressed way, it is possible to use CDO on existing files to convert to netcdf4 conventions and compress at the same time:
cdo -f nc4c -z zip_9 copy in.nc out.nc
Related
We are developing a scientific application which has the interface in python 2.7 and the computation routines written in Intel Visual Fortran. Reading the source files is done using python, then only the required data for computations has to be passed to standalone Fortran algorithms. Once the computations done, the data has to be read by python once again.
Using formatted text files seems to be taking too long and not efficient. Further, we would like to have a standard intermediate format. There can be about 20 arrays and those are huge (if written to formatted text, the file is about 500 MB).
Q1. In a similar situation where Python and Fortran data exchange is necessary. What would be recommended way of interaction? (e.g.: writing an intermediate data to be read by the other or calling Fortran from within Python or using numpy to create compatible arrays or etc.)
Q2. If writing intermediate structures is recommended, What format is good for data exchange? (We came across CDF, NETCdf, binary streaming, but didn't try any so far.)
The standard way of wrapping Fortran code in Python is with f2py (included in the numpy module).
For the output of intermediary results, a number of formats could work, it really depends on your requirements.
For simple datasets, from python, just use numpy.save.
If your datasets become large, HDF5 with, for instance, PyTables in Python and libhdf5 in Fortran could be used.
Otherwise, if you don't want to link your code to an external library, custom binary files written from Fortran and parsed with numpy could work too.
I would interface directly between Python and Fortran. It is relatively straightforward to allocate memory using Numpy and pass a pointer through to Fortran. You use iso_c_binding to write C-compatible wrapper routines for your Fortran routines and ctypes to load the Fortran .dll and call the wrappers. If you're interested I can throw together a simple example (but I am busy right this moment).
Is it possible to load the matrix in PETSc binary format from external file at runtime with use of Octave C++ API? I've found the Doxygen documentation, but I can't find anything useful among so many items.
Usually I use "PetscBinaryRead.m" when I want to load a PETSc matrix to Octave, but now in C++ I'm really completely lost.
the PetscBinaryRead.m is not part of Octave, we don't know where you got it or even what it does. You can:
reimplement it on C++, C, or Fortran (only you know what it does)
start the Octave interpreter in your C++ and call PetscBinaryRead from there (see the Octave manual on how to create standalone programs or calling Octave Functions from Oct-files
I am currently working on a project in C++, and I am actually interested in using Matlab data structures, instead of having to create my own data types (such as matrices, arrays, etc.)
Is there a way to seamlessly use Matlab objects in C++? I don't mind having to run Matlab in the background while my program runs.
EDIT: A starting point is this: http://www.mathworks.co.uk/help/matlab/calling-matlab-engine-from-c-c-and-fortran-programs.html. I will continue reading this.
You can use instead Armadillo C++ maths library; used by NASA, Boeing, Siemens, Deutsche Bank, MIT, CMU, Stanford, etc.
They have good documentation and examples if you are more familiar with MATLAB/OCTAVE
http://arma.sourceforge.net/docs.html#syntax
I would prefer using native C++ library of some sort and not Matlab. This is likely to be faster for both development and execution.
From writing C++ extensions for Matlab I learned one thing: Using Matlab objects in C++ is likely to give you considerable headache.
Matlab data structures are not exposed as C++ classes. Instead, you get pointers that you can manipulate with C-like API functions.
I recommend to use a native C++ library such as Eigen3.
The functionality you are looking at is not really intended to be used as seamless objects. In the past when I have used it I found it much simpler to do the C parts using either native arrays or a third party matrix library and then convert it into a Matlab matrix to return.
Mixing Matlab and C++ is typically done in one of two ways:
Having a C++ program call Matlab to do some specialist processing. This is chiefly useful for rapid development of complex matrix algorithms. You can do this either by calling the full Matlab engine, or by packaging you snippet of Matlab code into a shared library for distribution. (The distributed version packages a distributable copy of the Matlab runtime which is called with your scripts).
Having a Matlab script call a C++ function to do some specialist processing. This is often used to embed C++ implementations of algorithms (such as machine learning models) or to handle specific optimizations.
Both of these use cases have some overhead transferring the data to/from Matlab.
If you are simply looking for some matrix code to use in C++ you would be better off looking into the various C++ matrix libraries, such as the one implemented in Boost.
You can do mixed programming with C++ and Matlab. There are two possible ways:
Call MATLAB Engine directly: Refer to this post for more info. Matlab will run in the background.
Distribute MATLAB into independent shared library: check out here on how to do this (with detail steps and example).
I was trying to create a lzw compression program. But i need to finish it by today itself so i want to use some dll for taking my input as txt file and output to as a text file. I want to do this in TURBO C++ code which are doing my remaining functionalities.
Can anyone suggest me some method.
Libzip isn't LZW (it uses an algorithm that's generally better), but it is probably the best standard answer. I don't know if there's a downloadable DLL for it in a standard location, so you might have to compile it from source.
Alternatively, a bit of Google-searching (on "lzw compression dll") found this C++ source code for doing LZW compression, which you may be able to use: http://zabkat.com/blog/24Jan10-lzw-compression-code.htm
this is an oldie question.
You can still look around the LHArc sources on the old Simtel archives. There's an implementation of LZW algorithm before it was patented by Compuserve.
I'm looking for a wrapper that distills zlib to:
OpenZipFile()
GetItemInfo(n)
UnzipItem(n) // Bonus points for unzipping recursively if item n is a directory.
I see a lot of wrappers around the zlib library on, say, codeproject.com but they are all platform-specific in order to provide the added platform-specific functionality of unzipping to file/memory buffer/pipe.
In boost::iostreams there is the possibility to use zlib, gzip and bzip2 formats.
You find it from http://www.boost.org/
In the zlib source archive, there is a contribution named "minizip".
"minizip" is a set of files you can use to play with .zip files. Basic services you need are already there :
unzOpen
unzLocateFile
unzOpenCurrentFile
unzGetCurrentFileInfo
unzCloseCurrentFile
unzClose
Of course, this is not object oriented (and I'm sure that was not the goal of the creator of minizip), but writing a simple object oriented wrapper should be easy.
firstobject's easy zlib stays cross-platform; it has zlib in a single file easyzlib.c and exposes only ezcompress and ezuncompress functions with the added feature of determining the memory requirement before allocating the exact size.
You could try to grab the code from another FOSS project. ScummVM, for example, has a highly portable Zlib wrapper (implementation, header) with all the functions you need, plus an OO layer for interfacing generically with any other kind of archive.
Maybe that's a good starting point? The wrapper functions are totally standalone and portable (heck, they even work on a Nintendo DS), but the OO layer depends on many custom classes which may be hard to add to your own project.
GZStream is worth a look. This is a nice cross-platform wrapper round ZLib which extend the STL iostream classes.
http://www.cs.unc.edu/Research/compgeom/gzstream/
What is good about this wrapper over some of the others is that if you're working with very large archives you don't need to load the whole dataset into memory.
If you will use minizip -- pay attention, thet version shipped with zlib 1.2.3 has 2GB resulting zip file limitation. IT will produce zip with size >2GB - but you won't be able to open them...
This is an old thread, but I thought I'd throw in Boost's ZLib wrapper:
http://www.boost.org/doc/libs/1_47_0/libs/iostreams/doc/classes/zlib.html
You can check also this C++ Zlib wrapper with auto-detection of input format:
https://github.com/mateidavid/zstr