Writing Output File Cuda C++ - c++

I need to write simulation data computed on GPU into an output .csv file. Normally I would just use the fstream library but that's not possible on GPU.
Are there any built-in functions or other libraries that I could use to write data to .csv or .txt files directly from device code? Right now, performance is really not that important but rather an easy interim solution.

No, it's not possible to do direct file I/O in CUDA from device code, unless you are using something like GPU Direct Storage (GDS) (which most likely you are not, at the current time, and based on your question). If you don't already have it set up, GDS might not be an "easy interim solution".
Copy the data to the host, then use whatever file I/O routines you are comfortable with.
Note that requests for library recommendations are specifically off-topic for SO.

Use the printf statement to output the prints from Cuda kernel to a text file and then parse the text file to convert to CSV.

Related

FILE IO to InMemory IO

I have a legacy C library which accepts a file, works on the file payload and writes the processed payload to an output file. The functions in the library are tightly coupled with FILE i.e. it passes around FILE handle to the functions and functions do file IO to retrieve the necessary data.
I want to modify this library such that it works with in memory data(No file IO). i.e pass a binary array and get back binary array.
I have 2 solution in mind
Implement a InMemory File module (which implants all operations as C FILE) and override the default file operations with new implementation using typedef or #define
Pass around binary array to all the functions of the library and retrieve the necessary data from the same.
Which one of this is better or any other better way to solve the problem
I would suggest not to change legacy code if any other code depends on it.
If you are building for a somewhat POSIX compliant platform, you can use fmemopen http://pubs.opengroup.org/onlinepubs/9699919799/functions/fmemopen.html
For Windows maybe this might help
C - create file in memory
I don't know what is the exact purpose of changing legacy code.The issue which is I understand is the overhead caused by reading and writing. But there are many methods are available to resolve overhead issues as:
As already told you can use fmemopen
You may also use mmap for plain read/write should make little difference; either way, everything happens through the filesystem cache/buffers.
You can also use tmpfs to leverage memory as (temporary) files also known as RAMDisk as storage. As the files are washed out easily as files are temporary already in nature
Another solution - you can use inmemory database (TimesTen for sample)

How to save a c++ readable .mat file

I am running a DCT code in matlab and i would like to read the compressed file (.mat) into a c code. However, am not sure this is right. I have not yet finished my code but i would like to request for an explanation of how to create a c++ readable file from my .mat file.
Am kinda confused when it comes to .mat, .txt and then binary, float details of files. Someone please explain this to me.
It seems that you have a lot of options here, depending on your exact needs, time, and skill level (in both Matlab and C++). The obvious ones are:
ASCII files
You can generate ASCII files in Matlab either using the save(filename, variablename, '-ascii') syntax, or you can create a more custom format using c-style fprintf commands. Then, within a C or C++ program the files are read using an fscanf.
This is often easiest, and good enough in many cases. The fact that a human can read the files using notepad++, emacs, etc. is a nice sanity check, (although this is often overrated).
There are two big downsides. First, the files are very large (an 8 byte double number requires about 19 bytes to store in ASCII). Second, you have to be very careful to minimize the inevitable loss of precision.
Bytes-on-a-disk
For a simple array of numbers (for example, a 32-by-32 array of doubles) you can simply use the fwrite Matlab function to write the array to a disk. Then within C/C++ use the parallel fread function.
This has no loss of precision, is pretty fast, and relatively small size on disk.
The downside with this approach is that complex Matlab structures cannot necessarily be saved.
Mathworks provided C library
Since this is a pretty common problem, the Mathworks has actually solved this by a direct C implementation of the functions needed to read/write to *.mat files. I have not used this particular library, but generally the libraries they provide are pretty easy to integrate. Some documentation can be found starting here: http://www.mathworks.com/help/matlab/read-and-write-matlab-mat-files-in-c-c-and-fortran.html
This should be a pretty robust solution, and relatively insensitive to changes, since it is part of the mainstream, supported Matlab toolset.
HDF5 based *.mat file
With recent versions of Matlab, you can use the notation save(filename, variablename, '-v7.3'); to force Matlab to save the file in an HDF5 based format. Then you can use tools from the HDF5 group to handle the file. Note a decent, java-based GUI viewer (http://www.hdfgroup.org/hdf-java-html/hdfview/index.html#download_hdfview) and libraries for C, C++ and Fortran.
This is a non-fragile method to store binary data. It is also a bit of work to get the libraries working in your code.
One downside is that the Mathworks may change the details of how they map Matlab data types into the HDF5 file. If you really want to be robust, you may want to try ...
Custom HDF5 file
Instead of just taking whatever format the Mathworks decides to use, it's not that hard create a HDF5 file directly and push data into it from Matlab. This lets you control things like compression, chunk sizing, dataset hierarchy and names. It also insulates you from any future changes in the default *.mat file format. See the h5write command in Matlab.
It is still a bit of effort to get running from the C/C++ end, so I would only go down this path if your project warranted it.
.mat is special format for the MATLAB itself.
What you can do is to load your .mat file in the MATLAB workspace:
load file.mat
Then use fopen and fprintf to write the data to file.txt and then you can read the content of that file in C.
You can also use matlab's dlmwrite to write to a delimited asci file which will be easy to read in C (and human readable too) although it may not be as compressed if that is core to the issue
Adding to what has already been mentioned you can save your data from MATLAB using -ascii.
save x.mat x
Becomes:
save x.txt x -ascii

multiple access to file in r/w

I'm planning to write a programm which has to access to a certain file many times in r/w.
So I decided to use fstream, since I can use this class for both reading and writing purpose.
My idea is to open the file at the startup of the application and then close it as the application is closed too.
Since the file can be arbitrarily big, I was planning to use a "paging" structure, in which:
1) preallocate a fixed amount of memory for each page and a fixed number of page
2) load part of the file in to the first free page
3) if there is no free page, I select one non empty with a certain criterion, I commit all edit in it (if there are any) and then load the part of file in the page.
That's not so hard to code. But I was wondering If I'm going to reinvent the wheel... maybe the fstream itself is written in a smart way so that it also implements a similar paging mechanism. In that case, I would not take care about, just write and read at any time.
Some suggestion?
Don't do this by yourself. Unless you are using very exotic implementation, the fstream class already implement such a mechanism efficiently.
Checkout http://www.cplusplus.com/doc/tutorial/files/ "Buffers and Synchronization"
There are possible issues if you are seek-ing into file larger than 2GB with a old kernel or implementation of the standard library. Check this
Large file support in C++
or use Boost.Filesystem
Internal working of the standard C++ library vary by implementation. Hence a test would be needed to get some real data on your preferred platform. Generally memory mapped files are considered to be the fastest way to access data stored in a file (as Uflex has mentioned in his comment, but it has some drawbacks as well (see the linked wiki page). You can either use the standard (POSIX) C functions mmap() and munmap(), or the Boost C++ libraries which also have a portable C++ interface for memory mapped files.

How to create a virtual file?

I'd like to simulate a file without writing it on disk. I have a file at the end of my executable and I would like to give its path to a dll. Of course since it doesn't have a real path, I have to fake it.
I first tried using named pipes under Windows to do it. That would allow for a path like \\.\pipe\mymemoryfile but I can't make it works, and I'm not sure the dll would support a path like this.
Second, I found CreateFileMapping and GetMappedFileName. Can they be used to simulate a file in a fragment of another ? I'm not sure this is what this API does.
What I'm trying to do seems similar to boxedapp. Any ideas about how they do it ? I suppose it's something like API interception (Like Detour ), but that would be a lot of work. Is there another way to do it ?
Why ? I'm interested in this specific solution because I'd like to hide the data and for the benefit of distributing only one file but also for geeky reasons of making it works that way ;)
I agree that copying data to a temporary file would work and be a much easier solution.
Use BoxedApp and do not worry.
You can store the data in an NTFS stream. That way you can get a real path pointing to your data that you can give to your dll in the form of
x:\myfile.exe:mystreamname
This works precisely like a normal file, however it only works if the file system used is NTFS. This is standard under Windows nowadays, but is of course not an option if you want to support older systems or would like to be able to run this from a usb-stick or similar. Note that any streams present in a file will be lost if the file is sent as an attachment in mail or simply copied from a NTFS partition to a FAT32 partition.
I'd say that the most compatible way would be to write your data to an actual file, but you can of course do it one way on NTFS systems and another on FAT systems. I do recommend against it because of the added complexity. The appropriate way would be to distribute your files separately of course, but since you've indicated that you don't want this, you should in that case write it to a temporary file and give the dll the path to that file. Make sure you write the temporary file to the users' temp directory (you can find the path using GetTempPath in C/C++).
Your other option would be to write a filesystem filter driver, but that is a road that I strongly advise against. That sort of defeats the purpose of using a single file as well...
Also, in case you want only a single file for distribution, how about using a zip file or an installer?
Pipes are for communication between processes running concurrently. They don't store data for later access, and they don't have the same semantics as files (you can't seek or rewind a pipe, for instance).
If you're after file-like behaviour, your best bet will always be to use a file. Under Windows, you can pass FILE_ATTRIBUTE_TEMPORARY to CreateFile as a hint to the system to avoid flushing data to disk if there's sufficient memory.
If you're worried about the performance hit of writing to disk, the above should be sufficient to avoid the performance impact in most cases. (If the system is low enough on memory to force the file data out to disk, it's probably also swapping heavily anyway -- you've already got a performance problem.)
If you're trying to avoid writing to disk for some other reason, can you explain why? In general, it's quite hard to stop data from ever hitting the disk -- the user can always hibernate the machine, for instance.
Since you don't have control over the DLL you have to assume that the DLL expects an actual file. It probably at some point makes that assumption which is why named pipes are failing on you.
The simplest solution is to create a temporary file in the temp directory, write the data from your EXE to the temp file and then delete the temporary file.
Is there a reason you are embedding this "pseudo-file" at the end of your EXE instead of just distributing it with our application? You are obviously already distributing this third party DLL with your application so one more file doesn't seem like it is going to hurt you?
Another question, will this data be changing? That is are you expecting to write back data this "pseudo-file" in your EXE? I don't think that will work well. Standard users may not have write access to the EXE and that would probably drive anti-virus nuts.
And no CreateFileMapping and GetMappedFileName definitely won't work since they don't give you a file name that can be passed to CreateFile. If you could somehow get this DLL to accept a HANDLE then that would work.
And I wouldn't even bother with API interception. Just hand the DLL a path to an acutal file.
Reading your question made me think: if you can pretend an area of memory is a file and have kind of "virtual path" to it, then this would allow loading a DLL directly from memory which is what LoadLibrary forbids by design by asking for a path name. And this is why people write their own PE loader when they want to achieve that.
I would say you can't achieve what you want with file mapping: the purpose of file mapping is to treat a portion of a file as if it was physical memory, and you're wanting the reciprocal.
Using Detours implies that you would have to replicate everything the intercepted DLL function does except from obtaining data from a real file; hence it's not generic. Or, even more intricate, let's pretend the DLL uses fopen; then you provide your own fopen that detects a special pattern in the path and you mimmic the C runtime internals... Hmm is it really worth all the pain? :D
Please explain why you can't extract the data from your EXE and write it to a temporary file. Many applications do this -- it's the classic solution to this problem.
If you really must provide a "virtual file", the cleanest solution is probably a filesystem filter driver. "clean" doesn't mean "good" -- a filter is a fully documented and supported solution, so it's cleaner than API hooking, injection, etc. However, filesystem filters are not easy.
OSR Online is the best place to find Windows filesystem information. The NTFSD mailing list is where filesystem developers hang out.
How about using a some sort of RamDisk and writing the file to this disk? I have tried some ramdisks myself, though never found a good one, tell me if you are successful.
Well, if you need to have the virtual file allocated in your exe, you will need to create a vector, stream or char array big enough to hold all of the virtual data you want to write.
that is the only solution I can think of without doing any I/O to disk (even if you don't write to file).
If you need to keep a file like path syntax, just write a class that mimics that behaviour and instead of writing to a file write to your memory buffer. It's as simple as it gets. Remember KISS.
Cheers
Open the file called "NUL:" for writing. It's writable, but the data are silently discarded. Kinda like /dev/null of *nix fame.
You cannot memory-map it though. Memory-mapping implies read/write access, and NUL is write-only.
I'm guessing that this dll cant take a stream? Its almost to simple to ask BUT if it can you could just use that.
Have you tried using the \?\ prefix when using named pipes? Many APIs support using \?\ to pass the remainder of the path directly through without any parsing/modification.
http://msdn.microsoft.com/en-us/library/aa365247(VS.85,lightweight).aspx
Why not just add it as a resource - http://msdn.microsoft.com/en-us/library/7k989cfy(VS.80).aspx - the same way you would add an icon.

How to decompress a file in fortran77?

I have a compressed file.
Let's ignore the tar command because I'm not sure it is compressed with that.
All I know is that it is compressed in fortran77 and that is what I should use to decompress it.
How can I do it?
Is decompression a one way road or do I need a certain header file that will lead (direct) the decompression?
It's not a .Z file. It ends at something else.
What do I need to decompress it? I know the format of the final decompressed archive.
Is it possible that the file is compressed thru a simple way but it appears with a different extension?
First, let's get the "fortran" part out of the equation. There is no standard (and by that, I mean the fortran standard) way to either compress or decompress files, since fortran doesn't have a compression utility as part of the language. Maybe someone written some of their own, but that's entirely up to him.
So, you're stuck with publicly available compression utilities, and such. On systems which have those available, and on compilers which support it (it varies), you can use the SYSTEM function, which executes the system command by passing a command string to the operating system's command interpreter (I know it exists in cvf, probably ivf ... you should probably look it up in help of your compiler).
Since you asked a similar question already I assume you're still having problem with this. You mentioned that "it was compressed with fortran77". What do you mean by that ? That someone builded a compression utility in f77 and used it ? So that would make it a custom solution ?
If it's some kind of a custom solution, then it can practically be anything, since a lot of algorithms can serve as "compression algorithms" (writing file as binary compared to plain text will save a few bytes; voila, "compression")
Or have I misunderstood something ? Please, elaborate this a little.
My guess is that you have a binary file, which is output by a Fortran program. These can look like compressed files because they are not readable in a text editor.
Fortran allows you to write the in-memory data out to a file without formatting it, so that you can reload it later without having to parse it. The problem, however, is that you need that original source code in order to see what types of variables are written in the file.
If you have no access to the fortran source code, but a lot of time to spare, you could write some simple fortran program and guess what types of variables are being used. I wouldn't advise it, though, as Fortran is not very forgiving.
If you want some simple source code to try, look at this page which details binary read and write in Fortran, and includes a code sample. Just start by replacing reclength=reclength*4 with reclength=reclength*2 for a double precision real.
There is no standard decompression method, there are tons. You will need to know the method used to compress it in order to decompress it.
You said that the file extension was not .Z, but something else. What was that something else?
If it's .gz (which is very common on Unix systems), "gunzip" is the proper command. If it's .tgz, you can gunzip and untar it. (Or you can read the man page for tar(1), since it probably has the ability to gunzip and extract together.)
If it's on Windows, see if Windows can read it directly, as the file system itself appears to support the ZIP format.
If something else, please just list the file name (or, if there are security implications, the file name beginning with the first period), and we might be able to figure it out.
You can check to see if it's a known compressed file type with the file command. Assuming file returns something like "binary file" then you're almost certainly looking at plain binary data.