convert image buffer to filestream - c++

Something similar to this may have been asked earlier, I could not find an exact answer to my problem to decided to ask here.
I am working with a 3rd party framework that has it's own classes defined to handle image files. It only accepts the file name and the whole implementation is around being able to open these filestreams and perform reads/writes.
I'd like to input an image buffer (that I obtain through some pre-processing on an image open earlier) and feed it to this framework. The problem being I cannot feed a buffer to it, only a filename string.
I am looking at the best way to convert my buffer to a filestream so it can be seekable and be ingested by the framework. Please help me figure out what I should be looking at.
I tried reading about streambuf (filebuf and stringbuf) and tried assigning the buffer to these types, but no success so far.

If the framework only takes a file name, then you have to pass it a file name. Which means the data must reside in the file system.
The portable answer is "write your data to a temporary file and pass the name of that".
On Unix, you might be able to use a named pipe and fork another thread to feed the data through the pipe...
But honestly, you are probably better off just using a temporary file. If you manage to open, read, and delete the file quickly enough, it most likely will never make it out to disk anyway, since the kernel will cache the data.
And if you are able to use a ramdisk (tmpfs), you can guarantee that everything happens in memory.
[edit]
One more thought. If you can modify your code base to operate on std::iostream instead of std::fstream, you can pass it a std::stringstream. They support all of the usual iostream operations on a memory buffer, including things like seeking.

Related

Directly use c++ stream buffer to e.g. decompress

My use case is reading a compressed file from disk, de-compressing it incrementally and using the resulting data.
Currently I am reading file contents into a temporary buffer allocated by me, point the decompression API at this buffer, and so on. The question is about understanding whether the temporary buffer really is necessary or helpful in this case.
In an experiment, I opened a stream to a text file and called get() once. In the debugger I can see that the filebuffer in the stream already contains the following characters in the text stream, as expected. (On msvc I found it under std::ifstream::_Filebuffer::_IGfirst)
I am looking for a portable way to access this buffer I see in the debugger, feed the decompression API with it, and then continue reading the file.
I don't understand why I should copy the filebuffer to my buffer (with e.g. read()) in this particular case where I will promptly consume the buffer contents and move on. I'm not questioning the merits of buffer I/O in general.
EDIT:
I did further experiments, and it seems that the internal stream buffer doesn't get used if the target of read() is itself sufficiently large. Apparently the case I was worried about, doesn't really come up.

Modifying an reading big .txt file with MPI/c++?

I am using MPI together with C++. I want to read information from one file, modify it by some rule, and then write modified content in the same file. I am using temporary file which where I store modified content and at the end I overwrite it by these commands:
temp_file.open("temporary.txt",ios::in);
ofstream output_file(output_name,ios::out);
output_file<<temp_file.rdbuf();
output_file.flush();
temp_file.close();
output_file.close();
remove("temporary.txt");
This function which modify the file is executed by MPI process with rank 0. After exiting from function, MPI_Barrier(MPI_COMM_WORLD); is called to ensure synchronization.
And then, all MPI processes should read modified file and perform some computations. The problem is that, since file is too big, data are not completely written to file when execution of function is finished, and I get wrong results. I also tried to put sleep() command, but sometimes it works, sometimes it doesn't (it depends on the node where I perform computations). Is there general way to solve this problem?
I put MPI as a tag, but I think this problm is inherently connected with c++ standard and manipulating with storage. How to deal with this latency between writing in buffer aand writing in file on storage medium?
Fun topic. You are dealing with two or maybe three consistency semantics here.
POSIX consistency says essentially when a byte is written to a file, it's visible.
NFS consistency says "woah, that's way too hard. you write to this file and I'll make it visible whenever I feel like it. "
MPI-IO consistency semantics (which you aren't using, but are good to know) say that data is visible after specific synchronization events occur. Those two events are "close a file and reopen it" or "sync file, barrier, sync file again".
If you are using NFS, give up now. NFS is horrible. There are a lot of good parallel file systems you can use, several of which you can set up entirely in userspace (such as PVFS).
If you use MPI-IO here, you'll get more well-defined behavior, but the MPI-IO routines are more like C system calls than C++ iostream operators, so think more like open(2) read(2) write(2) and close(2). Text files are usually a headache to deal with but in your case where modifications are appended to file, that shouldn't be too bad.

Can I create a Handle without a file?

I want to create a dump in windows with the function MiniDumpWriteDump. The problem is that that function takes a Handle to a file to write the result to. I want the data in memory so that I can send it over the internet. Therefore, I was wondering if there is a way to create a handle without a file backing it and I can just get a pointer to the data?
You can use memory mapped files. See here: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366537(v=vs.85).aspx
You need to pass hFile = INVALID_HANDLE_VALUE and specify maximal size of file. Please, check msdn for the details.
There are a couple of possibilities.
One would be to use CreateFile, but pass FILE_ATTRIBUTE_TEMPORARY. This will create a file, but tells Windows to attempt to keep as much of the file in the cache as possible. While this doesn't completely avoid creating a file, if you have enough memory it can often eliminate any (or much, anyway) I/O to/from the disk from happening.
Another possibility (though one I've never tested) would be to pass a handle to a named (or maybe even an anonymous) pipe. You can generally write to a pipe like you would a file, so as long as the crash dump writer just needs to be able to pass the handle to WriteFile, chances are pretty good this will work fine. From there, you could (for example) have another small program that would read the data from the pipe and write it to a socket. Obviously it would be nice to be able to avoid the extra processing to translate from pipe to socket, but such is life some times.
If you haven't tried it, you might want to test with just passing a socket handle to the crash dump writer. Although it's somewhat limited, Windows does support treating a socket handle like a normal file (or whatever) handle. There's certainly nothing close to a guarantee that it'll work, but it may be worth a shot anyway.
The crash dump is indeed process's memory. So, it doesn't make sense. Why don't you simply send the file and delete after successful send?
By the way, you can compress the file and send it, because crashdumps are usually big files.
The documentation says to pass a file handle, so if you do anything else you're breaking the contract and (if it works at all) the behaviour will not be reliable.
Pass a named pipe handle. Pipe the data back to yourself.

Read with File Mapping Objects in C++

I am trying to use Memory Mapped File (MMF) to read my .csv data file (very large and time consuming).
I've heared that MMF is very fast since it caches content of the file, thus users can get access to the content in disk as in memory.
May I know if MMF is any faster than using other reading methods?
If this is true, can anyone show me a simple example how to read a file from disk?
Many thanks in advance.
May I know if MMF is any faster than using other reading methods?
If you're reading the entire file sequentially in one pass, then a memory-mapped file is probably approximately the same as using conventional file I/O.
can anyone show me a simple example how to read a file from disk?
Memory mapped files are typically an operating system feature, so you'd have to tell us which platform you're on to get an example of using it.
If you want to read a file sequentially, you can use the C++ ifstream class or the C run-time functions like fopen, fread, and fclose.
If it's faster or not depends on many different factors (such as what data you are accessing, how you are accessing it, etc. To determine what is right for YOUR case, you need to benchmark different solutions, and see what is best in your case.
The main benefit of memory mapped files is that the data can be copied directly from the filesystem into the user-accessible memory.
In traditional (fstream::read(), fredad(), etc) types of file-reading, the content of the file is read into a temporary buffer in the OS, then (part of) that buffer is copied to the user supplied buffer. This is because the OS can't rely on the memory being there and it gets pretty messy pretty quickly. For memory mapped files, the OS knows directly where the memory is for the different sections (because it's the OS's task to assign that memory and keep track of where it is!) of the file, so the OS can just copy it straight in.
However, I strongly suspect that the method of reading the file is a minor part, and the actual interpretation/parsing/copying out of the file may well be a large part. [Speculation, we haven't seen your code, of course]. And of course, the I/O speed available from the DISK itself may play a large factor if the file is very large.

Non-blocking call to ofstream::open?

I have a C++ program which opens files in /tmp (on a *nix system) and reads their contents.
To do this, I am using:
ofstream dest;
dest.open(abs_path.c_str(), ios::app);
where abs_path is a string containing the absolute path to the file.
The problem is that some *nix programs create named pipes as files in /tmp. For example,
/tmp/vgdb-pipe-to-vgdb-from-23732-by-myusername-on-???
Is a pipe created by a debugging utility I am using.
In the documentation for ofstream, the open method it says that the method sets an error bit when opening the file fails. However, in my tests it instead hangs trying to open the file (which is actually a pipe) indefinitely. I assume this is because the file is locked by another program (probably the debugger).
So, how can I force ofstream::open to block for a finite amount of time, or not at all? It's easy enough to clean up gracefully if it fails, but it needs to actually fail first..
The simple answer is that you can't. filebuf::open (called by
ofstream) basically delegates to the OS, and supposed that the OS will
do the right thing. And the interface it supports is very, very
limited; many important options to open (O_SYNC, O_NONBLOCK, etc)
aren't mapped, and thus can't be used. The only solutions I've found to
this is either to use std::ostringstream, then write the string to the
file using system level calls, or to write my own streambuf, which
does what I want (much simpler than it sounds, since you typically only
need part of what filebuf offers—you often don't need
bidirectionality, seeking or code translation).
Neither of these solutions are portable, of course.
Finally, I'm not sure why you're writing into /tmp. By convention,
anything you put into /tmp should contain the process id. And for
security reasons, I'd always create a subdirectory, with the process id
in its name, and with very limited access rights, and create any
temporary files in it.
AFAIK, there is no such thing as non-blocking input defined by the C++ language. (There is a method std::streambuf::in_avail(), but still it can't help you)
You can consider using C method
int file_descr = open( "pipe_addr", O_RDONLY |O_NONBLOCK);
instead of std::ofstream