stxxl save and read vector from disk file

stxxl save and read vector from disk file - c++

I'm struggling in trying to use the stxxl library in a way, that I cannot only store the data from their vector structure into a file but also recover it from that file in a rerun of my program.
I found out that you can construct a vector from a file ( http://stxxl.sourceforge.net/tags/master/classstxxl_1_1vector.html#a4d9029657cc11315cb77955fae70b877 ) but the class "file" only contains these functions ( http://stxxl.sourceforge.net/tags/master/classstxxl_1_1file.html ) with no way (that I can see) to actually access an existing file with some given path.
Does someone who worked with this library before have an idea how to do that?
Thanks in advance

stxxl::file is an interface base class. Depending on your operating system, you want one of the derived classes
stxxl::syscall_file for UNIX, Linux, and Mac OS X using POSIX read and write,
stxxl::wincall_file for Windows, or
stxxl::linuxaio_file for Linux using the SYS_io_* asynchronous I/O syscalls (see man 7 aio for details). This requires STXXL 1.4.1.
You can use the stxxl::create_file function to decide at runtime which backend to use. Set the io_impl parameter to "syscall", "wincall", or "linuxaio", respectively.

Related

C/C++: is there a way to create a nonzero-length file atomically?

I have a thread which watches a directory for file additions (using inotify if it exists, polling otherwise), and notifies a listener upon new files created in the watched directory. The listener has conditional logic based on the size of the created file, which it determines using int stat(const char *pathname, struct stat *statbuf).
In a separate thread, I create a nonzero-length file using std::ofstream; a simplified example of the file creation is:
std::ofstream ofs( "/path/to/file", std::ofstream::out );
ofs << "abc";
ofs.close()
Runtime behavior is that the listener, invoking stat(), sometimes sees the file as 0-length.
This is perfectly reasonable, since the file creation and content-addition are separate actions.
Question: Is there a way to atomically create a nonzero-length file using either C functions or C++03's stl?
Note: For the purpose of this question, I'm not interested in synchronization primitives, like mutexes or semaphores, to synchronize the two threads around the entire process of file-open, add content, close-file.

The base C language has no such concepts, and I don't think C++ does either. If you're talking about these type of things, you must be assuming POSIX or some other operating-system-level behavior specification.
Under POSIX, the way to do this kind of operation is to create the file with a temporary name, then rename it only after you finish writing it. You can do that in a different directory if they're both on the same device; if they're on different devices, whether that works is implementation-defined. The most portable way is to do it in the same directory, which means that your inotify (Linux-specific, BTW) listener should ignore files not matching the naming pattern it's looking for or ignore files in a particular temp namespace you choose as your convention.

Is there a way to atomically create a nonzero-length file using either C functions or C++03's stl?
Best approach would be to create the file elsewhere on the same filesystem, and then std::rename the file into the target file.
The standard doesn't really give explicit guarantees except for the post-coditions (either the file exists with new name, or the old name). Nothing about observable intermediate states. In practice, you're at the mercy of the file system. But if there is some standard operation that achieves what you want, then this is it. POSIX standard does require rename to be atomic.

fstream delete N bytes from the end of a binary file

Is it possible to delete N bytes from the end of a binary file in C++ using fstream (or something similar)? I don´t want to read the whole file, cut it and write it again, but since it´s from the end of a file it seems like it shouldn't be such a problem.

I'm not aware of a generic C++ (platform independent) way to do this without writing a new file. However, on POSIX systems (Linux, etc.) you can use the ftruncate() function. On Windows, you can use SetEndOfFile().
This also means you'll need to open the file using the native functions instead of fstream since you need the native descriptor/handle for those functions.
EDIT: If you are able to use the Boost library, it has a resize_file() function in its Filesystem library which would do what you want.

Update:
Now in C++17 you can use resize_file from filesystem
Live on Coliru

In case you want to use Qt, QFile also provides two resize() methods that allow to truncate a file.

Redirect FILE handle to char-buffer

I'm using a third-party library that allows conversion between two file formats A and B. I would like to use this library to load a file of format A and convert it to format B, but I only need the converted representation in memory. So I would like to do the conversion without actually saving a file of the target format to disk and rather obtain an unsigned char* buffer or something similar. Unfortunately the libraries only conversion function is of the form
void saveAsB(A& a, std::FILE *const file);
What can I do? Is there any way to redirect the write operations performed on the handle to some buffer?

If your platform supports it, use open_memstream(3). This will be available on Linux and BSD systems, and it's probably better than fmemopen() for your use case because open_memstream() allocates the output buffer dynamically rather than you having to know the maximum size in advance.
If your platform doesn't have those functions, you can always use a "RAM disk" approach, which again on Linux would be writing a "file" to /dev/shm/ which will never actually reach any disk, but rather be stored in memory.
Edit: OK, so you say you're using Windows. Here's an outline of what you can try:
Open a non-persisted memory-mapped files.
Use _open_osfhandle to convert the HANDLE to an int file descriptor.
Use _fdopen to convert the int file descriptor to FILE*.
Cross your fingers. I haven't tested any of this.
I found this reference useful in putting the pieces together: http://www.codeproject.com/Articles/1044/A-Handy-Guide-To-Handling-Handles
Edit 2: It looks like CreateFileMapping() and _open_osfhandle() may be incompatible with each other--you would be at least the third person to try it:
https://groups.google.com/forum/#!topic/comp.os.ms-windows.programmer.win32/NTGL3h7L1LY
http://www.progtown.com/topic178214-createfilemapping-and-file.html
So, you can try what the last link suggested, which is to use setvbuf() to "trick" the data into flowing to a buffer you control, but even that has potential problems, e.g. it won't work if the library seeks within the FILE*.
So, perhaps you can just write to a file on some temporary/scratch filesystem and be done with it? Or use a platform other than Windows? Or use some "RAM disk" software.

If you can rely on POSIX being available, then use fmemopen().

Is HANDLE similar to file descriptor in Linux?

Is HANDLE similar to file descriptor in Linux? As far as I know, HANDLE is used for handling every resources on Windows, such as font, icons, files, devices..., which in essence is just a void pointer point to a memory block holding data of a specific resource

Yes, Windows handles are very similar to Unix file descriptors (FDs).
Note that a HANDLE is not a pointer to a block of memory. Although HANDLE is typedef'd as void *, that's just to make it more opaque. In practice, a HANDLE is an index that is looked up in a table, just as an FD number is.
This blog post explores some of the similarities and differences:
http://lackingrhoticity.blogspot.com/2015/05/passing-fds-handles-between-processes.html

Yes, they are conceptually similar. File descriptors in unix map integers to a per-process table of pointers to other objects (which can be other things than files, too). File descriptors are not as unified though -- some things exist in a separate "namespace" (e.g., process timers). In that respect, Windows is more orthogonal -- CloseHandle will always free a resource regardless of what it is.

Besides the fact that handles refer to a far broader concept on Windows. Even we restrict the discussion to only file handles, there is significant differences. There is a function called _open_osfhandle() as part of C run-time library on Windows. Its purpose is to, quote "Associates a C run-time file descriptor with an existing operating-system file handle." That is, a glue function between the kernel land and the C Run-time land. The function signature is as below:
int _open_osfhandle (
intptr_t osfhandle,
int flags
);
File handles Windows is actually more feature rich than file descriptors in C, which can be configured when a file handle is created with CreateFileA (ANSI version) or CreateFile (UTF16 version), reflecting the design difference between *Nix and Windows. And the resulted handle carries all these information around with all its implications.

A HANDLE is a void pointer
typedef PVOID HANDLE;
typedef void *PVOID;
Windows Data Types

How can I create a temporary file for writing in C++ on a Linux platform?

In C++, on Linux, how can I write a function to return a temporary filename that I can then open for writing?
The filename should be as unique as possible, so that another process using the same function won't get the same name.

Use one of the standard library "mktemp" functions: mktemp/mkstemp/mkstemps/mkdtemp.
Edit: plain mktemp can be insecure - mkstemp is preferred.

tmpnam(), or anything that gives you a name is going to be vulnerable to race conditions. Use something designed for this purpose that returns a handle, such as tmpfile():
#include <stdio.h>
FILE *tmpfile(void);

The GNU libc manual discusses the various options available and their caveats:
http://www.gnu.org/s/libc/manual/html_node/Temporary-Files.html
Long story short, only mkstemp() or tmpfile() should be used, as others have mentioned.

man tmpfile
The tmpfile() function opens a unique temporary file in binary
read/write (w+b) mode. The file will be automatically deleted when it
is closed or the program terminates.ote

mktemp should work or else get one of the plenty of available libraries to generate a UUID.

The tmpnam() function in the C standard library is designed to solve just this problem. There's also tmpfile(), which returns an open file handle (and automatically deletes it when you close it).

You should simply check if the file you're trying to write to already exists.
This is a locking problem.
Files also have owners so if you're doing it right the wrong process will not be able to write to it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js