Is HANDLE similar to file descriptor in Linux? As far as I know, HANDLE is used for handling every resources on Windows, such as font, icons, files, devices..., which in essence is just a void pointer point to a memory block holding data of a specific resource
Yes, Windows handles are very similar to Unix file descriptors (FDs).
Note that a HANDLE is not a pointer to a block of memory. Although HANDLE is typedef'd as void *, that's just to make it more opaque. In practice, a HANDLE is an index that is looked up in a table, just as an FD number is.
This blog post explores some of the similarities and differences:
http://lackingrhoticity.blogspot.com/2015/05/passing-fds-handles-between-processes.html
Yes, they are conceptually similar. File descriptors in unix map integers to a per-process table of pointers to other objects (which can be other things than files, too). File descriptors are not as unified though -- some things exist in a separate "namespace" (e.g., process timers). In that respect, Windows is more orthogonal -- CloseHandle will always free a resource regardless of what it is.
Besides the fact that handles refer to a far broader concept on Windows. Even we restrict the discussion to only file handles, there is significant differences. There is a function called _open_osfhandle() as part of C run-time library on Windows. Its purpose is to, quote "Associates a C run-time file descriptor with an existing operating-system file handle." That is, a glue function between the kernel land and the C Run-time land. The function signature is as below:
int _open_osfhandle (
intptr_t osfhandle,
int flags
);
File handles Windows is actually more feature rich than file descriptors in C, which can be configured when a file handle is created with CreateFileA (ANSI version) or CreateFile (UTF16 version), reflecting the design difference between *Nix and Windows. And the resulted handle carries all these information around with all its implications.
A HANDLE is a void pointer
typedef PVOID HANDLE;
typedef void *PVOID;
Windows Data Types
Related
I have a thread which watches a directory for file additions (using inotify if it exists, polling otherwise), and notifies a listener upon new files created in the watched directory. The listener has conditional logic based on the size of the created file, which it determines using int stat(const char *pathname, struct stat *statbuf).
In a separate thread, I create a nonzero-length file using std::ofstream; a simplified example of the file creation is:
std::ofstream ofs( "/path/to/file", std::ofstream::out );
ofs << "abc";
ofs.close()
Runtime behavior is that the listener, invoking stat(), sometimes sees the file as 0-length.
This is perfectly reasonable, since the file creation and content-addition are separate actions.
Question: Is there a way to atomically create a nonzero-length file using either C functions or C++03's stl?
Note: For the purpose of this question, I'm not interested in synchronization primitives, like mutexes or semaphores, to synchronize the two threads around the entire process of file-open, add content, close-file.
The base C language has no such concepts, and I don't think C++ does either. If you're talking about these type of things, you must be assuming POSIX or some other operating-system-level behavior specification.
Under POSIX, the way to do this kind of operation is to create the file with a temporary name, then rename it only after you finish writing it. You can do that in a different directory if they're both on the same device; if they're on different devices, whether that works is implementation-defined. The most portable way is to do it in the same directory, which means that your inotify (Linux-specific, BTW) listener should ignore files not matching the naming pattern it's looking for or ignore files in a particular temp namespace you choose as your convention.
Is there a way to atomically create a nonzero-length file using either C functions or C++03's stl?
Best approach would be to create the file elsewhere on the same filesystem, and then std::rename the file into the target file.
The standard doesn't really give explicit guarantees except for the post-coditions (either the file exists with new name, or the old name). Nothing about observable intermediate states. In practice, you're at the mercy of the file system. But if there is some standard operation that achieves what you want, then this is it. POSIX standard does require rename to be atomic.
Given a HANDLE to a file (e.g. C:\\FolderA\\file.txt), I want a function which will return a HANDLE to the containing directory (in the previous example, it would be a HANDLE to C:\\FolderA). For example:
HANDLE hFile = CreateFileA(
"C:\\FolderA\\file.txt",
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
HANDLE hDirectory = somefunc(hFile);
Possible implementation for someFunc:
HANDLE someFunc(HANDLE h)
{
char *path = getPath(h); // "C:\\FolderA\\file.txt"
char *parent = getParentPath(path); // "C:\\FolderA"
HANDLE hFile = CreateFileA(
parent,
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
free(parent);
free(path);
return hFile;
}
But is there a way to implement someFunc without getParentPath or without making it look at the string and removing everything after the last directory separator (because this is terrible from a performance point of view)?
I don't know what getParentPath is. I assume it's a function that searches for the trailing backslash in the string and uses that to strip off the file specification. You don't have to define such a function yourself; Windows already provides one for you—PathCchRemoveFileSpec. (Note that this assumes the specified path actually contains a file name to remove. If the path doesn't contain a file name, it will remove the trailing directory name. There are other functions you can use to verify whether a path contains a file specification.)
The older version of this function is PathRemoveFileSpec, which is what you would use on downlevel operating systems where the newer, safer function is not available.
Outside of the Windows API, there are other ways of doing the same thing. If you're targeting C++17, there is the filesystem::path class. Boost provides something similar. Or you could write it yourself with the find_last_of member function of the std::string class, if you absolutely have to. (But prefer not to re-invent the wheel. There are lots of edge cases when it comes to path manipulation that you probably won't think of, and that your testing probably won't reveal.)
You express concerns about the performance of this approach. This is nonsense. Stripping some characters from a string is not a slow operation. It wouldn't even be slow if you started searching from the beginning of the string and then, once you found the file specification, made a second copy of the string, again starting from the beginning of the string. It's a simple loop searching through the characters of a reasonable-length string, and then a simple memcpy. There is absolutely no way that this operation could be a performance bottleneck in code that does file I/O.
But, the implementation probably isn't even going to be that naïve. You can optimize it by starting the search from the end of the path string, reducing the number of characters that you have to iterate through, and you can avoid any type of memory copy altogether if you're allowed to manipulate the original string. With a C-style string, you just replace the trailing path separator (the one that demarcates the beginning of the path specification) with a NUL character (\0). With a C++-style string, you just call the erase member function.
In fact, if you really care about performance, this is virtually guaranteed to be faster than making a system call to retrieve the containing folder from a file object. System calls are a lot slower than some compiler-generated, inlinable code to iterate through a string and strip out a sub-string.
Once you have the path to the directory, you can obtain a HANDLE to it by calling the CreateFile function with the FILE_FLAG_BACKUP_SEMANTICS flag. (It is necessary to pass that flag if you want to retrieve a handle to a directory.
I have measured that this is slow and am looking for a faster way.
Your measurements are wrong. Either you've made the common mistake of benchmarking a debugging build, where the standard library functionality (e.g., std::string) is not optimized, and/or the real performance bottleneck is the file I/O. CreateFile is not a speedy function by any stretch of the imagination. I can almost guarantee that is going to be your hotspot.
Note that if you don't already have the path, it is straightforward to obtain the path from a HANDLE to a file. As was pointed out in the comments, on Windows Vista and later, you simply need to call the GetFinalPathNameByHandle function. More details are available in this article on MSDN, including sample code and an alternative for use on downlevel versions of Windows.
As was mentioned already in the comments to the question, you can optimize this further by allocating a buffer of length MAX_PATH (or perhaps even larger) on the stack. That compiles to a single instruction to adjust the stack pointer, so it won't be a performance bottleneck, either. (Okay, I lied: you actually will need two instructions—one to create space on the stack, and the other to free the allocated space on the stack. Still not a performance problem.) That way, you don't even have to do any dynamic memory allocation.
Note that for maximum robustness, especially on Windows 10, you want to handle the case that a path is longer than MAX_PATH. In such cases, your stack-allocated buffer will be too small, and the function you call to fill it will return an error. Handle that error, and allocate a larger buffer on the free store. That will be slower, but this is an edge case and probably not one that is worth optimizing. The 99% common case will use the stack-allocated buffer.
Furthermore, eryksun points out (in comments to this answer) that, although it is convenient, GetFinalPathNameByHandle requires multiple system calls to map the file object between the NT and DOS namespaces and to normalize the path. I haven't disassembled this function, so I can't confirm his claims, but I have no reason to doubt them. Under normal circumstances, you wouldn't worry about this sort of overhead or possible performance costs, but since this seems to be a big concern for your application, you can use eryksun's alternative suggestion of calling GetFileInformationByHandleEx and requesting the FileNameInfo class. GetFileInformationByHandleEx is a general, multi-purpose function that can retrieve all different sorts of information about a file, including the path. Its implementation is simpler, calling directly down to the native NtQueryInformationFile function. I would have thought GetFinalPathNameByHandle was just a user-mode wrapper providing exactly this service, but eryksun's research suggests it is doing extra work that you might want to avoid if this is truly a performance hot-spot. I have to qualify this slightly by noting that GetFileInformationByHandleEx, in order to retrieve the FileNameInfo, is going to have to create an I/O Request Packet (IRP) and call down to the underlying device driver. That's not a cheap operation, so I'm not sure that the additional overhead of normalizing the path is really going to matter. But in this case, there's no real harm in using the GetFileInformationByHandleEx approach, since it's a documented function.
If you've written the code as described but are still having measurable performance problems, then please post that code for someone to review and help you optimize. The Code Review Stack Exchange site is a great place to get help like that on working code. Feel free to leave me a link to such a question in a comment under this answer so that I don't miss it.
Whatever you do, please stop calling the ANSI versions of the Windows API functions (the ones that end with an A suffix). You want the wide-character (Unicode) versions. These end with a W suffix, and work with strings composed of WCHAR (== wchar_t) characters. Aside from the fact that the ANSI versions have been deprecated for decades now because they do not provide Unicode support (it is not optional for any application written after the year 2000 to support Unicode characters in paths), as much as you care about performance, you should be aware of the fact that all A-suffixed API functions are just stubs that convert the passed-in ANSI string to a Unicode string and then delegate to the W-suffixed version. If the function returns a string, a second conversion also must be done by the A-suffixed version, since all native APIs work with Unicode strings. Performance isn't the real reason why you should avoid calling ANSI functions, but perhaps it's one that you'll find more convincing.
There might be a way to do what you want (map a file object via a HANDLE to its containing directory), but it would require undocumented usage of the NT native API. I don't see anything at all in the documented functions that would allow you to obtain this information. It certainly isn't accessible via the GetFileInformationByHandleEx function. For better or worse, the user-mode file system API is almost entirely path-based. Presumably, it is tracked internally, but even the documented NT native API functions that take a root directory HANDLE (e.g., NtDeleteFile via the OBJECT_ATTRIBUTES structure) allow this field to be NULL, in which case the full path string is used.
As always, if you had provided more details on the bigger picture, we could probably provide a more appropriate solution. This is what the commenters were driving at when they mentioned an XY problem. Yes, people are questioning your motives because that's how we provide the most appropriate help.
I'm struggling in trying to use the stxxl library in a way, that I cannot only store the data from their vector structure into a file but also recover it from that file in a rerun of my program.
I found out that you can construct a vector from a file ( http://stxxl.sourceforge.net/tags/master/classstxxl_1_1vector.html#a4d9029657cc11315cb77955fae70b877 ) but the class "file" only contains these functions ( http://stxxl.sourceforge.net/tags/master/classstxxl_1_1file.html ) with no way (that I can see) to actually access an existing file with some given path.
Does someone who worked with this library before have an idea how to do that?
Thanks in advance
stxxl::file is an interface base class. Depending on your operating system, you want one of the derived classes
stxxl::syscall_file for UNIX, Linux, and Mac OS X using POSIX read and write,
stxxl::wincall_file for Windows, or
stxxl::linuxaio_file for Linux using the SYS_io_* asynchronous I/O syscalls (see man 7 aio for details). This requires STXXL 1.4.1.
You can use the stxxl::create_file function to decide at runtime which backend to use. Set the io_impl parameter to "syscall", "wincall", or "linuxaio", respectively.
I'm using a third-party library that allows conversion between two file formats A and B. I would like to use this library to load a file of format A and convert it to format B, but I only need the converted representation in memory. So I would like to do the conversion without actually saving a file of the target format to disk and rather obtain an unsigned char* buffer or something similar. Unfortunately the libraries only conversion function is of the form
void saveAsB(A& a, std::FILE *const file);
What can I do? Is there any way to redirect the write operations performed on the handle to some buffer?
If your platform supports it, use open_memstream(3). This will be available on Linux and BSD systems, and it's probably better than fmemopen() for your use case because open_memstream() allocates the output buffer dynamically rather than you having to know the maximum size in advance.
If your platform doesn't have those functions, you can always use a "RAM disk" approach, which again on Linux would be writing a "file" to /dev/shm/ which will never actually reach any disk, but rather be stored in memory.
Edit: OK, so you say you're using Windows. Here's an outline of what you can try:
Open a non-persisted memory-mapped files.
Use _open_osfhandle to convert the HANDLE to an int file descriptor.
Use _fdopen to convert the int file descriptor to FILE*.
Cross your fingers. I haven't tested any of this.
I found this reference useful in putting the pieces together: http://www.codeproject.com/Articles/1044/A-Handy-Guide-To-Handling-Handles
Edit 2: It looks like CreateFileMapping() and _open_osfhandle() may be incompatible with each other--you would be at least the third person to try it:
https://groups.google.com/forum/#!topic/comp.os.ms-windows.programmer.win32/NTGL3h7L1LY
http://www.progtown.com/topic178214-createfilemapping-and-file.html
So, you can try what the last link suggested, which is to use setvbuf() to "trick" the data into flowing to a buffer you control, but even that has potential problems, e.g. it won't work if the library seeks within the FILE*.
So, perhaps you can just write to a file on some temporary/scratch filesystem and be done with it? Or use a platform other than Windows? Or use some "RAM disk" software.
If you can rely on POSIX being available, then use fmemopen().
I want to run another program from my C++ code. system() returns int, as every program can only return int to the os. However, the other program I want to call will generate a string that I need in my base app. How can I send it to the parent process?
The two apps will be in the same folder, so I think that the child app can save the string to "temp.txt" and then the main app may read and delete it (it's not performance critical process, I will call another process just to call open file dialog in my main opengl app). However this is a bit ugly solution, are there better cross platform solutions?
Thanks
You could use popen(), this opens a process where you can write and read data. AFIK this is also cross plattform
// crt_popen.c
/* This program uses _popen and _pclose to receive a
* stream of text from a system process.
*/
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char psBuffer[128];
FILE *pPipe;
/* Run DIR so that it writes its output to a pipe. Open this
* pipe with read text attribute so that we can read it
* like a text file.
*/
if((pPipe = _popen("dir *.c /on /p", "rt")) == NULL)
exit(1);
/* Read pipe until end of file. */
while(!feof(pPipe)) {
if(fgets(psBuffer, 128, pPipe) != NULL)
printf(psBuffer);
}
/* Close pipe and print return value of pPipe. */
printf("\nProcess returned %d\n", _pclose(pPipe));
return 0;
}
Although it's not part of the C++ standard, nearly all reasonably current systems provide a popen (or _popen, in Microsoft's case) that will let you spawn a child process and read from its standard output as a C-style FILE * in the parent. At least if memory serves, popen is included in POSIX, so you can expect it to be present in essentially all Unix-like systems (and, as implied above, it's also available on Windows, at least with most compilers).
In other words, about the only place you'd likely encounter that it's not available would be something like a small embedded system where it might well be pretty meaningless (e.g., no file system to find the other executable in, and quite possibly no ability to create new processes either).
Though there is no standard way of achieving interprocess communication, there is a relatively pain free library, ported to many OS/compilers: Boost.Interprocess. It covers most necessities:
Shared memory.
Memory-mapped files.
Semaphores, mutexes, condition variables and upgradable mutex
types to place them in shared memory and memory mapped files.
Named versions of those synchronization objects, similar to
UNIX/Windows sem_open/CreateSemaphore API.
File locking.
Relative pointers.
Message queues.