Recover stdio stream open mode - c++

Is there a way for a function receiving a value of type FILE * to get the open mode used on the call to fopen() used to create the stream?
This question was motivated by the need to extend a C++ class that works as a wrapper to stdio's FILE pointers, in a way that I can clone an already open stream into a new wrapped one, while the original would continue to be used unwrapped by other parts of the program.
Under POSIX, I know that I can use fileno() to get the stream's underlying file descriptor in order to clone (dup()) it, but using the underlying descriptor's file flags would not be an exact replacement for the stream open mode, since it is possible that the stream would have stricter access restrictions than the descriptor it's bound to. So, do you have any suggestions?

Related

Is it possible to fan-out an istream to multiple readers?

I have an std::istream to work with. Is it possible to somehow pass it on to multiple readers which will potentially seek to and read from different positions?
If not, what if I restrict it to the case of an std::ifstream?
You already answered your question. If it is filestream (ifstream) you get random access (read only; you can set open mode), there should be no problem with multiple threads accessing the same file by opening multiples ifstreams each for one thread. The C++ standard said nothing about thread-safeness about ifstream. For the generic istream (socket, cin), if you use the get() method you will be consuming input stream. I don't see any document for thread-safe of istream. the peek() method will not consume the input stream but will still change the internal state of the istream. If multiple threads doing seek() on the same istream, the behavior is undefined. You are not assured of an internal lock by the C++ language. The seek() is basically dereferencing some sort of pointer to an internal buffer.
I would suggest that you have one thread reading the istream into some buffer (constructed objects (the producer), or simple raw memory), then, multiple threads can consume the result (consumer). This is typical consumer/producer synchronization; any multi-threading text book will teach you how to do it.

Concurrent File write between processes

I need to write log data into a single file from different processes.
I am using Windows Mutex which needs Common Language Runtime support for it.
Mutex^ m = gcnew Mutex( false,"MyMutex" );
m->WaitOne();
//... File Open and Write ..
m->ReleaseMutex()
Do I really need to change from C++ to C++/CLI for synchronization?
It is ok if the atomic is not used. But I need to know whether using this Mutex will slow down the performance compared to local mutex.
Adding CLR support to your C++ application just to get the Mutex class is overkill. There are several options available to you to synchronize your file access between two applications.
Option 1: Mutex
If you need to write a file from multiple processes, using a mutex is a good way to do it. Use the mutex functions in the Win32 API. (The .Net Mutex class is just a wrapper around those functions anyway.)
HANDLE mutex = CreateMutex(NULL, false, "MyMutex");
DWORD waitResult = WaitForSingleObject(mutex, INFINITE);
if (waitResult == WAIT_OBJECT_0)
{
// TODO: Write the file
WriteFile(...);
ReleaseMutex(mutex);
}
As the other answer noted, you will need to open the file with sharing, so that both of your applications can open it at once. However, that by itself may not be enough: If both of your applications are trying to write to the same area of the file, then you'll still need to make sure that only one application writes at a time. Imagine if both applications look at the size of the file, then both try to write to that byte offset at the same time: Even though both tried to just append to the end of the file, they ended up clobbering each other.
Option 2: Open as append only
If you're purely writing to the end of the file, and not ever attempting to read anything or to write anywhere other than the very end of the file, then there is a special mode you can use that will let you not use a mutex. If you open the file with dwDesiredAccess set to FILE_APPEND_DATA | SYNCHRONIZE and nothing else (don't include FILE_WRITE_DATA), then the OS will take care of making sure that all the data that gets written to the file at the end, and the two applications writing data do not overwrite each other. This behavior is documented on MSDN:
If only the FILE_APPEND_DATA and SYNCHRONIZE flags are set, the caller can write only to the end of the file, and any offset information about writes to the file is ignored. However, the file will automatically be extended as necessary for this type of write operation.
Option 3: LockFile
One other path you can take is to use the LockFile method. With LockFile (or LockFileEx), you can have both applications open the file, and have each app lock the section of the file that it wants to write to. This gives you more granularity than the mutex, allowing non-overlapping writes to happen at the same time. (Using LockFile on the entire file will give you the same basic effect as the mutex, with the added benefit that it will prevent other applications from writing the file while you're doing so.) There's a good example of how to use LockFile on Raymond Chen's blog.
Actually you don't need to use a separate mutex at all, you can just use the file itself. When a file is opened with the CreateFile API call (see https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396), the call takes a parameter called dwShareMode which specifiew what concurrent access is allowed by other processes. A value of 0 would prevent other processes from opening the file completely.
Pretty much all APIs to open a file map to CreateFile under the hood, so clr might be doing the right thing for you when you open a file for writing already.
In the C runtime there is also _fsopen which allows you to open a a file with the sharing flags.
I'd recommend you to test what the default sharing mode is when you open your file from C#. If it does not prevent simultaneous open for writing by default, use _fsopen from C (or maybe there is an appropriate C# function).

Functions responsibility on data in C

Recently I ran into a problem at work where you have two functions; one opens a file descriptor (which is a local variable in the function), and passes it to another function where it is used for reading or writing. Now, when one of the operations read/write fails the function that was doing the read/write closes this file descriptor, and returns.
The question is, whose responsibility is to close the file descriptor, or let's say do cleanup:
the function which created the fd
the function which experienced the error while read/write
Is there a design rule for these kind of cases; let's say creation and cleanup.
BTW, the problem was that both functions attempted to close the fd, which resulted in a crash on the second call to close.
There are two parts to this answer — the general design issue and the detailed mechanics for your situation.
General Design
Handling resources such as file descriptors correctly, and making sure they are released correctly, is an important design issue. There are multiple ways to manage the problem that work. There are some others that don't.
Your tags use C and C++; be aware that C++ has extra mechanisms available to it.
In C++, the RAII — Resource Acquisition Is Initialization — idiom is a great help. When you acquire a resource, you ensure that whatever acquires the resource initializes a value that will be properly destructed and will release the resource when destructed.
In both languages, it is generally best if the function responsible for allocating the resource also releases it. If a function opens a file, it should close it. If a function is given an open file, it should not close the file.
In the comments, I wrote:
Generally, the function that opened the file should close it; the function that experienced the error should report the error, but not close the file. However, you can work it how you like as long as the contract is documented and enforced — the calling code needs to know when the called code closed the file to avoid double closes.
It would generally be a bad design for the called function to close the file sometimes (on error), but not other times (no error). If you must go that way, then it is crucial that the calling function is informed that the file is closed; the called function must return an error indication that tells the calling code that the file is no longer valid and should neither be used nor closed. As long as the information is relayed and handled, there isn't a problem — but the functions are harder to use.
Note that if a function is designed to return an opened resource (it is a function that's responsible for opening a file and making it available to the function that called it), then the responsibility for closing the file falls on the code that calls the opening function. That is a legitimate design; you just have to make sure that there is a function that knows how to close it, and that the calling code does close it.
Similar comments apply to memory allocation. If a function allocates memory, you must know when the memory will be freed, and ensure that it is freed. If it was allocated for the purposes of the current function and functions it calls, then the memory should be released before return. If it was allocated for use by the calling functions, then the responsibility for release transfers to the calling functions.
Detailed mechanics
Are you sure you're using file descriptors and not FILE * (file streams)? It's unlikely that closing a file descriptor twice would cause a crash (error, yes, but not a crash). OTOH, calling fclose() on an already closed file stream could cause problems.
In general, in C, you pass file descriptors, which a small integers, by value, so there isn't a way to tell the calling function that the file descriptor is no longer valid. In C++, you could pass them by reference, though it is not conventional to do so. Similarly with FILE *; they're most usually passed by value, not by reference, so there isn't a way to tell the calling code that the file is not usable any more by modifying the value passed to the function.
You can invalidate a file descriptor by setting it to -1; that is never a valid file descriptor. Using 0 is a bad idea; it is equivalent to using standard input. You can invalidate a file stream by setting it to 0 (aka NULL). Passing the null pointer to functions that try to use the file stream will tend to cause crashes. Passing an invalid file descriptor typically won't cause crashes — the calls may fail with EBADF set in errno, but that's the limit of the damage, usually.
Using file descriptors, you will seldom get a crash because the file descriptor is no longer valid. Using file streams, all sorts of things can go wrong if you try using an invalid file stream pointer.

Multiple Pointers To Same FILE With Different Access Mode C++

Is It possible to have multiple FILE * s point to the same file with different access modes? For example
lets say i had fopen("File1.bin","wb",fp1) and i perform write operations and WITHOUT closing the file using fclose i call fopen("File1.bin","rb",fp2) and try to use write operations on it. this should fail. but fp2 still writes content to it when i use a different access mode. Why?
fopen() opens a file stream, which is an abstraction of a file. Sure, a file handle is opened underneath but it is perfectly acceptable to have concurrent access to the same file through different handles (which may even be in different processes).
A file is a shared resource.

Can I use fstream in C++ to read or write file when I'm implementing a disk management component of DBMS

In C++, I know I can use read or write file using system function like read or write and I can also do that with fstream's help.
Now I'm implementing a disk management which is a component of DBMS. For simplicity I only use disk management to manage the space of a Unix file.
All I know is fstream wrap system function like read or write and provide some buffer.
However I was wondering whether this will affect atomicity and synchronization or not?
My question is which way should I use and why?
No. Particularly not with Unix. A DBM is going to want contiguous files. That means either a unix variant that support them or creating a disk partition.
You're also going to want to handle the buffering; not following the C++ library's buffering.
I could go on but streams are for - - streams of data -- not secure, reliable structured data.
The following information about synchronization and thread safety of 'fstream' can be found from ISO C++ standard.
27.2.3 Thread safety [iostreams.threadsafety]
Concurrent access to a stream object (27.8, 27.9), stream buffer
object (27.6), or C Library stream (27.9.2) by multiple threads may
result in a data race (1.10) unless otherwise specified (27.4). [
Note: Data races result in undefined behavior (1.10). —end note ]
If one thread makes a library call a that writes a value to a stream
and, as a result, another thread reads this value from the stream
through a library call b such that this does not result in a data
race, then a’s write synchronizes with b’s read.
C/C++ file I/O operation are not thread safe by default. So if you are planning to use fstream of open/write/read system call, then you would have to use synchronization mechanism by yourself in your implementation. You may use 'std::mutex' mechanism provided in new C++ standard(.i.e C++11) to synchronize your file I/O.