What does ifstream::open() really do? - c++

Consider this code:
ifstream filein;
filein.open("y.txt");
When I use the open() function, what happens?
Does the file stream itself get opened?
or does the object's state change to open?
or both?

It's not clear if you want to know implementation details or standard requirements - but as for implementation details - it will call the underlying open system call on the operating system. On Linux for example this is called open. On Windows it is called CreateFile.

The filestream being open or closed is represented by it's state. So if you change the state to open, the filestream is now open. Like a doorway. If you open it, you've changed it's state to the open position. Then you can later close it, which involves changing it's state to the closed position. Changing its state to open and opening the stream are the exact same thing.

The std::ifstream is set up to own a std::filebuf which is a class derived from std::streambuf. The stream buffer is managing buffering for streams in a generic way and abstracts the details of how a stream is accessed. For a std::filebuf the underlying stream is an operating system file accessed as needed. When std::ifstream::open() is called this call is mainly delegated to std::filebuf::open() which does the actual work. However, the std::ifstream will clear() its state bits if the call to std::filebuf::open() succeeds and set std::ios_base::failbit if the call fails. The file buffer will call the system's method to allocate a file handle and, if successful, arrange for this file handle to be released in its destructor or in the std::filebuf::close() function - whatever comes first. When calling std::ifstream::open() with the default arguments the system call will check that the file exists, is accessible, not too many file handles are open, etc. There is an std::ios_base::openmode parameter which can be used to modify the behavior in some ways and when different flags are used when calling std::ofstream::open().
Whether the call to std::filebuf::open() has any other effects is up to the implementation. For example, the implementation could choose to obtain a sequence of bytes and convert them into characters. Since the user can override certain setting, in particular the std::locale (see the std::streambuf::pubimbue() function), it is unlikely that much will happen prior to the first read, though. In any case, none of the state flags would be affected by the outcome of any operation after opening the file itself.
BTW, the mentioned classes are actually all templates (std::basic_ifstream, std::basic_filebuf, std::basic_streambuf, and std::basic_ofstream) which are typedef'ed to the names used above for the instantiation working on char as a character type. There are similar typedefs using a w prefix for instantiations working on wchar_t. Interestingly, there are no typedefs for the char16_t and char32_t versions and it seems it will be a bit of work to get them instantiated as well.

If you think logically, ifstream is just the stream in which we will get our file contents. The parameters, we provide to ifstream.open() will open the file and mark it as open. When the file is marked as open, it will not allow you to do some operations on file like renaming a file as it is opened by some program. It will allow you to do the same after you close the stream. ifstream - imo is only the helper class to access files.

Related

Is opening the SAME file in two different fstreams Undefined Behaviour?

This recently asked question has raised another interesting issue, as discussed in the comments to one of its answers.
To summarize: the OP there was having issues with code like that below, when subsequently attempting to read and write data from/to the two streams 'concurrently':
ifstream infile;
infile.open("accounts.txt");
ofstream outfile;
outfile.open("accounts.txt");
Although the issue, in itself, was successfully resolved, it has raised a question to which I cannot find an authoritative answer (and I've made some quite extensive searches of Stack Overflow and the wider web).
It is very clearly stated what happens when calling the open() method of a stream that is already associated with a file (cppreference), but what I cannot find an answer to is what happens when (as in this case) the file is already associated with a (different) stream.
If the stream is already associated with a file (i.e., it is already
open), calling this function fails.
I can see several possible scenarios here:
The second open call will fail and any attempted writes to it will also fail (but that is not the case in the cited question).
The second open call will 'override' the first, effectively closing it (this could explain the issues encountered in said code).
Both streams remain open but enter into a 'mutual clobbering' match regarding their internal file pointers and buffers.
We enter the realm of undefined (or implementation-defined) behaviour.
Note that, as the first open() call is made by an input stream, the operating system will not necessarily 'lock' the file, as it probably would for an output stream.
So, does anyone have a definitive answer to this? Or a citation from the Standard (cppreference will be 'acceptable' if nothing more authoritative can be found)?
basic_filebuf::open (and all things that depend on it, like fstream::open) has no statement about what will happen in this case. A filesystem may allow it or it may not.
What the standard says is that, if the file successfully opens, then you can play with it in accord with the interface. And if it doesn't successfully open, then there will be an error. That is, the standard allows a filesystem to permit it or forbid it, but it doesn't say which must happen. The implementation can even randomly forbid it. Or forbid you from opening any files in any way. All are (theoretically) valid.
To me, this falls even out of the 'implementation defined' field. The very same code will have different behaviour depending of the underlying filesystem or OS (some OSes forbid to open a file twice).
No.
Such a scenario is not discussed by the standard.
It's not even managed by the implementation (your compiler, standard library implementation etc).
The stream ultimately asks the operating system for access to that file in the desired mode, and it's up to the operating system to decide whether that access shall be granted at that time.
A simple analogy would be your program making some API call to a web application over a network. Perhaps the web application does not permit more than ten calls per minute, and returns some error code if you attempt more than that. But that doesn't mean your program has undefined behaviour in such a case.
C implementations exist for many different platforms, whose underlying file systems may handle such corner cases differently. For the Standard to mandate any particular corner-case behavior would have made the language practical only on platforms whose file systems behave in such fashion. Instead, the Standard regards such issues as being outside its jurisdiction (i.e. to use its own terminology, "Undefined Behavior"). That doesn't mean that implementations whose target OS offers useful guarantees shouldn't make such guarantees to programs when practical, but implementation designers are presumed to know more than the Committee about how best to serve their customers.
On the other hand, it may sometime be helpful for an implementation not to expose the underlying OS behavior. On an OS that doesn't have a distinct "append" mode, for example, but code needing an "open for append" could do an "open existing file for write" followed by "seek to end of file", an attempt to open two streams for appending to the same file could result in data corruption when one stream writes part of a file, and the other stream then rewrites that same portion. It may be helpful for an implementation that detects that condition to either inject its own logic to either ensure smooth merging of the data or block the second open request. Either course of action might be better, depending upon an application's purpose, but--as noted above--the choice is outside the Standard's jurisdiction.
I open the zip file as stream twice.The zip file contains some XML files.
std::ifstream("filename") file;
zipstream *p1 = new zipstream(file);
zipstream *p2 = new zipstream(file);
p1->getNextEntry();
auto p3 = p1.rdbuf();
autp p4 = p2.rdbuf();
Then see p3 address = p4 address, but the member variables are different between them. Such as _IGfirst.
The contents of one of the XML files are as follows:
<test>
<one value="0.00001"/>
</test>
When the contents of file are read in two thread at the same time.error happend.
string One = p1.getPropertyValue("one");
// one = "0001two"

What does it mean to open an output file as both input and output?

I see code like this sometimes:
ofstream of("out.txt", ofstream::in | ofstream::out);
Why is it being opened as both input and output? What sense does it make to do that?
Sometimes one needs to read a file, do some processing, then update the original file with some new information. Opening the file for both input and output allows one to do that without closing and re-opening.
It also makes explicit one's intentions with regards to the file.
EDIT: My original answer didn't take into account that the OP's example uses ofstream instead of fstream. So …
Using ofstream with std::in doesn't make any sense. Are there reads performed on the stream? If not, perhaps the code when originally written used an fstream, which was later changed to an ofstream. It's the kind of thing that can creep into a code base that's in maintenance mode.
Since you're using an ofstream, none of the input functions are available to you on the stream. Then opening the file in read-write mode isn't harmful, but it is pointless unless you're going to start hacking about with the underlying stream.
It's probably boilerplate, copy/pasted from some tutorial on the internet without the author actually understanding it.
This means nothing for the actual streams. The std::ios_base::in openmode will simply be ignored. But for the underlying stream buffers, it does mean something to open as both input and output: For example, file stream and string stream buffers allow putback functionality, but it can only be used if the buffers' std::ios_base::openmode specifies output.

C++ how to check if a stream (iostream) is seekable

Is there a way I can check if an istream of ostream is seekable?
I suspect doing a test seek and checking for failbit is not correct
since the seek can fail for unrelated reasons.
I need this to work on Linux and mac, if that makes a difference.
Iostreams doesn't give you much. Stream objects are just wrappers around a buffer object derived from class std::streambuf. (Assuming "narrow" characters.) The standard derived buffer classes are std::stringbuf for strings and std::filebuf for files. Supposing you're only interested in files, std::filebuf is just a simple wrapper around the C library functionality. The C library does not define a way to determine if a FILE object supports seeking or not, besides attempting to do so, so neither does C++.
For what it's worth, the semantics of seek vary a bit. Some platforms might allow you to "seek" a pipe but only to the current position, to determine how many characters have been read or written. Seeking past the end might resize the file, or might cause the next write operation to resize the file, or something in between.
You might also try checking errno if badbit is set (or, as I prefer, use exceptions instead of flags).

Getting a HANDLE from a std::ofstream

Is it possible to get the underlying file HANDLE from a std::ofstream (Visual C++ 2005)?
This is the opposite of this question:
Can I use CreateFile, but force the handle into a std::ofstream?
The reason I want to so this is to modify attributes of the file (e.g. creation time) without having to open the file with CreateFile.
The C++ standard does not provide any means for specifying or retrieving the raw file descriptors of an ofstream, so I don't believe this is possible. What is possible, though, would be to build a custom streambuf class that implements stream buffering to and from a HANDLE, then to define a custom ostream type that uses that buffer. I'm not sure if that's really what you're looking for, but it is a viable option.
My answer should be prefaced with "I am a Unix developer, not a Windows developer." But I had the same problem that you did, and this is how I chose to address it. I would love to have a better answer. My answer below will make your skin crawl, but it worked.
First off, we'll need the _Filet* from the fdbuf. This is a private member, so we can't just create a new class that gives us visibility into it. So, I modify the header for fstream to add a new friend function in filebuf, so that the specific function will let us cheat and get access to that member (I added it just below the definition "_Filet *_Myfile;"):
friend HANDLE __HACK_getFilebufHANDLE(filebuf*);
Now we have a public function to access the private member. Step two is to write the function:
namespace std {
HANDLE __HACK_getFilebufHANDLE(filebuf*in) {
return (HANDLE) _get_osfhandle(_fileno(in->_Myfile));
}
};
Lastly, you just need to call it, except that rdbuf returns the wrong type (iobuf rather than filebuf). Since we're already off in "here there be dragons" for this entire process, we may as well make everyone's skin crawl (but in real life, do type checking here to validate the cast to the derived type):
__HACK_getFilebufHANDLE((filebuf*)fopoutstrm.rdbuf())
Sorry that I don't have a cleaner answer.
No. You can't even get at the FILE* (or _Filet* as it's internally known) inside std::basic_filebuf.
This is not possible in standard C++. However, with Boost.IOStreams library it is not that hard. Create a Device, wrap it in a boost::iostreams::stream_buffer<> and add appropriate stream using boost::iostreams::stream<>.
With the VisualC++ 2010 libraries, the following should work. I assume it's the same for VisualC++ 2005, but you will have to verify:
FILE* fh = fopen(...);
HANDLE hFile = (HANDLE)_get_osfhandle(_fileno(fh));
// do something on hFile
// create iostream from FILE
std::ifstream ifs(fh);
// use it...
// close FILE
_close(fh);
No. I try many ways.
this line: "std::ifstream ifs(fh);" may not wrok in some msvs, such as 2008.
I find another way, you can enumerate handle in your process, and find the handle that releated to the filename.
In this way, I get the handle.

Why can't I use fopen?

In the mold of a previous question I asked about the so-called safe library deprecations, I find myself similarly bemused as to why fopen() should be deprecated.
The function takes two C strings, and returns a FILE* ptr, or NULL on failure. Where are the thread-safety problems / string overrun problems? Or is it something else?
Thanks in advance
You can use fopen(). Seriously, don't take any notice of Microsoft here, they're doing programmers a real disservice by deviating from the ISO standards . They seem to think that people writing code are somehow brain-dead and don't know how to check parameters before calling library functions.
If someone isn't willing to learn the intricacies of C programming, they really have no business doing it. They should move on to a safer language.
This appears to be just another attempt at vendor lock-in by Microsoft of developers (although they're not the only ones who try it, so I'm not specifically berating them). I usually add:
#define _CRT_SECURE_NO_WARNINGS
(or the "-D" variant on the command line) to most of my projects to ensure I'm not bothered by the compiler when writing perfectly valid, legal C code.
Microsoft has provided extra functionality in the fopen_s() function (file encodings, for one) as well as changing how things are returned. This may make it better for Windows programmers but makes the code inherently unportable.
If you're only ever going to code for Windows, by all means use it. I myself prefer the ability to compile and run my code anywhere (with as little change as possible).
As of C11, these safe functions are now a part of the standard, though optional. Look into Annex K for full details.
There is an official ISO/IEC JTC1/SC22/WG14 (C Language) technical report TR24731-1 (bounds checking interfaces) and its rationale available at:
http://www.open-std.org/jtc1/sc22/wg14
There is also work towards TR24731-2 (dynamic allocation functions).
The stated rationale for fopen_s() is:
6.5.2 File access functions
When creating a file, the fopen_s and freopen_s functions improve security by protecting the file from unauthorized access by setting its file protection and opening the file with exclusive access.
The specification says:
6.5.2.1 The fopen_s function
Synopsis
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
errno_t fopen_s(FILE * restrict * restrict streamptr,
const char * restrict filename,
const char * restrict mode);
Runtime-constraints
None of streamptr, filename, or mode shall be a null pointer.
If there is a runtime-constraint violation, fopen_s does not attempt to open a file.
Furthermore, if streamptr is not a null pointer, fopen_s sets *streamptr to the
null pointer.
Description
The fopen_s function opens the file whose name is the string pointed to by
filename, and associates a stream with it.
The mode string shall be as described for fopen, with the addition that modes starting
with the character ’w’ or ’a’ may be preceded by the character ’u’, see below:
uw truncate to zero length or create text file for writing, default permissions
ua append; open or create text file for writing at end-of-file, default permissions
uwb truncate to zero length or create binary file for writing, default permissions
uab append; open or create binary file for writing at end-of-file, default
permissions
uw+ truncate to zero length or create text file for update, default permissions
ua+ append; open or create text file for update, writing at end-of-file, default
permissions
uw+b or uwb+ truncate to zero length or create binary file for update, default
permissions
ua+b or uab+ append; open or create binary file for update, writing at end-of-file,
default permissions
To the extent that the underlying system supports the concepts, files opened for writing
shall be opened with exclusive (also known as non-shared) access. If the file is being
created, and the first character of the mode string is not ’u’, to the extent that the
underlying system supports it, the file shall have a file permission that prevents other
users on the system from accessing the file. If the file is being created and first character
of the mode string is ’u’, then by the time the file has been closed, it shall have the
system default file access permissions10).
If the file was opened successfully, then the pointer to FILE pointed to by streamptr
will be set to the pointer to the object controlling the opened file. Otherwise, the pointer
to FILE pointed to by streamptr will be set to a null pointer.
Returns
The fopen_s function returns zero if it opened the file. If it did not open the file or if
there was a runtime-constraint violation, fopen_s returns a non-zero value.
10) These are the same permissions that the file would have been created with by fopen.
The fopen_s() function has been added by Microsoft to the C runtime with the following fundamental differences from fopen():
if the file is opened for writing ("w" or "a" specified in the mode) then the file is opened for exclusive (non-shared) access (if the platform supports it).
if the "u" specifier is used in the mode argument with the "w" or "a" specifiers, then by the time the file is closed, it will have system default permissions for others users to access the file (which may be no access if that's the system default).
if the "u" specified is not used in those cases, then when the file is closed (or before) the permissions for the file will be set such that other users will not have access to the file.
Essentially it means that files the application writes are protected from other users by default.
They did not do this to fopen() due to the likelyhood that existing code would break.
Microsoft has chosen to deprecate fopen() to encourage developers for Windows to make conscious decisions about whether the files their applications use will have loose permissions or not.
Jonathan Leffler's answer provides the proposed standardization language for fopen_s(). I added this answer hoping to make clear the rationale.
Or is it something else?
Some implementations of the FILE structure used by 'fopen' has the file descriptor defined as 'unsigned short'. This leaves you with a maximum of 255 simultaneously open files, minus stdin, stdout, and stderr.
While the value of being able to have 255 open files is debatable, of course, this implementation detail materializes on the Solaris 8 platform when you have more than 252 socket connections! What first appeared as a seemingly random failure to establish an SSL connection using libcurl in my application turned out to be caused by this, but it took deploying debug versions of libcurl and openssl and stepping the customer through debugger script to finally figure it out.
While it's not entirely the fault of 'fopen', one can see the virtues of throwing off the shackles of old interfaces; the choice to deprecate might be based on the pain of maintaining binary compatibility with an antiquated implementation.
The new versions do parameter validation whereas the old ones didn't.
See this SO thread for more information.
Thread safety. fopen() uses a global variable, errno, while the fopen_s() replacement returns an errno_t and takes a FILE** argument to store the file pointer to.