Why can't I use fopen? - c++

In the mold of a previous question I asked about the so-called safe library deprecations, I find myself similarly bemused as to why fopen() should be deprecated.
The function takes two C strings, and returns a FILE* ptr, or NULL on failure. Where are the thread-safety problems / string overrun problems? Or is it something else?
Thanks in advance

You can use fopen(). Seriously, don't take any notice of Microsoft here, they're doing programmers a real disservice by deviating from the ISO standards . They seem to think that people writing code are somehow brain-dead and don't know how to check parameters before calling library functions.
If someone isn't willing to learn the intricacies of C programming, they really have no business doing it. They should move on to a safer language.
This appears to be just another attempt at vendor lock-in by Microsoft of developers (although they're not the only ones who try it, so I'm not specifically berating them). I usually add:
#define _CRT_SECURE_NO_WARNINGS
(or the "-D" variant on the command line) to most of my projects to ensure I'm not bothered by the compiler when writing perfectly valid, legal C code.
Microsoft has provided extra functionality in the fopen_s() function (file encodings, for one) as well as changing how things are returned. This may make it better for Windows programmers but makes the code inherently unportable.
If you're only ever going to code for Windows, by all means use it. I myself prefer the ability to compile and run my code anywhere (with as little change as possible).
As of C11, these safe functions are now a part of the standard, though optional. Look into Annex K for full details.

There is an official ISO/IEC JTC1/SC22/WG14 (C Language) technical report TR24731-1 (bounds checking interfaces) and its rationale available at:
http://www.open-std.org/jtc1/sc22/wg14
There is also work towards TR24731-2 (dynamic allocation functions).
The stated rationale for fopen_s() is:
6.5.2 File access functions
When creating a file, the fopen_s and freopen_s functions improve security by protecting the file from unauthorized access by setting its file protection and opening the file with exclusive access.
The specification says:
6.5.2.1 The fopen_s function
Synopsis
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
errno_t fopen_s(FILE * restrict * restrict streamptr,
const char * restrict filename,
const char * restrict mode);
Runtime-constraints
None of streamptr, filename, or mode shall be a null pointer.
If there is a runtime-constraint violation, fopen_s does not attempt to open a file.
Furthermore, if streamptr is not a null pointer, fopen_s sets *streamptr to the
null pointer.
Description
The fopen_s function opens the file whose name is the string pointed to by
filename, and associates a stream with it.
The mode string shall be as described for fopen, with the addition that modes starting
with the character ’w’ or ’a’ may be preceded by the character ’u’, see below:
uw truncate to zero length or create text file for writing, default permissions
ua append; open or create text file for writing at end-of-file, default permissions
uwb truncate to zero length or create binary file for writing, default permissions
uab append; open or create binary file for writing at end-of-file, default
permissions
uw+ truncate to zero length or create text file for update, default permissions
ua+ append; open or create text file for update, writing at end-of-file, default
permissions
uw+b or uwb+ truncate to zero length or create binary file for update, default
permissions
ua+b or uab+ append; open or create binary file for update, writing at end-of-file,
default permissions
To the extent that the underlying system supports the concepts, files opened for writing
shall be opened with exclusive (also known as non-shared) access. If the file is being
created, and the first character of the mode string is not ’u’, to the extent that the
underlying system supports it, the file shall have a file permission that prevents other
users on the system from accessing the file. If the file is being created and first character
of the mode string is ’u’, then by the time the file has been closed, it shall have the
system default file access permissions10).
If the file was opened successfully, then the pointer to FILE pointed to by streamptr
will be set to the pointer to the object controlling the opened file. Otherwise, the pointer
to FILE pointed to by streamptr will be set to a null pointer.
Returns
The fopen_s function returns zero if it opened the file. If it did not open the file or if
there was a runtime-constraint violation, fopen_s returns a non-zero value.
10) These are the same permissions that the file would have been created with by fopen.

The fopen_s() function has been added by Microsoft to the C runtime with the following fundamental differences from fopen():
if the file is opened for writing ("w" or "a" specified in the mode) then the file is opened for exclusive (non-shared) access (if the platform supports it).
if the "u" specifier is used in the mode argument with the "w" or "a" specifiers, then by the time the file is closed, it will have system default permissions for others users to access the file (which may be no access if that's the system default).
if the "u" specified is not used in those cases, then when the file is closed (or before) the permissions for the file will be set such that other users will not have access to the file.
Essentially it means that files the application writes are protected from other users by default.
They did not do this to fopen() due to the likelyhood that existing code would break.
Microsoft has chosen to deprecate fopen() to encourage developers for Windows to make conscious decisions about whether the files their applications use will have loose permissions or not.
Jonathan Leffler's answer provides the proposed standardization language for fopen_s(). I added this answer hoping to make clear the rationale.

Or is it something else?
Some implementations of the FILE structure used by 'fopen' has the file descriptor defined as 'unsigned short'. This leaves you with a maximum of 255 simultaneously open files, minus stdin, stdout, and stderr.
While the value of being able to have 255 open files is debatable, of course, this implementation detail materializes on the Solaris 8 platform when you have more than 252 socket connections! What first appeared as a seemingly random failure to establish an SSL connection using libcurl in my application turned out to be caused by this, but it took deploying debug versions of libcurl and openssl and stepping the customer through debugger script to finally figure it out.
While it's not entirely the fault of 'fopen', one can see the virtues of throwing off the shackles of old interfaces; the choice to deprecate might be based on the pain of maintaining binary compatibility with an antiquated implementation.

The new versions do parameter validation whereas the old ones didn't.
See this SO thread for more information.

Thread safety. fopen() uses a global variable, errno, while the fopen_s() replacement returns an errno_t and takes a FILE** argument to store the file pointer to.

Related

Retrieve information about an open file

Can I retrieve information about a file previously opened with fopen() using only the pointer it returned?
The reason I ask is that I am trying to write a RAII-style wrapper class for FILE *s, and I want to make it as general as possible, and one of the functions I imagined for it was a copy-like operation, that would take a FILE * as an argument, and create a new reference to the same file.
Under POSIX, I can create a duplicate of a file descriptor with dup()/dup2(), and even get how the file is being accessed with fnctl()'s F_GETFL operation. However, even if I do that to the underlying descriptor of a FILE *, it isn't enough for guessing properties such as if the stream is text or binary (under POSIX, there no real difference, but I want to be general), or its orientation towards char- or wchar_t-based text.
So, is there is a way of learning about the stream I'm about to create a wrapper for, how far can I go, and how should I do it?
Thank you for you attention.

Is opening the SAME file in two different fstreams Undefined Behaviour?

This recently asked question has raised another interesting issue, as discussed in the comments to one of its answers.
To summarize: the OP there was having issues with code like that below, when subsequently attempting to read and write data from/to the two streams 'concurrently':
ifstream infile;
infile.open("accounts.txt");
ofstream outfile;
outfile.open("accounts.txt");
Although the issue, in itself, was successfully resolved, it has raised a question to which I cannot find an authoritative answer (and I've made some quite extensive searches of Stack Overflow and the wider web).
It is very clearly stated what happens when calling the open() method of a stream that is already associated with a file (cppreference), but what I cannot find an answer to is what happens when (as in this case) the file is already associated with a (different) stream.
If the stream is already associated with a file (i.e., it is already
open), calling this function fails.
I can see several possible scenarios here:
The second open call will fail and any attempted writes to it will also fail (but that is not the case in the cited question).
The second open call will 'override' the first, effectively closing it (this could explain the issues encountered in said code).
Both streams remain open but enter into a 'mutual clobbering' match regarding their internal file pointers and buffers.
We enter the realm of undefined (or implementation-defined) behaviour.
Note that, as the first open() call is made by an input stream, the operating system will not necessarily 'lock' the file, as it probably would for an output stream.
So, does anyone have a definitive answer to this? Or a citation from the Standard (cppreference will be 'acceptable' if nothing more authoritative can be found)?
basic_filebuf::open (and all things that depend on it, like fstream::open) has no statement about what will happen in this case. A filesystem may allow it or it may not.
What the standard says is that, if the file successfully opens, then you can play with it in accord with the interface. And if it doesn't successfully open, then there will be an error. That is, the standard allows a filesystem to permit it or forbid it, but it doesn't say which must happen. The implementation can even randomly forbid it. Or forbid you from opening any files in any way. All are (theoretically) valid.
To me, this falls even out of the 'implementation defined' field. The very same code will have different behaviour depending of the underlying filesystem or OS (some OSes forbid to open a file twice).
No.
Such a scenario is not discussed by the standard.
It's not even managed by the implementation (your compiler, standard library implementation etc).
The stream ultimately asks the operating system for access to that file in the desired mode, and it's up to the operating system to decide whether that access shall be granted at that time.
A simple analogy would be your program making some API call to a web application over a network. Perhaps the web application does not permit more than ten calls per minute, and returns some error code if you attempt more than that. But that doesn't mean your program has undefined behaviour in such a case.
C implementations exist for many different platforms, whose underlying file systems may handle such corner cases differently. For the Standard to mandate any particular corner-case behavior would have made the language practical only on platforms whose file systems behave in such fashion. Instead, the Standard regards such issues as being outside its jurisdiction (i.e. to use its own terminology, "Undefined Behavior"). That doesn't mean that implementations whose target OS offers useful guarantees shouldn't make such guarantees to programs when practical, but implementation designers are presumed to know more than the Committee about how best to serve their customers.
On the other hand, it may sometime be helpful for an implementation not to expose the underlying OS behavior. On an OS that doesn't have a distinct "append" mode, for example, but code needing an "open for append" could do an "open existing file for write" followed by "seek to end of file", an attempt to open two streams for appending to the same file could result in data corruption when one stream writes part of a file, and the other stream then rewrites that same portion. It may be helpful for an implementation that detects that condition to either inject its own logic to either ensure smooth merging of the data or block the second open request. Either course of action might be better, depending upon an application's purpose, but--as noted above--the choice is outside the Standard's jurisdiction.
I open the zip file as stream twice.The zip file contains some XML files.
std::ifstream("filename") file;
zipstream *p1 = new zipstream(file);
zipstream *p2 = new zipstream(file);
p1->getNextEntry();
auto p3 = p1.rdbuf();
autp p4 = p2.rdbuf();
Then see p3 address = p4 address, but the member variables are different between them. Such as _IGfirst.
The contents of one of the XML files are as follows:
<test>
<one value="0.00001"/>
</test>
When the contents of file are read in two thread at the same time.error happend.
string One = p1.getPropertyValue("one");
// one = "0001two"

Creating a new file avoiding race conditions

I need to develop a C++ routine performing this apparently trivial task: create a file only if it does not exist, else do nothing/raise error.
As I need to avoid race conditions, I want to use the "ask forgiveness not permission" principle (i.e. attempting the intended operation and checking if it succeeded, as opposed to checking preconditions in advance), which, to my knowledge, is the only robust and portable method for this purpose [Wikipedia article][an example with getline].
Still, I could not find a way to implement it in my case. The best I could come up with is opening a fstream in app mode (or fopening with "a"), checking the output position with tellp (C++) or ftell (C) and aborting if such position is not zero. This has however two disadvantages, namely that if the file exists it gets locked (although for a short time) and its modification date is altered.
I checked other possible combinations of ios_base::openmode for fstream, as well as the mode strings of fopen but found no option that suited my needs. Further search in the C and C++ standard libraries, as well as Boost Filesystem, proved unfruitful.
Can someone point out a method to perform my task in a robust way (no collateral effects, no race conditions) without relying on OS-specific functions?
My specific problem is in Windows, but portable solutions would be preferred.
EDIT: The answer by BitWhistler completely solves the problem for C programs. Still, I am amazed that no C++ idiomatic solution seems to exist. Either one uses open with the O_EXCL attribute as proposed by Andrew Henle, which is however OS-specific (in Windows the attribute seems to be called _O_EXCL with an additional underscore [MSDN]) or one separately compiles a C11 file and links it from the C++ code. Moreover, the file descriptor obtained cannot be converted to a stream except with nonstandard extensions (e.g. GCC's __gnu_cxx::stdio_filebuf). I hope a future version of C++ will implement the "x" subattribute and possibly also a corresponding ios:: modificator for file streams.
The new C standard (C2011, which is not part of C++) adds a new standard subspecifier ("x"), that can be appended to any "w" specifier (to form "wx", "wbx", "w+x" or "w+bx"/"wb+x"). This subspecifier forces the function to fail if the file exists, instead of overwriting it.
source: http://www.cplusplus.com/reference/cstdio/fopen/

What does ifstream::open() really do?

Consider this code:
ifstream filein;
filein.open("y.txt");
When I use the open() function, what happens?
Does the file stream itself get opened?
or does the object's state change to open?
or both?
It's not clear if you want to know implementation details or standard requirements - but as for implementation details - it will call the underlying open system call on the operating system. On Linux for example this is called open. On Windows it is called CreateFile.
The filestream being open or closed is represented by it's state. So if you change the state to open, the filestream is now open. Like a doorway. If you open it, you've changed it's state to the open position. Then you can later close it, which involves changing it's state to the closed position. Changing its state to open and opening the stream are the exact same thing.
The std::ifstream is set up to own a std::filebuf which is a class derived from std::streambuf. The stream buffer is managing buffering for streams in a generic way and abstracts the details of how a stream is accessed. For a std::filebuf the underlying stream is an operating system file accessed as needed. When std::ifstream::open() is called this call is mainly delegated to std::filebuf::open() which does the actual work. However, the std::ifstream will clear() its state bits if the call to std::filebuf::open() succeeds and set std::ios_base::failbit if the call fails. The file buffer will call the system's method to allocate a file handle and, if successful, arrange for this file handle to be released in its destructor or in the std::filebuf::close() function - whatever comes first. When calling std::ifstream::open() with the default arguments the system call will check that the file exists, is accessible, not too many file handles are open, etc. There is an std::ios_base::openmode parameter which can be used to modify the behavior in some ways and when different flags are used when calling std::ofstream::open().
Whether the call to std::filebuf::open() has any other effects is up to the implementation. For example, the implementation could choose to obtain a sequence of bytes and convert them into characters. Since the user can override certain setting, in particular the std::locale (see the std::streambuf::pubimbue() function), it is unlikely that much will happen prior to the first read, though. In any case, none of the state flags would be affected by the outcome of any operation after opening the file itself.
BTW, the mentioned classes are actually all templates (std::basic_ifstream, std::basic_filebuf, std::basic_streambuf, and std::basic_ofstream) which are typedef'ed to the names used above for the instantiation working on char as a character type. There are similar typedefs using a w prefix for instantiations working on wchar_t. Interestingly, there are no typedefs for the char16_t and char32_t versions and it seems it will be a bit of work to get them instantiated as well.
If you think logically, ifstream is just the stream in which we will get our file contents. The parameters, we provide to ifstream.open() will open the file and mark it as open. When the file is marked as open, it will not allow you to do some operations on file like renaming a file as it is opened by some program. It will allow you to do the same after you close the stream. ifstream - imo is only the helper class to access files.

How can I create a temporary file for writing in C++ on a Linux platform?

In C++, on Linux, how can I write a function to return a temporary filename that I can then open for writing?
The filename should be as unique as possible, so that another process using the same function won't get the same name.
Use one of the standard library "mktemp" functions: mktemp/mkstemp/mkstemps/mkdtemp.
Edit: plain mktemp can be insecure - mkstemp is preferred.
tmpnam(), or anything that gives you a name is going to be vulnerable to race conditions. Use something designed for this purpose that returns a handle, such as tmpfile():
#include <stdio.h>
FILE *tmpfile(void);
The GNU libc manual discusses the various options available and their caveats:
http://www.gnu.org/s/libc/manual/html_node/Temporary-Files.html
Long story short, only mkstemp() or tmpfile() should be used, as others have mentioned.
man tmpfile
The tmpfile() function opens a unique temporary file in binary
read/write (w+b) mode. The file will be automatically deleted when it
is closed or the program terminates.ote
mktemp should work or else get one of the plenty of available libraries to generate a UUID.
The tmpnam() function in the C standard library is designed to solve just this problem. There's also tmpfile(), which returns an open file handle (and automatically deletes it when you close it).
You should simply check if the file you're trying to write to already exists.
This is a locking problem.
Files also have owners so if you're doing it right the wrong process will not be able to write to it.