Recently I ran into a problem at work where you have two functions; one opens a file descriptor (which is a local variable in the function), and passes it to another function where it is used for reading or writing. Now, when one of the operations read/write fails the function that was doing the read/write closes this file descriptor, and returns.
The question is, whose responsibility is to close the file descriptor, or let's say do cleanup:
the function which created the fd
the function which experienced the error while read/write
Is there a design rule for these kind of cases; let's say creation and cleanup.
BTW, the problem was that both functions attempted to close the fd, which resulted in a crash on the second call to close.
There are two parts to this answer — the general design issue and the detailed mechanics for your situation.
General Design
Handling resources such as file descriptors correctly, and making sure they are released correctly, is an important design issue. There are multiple ways to manage the problem that work. There are some others that don't.
Your tags use C and C++; be aware that C++ has extra mechanisms available to it.
In C++, the RAII — Resource Acquisition Is Initialization — idiom is a great help. When you acquire a resource, you ensure that whatever acquires the resource initializes a value that will be properly destructed and will release the resource when destructed.
In both languages, it is generally best if the function responsible for allocating the resource also releases it. If a function opens a file, it should close it. If a function is given an open file, it should not close the file.
In the comments, I wrote:
Generally, the function that opened the file should close it; the function that experienced the error should report the error, but not close the file. However, you can work it how you like as long as the contract is documented and enforced — the calling code needs to know when the called code closed the file to avoid double closes.
It would generally be a bad design for the called function to close the file sometimes (on error), but not other times (no error). If you must go that way, then it is crucial that the calling function is informed that the file is closed; the called function must return an error indication that tells the calling code that the file is no longer valid and should neither be used nor closed. As long as the information is relayed and handled, there isn't a problem — but the functions are harder to use.
Note that if a function is designed to return an opened resource (it is a function that's responsible for opening a file and making it available to the function that called it), then the responsibility for closing the file falls on the code that calls the opening function. That is a legitimate design; you just have to make sure that there is a function that knows how to close it, and that the calling code does close it.
Similar comments apply to memory allocation. If a function allocates memory, you must know when the memory will be freed, and ensure that it is freed. If it was allocated for the purposes of the current function and functions it calls, then the memory should be released before return. If it was allocated for use by the calling functions, then the responsibility for release transfers to the calling functions.
Detailed mechanics
Are you sure you're using file descriptors and not FILE * (file streams)? It's unlikely that closing a file descriptor twice would cause a crash (error, yes, but not a crash). OTOH, calling fclose() on an already closed file stream could cause problems.
In general, in C, you pass file descriptors, which a small integers, by value, so there isn't a way to tell the calling function that the file descriptor is no longer valid. In C++, you could pass them by reference, though it is not conventional to do so. Similarly with FILE *; they're most usually passed by value, not by reference, so there isn't a way to tell the calling code that the file is not usable any more by modifying the value passed to the function.
You can invalidate a file descriptor by setting it to -1; that is never a valid file descriptor. Using 0 is a bad idea; it is equivalent to using standard input. You can invalidate a file stream by setting it to 0 (aka NULL). Passing the null pointer to functions that try to use the file stream will tend to cause crashes. Passing an invalid file descriptor typically won't cause crashes — the calls may fail with EBADF set in errno, but that's the limit of the damage, usually.
Using file descriptors, you will seldom get a crash because the file descriptor is no longer valid. Using file streams, all sorts of things can go wrong if you try using an invalid file stream pointer.
Related
Copying files to a directory to another directory using this code
auto const copyOption = std::filesystem::copy_options::recursive | std::filesystem::copy_options::skip_symlinks;
std::filesystem::copy("/mnt/iso", "/mnt/usb", copyOption);
Copying big files can take a long time.
So how to check when copy is ended ?
How to check if std::filesystem::copy is ended?
The call to a function ends by either (1) returning, by (2) throwing or by (3) terminating the program.
If the execution of the program proceeds to the next statement (or next sibling expression), then you know that the function call has ended by (1) returning. If the execution proceeds to unwind and eventually enters a catch block, then you know that the function has ended by (2) throwing. If the program no longer runs, then you know that the function has (3) terminated the program (std::filesystem::copy won't do this one directly, although it can happen if it throws without being caught). If none of those have happened, then you know that the function call hasn't ended yet.
For example try to unmount an usbkey just after copy finished will take one or two minutes longer if you have big files
There is no standard way in C++ to verify when data has physically been written onto a device.
You've tagged [ubuntu], so you may be interested in fsync function in the POSIX standard. You cannot use it with conjunction with std::filesystem::copy; you'll need to use the other POSIX file manipulation functions instead. fsync should guarantee that any writes have been passed on to the device by the kernel. From then on, you rely on the hardware.
The title pretty much says it all. I'm new to Winsock, and I need to know what the scope of a SOCKET object is.
Do I need to worry about it going out of scope when using it in a class member variable (since when it's returned, it's not dynamic memory)?
Thanks.
I'm pretty sure the answer to this is no, but since I can't find the info, I figured I would put it out there, for quick reference to others in the future.
The MSDN documentation for socket says the following:
When a session has been completed, a closesocket must be performed.
And the accompanying sample does just that. The documentation for closesocket is more forceful:
An application should always have a matching call to closesocket for each successful call to socket to return any socket resources to the system.
So as long as you keep the SOCKET descriptor somewhere you can use it until you call closesocket. You could consider putting it inside your own RAII type (or use an existing one) to avoid leaks. If you "forget" the descriptor, you will leak the internal resources.
Internally, SOCKET is just some ID, which is refers to some internal Windows structure. You can work with it like with HANDLE or with usual pointer.
I.e. nothing will happen, if it will go out of scope (but it can leak resources, like HANDLE, if you forgot CloseHandle), if you copy it - you will go 2 same sockets, which are referring to same Windows structure, etc.
I'm creating a file format where I'd like to write an explicit message into the file indicating that the writer ran to completion. I've had problems in the past with generating files where the generating program crashed and the file was truncated without me realizing, since without an explicit marker there's no way for reading programs to detect a file is incomplete.
So I have a class that is used for writing these files. Now usually if you have an "open" operation and a "close" operation you want to use RAII, so I would put the code to write the end of file marker in the destructor. This way the user can't forget. But in a situation where writing doesn't complete because an exception is thrown the destructor will still be run -- in which case we don't want to write the message so readers will know the file is incomplete.
This seems like something that could happen any time there's a "commit" sort of operation. You want RAII so you can't forget to commit, but you also don't want to commit when an exception occurs. The temptation here is to use std::uncaught_exceptions, but I think that's a code smell.
What's the usual solution to this? Just require that people remember? I'm concerned this will be a stumbling block everytime someone tries to use my API.
One way to tackle this problem is to implement a simple framing system where you can define a header that is only filled in completely at the end of the write. Include a SHA256 hash to make the header useful for verifying the contents of the file. This is usually a lot more convenient than having to read bytes at the end of a file.
In terms of implementation you write out a header with some fields deliberately zeroed out, write the contents of the payload while feeding that data through your hashing method, and then seek back to the header and re-write that with the final values. The file starts out in an obviously invalid state and ends up valid only if everything ran to completion.
You could wrap up all of this in a stream handle that handles the implementation details so as far as the calling code is concerned it's just opening a regular file. Your reading version would throw an exception if the header is incomplete or invalid.
For your example, it seems like RAII would work fine if you add a commit method which the user of your class calls when they are done writing to a file.
class MyFileFormat {
public:
MyFileFormat() : committed_(false) {}
~MyFileFormat() {
if (committed_) {
// write the completion footer (I hope this doesn't throw!)
}
// close the underlying stream...
}
bool open(const std::string& path) {
committed_ = false;
// open the underlying stream...
}
bool commit() {
committed_ = true;
}
};
The onus is on the user to call commit when they're done, but at least you can be sure that resources get closed.
For a more general pattern for cases like this, take a look at ScopeGuards.
ScopeGuards would move the responsibility for cleanup out of your class, and can be used to specify an arbitrary "cleanup" callback in the event that the ScopeGuard goes out of scope and is destroyed before being explicitly dismissed. In your case, you might extend the idea to support callback for both failure-cleanup (e.g. close file handles), and success-cleanup (e.g. write completion footer and close file handles).
I've handled situations like that by writing to a temporary file. Even if you're appending to a file, append to a temporary copy of the file.
In your destructor, you can check std::uncaught_exception() to decide whether your temporary file should be moved to its intended location.
Why do C++ Standard Library streams use open()/close() semantics decoupled from object lifetime? Closing on destruction might still technically make the classes RAII, but acquisition/release independence leaves holes in scopes where handles can point to nothing but still need run-time checks to catch.
Why did the library designers choose their approach over having opening only in constructors that throw on a failure?
void foo() {
std::ofstream ofs;
ofs << "Can't do this!\n"; // XXX
ofs.open("foo.txt");
// Safe access requires explicit checking after open().
if (ofs) {
// Other calls still need checks but must be shielded by an initial one.
}
ofs.close();
ofs << "Whoops!\n"; // XXX
}
// This approach would seem better IMO:
void bar() {
std_raii::ofstream ofs("foo.txt"); // throw on failure and catch wherever
// do whatever, then close ofs on destruction ...
}
A better wording of the question might be why access to a non-opened fstream is ever worth having. Controlling open file duration via handle lifetime does not seem to me to be a burden at all, but actually a safety benefit.
Although the other answers are valid and useful, I think the real reason is simpler.
The iostreams design is much older than a lot of the Standard Library, and predates wide use of exceptions. I suspect that in order to be compatible with existing code, the use of exceptions was made optional, not the default for failure to open a file.
Also, your question is only really relevant to file streams, the other types of standard stream don't have open() or close() member functions, so their constructors don't throw if a file can't be opened :-)
For files, you may want to check that the close() call succeeded, so you know if the data got written to disk, so that's a good reason not to do it in the destructor, because by the time the object is destroyed it is too late to do anything useful with it and you almost certainly don't want to throw an exception from the destructor. So an fstreambuf will call close in its destructor, but you can also do it manually before destruction if you want to.
In any case, I don't agree that it doesn't follow RAII conventions...
Why did the library designers choose their approach over having opening only in constructors that throw on a failure?
N.B. RAII doesn't mean you can't have a separate open() member in addition to a resource-acquiring constructor, or you can't clean up the resource before destruction e.g. unique_ptr has a reset() member.
Also, RAII doesn't mean you must throw on failure, or an object can't be in an empty state e.g. unique_ptr can be constructed with a null pointer or default-constructed, and so can also point to nothing and so in some cases you need to check it before dereferencing.
File streams acquire a resource on construction and release it on destruction - that is RAII as far as I'm concerned. What you are objecting to is requiring a check, which smells of two-stage initialization, and I agree that is a bit smelly. It doesn't make it not RAII though.
In the past I have solved the smell with a CheckedFstream class, which is a simple wrapper that adds a single feature: throwing in the cosntructor if the stream couldn't be opened. In C++11 that's as simple as this:
struct CheckedFstream : std::fstream
{
CheckedFstream() = default;
CheckedFstream(std::string const& path, std::ios::openmode m = std::ios::in|std::ios::out)
: fstream(path, m)
{ if (!is_open()) throw std::ios::failure("Could not open " + path); }
};
This way you get more and nothing less.
You get the same: You still can open the file via constructor. You still get RAII: it will automatically close the file at object destruction.
You get more: you can use the same stream to reopen other file; you can close the file when you want, not being restricted to wait for the object going out of scope or being destructed (this is very important).
You get nothing less: The advantage you see is not real. You say that your way you don’t have to check at each operation. This is false. The stream can fail at any time even if it successfully opened (the file).
As about error checking vs throwing exceptions, see #PiotrS’s answer. Conceptually I see no difference between having to check the return status vs having to catch error. The error is still there; the difference is how you detect it. But as pointed by #PiotrS you can opt for both.
The library designers gave you alternative:
std::ifstream file{};
file.exceptions(std::ifstream::failbit | std::ifstream::badbit);
try
{
file.open(path); // now it will throw on failure
}
catch (const std::ifstream::failure& e)
{
}
The standard library file streams do provide RAII, in the
sense that calling the destructor on one will close any file
which happens to be open. At least in the case of output,
however, this is an emergency measure, which should only be used
if you have encountered another error, and are not going to use
the file which was being written anyway. (Good programming
practice would be to delete it.) Generally, you need to check
the status of the stream after you've closed it, and this is
an operation which can fail, so shouldn't be done in the
destructor.
For input, it's not so critical, since you'll have checked the
status after the last input anyway, and most of the time, will
have read until an input fails. But it does seem reasonable to
have the same interface for both; from a programming point of
view, however, you can usually just let the close in the
destructor do its job on input.
With regards to open: you can just as easily do the open in
the constructor, and for isolated uses like you show, this is
probably the preferred solution. But there are cases where you
might want to reuse an std::filebuf, opening it and closing it
explicitly, and of course, in almost all cases, you will want to
handle a failure to open the file immediately, rather than
through some exception.
It depends on what you are doing, reading or writing.
You can encapsulate an input stream in RAII way, but it is not true for output streams. If the destination is a disk file or network socket, NEVER, NEVER put fclose/close in destructor. Because you need check the return value of fclose, and there is no way to report an error occurred in destructor. see How can I handle a destructor that fails
I am working with a program where my code calls a third party library which uses boost and shared_pointers to create a large and complex structure. This structure is created in a method that I call and at the end of the method I know that the program is finished.
For a large sample that I am handling the code to handle the processing takes 30 minutes and the boost code called automatically at exit takes many hours. Exiting the program without releasing the memory and spending all that time would be a perfectly acceptable outcome.
I tried
vector *iddListV = new vector(); // this WILL leak memory
with all the relevant structures added to the vector but this does not help.
I also tried calling exit(0); before reaching the end of the subroutine. This also causes the boost code to spend many hours trying to release pointers.
How to I get a C++ program (Microsoft C++ on Windows if that matters) to abruptly exit without calling the boost destructors.
My constraints are I can call any function before the boost structure are allocated but cannot modify the code once it starts running.
_Exit quits without calling any destructors.
If you're unconcerned about portability, you can call TerminateProcess(). But remember to take care that you are absolutely sure that your program is in a state which is ready to terminate. For example, if you terminate before I/O has had a chance to flush, then your file data and network streams may become invalid.
It is possible, in a portable manner, to do:
#include <exception>
...
std::terminate();
However, there's a big gotcha, in that, at least on linux, this may cause a core dump. (I'm really not sure what the behavior is on Windows).
It should be noted, that the behavior is implementation defined as far as whether or not destructors are called. Siting §15.5.1 P2:
In the situation where the search for a handler (15.3) encounters the
outermost block of a function with a noexcept-specification that does
not allow the exception (15.4), it is implementation-defined whether
the stack is unwound, unwound partially, or not unwound at all before
std::terminate() is called.
Additionally in §18.8.3.4 P1:
Remarks: Called by the implementation when exception handling must be
abandoned for any of several reasons (15.5.1), in effect immediately
after evaluating the throw-expression (18.8.3.1). May also be called
directly by the program.
C++11 also defines the function std::quick_exit(int status) that can be used in a similar manner (presumably without a coredump). This function is available from <cstdlib>.