Correct way of using fdopen - c++

I mean to associate a file descriptor with a file pointer and use that for writing.
I put together program io.cc below:
int main() {
ssize_t nbytes;
const int fd = 3;
char c[100] = "Testing\n";
nbytes = write(fd, (void *) c, strlen(c)); // Line #1
FILE * fp = fdopen(fd, "a");
fprintf(fp, "Writing to file descriptor %d\n", fd);
cout << "Testing alternate writing to stdout and to another fd" << endl;
fprintf(fp, "Writing again to file descriptor %d\n", fd);
close(fd); // Line #2
return 0;
}
I can alternately comment lines 1 and/or 2, compile/run
./io 3> io_redirect.txt
and check the contents of io_redirect.txt.
Whenever line 1 is not commented, it produces in io_redirect.txt the expected line Testing\n.
If line 2 is commented, I get the expected lines
Writing to file descriptor 3
Writing again to file descriptor 3
in io_redirect.txt.
But if it is not commented, those lines do not show up in io_redirect.txt.
Why is that?
What is the correct way of using fdopen?
NOTE.
This seems to be the right approach for a (partial) answer to Smart-write to arbitrary file descriptor from C/C++
I say "partial" since I would be able to use C-style fprintf.
I still would like to also use C++-style stream<<.
EDIT:
I was forgetting about fclose(fp).
That "closes" part of the question.

Why is that?
The opened stream ("stream" is an opened FILE*) is block buffered, so nothing gets written to the destination before the file is flushed. Exiting from an application closes all open streams, which flushes the stream.
Because you close the underlying file descriptor before flushing the stream, the behavior of your program is undefined. I would really recommend you to read posix 2.5.1 Interaction of File Descriptors and Standard I/O Streams (which is written in a horrible language, nonetheless), from which:
... if two or more handles are used, and any one of them is a stream, the application shall ensure that their actions are coordinated as described below. If this is not done, the result is undefined.
...
For the first handle, the first applicable condition below applies. ...
...
If it is a stream which is open for writing or appending (but not also open for reading), the application shall either perform an fflush(), or the stream shall be closed.
A "handle" is a file descriptor or a stream. An "active handle" is the last handle that you did something with.
The fp stream is the active handle that is open for appending to file descriptor 3. Because fp is an active handle and is not flushed and you switch the active handle to fd with close(fd), the behavior of your program is undefined.
What is my guess and most probably happens is that your C standard library implementation calls fflush(fp) after main returns, because fd is closed, some internal write(3, ...) call returns an error and nothing is written to the output.
What is the correct way of using fdopen?
The usage you presented is the correct way of using fdopen.

Related

How do I get the content of llvm::MemoryBuffer when reading STDIN?

I am using llvm::MemoryBuffer::getFileOrSTDIN("-") and, according to the specification, it should Open the specified file as a MemoryBuffer, or open stdin if the Filename is "-".
Now, in the following context:
auto Source = llvm::MemoryBuffer::getFileOrSTDIN(File);
if (std::error_code err = Source.getError()) {
llvm::errs() << err.message();
} else{
someFunction(std::move(*Source), File, makeOutputWriter(Format, llvm::outs()),
IdentifiersOnly, DumpAST);
}
it blocks on the first line (when File == "-"); as expected as the STDIN never closes.
When a special *char appears in STDIN, let's say <END_CHAR>, I know that I am finished reading for a given task. How could I close the STDIN in this situations and move on to someFunction ?
Thanks,
You can always close the stdin file descriptor using close, i.e. close(0). If you check llvm::MemoryBuffer's source, you'll see that getFileOrSTDIN() basically boils down to a call to llvm::MemoryBuffer::getMemoryBufferForStream() with the first argument (the file descriptor) set to 0.
Also, see this SO answer.
The special character to close the standard input is ctrl-d (in *nix at least) on the command line (have a look here).

How should I manage ::std::cout after changing file descriptor 1 to refer to a different file?

I would like to do dup2(fd, 1); close(fd); and have ::std::cout write to the new fd 1. How do I can reset the state of ::std::cout so nothing goes funny? For example, is flushing beforehand sufficient? Or is there more to do than that?
I'm also curious about the same thing with ::std::cin.
Is there a standard mechanism for resetting these if you change out the file descriptors they're using underneath them?
To be clear, my goal here is basically to redirect my own input and output someplace else. I want to not have the process inadvertently burping up something on its parent's stdout or attempting to consume anything from its parent's stdin. And I never want to touch my parent's stdin or stdout ever again. I want to forget they ever existed.
And I most especially do not want to inadvertently ever send output to the same device my parent is using on a different file descriptor.
My goal is to have cin and cout lead to completely different places than they did when the process started, and to never ever touch in any way the places where they used to lead. Ever!
Option 1: set stdin & stdout
According to cppreference.com:
By default, all eight standard C++ streams are synchronized with their respective C streams.
And as long as you didn't explicitly called sync_with_stdio(false), they'll stay that way. What does it mean? The following:
In practice, this means that the synchronized C++ streams are unbuffered, and each I/O operation on a C++ stream is immediately applied to the corresponding C stream's buffer. This makes it possible to freely mix C++ and C I/O.
So, flush()-ing your cin & cout before dup()-ing them should be enough, since they should be in a consistent state.
If you wish to work with files for example, you could use:
if (freopen("input.txt", "r", stdin) == NULL) {
// Handle error, errno is set to indicate error
}
if (freopen("output.txt", "w", stdout) == NULL) {
// Handle error, errno is set to indicate error
}
Note 1: Setting the global extern FILE * stdin or stdout won't work because it simply changes a single instance of a pointer to the relevant FILE struct of the os. Any module that copied this pointer at any moment prior to this change will continue using the old FILE. A specific example is libc++'s implementation for cout, which copies FILE * stdout to a private member during the object's init. freopen on the other hand changes the internal FILE structure of the OS to use the newly opened file, affecting anyone who has a FILE * to it.
Note 2: When using dup() flavors (rather than freopen()), we are changing the underlying fd, rather than the FILE*. The freopen() method does more than that. From POSIX:
The freopen() function shall first attempt to flush the stream associated with stream as if by a call to fflush(stream). Failure to flush the stream successfully shall be ignored. If pathname is not a null pointer, freopen() shall close any file descriptor associated with stream. Failure to close the file descriptor successfully shall be ignored. The error and end-of-file indicators for the stream shall be cleared.
dup()-ing might work, but, it might be tricky, since it won't affect other properties of the FILE*, including: Character width, Buffering state, The buffer, I/O, Binary/text mode indicator, End-of-file status indicator, Error status indicator, File position indicator & (After C++17) Reentrant lock used to prevent data races.
When possible, I suggest using freopen. Otherwise, you could follow the steps described by yourself (fflush(), clearerr()). Skipping fclose() will be wise, since we won't be able to reopen the same internal FILE by any of the API methods.
Option 2: set cin's & cout's rdbuf()
The other way around, just like some comments proposed, is replacing cin's and cout's underlying buffer using rdbuf().
What are your options here?
File streams: Open ifstream & ofstream and use them:
std::ifstream fin("input.txt");
if (!fin) {
// Handle error
}
cin.rdbuf(fin.rdbuf());
std::ofstream fout("output.txt");
if (!fout) {
// Handle error
}
cout.rdbuf(fout.rdbuf());
Network streams: Use boost's boost::asio::ip::tcp::iostream (It's derived from std::streambuf and thus will work):
boost::asio::ip::tcp::iostream stream("www.boost.org", "http");
if (!stream) {
// Handle error
}
cin.rdbuf(stream.rdbuf());
cout.rdbuf(stream.rdbuf());
// GET request example
cout << "GET /LICENSE_1_0.txt HTTP/1.0\r\n";
cout << "Host: www.boost.org\r\n";
cout << "Accept: */*\r\n";
cout << "Connection: close\r\n\r\n";
cout.flush();
std::string response;
std::getline(cin, response);
Custom streams: Use your own custom wrapper for std::streambuf. See an example here.
You may create (or use an existing) library for a socket_streambuf class and associate this to std::cout/std::cin:
socket_streambuf<char> buffer{ "127.0.0.1:8888" }; // will call socket(), connect() or throw on failure
std::cout.rdbuf(&buffer); // re-direct cout to the network connection
std::cout << "Hello, World!\n"; // may call send() on basic_streambuf::overflow()
This way, you wouldn't have to bother about manipulating the state of the (global) C-stream buffers.

What does fd represent when typing: int fd = open("file");?

I am looking at I/O operations in C++ and I have a question.
When opening a file like:
#include <fcntl.h>
int main() {
unsigned char buffer[16];
int fd = open (argv[1], O_RDONLY);
read(fd, buffer, sizeof(buffer));
return 0;
}
How can the variable fd represent a file as an integer when passing it to the open method? Is it repesenting a file in current folder? If I print the ´fd´variable, it prints 3. What does that mean?
Ps. I know there are several other ways to handle files, like stdio.h, fstream etc but that is out of the scope of this question. Ds.
How can the variable fd represent a file as an integer when passing it to the open method?
It's a handle that identifies the open file; it's generally called a file descriptor, hence the name fd.
When you open the file, the operating system creates some resources that are needed to access it. These are stored in some kind of data structure (perhaps a simple array) that uses an integer as a key; the call to open returns that integer so that when you pass it read, the operating system can use it to find the resources it needs.
Is it repesenting a file in current folder?
It's representing the file that you opened; its filename was argv[1], the first of the arguments that was passed to the program when it was launched. If that file doesn't exist, or open failed for some reason, then it has the value -1 and doesn't represent any file; you really should check for that before you try to do anything with it.
If I print the fd variable, it prints 3. What does that mean?
It doesn't have any particular meaning; but it has that value because it was the fourth file (or file-like thing) that was opened, after the input (0), output (1) and error (2) streams that are used by cin, cout and cerr in C++.
Because that is the index of the table of resources stored for your current process.
Each process has it own resources table, so you just need to pass the index to read/write/etc function
Generally, a file descriptor is an index for an entry in a kernel-resident data structure containing the details of all open files. In POSIX this data structure is called a file descriptor table, and each process has its own file descriptor table. The user application passes the abstract key to the kernel through a system call, and the kernel will access the file on behalf of the application, based on the key. The application itself cannot read or write the file descriptor table directly.
from: http://en.wikipedia.org/wiki/File_descriptor
open() returns the file descriptor of the file which is the C type int. To know more about File Descriptor refer http://en.wikipedia.org/wiki/File_descriptor.
"fd" stands for file descriptor. It is a value identifying a file. It is often an index (in the global table), an offset, or a pointer. Different APIs use different types. WinAPI, for example, uses different types of handles (HANDLE, HGDI, etc.), which are essentially typedefs for int/void*/long, and so on.
Using naked types like "int" is usually not a good idea, but if the implementation tells you to do so (like POSIX in this case), you should keep it.
The simplified answer is that fd is just an index into some array of file descriptors.
When most processes are started, they are given three open file descriptors to begin with: stdin (0), stdout (1), and stderr (2). So when you open your first file, the next available array entry is 3.

EOF before EOF in Visual Studio

I had this snippet in a program (in Visual Studio 2005):
if(_eof(fp->_file))
{
break;
}
It broke the enclosing loop when eof was reached. But the program was not able to parse the last few thousand chars in file. So, in order to find out what was happening, I did this:
if(_eof(fp->_file))
{
cout<<ftell(fp)<<endl;
break;
}
Now the answer that I got from ftell was different (and smaller) than the actual file-size (which isn't expected). I thought that Windows might have some problem with the file, then I did this:
if(_eof(fp->_file))
{
cout<<ftell(fp)<<endl;
fseek(fp, 0 , SEEK_END);
cout<<ftell(fp)<<endl;
break;
}
Well, the fseek() gave the right answer (equal to the file-size) and the initial ftell() failed (as previously told).
Any idea about what could be wrong here?
EDIT: The file is open in "rb" mode.
You can't reliably use _eof() on a file descriptor obtained from a FILE*, because FILE* streams are buffered. It means that fp has sucked fp->_file dry and stores the remaining byte in its internal buffer. Eventually fp->_file is at eof position, while fp still has bytes for you to read. Use feof() after a read operation to determine if you are at the end of a file and be careful if you mix functions which operate on FILE* with those operating on integer file descriptors.
You should not be using _eof() directly on the descriptor if your file I/O operations are on the FILE stream that wraps it. There is buffering that takes place and the underlying descriptor will hit end-of-file before your application has read all the data from the FILE stream.
In this case, ftell(fp) is reporting the state of the stream and you should be using feof(fp) to keep them in the same I/O domain.

Question about STDIN STDOUT STDERR

I'm designing a MIPS simulator in c++ and my simplified OS must be able to run stat() occasionally (when a program being executed on my simulator requires an input or an output or something.)
The problem is, I need to be able to assert STDIN, STDOUT, and STDERR as parameters to stat "stat("stdin",buff)" where buff is the pointer to the insertion point, for the struct data returned, in memory. In reality I'll be using fstat() which uses file descriptors to point to the file to be stat-ed. My file descriptor table in my simple OS reserves 0, 1, and 2 for stdin, stdout, and stderr. I'm a bit confused about what STDIN, etc are. They're streams, I realize that, they're defined in stdio.h, but how in the world do I get a stat struct with all of the relevant information about the file for each of these streams?
On a POSIX system, you can use fileno() to convert from a FILE* (e.g. stdin, stdout, stderr) to an integer file descriptor. That file descriptor can be sent to fstat().
Here is a very well known example of how to determine if the standard terminal output is redirected to a file to illustrate the usage of POSIX's fileno function
if (!isatty(fileno(stdout))){
fprintf(stdout, "argv, argc, someone is redirecting me elsewhere...\n");
return 1;
}
If using the above code in a program and that said program was executed like this
foobar_program > foobar_program.output
'foobar_program.output' will contain
argv, argc, someone is redirecting me elsewhere...\n
A file stream pointer is nothing more than a structure of a pointer type to FILE, i.e. FILE *, fileno takes that structure and converts it to its relevant file descriptor, accordingly to the manual page for fileno here
The function fileno() examines the argument stream and returns
its integer descriptor.
and also here on the posix manual pages, and I'll quote fileno - map a stream pointer to a file descriptor....