What does fd represent when typing: int fd = open("file");? - c++

I am looking at I/O operations in C++ and I have a question.
When opening a file like:
#include <fcntl.h>
int main() {
unsigned char buffer[16];
int fd = open (argv[1], O_RDONLY);
read(fd, buffer, sizeof(buffer));
return 0;
}
How can the variable fd represent a file as an integer when passing it to the open method? Is it repesenting a file in current folder? If I print the ´fd´variable, it prints 3. What does that mean?
Ps. I know there are several other ways to handle files, like stdio.h, fstream etc but that is out of the scope of this question. Ds.

How can the variable fd represent a file as an integer when passing it to the open method?
It's a handle that identifies the open file; it's generally called a file descriptor, hence the name fd.
When you open the file, the operating system creates some resources that are needed to access it. These are stored in some kind of data structure (perhaps a simple array) that uses an integer as a key; the call to open returns that integer so that when you pass it read, the operating system can use it to find the resources it needs.
Is it repesenting a file in current folder?
It's representing the file that you opened; its filename was argv[1], the first of the arguments that was passed to the program when it was launched. If that file doesn't exist, or open failed for some reason, then it has the value -1 and doesn't represent any file; you really should check for that before you try to do anything with it.
If I print the fd variable, it prints 3. What does that mean?
It doesn't have any particular meaning; but it has that value because it was the fourth file (or file-like thing) that was opened, after the input (0), output (1) and error (2) streams that are used by cin, cout and cerr in C++.

Because that is the index of the table of resources stored for your current process.
Each process has it own resources table, so you just need to pass the index to read/write/etc function
Generally, a file descriptor is an index for an entry in a kernel-resident data structure containing the details of all open files. In POSIX this data structure is called a file descriptor table, and each process has its own file descriptor table. The user application passes the abstract key to the kernel through a system call, and the kernel will access the file on behalf of the application, based on the key. The application itself cannot read or write the file descriptor table directly.
from: http://en.wikipedia.org/wiki/File_descriptor

open() returns the file descriptor of the file which is the C type int. To know more about File Descriptor refer http://en.wikipedia.org/wiki/File_descriptor.

"fd" stands for file descriptor. It is a value identifying a file. It is often an index (in the global table), an offset, or a pointer. Different APIs use different types. WinAPI, for example, uses different types of handles (HANDLE, HGDI, etc.), which are essentially typedefs for int/void*/long, and so on.
Using naked types like "int" is usually not a good idea, but if the implementation tells you to do so (like POSIX in this case), you should keep it.

The simplified answer is that fd is just an index into some array of file descriptors.
When most processes are started, they are given three open file descriptors to begin with: stdin (0), stdout (1), and stderr (2). So when you open your first file, the next available array entry is 3.

Related

Correct way of using fdopen

I mean to associate a file descriptor with a file pointer and use that for writing.
I put together program io.cc below:
int main() {
ssize_t nbytes;
const int fd = 3;
char c[100] = "Testing\n";
nbytes = write(fd, (void *) c, strlen(c)); // Line #1
FILE * fp = fdopen(fd, "a");
fprintf(fp, "Writing to file descriptor %d\n", fd);
cout << "Testing alternate writing to stdout and to another fd" << endl;
fprintf(fp, "Writing again to file descriptor %d\n", fd);
close(fd); // Line #2
return 0;
}
I can alternately comment lines 1 and/or 2, compile/run
./io 3> io_redirect.txt
and check the contents of io_redirect.txt.
Whenever line 1 is not commented, it produces in io_redirect.txt the expected line Testing\n.
If line 2 is commented, I get the expected lines
Writing to file descriptor 3
Writing again to file descriptor 3
in io_redirect.txt.
But if it is not commented, those lines do not show up in io_redirect.txt.
Why is that?
What is the correct way of using fdopen?
NOTE.
This seems to be the right approach for a (partial) answer to Smart-write to arbitrary file descriptor from C/C++
I say "partial" since I would be able to use C-style fprintf.
I still would like to also use C++-style stream<<.
EDIT:
I was forgetting about fclose(fp).
That "closes" part of the question.
Why is that?
The opened stream ("stream" is an opened FILE*) is block buffered, so nothing gets written to the destination before the file is flushed. Exiting from an application closes all open streams, which flushes the stream.
Because you close the underlying file descriptor before flushing the stream, the behavior of your program is undefined. I would really recommend you to read posix 2.5.1 Interaction of File Descriptors and Standard I/O Streams (which is written in a horrible language, nonetheless), from which:
... if two or more handles are used, and any one of them is a stream, the application shall ensure that their actions are coordinated as described below. If this is not done, the result is undefined.
...
For the first handle, the first applicable condition below applies. ...
...
If it is a stream which is open for writing or appending (but not also open for reading), the application shall either perform an fflush(), or the stream shall be closed.
A "handle" is a file descriptor or a stream. An "active handle" is the last handle that you did something with.
The fp stream is the active handle that is open for appending to file descriptor 3. Because fp is an active handle and is not flushed and you switch the active handle to fd with close(fd), the behavior of your program is undefined.
What is my guess and most probably happens is that your C standard library implementation calls fflush(fp) after main returns, because fd is closed, some internal write(3, ...) call returns an error and nothing is written to the output.
What is the correct way of using fdopen?
The usage you presented is the correct way of using fdopen.

C++ how to copy a FILE * pointer

I found a solution here Duplicating file pointers?
FILE *fp2 = fdopen (dup (fileno (fp)), "r");
but according to http://man7.org/linux/man-pages/man2/dup.2.html,
the new file descriptor created by dup, they refer to the same open file descriptor, and thus share status. That's not what I want. I want to create a totally new IO object which refers to the file pointed by the old FILE *
Is there any way to do this?
Add:
I don't have the filename actually. I'm doing a deep copy of an object, which hold an open FILE pointer, so I have to copy that also.
I want to create a totally new IO object which refers to the file pointed by the old FILE *
You're assuming that the file associated with the original FILE * has some form of identity distinct from the IO object by which it is accessed. That is true for regular files and some other objects, but false for others, such as sockets and pipes. Thus, there is no general-purpose mechanism for doing what you ask.
For the special case of objects that can be accessed via the file system, the way to create a new IO object associated with the same file is to open() or fopen() the file via a path to it. That's what these functions do. There is no standard way to get a path from a FILE * or file descriptor number, but on Linux (since you tagged that) you can use readlink() on the open file's entry in /proc, as described here.
Do be aware that even for regular files, the readlink approach is not guaranteed to work. In particular, it will not work if the path with which the original file was opened has since been unlinked, and in fact, in that case it could lead to the wrong file being opened instead. You can check for that by running fstat() on both the old and new file descriptor numbers -- if the files are in fact the same, then they will have the same inode numbers on the same host device.

Write to a file steam without a file existing on the hard drive in C

Is this possible in the C language? Or even C++? I prefer to know for C.
For example, say I had a function that reads a text file and does something with it. If the user did not specify an input text file and I wanted to use that function for stdin; Is it possible to write stdin to a file stream as if it were coming from a file read so it can be used in the same method that normally just takes input files?
A way around this of course is that I could take stdin, write it to a temp file, then pass the temp file to the function that normally would take an input file. I've searched online and asked tutors at university but am not getting any solutions. Has anyone ever accomplished this?
If your function has a prototype say
void add(FILE *fp,<rest of the argument>)
{
}
The you can directly pass
add(stdin,<rest of the arguments>);
Because stdin is of type FILE *
FILE *stdin;
No need to read from stdin and store it in some file and and later send that file pointer to your API.
STDIN is open for you automatically, so just read from STDIN. There are several ways of doing it, but basically STDIN is file descriptor 0.
int filedes = 0;
if(/* argv[1] is a file name */)
filedes = open(argv[1], flags);
read(filedes, bufr, size);

Does constructing an iostream (c++) read data from the hard drive into memory?

When I construct an iostream when say opening a file will this always read the entire file from the hard disk and then put it into memory, or is it streamed in and buffered by the OS on demand?
I ask because one way to check if a file exists is to see if opening it fails, but I fear if the files I am opening are very large then this take a long time if iostream must read the entire file in on open.
To check whether a file exists can be done like this if you want to use boost.
#include <boost/filesystem.hpp>
bool fileExists = boost::filesystem::exists("foo.txt");
No, it will not read the entire file into memory when you open it. It will read your file in chunks though, but I believe this process will not start until you read the first byte. Also these chunks are relatively small (on the order of 4-128 kibibytes in size), and the fact it does this will speed things up greatly if you are reading the file sequentially.
In a test on my Linux box (well, Linux VM) simply opening the file only results in the OS open system call, but no read system call. It doesn't start reading anything from the file until the first attempt to read from the stream. And then it reads 8191 (why 8191? that seems a very strange number) byte chunks as I read the file in.
Opening a file is a bad way of testing if the file exists - all it does is tell you if you can open it. Opening might fail for a number of reasons, typically because you don't have read permission, but the file will still exist. It is usually better to use an operating system specific function to test for existence. And no, opening an fstream will not cause the contents to be read.
What I think is, when you open a file, the corresponding data structures for the process opening the file are populated which include file pointer, file descriptor, v node etc.
Now one can read and write to a file using buffered streams (fwrite , fread) or using system calls (read and write).
When we use buffered streams, we buffer the data and then write or read it[This is done for efficiency puposes]. This statement itself means that the whole file is not read into memory but certain bytes are read into buffer and then made available.
In case of sys calls such as read and write , kernel level buffering is done (using fsync one can flush out kernel buffer too), but data is actually read and written to the device .file
checking existance of file
#include &lt sys/stat.h &gt
int main(){
struct stat file_i;
std::string f("myfile.txt");
if (stat(f.c_str(),&file_i) != 0){
cout &lt&lt "File not found" &lt&lt endl;
}
return 0;
}
Hope this clarifies a bit.

Question about STDIN STDOUT STDERR

I'm designing a MIPS simulator in c++ and my simplified OS must be able to run stat() occasionally (when a program being executed on my simulator requires an input or an output or something.)
The problem is, I need to be able to assert STDIN, STDOUT, and STDERR as parameters to stat "stat("stdin",buff)" where buff is the pointer to the insertion point, for the struct data returned, in memory. In reality I'll be using fstat() which uses file descriptors to point to the file to be stat-ed. My file descriptor table in my simple OS reserves 0, 1, and 2 for stdin, stdout, and stderr. I'm a bit confused about what STDIN, etc are. They're streams, I realize that, they're defined in stdio.h, but how in the world do I get a stat struct with all of the relevant information about the file for each of these streams?
On a POSIX system, you can use fileno() to convert from a FILE* (e.g. stdin, stdout, stderr) to an integer file descriptor. That file descriptor can be sent to fstat().
Here is a very well known example of how to determine if the standard terminal output is redirected to a file to illustrate the usage of POSIX's fileno function
if (!isatty(fileno(stdout))){
fprintf(stdout, "argv, argc, someone is redirecting me elsewhere...\n");
return 1;
}
If using the above code in a program and that said program was executed like this
foobar_program > foobar_program.output
'foobar_program.output' will contain
argv, argc, someone is redirecting me elsewhere...\n
A file stream pointer is nothing more than a structure of a pointer type to FILE, i.e. FILE *, fileno takes that structure and converts it to its relevant file descriptor, accordingly to the manual page for fileno here
The function fileno() examines the argument stream and returns
its integer descriptor.
and also here on the posix manual pages, and I'll quote fileno - map a stream pointer to a file descriptor....