Will File I/O In Current Working Directory Ever Fail? - c++

On my home Linux laptop, I like to write wrapper programs and GUI helpers for things I use frequently. However, I don't like Bash scripting very much, so I do a lot of stuff in C++. However, a lot of times, this requires me to use the system() function from the cstdlib.
This system() command is awesome, but I wanted a way to call system() and receive the stdout/stderror. The system() command only returns the return code from the command. So, in a Bash script, one can do:
myVar=$(ls -a | grep 'search string')
echo $myVar
and myVar will output whatever the stdout was for the command. So I began writing a wrapper class that will add a pipe-to-file to the end of the command, open the file, read all of the piped stdout, and return it as either one long string or as a vector of strings. The intricacies of the class are not really relevant here (I don't think anyway), but the above example would be done like this:
SystemCommand systemCommand;
systemCommand.setCommand("ls -a | grep \'search string\' ");
systemCommand.execute();
std::cout << systemCommand.outputAsString() << std::endl;
Behind the scenes, when systemCommand.execute() is called, the class ensures that the command will properly pipe all stdout/stderr to a randomly generated filename, in the current working directory. So for example, the above command would end up being
"ls -a | grep 'search string' >> 1452-24566.txt 2>&1".
The class then goes attempts to open and read from that file, using ifstream:
std::ifstream readFromFile;
readFromFile.open(_outputFilename);
if (readFromFile.is_open()) {
//Read all contents of file into class member vector
...
readFromFile.close();
//Remove temporary file
...
} else {
//Handle read failure
}
So here is my main question will std::ifstream ever fail to open a recently created file in the current working directory? If so, what would be a way to make it more robust (specifically on Linux)?
A side/secondary question: Is there a very simplified way to achieve what I'm trying to achieve without using file pipes? Perhaps some stuff available in unistd.h? Thanks for your time.

So here is my main question will std::ifstream ever fail to open a recently created file in the current working directory?
Yes.
Mount a USB thumb drive (or some other removable media)
cd to the mount
Execute your program. While it's executing, remove the drive.
Watch the IO error happen.
There's a ton of other reasons too. Filesystem corruption, hitting the file descriptor limit, etc.
If so, what would be a way to make it more robust (specifically on Linux)?
Make temporary files in /tmp, whose entire purpose is for temporary files. Or don't create a file at all, and use pipes for communication instead (Like what popen does, like harmic suggested). Even so, there are no guarantees; try to gracefully handle errors.

Related

Check/list all bash commands in C++?

Basically, is there a simple way to get a list of all bash commands in the PATH environment variable in C++? My current solution is to run a command beforehand that lists all the commands into a .txt, which is then read into the C++ program. I want to be able to cut out this step, if possible.
ls ${PATH//:/ } > commands.txt
If you do NOT need to use stdin in your C++ program
This is the easy solution. Just pipe the output of the ls command to your C++ program. Then, in your C++ program, read the contents of the file from stdin like you would read from a normal file. Literally use stdin wherever you need to provide a file descriptor. So, your command would look something like
ls ${PATH//:/ } | ./a.out
The | denotes a pipe in bash. It takes stdout from the first program (here ls) and redirects it to stdin of the second program (here your C++ program).
If you do need to use stdin in your C++ program
This is going to be tricky. You essentially need to make your C++ program do everything itself. The first way to this that comes to mind is
Read $PATH using getenv().
Parse $PATH by replacing all occurrences of : with (a blank space). This is easy enough to do in a loop, but you could also use std::replace.
Now that you have the directory paths from $PATH, you simply need the contents of each directory. This post will help you get the contents of a directory.
UPDATE: Another Approach
I've thought of another way to approach your problem that allows you to use IO redirection (ie. use the pipe), and also use stdin at the same time. The problem is that it is probably not portable.
The basic idea is that you read the output of ls from stdin (using the pipe operator in bash). Next, you essentially reset stdin using freopen. Something along the lines of
#include <stdio.h>
int main(void)
{
char buf[BUFSIZ];
puts("Reading from stdin...");
while(fgets(buf, sizeof(buf), stdin))
fputs(buf, stdout);
freopen("/dev/tty", "rw", stdin);
puts("Reading from stdin again...");
while(fgets(buf, sizeof(buf), stdin))
fputs(buf, stdout);
return 0;
}
The above code is from here. It reads stdin, resets stdin, and reads from stdin again. I would suggest not using this approach for anything important, or for something that needs to work on several platforms. While it is more convenient since it allows you to use IO redirection while retaining the ability to use stdin, it is not portable.

Deleting Lines after reading them in C++ program using system()

I am trying to understand how basic I/O with files is handled in c++ or c. My aim is to read file line by line and send the lines across to a remote server. If the line is sent, I want to delete it from the file.
One way I tried was that I kept a count of the lines read and called an system() system call to delete the 'count' number of lines. I used the bash command: sed -i -e 1,'count'd filename.
After that I continued reading the file and surprisingly it worked as planned.
I have two questions:
Is this way reliable?
And why does this work at all, when while
reading the file I deleted a part of it and yet it works? What if I
did a seek to a previous position, what then?
Best,Digvijay
PS:
I would be glad if somebody could suggest a better way.
Also here is the code for the program I wrote:
#include<iostream>
#include<fstream>
#include<string>
#include<sstream>
#include<cstdlib>
int main(){
std::ifstream f;
std::string line;
std::stringstream ss;
int i=0;
f.open("in.txt");
if(f.is_open()){
while(getline(f,line)){
std::cout<<line<<std::endl;
i++;
if(i==2)break;
}
ss<<"sed -i -e 1,"<<i<<"d in.txt";
system(ss.str().c_str());
while(getline(f,line)){
std::cout<<line<<std::endl;
}
}
return 0;
}
Edit:
Firstly thanks for taking the time to write answers. But here is some extra information which I missed out on earlier. The files I am dealing with are log files. So they are constantly being appended with information from devices. The reason why I want to avoid creating a copy is, because the log file themselves are very big(at times) and plus this would help to keep the log file short. Since they would be divided into parts and archived on the server.
Solution
I have found the way to deal with the problem. Apparently Thomas is right, that sed does create a new file. So the old file remains as is. Using this, I can read n lines, call the system function, close the file pointer and open it again. I do this on small chunks of the log, repeatedly until it becomes small and hence efficient to deal with. The server while archives the logs in 1gb files.
However I have a new question, due to memory constraint, I need to know if it is possible to split a log file into two efficiently. (Which possibly would be another question on SO)
Most modern file systems don't support deleting lines at the beginning of the file, so doing so would be very inefficient.
The normal solution to your actual problem is to stop writing to your log file when it reaches some size, then start writing to a new file. The code that copies the files can delete a whole file once it has been written (this is an efficient operation).
sed writes a new version of the file, while the program keeps reading the same version that it opened. This is the usual behavior of Unix and Linux when a program writes a file that another program has open.
You can see this for yourself with this small C program:
#include <stdlib.h>
#include <stdio.h>
int main(void) {
FILE *f = fopen("in.txt", "r");
while (1) {
rewind(f);
int lines = 0;
int c;
while ((c = getc(f)) != EOF)
if (c == '\n')
++lines;
printf("Number of lines in file: %d\n", lines);
}
return 0;
}
Run that program in one window, and then use sed in another window to edit the file. The number of lines printed by the program will stay the same, even if the file on disk has been edited, and this is because Unix keeps the old, open version, even if it is no longer accessible to other programs.
As to your first question, how reliable your solution is, as far as I can see it should be reliable, except with the usual caveats about the system crashing or running out of memory in the middle of an update, someone else accessing the file, and of course all the problems with the system call. It is not very efficient, though, and for large data sets you might want to do it differently.
sujin's comment about using a temporary file for the lines you want to keep seems reasonable. It would be both faster and safer. Keep the original file, so if the system crashes you'll still have your data, and wait until you have finished to rename the old file to "in.txt.bak", and then rename your temporary file to "in.txt".
First off, avoid use of system calls as much as you can (if possible, don't use it at all) as they create race conditions and other problems which drastically (and often) detrimentally affect the outcome of your program. This especially true if access to files are involved.
Given your problem, there are a number of ways to do this, each with its own caveats.
I'll cover three possible solutions:
1) If the file is small enough:
you can read in the entire thing in a data structure (vector, list, deque, etc.)
delete the original file
determine how many lines to read (and send off via server protocol)
then write the remaining lines as the name of the original file.
If you intend to parallelize your program later on, this may be a better solution, provided that the file is small. Note: small is a relative term, but is generally limited by how much memory you have available.
2) If the file is quite large or you're limited by memory constraints, you will have to get creative by using buffers. Once you've read a line and successfully sent it off via your program, you determine where the file pointer is and copy the remaining information until the end of the current file as a new file. Once done, close and delete the old file, then close and rename the new file the same name as the old file.
3) If your solution doesn't have to be in C++, you can use shell-scripting or (controversially) another language to get the job done.
1) No, it's not reliable.
2) The C++ runtime library reads your file in blocks (internally) which are then parceled out to your (higher level) input requests until the block(s) is(are) exhausted, forcing it to (internally) read more blocks from disk. Since one or more physical blocks are read in before you make any call to sed, it/they cannot be altered if sed happens to change that first part of the file.
To see your code fail, you would need to make the input file big enough that there are remaining blocks of the file that have not been read in (internally by the runtime library) before you call sed. By "fail" I mean your program would not see all the characters that were originally in the file before sed clobbered some lines.
As the other guys said, you have to make another file with the records you need after read the original file and then delete it. But in this application perhaps you will see more useful a fifo than a file. If you are on a *NIX platform check up about the makefifo statement from the console.
It is like a file with the singularity that after read a line it gets deleted.

printing to a network printer using fstream c++ in mac

I wish to print some text directly to a network printer from my c++ code (I am coding with xcode 4). I do know that everything on unix is a file and believe that it would not be impossible to redirect the text using fstream method in c++ to the printer device file. The only problem is I don't know the device file in /dev associated with my network printer.
Is it possible to achieve printing using fstream method? Something like
std::fstream printFile;
printFile.open("//PATH/TO/PRINTER/DEV", std::ios::out);
printFile << "This must go to printer" << std::endl;
printFile.close();
And, if so
How to obtain the file in /dev corresponding to a particular printer?
Thanks in advance,
Nikhil
Opening and writing directly to a file used to be possible back in the days of serial printers; however, this is not the approach available today.
The CUPS daemon provides print queuing, scheduling, and administrative interfaces on OS X and many other Unix systems. You can use the lp(1) or lpr(1) commands to print files. (The different commands come from different versions of print spoolers available in Unix systems over the years; one was derived from the BSD-sources and the other derived from the AT&T sources. For compatibility, CUPS provides both programs.)
You can probably achieve something like you were after with popen(3). In shell, it'd be something like:
echo hello | lp -
The - says to print from standard input.
I haven't tested this, but the popen(3) equivalent would probably look like this:
FILE *f = popen("lp -", "w");
if (!f)
exit(1);
fprintf(f, "output to the printer");
I recommend testing some inputs at the shell first to make sure that CUPS is prepared to handle the formatting of the content you intend to send. You might need to terminate lines with CRLF rather than just \n, otherwise the printer may "stair-step" the output. Or, if you're sending PDF or PS or PCL data, it'd be worthwhile testing that in the cheapest possible manner to make sure the print system works as you expect.

How to read the disk usage (du) in a C variable

I would like to do the following in a "c" program.
I need to get the disk usage of the following directory and should be able to read it in a variable.
du -sb /home/mann | awk '{print$1}'
I would like to do the above in C program and copy the output in a variable. I need to do this for this directory alone not for the "/" or "/home".
Pipe the output of your command to a file on the disk. Run your command using system
Read the file using standard C functions
Update your variable
Another option is to use popen/pclose to launch your command. This will return a file descriptor from which you can read.
Yet another option is to hunt your system for any library function that provides the information you desire

Bash input/output in C++

I'm writing program in C++ (for XAMPP communication) and I want to execute command which I have in strings (I know that this is simply system("command")) but I want to get the output from bash to C++ to string. I've founded several threads about this, but no which solved Bash -> C++.
You can call the FILE *popen(const char *command, const char *mode) function. Then, you can read the file it returns to get the output of your call.
It's like using a pipe to redirect the output of the command you used to a file in the hard drive and then read the file, but you don't get to create a file in the hard drive.
The documentation of the popen() is here.
You need to call the popen function, and read the output from the FILE it returns.
You can try Standard Output Redirection to redirect the standard output to a file stream
and then use it to read to a string.
Dup()