C++: Getting size of all files inside current directory - c++

I'm new to C++ programming, and I'm trying to practice file reading and writing. I'm trying to get the sizes of all the files of the current directory. Thing is, after getting the names of the files in the current directory, I place them inside of a text file. So now I'm stuck, and don't know where to go from here.
#include <iostream>
#include <fstream>
#include <algorithm>
using namespace std;
// FILE FUNCTION
void fileStuff(){
}
// MAIN FUNCTION
int main(int argc, char const *argv[])
{
// ERROR CHECKING
if(argc != 3){ // IF USER DOESN'T TYPE ./nameOfFile, AND THE OTHER REQUIRED ARGUMENTS.
cout << "Incorrect. Try Again" << endl;
exit(-1);
}
ifstream file;
string fileContents;
system("find . -type f > temp.txt");
file.open("temp.txt");
if (!file){
cout << "Unable to open file: temp.txt" << endl;
exit(-1);
}
while(file){
getline(file, fileContents);
cout << fileContents << endl;
}
file.close();
return 0;
}

C++14 (and earlier versions, notably C++11) does not know about file systems and directories (yet). For C++17, see its file system library. Otherwise, your code is operating system specific, but Boost library has some file system support.
I am assuming you are running on Linux or some POSIX system.
Your program just uses an external command (find(1)); if you want to read from such a command, you might use popen(3) with pclose, then you won't need a temporary file. BTW, you could use find . -type f -ls.
However, you don't need to use an external command, and it is safer (and faster) to avoid that.
Pedantically, a file name could contain a newline character, and with your approach you'll need to special case that. A file name could also contain a tab character (or other control characters) and in that case find . -type f behave specifically, and you would also need to special case. In practice, it is extremely poor taste and very unlikely to have a newline or tab character in a file name and you might forget these weird cases.
You could use nftw(3). You could recursively use opendir(3) & loop on readdir(3) (and later closedir).
Once you have a file path, you would use stat(2) to get that file's metadata, including its size (field st_size). BTW the /bin/ls and /usr/bin/find programs use that.
The readdir(3) function returns a struct dirent pointer ending with d_name; you probably want to skip the two entries for . and .. (so use strcmp(3) to compare with "." and "..", or do the compare the hard way). Then you'll build a complete file path using string catenation. You might use (in genuine C++) std::string or you could use snprintf(3) or asprintf(3) for that. If you readdir the current directory . you could call stat(2) directly on d_name field.
BTW exit(-1) is incorrect (and certainly poor taste). See exit(3). A much more readable alternative is exit(EXIT_FAILURE)

Related

Is there a way to read in a folder of files in C++?

I have a folder containing close to 200 word documents, and I want to read them in to C++ using ifstream fin from library fstream. I have two problems:
1) fin is able to read in .doc files, but nonsense is printed to the screen because .doc files are not plain text.
2) I know of no way to get a program to automatically read in multiple files with unrelated file names.
Because of these two problems, I am manually going through each of my .doc files and changing them to .txt files. In addition, I am calling them 1.txt, 2.txt, 3.txt, etc, so that I can use a for loop in C++ to read them all in (I would convert the loop control variable i to a string x in each iteration, and read in "x.txt").
While this will work, I've only finished going through 83 files and it's taken around an hour. Is there a way for me to get C++ to automatically read all these files in? C++ would have to first change each one to a .txt file as well, so that I can print meaningful text to the screen.
Boost library is very rich for these type of file / filesystem operations. Please check the code below. This basically goes to the folder (ws) where you keep all your doc files, and iterates through all the files in it. The code assumes that the folder 'ws' has only files, no folders. Once you have the name of the file you can do all kinds of manipulation on it.
I didn't get why you want to change the extension to txt but included a few lines that does this. Changing the extension won't affect its content.
#include <sstream>
#include <iostream>
#include <boost/filesystem.hpp>
namespace fs = boost::filesystem;
int main(){
// ref : https://theboostcpplibraries.com/boost.filesystem-paths
// ws : workspace where you keep all the files
fs::path ws = fs::path(getenv("HOME")) / "ws";
// ref : https://theboostcpplibraries.com/boost.filesystem-iterators
fs::directory_iterator it{ws};
while (it != fs::directory_iterator{}){
std::cout << "Processing file < " << *it << " >" << std::endl;
// ... do other stuff
// Parse the current filename into its parts, then change the extension to txt
// ref : https://theboostcpplibraries.com/boost.filesystem-paths
std::stringstream ss;
ss << (ws / fs::path(*it).stem()).native() << ".txt";
fs::path new_path(ss.str());
std::cout << "Copying into < " << new_path << " >" << std::endl;
// ref : http://www.boost.org/doc/libs/1_53_0/libs/filesystem/doc/reference.html
fs::copy_file(*it++, new_path, fs::copy_option::overwrite_if_exists);
}
return 0;
}
You can compile with this :
g++ -std=c++14 -o main main.cc -lboost_filesystem -lboost_system
Given that you are talking about Microsoft Word and "folder", I guess you are running Windows.
The Windows API provides the FirstFirstFile / FindNextFile pair of functions, which allow your program to automatically find the names of existing files. The official example is named "Listing the Files in a Directory"
On Linux and Unix platforms, there are functions named opendir and readdir which serve the same purpose.
If you want to write cross-platform code, there are libraries that provide an abstraction layer above the OS functions such as boost::filesystem.

Brought a Linux C++ Console Application to a Win32 C++ App using VS2010 and the search function from <algorithm> is no longer working

Just like the title says, I've been working on a fairly large program and have come upon this bug. I'm also open to alternatives for searching a file for a string instead of using . Here is my code narrowed down:
istreambuf_iterator<char> eof;
ifstream fin;
fin.clear();
fin.open(filename.c_str());
if(fin.good()){
//I outputted text to a file to make sure opening the file worked, which it does
}
//term was not found.
if(eof == search(istreambuf_iterator<char>(fin), eof, term.begin(), term.end()){
//PROBLEM: this code always executes even when the string term is in the file.
}
So just to clarify, my program worked correctly in Linux but now that I have it in a win32 app project in vs2010, the application builds just fine but the search function isn't working like it normally did. (What I mean by normal is that the code in the if statement didn't execute because, where as now it always executes.)
NOTE: The file is a .xml file and the string term is simply "administration."
One thing that might or might not be important is to know that filename (filename from the code above) is a XML file I have created in the program myself using the code below. Pretty much I create an identical xml file form the pre-existing one except for it is all lower case and in a new location.
void toLowerFile(string filename, string newloc, string& newfilename){
//variables
ifstream fin;
ofstream fout;
string temp = "/";
newfilename = newloc + temp + newfilename;
//open file to read
fin.open(filename.c_str());
//open file to write
fout.open(newfilename.c_str());
//loop through and read line, lower case, and write
while (fin.good()){
getline (fin,temp);
//write lower case version
toLowerString(temp);
fout << temp << endl;
}
//close files
fout.close();
fin.close();
}
void toLowerString(string& data){
std::transform(data.begin(), data.end(), data.begin(), ::tolower);
}
I'm afraid your code is invalid - the search algorithm requires forward iterators, but istreambuf_iterator is only an input iterator.
Conceptually that makes sense - the algorithm needs to backtrack on a partial match, but the stream may not support backtracking.
The actual behaviour is undefined - so the implementation is allowed to be helpful and make it seem to work, but doesn't have to.
I think you either need to copy the input, or use a smarter search algorithm (single-pass is possible) or a smarter iterator.
(In an ideal world at least one of the compilers would have warned you about this.)
Generally, with Microsoft's compiler, if your program compiles and links a main() function rather than a wmain() function, everything defaults to char. It would be wchar_t or WCHAR if you have a wmain(). If you have tmain() instead, then you are at the mercy of your compiler/make settings and it's the UNICODE macro that determines which flavor your program uses. But I doubt that char_t/wchar_t mismatch is actually the issue here because I think you would have got an warning or error if all four of the search parameters didn't use the same the same character width.
This is a bit of a guess, but try this:
if(eof == search(istreambuf_iterator<char>(fin.rdbuf()), eof, term.begin(), term.end())

C++ text file I/O

This is a very simple question but wherever I look I get a different answer (is this because it's changed or will change in c++0x?):
In c++ how do I read two numbers from a text file and output them in another text file?
Additionally, where do I put the input file? Just in the project directory? And do I need to already have the output file? Or will one be created?
You're probably getting different answers because there are many different ways to do this.
Reading and writing two numbers can be pretty simple:
std::ifstream infile("input_file.txt");
std::ofstream outfile("output_file.txt");
int a, b;
infile >> a >> b;
outfile << a << "\t" << b;
You (obviously) need to replace "input_file.txt" with the name of a real text file. You can specify that file with an absolute or relative path, if you want. If you only specify the file name, not a path, that means it'll look for the file in the "current directory" (which may or may not be the same as the directory containing the executable).
When you open a file just for writing as I have above, by default any existing data will be erased, and replaced with what you write. If no file by that name (and again, you can specify the path to the file) exists, a new one will be created. You can also specify append mode, which adds new data to the end of the existing file, or (for an std::fstream) update mode, where you can read existing data and write new data.
If your program is a filter, i.e. it reads stuff from somewhere, and outputs stuff elsewhere, you will benefit of using standard input and standard output instead of named files. It will allow you to easily use the shell redirections to use files, saving your program to handle all the file operations.
#include <iostream>
int main()
{
int a, b;
std::cin >> a >> b;
std::cout << a << " " << b;
}
Then use it from the shell.
> cat my_input_file | my_program > my_output_file
Put in the same folder as the executable. Or you can use a file path to point at it.
It can be created if it does not exist.

Copying contents of one file to another in C++

I am using the following program to try to copy the contents of a file, src, to another, dest, in C++. The simplified code is given below:
#include <fstream>
using namespace std;
int main()
{
fstream src("c:\\tplat\test\\secClassMf19.txt", fstream::binary);
ofstream dest("c:\\tplat\\test\\mf19b.txt", fstream::trunc|fstream::binary);
dest << src.rdbuf();
return 0;
}
When I built and executed the program using CODEBLOCKS ide with GCC Compiler in windows, a new file named "....mf19.txt" was created, but no data was copied into it, and filesize = 0kb. I am positive I have some data in "...secClassMf19.txt".
I experience the same problem when I compiled the same progeam in windows Visual C++ 2008.
Can anyone please help explain why I am getting this unexpected behaviour, and more importantly, how to solve the problem?
You need to check whether opening the files actually succeeds before using those streams. Also, it never hurts to check if everything went right afterwards. Change your code to this and report back:
int main()
{
std::fstream src("c:\\tplat\test\\secClassMf19.txt", std::ios::binary);
if(!src.good())
{
std::cerr << "error opening input file\n";
std::exit(1);
}
std::ofstream dest("c:\\tplat\\test\\mf19b.txt", std::ios::trunc|std::ios::binary);
if(!dest.good())
{
std::cerr << "error opening output file\n";
std::exit(2);
}
dest << src.rdbuf();
if(!src.eof())
std::cerr << "reading from file failed\n";
if(!dst.good())
std::cerr << "writing to file failed\n";
return 0;
}
I bet you will report that one of the first two checks hits.
If opening the input file fails, try opening it using std::ios::in|std::ios::binary instead of just std::ios::binary.
Do you have any reason to not use CopyFile function?
Best
As it is written, your src instance is a regular fstream, and you are not specifying an open mode for input. The simple solution is to make src an instance of ifstream, and your code works. (Just by adding one byte!)
If you had tested the input stream (as sbi suggests), you would have found that it was not opened correctly, which is why your destination file was of zero size. It was opened in write mode (since it was an ofstream) with the truncation option to make it zero, but writing the result of rdbuf() simply failed, with nothing written.
Another thing to note is that while this works fine for small files, it would be very inefficient for large files. As is, you are reading the entire contents of the source file into memory, then writing it out again in one big block. This wastes a lot of memory. You are better off reading in chunks (say 1MB for example, a reasonable size for a disk cache) and writing a chunk at a time, with the last one being the remainder of the size. To determine the source's size, you can seek to the end and query the file offset, then you know how many bytes you are processing.
And you will probably find your OS is even more efficient at copying files if you use the native APIs, but then it becomes less portable. You may want to look at the Boost filesystem module for a portable solution.

Globbing in C++/C, on Windows

Is there a smooth way to glob in C or C++ in Windows?
E.g., myprogram.exe *.txt sends my program an ARGV list that has...ARGV[1]=*.txt in it.
I would like to be able to have a function (let's call it readglob) that takes a string and returns a vector of strings, each containing a filename.
This way, if I have files a.txt b.txt c.txt in my directory and readglob gets an argument *.txt, it returns the above filelist.
//Prototype of this hypothetical function.
vector<string> readglob(string);
Does such exist?
Link with setargv.obj (or wsetargv.obj) and argv[] will be globbed for you similar to how the Unix shells do it:
http://msdn.microsoft.com/en-us/library/8bch7bkk.aspx
I can't vouch for how well it does it though.
This is very Windows-specific. I don't know how you'd write this to be cross-platform. But I've used this in Windows programs and it works well for me.
// Change to the specified working directory
string path;
cout << "Enter the path to report: ";
cin >> path;
_chdir(path.c_str());
// Get the file description
string desc;
cout << "Enter the file description: ";
cin >> desc;
// List the files in the directory
intptr_t file;
_finddata_t filedata;
file = _findfirst(desc.c_str(),&filedata);
if (file != -1)
{
do
{
cout << filedata.name << endl;
// Or put the file name in a vector here
} while (_findnext(file,&filedata) == 0);
}
else
{
cout << "No described files found" << endl;
}
_findclose(file);
there was talk about having it in Boost::filesystem but it was dropped in favor of using the boost::regex.
For win32 specific (MFC) you can use the CFileFind class
There may be a better way now, but last time I had to deal with this problem I ended up including Henry Spencer's regex library statically linked into my program (his library is BSD licensed), and then I made a wrapper class that converted the user's glob-expressions into regular expressions to feed to the regex code. You can view/grab the wrapper class here if you like.
Once you have those parts in place, the final thing to do is actually read the directory, and pass each entry name into the matching function to see if it matches the expression or not. The filenames that match, you add to your vector; the ones that don't you discard. Reading the directory is fairly straightforward to do using the DOS _findfirst() and _findnext() functions, but if you want a nicer C++ interface I have a portable wrapper class for that also...
Ehw. I had to implement something like this in ANSI C about 15 years ago. Start with the ANSI opendir/readdir routines, I guess. Globs aren't exactly RegExs, so you will have to implement your own filtering.