C++ folder opening and file counting - c++

So this is my code but I cant prevent it from printing out: . .. and it counts them as a file. I couldnt understand why.
The output is:
.
1files.
..
2files.
course3.txt
3files.
course2.txt
4files.
course1.txt
5files.
But there are only 3 files... It should say 3 files instead it counts that . .. and i dont know its meaning.
int folderO(){
DIR *dir;
struct dirent *ent;
int nFiles=0;
if ((dir = opendir ("sampleFolder")) != NULL) {
/* print all the files and directories within directory */
while ((ent = readdir (dir)) != NULL) {
std::cout << ent->d_name << std::endl;
nFiles++;
std::cout << nFiles << "files." << std::endl;
}
closedir (dir);
}
else {
/* could not open directory */
perror ("");
return EXIT_FAILURE;
}
}

. and .. are meta directories, current directory and parent directory respectively.
What you have found is that subdirectories are being printed along with files. And so are symlinks and other "weird" Unix-y stuff. Couple ways to filter those out if you don't want them printed:
If your system supports d_type in the dirent structure, check that d_type == DT_FILE before printing. (GNU page on dirent listing possible d_types)
if (ent->d_type == DT_FILE)
{
std::cout << ent->d_name << std::endl;
nFiles++;
std::cout << nFiles << "files." << std::endl;
}
if d_type is not supported, stat the file name and check that it is a file st_mode == S_ISREG.
struct stat statresult;
if (stat(ent->d_name, &statresult) == 0)
{
if (statresult.st_mode == S_ISREG)
{
std::cout << ent->d_name << std::endl;
nFiles++;
std::cout << nFiles << "files." << std::endl;
}
}
And of course there is the dumb-simple strcmp-based if statement, but this will list all other subdirectories.
Crap. Sorry. C++. that last line should be "And of course there is the dumb-simple std::string operator==-based if statement, but this will list all other subdirectories."

. is current directory inode (technically, a hardlink), .. is parent directory.
These are there for navigation. They're directories, perhaps you can ignore them if they are directories?

A Google search would have revealed that these are special folder names with these meanings:
. the current directory
.. the parent directory
Any tutorial on iterating a directory shows you how to filter these out with a simple "if" statement.

Related

Error opening existing directory using opendir in c++

I am making use of opendir() as below to access a directory.
DIR *dp;
if((dp = opendir(dir.c_str())) == NULL) {
cout << "Error(" << errno << ") opening " << dir << endl;
return errno;
}
However, I keep getting the error below, even though the directory exists.
Error(2) opening /path/to/folder/
I am able to get a list of file names when I do ls /path/to/folder
Be aware that /path/to/folder is different from /path/to/folder/
errno value 2 means ENOENT (it's an abbreviaton for Error NO ENTry) that is "Directory does not exist, or name is an empty string".
How do you define dir in your code?
std::string dir = "/path/to/folder/";
DIR* dp = opendir(dir.c_str());
if (dp == NULL)
{
std::cout << "Error(" << errno << ") opening " << dir << std::endl;
perror("opendir");
return errno;
}
closedir(dp);
Update #1:
Try to call you shell script:
main.sh folder/ foldername
Where main.sh contains:
#!/bin/sh
path="$1$2"
echo "$path"
ls -l "$path"

Find all files in a directory and its subdirectory

I would like to list all files in a given directory and its different subdirectory.
I found some code that I modified but it doing a never ending loop and I don't understand why.
int getdir (string dir, vector<string> &files)
{
DIR *dp;
struct dirent *dirp;
if((dp = opendir(dir.c_str())) == NULL) {
cout << "Error(" << errno << ") opening " << dir << endl;
return errno;
}
while ((dirp = readdir(dp)) != NULL) {
files.push_back(string(dirp->d_name));
string test=dir+"/"+dirp->d_name;
getdir(test,files);
}
closedir(dp);
return 0;
}
My main:
int main()
{
string dir = string(".");
vector<string> files = vector<string>();
getdir(dir,files);
for (unsigned int i = 0;i < files.size();i++) {
cout << files[i] << endl;
}
return 0;
}
How could I fix it?
This is likely due to the "." directory entry returned as the first entry which represents the current directory.
This causes your algorithm to try to list the entries for ./. and then ././. endlessly repeating until your program would eventual crash when it ran out of memory.
There's also a ".." directory entry which represents the parent directory and can cause a similar recursive problem.
As noted by Jerry Coffin, symbolic links can also cause a very similar issue if you have links which point to a directory which is the parent or ancestor of the symbolic link. This could be avoided with a much more complicated check or just simply excluding DT_LNK type entries all together.
Another issue is that you're trying to call getdir on files as well as subdirectories.
Try the following changes
while ((dirp = readdir(dp)) != NULL) {
string name(dir->d_name);
if (name != "." && name != "..") {
string test=dir+"/"+name;
files.push_back(test);
if (dir->d_type == DT_DIR) {
getdir(test,files);
}
}
}

Reading file names from a directory

I'm reading all file names from a certain directory using this function:
void getdir(std::string dir, std::list<std::string>& files)
{
DIR *dp;
struct dirent *dirp;
if((dp = opendir(dir.c_str())) == NULL)
{
std::cout<< "Error: path " << dir << " onbekend!\n";
}
else
{
while ((dirp = readdir(dp)) != NULL)
{
files.push_back(std::string(dirp->d_name));
}
closedir(dp);
}
}
When I print them out, I get '.' or '..' too with the filenames. But the file '.' or '..' is not in the directory.
I'm using ubuntu 12.04 :)
. is current directory, and .. is parent directory, you will find them in every directory.

How do I ignore hidden files (and files in hidden directories) with Boost Filesystem?

I am iterating through all files in a directory recursively using the following:
try
{
for ( bf::recursive_directory_iterator end, dir("./");
dir != end; ++dir )
{
const bf::path &p = dir->path();
if(bf::is_regular_file(p))
{
std::cout << "File found: " << p.string() << std::endl;
}
}
} catch (const bf::filesystem_error& ex) {
std::cerr << ex.what() << '\n';
}
But this includes hidden files and files in hidden directories.
How do I filter out these files? If needed I can limit myself to platforms where hidden files and directories begin with the '.' character.
Unfortunately there doesn't seem to be a cross-platform way of handling "hidden". The following works on Unix-like platforms:
First define:
bool isHidden(const bf::path &p)
{
bf::path::string_type name = p.filename();
if(name != ".." &&
name != "." &&
name[0] == '.')
{
return true;
}
return false;
}
Then traversing the files becomes:
try
{
for ( bf::recursive_directory_iterator end, dir("./");
dir != end; ++dir)
{
const bf::path &p = dir->path();
//Hidden directory, don't recurse into it
if(bf::is_directory(p) && isHidden(p))
{
dir.no_push();
continue;
}
if(bf::is_regular_file(p) && !isHidden(p))
{
std::cout << "File found: " << p.string() << std::endl;
}
}
} catch (const bf::filesystem_error& ex) {
std::cerr << ex.what() << '\n';
}
Let's assume for now that you want to ignore files which start with a '.'. This is the standard indication in Unix for a hidden file. I suggest writing a recursive function to visit each file. In pseudocode, it looks something like this:
visitDirectory dir
for each file in dir
if the filename of file does not begin with a '.'
if file is a directory
visitDirectory file
else
do something with file (perhas as a separate function call?)
This avoids the need to search the whole path of a file to determine whether or not we want to deal with it. Instead, we simply skip any directories which are "hidden."
I can think of several iterative solutions as well, if that's what you prefer. One is to have a stack or queue to keep track of which directory to visit next. Basically this emulates the recursive version with your own data structure. Alternatively, if you are stuck on parsing the full path of the file, simply make sure you get the absolute path. This will guarantee that you don't encounter a directory with a name like './' or '../', which would cause problems with checking for a hidden file.

C++ Looping Through Files In Directory and Writing to a Different Directory

I am attempting to modify some existing C++ code to work with my needs, but having never used C++ before, I am having some difficulties.
My goal is:
--> time and memory-intensive processes for preparation
for each file in directory:
open file;
generate a tagged representation; //the current code just does this
write file; //different directory but same filename
The reason I do not want to just call the C++ program for each file (with, for instance, a shell script) is that prior to the below code running, time and memory-intensive pre-processing steps are performed. (These take about 45-60sec. while the code only takes about 2-5sec. to run.)
I have pasted the section of the code below. I want to read the arguments from the command line.
int main(int argc, char** argv) {
/*
pre-processing stuff
*/
/* for each file */
HANDLE hFind = INVALID_HANDLE_VALUE;
string path = argv[1];
string outpath = argv[2];
WIN32_FIND_DATA ffd;
//EDIT 2:
cout << "Path: " << path << '\n';
cout << "Outpath: " << outpath << '\n';
hFind = FindFirstFile(path.c_str(), &ffd);
if (hFind == INVALID_HANDLE_VALUE) {
cout << "error searching directory\n";
return false;
}
do {
//istream *is(&std::cin);
string filePath = path + ffd.cFileName;
ifstream in( filePath.c_str() );
if (in) {
/* for each line */
string line;
int n = 1;
string str;
string fullOutpath = outpath + ffd.cFileName;
ofstream File;
File.open(fullOutpath);
while (getline(in, line)) {
if (line.size() > 1024) {
cerr << "warning: the sentence seems to be too long at line " << n;
cerr << " (please note that the input should be one-sentence-per-line)." << endl;
}
string postagged = bidir_postag(line, vme, vme_chunking, dont_tokenize);
/* output to file */
File << postagged << endl;
//cout << postagged << endl;
/* increment counter */
n++;
}
File.close();
} else {
cout << "Problem opening file " << ffd.cFileName << "\n";
}
} while (FindNextFile(hFind, &ffd) != 0);
if (GetLastError() != ERROR_NO_MORE_FILES) {
cout << "Something went wrong during searching\n";
}
return true;
}
Currently, I am getting a compiler error: EDIT: compiler error fixed, thanks Blood!, but see below...
error: no matching function for call to 'std::basic_ofstream<char>::open<std::string&>
Any thoughts? Please let me know if you need more code/information. Also, I should add that I'm running these on Windows XP using command prompt.
Thanks.
EDIT:
It now compiles (thanks Blood), though when it runs it is only attempting to open the directory, not the files in the directory.
Problem opening file directory_name.
The ifstream should be opening the files in teh directory, not the directory itself.
EDIT 2:
I am running the executable fromt he command line with the following prompt:
.\tag.exe C:\indir C:\outdir
I have also tried:
.\tag.exe C:\indir\* C:\outdir\
This enumerates all the files, but how can I capture them? Also, is there a simpler way to modify my code/input?
I have also tried:
.\tag.exe C:\indir\ C:\outdir\
This gives: error searching directory.
EDIT 3:
Using:
.\tag.exe "C:\indir\*" C:\outdir\
I get the output:
Problem opening file .
Problem opening file ..
Problem opening file 2967
Problem opening file 2966
Problem opening file 4707
etc. (100s)
Solution:
Here are the key changes to the code (thanks Nate Kohl!):
string path = argv[1];
path += "\\*";
hFind = FindFirstFile(path.c_str(),&ffd);
// in the 'do-while' loop
string filePath = argv[1];
filePath += "\\";
filePath += ffd.cFileName;
ifstream in(filePath.c_str());
//regarding the outpath
fullOutpath = outpath + "\\";
fullOutpath += ffd.cFileName;
File.open(fullOutpath.c_str());
and from the command line:
.\tag.exe C:\indir C:\outdir
The help was very much appreciated.
Make sure you're passing the right path format to FindFirstFile.
From the documentation:
To examine a directory that is not a root directory, use the path to
that directory, without a trailing backslash. For example, an argument
of "C:\Windows" returns information about the directory "C:\Windows",
not about a directory or file in "C:\Windows". To examine the files
and directories in "C:\Windows", use an lpFileName of "C:\Windows\*".
Edit:
I'm not near a windows box right now (so this may not compile!) but I imagine that "loop over each file in a directory" would look something like this:
// argv[1] is the input path with no trailing characters, e.g. "c:\indir"
// add a wildcard because FindFirstFile expects e.g. "c:\indir\*"
TCHAR wildcard_path[MAX_PATH];
PathCombine(wildcard_path, argv[1], "*");
// iterate over each file
WIN32_FIND_DATA ffd;
HANDLE hFind = FindFirstFile(wildcard_path, &ffd);
if (hFind == INVALID_HANDLE_VALUE) { } // error
do {
// ignore directories
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) {
// create a full path for each file we find, e.g. "c:\indir\foo.txt"
TCHAR file_path[MAX_PATH];
PathCombine(file_path, argv[1], ffd.cFileName);
// ...and do something with file_path.
}
} while (FindNextFile(hFind, &ffd) != 0);
FindClose(hFind);