how to skip a directory while reading using dirent.h - c++

i am trying to recursively open files using the functionality provided in dirent.h
My problem is:
i could not make it to skip directories which failed to open. I want it to open the directories which it can and skip those which it can't and move to the next directory instead of exiting with failure.
What should i do to fix this?
Here is a simple code i tried to use
int acessdirs(const char *path)
{
struct dirent *entry;
DIR *dp;
char fpath[300];
if(dp=opendir(path))
{
while((entry=readdir(dp)))
do things here
}
else
{
std::cout<<"error opening directory";
return 0;
}
return 1;
}
I used this same style on windows 7 and it works fine.But it crashed on windows xp and when i debugged it i found that it crashes while trying to open "system volume information".
I really dont need to access this folder and i was hoping if there is any way to skip it.
Here is my real code:
It is a little bit long.
int listdir(const char *path)
{
struct dirent *entry;
DIR *dp;
if(dp = opendir(path))
{
struct stat buf ;
while((entry = readdir(dp)))
{
std::string p(path);
p += "\\";
p += entry->d_name;
char fpath[300];
if(!stat(p.c_str(), &buf))
{
if(S_ISREG(buf.st_mode))
{
sprintf(fpath,"%s\\%s",path,entry->d_name);
stat(fpath, &buf);
std::cout<<"\n Size of \t"<<fpath<<"\t"<<buf.st_size;
fmd5=MDFile (fpath);
}//inner second if
if(S_ISDIR(buf.st_mode) &&
// the following is to ensure we do not dive into directories "." and ".."
strcmp(entry->d_name, ".") && strcmp(entry->d_name, "..") )
{
listdir(p.c_str());
}
}//inner first if
else
std::cout << "ERROR in stat\n";
}//end while
closedir(dp);
}//first if
else
{
std::cout << "ERROR in opendir\n";
return 0;
}
return 1;
}//listdir()

Your biggest problem seems to be here:
sprintf(fpath,"%s\\%s",path,entry->d_name);
stat(fpath, &buf);
Without seeing fpath's declatation, it's tough to tell for certain, but you're either
Overflowing fpath in the sprintf call, leading to undefined behavior. "System Volume Information" is a long name. You should really use snprintf.
Not checking the return value of the stat call. If it returns -1, I'n not sure what the contents of buf will be.
More importantly, if you can use POSIX stuff, the function ftw is standard, and should provide most of the functionality you're trying to implement here.

Related

How can I open directories using a similar function like opendir, but using the filesystem library?

How can I list directories using a similar function like opendir, but using the filesystem library from C++?
The opendir function from dirent, opens the directory but we can't see it, which is okay.
for (const auto & entry : fs::recursive_directory_iterator(dir))
Basically, this is the code I use to loop through the directories. fs is the filesystem.
Now, the opendir actually opens the directory silently. Now if it doesn't have enough permissions, it can't open the directory and it will return,
No such directory. (I didn't write this function myself)
void SearchFiles(std::string Directory)
{
DIR* pDir;
if((pDir = opendir(Directory.c_str())) != NULL)
{
struct dirent* pEntry;
/* print all the files and directories within directory */
while((pEntry = readdir(pDir)) != NULL)
{
if(pEntry->d_type == DT_DIR)
{
std::string Name = pEntry->d_name;
if(Name != "." && Name != "..")
SearchFiles(Directory + Name + '\\');
}
else if(pEntry->d_type == DT_REG)
{
g_vFiles.push_back(Directory + pEntry->d_name);
}
}
closedir(pDir);
}
else
{
printf("No such directory: '%s'\n", Directory.c_str());
}
}
Now, I don't understand the code above a lot, but yeah...
Not sure how new the filesystem library is, but does it have a function or something so I can make that what it does above?
Because when using my method with the filesystem code above, it lists everything even those that I do not have permissions to, I guess.
Incase that doesn't exist without using the dirent.h one. Do I really have to use the * to do the pDir thing? Can't I just write DIR pDir, because the pointer thing is kinda not really clear to me.

c++ - fopen() internally changes my filename?

I use fopen() in my c++ program and I tried to open a .aff file.
I want to parse a file named car_wheel.aff and after if(ifp=fopen(path,"r")) has executed, it seems the fopen() function changes my path variable???
I add some detail to my question to the comment.
code (since variable path is constructed by my code, I put the whole piece of code here, which may seem a bit redundant.)
char* dir = "../kitchen/";
char filename[100];
char* path;
FILE *ifp;
int detail_level;
if(fscanf(fp,"%d %s",&detail_level,filename)!=2)
{
printf("Error: could not parse include.\n");
exit(0);
}
path = (char*)malloc(strlen(dir)+strlen(filename));
strcpy(path, dir);
strcat(path, filename); // path is "../kitchen/car_wheel.aff"
if(detail_level<=gDetailLevel)
{
if(ifp=fopen(path,"r"))
{
viParseFile(ifp);
fclose(ifp);
}
else
{
// jumped here and path became "../kitchen/car_wheel.aff1\002"
if (ifp == NULL) {
perror(path);
exit(EXIT_FAILURE);
}
printf("Error: could not open include file: <%s>.\n",filename);
exit(1);
}
}
I debugged the code in my ide, and it gave the filename char array is
and there is no '1\002' behind my filename variable. What happened??
The problem is here:
path = (char*)malloc(strlen(dir)+strlen(filename));
You don't allocate space for the terminating zero character. Change it to this:
path = (char*)malloc(strlen(dir)+strlen(filename)+1);

readdir on AWS EFS doesn't return all files in directory

After having written many files to a series of folders on EFS (10k or so). Readdir stops returning all of the files in each directory.
I have a C++ application that in one part of its process it generates a lot of files and each file is given a symlink. After that I need to get a list of the file in a folder to then select a subset to rename. When I run the function that gets the list of files, it does not return all the files that are actually there. This code runs fine on my local machine, but on an AWS server with a mounted EFS drive, it stops working after a while.
In order to troubleshoot this issue, I have made my code only write one file at a time. I have also setup my code to use getFiles() to give me a count of how many files there are in a folder after writing each batch of files (around 17 files). When the number of files reaches ~950 files, getFiles() starts listing ~910 files and no longer increments. When its writing files, the files are varied but fairly small (2 bytes - 300K) and its writing about 200 files a second. Each file also has a symlink created to it.
When reading and writing files I am using posix open(), write(), read() and close(). I have verified that I do in fact close all files after reading or writing.
I am trying to figure out:
1. Why is readdir not working? Or why is it not listing all the files?
2. What is different about EFS that could be causing issues?
These are the functions I am using to get the list of files in a folder:
DIR * FileUtil::getDirStream(std::string path) {
bool success = false;
if (!folderExists(path)){
return NULL;
}
DIR * dir = opendir(path.c_str());
success = dir != NULL;
int count = 0;
while(!success){
int fileRetryDelay = BlazingConfig::getInstance()->getFileRetryDelay();
const int sleep_milliseconds = (count+1)*fileRetryDelay;
std::this_thread::sleep_for(std::chrono::milliseconds(sleep_milliseconds));
std::cout<<"Was unable to get Dir stream for "<<path<<std::endl;
dir = opendir(path.c_str());
success = dir != NULL;
count++;
if(count > 6){
break;
}
}
if(success == -1){
std::cout<<"Can't get Dir stream for "<<path<<". Error was: "<<errno<<std::endl;
}
return dir;
}
int FileUtil::getDirEntry(DIR * dirp, struct dirent * & prevDirEntry, struct dirent * & dirEntry){
bool success = false;
if (dirp == NULL){
return -1;
}
int returnCode = readdir_r(dirp, prevDirEntry, &dirEntry);
success = (dirEntry == NULL && returnCode == 0) || dirEntry != NULL;
int count = 0;
while(!success){
int fileRetryDelay = BlazingConfig::getInstance()->getFileRetryDelay();
const int sleep_milliseconds = (count+1)*fileRetryDelay;
std::this_thread::sleep_for(std::chrono::milliseconds(sleep_milliseconds));
std::cout<<"Was unable to get dirent with readdir"<<std::endl;
returnCode = readdir_r(dirp, prevDirEntry, &dirEntry);
success = (dirEntry == NULL && returnCode == 0) || dirEntry != NULL;
count++;
if(count > 6){
break;
}
}
if(success == -1){
std::cout<<"Can't get dirent with readdir. Error was: "<<errno<<std::endl;
}
return returnCode;
}
std::vector<std::string> FileUtil::getFiles(std::string baseFolder){
DIR *dir = getDirStream(baseFolder);
std::vector <std::string> subFolders;
if (dir != NULL) {
struct dirent *prevDirEntry = NULL;
struct dirent *dirEntry = NULL;
int len_entry = offsetof(struct dirent, d_name) + fpathconf(dirfd(dir), _PC_NAME_MAX) + 1;
prevDirEntry = (struct dirent *)malloc(len_entry);
int returnCode = getDirEntry(dir, prevDirEntry, dirEntry);
while (dirEntry != NULL) {
if( dirEntry->d_type == DT_REG || dirEntry->d_type == DT_LNK){
std::string name(dirEntry->d_name);
subFolders.push_back(name);
}
returnCode = getDirEntry(dir, prevDirEntry, dirEntry);
}
free(prevDirEntry);
closedir (dir);
} else {
std::cout<<"Could not open directory err num is"<<errno<<std::endl;
/* could not open directory */
perror ("");
}
return subFolders;
}
The functions were written this way to try to be as robust as possible, since there can be many threads performing file operations, I wanted to be able to have the code retry in case of any failures. Unfortunately when getFiles() returns the wrong result, it does not give me any indication of failure.
Note: when I use readdir as opposed to readdir_r I still have the same issue.

DIR Functions returning false after a few days

I have a daemon (running on Ubuntu Server 16.04, compiled with g++ -std=c++11) that relies on two functions to know if a directory exists and if it's empty:
bool DirectoryExists ( const char* path ) {
if( path == NULL ) return false;
DIR *d;
d = opendir(path);
if (d){
closedir(d);
return true;
}
return false;
}
bool isEmpty(const char* path) {
int n = 0;
//Directory scan
DIR *d;
struct dirent *dir;
d = opendir(path);
if (d){
while ((dir = readdir(d)) != NULL){
if(dir->d_name[0] == '.')continue;
if(++n > 0) break;
}
closedir(d);
}
else{
return false;
}
if (n == 0) //Directory Empty
return true;
else
return false;
}
The problem is that after a day or two of the daemon working, these functions start constantly returning FALSE (both of them) when they should return TRUE. I have the suspicion that the DIR * pointer is not closing correctly but I couldn't manage to fix it.
What am I doing wrong here?
EDIT:
In some parts of my code, I use DirectoryExists to check if a removed directory is actually gone, or if it's still there. When that checking is done, the errno is set to "No such file or directory", which is correct, but I don't know if that could be the source of my problem.
system("rm -rf " + fullpath);
if(DirectoryExists(std::string(fullpath).c_str())){
syslog(LOG_ERR, "ERROR: Directory %s couldn't be removed", fullpath.c_str());
return false;
}
EDIT 2:
As I suspected, when these functions starts failing the errno is set to "too many open files"
My guess would be the offending line is the one containing std::string(fullpath).c_str():
if(DirectoryExists(std::string(fullpath).c_str())) {
...
}
You are using a temporary which gets destroyed before the function even enters. By luck, the memory pointed to by c_str() seems to contain the string you want, but at some point this ceases to be the case (perhaps because of memory fragmentation increasing pressure on the memory allocator to reuse memory).

File count in a directory using C++

How do I get the total number of files in a directory by using C++ standard library?
If you don't exclude the basically always available C standard library, you can use that one.
Because it's available everywhere anyways, unlike boost, it's a pretty usable option!
An example is given here.
And here:
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
int main (void)
{
DIR *dp;
int i = 0;
struct dirent *ep;
dp = opendir ("./");
if (dp != NULL)
{
while (ep = readdir (dp))
i++;
(void) closedir (dp);
}
else
perror ("Couldn't open the directory");
printf("There's %d files in the current directory.\n", i);
return 0;
}
And sure enough
> $ ls -a | wc -l
138
> $ ./count
There's 138 files in the current directory.
This isn't C++ at all, but it is available on most, if not all, operating systems, and will work in C++ regardless.
UPDATE: I'll correct my previous statement about this being part of the C standard library - it's not. But you can carry this concept to other operating systems, because they all have their ways of dealing with files without having to grab out additional libraries.
EDIT: : Added initialization of i
You can't. The closest you are going to be able to get is to use something like Boost.Filesystem
EDIT: It is possible with C++17 using the STL's filesystem library
As of C++17 it can be done with STL:
auto dirIter = std::filesystem::directory_iterator("directory_path");
int fileCount = std::count_if(
begin(dirIter),
end(dirIter),
[](auto& entry) { return entry.is_regular_file(); }
);
A simple for-loop works, too:
auto dirIter = std::filesystem::directory_iterator("directory_path");
int fileCount = 0;
for (auto& entry : dirIter)
{
if (entry.is_regular_file())
{
++fileCount;
}
}
See https://en.cppreference.com/w/cpp/filesystem/directory_iterator
An old question, but since it appears first on Google search, I thought to add my answer since I had a need for something like that.
int findNumberOfFilesInDirectory(std::string& path)
{
int counter = 0;
WIN32_FIND_DATA ffd;
HANDLE hFind = INVALID_HANDLE_VALUE;
// Start iterating over the files in the path directory.
hFind = ::FindFirstFileA (path.c_str(), &ffd);
if (hFind != INVALID_HANDLE_VALUE)
{
do // Managed to locate and create an handle to that folder.
{
counter++;
} while (::FindNextFile(hFind, &ffd) == TRUE);
::FindClose(hFind);
} else {
printf("Failed to find path: %s", path.c_str());
}
return counter;
}
If they are well named, sorted, and have the same extension, you could simply do count them with standard C++ library.
Assume the file names are like "img_0.jpg..img_10000.jpg..img_n.jpg",
Just check if they are in the folder or not.
int Trainer::fileCounter(string dir, string prefix, string extension)
{
int returnedCount = 0;
int possibleMax = 5000000; //some number you can expect.
for (int istarter = 0; istarter < possibleMax; istarter++){
string fileName = "";
fileName.append(dir);
fileName.append(prefix);
fileName.append(to_string(istarter));
fileName.append(extension);
bool status = FileExistenceCheck(fileName);
returnedCount = istarter;
if (!status)
break;
}
return returnedCount;
}
bool Trainer::FileExistenceCheck(const std::string& name) {
struct stat buffer;
return (stat(name.c_str(), &buffer) == 0);
}
You would need to use a native API or framework.