readdir(): re-reading certain files - c++

I got a function which task is to rename all files in a folder however, it re-rename certain files:
http://i.imgur.com/JjN8Qb2.png, the same kind of "error" keeps occurring for every tenth number onwards. What exactly is causing this "error"?
The two arguments to the function is the path for the folder and what start value the first file should have.
int lookup(std::string path, int *start){
int number_of_chars;
std::string old_s, file_format, new_s;
std::stringstream out;
DIR *dir;
struct dirent *ent;
dir = opendir (path.c_str());
if (dir != NULL) {
// Read pass "." and ".."
ent = readdir(dir);
ent = readdir(dir);
// Change name of all the files in the folder
while((ent = readdir (dir)) != NULL){
// Old string value
old_s = path;
old_s.append(ent->d_name);
// Get the format of the image
file_format = ent->d_name;
number_of_chars = file_format.rfind(".");
file_format.erase(0,number_of_chars);
// New string value
new_s = path;
out << *start;
new_s += out.str();
new_s.append(file_format);
std::cout << "Successfully changed name on " << ent->d_name << "\tto:\t" << *start << file_format << std::endl;
// Switch name on the file from old string to new string
rename(old_s.c_str(), new_s.c_str());
out.str("");
*start = *start+1;
}
closedir (dir);
}
// Couldn't open
else{
std::cerr << "\nCouldn't open folder, check admin privileges and/or provided file path\n" << std::endl;
return 1;
}
return 0;
}

You are renaming files to the same folder in which the original files were, resulting in an infinite loop. You renamed 04.png to 4.png but since you are iterating over all files in the folder, at some point you're going to iterate to the "new" 4.png file (in your smaple, on the 40th iteration) and rename that file to 40.png and so on...
The easiest way to resolve this with minimal changes to the existing code is to "rename" (move) the files to a temporary folder with their new names. Something like:
new_s = temp_path;
out << *start;
new_s += out.str();
new_s.append(file_format);
// Switch name on the file from old string to new string
rename(old_s.c_str(), new_s.c_str());
and when you are done renaming all the files in path (outside the while loop), delete the folder and "rename" (move) temp_path to `path:
closedir (dir);
deletedir(path);
rename(temp_path, path);
`

Possible problems I see:
Renaming files causes them to be fed to your algorithm twice.
Your algorithm for computing the new filename is wrong.
You should be able to write a test for this easily, which in turn should help you fix the problem or write a more specific question. Other than that, I don't see any grave issues, but it would help if you reduced the scope of variables a bit, which would make sure that different iterations don't influence each other.

Related

C++: Reading from multiple files with spaces in their names

Essential to this problem is, that i am programming on Xcode. I wrote a function that reads in a given amount of text files into my sensor vector: To get the text file Paths i wrote also a function that gives me the filenames inclusive their paths and stores them in a list. The text files contain data that are delimited by a tab and will be stored in excel later on. The problem is they include spaces in their names and Mac has a problem with spaces in file names. I tried to replace the spaces with "\ ". This is what the terminal does with a space when i echo a file with a space in it. I cant open the files to read from them. I appreciate your help.
The path function:
void get_filelist(list<string>& list_in)
{
string full_path;
DIR *dir;
struct dirent *ent;
if((dir = opendir (dir_target.c_str()))!=NULL)
{
while((ent = readdir (dir)) != NULL)
{
if(strstr(ent->d_name, ".txt") && !strstr(ent->d_name, "Summary"))
{
full_path = dir_target;
full_path = full_path + ent->d_name;
list_in.push_back(full_path);
}
}
closedir(dir);
}
else{
printf("could not open directory");
perror("");
}
}
now here is the function that writes into my 3D vector
void fill_vector(list<string> list_in, data_vec& sensors)
{
ifstream myfile;
size_t found;
for(list<string>::iterator it = list_in.begin(); it!= list_in.end(); it++)
{
string tab = "";
vector<vector<string> > temp_matrix;
cout << *it << endl;
myfile.open(*it);
if(myfile.is_open())
{
vector<string> temp_row;
while(myfile.is_open())
{
getline(myfile, tab, '\t');
found = tab.find('\n');
if(found == string::npos) temp_row.push_back(tab);
else{
temp_row.push_back(tab.substr(0, found));
temp_matrix.push_back(temp_row);
temp_row.clear();
temp_row.push_back(tab.substr(found+1));
}
}
myfile.close();
}
else cout << "unable to open file" ;
sensors.push_back(temp_matrix);
}
}
Don't use backs;ash to denote spaces in filenames or you'll get into an even worse mess. Backslash is directory separator on MS DOS.
If
fopen("path/my file.txt", "w");
and
fopen("path/my filetxt", "r");
both work as expected (creating a file with a space in its name and opening it) you don't really have a problem. The rest of the system has the problem, but if you must have files with spaces in their names, you can read and write them.
Of course convert to hyphen, underscores or simple concatenation as soon as possible, spaces in filenames make for endless problems.
The function
std::string spacesToUnderscores(std:string const &nasty)
is easy enough to write.

Recursive listing files in C++ doesn't enter all subdirectories

!!!Solved!!!
Thank you guys for your help, it's all working now. I made changes to my code as suggested by #RSahu and got it to work.
Thanks for all your input I've been really stuck with this.
To #Basile: I will definitely check that out but for this particular piece of code I'm not gonna use it because it looks way too complicated :) But thanks for suggestion.
Original question
I'm trying to make a C++ code to list all files in given directory and it's subdirectories.
Quick explanation
Idea is that function list_dirs(_dir, _files, _current_dir) will start in top directory and put files into vector _files and when it find a directory it will call itself on this directory. The _current_dir is there to be prepended to file name if in subdirectory because I need to know the path structure (it's supposed to generate sitemap.xml).
In list_dirs there is a call to list_dir which simply returns all files in current directory, not making difference between file and directory.
My problem
What codes does now is that it lists all files in original directory and then all files in one subdirectory but skipping all other subdirectories. It will list them but not the files in them.
And to be even more cryptic, it list files only in this one specific directory and none other. I tried running it in multiple locations but it never went into any other directory.
Thanks in advance and please note that I am beginner at C++ so don't be harsh ;)
LIST_DIR
int list_dir(const std::string& dir, std::vector<std::string>& files){
DIR *dp;
struct dirent *dirp;
unsigned fileCount = 0;
if ((dp = opendir(dir.c_str())) == NULL){
std::cout << "Error opening dir." << std::endl;
}
while ((dirp = readdir(dp)) != NULL){
files.push_back(std::string (dirp->d_name));
fileCount++;
}
closedir(dp);
return fileCount;
}
and LIST_DIRS
int list_dirs (const std::string& _dir, std::vector<std::string>& _files, std::string _current_dir){
std::vector<std::string> __files_or_dirs;
list_dir(_dir, __files_or_dirs);
std::vector<std::string>::iterator it = __files_or_dirs.begin();
struct stat sb;
while (it != __files_or_dirs.end()){
if (lstat((&*it)->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
/* how to do this better? */
if (*it == "." || *it == ".."){
__files_or_dirs.erase(it);
continue;
}
/* here it should go into sub-directory */
list_dirs(_dir + *it, _files, _current_dir + *it);
__files_or_dirs.erase(it);
} else {
if (_current_dir.empty()){
_files.push_back(*it);
} else {
_files.push_back(_current_dir + "/" + *it);
}
++it;
}
}
}
The main problem is in the line:
if (lstat((&*it)->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
You are using the name of a directory entry in the call to lstat. When the function is dealing with a sub-directory, the entry name does not represent a valid path. You need to use something like:
std::string entry = *it;
std::string full_path = _dir + "/" + entry;
if (lstat(full_path.c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
Suggestions for improvement
Update list_dir so that it doesn't include "." or ".." in the output. It makes sense to me to exclude those files to start with.
int list_dir(const std::string& dir, std::vector<std::string>& files){
DIR *dp;
struct dirent *dirp;
unsigned fileCount = 0;
if ((dp = opendir(dir.c_str())) == NULL){
std::cout << "Error opening dir." << std::endl;
}
while ((dirp = readdir(dp)) != NULL){
std::string entry = dirp->d_name;
if ( entry == "." or entry == ".." )
{
continue;
}
files.push_back(entry);
fileCount++;
}
closedir(dp);
return fileCount;
}
In list_dirs, there is no need to erase items from _files_or_dirs. The code can be simplified with a for loop and by removing the calls to erase items from _files_or_dirs.
It's not clear to me what the purpose of _current_dir is. Perhaps it can be removed.
Here's an updated version of the function. _current_dir is used only to construct the value of the argument in the recursive call.
int list_dirs (const std::string& _dir,
std::vector<std::string>& _files, std::string _current_dir){
std::vector<std::string> __files_or_dirs;
list_dir(_dir, __files_or_dirs);
std::vector<std::string>::iterator it = __files_or_dirs.begin();
struct stat sb;
for (; it != __files_or_dirs.end() ; ++it){
std::string entry = *it;
std::string full_path = _dir + "/" + entry;
if (lstat(full_path.c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
/* how to do this better? */
/* here it should go into sub-directory */
list_dirs(full_path, _files, _current_dir + "/" + entry);
} else {
_files.push_back(full_path);
}
}
}
For this line:
if (lstat((&*it)->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
Note that readdir and consequently list_dir only return the file name, not the full file path. So at this point (&*it)->c_str() only has a file name (e.g. "input.txt"), not the full path, so when you call lstat on a file in a subdirectory, the system can't find it.
To fix this, you will need to add in the file path before calling lstat. Something like:
string fullFileName;
if (dir.empty()){
fullFileName = *it;
} else {
fullFileName = dir + "/" + *it;
}
if (lstat(fullFileName.c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
You may have to use _currentDir instead of dir, depending on what they are actually for (I couldn't follow your explanation).
I am not sure all of the problems in your code but I can tell you that this line and the other one similar to it are going to cause you problems:
__files_or_dirs.erase(it);
When you call erase you invalidate the iterator and references at or after the point of the erase, including the end() iterator (see this erase reference). You are calling erase and then not storing the returned iterator and are then looking at it again after this call which is not a good thing to do. You should at least change the line to this so that you capture the returned iterator which should point to the element just after the erased element (or end() if it was the last element)
it = __files_or_dirs.erase(it);
It also appears from the code you posted that you have a redundancy between _dir and _current_dir. You do not modify either of them. You pass them in as the same value and they stay the same value throughout the function execution. Unless this is simplified code and you are doing something else, I would recommend you remove the _current_dir one and just stick with _dir. You can replace the line in the while loop with _dir where you are building the file name and you will have simplified your code which is always a good thing.
A simpler way on Linux is to use the nftw(3) function. It is scanning recursively the file tree, and you give it some handler function.

C++: Rename all files in a directory

List of whats been achieved and what I'm stuck on to help with understanding what I am asking
What I have achieved:
Open a user specified directory, display all files within this directory.
What I haven't yet achieved:
Rename all files within this directory automatically according to a predefined name - Files are currently named as random characters, I wish to automatically rename them to "August 1", "August 2", "August 3" etc. Files have different extensions though, and I wish the extensions to remain the same.
So this is how I am opening and displaying the directory:
void DirectorySelector::OpenDirectory(void)
{
// convert directory string to const char
DIRECTORY = directory.c_str();
pdir = opendir (DIRECTORY);
}
void DirectorySelector::DisplayDirectory(void)
{
// read directory
while (pent = readdir (pdir))
{
std::cout << pent->d_name << "\n";
}
}
And this is what I am stuck on, renaming the files (files have different extensions, not sure if this will cause problems later on?)
I get the following error as soon as the program hits the while loop:
Unhandled exception at 0x009657C1 in MultipleRename.exe: 0xC0000005: Access violation reading location 0xCCCCCDE0.
void DirectoryOperator::StandardRename(void)
{
i = 1;
while (pent = readdir (pdir))
{
oldname = pent->d_name;
newname = "August " + i;
OLDNAME = oldname.c_str();
NEWNAME = newname.c_str();
rename(OLDNAME, NEWNAME);
i++;
}
}
Note: All declarations handled elsewhere and have removed validation for simplicity, if you need the code I can post it.
Also I have already checked that the directory is still open in the DirectoryOperator class and I am using MSVS2012 on Windows.
Thanks in advance.
There is a problem with the line:
newname = "August " + i;
"August " is a char* and i is added to the pointer before it is converted into a std::string.
So, when i==1, your string will be "ugust ", and when it is 2, it will be "gust ". Very quickly, when i > 8, you will run into undefined behavior.
Solutions:
newname = "August " + std::to_string(i); // c++11
or
#include<sstream>
...
stringstream ss;
ss << "August " << i;
newname = ss.str();
"I get the following error as soon as the program hits the while loop:"
Unhandled exception at 0x009657C1 in MultipleRename.exe: 0xC0000005: Access violation reading location 0xCCCCCDE0.
Most probably pdir isn't correctly initialized when the code
while (pent = readdir (pdir))
is called. The value 0xC0000005 indicates you're trying to dereference a nullptr somewhere.
Are you sure, that
pdir = opendir (DIRECTORY);
was called in sequence as intended, and the result was valid (pdir != nullptr)?

list top 10 files by size in a unix directory

I am trying a to read a unix directory (including all subdirectories) using c++ and list the top 10 largest files.
I have read that I can use #include dirent.h and use struct dirent but I am having trouble passing the directory name as a variable to opendir/readdir.
Basically it doesn't recognise it and says file/directory not found.
Please can you help me with how I can do this in c++ and print out the top 10 largest files in the directory? Thanks
DIR *dir;
struct dirent *ent;
dir = opendir ("homedir");
if (dir != NULL) {
while ((ent = readdir (dir)) != NULL) {
cout << ent->d_name <<endl;
}
closedir (dir);
} else {
cout << "Can't open directory" << endl;
}
You don't really give enough details, but when you are reading
recursively, are you postfixing the names you read to the
previous names. Reading a directory doesn't change the current
directory, so your function should look more or less like:
std::vector
readDirectoriesRecursively( std::string const& path )
{
std::vector results;
for each name in path
if is directory
results.insert(
results.end(),
readDirectoriesRecursively( path + '/' + filename ) ) ;
else
results.push_back( FileInfo( path + '/' + filename ) );
return results;
}
In the constructor of FileInfo, use stat to obtain the size. Once you have the results, sort by size, and output the first 10.
You're almost there. You have all the filenames. With these, you can do a stat to obtain the filesize for each file. When you sort the filesizes descending, you have the ten largest files.
struct stat buf;
stat(ent->d_name, &buf);
See the detailed example in the man page.

directory_iterator file_iter to rename files in a folder

I wanted to rename the files in a directory.There are 52 folders in the directory. Each folder has a different name and has around 40 files in each of them.I wanted to extract the name of a particular folder and attach that name to the name of the files in that particular folder.
It worked fine, when there was only 31 or less files in each folder. But whenever the number of files in a particular folder was above 31 the rename algorithm i wrote failed. I am not able to figure out why it crashes when there are more files. Do enlighten me if u understand why...!
I'm attaching the code:
int main( int argc, char** argv ){
directory_iterator end_iter;
directory_iterator file_itr;
string inputName;
string checkName;
inputName.assign(argv[1]);
if (is_directory(inputName))
{
for (directory_iterator dir_itr(inputName); dir_itr != end_iter; ++dir_itr)
{
if (is_directory(*dir_itr))
{
for (directory_iterator file_itr(*dir_itr); file_itr != end_iter; ++file_itr)
{
string folderName(dir_itr->path().filename().string());
if (is_regular_file(*file_itr))
{
std::string fileType = file_itr->path().extension().string();
std::transform(fileType.begin(), fileType.end(), fileType.begin(), (int(*)(int))std::toupper);
if (fileType == ".JPG" || fileType == ".JPEG" || fileType == ".JPG" || fileType == ".PGM")
{
string filename(file_itr->path().string());
string pathName(file_itr->path().parent_path().string());
string oldName(file_itr->path().filename().string());
cout << folderName << endl;
folderName += "_";
folderName += oldName;
string newPathName = pathName + "\\" + folderName;
cout << pathName <<"\\"<< folderName << endl;
//RENAMING function
rename(file_itr->path(), path(newPathName.c_str()));
}
}
}
}
}
}
}
It's likely that Boost's directory_iterator implementation is getting confused by you renaming files that are in the directory listing.
From the docs:
Warning: If a file or sub-directory is removed from or added to a directory after the construction of a directory_iterator for the directory, it is unspecified whether or not subsequent incrementing of the iterator will ever result in an iterator whose value is the removed or added directory entry.
I recommend trying it in two phases. In the first phase, use the code you have now to build a vector<pair<string, string> > instead of renaming the file. Then, once you've scanned the directory, it should just be a matter of iterating through the list performing the actual renames.