C++ Multi threaded directory scan code - c++

I was looking how to write a multi threaded C++ code for scanning directory and get list of all files underneath. I have written a single threaded code which can do and below the code which can do that.
#include <sys/types.h>
#include <dirent.h>
#include <errno.h>
#include <vector>
#include <string>
#include <iostream>
#include <sys/stat.h> /* for stat() */
using namespace std;
int isDir(string path)
;
/*function... might want it in some class?*/
int getdir (string dir, vector<string> &dirlist, vector<string> &fileList)
{
DIR *dp;
struct dirent *dirp, *dirFp ;
if((dp = opendir(dir.c_str())) == NULL) {
cout << "Error(" << errno << ") opening " << dir << endl;
return errno;
}
while ((dirp = readdir(dp)) != NULL) {
if (strcmp (dirp->d_name, ".") != 0 && strcmp(dirp->d_name, "..") != 0) {
//dirlist.push_back(string(dirp->d_name));
string Tmp = dir.c_str()+ string("/") + string(dirp->d_name);
if(isDir(Tmp)) {
//if(isDir(string(dir.c_str() + dirp->d_name))) {
dirlist.push_back(Tmp);
getdir(Tmp,dirlist,fileList);
} else {
// cout << "Files :"<<dirp->d_name << endl;
fileList.push_back(string(Tmp));
}
}
}
closedir(dp);
return 0;
}
int isDir(string path)
{
struct stat stat_buf;
stat( path.c_str(), &stat_buf);
int is_dir = S_ISDIR( stat_buf.st_mode);
// cout <<"isDir :Path "<<path.c_str()<<endl;
return ( is_dir ? 1: 0);
}
int main()
{
string dir = string("/test1/mfs");
vector<string> dirlist = vector<string>();
vector<string> fileList = vector<string>();
getdir(dir,dirlist,fileList);
#if 0
for (unsigned int i = 0;i < dirlist.size();i++) {
cout << "Dir LIst" <<dirlist[i] << endl;
//string dirF = dir + "/" + dirlist[i];
//getdir(dirF,fileList);
}
#endif
for (unsigned int i = 0; i < fileList.size(); i++)
cout << "Files :"<<fileList[i]<< endl;
return 0;
}
Now issue is that it is single threaded and I need to scan say about 8000 directories under which file can be present. So I am not getting how to do so as number of directories can vary as it is decided by N dimension matrix.
Any help in this regard will be great. Thanks in advance.

boost::filesystem has directory_iterator and recursive_directory_iterator, the former will get all the contents of a directory but not recurse sub-directories, the latter will also recurse subdirectories.
With regard to thread-safety, you could lock a mutex then copy the results into a std::vector or two vector instances, one for files and one for directories, in which case you will at least have a local snapshot copy.
To actual "freeze" the file-system at that point to stop any process modifying it is not something you can normally do - well you could try setting the file attributes on it to read-only then change it back later but you will need to have permission to do that first.

Related

issue while creating binary files

I've written this code, which it get the repository and look for the files within. it aims to create binary files for each file found so as to write some data inside it later. However, the code is not running as expected. and the binary file are not created this the issue.
the directory has two images, and the output I get is as follows :
Creating bin files
C:\repo\1.bin
Error: failed to create file
Press <RETURN> to close this window...
I really do not know where I miss it. Any advice I'd be glad.
#include <vector>
#include <string>
#include <iostream> // for standard I/O
#include <string> // for strings
#include <iomanip> // for controlling float print precision
#include <sstream> // string to number conversion
#include <fstream>
using namespace std;
void getDir(string d, vector<string> & f)
{
FILE* pipe = NULL;
string pCmd = "dir /B /S " + string(d);
char buf[256];
if( NULL == (pipe = _popen(pCmd.c_str(),"rt")))
{
cout<<"Error"<<endl;
return;
}
while (!feof(pipe))
{
if(fgets(buf,256,pipe) != NULL)
{
f.push_back(string(buf));
}
}
_pclose(pipe);
}
void replaceExt(string& s, const string& newExt) {
string::size_type i = s.rfind('.', s.length());
if (i != string::npos) {
s.replace(i+1, newExt.length(), newExt);
}
}
using namespace std;
int main(int argc, char* argv[])
{
vector<string> files;
string path = "C:\\repo";
getDir(path, files);
vector<string>::const_iterator it = files.begin();
cout<<"Creating bin files "<<endl;
ofstream myOfstream;
while( it != files.end())
{
string fileName = (string) *it;
replaceExt(fileName, "bin");
cout << fileName << '\n';
std::stringstream ss;
ss << fileName << "" ;
myOfstream.open(ss.str(), fstream::binary);
if ( !myOfstream )
{
std::cerr << "Error: failed to create file " << '\n';
break;
}
myOfstream.close();
it++;
}
return 0;
}
First I have to say, if you directory you are looking for doesn't exists or is empty, the program gets locked, it would be nice to have that fixed if making a bigger program.
Then, for your case, I don't see whars the point of that stringstream, so I tried removing that, and changing it by a normal string, removing the last \n character you get from reading the filenames:
cout << fileName << '\n';
string ss = fileName.substr(0, fileName.size() - 1);
myOfstream.open(ss.c_str(), fstream::binary);
if (!myOfstream)
{
hope it helps
I found the issue bro, after debugging ;D
the problem is in the "newline", the string fileName has a "\n" at the end that's whats rise your error. Thus you have to erase it, I ve used this statement fileName.erase(std::remove(fileName.begin(), fileName.end(), '\n'), fileName.end());
and I included algorithm lib.
the working code is as follows :
#include <vector>
#include <string>
#include <iostream> // for standard I/O
#include <string> // for strings
#include <iomanip> // for controlling float print precision
#include <sstream> // string to number conversion
#include <fstream>
#include <algorithm>
using namespace std;
void getDir(string d, vector<string> & f)
{
FILE* pipe = NULL;
string pCmd = "dir /B /S " + string(d);
char buf[256];
if( NULL == (pipe = _popen(pCmd.c_str(),"rt")))
{
cout<<"Error"<<endl;
return;
}
while (!feof(pipe))
{
if(fgets(buf,256,pipe) != NULL)
{
f.push_back(string(buf));
}
}
_pclose(pipe);
}
void replaceExt(string& s, const string& newExt) {
string::size_type i = s.rfind('.', s.length());
if (i != string::npos) {
s.replace(i+1, newExt.length(), newExt);
}
}
using namespace std;
int main(int argc, char* argv[])
{
vector<string> files;
string path = "C:\\repo";
getDir(path, files);
vector<string>::const_iterator it = files.begin();
cout<<"Creating bin files "<<endl;
ofstream myOfstream;
while( it != files.end())
{
string fileName = (string) *it;
replaceExt(fileName, "bin");
cout << fileName << '\n';
fileName.erase(std::remove(fileName.begin(), fileName.end(), '\n'), fileName.end());
std::stringstream ss;
ss << fileName << "" ;
myOfstream.open(ss.str(), fstream::binary);
if ( !myOfstream )
{
std::cerr << "Error: failed to create file " << '\n';
break;
}
myOfstream.close();
it++;
}
return 0;
}

Copy directory content

I want to copy the content o directory(tmp1) to another directory(tmp2). tmp1 may contain files and others directories. I want to copy the content of tmp1 (including the mode) using C/C++. If tmp1 contains a tree of directories I want to copy them recursively.
What is the simplest solution?
I found a solution to open the directory and read every entry and copy it with cp command. Any simpler solutions?
I recommend using std::filesystem (merged to ISO C++ as of C++17!)
Shamelessly copied from http://en.cppreference.com/w/cpp/filesystem/copy:
std::filesystem::copy("/dir1", "/dir3", std::filesystem::copy_options::recursive);
Read more about it:
https://gcc.gnu.org/onlinedocs/gcc-6.1.0/libstdc++/api/a01832.html
experimental::filesystem linker error
Recently I had the same need, so I have developed the next chunk of code in order to solve the problem. I hope it helps to another people in the same situation.
#include <iostream>
#include <dirent.h>
#include <string.h>
#include <sys/stat.h>
#include <windows.h>
using namespace std;
bool is_dir(const char* path);
void copyFile_(string inDir, string outDir);
void copyDir_(const char *inputDir, string outDir);
int main()
{
string srcDir = "C:\\testDirectory";
string destDir = "C:\\destDir";
copyDir_(srcDir.c_str(), destDir);
return 0;
}
void copyDir_(const char *inputDir, string outDir)
{
DIR *pDIR;
struct dirent *entry;
string tmpStr, tmpStrPath, outStrPath, inputDir_str = inputDir;
if (is_dir(inputDir) == false)
{
cout << "This is not a folder " << endl;
return;
}
if( pDIR = opendir(inputDir_str.c_str()) )
{
while(entry = readdir(pDIR)) // get folders and files names
{
tmpStr = entry->d_name;
if( strcmp(entry->d_name, ".") != 0 && strcmp(entry->d_name, "..") != 0 )
{
tmpStrPath = inputDir_str;
tmpStrPath.append( "\\" );
tmpStrPath.append( tmpStr );
cout << entry->d_name;
if (is_dir(tmpStrPath.c_str()))
{
cout << "--> It's a folder" << "\n";
// Create Folder on the destination path
outStrPath = outDir;
outStrPath.append( "\\" );
outStrPath.append( tmpStr );
mkdir(outStrPath.c_str());
copyDir_(tmpStrPath.c_str(), outStrPath);
}
else
{
cout << "--> It's a file" << "\n";
// copy file on the destination path
outStrPath = outDir;
outStrPath.append( "\\" );
outStrPath.append( tmpStr );
copyFile_(tmpStrPath.c_str(), outStrPath.c_str());
}
}
}
closedir(pDIR);
}
}
bool is_dir(const char* path)
{
struct stat buf;
stat(path, &buf);
return S_ISDIR(buf.st_mode);
}
void copyFile_(string inDir, string outDir)
{
CopyFile(inDir.c_str(), outDir.c_str(), 1);
DWORD Error = GetLastError();
}

reading files in a directory C++

I'm reading files in a directory and passing it to a function, I think I'm doing it in a wrong way, not able to figure out.
Here is my code first it reads files in a folder and send it to a function for further operations.
#include <dirent.h>
#include <stdio.h>
#include <vector>
#include <string>
#include <iostream>
using namespace std;
std::vector<std::string> fileName;
int main(void)
{
DIR *d;
struct dirent *dir;
vector<string> fileList;
int i=0;
d = opendir("files");
if (d)
{
while ((dir = readdir(d)) != NULL)
{
i++;
fileList.push_back(dir->d_name);
}
for(int i=0;i<fileList.size();i++) {
cout<<fileList[i]<<endl;
doSomething(fileList[i]);
}
closedir(d);
}
return(0);
}
int doSomething(fileName) {
//do something
}
Error
main.cpp: In function ‘int main()’:
main.cpp:29:28: error: ‘doSomething’ was not declared in this scope
doSomething(fileList[i]);
^
main.cpp: At global scope:
main.cpp:37:26: error: cannot convert ‘std::vector<std::basic_string<char> >’ to ‘int’ in initialization
int doSomething(fileName) {
^
main.cpp:37:28: error: expected ‘,’ or ‘;’ before ‘{’ token
int doSomething(fileName) {
^
Since your doSomething function is defined after main, it is not visible, that causes the first error. The correct way would be to at least declare the function first:
int doSomething(); //declaration
int main()
{
doSomething(); //now the function is declared
}
//definition
int doSomething()
{
}
Now, the second and third errors is emited because you didn't include the fileName parameter's type in your function definition. Based on your code, it should be a string:
int doSomething(string fileName)
{
}
I also noticed that, while this function returns int, you are not using it's returned value. Nevertheless, don't forget to return something from doSomething, otherwise it will cause undefined behavior.
Yes, Boost is great, but it's a bit bloaty. So, just for completenessapplied to reading images in a directory for OpenCV:
// you need these includes for the function
//#include <windows.h> // for windows systems
#include <dirent.h> // for linux systems
#include <sys/stat.h> // for linux systems
#include <algorithm> // std::sort
#include <opencv2/opencv.hpp>
#include <iostream> //cout
using namespace std;
/* Returns a list of files in a directory (except the ones that begin with a dot) */
int readFilenames(std::vector<string> &filenames, const string &directory)
{
#ifdef WINDOWS
HANDLE dir;
WIN32_FIND_DATA file_data;
if ((dir = FindFirstFile((directory + "/*").c_str(), &file_data)) == INVALID_HANDLE_VALUE)
return; /* No files found */
do {
const string file_name = file_data.cFileName;
const string full_file_name = directory + "/" + file_name;
const bool is_directory = (file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0;
if (file_name[0] == '.')
continue;
if (is_directory)
continue;
filenames.push_back(full_file_name);
} while (FindNextFile(dir, &file_data));
FindClose(dir);
#else
DIR *dir;
class dirent *ent;
class stat st;
dir = opendir(directory.c_str());
while ((ent = readdir(dir)) != NULL) {
const string file_name = ent->d_name;
const string full_file_name = directory + "/" + file_name;
if (file_name[0] == '.')
continue;
if (stat(full_file_name.c_str(), &st) == -1)
continue;
const bool is_directory = (st.st_mode & S_IFDIR) != 0;
if (is_directory)
continue;
// filenames.push_back(full_file_name); // returns full path
filenames.push_back(file_name); // returns just filename
}
closedir(dir);
#endif
std::sort (filenames.begin(), filenames.end()); //optional, sort the filenames
return(filenames.size()); //Return how many we found
} // GetFilesInDirectory
void help(const char **argv) {
cout << "\n\n"
<< "Call:\n" << argv[0] << " <directory path>\n\n"
<< "Given a directory of images, create a vector of\n"
<< "their names, read and display them. Filter out\n"
<< "non-images\n"
<< endl;
}
int main( int argc, const char** argv )
{
if(argc != 2) {
cerr << "\nIncorrect number of parameters: " << argc << ", should be 2\n" << endl;
help(argv);
return -1;
}
string folder = argv[1];
cout << "Reading in directory " << folder << endl;
vector<string> filenames;
int num_files = readFilenames(filenames, folder);
cout << "Number of files = " << num_files << endl;
cv::namedWindow( "image", 1 );
for(size_t i = 0; i < filenames.size(); ++i)
{
cout << folder + filenames[i] << " #" << i << endl;
cv::Mat src = cv::imread(folder + filenames[i]);
if(!src.data) { //Protect against no file
cerr << folder + filenames[i] << ", file #" << i << ", is not an image" << endl;
continue;
}
cv::imshow("image", src);
cv::waitKey(250); //For fun, wait 250ms, or a quarter of a second, but you can put in "0" for no wait or -1 to wait for keypresses
/* do whatever you want with your images here */
}
}

POSIX Program to search entire file system for a file

Hey everyone. I need to write a POSIX program to search through an entire file system for a specified file starting at the top directory. I've got some code which isn't done at all, but when I run it, and check to see if a particular file is a directory, it's saying this file which is not at all a directory is a directory and is trying to move into it, causing an error. I'm not sure how I can tell it that this type of file isn't a directory.
Here's my code. I know it's not perfect and I could probably do some things differently in the way of getting the directory names and passing them into the function. Either way, I'm pretty sure I have to do this recursively.
The file in question is /dev/dri/card0 and I'm running this from a Debian virtual machine.
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <unistd.h>
#include <time.h>
#include <stdint.h>
#include <locale.h>
#include <langinfo.h>
#include <fcntl.h>
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;
void SearchDirectory(string file_Name, string directory){
string new_Directory = directory;
DIR *dirp;
dirp = opendir(directory.c_str());
struct dirent *dptr;
struct stat statStruct;
while(dptr = readdir(dirp)){
stat(dptr->d_name, &statStruct);
if( S_ISDIR(statStruct.st_mode) ){
string check = dptr->d_name;
if ( check.compare(".") == 0 || check.compare("..") == 0 ){
continue;
}
else{
cout << dptr->d_name << " is is a directory" << endl;
new_Directory.append("/");
new_Directory.append(dptr->d_name);
SearchDirectory(file_Name, new_Directory);
}
}
else if( S_ISREG(statStruct.st_mode)){
string check = dptr->d_name;
if( check.compare(file_Name) == 0){
cout << "Found " << file_Name << " in " << directory << "/" << endl;
}
}
}
}
int main(int argc, char *argv[]){
if(argc < 2 || argc > 2){
cerr << "This program will find the specified file." << endl;
cerr << "Usage: mysearch <filename>" << endl;
return 1;
}
string file_Name = argv[1];
SearchDirectory(file_Name, "/");
return 0;
}
POSIX.2 requires a working "find" command.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
if (argc != 2) {
fprintf(stderr, "Usage: %s <filename>", argv[0]);
}
execlp("find", "find", "/", "-name", argv[1], "-print", (char *)NULL);
exit(EXIT_FAILURE);
}
->d_name returns just the name of the file, not the path to the file. You need to stat (not yet constructed) new_Directory instead of dptr->d_name.
You also have a problem if a directory contains more than one subdirectories. Your construction of new_Directory is incorrect for each subdirectory after the first.
You never closedir your directory handle, so you run out of resources. You should also consider loading the entire directory into an array before recursing to avoid running out of handles.
void SearchDirectory(string directory, string target_File_Name){
DIR *dirp = opendir(directory.c_str());
if (!dirp) {
perror(("opendir " + directory).c_str());
return;
}
struct dirent *dptr;
while(dptr = readdir(dirp)){
string file_Name = dptr->d_name;
string file_Path = directory + "/" + file_Name;
struct stat statStruct;
stat(file_Path.c_str(), &statStruct);
if( S_ISDIR(statStruct.st_mode) ){
if ( file_Name.compare(".") == 0 || file_Name.compare("..") == 0 ){
continue;
}
SearchDirectory(file_Path, target_File_Name);
}
else if( S_ISREG(statStruct.st_mode)){
if( file_Name.compare(target_File_Name) == 0){
cout << file_Path << endl;
}
}
}
closedir(dirp);
}
Update: Added second problem.
Update: Added third problem.
Update: Added code.
Not for the benefit of the OP, who writes "The point is to come up with a way to do it myself," but rather for the benefit of posterity, here is a way to use Boost.Filesystem:
#include <boost/filesystem.hpp>
namespace fs = boost::filesystem;
// sample usage: find_file("/home", ".profile");
void find_file( const fs::path& dirPath, const std::string& fileName) {
fs::recursive_directory_iterator end;
for(fs::recursive_directory_iterator it(dirPath); it != end; ++it) {
if(it->leaf() == fileName)
std::cout << it->path() << "\n";
if(fs::is_symlink(it->symlink_status()))
it.no_push();
}
}
Use fork, execv and the Unix implemented /usr/bin/find process and redirect its output for your result area?
I'm not sure if it's POSIX or not but the nftw library function is widely available on UNIX (HP-UX, AIX, Linux).
Your problem is "search a tree for a match"
BFS and DFS are the canonical basic algorithms. Give them a start node and go.
You will get into trouble if you follow symlinks; so test for them and don't follow them.
You should be able to map each point in the *FS algorithms to a directory operation.
Since C++ is an option, why not use something like Boost.Filesystem? The Boost.Filesystem two-minute tutorial gives an example of how to implement your search using directory iterators.

How can I automatically open the first file in a folder using C++?

How can I automatically open and read the content of a file within a given directory from a C++ application without knowing the file's name?
For example (a rough description of the program):
#include iomanip
#include dirent.h
#include fstream
#include iostream
#include stdlib.h
using namespace std;
int main()
{
DIR* dir;
struct dirent* entry;
dir=opendir("C:\\Users\\Toshiba\\Desktop\\links\\");
printf("Directory contents: ");
for(int i=0; i<3; i++)
{
entry=readdir(dir);
printf("%s\n",entry->d_name);
}
return 0;
}
This will print the name of the first file in that directory. My problem is how to read that particular file's content and save it in a .txt document. Can ifstream do that? (Sorry for my bad English.)
this should do it
#include <iostream>
#include <boost/filesystem/operations.hpp>
#include <boost/filesystem/fstream.hpp>
using namespace boost::filesystem;
using namespace std;
void show_files( const path & directory, bool recurse_into_subdirs = true )
{
if( exists( directory ) )
{
directory_iterator end ;
for( directory_iterator iter(directory) ; iter != end ; ++iter )
if ( is_directory( *iter ) )
{
cout << iter->native_directory_string() << " (directory)\n" ;
if( recurse_into_subdirs ) show_files(*iter) ;
}
else
cout << iter->native_file_string() << " (file)\n" ;
copyfiles(iter->native_file_string());
}
}
void copyfiles(string s)
{
ifstream inFile;
inFile.open(s);
if (!inFile.is_open())
{
cout << "Unable to open file";
exit(1); // terminate with error
}
//Display contents
string line = "";
//Getline to loop through all lines in file
while(getline(inFile,line))
{
cout<<line<<endl; // line buffers for every line
//here add your code to store this content in any file you want.
}
inFile.close();
}
int main()
{
show_files( "/usr/share/doc/bind9" ) ;
return 0;
}
If you're on Windows you can use the FindFirstFile in the Windows API. Here is a short example:
HANDLE myHandle;
WIN32_FIND_DATA findData;
myHandle = FindFirstFile("C:\\Users\\Toshiba\\Desktop\\links\\*", &findData);
do {
if (findData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY){
cout << "Directoryname is " << findData.cFileName << endl;
}
else{
cout << "Filename is " << findData.cFileName << endl;
}
} while (FindNextFile(myHandle, &findData));
Otherwise I'd go with ayushs answer, Boost works for unix systems as well