Recursive Directories and File streaming and Searching Strings - c++

I have an issue with the recursive call in the walkThroughFunction.
The code is supposed to go through directories and count the sub directories and if it finds a file it should open it and search a certain string.
The code only goes through one directory. Can someone help me with this. You will find the braces misplaced a little. Kindly ignore those.
int directories=0;
void walkThroughDirectory(char *directory_name,char *searchString){
DIR * directory;
struct dirent * walker;
char d_name[PATH_MAX];
int path_length;
char path[PATH_MAX];
directory=opendir(directory_name);
if(directory==NULL){
cout<<"Error"<<endl;
cout<<directory_name<<" Cannot be Opened"<<endl;
exit(10000);
}
while((walker=readdir(directory)) !=NULL){
strcpy(d_name,walker->d_name);
cout<<directory_name<<"/"<<endl;
if (strcmp (d_name, "..") == 0 &&
strcmp (d_name, ".") == 0){
continue;
}
else{
path_length = snprintf(path,PATH_MAX,"%s/%s\n",directory_name,d_name);
cout<<"HELLO"<<endl;
cout<<path<<endl;
if (path_length >= PATH_MAX){
cout<<"Path is too long"<<endl;
exit (1000);
}
if(walker->d_type==DT_DIR){
cout<<"Hello"<<endl;
directories++;
walkThroughDirectory (path,searchString);
}
else if(walker->d_type==DT_REG){
ifstream openFile;
openFile.open(path);
char line[1500];
int currentLine = 0;
if (openFile.is_open()){
while (openFile.good()){
currentLine++;
openFile.getline(line, 1500);
if (strstr(line, searchString) != NULL)
cout<<path<<": "<<currentLine<<": "<<line<<endl;
}
}
openFile.close();
}
/*
struct stat directory_stat;
if (stat(path, &directory_stat) == -1){
return;
}
if (S_ISDIR(directory_stat.st_mode)){
cout<<"HELLO"<<endl;
directories++;
walkThroughDirectory(path, searchString);
}
else if (S_ISREG(directory_stat.st_mode)){
ifstream openFile;
openFile.open(path);
char line[1500];
int currentLine = 0;
if (openFile.is_open()){
while (openFile.good()){
currentLine++;
openFile.getline(line, 1500);
if (strstr(line, searchString) != NULL)
cout<<path<<": "<<currentLine<<": "<<line<<endl;
}
}
// it's a file so search for text in file
}
*/
}
}
if (closedir (directory))
{
cout<<"Unable to close "<<directory_name<<endl;
exit (1000);
}
}
int main(){
char * name;
name=new char;
cout<<"Total Directories "<< directories<<endl;
name=get_current_dir_name();
cout<<"Current Directory is: "<<name<<endl;
/*
cout<<"Now Enter The Desired Directory from the root or the current path"<<endl;
char *desiredDirectory;
desiredDirectory=new char;
cin>>desiredDirectory;
cout<<"Enter The String You want to search"<<endl;
char *searchString;
searchString=new char;
cin>>searchString;
*/
char ourpath[400];
strcpy(ourpath,name);
walkThroughDirectory(ourpath,"diminutive");
cout<<"Total Directories "<< directories<<endl;
return 0;
}

This code has several problems. First, when you perform the strcmp to check if d_name is "." or "..", you need to use an OR, not an AND. Second, when you call sprintf to create your c-string path, you should not have a newline at the end of the string. This is what was causing your code to only go one level deep. Third, when you call get_current_dir_name, it does all the malloc work for you. What you're doing is allocating space for a single char, which won't work in itself and is not a correct use of the API. See the man page for get_current_dir_name.
The below code addresses these issues (and also has proper indentation).
#include <iostream>
#include <fstream>
#include <stdlib.h>
#include <unistd.h>
#include <dirent.h>
#include <limits.h>
#include <string.h>
int directories=0;
void walkThroughDirectory(char *directory_name,char *searchString)
{
DIR *directory;
struct dirent *walker;
char d_name[PATH_MAX];
int path_length;
char path[PATH_MAX];
directory = opendir(directory_name);
if(directory == NULL)
{
std::cout << directory_name << " Cannot be Opened" << std::endl;
exit(1);
}
while((walker=readdir(directory)) != NULL)
{
strcpy(d_name, walker->d_name);
// Needs to be || not &&
if (strcmp(d_name, "..") == 0 || strcmp(d_name, ".") == 0)
{
continue;
}
else
{
// No newline on the path name.
path_length = snprintf(path, PATH_MAX, "%s/%s", directory_name, d_name);
if (path_length >= PATH_MAX)
{
std::cout << "Path is too long" << std::endl;
exit(2);
}
if(walker->d_type == DT_DIR)
{
directories++;
walkThroughDirectory(path, searchString);
}
else if(walker->d_type==DT_REG)
{
std::ifstream openFile;
openFile.open(path);
char line[1500];
int currentLine = 0;
if (openFile.is_open())
{
while (openFile.good())
{
currentLine++;
openFile.getline(line, 1500);
if (strstr(line, searchString) != NULL)
std::cout << path << ": " << currentLine << ": " << line << std::endl;
}
}
openFile.close();
}
}
}
if (closedir(directory))
{
std::cout << "Unable to close " << directory_name << std::endl;
exit(3);
}
}
int main()
{
// get_current_dir_name() mallocs a string for you.
char *name;
name = get_current_dir_name();
walkThroughDirectory(name, "matthew");
free(name);
std::cout << "Total Directories: " << directories << std::endl;
return 0;
}

Related

reading files in a directory C++

I'm reading files in a directory and passing it to a function, I think I'm doing it in a wrong way, not able to figure out.
Here is my code first it reads files in a folder and send it to a function for further operations.
#include <dirent.h>
#include <stdio.h>
#include <vector>
#include <string>
#include <iostream>
using namespace std;
std::vector<std::string> fileName;
int main(void)
{
DIR *d;
struct dirent *dir;
vector<string> fileList;
int i=0;
d = opendir("files");
if (d)
{
while ((dir = readdir(d)) != NULL)
{
i++;
fileList.push_back(dir->d_name);
}
for(int i=0;i<fileList.size();i++) {
cout<<fileList[i]<<endl;
doSomething(fileList[i]);
}
closedir(d);
}
return(0);
}
int doSomething(fileName) {
//do something
}
Error
main.cpp: In function ‘int main()’:
main.cpp:29:28: error: ‘doSomething’ was not declared in this scope
doSomething(fileList[i]);
^
main.cpp: At global scope:
main.cpp:37:26: error: cannot convert ‘std::vector<std::basic_string<char> >’ to ‘int’ in initialization
int doSomething(fileName) {
^
main.cpp:37:28: error: expected ‘,’ or ‘;’ before ‘{’ token
int doSomething(fileName) {
^
Since your doSomething function is defined after main, it is not visible, that causes the first error. The correct way would be to at least declare the function first:
int doSomething(); //declaration
int main()
{
doSomething(); //now the function is declared
}
//definition
int doSomething()
{
}
Now, the second and third errors is emited because you didn't include the fileName parameter's type in your function definition. Based on your code, it should be a string:
int doSomething(string fileName)
{
}
I also noticed that, while this function returns int, you are not using it's returned value. Nevertheless, don't forget to return something from doSomething, otherwise it will cause undefined behavior.
Yes, Boost is great, but it's a bit bloaty. So, just for completenessapplied to reading images in a directory for OpenCV:
// you need these includes for the function
//#include <windows.h> // for windows systems
#include <dirent.h> // for linux systems
#include <sys/stat.h> // for linux systems
#include <algorithm> // std::sort
#include <opencv2/opencv.hpp>
#include <iostream> //cout
using namespace std;
/* Returns a list of files in a directory (except the ones that begin with a dot) */
int readFilenames(std::vector<string> &filenames, const string &directory)
{
#ifdef WINDOWS
HANDLE dir;
WIN32_FIND_DATA file_data;
if ((dir = FindFirstFile((directory + "/*").c_str(), &file_data)) == INVALID_HANDLE_VALUE)
return; /* No files found */
do {
const string file_name = file_data.cFileName;
const string full_file_name = directory + "/" + file_name;
const bool is_directory = (file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0;
if (file_name[0] == '.')
continue;
if (is_directory)
continue;
filenames.push_back(full_file_name);
} while (FindNextFile(dir, &file_data));
FindClose(dir);
#else
DIR *dir;
class dirent *ent;
class stat st;
dir = opendir(directory.c_str());
while ((ent = readdir(dir)) != NULL) {
const string file_name = ent->d_name;
const string full_file_name = directory + "/" + file_name;
if (file_name[0] == '.')
continue;
if (stat(full_file_name.c_str(), &st) == -1)
continue;
const bool is_directory = (st.st_mode & S_IFDIR) != 0;
if (is_directory)
continue;
// filenames.push_back(full_file_name); // returns full path
filenames.push_back(file_name); // returns just filename
}
closedir(dir);
#endif
std::sort (filenames.begin(), filenames.end()); //optional, sort the filenames
return(filenames.size()); //Return how many we found
} // GetFilesInDirectory
void help(const char **argv) {
cout << "\n\n"
<< "Call:\n" << argv[0] << " <directory path>\n\n"
<< "Given a directory of images, create a vector of\n"
<< "their names, read and display them. Filter out\n"
<< "non-images\n"
<< endl;
}
int main( int argc, const char** argv )
{
if(argc != 2) {
cerr << "\nIncorrect number of parameters: " << argc << ", should be 2\n" << endl;
help(argv);
return -1;
}
string folder = argv[1];
cout << "Reading in directory " << folder << endl;
vector<string> filenames;
int num_files = readFilenames(filenames, folder);
cout << "Number of files = " << num_files << endl;
cv::namedWindow( "image", 1 );
for(size_t i = 0; i < filenames.size(); ++i)
{
cout << folder + filenames[i] << " #" << i << endl;
cv::Mat src = cv::imread(folder + filenames[i]);
if(!src.data) { //Protect against no file
cerr << folder + filenames[i] << ", file #" << i << ", is not an image" << endl;
continue;
}
cv::imshow("image", src);
cv::waitKey(250); //For fun, wait 250ms, or a quarter of a second, but you can put in "0" for no wait or -1 to wait for keypresses
/* do whatever you want with your images here */
}
}

Traversing directory and iterators in c++

I am an absolute newbie to C++ and have only started to program with it 3 days ago.
I am trying to do the folliwng:
traverse a directory for X.X files (typically .), and for each file, do the following:
Search within the file for a string (findFirst) and then search until another string (findLast) - The files will be HTML format.
In this selection, I want to perform several tasks (yet to write) - but they will be the following:
One of the strings will be the Filename I want to write to. - so extract this field and create an outputfile with this name
Some of the lines will be manufacturer part numbers - extract these and format the output file accordingly
most of it will be description of product. Again - this will be in an HTML construct - so extract this and format the output file.
So far, I have managed to get working the traverse directory, and selecting the start and finish keywords - using some help from the internet.
My problem is here
processFiles(inputFileName, "testing", "finish");
I need the inputFileName to be the name of the traversed filename.
All the examples I have found simply print the filename using cout
I need to pass this into the processFiles function.
Can someone tell me what i need to use? i have tried it->c_Str() and other variations of (*it) and .at, .begin etc
my non printing example is below:
// Chomp.cpp : Defines the entry point for the console application.
//
#include <stdafx.h>
#include <windows.h>
#include <string>
#include <iostream>
#include <fstream>
#include <stdlib.h>
#include <cctype>
#include <algorithm>
#include <vector>
#include <stack>
//std::ifstream inFile ( "c:/temp/input.txt" ) ;
std::ofstream outFile( "c:/temp/output.txt") ;
using namespace std;
/////////////////////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////
void openFiles()
{
if (!(outFile.is_open()))
{
printf ("Could not Create Output file\n");
exit(0);
}
}
/////////////////////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////
bool ListFiles(wstring path, wstring mask, vector<wstring>& files)
{
HANDLE hFind = INVALID_HANDLE_VALUE;
WIN32_FIND_DATA ffd;
wstring spec;
stack<wstring> directories;
directories.push(path);
files.clear();
while (!directories.empty())
{
path = directories.top();
spec = path + L"\\" + mask;
directories.pop();
hFind = FindFirstFile(spec.c_str(), &ffd);
if (hFind == INVALID_HANDLE_VALUE)
return false;
do
{
if (wcscmp(ffd.cFileName, L".") != 0 && wcscmp(ffd.cFileName, L"..") != 0)
{
if (ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
directories.push(path + L"\\" + ffd.cFileName);
else
files.push_back(path + L"\\" + ffd.cFileName);
}
} while (FindNextFile(hFind, &ffd) != 0);
if (GetLastError() != ERROR_NO_MORE_FILES)
{
FindClose(hFind);
return false;
}
FindClose(hFind);
hFind = INVALID_HANDLE_VALUE;
}
return true;
}
/////////////////////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////
void processFiles(const wchar_t *inFileName, std::string findFirst,std::string findLast )
{
/*
std::string findFirst = "testing" ;
std::string findLast = "finish" ;
*/
std::string inputLine ;
int lineNum = 0 ;
char buffer[2048];
size_t found = 0;
std::ifstream inFile;
inFile.open (inFileName); // Open The file
if (inFile.is_open())
{
while( std::getline( inFile, inputLine ))
{
++lineNum ;
// printf ("Line len = %d\n ", inputLine.length());
if( (found = inputLine.find(findFirst)) != std::string::npos )
{
std::cout << "###Line " << lineNum << " At Position [ " << found << " ]\n" ;
sprintf_s(buffer, 2048, "[%-5.5d] %s\n", lineNum, inputLine.c_str());
outFile << buffer ;
bool foundLast = 0;
while( std::getline( inFile, inputLine ))
{
++lineNum ;
sprintf_s(buffer, 2048, "[%-5.5d] %s\n", lineNum, inputLine.c_str());
if( (found = inputLine.find(findLast)) != std::string::npos )
{
outFile << buffer ;
break; // Found last string - so stop after printing last line
}
else
outFile << buffer ;
}
}
else
{
// std::cout << "=>" << inputLine << '\n' ;
}
}
}
else
{
printf ("Cant open file \n");
exit(0);
}
inFile.close() ; // Close The file
}
/////////////////////////////////////////////////////////////////////////////////////////////
/// M A I N
/////////////////////////////////////////////////////////////////////////////////////////////
int main()
{
std::ifstream inFile ;
int startLine = 0;
int endLine = 0;
int lineSize = 0;
char buffer[512];
vector<wstring> files; // For Parsing Directory structure
openFiles();
// Start The Recursive parsing of Directory Structure
if (ListFiles(L"C:\\temp", L"*.*", files))
{
for (vector<wstring>::iterator it = files.begin(); it != files.end(); ++it)
{
printf ("Filename1 is %s\n", it->c_str());
printf ("Filename2 is %s\n", files.begin());
outFile << "\n------------------------------\n";
//outFile << it << endl;
wcout << it->c_str() << endl;
outFile << "\n------------------------------\n";
const wchar_t *inputFileName = it->c_str();
// processFiles(inputFileName, "testing", "finish");
// getchar();
}
}
outFile.close();
getchar();
}
Make your processFile accept a wstring, viz:
void processFiles(wstring inFileName, std::string findFirst,std::string findLast )
{
// Make the necessary changes so that you use a wstring for inFileName
}
Call it from main() using:
processFiles(*it, "testing", "finish");
You need to change processFile to use a wifstream instead of a ifstream and you should change all of your narrow strings to use wide strings (or vice versa). Narrow strings and wide strings are not compatible with each other and in order to use one with the other a conversion function must be used such as mbstowcs.
Edit:
You can find an example that should compile here.

How To Fix C++ File Or Directory Program

In an effort to learn plain ansi C++, I am creating a console program that will hopefully
check to see if a file/folder exists.
Then differentiate between a file or a folder.
Count the number of lines in a file.
Recurse through all the files in a folder.
I don't want to include any libraries, just trying to get my head around basic c++ for the moment.
The problem I am having is with stage 2 of the application. I was struggling to find a way to tell if I had a file or directory, and after doing some reading I came across the library.
When the program calls the
checkIsDir()
function I get thrown an error
thread1 : EXEC_BAD_ACCESS(code = 1, address = 0x8)
I am presuming it is a memory problem, but I don't really have a hang of the basics with c++ never mind memory management. Here is my code
#include <iostream>
#include <fstream>
#include <dirent.h>
#include <sys/stat.h>
using namespace std;
bool checkIsDir(const char *fileData);
bool checkFileExists(const char *fileName, const char *directory);
int countTheNumberOfLines(const char *fileName);
int main(int argc, const char * argv[])
{
string directory, fileName;
while(fileName != "-1"){
cout << "Please enter your filename or enter -1 to quit:" << endl;
getline(cin, fileName);
cout << "Please enter your directory :" << endl;
getline(cin, directory);
if(checkFileExists(fileName.c_str(), directory.c_str())){
cout << fileName << " : exists" << endl;
if(checkIsDir(fileName.c_str())){ //code breaks when calling this function
cout << "==================| ";
cout << fileName << " is a Directory";
cout << " |==================";
}
if(checkIsDir(fileName.c_str())){
cout << fileName << " is a directory" << endl;
}
} else {
cout << fileName << " : not found";
cout << endl;
}
}
cout << "\nGoodbye";
return 0;
}
bool checkFileExists(const char *fileName, const char *directory){
const char* dirContent;
dirent* dirStruct;
DIR* dir;
dir = opendir(directory);
if(dir == NULL) return false;
while((dirStruct = readdir(dir))){
dirContent = dirStruct->d_name;
if(strcmp(fileName, dirContent) == 0){
closedir(dir);
return true;
}
}
closedir(dir);
return false;
}
bool checkIsDir(const char * fileData){
struct stat data;
struct dirent *file;
DIR *dir;
dir = opendir(fileData);
while((file = readdir(dir))){
stat(file->d_name, &data);
if(S_ISDIR(data.st_mode)){
closedir(dir);
return true;
} else {
closedir(dir);
return false;
}
}
closedir(dir);
return false;
}
int countTheLines(const char *fileName){
int lineNums = 0;
//implement
return lineNums;
}
StackExchange is the best place for hemp on the internet and I appreciate all the help that everyone gives.
Thanks in advance

How can I automatically open the first file in a folder using C++?

How can I automatically open and read the content of a file within a given directory from a C++ application without knowing the file's name?
For example (a rough description of the program):
#include iomanip
#include dirent.h
#include fstream
#include iostream
#include stdlib.h
using namespace std;
int main()
{
DIR* dir;
struct dirent* entry;
dir=opendir("C:\\Users\\Toshiba\\Desktop\\links\\");
printf("Directory contents: ");
for(int i=0; i<3; i++)
{
entry=readdir(dir);
printf("%s\n",entry->d_name);
}
return 0;
}
This will print the name of the first file in that directory. My problem is how to read that particular file's content and save it in a .txt document. Can ifstream do that? (Sorry for my bad English.)
this should do it
#include <iostream>
#include <boost/filesystem/operations.hpp>
#include <boost/filesystem/fstream.hpp>
using namespace boost::filesystem;
using namespace std;
void show_files( const path & directory, bool recurse_into_subdirs = true )
{
if( exists( directory ) )
{
directory_iterator end ;
for( directory_iterator iter(directory) ; iter != end ; ++iter )
if ( is_directory( *iter ) )
{
cout << iter->native_directory_string() << " (directory)\n" ;
if( recurse_into_subdirs ) show_files(*iter) ;
}
else
cout << iter->native_file_string() << " (file)\n" ;
copyfiles(iter->native_file_string());
}
}
void copyfiles(string s)
{
ifstream inFile;
inFile.open(s);
if (!inFile.is_open())
{
cout << "Unable to open file";
exit(1); // terminate with error
}
//Display contents
string line = "";
//Getline to loop through all lines in file
while(getline(inFile,line))
{
cout<<line<<endl; // line buffers for every line
//here add your code to store this content in any file you want.
}
inFile.close();
}
int main()
{
show_files( "/usr/share/doc/bind9" ) ;
return 0;
}
If you're on Windows you can use the FindFirstFile in the Windows API. Here is a short example:
HANDLE myHandle;
WIN32_FIND_DATA findData;
myHandle = FindFirstFile("C:\\Users\\Toshiba\\Desktop\\links\\*", &findData);
do {
if (findData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY){
cout << "Directoryname is " << findData.cFileName << endl;
}
else{
cout << "Filename is " << findData.cFileName << endl;
}
} while (FindNextFile(myHandle, &findData));
Otherwise I'd go with ayushs answer, Boost works for unix systems as well

C++ Multi threaded directory scan code

I was looking how to write a multi threaded C++ code for scanning directory and get list of all files underneath. I have written a single threaded code which can do and below the code which can do that.
#include <sys/types.h>
#include <dirent.h>
#include <errno.h>
#include <vector>
#include <string>
#include <iostream>
#include <sys/stat.h> /* for stat() */
using namespace std;
int isDir(string path)
;
/*function... might want it in some class?*/
int getdir (string dir, vector<string> &dirlist, vector<string> &fileList)
{
DIR *dp;
struct dirent *dirp, *dirFp ;
if((dp = opendir(dir.c_str())) == NULL) {
cout << "Error(" << errno << ") opening " << dir << endl;
return errno;
}
while ((dirp = readdir(dp)) != NULL) {
if (strcmp (dirp->d_name, ".") != 0 && strcmp(dirp->d_name, "..") != 0) {
//dirlist.push_back(string(dirp->d_name));
string Tmp = dir.c_str()+ string("/") + string(dirp->d_name);
if(isDir(Tmp)) {
//if(isDir(string(dir.c_str() + dirp->d_name))) {
dirlist.push_back(Tmp);
getdir(Tmp,dirlist,fileList);
} else {
// cout << "Files :"<<dirp->d_name << endl;
fileList.push_back(string(Tmp));
}
}
}
closedir(dp);
return 0;
}
int isDir(string path)
{
struct stat stat_buf;
stat( path.c_str(), &stat_buf);
int is_dir = S_ISDIR( stat_buf.st_mode);
// cout <<"isDir :Path "<<path.c_str()<<endl;
return ( is_dir ? 1: 0);
}
int main()
{
string dir = string("/test1/mfs");
vector<string> dirlist = vector<string>();
vector<string> fileList = vector<string>();
getdir(dir,dirlist,fileList);
#if 0
for (unsigned int i = 0;i < dirlist.size();i++) {
cout << "Dir LIst" <<dirlist[i] << endl;
//string dirF = dir + "/" + dirlist[i];
//getdir(dirF,fileList);
}
#endif
for (unsigned int i = 0; i < fileList.size(); i++)
cout << "Files :"<<fileList[i]<< endl;
return 0;
}
Now issue is that it is single threaded and I need to scan say about 8000 directories under which file can be present. So I am not getting how to do so as number of directories can vary as it is decided by N dimension matrix.
Any help in this regard will be great. Thanks in advance.
boost::filesystem has directory_iterator and recursive_directory_iterator, the former will get all the contents of a directory but not recurse sub-directories, the latter will also recurse subdirectories.
With regard to thread-safety, you could lock a mutex then copy the results into a std::vector or two vector instances, one for files and one for directories, in which case you will at least have a local snapshot copy.
To actual "freeze" the file-system at that point to stop any process modifying it is not something you can normally do - well you could try setting the file attributes on it to read-only then change it back later but you will need to have permission to do that first.