read only .txt files from directory which has subfolders too - c++

I am trying to read only .txt files from directory.
I am not using arrays.
I am using opendir() to open my directory.
d->d_name lists all my files and also subfolders.
I want to read only .txt but not the subfolders.
please help me.
Thanks

Can you not use FindFirstFile and FindNextFile for this?

Well, something like:
call opendir() to open the directory
in a loop, call readdir to read each entry
for each entry, examine the name to see if the last 4 characters are ".txt"
if they are, do something
at the end, call closedir to close the directory

You can use the stat() function to determine the type of file your struct dirent represents.
struct stat sb;
int rc = stat(filename, &sb);
// error handling if stat failed
if (S_ISREG(sb.st_mode)) {
// it's a regular file, process it
} else {
// it's not a regular file, skip it
}
Read the man pages for details. Also take care that the filename in d_name does not contain the directory part. If you're in a different directory than what you opendir'd, you'll need to prepend the directory name (and a directory separator if required).
For a C++ alternative, please see boost::filesystem.

You could try put the filenames into a simple structure (string array or vector for example), then pass a reference to that structure to a function that prunes names that don't use the .txt extension
in the function, look at each filename (a for loop would be handy), and use the find function in the String library to see if the last four characters are == to .txt. You can reset the position of to start searching the string to string_name.length - 4 so that you're only comparing the last few characters.
Cplusplus.com is a great reference for things like the String library: http://www.cplusplus.com/reference/string/string/find/

Assuming you are on a Linux/Posix system, you can use scandir(...). You can find the details on the manual page, but in short, you have to provide a filter function that takes a dirent pointer as argument, and returns non-zero if the entry is to be included (in your case, you would check for the name ending in .txt, and possibly the file type in the dirent struct).

#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
#include <errno.h>
int main(int argc, char *argv[])
{
DIR *dir;
struct dirent *entry;
int pos;
if (argc < 2)
{
printf("Usage: %s <directory>\n", argv[0]);
return 1;
}
if ((dir = opendir(argv[1])) == NULL)
{
perror("opendir");
return 1;
}
while ((entry = readdir(dir)) != NULL)
{
if (entry->d_type != DT_REG)
continue;
pos = strlen(entry->d_name) - 4;
if (! strcmp(&entry->d_name[pos], ".txt"))
{
printf("%s\n", entry->d_name);
}
}
if (closedir(dir) == -1)
{
perror("closedir");
return 1;
}
return 0;
}

Related

ifstream not working with dirent.h

I'm testing optimizations for dijkstra algorithm and to make it easier to open files I used "dirent.h" to get all the test files in the running path and then ifstream to open this file.
the readDirec method reads all the files in the directory and ignores folder and puts those files names in a vector called files.
void selectDirec(){
files.clear();
DIR *dir;
struct dirent *ent;
if ((dir = opendir (".")) != NULL) {
while ((ent = readdir (dir)) != NULL) {
if(opendir(ent->d_name) == NULL){
files.push_back(ent->d_name);
}
}
closedir (dir);
} else {
cout<<"directory error"<<endl;
}
}
after that I uses a function called selectFile which assigns the name of the file the user chooses to a variable called fileName.
void selectFile(){
selectDirec();
for(int i = 0 ; i < files.size() ; i++){
cout<<i+1<<" : "<<files[i]<<endl;
}
int choice = 0;
do{
cout<<"enter file number"<<endl;
cin>>choice;
}while(choice > files.size());
choice--;
fileName = files[choice];
cout<<fileName<<":"<<endl;
}
after that I enter my readGraph function which opens the file and continue graph operations
void readGraph(){
ifstream ifile; ifile.open(fileName);
if(!ifile.is_open()){
cout<<"no file with the name specified"<<endl;
eflag = true;
return;
}
...
...
}
initialization:
vector<char *> files;
char * fileName ;
now I have those 5 files to test which I got from here http://algs4.cs.princeton.edu/44sp/:
tinyEWD.txt contains 8 vertices and 15 edges [140B]
mediumEWD.txt contains 250 vertices and 2,546 edges[40KB]
1000EWG.txt contains 1,000 vertices and 16,866 edges[313KB]
10000EWG.txt contains 10,000 vertices and 123,462 edges[2.4MB]
NYC.txt . contains 264346 vertices and 733846 edges[12.7MB].
but there's a weird problem with those 3 files:
'mediumEWD' , '10000EWD.txt' , 'NYC.txt'
when I choose any of them the code shows me "no file with the name specified" that in the else statement in readGraph.
but when I enter their name manually and comment selectDirec and selectFile the program opens them successfully.
P.S. I checked the file name and spacing and everything.
P.S.2 currently running this code on ubuntu 14.04 LTS.
thanks in advance.
if(opendir(ent->d_name) == NULL){
files.push_back(ent->d_name);
}
What is files? I suspect that you are using a std::vector<const char *>, or something along the same lines.
This won't work. d_name is a part of the dirent structure. Immediately afterwards, and certainly after the closedir(), that pointer is no longer valid, and points to deallocated memory.
Looks to me like you then proceed and attempt to use the no-longer valid pointer as the filename parameter to std::ifstream.
You should use a std::vector<std::string> to store the filenames, and use the c_str() member function to extract a pointer to a C-style string, for the open() call.
You can't be using a vector of std::strings here, this must be a vector of raw character pointers. That's because you're assigning one of its values to fileName, whatever it is, and then passing it directly to open() without using c_str(). So it can't be a vector of strings.

scan directory to find and open the file

I want to make a program that lets user enter drive name/folder(C:\ or f:\folder\) and a file name (test.exe) then program searches that file in the given drive or folder and opens the file.I managed to do the function that opens the file but cannot figure out how to search the file pass the location of file found to open it. Can anyone help me?
You can use boost::file_system. Here is documentation: http://www.boost.org/doc/libs/1_55_0/libs/filesystem/doc/index.htm
EDIT: after some time, I've got that my ansver were sligtly out of topic. To check if file exists you can use special boost::filesystem function.
bool exists(const path& p);
/EDIT
And directory iterator example: http://www.boost.org/doc/libs/1_55_0/libs/filesystem/doc/tutorial.html#Directory-iteration
It that example used std::copy, but you need filenames. So you can do something like this.
#include <boost/filesystem.hpp>
namespace bfs = boost::filesystem;
std::string dirPath = "."; // target directory path
boost::filesystem::directory_iterator itt(bfs::path(dirPath)); // iterator for dir entries
for ( ; itt != boost::filesystem::directory_iterator(); itt++)
{
const boost::filesystem::path & curP = itt->path();
if (boost::filesystem::is_regular_file(curP)) // check for not-a-directory-or-something-but-file
{
std::string filename = curP.string(); // here it is - filename in a directory
// do some stuff
}
}
If you are not expirienced with boost - building it can be complicated.
You can obtain prebuilded boost binaries for your compiller and platform at boost.teeks99.com
Also, if you cant use boost for some reason, there is platform specific ways of iterating a directory, but I dont know on which platform you are, so I cant provide you an example.
Try this:
char com[50]="ls ";
char path[50]="F:\\folder\\";
char file[50]="test.exe";
strcat(com,path);
strcat(com,file);
if (!system(com)) // system returns the return value of the command executed
cout<<"file not present\n";
else
{
cout<<"file is present\n";
strcat(path,file);
FILE* f = fopen(path,"r");
//do your file operations here
}

list top 10 files by size in a unix directory

I am trying a to read a unix directory (including all subdirectories) using c++ and list the top 10 largest files.
I have read that I can use #include dirent.h and use struct dirent but I am having trouble passing the directory name as a variable to opendir/readdir.
Basically it doesn't recognise it and says file/directory not found.
Please can you help me with how I can do this in c++ and print out the top 10 largest files in the directory? Thanks
DIR *dir;
struct dirent *ent;
dir = opendir ("homedir");
if (dir != NULL) {
while ((ent = readdir (dir)) != NULL) {
cout << ent->d_name <<endl;
}
closedir (dir);
} else {
cout << "Can't open directory" << endl;
}
You don't really give enough details, but when you are reading
recursively, are you postfixing the names you read to the
previous names. Reading a directory doesn't change the current
directory, so your function should look more or less like:
std::vector
readDirectoriesRecursively( std::string const& path )
{
std::vector results;
for each name in path
if is directory
results.insert(
results.end(),
readDirectoriesRecursively( path + '/' + filename ) ) ;
else
results.push_back( FileInfo( path + '/' + filename ) );
return results;
}
In the constructor of FileInfo, use stat to obtain the size. Once you have the results, sort by size, and output the first 10.
You're almost there. You have all the filenames. With these, you can do a stat to obtain the filesize for each file. When you sort the filesizes descending, you have the ten largest files.
struct stat buf;
stat(ent->d_name, &buf);
See the detailed example in the man page.

Writing c++ output into xlsx file

I have a certain function in my c++ code which compares 2 .bmp files( a reference file is compared with almost 100 files in another directory one at a time), bit by bit and reports the bit errors properly when run a in terminal window. The function that does that is as follows :
void getBitErrors(char *filename, char *dirName, int height, int width){
DIR *dir;
struct dirent *ent;
//char *f = "";
dir = opendir (dirName);
if (dir != NULL) {
/* print all the files and directories within directory */
while ((ent = readdir (dir)) != NULL) {
if(strcmp(ent->d_name,".") && strcmp(ent->d_name,".."))
{
char f[255]="";
strcat(f,dirName);
strcat(f,ent->d_name);
printf ("reading image file %s\n", f);
cout<<"Bit Error "<<getBitError(filename,f,height,width)<<endl;
}
}
closedir (dir);
}
else {
perror ("");
}
}
I wish to have a function in my code that writes the 100 respective comparison values into an xlsx/obs file .(As opposed to having the output displayed by std::cout in a terminal window.) I have looked into 2 different options .
1) Self explanatory libXL which is a paid library and I dont really have a $199 to pay for this library.
2) SimpleXlsx which is slightly hazy.
I would be awfully obliged if someone were to explain to me how i could go about achieving my result.
OS : Linux Ubuntu 10.10 Maverick.
I suggest you look at the .csv (comma Separated Values) format. You can get a spreadsheet like result with that with much less complexity.

Example of using FindFirstFIleEx() with specific search criteria

I asked about finding in subdirs with criteria. First answer was use FindFirstFileEx(). It seems the function is no good for this purpose or I'm using it wrong.
So can someone explain how I would go about searching in a folder, and all it's subfolders for files that match (to give some sample criteria) .doc;.txt;*.wri; and are newer than 2009-01-01?
Please give a specific code example for those criteria so I know how to use it.
If it isn't possible, is there an alternative for doing this not-at-all-obscure task??? I am becoming quite baffled that so far there aren't well known/obvious tools/ways to do this.
From MSDN:
If you refer to the code fragment in that page:
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
void _tmain(int argc, TCHAR *argv[])
{
WIN32_FIND_DATA FindFileData;
HANDLE hFind;
if( argc != 2 )
{
_tprintf(TEXT("Usage: %s [target_file]\n"), argv[0]);
return;
}
_tprintf (TEXT("Target file is %s\n"), argv[1]);
hFind = FindFirstFileEx(argv[1], FindExInfoStandard, &FindFileData,
FindExSearchNameMatch, NULL, 0);
if (hFind == INVALID_HANDLE_VALUE)
{
printf ("FindFirstFileEx failed (%d)\n", GetLastError());
return;
}
else
{
_tprintf (TEXT("The first file found is %s\n"),
FindFileData.cFileName);
FindClose(hFind);
}
}
You'll see that you can call FindFirstFileEx, where argv1 is a string (LPCSTR) pattern to look for, and &FindFileData is a data structure that contains file info of the found data.. hFind is the handle you use on subsequent calls with FindNextFile.. I think you can also add more search parameters by using the fourth and sixth parameter to FindFirstFileEx.
Good luck!
EDIT: BTW, I think you can check a file or dir's attributes by using GetFileAttributes() .. Just pass the filename found in FileFindData.. (filename can refer to a file's name or a directory name I think)
EDIT: MrVimes, here's what you could do (in pseudocode)
find the first file (match with *)
Check the file find data if it is ".", ".." (these are not really directories or files)
if check passed, check file find data if it has the attributes you are looking for (i.e. check filename, file attributes, even file creation time can be checked in the file find data, and what not) and do whatever with it
if check passed, do whatever you need to do with the file
if check failed, either call findnextfile or end, up to you
Something like that..
I think you use FindFirstFile to find all files and ignore the ones whose WIN32_FIND_DATA values don't match your search criteria.
Well you could use it to search for *.doc, *.txt and *.wri by passing those values as the name to search for:
FindFirstFileEx("*.doc", FindExInfoStandard, &fileData, FindExSearchNameMatch, NULL, 0);
To search by date is a little more complicated, but not overly so:
SYSTEMTIME createTime;
SYSTEMTIME searchDate;
FILETIME compareTime;
HANDLE searchHandle;
searchDate.wYear = 2009;
searchDate.wMonth= 1;
searchDate.wDay = 1;
SystemTimeToFileTime(searchDate, &compareTime);
searchHandle FindFirstFileEx("*", FindExInfoStandard, &fileData, FindExSearchNameMatch, NULL, 0);
if(searchHandle != INVALID_HANDLE_VALUE)
{
While(searchHandle != ERROR_NO_MORE_FILES)
{
FileTimeToSystemTime(fileData.ftCreationTime, &createTime);
if((ULARGE_INTEGER)compareTime < (ULARGE_INTEGER)createTime)
printf("%s matches date criteria", fileData.cFileName);
FindNextFile(searchHandle, &fileData);
}
}
You need to do two searches. The first is just to find the subdirs, and you do that without any file spec. The second search for the files uses the file spec.