localtime alternative that won't overwrite the supplied struct - c++

Essentially, what I'm trying to do is to check the last access time of a file and compare it with a string. Here's the relevant block:
struct stat file;
char timeStr[ 100 ];
stat(nodes.at(0), &file);
strftime(timeStr, 100, "%H:%M:%S-%m/%d/%y", localtime(&file.st_atime)); /* problem */
nodes is a vector of file paths; I'm not sure if it's relevant but I'll include the code that I'm using to set nodes:
vector<char*> nodes;
DIR *dir;
struct dirent *cur
if((dir = opendir(searchPath.c_str())) == NULL) {
cout << "Error opening search path. Are you sure '"
<< searchPath.c_str() << "' is a valid search path?" << endl;
return 0;
}
while((cur = readdir(dir)) != NULL) {
if(string(cur->d_name) == "." || string(cur->d_name) == "..") continue;
nodes.push_back(cur->d_name);
}
closedir(dir);
Where searchPath is a user-inputted string.
The problem: when the 'problem' line is run, from there on nodes is a vector of garbage. I'm wondering if I can accomplish this task without turning nodes into garbage.
Since this is homework, and as you can probably tell I'm not used to C++, a solid push in the right direction will be given the 'accept'.
Thank you.

It has nothing to do with your strftime call but with the fact that (from here):
The pointer returned by readdir() points to data which may be overwritten by another call to readdir() on the same directory stream.
Since you're simply pushing a character pointer that points to data that may be overwritten by subsequent calls to readdir, you may well end up with garbage.
You can probably fix it by using a copy of the C string with something like:
nodes.push_back (strdup (cur->d_name)); // plus error handling if need be.
And, if your implementation doesn't have a strdup (it's not part of the standard), you can use mine (found here).

nodes.push_back(cur->d_name);
You're storing pointers in the vector that immediately become invalid (cur is valid until the next readdir or closedir call). The best fix is to code what you want -- make nodes a vector of strings. The easiest fix:
nodes.push_back(strdup(cur->d_name));

Related

Recursive listing files in C++ doesn't enter all subdirectories

!!!Solved!!!
Thank you guys for your help, it's all working now. I made changes to my code as suggested by #RSahu and got it to work.
Thanks for all your input I've been really stuck with this.
To #Basile: I will definitely check that out but for this particular piece of code I'm not gonna use it because it looks way too complicated :) But thanks for suggestion.
Original question
I'm trying to make a C++ code to list all files in given directory and it's subdirectories.
Quick explanation
Idea is that function list_dirs(_dir, _files, _current_dir) will start in top directory and put files into vector _files and when it find a directory it will call itself on this directory. The _current_dir is there to be prepended to file name if in subdirectory because I need to know the path structure (it's supposed to generate sitemap.xml).
In list_dirs there is a call to list_dir which simply returns all files in current directory, not making difference between file and directory.
My problem
What codes does now is that it lists all files in original directory and then all files in one subdirectory but skipping all other subdirectories. It will list them but not the files in them.
And to be even more cryptic, it list files only in this one specific directory and none other. I tried running it in multiple locations but it never went into any other directory.
Thanks in advance and please note that I am beginner at C++ so don't be harsh ;)
LIST_DIR
int list_dir(const std::string& dir, std::vector<std::string>& files){
DIR *dp;
struct dirent *dirp;
unsigned fileCount = 0;
if ((dp = opendir(dir.c_str())) == NULL){
std::cout << "Error opening dir." << std::endl;
}
while ((dirp = readdir(dp)) != NULL){
files.push_back(std::string (dirp->d_name));
fileCount++;
}
closedir(dp);
return fileCount;
}
and LIST_DIRS
int list_dirs (const std::string& _dir, std::vector<std::string>& _files, std::string _current_dir){
std::vector<std::string> __files_or_dirs;
list_dir(_dir, __files_or_dirs);
std::vector<std::string>::iterator it = __files_or_dirs.begin();
struct stat sb;
while (it != __files_or_dirs.end()){
if (lstat((&*it)->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
/* how to do this better? */
if (*it == "." || *it == ".."){
__files_or_dirs.erase(it);
continue;
}
/* here it should go into sub-directory */
list_dirs(_dir + *it, _files, _current_dir + *it);
__files_or_dirs.erase(it);
} else {
if (_current_dir.empty()){
_files.push_back(*it);
} else {
_files.push_back(_current_dir + "/" + *it);
}
++it;
}
}
}
The main problem is in the line:
if (lstat((&*it)->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
You are using the name of a directory entry in the call to lstat. When the function is dealing with a sub-directory, the entry name does not represent a valid path. You need to use something like:
std::string entry = *it;
std::string full_path = _dir + "/" + entry;
if (lstat(full_path.c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
Suggestions for improvement
Update list_dir so that it doesn't include "." or ".." in the output. It makes sense to me to exclude those files to start with.
int list_dir(const std::string& dir, std::vector<std::string>& files){
DIR *dp;
struct dirent *dirp;
unsigned fileCount = 0;
if ((dp = opendir(dir.c_str())) == NULL){
std::cout << "Error opening dir." << std::endl;
}
while ((dirp = readdir(dp)) != NULL){
std::string entry = dirp->d_name;
if ( entry == "." or entry == ".." )
{
continue;
}
files.push_back(entry);
fileCount++;
}
closedir(dp);
return fileCount;
}
In list_dirs, there is no need to erase items from _files_or_dirs. The code can be simplified with a for loop and by removing the calls to erase items from _files_or_dirs.
It's not clear to me what the purpose of _current_dir is. Perhaps it can be removed.
Here's an updated version of the function. _current_dir is used only to construct the value of the argument in the recursive call.
int list_dirs (const std::string& _dir,
std::vector<std::string>& _files, std::string _current_dir){
std::vector<std::string> __files_or_dirs;
list_dir(_dir, __files_or_dirs);
std::vector<std::string>::iterator it = __files_or_dirs.begin();
struct stat sb;
for (; it != __files_or_dirs.end() ; ++it){
std::string entry = *it;
std::string full_path = _dir + "/" + entry;
if (lstat(full_path.c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
/* how to do this better? */
/* here it should go into sub-directory */
list_dirs(full_path, _files, _current_dir + "/" + entry);
} else {
_files.push_back(full_path);
}
}
}
For this line:
if (lstat((&*it)->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
Note that readdir and consequently list_dir only return the file name, not the full file path. So at this point (&*it)->c_str() only has a file name (e.g. "input.txt"), not the full path, so when you call lstat on a file in a subdirectory, the system can't find it.
To fix this, you will need to add in the file path before calling lstat. Something like:
string fullFileName;
if (dir.empty()){
fullFileName = *it;
} else {
fullFileName = dir + "/" + *it;
}
if (lstat(fullFileName.c_str(), &sb) == 0 && S_ISDIR(sb.st_mode)){
You may have to use _currentDir instead of dir, depending on what they are actually for (I couldn't follow your explanation).
I am not sure all of the problems in your code but I can tell you that this line and the other one similar to it are going to cause you problems:
__files_or_dirs.erase(it);
When you call erase you invalidate the iterator and references at or after the point of the erase, including the end() iterator (see this erase reference). You are calling erase and then not storing the returned iterator and are then looking at it again after this call which is not a good thing to do. You should at least change the line to this so that you capture the returned iterator which should point to the element just after the erased element (or end() if it was the last element)
it = __files_or_dirs.erase(it);
It also appears from the code you posted that you have a redundancy between _dir and _current_dir. You do not modify either of them. You pass them in as the same value and they stay the same value throughout the function execution. Unless this is simplified code and you are doing something else, I would recommend you remove the _current_dir one and just stick with _dir. You can replace the line in the while loop with _dir where you are building the file name and you will have simplified your code which is always a good thing.
A simpler way on Linux is to use the nftw(3) function. It is scanning recursively the file tree, and you give it some handler function.

Using dirent->d_name together with string fails

I'm writing an C++ Application that uses the dirent.h library, to read files from a directory. At one point I want to decide between Files and directories. To achieve that I added the following piece of code:
entry = readdir(used_directory); //read next object from directory stream
DIR* directory_test = opendir((path + entry->d_name).c_str()); //try to open object as directory
if ( directory_test != nullptr) { //object is directory
if (entry != nullptr) { //reading from directory succeeded
dirs.push_back(entry->d_name); //add filename to file list
++dircounter;
}
}
else { //object is file
path is type of string and entry is type of dirent *.
With this, the program causes an memory access error, without it doesn't.
I figured out, that the error is caused by the
(path + entry->d_name)
But it is not the implicit conversion to string in the statement, because other tests like cout << entry->d_name; or path += entry->d_name failed with the same error, too. So obviously there is a failure with using entry->d_name as char *, although it is defined so (in the documentation of dirent.h).
Why is this failure occuring?
EDIT:
Later in the program I add entry->d_name to a vector<string>, that doesn't cause any problems.
The failure was accessing entry before checking if it's equal to nullptr.
Because my loop itterating through the directory is stopped if entry is equal to nullptr, the last itteration causes the error.

ifstream not working with dirent.h

I'm testing optimizations for dijkstra algorithm and to make it easier to open files I used "dirent.h" to get all the test files in the running path and then ifstream to open this file.
the readDirec method reads all the files in the directory and ignores folder and puts those files names in a vector called files.
void selectDirec(){
files.clear();
DIR *dir;
struct dirent *ent;
if ((dir = opendir (".")) != NULL) {
while ((ent = readdir (dir)) != NULL) {
if(opendir(ent->d_name) == NULL){
files.push_back(ent->d_name);
}
}
closedir (dir);
} else {
cout<<"directory error"<<endl;
}
}
after that I uses a function called selectFile which assigns the name of the file the user chooses to a variable called fileName.
void selectFile(){
selectDirec();
for(int i = 0 ; i < files.size() ; i++){
cout<<i+1<<" : "<<files[i]<<endl;
}
int choice = 0;
do{
cout<<"enter file number"<<endl;
cin>>choice;
}while(choice > files.size());
choice--;
fileName = files[choice];
cout<<fileName<<":"<<endl;
}
after that I enter my readGraph function which opens the file and continue graph operations
void readGraph(){
ifstream ifile; ifile.open(fileName);
if(!ifile.is_open()){
cout<<"no file with the name specified"<<endl;
eflag = true;
return;
}
...
...
}
initialization:
vector<char *> files;
char * fileName ;
now I have those 5 files to test which I got from here http://algs4.cs.princeton.edu/44sp/:
tinyEWD.txt contains 8 vertices and 15 edges [140B]
mediumEWD.txt contains 250 vertices and 2,546 edges[40KB]
1000EWG.txt contains 1,000 vertices and 16,866 edges[313KB]
10000EWG.txt contains 10,000 vertices and 123,462 edges[2.4MB]
NYC.txt . contains 264346 vertices and 733846 edges[12.7MB].
but there's a weird problem with those 3 files:
'mediumEWD' , '10000EWD.txt' , 'NYC.txt'
when I choose any of them the code shows me "no file with the name specified" that in the else statement in readGraph.
but when I enter their name manually and comment selectDirec and selectFile the program opens them successfully.
P.S. I checked the file name and spacing and everything.
P.S.2 currently running this code on ubuntu 14.04 LTS.
thanks in advance.
if(opendir(ent->d_name) == NULL){
files.push_back(ent->d_name);
}
What is files? I suspect that you are using a std::vector<const char *>, or something along the same lines.
This won't work. d_name is a part of the dirent structure. Immediately afterwards, and certainly after the closedir(), that pointer is no longer valid, and points to deallocated memory.
Looks to me like you then proceed and attempt to use the no-longer valid pointer as the filename parameter to std::ifstream.
You should use a std::vector<std::string> to store the filenames, and use the c_str() member function to extract a pointer to a C-style string, for the open() call.
You can't be using a vector of std::strings here, this must be a vector of raw character pointers. That's because you're assigning one of its values to fileName, whatever it is, and then passing it directly to open() without using c_str(). So it can't be a vector of strings.

Pull out data from a file and store it in strings in C++

I have a file which contains records of students in the following format.
Umar|Ejaz|12345|umar#umar.com
Majid|Hussain|12345|majid#majid.com
Ali|Akbar|12345|ali#geeks-inn.com
Mahtab|Maqsood|12345|mahtab#myself.com
Juanid|Asghar|12345|junaid#junaid.com
The data has been stored according to the following format:
firstName|lastName|contactNumber|email
The total number of lines(records) can not exceed the limit 100. In my program, I've defined the following string variables.
#define MAX_SIZE 100
// other code
string firstName[MAX_SIZE];
string lastName[MAX_SIZE];
string contactNumber[MAX_SIZE];
string email[MAX_SIZE];
Now, I want to pull data from the file, and using the delimiter '|', I want to put data in the corresponding strings. I'm using the following strategy to put back data into string variables.
ifstream readFromFile;
readFromFile.open("output.txt");
// other code
int x = 0;
string temp;
while(getline(readFromFile, temp)) {
int charPosition = 0;
while(temp[charPosition] != '|') {
firstName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
lastName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
contactNumber[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != endl) {
email[x] += temp[charPosition];
charPosition++;
}
x++;
}
Is it necessary to attach null character '\0' at the end of each string? And if I do not attach, will it create problems when I will be actually implementing those string variables in my program. I'm a new to C++, and I've come up with this solution. If anybody has better technique, he is surely welcome.
Edit: Also I can't compare a char(acter) with endl, how can I?
Edit: The code that I've written isn't working. It gives me following error.
Segmentation fault (core dumped)
Note: I can only use .txt file. A .csv file can't be used.
There are many techniques to do this. I suggest searching StackOveflow for "[C++] read file" to see some more methods.
Find and Substring
You could use the std::string::find method to find the delimiter and then use std::string::substr to return a substring between the position and the delimiter.
std::string::size_type position = 0;
positition = temp.find('|');
if (position != std::string::npos)
{
firstName[x] = temp.substr(0, position);
}
If you don't terminate a a C-style string with a null character there is no way to determine where the string ends. Thus, you'll need to terminate the strings.
I would personally read the data into std::string objects:
std::string first, last, etc;
while (std::getline(readFromFile, first, '|')
&& std::getline(readFromFile, last, '|')
&& std::getline(readFromFile, etc)) {
// do something with the input
}
std::endl is a manipulator implemented as a function template. You can't compare a char with that. There is also hardly ever a reason to use std::endl because it flushes the stream after adding a newline which makes writing really slow. You probably meant to compare to a newline character, i.e., to '\n'. However, since you read the string with std::getline() the line break character will already be removed! You need to make sure you don't access more than temp.size() characters otherwise.
Your record also contains arrays of strings rather than arrays of characters and you assign individual chars to them. You either wanted to yse char something[SIZE] or you'd store strings!

How to know if the next character is EOF in C++

I'm need to know if the next char in ifstream is the end of file. I'm trying to do this with .peek():
if (file.peek() == -1)
and
if (file.peek() == file.eof())
But neither works. There's a way to do this?
Edit: What I'm trying to do is to add a letter to the end of each word in a file. In order to do so I ask if the next char is a punctuation mark, but in this way the last word is left without an extra letter. I'm working just with char, not string.
istream::peek() returns the constant EOF (which is not guaranteed to be equal to -1) when it detects end-of-file or error. To check robustly for end-of-file, do this:
int c = file.peek();
if (c == EOF) {
if (file.eof())
// end of file
else
// error
} else {
// do something with 'c'
}
You should know that the underlying OS primitive, read(2), only signals EOF when you try to read past the end of the file. Therefore, file.eof() will not be true when you have merely read up to the last character in the file. In other words, file.eof() being false does not mean the next read operation will succeed.
This should work:
if (file.peek(), file.eof())
But why not just check for errors after making an attempt to read useful data?
file.eof() returns a flag value. It is set to TRUE if you can no longer read from file. EOF is not an actual character, it's a marker for the OS. So when you're there - file.eof() should be true.
So, instead of if (file.peek() == file.eof()) you should have if (true == file.eof()) after a read (or peek) to check if you reached the end of file (which is what you're trying to do, if I understand correctly).
For a stream connected to the keyboard the eof condition is that I intend to type Ctrl+D/Ctrl+Z during the next input.
peek() is totally unable to see that. :-)
Usually to check end of file I used:
if(cin.fail())
{
// Do whatever here
}
Another such way to implement that would be..
while(!cin.fail())
{
// Do whatever here
}
Additional information would be helpful so we know what you want to do.
There is no way of telling if the next character is the end of the file, and trying to do so is one of the commonest errors that new C and C++ programmers make, because there is no end-of-file character in most operating systems. What you can tell is that reading past the current position in a stream will read past the end of file, but this is in general pretty useless information. You should instead test all read operations for success or failure, and act on that status.
You didn't show any code you are working with, so there is some guessing on my part. You don't usually need low level facilities (like peek()) when working with streams. What you probably interested in is istream_iterator. Here is an example,
cout << "enter value";
for(istream_iterator<double> it(cin), end;
it != end; ++it)
{
cout << "\nyou entered value " << *it;
cout << "\nTry again ...";
}
You can also use istreambuf_iterator to work on buffer directly:
cout << "Please, enter your name: ";
string name;
for(istreambuf_iterator<char> it(cin.rdbuf()), end;
it != end && *it != '\n'; ++it)
{
name += *it;
}
cout << "\nyour name is " << name;
just use this code in macosx
if (true == file.eof())
it work for me in macosx!