stop recursion for recursive_directory_operator c++ - c++

So let's say i have this code that looks for documents of mine and prints the path to them.
#include <iostream>
#include <experimental/filesystem>
#include <string>
#include <stdio.h>
#include <Shlwapi.h>
using namespace std;
string extensions[3] = { ".doc", ".docx", ".txt" }; // things to look for
string ignoredirs[2] = { "Windows", "Program Files", } // and other ones that i was too
// lazy to write in there
using namespace std::experimental::filesystem;
path yee;
void main()
{
for (recursive_directory_iterator i("c:\\"), end; i != end; ++i)
if (!is_directory(i->path()) && i->path().has_extension()) // checks if the file
// even has an extension
{
for (int x = 0; x <= 3; ++x)
if (i->path().extension().string() == extensions[x])// checks if the
// extension is equal
// to current
// extension in loop
cout << "found document at :" ;
cout << i->path().string() << endl; // print out the path
}
}
And I would like to not iterate to directories in ignoredir[] because it takes ages to find my docs on the filesystem.
I saw this code from cppreference.com.
Could someone explain me the the code and/or how to use it in my use case?
Pr could you submit a better solution than that?
Ps. I don't want to use boost in this program, just to see how it works in experimental::filesystem

Proposed solution:
#include <iostream>
#include <experimental/filesystem>
#include <string>
#include <stdio.h>
#include <Shlwapi.h>
#include <set>
using namespace std;
set<string> extensions = { ".doc", ".docx", ".txt" }; //things to look for
set<string> ignoredirs = { "Windows", "Program Files" }; //and other ones that i was too lazy to write in there
using namespace std::experimental::filesystem;
int main()
{
for (recursive_directory_iterator i("c:\\"), end; i != end; ++i)
{
if (!is_directory(i->path()) && i->path().has_extension()) // checks ifthe file even has an extension
{
if (extensions.find(i->path().extension().string()) != extensions.end())
cout << "found document at :";
cout << i->path().string() << endl; //print out the path
}
if (ignoredirs.find(i->path().filename().string()) != ignoredirs.end())
i.disable_recursion_pending();
}
}
Explanation:
i->path().filename() actually return directory name, when the same directory name is in set<> then i.disable_recursion_pending(); is called. When this one is called recursive_directory_iterator i omit directory with i->path().filename() name.
set<string> was used to get rid internal for loop that requires size of table, which is error prone. Also performance gain and code simplification
Notice:
On windows directory names should be compared case insensitive. Also Windows 64bit have "Program Files" and "Program Files (x86)" and both need to be listed in ignore list (or string comparison need to be improved)

Related

How to read and output how many Bytes a folder has in it using C++ 17 filesystem on windows

I have a C++ program that takes a directory, like "D:\P4Test", and attempts to tell me how many bytes are in each subfolder and file within that directory. The code I currently have looks like this:
#define _CRT_SECURE_NO_WARNINGS
#include <iostream>
#include <fstream>
#include <string>
#include <sys/stat.h>
#include <fcntl.h>
#include <cstdint>
#include <sstream>
#include <filesystem>
using namespace std;
namespace fs = std::filesystem;
using namespace std::filesystem;
int main()
{
string path = "D:\\P4Test";
for (const auto& entry : directory_iterator(path)) {
cout << entry.path() << std::endl;
uintmax_t fsize = file_size(entry.path());
cout << " ||| " << fsize << endl;
}
}
Yes, it has a lot of unnecessary includes and such, but that's for future things.
When I run this code, I don't get what I want. Here's a picture of what's in that directory, and the output.
As you can see, the output looks good, but it does not give me the bytes for what's in the folders called "Two" & "Three".
Both folders have a text file in them that's 5 bytes, but they report back 0.
Can anyone help me figure out why, and show me how to make the folders show their bytes, or direct me to where I can figure this out?
It looks like you are trying to do a recursive file size check, but you do not actually recurse into the directories. 1 way to do this is to stasrt with a function gets all of the file sizes:
void folder_size(std::filesystem::path path) {
for (const auto& entry : directory_iterator(path)) {
cout << entry.path() << std::endl;
uintmax_t fsize = file_size(entry.path());
cout << " ||| " << fsize << endl;
}
}
Now we simply a special case to deal with if the file type is a directory, we can do this with std::filesystem::directory_entry::is_directory:
if (entry.is_directory()) {
// Handle the directory
}
So how do we handle the directory, well we have a function that we made that takes a directory path and goes through it. Lets call that:
if (std::filesystem::is_directory(entry.path())) {
folder_size(entry.path())
}
Putting it all together:
void folder_size(std::filesystem::path path) {
for (const auto& entry : directory_iterator(path)) {
cout << entry.path() << std::endl;
uintmax_t fsize = file_size(entry.path());
cout << " ||| " << fsize << endl;
if (std::filesystem::is_directory(entry.path())) {
folder_size(entry.path())
}
}
}
NOTE: All of the above is example code. No compilation check has been conducted.

Printing file directory into a data structure

This code is suppose to read and print every single file in the directory, which is does. Now I want to be able to put those files in a data structure. I chose a list and the output cuts off before it gets to the actual file. How can I fix this?
"C:/Users/deonh/Downloads/intranets/intranet1\\page99.html" - What I want in the list
C:/Users/deonh/Downloads/intranets/intranet1 - What I get.
#include <iostream>
#include<vector>
#include<list>
#include<map>
#include<queue>
#include<fstream>
#include<string>
#include <filesystem>
namespace fs = std::filesystem;
using namespace std;
string path = "C:/Users/deonh/Downloads/intranets/intranet1"; //This gets every single file in the directory
string path5 = "C:/Users/deonh/Downloads/intranets/intranet5";
string path7 = "C:/Users/deonh/Downloads/intranets/intranet7";
int main()
{
list<string>pages;
map<string, int> page;
//Here I am printing the files to make sure the above code works.
for (const auto& entry : fs::directory_iterator(path)) {
std::cout << entry.path() << std::endl;
pages.push_back(path);
}
for (list<string> ::iterator it = pages.begin(); it != pages.end(); it++) {
cout << *it << endl;
}
return 0;
}
You are just adding the wrong value to the list
pages.push_back(path);
should be
pages.push_back(entry.path().generic_string());

Assertion failure in std::ispunct in a simple program

I'm using the book C++primer by Stanley B.Lippman and this error is caused by the solution of Excersise 3.2.3 test 3.10.It requires that write a program that reads a string of characters including punctuation and writes what was read but with the punctuation removed.
here's the code:
#include "stdafx.h"
#include <iostream>
#include <string>
#include <cctype>
using namespace std;
int main() {
string s;
cout << "Please input a string of characters including punctuation:" << endl;
getline(cin, s);
for (auto c : s) {
if (!ispunct(c))
cout << c;
}
cout << endl;
return 0;
}
when I run this code in Visual studio 2017 it shows this:
Debug Assertion failed.
Expression:c>=-1&&c<=255
For information on how your program can cause an assertion failure,see the Visual C++ documentation on asserts.
why it shows like this? I can't understand.
Although the assertion failure you get is due to a bad call to std::ispunct() (you should iterate over the string with an unsigned char), the proper solution would be to use std::iswpunct:
#include <iostream>
#include <string>
#include <locale>
#include <cwctype> // std::iswpunct
int main()
{
std::wstring s;
do {
std::wcout << "Please input a string of characters including punctuation:\n";
} while (!std::getline(std::wcin, s));
for (auto c : s) {
if (!std::iswpunct(c))
std::wcout << c;
}
std::wcout << std::endl;
}
On a Windows platform, the conjunction of std::wstring1 and std::iswpunct will let you handle Chinese characters right. Note that I assumed your system locale is "zh_CH.UTF-8". If it is not, you'll need to imbue your streams.
1) see this excellent answer about the difference between string and wstring.

How to group data while iterating through a directory in c++

I have a directory with 15 folders and each folder has 100 of text files. In each text files contains a column of numbers.
I need those numbers to do some calculations, but I cannot figure out how to obtain it. I was thinking about a 2D vector, but I need different type of data structure (string for the name of the folder and interger for the numbers).
What is my best solution?d
What I got so far is a code that will search all the files by given a path.
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sstream>
#include <algorithm>
#include <tuple>
#include <boost/filesystem.hpp>
#include<dirent.h>
using namespace std;
namespace fs = boost::filesyst
// prototype to search all the files by given it a path
vector<double> getFilesFromDirectory(const fs::path& startDirectory);
int main()
{ // the directory
string dir = "/home/...";
// testing to call my methode
vector<double> myDataStructure = getFilesFromDirectory(dir);
// print out the value of myDataStructure
for (auto it = myDataStructure.begin(); it != myDataStructure.end(); it++)
{
cout << *it << " " << endl;
}
return 0;
}
// methode to search all the files by given it a path
vector<double> getFilesFromDirectory(const fs::path& startDirectory)
{
vector<double> di;
// First check if the start path exists
if (!fs::exists(startDirectory) || !fs::is_directory(startDirectory))
{
cout << "Given path not a directory or does not exist" << endl;
exit(1);
}
// Create iterators for iterating all entries in the directory
fs::recursive_directory_iterator it(startDirectory); // Directory iterator at the start of the directory
fs::recursive_directory_iterator end; // Directory iterator by default at the end
// Iterate all entries in the directory and sub directories
while (it != end)
{
// Print leading spaces
for (int i = 0; i < it.level(); i++)
cout << "";
// Check if the directory entry is an directory
// When directory, print directory name.
// Else print just the file name.
if (fs::is_directory(it->status()))
{
// print out the path file
cout << it->path() << endl;
}
else
{
cout << it->path().filename() << endl;
// test
di = getValueFromFile(it->path().c_str());
// test, here I want to group the numbers of the file
// and each name of the folder
for(int i = 0; i < 15; i++)
{
di.push_back(mi(fs::basename(it->path()), it->path().c_str());
}
}
// When a symbolic link, don't iterate it. Can cause infinite loop.
if (fs::is_symlink(it->status()))
it.no_push();
// Next directory entry
it++;
}
return di;
}
If I understand the problem correctly, I'd write a class (or struct) to hold the contents of each file:
A string containing the path:
A vector containing every value represented in the column for that file
In your main program, a vector containing each object you create.
Definition:
#ifndef __COLVALS_HPP__
#define __COLVALS_HPP__
#include <vector>
#include <string>
class ColVals {
private:
std::vector<double> _colValues;
std::string _pathName;
public:
ColVals(const std::string& pathName);
~ColVals() {}
void appendValue(const double colValue);
std::vector<double> getValues();
std::string getPath();
};
#endif // __COLVALS_HPP__
Implementation:
#include "colvals.hpp"
using namespace std;
ColVals::ColVals(const string& pathName) {
_pathName = pathName;
}
void ColVals::appendValue(const double colValue) {
_colValues.push_back(colValue);
}
vector<double> ColVals::getValues() {
return _colValues;
}
string ColVals::getPath() {
return _pathName;
}

How often should one open/close fstream object c++

so the write elm and getfileID functions require the cursor pos in the file
(write elm appends to the end, getFileID prints lines first to last)
#ifndef file_operations_header
#define file_operations_header
#include <string>
#include <iostream>
#include <vector>
#include <fstream>
#include "First_classes_header.h"
class fileOPerations
{
private:
string line;
fstream f_myFileOut;
public:
fileOPerations();
void closeFile()
{
f_myFileOut.close();
}
int getFileID()
{
int counter = 0;
if (f_myFileOut.is_open())
{
while(f_myFileOut.good()){
getline(f_myFileOut,line);
++counter;
cout << line << endl;
}
}f_myFileOut.close();
return counter;
}
int writeElm(makeVector& mV,int i)
{
f_myFileOut.open("file.txt",ios::out|ios::app|ios::ate);
if (f_myFileOut.is_open())
{
f_myFileOut << mV.str_vector[i].counter << "\t";
f_myFileOut << mV.str_vector[i].name << endl;
}
else{
cout << "can't open file." << endl;
}
return 0;
}
friend class makeVector;
};
fileOPerations::fileOPerations():f_myFileOut("file.txt",ios::out|ios::app|ios::in){}
#endif // file_operations_header
and the call to getFileID in my main doesn't print anything because writeElm()
set the cursor pos to the end of the file.
#include <iostream>
#include <string.h>
#include <vector>
#include "First_classes_header.h"
#include "file_operations.h"
using namespace std;
int main()
{
fileOPerations fpObject;
makeVector vecObject;
int fileID = fpObject.getFileID();
while(true){
IDgenerator();
int genID = IDgenerator::GetID();
int currentID = fileID + genID;
string workingName = nameGetter::setName();
vecObject.vecSetter(currentID,workingName);
fpObject.writeElm(vecObject, currentID); // error within this function
fpObject.getFileID();
}fpObject.closeFile();
return 0;
}
Is it safe/effecient/effective to call f_myFileOut.open() with different parameters
in each separate function?
int getFileID()
{
f_myFileOut.open(("file.txt",ios::out|ios::app|ios::in)
int counter = 0;
...
...
int writeElm(makeVector& mV,int i)
{
f_myFileOut.open("file.txt",ios::out|ios::app|ios::ate);
Or should I set the cursor pos manually?
While it is certainly not efficient, to open/close the same file over and over again, it would be safe, and I'd even call it better coding style, because currently you are opening a file in one method and closing it in another, and in both cases it is not obvious from the function name that this is one of their side effects (contratry to e.g. closeFile()). Also you are already opening/closing the file in every iteration, so this would "only" double the open/close operations.
In general however, I'd definitively recommend to open the file once at the beginning of your program, close it at the end and e.g. use f_myFileOut.seekg (0,f_myFileOut.beg) and f_myFileOut.seekg (0,f_myFileOut.end) in between, to move your iterator around.