iterating through all the directories and subdirectories in c++ - c++

I wanted to use the std::filesystem::recursive_directory_iterator class to create a class method iterating through all subdirectories and processing found xml files.
The only way I have found on the internet to do this was using a for loop like this:
for (fs::directory_entry p : fs::recursive_directory_iterator("my_file"))
do_something(p);
The problem is that i need to store my iterator (or atleast where it's pointing) inbetween function calls as i can only process one file at a time. I tried implementing it like this:
class C {
private:
std::filesystem::recursive_directory_iterator it;
std::filesystem::directory_entry p;
public:
C(std::filesystem::path);
std::string find_file();
};
C::C(std::filesystem::path path)
{
it = fs::recursive_directory_iterator(path);
p = fs::directory_entry(it.begin());
}
std::string C::find_file()
{
do { //using do while so my function won't load the same file twice
++p;
} while (!is_xml(p.path()) && p != it.end());
}
But it seems that std::filesystem::recursive_directory_iterator doesn't have begin() and end() methods and can't be compared.
I have no idea how my code is different from the working for range loop except for storing the iterator and having an extra condition.

If you look a std::filesystem::recursive_directory_iterator Non-member functions you can see that there is:
// range-based for loop support
begin(std::filesystem::recursive_directory_iterator)
end(std::filesystem::recursive_directory_iterator)
And then std::filesystem::begin(recursive_directory_iterator), std::filesystem::end(recursive_directory_iterator) with more details:
end(recursive_directory_iterator) Returns a default-constructed recursive_directory_iterator, which serves as the end iterator. The argument is ignored.
So you will check if it is not equal to std::end(it), so see if there are any more elements. And you have to increment it and not p.
You also need to check if it != std::end(it) before you do !is_xml(*it.path())
std::string C::find_file()
{
do { //using do while so my function won't load the same file twice
++it;
} while (it != std::end(it) && !is_xml(*it.path()));
}

recursive_directory_iterator is already an iterator by itself (it says so right in its name), so you don't need to use begin() and end() at all. It implements operator==, operator!=, operator->, and operator++, which are all you need in this case.
Also, there is no reason for p to be a class member at all. It should be a local variable of find_file() instead (actually, in this case, it can be eliminated completely). And the loop should be a while loop instead of a do..while loop, in case the iterator is already at its "end" when find_file() is entered.
Try this instead:
class C {
private:
std::filesystem::recursive_directory_iterator it;
public:
C(std::filesystem::path);
std::string find_file();
};
C::C(std::filesystem::path path)
: it(path)
{
}
std::string C::find_file()
{
static std::filesystem::directory_iterator end;
while (it != end) {
auto p = it->path();
if (is_xml(p))
return p.string();
++it;
}
return "";
}

Related

How to properly handle if a function exits without encoutering a return

I have a function that search a vector and returns the item if it is found. But I want to know that best software appraoch to handle if it is not found.
I have created a function and could return -1 or something but that wouldn't match the return type.
koalaGraph::PVertex Koala::lookUpVertexbyName(const std::string&vertexName, const std::vector<koalaGraph::PVertex>& koalaVertices) {
for (size_t i = 0; i < koalaVertices.size(); i++) {
if(koalaVertices[i]->info.name == vertexName)
return koalaVertices[i];
}
}
If a situation is encountered where the item being searched for is not in the vector then program will exit.
You can use std::optional
#include <optional>
std::optional<koalaGraph::PVertex>
Koala::lookUpVertexbyName(const std::string&vertexName,
const std::vector<koalaGraph::PVertex>& koalaVertices) {
for (size_t i = 0; i < koalaVertices.size(); i++) {
if(koalaVertices[i]->info.name == vertexName)
return koalaVertices[i];
}
return {};
}
int main()
{
Koala k;
//...
auto maybeVertex = k.lookUpVertexByName("foo",vertices);
if(maybeVertex)
koalaGraph::PVertex&& vertex = *maybeVertex;
//alternatively
if(maybeVertex.has_value())
//found
}
You could use a for-loop and return a iterator.
std::vector<koalaGraph::PVertex>::const_iterator
Koala::lookUpVertexbyName(
const std::string&vertexName,
const std::vector<koalaGraph::PVertex>& koalaVertices)
{
for(auto iter = koalaVertices.begin(); iter != koalaVertices.end(); ++iter) {
if(koalaVertices[i]->info.name == vertexName) {
return iter;
}
}
return koalaVertices.end();
}
Further you check if you got end back. end indicates if the value was found or not.
auto iter = <fucntioncall> // lookUpVertexbyName
if (iter == <vector>.end() {
// abort or do what ever you want
}
To use the value you have to dereference the iterator. DON'T derefence the end-iterator, it will lead you to neverland -> undefined behavior.
std::string test = *iter;
Why not use std::find_if instead of reinventing the wheel. See this link.
struct equal
{
equal(const std::string& vertexName) : vertexName_(vertexName) { }
bool operator()(const koalaGraph::PVertex& pVertex) const
{
return pVertex->info.name == vertexName_;
}
private:
std::string vertexName_;
};
And then:
std::find_if(koalaVertices.begin(), koalaVertices.end(), eq(vertexName));
Regarding handling the errors in function as it has already been stated there are multiple approaches that one can take. Returning an iterator instead of object(you will avoid copying this way too) is one of them. end() iterator would then indicate that the name was not found.
There are three ways to exit a function:
Return a value
Throw a value
Call std::abort or std::exit (possibly indirectly)
(std::longjmp which you shouldn't use)
If you don't do any of the above, then behaviour will be undefined. If you don't want to do 1., then your options are 2. or 3. Abort and exit will terminate the process. A throw can be caught, but an uncaught throw will cause std::abort.
Note that just because you don't find a value, it's not necessarily impossible to return some value. What you can do is return a "sentinel" value that represents "not found". For example, std::string functions that return an index will return std::string::npos when there is no result. Functions returning a pointer might return null, and functions returning an iterator would return an iterator the the end of the range.
If there is no representation of your return type that could be reserved for a sentinel, there is a way to add such representation by wrapping the return type with additional state. The standard library has a generic wrapper for this: std::optional.
Another wrapper is the proposed std::expected (it's not accepted to the standard as far as I know, but there are plenty of non-standard implementations). It allows storing information about the reason for not returning a proper value which similar to what you can do with exceptions.
P.S. Your function appears to be nearly identical to std::find_if. Use standard algorithms when possible. Also consider a data structure that is more efficient for searching if the search space is large.

How to handle a function that is not guaranteed to return anything?

I have a class that stores & manages a vector containing a number of objects.
I'm finding myself writing a number of functions similar to the following:
Object* ObjectManager::getObject(std::string name){
for(auto it = object_store.begin(); it != object_store.end(); ++it){
if(it->isCalled(name))
return &(*it)
}
return nullptr;
}
I think I would rather return by reference, as here the caller would have to remember to check for null! Is there a way I can change my design to better handle this?
Your alternatives are outlined below
Change your API to the following
object_manager.execute_if_has_object("something", [](auto& object) {
use_object(object);
});
This API is much easier to use, conveys intent perfectly and removes the thought process of error handling, return types, etc from the user's mind
Throw an exception.
Object& ObjectManager::getObject(const std::string& name){
for(auto& object : object_store){
if(object.isCalled(name))
return object;
}
// throw an exception
throw std::runtime_error{"Object not found"};
}
Return a bool, pass the Object by reference and get a copy
bool ObjectManager::getObject(const std::string& name, Object& object_out){
for(auto& object : object_store){
if(object.isCalled(name)) {
object_out = object;
return true;
}
}
return false;
}
Let the user do the finding
auto iter = std::find(object_store.begin(), object_store.end(), [&name](auto& element) {
return element.isCalled(name);
}
if (iter != object_store.end()) { ... }
Also
Pass that string by const reference. When C++17 is available change that const reference to a std::string_view
Use range based for loops in this situation, they are a more readable alternative for what you are doing
Look at the design of STL (e.g. find function), it is not at all bad to return the iterator your just searched for, and return .end() otherwise.
auto ObjectManager::getObject(std::string name){
for(auto it = object_store.begin(); it != object_store.end(); ++it){
if(it->isCalled(name))
return it;
}
return object_store.end();
}
More: Of course object_store.end() may be inaccessible from outside the class but that is not an excuse, because you can do this (note the more slick code also)
auto ObjectManager::getObject(std::string name){
auto it = object_store.begin();
while(not it->isCalled(name)) ++it;
return it;
}
auto ObjectManager::nullObject(){return object_store.end();}
Less code is better code. You can use it like this:
auto result = *om.getObject("pizza"); // search, not check (if you know what you are doing)
or
auto it = om.getObject("pizza");
if(it != om.nullObject() ){ ... do something with *it... }
or
auto it = om.getObject("pizza");
if(it != om.nullObject() ){ ... do something with *it... }
else throw java_like_ridiculous_error("I can't find the object, the universe will collapse and it will jump to another plane of existence");
Of course at this point it is better to call the functions findOject and noposObject and also question why not using directly std::find on the object_store container.
I think you are already handling the return value properly and your current solution is optimal.
The fact is you can not avoid checking for something in order to discover if your find operation succeeded. If you throw an exception then your try{}catch{} is your check. Also an exception should not be used when not finding an item is a legitimate result. If you return a bool and use an out parameter you have made the situation more complicated to do the same job. Same with returning an iterator. A std::optional returns values.
So IMO you can't improve upon returning a pointer you can just make the same job more complicated.
Solution alternative to exceptions or optional is to implement a "Null object" - which can be used as a regular object, but will "do nothing". Depends on the case, sometimes it can be used as is and does not require to be checked (explicitly) - especially in cases where ignoring the "not found" situation is acceptable.
(the null object can be a static global, so it is also possible to return a reference to it)
Even if a check is needed, an isNull() method can be implemented, which returns true for the null object and false for a valid object (or there can be isValid() method, etc.).
Example:
class Object {
public:
virtual void doSomething();
};
class NullObject: public Object {
public:
virtual void doSomething() {
// doing nothing - ignoring the null object
}
};
class ObjectManager {
public:
Object& getObject(const std::string& name);
private:
static NullObject s_nullObject;
};
Object& ObjectManager::getObject(const std::string& name){
for(auto it = object_store.begin(); it != object_store.end(); ++it){
if(it->isCalled(name))
return *it;
}
return s_nullObject;
}
ObjectManager mgr;
Object& obj = mgr.getObject(name);
obj.doSomething(); // does nothing if the object is NullObject
// (without having to check!)

C++ `vector iterators incompatible` error only in Visual Studio

I have a class representing a string of space-delimited words via a vector of those words and an iterator over the vector.
class WordCrawler{
public:
WordCrawler(std::string, bool reversed=false);
WordCrawler& operator--();
std::string operator* () const;
bool atBeginning() const;
private:
std::vector<std::string> words;
std::vector<std::string>::iterator it;
};
I am trying to print out the words in reverse order, using this function:
void print_in_reverse(std::string in) {
WordCrawler wc = WordCrawler(in, true);
while (!wc.atBeginning()) {
--wc;
std::cout << *wc << " ";
}
}
I construct my WordCrawler object with this constructor:
WordCrawler::WordCrawler(std::string in, bool reversed) {
std::istringstream iss(in);
std::string token;
while (std::getline(iss, token, ' '))
{
words.push_back(token);
}
if (reversed) {
it = words.end();
} else {
it = words.begin();
}
}
The rest of the member functions are pretty simple:
/**
True if pointer is at the beginning of vector
*/
bool WordCrawler::atBeginning() const {
return it == words.begin();
}
/**
Function that returns the string stored at the pointer's address
*/
std::string WordCrawler::operator*() const {
return *it;
}
/**
Function that increments the pointer back by one
*/
WordCrawler& WordCrawler::operator--() {
if (!atBeginning())
--it;
return *this;
}
I'm finding that everything works fine on Xcode and cpp.sh, but on Visual Studio it throws a runtime error saying vector iterators incompatible at atBeginning() function. My assumption would be that this is because the code is reliant on some sort of undefined behavior, but as I am relatively new to C++ I'm not sure what it is.
I know that it is always an iterator of the words vector, and I know that words does not change after it has been initialized, so I'm not sure what the issue is.
Full code at: http://codepad.org/mkN2cGaM
Your object has a rule of three violation - on copy/move construction the iterator will still point to the vector in the old object.
The line WordCrawler wc = WordCrawler(in, true); specifies a copy/move operation, triggering this problem. Most compilers perform copy elision here but I heard that older versions of MSVC don't, in debug mode anyway.
To fix this properly, I would recommend using an index instead of an iterator in the class. If you really want to use the iterator you will need to implement your own copy-constructor and move-constructor.
Changing that line to WordCrawler wc(in, true); would probably fix this particular program but the same problem would be lurking still, and might show up when you make further modifications later.

function returning iterator in C++

Following is a Java method that returns an iterator
vector<string> types;
// some code here
Iterator Union::types()
{
return types.iterator();
}
I want to translate this code to C++. How can i return an iterator of vector from this method?
This will return an iterator to the beginning of types:
std::vector<string>::iterator Union::types()
{
return types.begin();
}
However, the caller needs to know the end() of vector types as well.
Java's Iterator has a method hasNext(): this does not exist in C++.
You could change Union::types() to return a range:
std::pair<std::vector<std::string>::iterator,
std::vector<std::string>::iterator> Union::types()
{
return std::make_pair(types.begin(), types.end());
}
std::pair<std::vector<std::string>::iterator,
std::vector<std::string>::iterator> p = Union::types();
for (; p.first != p.second; p.first++)
{
}
You'll want to have a begin and end method:
std::vector<string>::iterator Union::begin()
{
return types.begin();
}
std::vector<string>::iterator Union::end()
{
return types.end();
}
For completeness you might also want to have const versions
std::vector<string>::const_iterator Union::begin()const
{
return types.begin();
}
std::vector<string>::const_iterator Union::end()const
{
return types.end();
}
Assuming that types is an attribute of the class Union, a nice, STL compliant, way to handle this is:
class Union
{
std::vector<std::string> types
public:
typedef std::vector< std::string >::iterator iterator;
iterator begin() { return types.begin(); }
iterator end() { return types.end(); }
};
An union is a container of its members. I would use begin and end to give back iterators to the first and after-the-last members, respectively.
The list of types is not IMO the primary iterable property of an union. So I would myself use the following, and reserve the plain begin and end for the member data itself.
std::vector<string>::const_iterator Union::types_begin() const {
return types.begin();
}
std::vector<string>::const_iterator Union::types_end() const {
return types.end();
}
Returning an iterator is easy. For example, you can return the first iterator in the vector types:
std::vector<std::string> types;
// some code here
std::vector<std::string>::iterator Union::returnTheBeginIterator()
{
return types.begin();
}
Java vs. C++
But C++ iterators are not Java iterators: They are not used the same way.
In Java (IIRC), you have more like an enumerator, that is, you use the method "next" to iterate from one item to the next. Thus, returning the Java iterator is enough to iterate from the begining to the end.
In C++, the iterator is designed to behave like a super-pointer. Thus, it usually "points" to the value, and using the operator ++, --, etc. (depending on the exact type of the iterator), you can move the iterator to "point" to the next, previous, etc. value in the container.
Let's iterate!
Usually, you want to iterate from the beginning to the end.
This, you need to return either the whole collection (as "const", if you want it to be readonly), and let the user iterate the way he/she wants.
Or you can return two iterators, one for the beginning, and one for the end. So you could have:
std::vector<std::string>::iterator Union::typesBegin()
{
return types.begin();
}
std::vector<std::string>::iterator Union::typesEnd()
{
return types.end();
}
And the, you can iterate from the beginning to the end, in C++03:
// alias, because the full declaration is too long
typedef std::vector<std::string> VecStr ;
void foo(Union & p_union)
{
VecStr::iterator it = p_union.typesBegin() ;
VecStr::iterator itEnd = p_union.typesEnd() ;
for(; it != itEnd; ++it)
{
// here, "*it" is the current string item
std::cout << "The current value is " << *it << ".\n" ;
}
}
C++11 version
If you provide the full container instead of only its iterators, in C++11, it becomes easier, as you can use the range-for loop (as the foreach in Java and C#):
void foo(std::vector<std::string> & p_types)
{
for(std::string & item : p_types)
{
// here, "item " is the current string item
std::cout << "The current value is " << item << ".\n" ;
}
}
P.S.: Johannes Schaub - litb is right in using the "const" qualifier whenever possible. I did not because I wanted to avoid to dilute the code, but in the end, "const" is your friend.
You can do it as below
std::vector<std::string> types
std::vector<std::string>::iterator Union::types(){
return types.begin();
}

Overloading [] operator in C++

Im trying to overload the [] operator in c++ so that I can assign / get values from my data structure like a dictionary is used in c#:
Array["myString"] = etc.
Is this possible in c++?
I attempted to overload the operator but it doesnt seem to work,
Record& MyDictionary::operator[] (string& _Key)
{
for (int i = 0; i < used; ++i)
{
if (Records[i].Key == _Key)
{
return Records[i];
}
}
}
Thanks.
Your code is on the right track - you've got the right function signature - but your logic is a bit flawed. In particular, suppose that you go through this loop without finding the key you're looking for:
for (int i = 0; i < used; ++i)
{
if (Records[i].Key == _Key)
{
return Records[i];
}
}
If this happens, your function doesn't return a value, which leads to undefined behavior. Since it's returning a reference, this is probably going to cause a nasty crash the second that you try using the reference.
To fix this, you'll need to add some behavior to ensure that you don't fall off of the end of the function. One option would be to add the key to the table, then to return a reference to that new table entry. This is the behavior of the STL std::map class's operator[] function. Another would be to throw an exception saying that the key wasn't there, which does have the drawback of being a bit counterintuitive.
On a totally unrelated note, I should point out that technically speaking, you should not name the parameter to this function _Key. The C++ standard says that any identifier name that starts with two underscores (i.e. __myFunction), or a single underscore followed by a capital letter (as in your _Key example) is reserved by the implementation for whatever purposes they might deem necessary. They could #define the identifier to something nonsensical, or have it map to some compiler intrinsic. This could potentially cause your program to stop compiling if you move from one platform to another. To fix this, either make the K lower-case (_key), or remove the underscore entirely (Key).
Hope this helps!
On a related note, one of the problems with operator[](const Key& key) is that, as templatetypedef states, in order to return a reference it needs to be non-const.
To have a const accessor, you need a method that can return a fail case value. In STL this is done through using find() and the use of iterators and having end() indicate a fail.
An alternative is to return a pointer, with a null indicating a fail. This is probably justified where the default constructed Record is meaningless. This can be also be done with the array operator:
Record* MyDictionary::operator[] (const string& keyToFind) const
{
for (int i = 0; i < used; ++i)
{
if (Records[i].Key == keyToFind)
{
return &Records[i];
}
}
return 0;
}
There is certainly a view that operator[] should return a reference. In that case, you'd most likely implement find() as well and implement operator[] in terms of it.
To implement find() you need to define an iterator type. The convenient type will depend in implementation. For example, if Records[] is a plain old array:
typedef Record* iterator;
typedef const Record* const_iterator;
const_iterator MyDictionary::end()const
{
return Records + used;
}
const_iterator MyDictionary::begin() const
{
return Records;
}
const_iterator MyDictionary::find(const string& keyToFind) const
{
for (iterator it = begin(); it != end(); ++it)
{
if (it->Key == keyToFind)
{
return it;
}
}
return end();
}