Make my C++ Class iterable via BOOST_FOREACH - c++

I have a class which I want to expose a list of structs (which just contain some integers).
I don't want the outside to modify these data, just iterate over it and read them
Example:
struct TestData
{
int x;
int y;
// other data as well
}
class IterableTest
{
public:
// expose TestData here
};
now in my code I want to use my class like this:
IterableTest test;
BOOST_FOREACH(const TestData& data, test.data())
{
// do something with data
}
I've already read this article http://accu.org/index.php/journals/1527 about memberspaces.
However, I don't want to (or can't) save all TestData in an internal vector or something.
This is because the class itself doesn't own the storage, i.e. there is actually no underlying container which can be accessed directly by the class. The class itself can query an external component to get the next, previous or ith element, though.
So basically I want my class to behave as if it had a collection, but in fact it doesn't have one.
Any ideas?

It sounds like you have to write your own iterators.
The Boost.Iterator library has a number of helpful templates. I've used their Iterator Facade base class a couple of times, and it's nice and easy to define your own iterators using it.
But even without it, iterators aren't rocket science. They just have to expose the right operators and typedefs. In your case, they're just going to be wrappers around the query function they have to call when they're incremented.
Once you have defined an iterator class, you just have to add begin() and end() member functions to your class.
It sounds like the basic idea is going to have to be to call your query function when the iterator is incremented, to get the next value.
And dereference should then return the value retrieved from the last query call.
It may help to take a look at the standard library stream_iterators for some of the semantics, since they also have to work around some fishy "we don't really have a container, and we can't create iterators pointing anywhere other than at the current stream position" issues.
For example, assuming you need to call a query() function which returns NULL when you've reached the end of the sequence, creating an "end-iterator" is going to be tricky. But really, all you need is to define equality so that "iterators are equal if they both store NULL as their cached value". So initialize the "end" iterator with NULL.
It may help to look up the required semantics for input iterators, or if you're reading the documentation for Boost.Iterator, for single-pass iterators specifically. You probably won't be able to create multipass iterators. So look up exactly what behavior is required for a single-pass iterator, and stick to that.

If your collection type presents a standard container interface, you don't need to do anything to make BOOST_FOREACH work with your type. In other words, if your type has iterator and const_iterator nested typedefs, and begin() and end() member functions, BOOST_FOREACH already knows how to iterate over your type. No further action is required.
http://boost-sandbox.sourceforge.net/libs/foreach/doc/html/boost_foreach/extending_boost_foreach.html

From the Boost FOR_EACH documentation page:
BOOST_FOREACH iterates over sequences. But what qualifies as a sequence, exactly? Since BOOST_FOREACH is built on top of Boost.Range, it automatically supports those types which Boost.Range recognizes as sequences. Specifically, BOOST_FOREACH works with types that satisfy the Single Pass Range Concept. For example, we can use BOOST_FOREACH with:

Related

Is there a type in the standard for storing the begin- and end-iterators of a container?

My question is really simple: Is there a type in the standard whose purpose is to store the begin-terator and the end-iterator for a container?
I want to return both iterators from a single function. I'm aware I could use std::pair for this but it feels like there would be some type in the standard for this exact purpose. I've scanned the <iterator> header but can't seem to find a type like this. I'm not that good at iterator terminology so I'm not sure if one of those actually do what I'm looking for. Is there one that does?
(If you're interested in why I even want to do it in the first place, I have constant arrays in a class template that derives from a base class. Using polymorphism I want to iterate over their class constants, but I obviously can't have the virtual functions of the base class return the actual arrays since those are templated. Hence, I have virtual functions returning plain pointers, called someArrayBegin(), someArrayEnd(), otherArrayBegin(), etc, that I override in the derived class template to return the correct addresses.)
Is there a type in the standard for storing the begin- and end-iterators of a container?
Your description matches closely with the concept of a range.
std::ranges::subrange can be created from a pair of iterators. That will be templated based on the iterator type.
If you need to hide the iterator type for runtime polymorphism, you would need ranges::any_view which unfortunately isn't in the standard implementation of the ranges. Furthermore, there is a runtime cost associated with it.
For contiguous containers, another alternative is std::span which can point to any contiguous range without the cost of type erasure. For strings in particular, yet another alternative is std::string_view.

Why don't almost all the type aliases in std::iterator_traits have defaults?

When creating a new iterator pre-C++20 without the help of libraries like boost.iterator, it's necessary to specify the type aliases difference_type, value_type, pointer, reference and iterator_category.
According to cppreference, with C++20, it's only necessary to specify difference_type and value_type, which I think is great!
But why are there defaults for exactly these 3 aliases?
There are 2 things I don't understand about this (and one thing that seems to me like an oversight):
Why are there no default values for value_type and difference_type? Wouldn't it make sense to use something like std::remove_reference_t<reference> as a default for value_type?
As a default for difference_type for random access iterators, it could arguably make sense to use the result type of the - operator taking two iterators.
C++20 adds the contiguous_iterator_tag. Just like with input_iterator_tag versus forward_iterator_tag, I don't see how it should be possible for the compiler to correctly distinguish between a contiguous iterator and a random access iterator, which I guess is why it apparently never selects contiguous_iterator_tag. Is this intended? It also seems somewhat dangerous to misclassify an input iterator as a forward iterator, so why don't require the programmers to specify this alias themselves?
On a somewhat unrelated note, I'm not sure if it's a good idea to silently generate a value for iterator_category even if the programmer has explicitly stated another category, and generating a value for iterator_category that's completely different from the concept seems strange as well. Consider this unrealistic example:
#include <iostream>
#include <iterator>
// With the == operator, this is an input iterator, but nothing else.
struct WeirdIterator {
// Not an output iterator because you can't assign to a const reference
const int& operator*() const { return 42; }
WeirdIterator& operator++() { return *this; } // unimportant
WeirdIterator operator++(int) { return *this; } // unimportant
// bool operator==(const WeirdIterator&) const = default;
using iterator_category = std::random_access_iterator_tag;
using value_type = int;
using difference_type = int;
};
void iteratorConcept(std::input_iterator auto) {
std::cout << "input iterator concept" << std::endl;
}
void iteratorConcept(std::random_access_iterator auto) {
std::cout << "random access iterator concept" << std::endl;
}
void iteratorTag(std::output_iterator_tag) {
std::cout << "output iterator tag" << std::endl;
}
void iteratorTag(std::input_iterator_tag) {
std::cout << "input iterator tag" << std::endl;
}
void iteratorTag(std::random_access_iterator_tag) {
std::cout << "random access iterator tag" << std::endl;
}
int main() {
WeirdIterator iter;
iteratorConcept(iter);
iteratorTag(std::iterator_traits<WeirdIterator>::iterator_category{});
return 0;
}
This prints "input iterator concept" and "output iterator tag" because it's missing the comparison operator (which isn't required for the concept).
If I add the commented line, this now prints "input iterator concept" and "random access iterator tag", even though it clearly isn't a random access iterator. To be fair, writing the wrong iterator_category (i.e. random_access_iterator_tag) like this is a pretty stupid example, but I still think it would make sense to check if the concept is satisfied, especially in the case of the "fall-back" output_iterator_tag: Forgetting to write the == operator shouldn't turn an input iterator into an unusable output iterator. Would it be possible and make sense to check that the corresponding concepts are satisfied?
Edit
A few points in my question seem to be unclear, or maybe I've made some incorrect but unstated assumptions. I'll try to be more explicit about them and rephrase my current understanding (after reading the answer by Nicol Bolas):
Regarding Point 3: As I understand it, it's possible that a type T may have some std::iterator_traits<T>::iterator_category alias even if it doesn't model the corresponding C++20 concept or the C++17 named requirement. This is intended. So, let's forget about this, because it's probably a better fit for a separate question.
I think that the std::type_traits aliases defined if I don't explicitly write them down (e.g. reference when I only write value_type) can be incorrect for some iterators and are meant as sensible default values. Is this correct? If this is incorrect, my question is pretty much answered.
If T::reference isn't defined for an input iterator T, then std::iterator_traits::reference is defined as decltype(*std::declval<T&>()). Is this correct?
If reference can be defined based on operator*, wouldn't it make sense to then also define value_type based on *? Assuming that 5. is correct,
the only input iterator I can think of where this would go wrong is the iterator from std::vector<bool>, and there were several proposals to deprecate it because of this difference. So most input iterators would work with this definition, and those that didn't could simply specify value_type. Am I missing something?
Regarding Point 2: It's not in general decidable into what category an iterator falls.
Using e.g. an input iterator as if it were a more general forward iterator would be a bug. It can happen that the type_traits::iterator_category of a valid iterator where the programmer did not specify the iterator_category is incorrect. This doesn't affect the concept or named requirement (they take semantics into account), but in practical terms, it's possible that stl functions don't work correctly with this iterator, without generating a (run- or compile-time) error. Therefore, I think it would be a good idea to require the programmer to explicitly state the category. Is there a problem in this reasoning or did miss something?
I hope I don't come across as overly pedantic or as insisting on my personal opinion, but I genuinely don't know if and where there's an error in the points above, and I'm guessing that this isn't just confusing to me.
It's important to understand something at this point, as certain different things are being conflated here.
In C++20, there are two classifications of iterators: the old C++17 named requirements, and the new C++20 concept-based iterators. Most of the old requirements map to the latter, but the concept requirements allow for more things to be considered iterators than what the C++17 requirements allowed.
std::iterator_traits however is used for both of them, since they do use many of the same moving parts. The point of this is that it should be possible to write an iterator that fulfills both the C++17 named requirement and the similar C++20 concept. That is, you can write a type that satisfies Cpp17RandomAccessIterator and std::random_access_iterator without too much trouble.
I bring this up because many of the things under discussion will matter a lot more to one set of requirements than the other.
Why are there no default values for value_type and difference_type? Wouldn't it make sense to use something like std::remove_reference_t<reference> as a default for value_type?
Obviously, that would require you to specify reference. So you'd still have to specify two things. value_type is the one that the creator of the iterator is thinking in terms of anyway. And if they're thinking of it, it's probably because reference needs to be something other than a value_type&, so they'll need to specify both anyway.
C++20 adds the contiguous_iterator_tag. Just like with input_iterator_tag versus forward_iterator_tag, I don't see how it should be possible for the compiler to correctly distinguish between a contiguous iterator and a random access iterator, which I guess is why it apparently never selects contiguous_iterator_tag. Is this intended?
In C++17, there was no such thing as a "contiguous iterator". Not in the same sense as a RandomAccessIterator. There's a whole section in the standard that explains the requirements of a RandomAccessIterator, while "contiguous iterator" gets a one paragraph statement with no additional information about it and very few actual uses.
And of course, "contiguous iterator" gets no iterator tag. This was done deliberately to avoid adding another iterator tag and possibly making a lot of code that could work non-functional because a contiguous iterator instead advertised itself as random access.
C++20 changes things. It adds a std::contiguous_iterator_tag, but it does so because std::contiguous_iterator now has syntactical differences from std::random_access_iterator. Namely, a contiguous iterator must permit conversion into a pointer to its value_type via std::to_pointer. This allows you to turn an iterator pair into a pointer pair without having to dereference a potentially non-dereference-able iterator (such as a past-the-end iterator).
Note also that automatic assignment of iterator categories is based on satisfying the C++17 named requirements, not of the C++20 concepts. Since there is no "contiguous iterator" named requirement (and even if there was, it wouldn't be syntactically determinable), there can be no auto assignment of it.
The reason automatic assignment only works for the C++17 requirements is because the C++20 concepts are defined in terms of std::iterator_traits. So it cannot use the concepts without creating a circular definition.
On a somewhat unrelated note, I'm not sure if it's a good idea to silently generate a value for iterator_category even if the programmer has explicitly stated another category
That's not what the standard does. It only provides one if you don't specify one (outside of one odd quirk mentioned below).
This prints "input iterator concept" and "output iterator tag" because it's missing the comparison operator (which isn't required for the concept).
This is an odd quirk of the new definition of iterator_category, but the quirk does ultimately correctly represent the incoherence of your type.
The primary template iterator_category has 3 possible versions, depending on how you defined your iterator type. If your iterator provides all of the member type alises except pointer, then it just uses them. If it only provides some of them, then it does a concept check against an exposition-only version of Cpp17InputIterator. If your type fits that, then it uses your type's iterator_category (and if you don't provide one, then it computes one).
However, if your iterator isn't an input iterator, then it checks against the basic Cpp17Iterator. If that fits, then iterator_traits::iterator_category is fixed to be output_iterator_tag. That is certainly a strange choice.
If I add the commented line, this now prints "input iterator concept" and "random access iterator tag", even though it clearly isn't a random access iterator.
But you said it was a random access iterator. The system isn't supposed to override what you said; that was just a quirk of what happens if your type doesn't match input-iterator but still happens to be some kind of iterator.
In any case, if you lie, you lied. Garbage in, garbage out.
I still think it would make sense to check if the concept is satisfied, especially in the case of the "fall-back" output_iterator_tag: Forgetting to write the == operator shouldn't turn an input iterator into an unusable output iterator.
But... that's what it is. Equality testing isn't optional for input iterators. If you can't test it for equality, then it not an input iterator. Indeed, if the system did as you suggested, that's exactly the tag you would get: an output iterator.
So what's your problem? If you accidentally failed to make your type an input iterator, do you want the system to correctly categorize it as what it is in accord with its behavior or do you want it to forward your mistaken category onward?

C++ for each in on custom collections

So since it was introduced I have been loving the for each in keywords to iterate STL collections.(I'm a very very big fan of syntactic sugar).
My question is how can I write a custom collection that can be iterated using these keywords?
Essentially, what APi do I need to expose for my collections to be iterable using these keywords?
I apologize if this sounds blunt, but please do not respond with "use boost", "don't write your own collections", or the like. Pursuit of knowledge, my friends. If it's not possible, hey, I can handle that.
I'd also very much so prefer not to have to inject an STL iterator into my collections.
Thanks in advance!
Here is a good explanation of iterable data structures (Range-Based loops):
In order to make a data structure iterable, it must work similarly to the existing STL iterators.
There must be begin and end methods that operate on that structure,
either as members or as stand-alone functions, and that return iterators to
the beginning and end of the structure.
The iterator itself must support an operator* method, an operator != method, and an operator++ method, either as members or as stand-alone functions.
Note, in C++11 there is an integrated support for range-based loops without the use of STL, though the above conditions hold for this as well. You can read about it at the same link above.
It's not really clear from your quesiton whether you're talking about std::for_each defined in the <algorithm> header, or the range-based for loop introduced in C++11.
However, the answer is similar for both.
Both operate on iterators, rather than the collection itself.
So you need to
define an iterator type which satisfies the requirements placed on it by the STL (the C++ standard, really). (The main things are that it must define operator++ and operator*, and a couple of other operations and typedefs)
for std::for_each, there is no 2. You're done. You simply pass two such iterators to std::for_each. For the range-based for loop, you need to expose a pair of these iterators via the begin() and end() functions.
And... that's it.
The only tricky part is really creating an iterator which complies with the requirements. Boost (even though you said you didn't want to use it) has a library which aids in implementing custom iterators (Boost.Iterator). There is also the std::iterator class which is intended as a base class for custom iterator implementations. But neither of these are necessary. Both are just convenience tools to make it easier to create your own iterator.

Most efficient way to process all items in an unknown container?

I'm doing a computation in C++ and it has to be as fast as possible (it is executed 60 times per second with possibly large data). During the computation, a certain set of items have to be processed. However, in different cases, different implementations of the item storage are optimal, so i need to use an abstract class for that.
My question is, what is the most common and most efficient way to do an action with each of the items in C++? (I don't need to change the structure of the container during that.) I have thought of two possible solutions:
Make iterators for the storage classes. (They're also mine, so i can add it.) This is common in Java, but doesn't seem very 'C' to me:
class Iterator {
public:
bool more() const;
Item * next();
}
Add sort of an abstract handler, which would be overriden in the computation part and would include the code to be called on each item:
class Handler {
public:
virtual void process(Item &item) = 0;
}
(Only a function pointer wouldn't be enough because it has to also bring some other data.)
Something completely different?
The second option seems a bit better to me since the items could in fact be processed in a single loop without interruption, but it makes the code quite messy as i would have to make quite a lot of derived classes. What would you suggest?
Thanks.
Edit: To be more exact, the storage data type isn't exactly just an ADT, it has means of only finding only a specific subset of the elements in it based on some parameters, which i need to then process, so i can't prepare all of them in an array or something.
#include <algorithm>
Have a look at the existing containers provided by the C++ standard, and functions such as for_each.
For a comparison of C++ container iteration to interfaces in "modern" languages, see this answer of mine. The other answers have good examples of what the idiomatic C++ way looks like in practice.
Using templated functors, as the standard containers and algorithms do, will definitely give you a speed advantage over virtual dispatch (although sometimes the compiler can devirtualize calls, don't count on it).
C++ has iterators already. It's not a particularly "Java" thing. (Note that their interface is different, though, and they're much more efficient than their Java equivalents)
As for the second approach, calling a virtual function for every element is going to hurt performance if you're worried about throughput.
If you can (pre-)sort your data so that all objects of the same type are stored consecutively, then you can select the function to call once, and then apply it to all elements of that type. Otherwise, you'll have to go through the indirection/type check of a virtual function or another mechanism to perform the appropriate action for every individual element.
What gave you the impression that iterators are not very C++-like? The standard library is full of them (see this), and includes a wide range of algorithms that can be used to effectively perform tasks on a wide range of standard container types.
If you use the STL containers you can save re-inventing the wheel and get easy access to a wide variety of pre-defined algorithms. This is almost always better than writing your own equivalent container with an ad-hoc iteration solution.
A function template perhaps:
template <typename C>
void process(C & c)
{
typedef typename C::value_type type;
for (type & x : c) { do_something_with(x); }
}
The iteration will use the containers iterators, which is generally as efficient as you can get.
You can specialize the template for specific containers.

public: typedef other_class::const_iterator my_class::const_iterator - acceptable?

I'm looking for a way to use std::set::const_iterator as const_iterator of my own class.
My code (which actually behaves correct and compiles fine), goes like:
class MyClass{
public:
typedef std::set<T>::const_iterator const_iterator;
const_iterator begin() const {
return mySet->begin();
}
const_iterator end() const {
return mySet->end();
}
}; // MyClass
So, my question is: Is my way of using the const_iterator acceptable and also in the STL intended form?
Edit: Related question and answers at How to typedef the iterator of a nested container?
The idea of having the internal type is so that generic code can use your class without prior knowledge of the implementation. The main point is that there is a const_iterator, and that it has the semantics of a constant iterator. The fact that you are borrowing the iterator from the internal type is just an implementation detail that calling code should not care about.
That is, the problem would be not defining the const_iterator internal type, as that would increase coupling in user code, where they would have to use std::set<T>::const_iterator explicitly, and that in turn makes the type of the member part of the interface (i.e. you can no longer change the implementation of the member without breaking user code).
Have a look at other C++ Standard Library container adapters like std::queue. std::queue is just an adapter on top of an underlying container (by default a std::deque).
This is exactly the method used in those adapters. Given that, I would say your implementation is fine.
Yes, of course. You may store a std::set<T>::const_iterator anywhere you like, and making an alias of the type is even more okay.
It's not particularly encapsulated, but a perfectly valid technique.