What is the meaning of "generic programming" in c++? - c++

What is the meaning of generic programming in c++?
Also, I am trying to figure out what container, iterator, and different types of them mean.

Generic programming means that you are not writing source code that is compiled as-is but that you write "templates" of source codes that the compiler in the process of compilation transforms into source codes. The simplest example for generic programming are container classes like arrays, lists or maps that contain a collection of other objects. But there's much more to generic programming. In the context of C++ (and called meta programming) it means to write programs that are evaluated at compile time.
A basic example of generic programming are templates of containers: In a statically typed language like C++ you would have to declare separate containers that hold integers, floats, and other types or deal with pointers to void and therefore losing all type information. Templates which are the C++ way of generic programming leverage this constraint by letting you define classes where one or more parameters are unspecified at the time you define the class. When you instance the template later you tell the compiler which type it should use to create the class out of the template. Example:
template<typename T>
class MyContainer
{
// Container that deals with an arbitrary type T
};
void main()
{
// Make MyContainer take just ints.
MyContainer<int> intContainer;
}
Templates are generic because the compiler translates the template into actual code. Note that in the case you don't instantiate your template no code will be generated for it at all. On the other hand, if you declare a MyContainer<int>, a MyContainer<float>, and a MyContainer<String> the compiler will create three versions of your code each of them having a different type. There will be some optimizations involved but basically your template code will be instantianted to three new types.
Iterators are a design pattern that were popularized in the seminal book "Design Patterns" by Gamma et al. It's a pattern to iterate over the content of a container class. Unlike using a for-loop an iterator is an instance of a class that points to a member of the container and gives you an unified interface to traverse the container as well as accessing the members. Take look at this example:
// Instanciate template QList with type int
QList<int> myList;
// put some ints into myList
// Copyconstruct iterator that points to the
// first member of the list.
QList<int>::iterator i = myList.begin();
// Iterate through the list
while (i != myList.end()) {
std::cout << *i << std::endl;
i++;
}
In this C++ example I'm instantating a template QList with type int. QList a container class that stores a list of objects. In this example we will use it to store integers.
Then I create an iterator i to traverse through the list. myList.begin() returns an iterator that points to the first element of the list. We can compare the iterator with another iterator myList.end() that points after the last element of the list. If both iterators are the same we know that we have passed the last elment. In the loop we're printing the element by accessing it with *i and go to the next element with i++.
Note that in this example * and ++ are overloaded operators and reimplemented by the iterator class. In a programming language without operator overloading there could be methods like i.element() or i.next() that do the same task. It's important to see that i is not a pointer but a whole class that just mimics the behaviour of a pointer.
What's the benefit of iterators? They provide a unified way to access the members of a container class, completely indepented on how the container class is implemented internally. No matter if your want to traverse a list, map or tree, the iterator classes (should) always work the same way.

Container
In C++, a container is a class that allows you to store objects. For example the standard library std::vector<T> is a resizable array which stores objects of some type T. In order to be formally considered a container class, it must expose certain functionality in order to facilitate generic programming. I could quote the exact requirements from the C++ standard, but for most purposes, the container classes that matter are the ones from the standard library: vector, deque, list, map, set and multimap/multiset.
One of the important requirements is that they must allow iterator access.
Iterator
"Iterator" can mean two things here: It is the name of a design pattern, but in C++ it is also the name of a specific expression of that design pattern. A C++ iterator is a type that allows traversal over a sequence of elements using a pointer-like syntax.
For example, if you have an array int a[10], you can use a plain pointer as an iterator:
int* first = a; // create an iterator that points to the beginning of the array
++first; // make the iterator point to the second element
int i = *first; // get the value of the element pointed to by the iterator
int* last = a+10; //create an "end" iterator, one which points one past the end of the array
If I had a linked list, such as std::list<int> l, I could do much the same, although now my iterators are no longer just pointers, but instead a class type implemented to work specifically with std::list:
std::list<int>::iterator first = l.begin(); // create an iterator that points to the beginning of the list
++first; // make the iterator point to the second element
int i = *first; // get the value of the element pointed to by the iterator
std::list<int>::iterator last = l.end(); //create an "end" iterator, one which points one past the end of the list
or with a vector std::vector<int> v:
std::vector<int>::iterator first = v.begin(); // create an iterator that points to the beginning of the vector
++first; // make the iterator point to the second element
int i = *first; // get the value of the element pointed to by the iterator
std::list<int>::iterator last = v.end(); //create an "end" iterator, one which points one past the end of the vector
The important thing about iterators is that they give us a uniform syntax for traversing sequences of elements, regardless of how the sequence is stored in memory (or even if it is stored in memory. An iterator could be written to iterate over the contents of a database on disk. Or we can use iterator wrappers to make a stream such as std::cin look like a sequence of objects too:
std::istream_iterator<int>(std::cin) first;
++first; // make the iterator point to the second element
int i = *first; // get the value of the element pointed to by the iterator
std::list<int>::iterator last; //create an "end" iterator, which marks the end of the stream
although because this wraps a regular stream, it is a more limited type of iterator (you can't move backwards, for example, which means not all of the following algorithms work with stream iterators.
Now, given any of these iterator types, we can use all the standard library algorithms which are designed to work with iterators. For example, to find the first element in the sequence with value 4:
std::find(first, last, 4); // return the first iterator which equals 4 and which is located in the interval [first, last)
Or we can sort the sequence (doesn't work with stream iterators):
std::sort(first, last);
or if we write a function which squares an int, such as this:
int square(int i) { return i * i; }
then we can apply it to the entire sequence:
// for every element in the range [first, last), apply the square function, and output the result into the sequence starting with first
std::transform(first, last, first, square);
That's the advantage of iterators: they abstract away the details of the container, so that we can apply generic operations on any sequence. Thanks to iterators, the same find or sort implementation works with linked lists as well as arrays, or even with your own home-made container classes.
Generic Programming
Generic programming is basically the idea that your code should be as generic as possible. As shown in the iterator examples above, we come up with a common set of functionality that a type must support in order to be called an iterator, and then we write algorithms that work with any iterator type.
Compare this with traditional object-oriented programming, where iterators would have to "prove" that they're iterators by inheriting from some kind of IIterator interface. That would prevent us from using raw pointers as iterators, so we'd lose genericity.
In C++, with generic programming, we don't need the official interface. We just write the algorithms using templates, so they accept any type which just so happens to look like an iterator, regardless of where, when and how they're defined, and whether or not they derive from a common base class or interface.

In the simplest definition, generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters.

As a point of historical interest, the versions of C++ that came before templates were part of the language had a "generic.h" that contained preprocessor macros which could be expanded to class declarations. So you could have a generic schema ("template") for a class which you could vary by passing certain parameters into the macros when you expanded them to actual class declarations.
However, preprocessor macros are not type safe and a bit clumsy to handle, and their use in C++ code significantly declined due to these reasons; C++ adopted the more versatile templates as elements of the language, but the term "generic" programming lived on. "Generics" are now used in other programming languages as glorified type casts.
Other than that, the question has already been expertly answered.

generic programming: pretty much just involves templates.
container: A struct or class, which contains its own data and methods that act on that data.
Iterator: It is a pointer to some memory address that you can iterate through (like an array).
Correct me if wrong on any of the above.

the concept of type parameters, which make it possible to design classes and methods that defer the specification of one or more types until the class or method is declared and instantiated by client code.

Alex Stepanov, pioneer of generic programming and author of the STL, says in From Mathematics to Generic Programming (Stepanov + Rose):
"Generic programming is an approach to programming that focuses on
designing algorithms and data structures so that they work in the most
general setting without loss of efficiency...
"What about all that stuff about templates and iterator traits?” Those
are tools that...support generic programming...But generic programming
itself is more of an attitude toward programming than a particular set
of tools...
"The components of a well-written generic program are easier to use
and modify than those of a program whose data structures, algorithms,
and interfaces hardcode unnecessary assumptions about a specific
application
"Although the essence of generic programming is abstraction,
abstractions do not spring into existence fully formed. To see how to
make something more general, you need to start with something
concrete. In particular, you need to understand the specifics of a
particular domain to discover the right abstractions."
"So where does this generic programming attitude come from, and how do
you learn it? It comes from mathematics, and especially from a branch
of mathematics called abstract algebra."
Let's start with a concrete algorithm and abstract away the non-essential details.
Let's use linear search as an example. We're looking for an int in the range between the pointers begin (inclusive) and end (exclusive) aka [begin,end):
int* find_int(int* begin, int* end, int target){
for(; begin != end; ++begin)
if(*begin == target) break;
return begin;
}
But the code to find a char or a float would be the same (s/int/float/)
float* find_float(float* begin, float* end, float target){
for(; begin != end; ++begin)
if(*begin == target) break;
return begin;
}
Using templates, we can generalize this:
template<class T>
T* find_array(T* begin, T* end, T target){
for(; begin != end; ++begin)
if(*begin == target) break;
return begin;
}
What if you want to search in a singly linked list? Let's ignore memory management for now and consider a linked list struct like
template<class T>
struct cell {
T elt;
cell* next;
};
Then find_list would look like
template<class T>
cell<T>* find_list(cell<T>* lst, T x){
for(; lst != nullptr; lst = lst->next)
if(lst->elt == x) break;
return lst;
}
This looks superficially different, but the core process is the same: single step through the search space until we find x or reach the end. Here, cell<T>* fills the role T* did in find_array, nullptr fills the role end did, lst = lst->next fills the role of ++begin, and lst->elt fills the role of *begin.
Instead of rewriting find for each data structure, look at what guarantees you need for find to work (the abstract guarantees on your input types are called a concept in analogy to an algebraic structure). You need a way to refer to a place in a data structure, called an iterator. For find, our iterator only needs three things:
we can read the data it points to (operator*)
we can advance it by a single step (operator++)
we can check if it reached the end (operator==).
An iterator with these capabilities is called a std::input_iterator.
What about T? We just need to know we can compare by ==. I wrote it to pass by copy, but if we pass by reference instead we can get rid of that:
#include <concepts> // for equality_comparable_with
#include <iterator> // for input_iterator
template<class I, class T>
requires std::input_iterator<I>
&& std::equality_comparable_with<std::iter_value_t<I>, T>
// ensures we can compare with ==
I find(I begin, I end, T const& target){
for(; begin != end; ++begin)
if(*begin == target) break;
return begin;
}
This version of find will work on any data structure that exposes input iterators. All the standard library's containers do, usually through methods called begin and end.
If we wanted to do something more complicated, like sorting, we can use stronger guarantees on our iterators, like random access.

Related

How to define a C++ iterator that skips tombstones

I am implementing a container that presents a map-like interface. The physicals implementation is an std::vector<std::pair<K*, T>>. A K object remembers its assigned position in the vector. It is possible for a K object to get destroyed. In that case its remembered index is used to zero out its corresponding key pointer within the vector, creating a tombstone.
I would like to expose the full traditional collection of iterators, though I think that they need only claim to be forward_iterators (see next).
I want to be able to use range-based for loop iteration to return the only non-tombstoned elements. Further, I would like the implementation of my iterators to be a single pointer (i.e. no back pointer to the container).
Since the range-based for loop is pretested I think that I can implement tombstone skipping within the inequality predicate.
bool operator != (MyInterator& cursor, MyIterator stop) {
while (cursor != stop) {
if (cursor->first)
return true;
++cursor;
}
return false;
}
Is this a reasonable approach? If yes, is there a simple way for me to override the inequality operator of std::vector's iterators instead of implementing my iterators from scratch?
If this is not a reasonable approach, what would be better?
Is this a reasonable approach?
No. (Keep in mind that operator!= can be used outside a range-based for loop.)
Your operator does not accept a const object as its first parameter (meaning a const vector::iterator).
You have undefined behavior if the first parameter comes after the second (e.g. if someone tests end != cur instead of cur != end).
You get this weird case where, given iterators a and b, it might be that *a is different than *b, but if you check if (a != b) then you find that the iterators are equal and then *a is the same as *b. This probably wrecks havoc with the multipass guarantee of forward iterators (but the situation is bizarre enough that I would want to check the standard's precise wording before passing judgement). Messing with people's expectations is inadvisable.
There is no simple way to override the inequality operator of std::vector's iterators.
If this is not a reasonable approach, what would be better?
You already know what would be better. You're just shying away from it.
Implement your own iterators from scratch. Wrapping your vector in your own class has the benefit that only the code for that class has to be aware that tombstones exist.
Caveat: Document that the conditions that create a tombstone also invalidate iterators to that element. (Invalid iterators are excluded from most iterator requirements, such as the multipass guarantee.)
OR
While your implementation makes a poor operator!=, it could be a fine update or check function. There's this little-known secret that C++ has more looping structures than just range-based for loops. You could make use of one of these, for example:
for ( cur = vec.begin(); skip_tombstones(cur, vec.end()); ++cur ) {
auto& element = *cur;
where skip_tombstones() is basically your operator!= renamed. If not much code needs to iterate over the vector, this might be a reasonable option, even in the long term.

Designing iterators for a Matrix class

I am creating a Matrix<T> class. While implementing iterators for it I stumbled upon a design conundrum.
Internally, the matrix holds the data in a std::vector<T> (row-major).
One way of iterating through the matrix is by double iterators (nested) (specified by the row_double template parameter):
for (auto it_i = mat.Begin<row_double>(); it_i < mat.End<row_double>(); ++it_i) {
for (auto it_j = it_i->Begin(); it_j < it_i->End(); ++it_j) {
cout << *it_j << " ";
}
cout << endl;
}
The first iterator Matrix<T>::Iterator<row_double>:
iterates through the rows of the matrix
has a RowProxy member
Dereferencing it returns a RowProxy.
RowProxy returns std::vector<T>::iterator iterators via methods like Begin() and End().
My idea is for the RowProxy to know the beginning of the line and the size of the line (number of matrix columns).
The problem is how RowProxy holds the beginning of the line reference:
My first approach was to make the beginning of the line a std::vector<T>::iterator.
The problem is that in Visual Studio the iterator is aware of the vector and there are debug checks for iterator arithmetics. It throws an error when constructing the ReverseEnd iterator (for the line before the first line): the beginning of the line is num_columns before the vector start. please note that this has nothing to do with dereferencing (whitch is UB). I can't create the iterator..
My second approach was to make the beginning of the line a raw pointer T *
The problem here is that LineProxy needs to return std::vector<T>::iterator (via it's own Begin etc.) and I cannot (don't know how to) construct a std::vector<T>::iterator from a T * in a standard way. (Didn't find any reference specific to std::vector<T>::iterator, just iterator concepts. In visual Studio there seems to be a constructor (T *, std::vector<T> *), in gcc a (T *), neither one works in the other compiler).
The solution that I see now is to make my own iterator identical with std::vector<T>::iterator but who isn't bound to any vector and can be constructed from T * and make RowProxy return that. But this really seems like reinventing the wheel.
Since this is part of a library (and the code resides in headers), compiler options are out of the question (including macros that control compiler options because they modify the whole behaviour of the program that includes the header, not just the behaviour of the library code). Also the solution must be conforming to the standard. The language is C++11.
The simplest way to do this as stated above is to use Eigen, it is really a very nice package. The next simplest thing to do would be to store size information in your class then you have a very nice way of getting a specific element out of your matrix. Just write an (i,j) operator that returns vector[i + j*rowlength]. Iterators should work well for looping over the whole vector in one loop, not sure how much sense it makes to loop over in two.

Need clarification about C++ std::iterator

Reading a C++ book I encountered the following example on using iterators:
vector<string::iterator> find_all(string& s, char c)
{
vector<string::iterator> res;
for(auto p = s.begin(); p != s.end(); ++p)
if(*p == c)
res.push_back(p);
return res;
}
void test()
{
string m {"Mary had a little lamb"};
for(auto p : find_all(m, 'a'))
if(*p != 'a')
cerr << "a bug!\n";
}
I'm a little confused about what the vector returned by find_all() contains. Is it essentially "pointers" to the elements of the string m created above it?
Thanks.
I'm a little confused about what the vector returned by find_all() contains. Is it essentially "pointers" to the elements of the string m created above it?
Mostly; iterators aren't (necessarily) pointers, they are somewhat a generalization of the pointer concept. They are used to point to specific objects stored inside containers (in this case, characters inside a string), you can use them to move between the elements of the string (via the usual arithmetic operators - when they are supported) and you "dereference" them with * to get a reference to the pointed object.
Notice that, depending from the container, they are implemented differently and provide different features; an iterator to a std::list, for example, will allow ++, -- and *, but not moving to arbitrary locations, and an iterator to a singly-linked list won't even support --, while typically iterators to array-like data structures (like vector or string) will allow completely free movement.
To refer to elements in array-like structures often one just stores indexes, since they are cheap to store and use; for other structures, instead, storing iterators may be more convenient.
For example, just yesterday I had some code which walked a unordered_set<string, int> (=a hashtable that mapped some words to their occurrences) to "take note" of some of the (string, int) couples to use them later.
The equivalent of storing vector indexes here would have been storing the hashtable's keys, but (1) they are strings (so they are moderately costly to allocate and handle), and (2) to use them to reach the corresponding object I had to do another hashtable lookup later. Instead, storing iterators in a vector guarantees no hassle for storing strings (iterators are intended to be cheap to handle) and no need to perform a lookup again.
Yes, iterators are like pointers. std::string::iterator can even be an alias for char *, although it's usually not.
In general, iterators provide a subset of pointer functionality. Which subset depends on the iterator. Your book probably covers this, but all iterators can be dereferenced (*, but there is never a reference & operation) and incremented (++), then some additionally provide --, and some add + and - on top of that.
In this case, the function seems to assume you will only be querying the values of the iterators without modifying the string. Because the allocation block used for string storage may change as the string grows, iterators (like pointers) into the string may be invalidated. This is why std::string member functions like string::find return index numbers, not iterators.
A vector of indexes could be a better design choice, but this is good enough for an example.

Get a pointer to STL container an iterator is referencing?

For example, the following is possible:
std::set<int> s;
std::set<int>::iterator it = s.begin();
I wonder if the opposite is possible, say,
std::set<int>* pSet = it->**getContainer**(); // something like this...
No, there is no portable way to do this.
An iterator may not even have a reference to the container. For example, an implementation could use T* as the iterator type for both std::array<T, N> and std::vector<T>, since both store their elements as arrays.
In addition, iterators are far more general than containers, and not all iterators point into containers (for example, there are input and output iterators that read to and write from streams).
No. You must remember the container that an iterator came from, at the time that you find the iterator.
A possible reason for this restriction is that pointers were meant to be valid iterators and there's no way to ask a pointer to figure out where it came from (e.g. if you point 4 elements into an array, how from that pointer alone can you tell where the beginning of the array is?).
It is possible with at least one of the std iterators and some trickery.
The std::back_insert_iterator needs a pointer to the container to call its push_back method. Moreover this pointer is protected only.
#include <iterator>
template <typename Container>
struct get_a_pointer_iterator : std::back_insert_iterator<Container> {
typedef std::back_insert_iterator<Container> base;
get_a_pointer_iterator(Container& c) : base(c) {}
Container* getPointer(){ return base::container;}
};
#include <iostream>
int main() {
std::vector<int> x{1};
auto p = get_a_pointer_iterator<std::vector<int>>(x);
std::cout << (*p.getPointer()).at(0);
}
This is of course of no pratical use, but merely an example of an std iterator that indeed carries a pointer to its container, though a quite special one (eg. incrementing a std::back_insert_iterator is a noop). The whole point of using iterators is not to know where the elements are coming from. On the other hand, if you ever wanted an iterator that lets you get a pointer to the container, you could write one.

C++ vector insights

I am a little bit frustrated of how to use vectors in C++. I use them widely though I am not exactly certain of how I use them. Below are the questions?
If I have a vector lets say: std::vector<CString> v_strMyVector, with (int)v_strMyVector.size > i can I access the i member: v_strMyVector[i] == "xxxx"; ? (it works, though why?)
Do i always need to define an iterator to acces to go to the beginning of the vector, and lop on its members ?
What is the purpose of an iterator if I have access to all members of the vector directly (see 1)?
Thanks in advance,
Sun
It works only because there's no bounds checking for operator[], for performance reason. Doing so will result in undefined behavior. If you use the safer v_strMyVector.at(i), it will throw an OutOfRange exception.
It's because the operator[] returns a reference.
Since vectors can be accessed randomly in O(1) time, looping by index or iterator makes no performance difference.
The iterator lets you write an algorithm independent of the container. This iterator pattern is used a lot in the <algorithm> library to allow writing generic code easier, e.g. instead of needing N members for each of the M containers (i.e. writing M*N functions)
std::vector<T>::find(x)
std::list<T>::find(x)
std::deque<T>::find(x)
...
std::vector<T>::count(x)
std::list<T>::count(x)
std::deque<T>::count(x)
...
we just need N templates
find(iter_begin, iter_end, x);
count(iter_begin, iter_end, x);
...
and each of the M container provide the iterators, reducing the number of function needed to just M+N.
It returns a reference.
No,, because vector has random access. However, you do for other types (e.g. list, which is a doubly-linked list)
To unify all the collections (along with other types, like arrays). That way you can use algorithms like std::copy on any type that meets the requirements.
Regarding your second point, the idiomatic C++ way is not to loop at all, but to use algorithms (if feasible).
Manual looping for output:
for (std::vector<std::string>::iterator it = vec.begin(); it != end(); ++it)
{
std::cout << *it << "\n";
}
Algorithm:
std::copy(vec.begin(), vec.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
Manual looping for calling a member function:
for (std::vector<Drawable*>::iterator it = vec.begin(); it != end(); ++it)
{
(*it)->draw();
}
Algorithm:
std::for_each(vec.begin(), vec.end(), std::mem_fun(&Drawable::draw));
Hope that helps.
Workd because the [] operator is overloaded:
reference operator[](size_type n)
See http://www.sgi.com/tech/stl/Vector.html
Traversing any collection in STL using iterator is a de facto.
I think one advantage is if you replace vector by another collection, all of your code would continue to work.
That's the idea of vectors, they provide direct access to all items, much as regular arrays. Internally, vectors are represented as dynamically allocated, contiguous memory areas. The operator [] is defined to mimic semantics of the regular array.
Having an iterator is not really required, you may as well use an index variable that goes from 0 to v_strMtVector.size()-1, as you would do with regular array:
for (int i = 0; i < v_strMtVector.size(); ++i) {
...
}
That said, using an iterator is considered to be a good style by many, because...
Using an iterator makes it easier to replace underlying container type, e.g. from std::vector<> to std::list<>. Iterators may also be used with STL algorithms, such as std::sort().
std::vector is a type of sequence that provides constant time random access. You can access a reference to any item by reference in constant time but you pay for it when inserting into and deleting from the vector as these can be very expensive operations. You do not need to use iterators when accessing the contents of the vector, but it does support them.