Related
I want to find the minimum element of a filtered list. In Python, I would write:
it = (x for x in [1, 8, 4, 3] if x % 2 == 0)
min(it, default=None)
I hoped that the c++ equivalent would read something like:
const std::vector<int> array {1, 8, 4, 3};
const auto arr_end = std::end(array);
auto it = std::find_if(std::begin(array), arr_end, [](int value) { return value % 2 == 0; });
auto jt = std::min_element(it, arr_end);
if (jt != arr_end) {
std::cout << "Min even element is: " << *jt << std::endl;
} else {
std::cout << "No even element exists!" << std::endl;
}
The expected result is 4, but of course the actual result is 3. The reason: find_if skips to 8. Then from 8 to end the min element is chosen, which is 3.
My question: Is there a way to create an iterator over all even values that can be used to find the minimum element? I am not allowed to use boost, create a copy or to write to array. We are using c++17.
There isn't an answer in std as of C++17. In C++20 you can use std::ranges::filter_view, outside of std you can use ranges::filter_view from the range-v3 library, which was the demonstration implementation for the C++20 ranges proposal.
auto filtered = ranges::filter_view(array, [](int value) { return value % 2 == 0; });
auto it = std::min_element(filtered.begin(), filtered.end());
if (it != filtered.end()) {
std::cout << "Min even element is: " << *jt << std::endl;
} else {
std::cout << "No even element exists!" << std::endl;
}
My question: Is there a way to create an iterator over all even values that can be used to find the minimum element?
Yes!
It's slightly unfortunate that you're limited to C++17 with no Boost, because you ideally want ranges - specifically ranges::filter_view etc. which was added in C++20, and preceded by the Boost.Range library.
You may possibly be able to use the intermediate experimental range extension.
If none of those are viable, you can of course write your own filtered_iterator to use with std::min_element.
It's not much fun: although it's probably more reusable (and easier to test) than encoding all the logic into a single lambda, it's a lot of work if you're not planning to reuse it. Also, C++ iterators aren't ideally suited to emulating a Python-style generator, as demonstrated by the redundant end iterator e_ and the copy-assignment operator. You can't elide the end & predicate members of the filtered end iterator either, because both iterators usually need to be the same type.
template <typename BaseIterator, typename UnaryPredicate>
class filter_iterator
{
BaseIterator i_;
BaseIterator e_;
UnaryPredicate pred_;
public:
using reference = typename std::iterator_traits<BaseIterator>::reference;
using value_type = typename std::iterator_traits<BaseIterator>::value_type;
filter_iterator(filter_iterator &&) = default;
filter_iterator(filter_iterator const&) = default;
filter_iterator(BaseIterator i, BaseIterator e, UnaryPredicate p)
: i_(i), e_(e), pred_(p)
{}
filter_iterator& operator=(filter_iterator &&) = default;
filter_iterator& operator=(filter_iterator const& other) {
i_ = other.i_;
e_ = other.e_;
// This is questionable, because we can't copy the predicate without adding
// a level of indirection (ie, always wrapping it in std::function).
// For now, just assume it is stateless for convenience.
return *this;
}
bool operator==(filter_iterator const& other) const
{
return i_ == other.i_;
}
filter_iterator& operator++() {
// We could check i_ is not already e_ here,
// but the caller is required to check this outside anyway
i_ = find_if(next(i_), e_, pred_);
return *this;
}
filter_iterator operator++(int) const {
filter_iterator i(*this);
++i;
return i;
}
reference operator*() { return *i_; }
std::add_const_t<reference> operator*() const { return *i_; }
};
template <typename BaseIterator, typename UnaryPredicate>
bool operator!=(filter_iterator<BaseIterator, UnaryPredicate> const& a,
filter_iterator<BaseIterator, UnaryPredicate> const& b)
{
return !(a == b);
}
Then the wrapper function hides most of this ugliness for us:
template <typename BaseIterator, typename UnaryPredicate>
std::pair<filter_iterator<BaseIterator, UnaryPredicate>,
filter_iterator<BaseIterator, UnaryPredicate>>
filter(BaseIterator b, BaseIterator e, UnaryPredicate p)
{
using f = filter_iterator<BaseIterator, UnaryPredicate>;
auto fbegin = find_if(b, e, p);
return {f{fbegin, e, p}, {e, e, p}};
}
and we can use it like:
int main() {
std::vector<int> a {7, 1, 8, 4, 3, 2};
auto be = filter(a.begin(), a.end(),
[](int i){ return (i%2) == 0;});
auto min = std::min_element(be.first, be.second);
return *min;
}
If you are limited at c++17 there is no solution without making a copy.
If you can transition to C++ 20 the solution is pretty easy. C++ 20 introduced the std::views concept and added the <ranges> library. The concept of std::view is to not create a copy of the underlying container, and it does not modifies the actual values of the container. Behind the scenes the views are actually iterators(actually it is a bit more but lets stay at the basics)
So in your case you could something like this
const std::vector<int> array {1, 8, 4, 3};
auto isEven = [](auto i) { return i % 2 == 0; };
//This is actually an iterator pair(begin, end)
//No copies of the container ever made, the container does not change
auto filtered = array | std::views::filter(isEven);
auto min = std::ranges::min_element(filtered );
if (min != filtered .end())
std::cout << "Min " << *min << std::endl;
else
std::cout << "No min\n";
//You can try to print the vector, it will be unchanged!!!
std::find_if does not filter the vector. It only returns the first element for which the predicate is true. I suppose there is an elegant solution using ranges. The rather inelegant way is to use a custom comparator with min_element:
#include <vector>
#include <algorithm>
#include <iostream>
int main() {
const std::vector<int> array {1, 8, 4, 3};
std::vector<float> x;
if (array.size()) {
auto it = std::min_element(begin(array),end(array),
[](auto a, auto b){
if ((a % 2) && (b % 2)) return a < b;
if (a % 2) return false;
if (b % 2) return true;
return a < b;
});
if (*it % 2 == 0) std::cout << *it;
}
}
Odd elements are considered to be not < than other elements. When both are odd or both are even the "normal" < is used. Output is:
4
Note that I have to check if (*it % 2 == 0) because when there is no even element then the call to min_element will return an iterator to the smallest odd element.
PS: The tricky part of custom comparators is to get strict weak ordering correct. The above comparator can be written in a more concise way (thanks to Jarod42) like this:
return std::tuple{ bool{a%2} , a} < std::tuple{ bool{b%2} , b};
Tuples have a operator< that implements a strict weak ordering (given that the elements type provide one), hence writing it this way it is much easier to convice yourself that the comparator really is a strict weak ordering.
Is there a better way to write code like this:
if (var == "first case" or var == "second case" or var == "third case" or ...)
In Python I can write:
if var in ("first case", "second case", "third case", ...)
which also gives me the opportunity to easily pass the list of good options:
good_values = "first case", "second case", "third case"
if var in good_values
This is just an example: the type of var may be different from a string, but I am only interested in alternative (or) comparisons (==). var may be non-const, while the list of options is known at compile time.
Pro bonus:
laziness of or
compile time loop unrolling
easy to extend to other operators than ==
if you want to expand it compile time you can use something like this
template<class T1, class T2>
bool isin(T1&& t1, T2&& t2) {
return t1 == t2;
}
template<class T1, class T2, class... Ts>
bool isin(T1&& t1 , T2&& t2, T2&&... ts) {
return t1 == t2 || isin(t1, ts...);
}
std::string my_var = ...; // somewhere in the code
...
bool b = isin(my_var, "fun", "gun", "hun");
I did not test it actually, and the idea comes from Alexandrescu's 'Variadic templates are funadic' talk. So for the details (and proper implementation) watch that.
Edit:
in c++17 they introduced a nice fold expression syntax
template<typename... Args>
bool all(Args... args) { return (... && args); }
bool b = all(true, true, true, false);
// within all(), the unary left fold expands as
// return ((true && true) && true) && false;
// b is false
The any_of algorithm could work reasonably well here:
#include <algorithm>
#include <initializer_list>
auto tokens = { "abc", "def", "ghi" };
bool b = std::any_of(tokens.begin(), tokens.end(),
[&var](const char * s) { return s == var; });
(You may wish to constrain the scope of tokens to the minimal required context.)
Or you create a wrapper template:
#include <algorithm>
#include <initializer_list>
#include <utility>
template <typename T, typename F>
bool any_of_c(const std::initializer_list<T> & il, F && f)
{
return std::any_of(il.begin(), il.end(), std::forward<F>(f));
}
Usage:
bool b = any_of_c({"abc", "def", "ghi"},
[&var](const char * s) { return s == var; });
Alrighty then, you want Radical Language Modification. Specifically, you want to create your own operator. Ready?
Syntax
I'm going to amend the syntax to use a C and C++-styled list:
if (x in {x0, ...}) ...
Additionally, we'll let our new in operator apply to any container for which begin() and end() are defined:
if (x in my_vector) ...
There is one caveat: it is not a true operator and so it must always be parenthesized as it's own expression:
bool ok = (x in my_array);
my_function( (x in some_sequence) );
The code
The first thing to be aware is that RLM often requires some macro and operator abuse. Fortunately, for a simple membership predicate, the abuse is actually not that bad.
#ifndef DUTHOMHAS_IN_OPERATOR_HPP
#define DUTHOMHAS_IN_OPERATOR_HPP
#include <algorithm>
#include <initializer_list>
#include <iterator>
#include <type_traits>
#include <vector>
//----------------------------------------------------------------------------
// The 'in' operator is magically defined to operate on any container you give it
#define in , in_container() =
//----------------------------------------------------------------------------
// The reverse-argument membership predicate is defined as the lowest-precedence
// operator available. And conveniently, it will not likely collide with anything.
template <typename T, typename Container>
typename std::enable_if <!std::is_same <Container, T> ::value, bool> ::type
operator , ( const T& x, const Container& xs )
{
using std::begin;
using std::end;
return std::find( begin(xs), end(xs), x ) != end(xs);
}
template <typename T, typename Container>
typename std::enable_if <std::is_same <Container, T> ::value, bool> ::type
operator , ( const T& x, const Container& y )
{
return x == y;
}
//----------------------------------------------------------------------------
// This thunk is used to accept any type of container without need for
// special syntax when used.
struct in_container
{
template <typename Container>
const Container& operator = ( const Container& container )
{
return container;
}
template <typename T>
std::vector <T> operator = ( std::initializer_list <T> xs )
{
return std::vector <T> ( xs );
}
};
#endif
Usage
Great! Now we can use it in all the ways you would expect an in operator to be useful. According to your particular interest, see example 3:
#include <iostream>
#include <set>
#include <string>
using namespace std;
void f( const string& s, const vector <string> & ss ) { cout << "nope\n\n"; }
void f( bool b ) { cout << "fooey!\n\n"; }
int main()
{
cout <<
"I understand three primes by digit or by name.\n"
"Type \"q\" to \"quit\".\n\n";
while (true)
{
string s;
cout << "s? ";
getline( cin, s );
// Example 1: arrays
const char* quits[] = { "quit", "q" };
if (s in quits)
break;
// Example 2: vectors
vector <string> digits { "2", "3", "5" };
if (s in digits)
{
cout << "a prime digit\n\n";
continue;
}
// Example 3: literals
if (s in {"two", "three", "five"})
{
cout << "a prime name!\n\n";
continue;
}
// Example 4: sets
set <const char*> favorites{ "7", "seven" };
if (s in favorites)
{
cout << "a favorite prime!\n\n";
continue;
}
// Example 5: sets, part deux
if (s in set <string> { "TWO", "THREE", "FIVE", "SEVEN" })
{
cout << "(ouch! don't shout!)\n\n";
continue;
}
// Example 6: operator weirdness
if (s[0] in string("014") + "689")
{
cout << "not prime\n\n";
continue;
}
// Example 7: argument lists unaffected
f( s, digits );
}
cout << "bye\n";
}
Potential improvements
There are always things that can be done to improve the code for your specific purposes. You can add a ni (not-in) operator (Add a new thunk container type). You can wrap the thunk containers in a namespace (a good idea). You can specialize on things like std::set to use the .count() member function instead of the O(n) search. Etc.
Your other concerns
const vs mutable : not an issue; both are usable with the operator
laziness of or : Technically, or is not lazy, it is short-circuited. The std::find() algorithm also short-circuits in the same way.
compile time loop unrolling : not really applicable here. Your original code did not use loops; while std::find() does, any loop unrolling that may occur is up to the compiler.
easy to extend to operators other than == : That actually is a separate issue; you are no longer looking at a simple membership predicate, but are now considering a functional fold-filter. It is entirely possible to create an algorithm that does that, but the Standard Library provides the any_of() function, which does exactly that. (It's just not as pretty as our RLM 'in' operator. That said, any C++ programmer will understand it easily. Such answers have already been proffered here.)
Hope this helps.
First, I recommend using a for loop, which is both the easiest and
most readable solution:
for (i = 0; i < n; i++) {
if (var == eq[i]) {
// if true
break;
}
}
However, some other methods also available, e.g., std::all_of, std::any_of, std::none_of (in #include <algorithm>).
Let us look at the simple example program which contains all the above keywords
#include <vector>
#include <numeric>
#include <algorithm>
#include <iterator>
#include <iostream>
#include <functional>
int main()
{
std::vector<int> v(10, 2);
std::partial_sum(v.cbegin(), v.cend(), v.begin());
std::cout << "Among the numbers: ";
std::copy(v.cbegin(), v.cend(), std::ostream_iterator<int>(std::cout, " "));
std::cout << '\\n';
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; }))
{
std::cout << "All numbers are even\\n";
}
if (std::none_of(v.cbegin(), v.cend(), std::bind(std::modulus<int>(),
std::placeholders::_1, 2)))
{
std::cout << "None of them are odd\\n";
}
struct DivisibleBy
{
const int d;
DivisibleBy(int n) : d(n) {}
bool operator()(int n) const { return n % d == 0; }
};
if (std::any_of(v.cbegin(), v.cend(), DivisibleBy(7)))
{
std::cout << "At least one number is divisible by 7\\n";
}
}
You may use std::set to test if var belongs to it. (Compile with c++11 enabled)
#include <iostream>
#include <set>
int main()
{
std::string el = "abc";
if (std::set<std::string>({"abc", "def", "ghi"}).count(el))
std::cout << "abc belongs to {\"abc\", \"def\", \"ghi\"}" << std::endl;
return 0;
}
The advantage is that std::set<std::string>::count works in O(log(n)) time (where is n is number of strings to test) comparing to non compact if witch is O(n) in general. The disadvantage is that construction of the set takes O(n*log(n)). So, construct it once, like:
static std::set<std::string> the_set = {"abc", "def", "ghi"};
But, IMO it would be better to leave the condition as is, unless it contains more than 10 strings to check. The performance advantages of using std::set for such a test appears only for big n. Also, simple non compact if is easier to read for average c++ developer.
The closest thing would be something like:
template <class K, class U, class = decltype(std::declval<K>() == std::declval<U>())>
bool in(K&& key, std::initializer_list<U> vals)
{
return std::find(vals.begin(), vals.end(), key) != vals.end();
}
We need to take an argument of type initializer_list<U> so that we can pass in a braced-init-list like {a,b,c}. This copies the elements, but presumably we're going doing this because we're providing literals so probably not a big deal.
We can use that like so:
std::string var = "hi";
bool b = in(var, {"abc", "def", "ghi", "hi"});
std::cout << b << std::endl; // true
If you have access to C++14 (not sure if this works with C++11) you could write something like this:
template <typename T, typename L = std::initializer_list<T>>
constexpr bool is_one_of(const T& value, const L& list)
{
return std::any_of(std::begin(list), std::end(list), [&value](const T& element) { return element == value; });
};
A call would look like this:
std::string test_case = ...;
if (is_one_of<std::string>(test_case, { "first case", "second case", "third case" })) {...}
or like this
std::string test_case = ...;
std::vector<std::string> allowedCases{ "first case", "second case", "third case" };
if (is_one_of<std::string>(test_case, allowedCases)) {...}
If you don't like to "wrap" the allowed cases into a list type you can also write a little helper function like this:
template <typename T, typename...L>
constexpr bool is_one_of(const T& value, const T& first, const L&... next) //First is used to be distinct
{
return is_one_of(value, std::initializer_list<T>{first, next...});
};
This will allow you to call it like this:
std::string test_case = ...;
if (is_one_of<std::string>(test_case, "first case", "second case", "third case" )) {...}
Complete example on Coliru
Worth noting that in most Java and C++ code I've seen, listing 3 or so conditionals out is the accepted practice. It's certainly more readable than "clever" solutions. If this happens so often it's a major drag, that's a design smell anyway and a templated or polymorphic approach would probably help avoid this.
So my answer is the "null" operation. Just keep doing the more verbose thing, it's most accepted.
You could use a switch case. Instead of having a list of separate cases you could have :
include
using namespace std;
int main ()
{
char grade = 'B';
switch(grade)
{
case 'A' :
case 'B' :
case 'C' :
cout << "Well done" << endl;
break;
case 'D' :
cout << "You passed" << endl;
break;
case 'F' :
cout << "Better try again" << endl;
break;
default :
cout << "Invalid grade" << endl;
}
cout << "Your grade is " << grade << endl;
return 0;
}
So you can group your results together: A, B and C will output "well done".
I took this example from Tutorials Point:
http://www.tutorialspoint.com/cplusplus/cpp_switch_statement.htm
I want to write my own is_sorted function template implementation instead of using std::is_sorted. Could you give me any idea about how to do it?
I want to use it only for arrays. So I want to make declaration like this:
template <typename T, size_t N>
bool is_sorted (const T (&array)[N]); and
bool operator>(const A &, const A &); is declared.
The obvious way would be to compare each item to the one after it, and see if it's <= to that one.
You probably don't want to do that directly though. First of all, for sorting, the client is typically only required to ensure that a < b is defined, so you want to use < instead of <=. Second, you want to allow (but not require) the user to pass a comparator of their own, in case < isn't defined directly for some type or the desired sort uses a different criteria than < defines.
As such, you probably want to define two versions of is_sorted, one using < directly, the other using a comparator passed by the user.
#include <iterator>
#include <functional>
template <class InIt>
bool is_sorted(InIt b, InIt e) {
if (b == e) // No items -- sorted by definition
return true;
typename std::iterator_traits<InIt>::value_type first = *b;
++b;
while (b != e) { // skip if e-b == 1 (single item is sorted)
if (*b < first)
return false;
first = *b;
++b;
}
return true;
}
template <class InIt, class Cmp>
bool is_sorted(InIt b, InIt e, Cmp cmp) {
if (b == e)
return true;
typename std::iterator_traits<InIt>::value_type first = *b;
++b;
while (b != e) { // skip if e-b == 1 (single item is sorte)
if (cmp(*b, first))
return false;
first = *b;
++b;
}
return true;
}
To keep myself honest, a bit of test code, with sorted, unsorted, identical, and reversed elements, using std::vector, std::deque, std::array, and a built-in array:
#ifdef TEST
#include <array>
#include <vector>
#include <iostream>
#include <deque>
int main() {
std::vector<int> sorted{1, 2, 3, 4, 6, 100};
std::deque<int> unsorted{1, 5, 2, 7, 4};
std::array<int, 7> ident = {1, 1, 1, 1, 1, 3, 3};
int rev[] = {5, 4, 3, 2, 1};
if (!is_sorted(std::begin(sorted), std::end(sorted)))
std::cout << "Sorted array detected as un-sorted\n";
if (is_sorted(std::begin(unsorted), std::end(unsorted)))
std::cout << "Un-sorted array detected as sorted\n";
if (!is_sorted(std::begin(ident), std::end(ident)))
std::cout << "sorted array with duplicated detected as un-sorted\n";
if (!is_sorted(std::begin(rev), std::end(rev), std::greater<int>()))
std::cout << "Reverse sorted array detected as un-sorted\n";
return 0;
}
#endif
This works fine for me with gcc 4.7.2. The is_sorted code seems to work fine with VC++ 2012 as well (though the test code requires some minor modifications, e.g., to eliminate use of uniform initialization, which it doesn't support yet).
Edit: if you don't mind a slightly tighter requirement on the iterators (forward iterators instead of input iterators), you can make the code simpler and often more efficient. For example, the code can be reduced to something like this:
template <class FwdIt>
bool is_sorted(FwdIt b, FwdIt e) {
if (b == e) // No items -- sorted by definition
return true;
for (FwdIt first = b; ++b != e; first = b)
if (*b < *first)
return false;
return true;
}
Make sure each element in the container is <= the next element.
If you only have a < comparator (like most STL algorithms), make sure there are no elements in the container where the given element is < the previous element.
I'm new to C/C++ programming, but I've been programming in C# for 1.5 years now. I like C# and I like the List class, so I thought about making a List class in C++ as an exercise.
List<int> ls;
int whatever = 123;
ls.Add(1);
ls.Add(235445);
ls.Add(whatever);
The implementation is similar to any Array List class out there. I have a T* vector member where I store the items, and when this storage is about to be completely filled, I resize it.
Please notice that this is not to be used in production, this is only an exercise. I'm well aware of vector<T> and friends.
Now I want to loop through the items of my list. I don't like to use for(int i=0;i<n; i==). I typed for in the visual studio, awaited for Intellisense, and it suggested me this:
for each (object var in collection_to_loop)
{
}
This obviously won't work with my List implementation. I figured I could do some macro magic, but this feels like a huge hack. Actually, what bothers me the most is passing the type like that:
#define foreach(type, var, list)\
int _i_ = 0;\
##type var;\
for (_i_ = 0, var=list[_i_]; _i_<list.Length();_i_++,var=list[_i_])
foreach(int,i,ls){
doWork(i);
}
My question is: is there a way to make this custom List class work with a foreach-like loop?
Firstly, the syntax of a for-each loop in C++ is different from C# (it's also called a range based for loop. It has the form:
for(<type> <name> : <collection>) { ... }
So for example, with an std::vector<int> vec, it would be something like:
for(int i : vec) { ... }
Under the covers, this effectively uses the begin() and end() member functions, which return iterators. Hence, to allow your custom class to utilize a for-each loop, you need to provide a begin() and an end() function. These are generally overloaded, returning either an iterator or a const_iterator. Implementing iterators can be tricky, although with a vector-like class it's not too hard.
template <typename T>
struct List
{
T* store;
std::size_t size;
typedef T* iterator;
typedef const T* const_iterator;
....
iterator begin() { return &store[0]; }
const_iterator begin() const { return &store[0]; }
iterator end() { return &store[size]; }
const_iterator end() const { return &store[size]; }
...
};
With these implemented, you can utilize a range based loop as above.
Let iterable be of type Iterable.
Then, in order to make
for (Type x : iterable)
compile, there must be types called Type and IType and there must be functions
IType Iterable::begin()
IType Iterable::end()
IType must provide the functions
Type operator*()
void operator++()
bool operator!=(IType)
The whole construction is really sophisticated syntactic sugar for something like
for (IType it = iterable.begin(); it != iterable.end(); ++it) {
Type x = *it;
...
}
where instead of Type, any compatible type (such as const Type or Type&) can be used, which will have the expected implications (constness, reference-instead-of-copy etc.).
Since the whole expansion happens syntactically, you can also change the declaration of the operators a bit, e.g. having *it return a reference or having != take a const IType& rhs as needed.
Note that you cannot use the for (Type& x : iterable) form if *it does not return a reference (but if it returns a reference, you can also use the copy version).
Note also that operator++() defines the prefix version of the ++ operator -- however it will also be used as the postfix operator unless you explicitly define a postfix ++. The ranged-for will not compile if you only supply a postfix ++, which btw.can be declared as operator++(int) (dummy int argument).
Minimal working example:
#include <stdio.h>
typedef int Type;
struct IType {
Type* p;
IType(Type* p) : p(p) {}
bool operator!=(IType rhs) {return p != rhs.p;}
Type& operator*() {return *p;}
void operator++() {++p;}
};
const int SIZE = 10;
struct Iterable {
Type data[SIZE];
IType begin() {return IType(data); }
IType end() {return IType(data + SIZE);}
};
Iterable iterable;
int main() {
int i = 0;
for (Type& x : iterable) {
x = i++;
}
for (Type x : iterable) {
printf("%d", x);
}
}
output
0123456789
You can fake the ranged-for-each (e.g. for older C++ compilers) with the following macro:
#define ln(l, x) x##l // creates unique labels
#define l(x,y) ln(x,y)
#define for_each(T,x,iterable) for (bool _run = true;_run;_run = false) for (auto it = iterable.begin(); it != iterable.end(); ++it)\
if (1) {\
_run = true; goto l(__LINE__,body); l(__LINE__,cont): _run = true; continue; l(__LINE__,finish): break;\
} else\
while (1) \
if (1) {\
if (!_run) goto l(__LINE__,cont);/* we reach here if the block terminated normally/via continue */ \
goto l(__LINE__,finish);/* we reach here if the block terminated by break */\
} \
else\
l(__LINE__,body): for (T x = *it;_run;_run=false) /* block following the expanded macro */
int main() {
int i = 0;
for_each(Type&, x, iterable) {
i++;
if (i > 5) break;
x = i;
}
for_each(Type, x, iterable) {
printf("%d", x);
}
while (1);
}
(use declspec or pass IType if your compiler doesn't even have auto).
Output:
1234500000
As you can see, continue and break will work with this thanks to its complicated construction.
See http://www.chiark.greenend.org.uk/~sgtatham/mp/ for more C-preprocessor hacking to create custom control structures.
That syntax Intellisense suggested is not C++; or it's some MSVC extension.
C++11 has range-based for loops for iterating over the elements of a container. You need to implement begin() and end() member functions for your class that will return iterators to the first element, and one past the last element respectively. That, of course, means you need to implement suitable iterators for your class as well. If you really want to go this route, you may want to look at Boost.IteratorFacade; it reduces a lot of the pain of implementing iterators yourself.
After that you'll be able to write this:
for( auto const& l : ls ) {
// do something with l
}
Also, since you're new to C++, I want to make sure that you know the standard library has several container classes.
C++ does not have the for_each loop feature in its syntax. You have to use c++11 or use the template function std::for_each.
#include <vector>
#include <algorithm>
#include <iostream>
struct Sum {
Sum() { sum = 0; }
void operator()(int n) { sum += n; }
int sum;
};
int main()
{
std::vector<int> nums{3, 4, 2, 9, 15, 267};
std::cout << "before: ";
for (auto n : nums) {
std::cout << n << " ";
}
std::cout << '\n';
std::for_each(nums.begin(), nums.end(), [](int &n){ n++; });
Sum s = std::for_each(nums.begin(), nums.end(), Sum());
std::cout << "after: ";
for (auto n : nums) {
std::cout << n << " ";
}
std::cout << '\n';
std::cout << "sum: " << s.sum << '\n';
}
As #yngum suggests, you can get the VC++ for each extension to work with any arbitrary collection type by defining begin() and end() methods on the collection to return a custom iterator. Your iterator in turn has to implement the necessary interface (dereference operator, increment operator, etc). I've done this to wrap all of the MFC collection classes for legacy code. It's a bit of work, but can be done.
C++ does not have native support for lazy evaluation (as Haskell does).
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
EDIT: I like Konrad Rudolph's answer.
I'm wondering if it's possible to implement it in a more generic fashion, for example by using a parametrized class lazy that essentially works for T the way matrix_add works for matrix.
Any operation on T would return lazy instead. The only problem is to store the arguments and operation code inside lazy itself. Can anyone see how to improve this?
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
Yes, this is possible and quite often done, e.g. for matrix calculations. The main mechanism to facilitate this is operator overloading. Consider the case of matrix addition. The signature of the function would usually look something like this:
matrix operator +(matrix const& a, matrix const& b);
Now, to make this function lazy, it's enough to return a proxy instead of the actual result:
struct matrix_add;
matrix_add operator +(matrix const& a, matrix const& b) {
return matrix_add(a, b);
}
Now all that needs to be done is to write this proxy:
struct matrix_add {
matrix_add(matrix const& a, matrix const& b) : a(a), b(b) { }
operator matrix() const {
matrix result;
// Do the addition.
return result;
}
private:
matrix const& a, b;
};
The magic lies in the method operator matrix() which is an implicit conversion operator from matrix_add to plain matrix. This way, you can chain multiple operations (by providing appropriate overloads of course). The evaluation takes place only when the final result is assigned to a matrix instance.
EDIT I should have been more explicit. As it is, the code makes no sense because although evaluation happens lazily, it still happens in the same expression. In particular, another addition will evaluate this code unless the matrix_add structure is changed to allow chained addition. C++0x greatly facilitates this by allowing variadic templates (i.e. template lists of variable length).
However, one very simple case where this code would actually have a real, direct benefit is the following:
int value = (A + B)(2, 3);
Here, it is assumed that A and B are two-dimensional matrices and that dereferencing is done in Fortran notation, i.e. the above calculates one element out of a matrix sum. It's of course wasteful to add the whole matrices. matrix_add to the rescue:
struct matrix_add {
// … yadda, yadda, yadda …
int operator ()(unsigned int x, unsigned int y) {
// Calculate *just one* element:
return a(x, y) + b(x, y);
}
};
Other examples abound. I've just remembered that I have implemented something related not long ago. Basically, I had to implement a string class that should adhere to a fixed, pre-defined interface. However, my particular string class dealt with huge strings that weren't actually stored in memory. Usually, the user would just access small substrings from the original string using a function infix. I overloaded this function for my string type to return a proxy that held a reference to my string, along with the desired start and end position. Only when this substring was actually used did it query a C API to retrieve this portion of the string.
Boost.Lambda is very nice, but Boost.Proto is exactly what you are looking for. It already has overloads of all C++ operators, which by default perform their usual function when proto::eval() is called, but can be changed.
What Konrad already explained can be put further to support nested invocations of operators, all executed lazily. In Konrad's example, he has an expression object that can store exactly two arguments, for exactly two operands of one operation. The problem is that it will only execute one subexpression lazily, which nicely explains the concept in lazy evaluation put in simple terms, but doesn't improve performance substantially. The other example shows also well how one can apply operator() to add only some elements using that expression object. But to evaluate arbitrary complex expressions, we need some mechanism that can store the structure of that too. We can't get around templates to do that. And the name for that is expression templates. The idea is that one templated expression object can store the structure of some arbitrary sub-expression recursively, like a tree, where the operations are the nodes, and the operands are the child-nodes. For a very good explanation i just found today (some days after i wrote the below code) see here.
template<typename Lhs, typename Rhs>
struct AddOp {
Lhs const& lhs;
Rhs const& rhs;
AddOp(Lhs const& lhs, Rhs const& rhs):lhs(lhs), rhs(rhs) {
// empty body
}
Lhs const& get_lhs() const { return lhs; }
Rhs const& get_rhs() const { return rhs; }
};
That will store any addition operation, even nested one, as can be seen by the following definition of an operator+ for a simple point type:
struct Point { int x, y; };
// add expression template with point at the right
template<typename Lhs, typename Rhs> AddOp<AddOp<Lhs, Rhs>, Point>
operator+(AddOp<Lhs, Rhs> const& lhs, Point const& p) {
return AddOp<AddOp<Lhs, Rhs>, Point>(lhs, p);
}
// add expression template with point at the left
template<typename Lhs, typename Rhs> AddOp< Point, AddOp<Lhs, Rhs> >
operator+(Point const& p, AddOp<Lhs, Rhs> const& rhs) {
return AddOp< Point, AddOp<Lhs, Rhs> >(p, rhs);
}
// add two points, yield a expression template
AddOp< Point, Point >
operator+(Point const& lhs, Point const& rhs) {
return AddOp<Point, Point>(lhs, rhs);
}
Now, if you have
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 };
p1 + (p2 + p3); // returns AddOp< Point, AddOp<Point, Point> >
You now just need to overload operator= and add a suitable constructor for the Point type and accept AddOp. Change its definition to:
struct Point {
int x, y;
Point(int x = 0, int y = 0):x(x), y(y) { }
template<typename Lhs, typename Rhs>
Point(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
}
template<typename Lhs, typename Rhs>
Point& operator=(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
return *this;
}
int get_x() const { return x; }
int get_y() const { return y; }
};
And add the appropriate get_x and get_y into AddOp as member functions:
int get_x() const {
return lhs.get_x() + rhs.get_x();
}
int get_y() const {
return lhs.get_y() + rhs.get_y();
}
Note how we haven't created any temporaries of type Point. It could have been a big matrix with many fields. But at the time the result is needed, we calculate it lazily.
I have nothing to add to Konrad's post, but you can look at Eigen for an example of lazy evaluation done right, in a real world app. It is pretty awe inspiring.
I'm thinking about implementing a template class, that uses std::function. The class should, more or less, look like this:
template <typename Value>
class Lazy
{
public:
Lazy(std::function<Value()> function) : _function(function), _evaluated(false) {}
Value &operator*() { Evaluate(); return _value; }
Value *operator->() { Evaluate(); return &_value; }
private:
void Evaluate()
{
if (!_evaluated)
{
_value = _function();
_evaluated = true;
}
}
std::function<Value()> _function;
Value _value;
bool _evaluated;
};
For example usage:
class Noisy
{
public:
Noisy(int i = 0) : _i(i)
{
std::cout << "Noisy(" << _i << ")" << std::endl;
}
Noisy(const Noisy &that) : _i(that._i)
{
std::cout << "Noisy(const Noisy &)" << std::endl;
}
~Noisy()
{
std::cout << "~Noisy(" << _i << ")" << std::endl;
}
void MakeNoise()
{
std::cout << "MakeNoise(" << _i << ")" << std::endl;
}
private:
int _i;
};
int main()
{
Lazy<Noisy> n = [] () { return Noisy(10); };
std::cout << "about to make noise" << std::endl;
n->MakeNoise();
(*n).MakeNoise();
auto &nn = *n;
nn.MakeNoise();
}
Above code should produce the following message on the console:
Noisy(0)
about to make noise
Noisy(10)
~Noisy(10)
MakeNoise(10)
MakeNoise(10)
MakeNoise(10)
~Noisy(10)
Note that the constructor printing Noisy(10) will not be called until the variable is accessed.
This class is far from perfect, though. The first thing would be the default constructor of Value will have to be called on member initialization (printing Noisy(0) in this case). We can use pointer for _value instead, but I'm not sure whether it would affect the performance.
Johannes' answer works.But when it comes to more parentheses ,it doesn't work as wish. Here is an example.
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 }, p4 = { 7, 8 };
(p1 + p2) + (p3+p4)// it works ,but not lazy enough
Because the three overloaded + operator didn't cover the case
AddOp<Llhs,Lrhs>+AddOp<Rlhs,Rrhs>
So the compiler has to convert either (p1+p2) or(p3+p4) to Point ,that's not lazy enough.And when compiler decides which to convert ,it complains. Because none is better than the other .
Here comes my extension: add yet another overloaded operator +
template <typename LLhs, typename LRhs, typename RLhs, typename RRhs>
AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>> operator+(const AddOp<LLhs, LRhs> & leftOperandconst, const AddOp<RLhs, RRhs> & rightOperand)
{
return AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>>(leftOperandconst, rightOperand);
}
Now ,the compiler can handle the case above correctly ,and no implicit conversion ,volia!
As it's going to be done in C++0x, by lambda expressions.
Anything is possible.
It depends on exactly what you mean:
class X
{
public: static X& getObjectA()
{
static X instanceA;
return instanceA;
}
};
Here we have the affect of a global variable that is lazily evaluated at the point of first use.
As newly requested in the question.
And stealing Konrad Rudolph design and extending it.
The Lazy object:
template<typename O,typename T1,typename T2>
struct Lazy
{
Lazy(T1 const& l,T2 const& r)
:lhs(l),rhs(r) {}
typedef typename O::Result Result;
operator Result() const
{
O op;
return op(lhs,rhs);
}
private:
T1 const& lhs;
T2 const& rhs;
};
How to use it:
namespace M
{
class Matrix
{
};
struct MatrixAdd
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
struct MatrixSub
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
template<typename T1,typename T2>
Lazy<MatrixAdd,T1,T2> operator+(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixAdd,T1,T2>(lhs,rhs);
}
template<typename T1,typename T2>
Lazy<MatrixSub,T1,T2> operator-(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixSub,T1,T2>(lhs,rhs);
}
}
In C++11 lazy evaluation similar to hiapay's answer can be achieved using std::shared_future. You still have to encapsulate calculations in lambdas but memoization is taken care of:
std::shared_future<int> a = std::async(std::launch::deferred, [](){ return 1+1; });
Here's a full example:
#include <iostream>
#include <future>
#define LAZY(EXPR, ...) std::async(std::launch::deferred, [__VA_ARGS__](){ std::cout << "evaluating "#EXPR << std::endl; return EXPR; })
int main() {
std::shared_future<int> f1 = LAZY(8);
std::shared_future<int> f2 = LAZY(2);
std::shared_future<int> f3 = LAZY(f1.get() * f2.get(), f1, f2);
std::cout << "f3 = " << f3.get() << std::endl;
std::cout << "f2 = " << f2.get() << std::endl;
std::cout << "f1 = " << f1.get() << std::endl;
return 0;
}
C++0x is nice and all.... but for those of us living in the present you have Boost lambda library and Boost Phoenix. Both with the intent of bringing large amounts of functional programming to C++.
Lets take Haskell as our inspiration - it being lazy to the core.
Also, let's keep in mind how Linq in C# uses Enumerators in a monadic (urgh - here is the word - sorry) way.
Last not least, lets keep in mind, what coroutines are supposed to provide to programmers. Namely the decoupling of computational steps (e.g. producer consumer) from each other.
And lets try to think about how coroutines relate to lazy evaluation.
All of the above appears to be somehow related.
Next, lets try to extract our personal definition of what "lazy" comes down to.
One interpretation is: We want to state our computation in a composable way, before executing it. Some of those parts we use to compose our complete solution might very well draw upon huge (sometimes infinite) data sources, with our full computation also either producing a finite or infinite result.
Lets get concrete and into some code. We need an example for that! Here, I choose the fizzbuzz "problem" as an example, just for the reason that there is some nice, lazy solution to it.
In Haskell, it looks like this:
module FizzBuzz
( fb
)
where
fb n =
fmap merge fizzBuzzAndNumbers
where
fizz = cycle ["","","fizz"]
buzz = cycle ["","","","","buzz"]
fizzBuzz = zipWith (++) fizz buzz
fizzBuzzAndNumbers = zip [1..n] fizzBuzz
merge (x,s) = if length s == 0 then show x else s
The Haskell function cycle creates an infinite list (lazy, of course!) from a finite list by simply repeating the values in the finite list forever. In an eager programming style, writing something like that would ring alarm bells (memory overflow, endless loops!). But not so in a lazy language. The trick is, that lazy lists are not computed right away. Maybe never. Normally only as much as subsequent code requires it.
The third line in the where block above creates another lazy!! list, by means of combining the infinite lists fizz and buzz by means of the single two elements recipe "concatenate a string element from either input list into a single string". Again, if this were to be immediately evaluated, we would have to wait for our computer to run out of resources.
In the 4th line, we create tuples of the members of a finite lazy list [1..n] with our infinite lazy list fizzbuzz. The result is still lazy.
Even in the main body of our fb function, there is no need to get eager. The whole function returns a list with the solution, which itself is -again- lazy. You could as well think of the result of fb 50 as a computation which you can (partially) evaluate later. Or combine with other stuff, leading to an even larger (lazy) evaluation.
So, in order to get started with our C++ version of "fizzbuzz", we need to think of ways how to combine partial steps of our computation into larger bits of computations, each drawing data from previous steps as required.
You can see the full story in a gist of mine.
Here the basic ideas behind the code:
Borrowing from C# and Linq, we "invent" a stateful, generic type Enumerator, which holds
- The current value of the partial computation
- The state of a partial computation (so we can produce subsequent values)
- The worker function, which produces the next state, the next value and a bool which states if there is more data or if the enumeration has come to an end.
In order to be able to compose Enumerator<T,S> instance by means of the power of the . (dot), this class also contains functions, borrowed from Haskell type classes such as Functor and Applicative.
The worker function for enumerator is always of the form: S -> std::tuple<bool,S,T where S is the generic type variable representing the state and T is the generic type variable representing a value - the result of a computation step.
All this is already visible in the first lines of the Enumerator class definition.
template <class T, class S>
class Enumerator
{
public:
typedef typename S State_t;
typedef typename T Value_t;
typedef std::function<
std::tuple<bool, State_t, Value_t>
(const State_t&
)
> Worker_t;
Enumerator(Worker_t worker, State_t s0)
: m_worker(worker)
, m_state(s0)
, m_value{}
{
}
// ...
};
So, all we need to create a specific enumerator instance, we need to create a worker function, have the initial state and create an instance of Enumerator with those two arguments.
Here an example - function range(first,last) creates a finite range of values. This corresponds to a lazy list in the Haskell world.
template <class T>
Enumerator<T, T> range(const T& first, const T& last)
{
auto finiteRange =
[first, last](const T& state)
{
T v = state;
T s1 = (state < last) ? (state + 1) : state;
bool active = state != s1;
return std::make_tuple(active, s1, v);
};
return Enumerator<T,T>(finiteRange, first);
}
And we can make use of this function, for example like this: auto r1 = range(size_t{1},10); - We have created ourselves a lazy list with 10 elements!
Now, all is missing for our "wow" experience, is to see how we can compose enumerators.
Coming back to Haskells cycle function, which is kind of cool. How would it look in our C++ world? Here it is:
template <class T, class S>
auto
cycle
( Enumerator<T, S> values
) -> Enumerator<T, S>
{
auto eternally =
[values](const S& state) -> std::tuple<bool, S, T>
{
auto[active, s1, v] = values.step(state);
if (active)
{
return std::make_tuple(active, s1, v);
}
else
{
return std::make_tuple(true, values.state(), v);
}
};
return Enumerator<T, S>(eternally, values.state());
}
It takes an enumerator as input and returns an enumerator. Local (lambda) function eternally simply resets the input enumeration to its start value whenever it runs out of values and voilà - we have an infinite, ever repeating version of the list we gave as an argument:: auto foo = cycle(range(size_t{1},3)); And we can already shamelessly compose our lazy "computations".
zip is a good example, showing that we can also create a new enumerator from two input enumerators. The resulting enumerator yields as many values as the smaller of either of the input enumerators (tuples with 2 element, one for each input enumerator). I have implemented zip inside class Enumerator itself. Here is how it looks like:
// member function of class Enumerator<S,T>
template <class T1, class S1>
auto
zip
( Enumerator<T1, S1> other
) -> Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
{
auto worker0 = this->m_worker;
auto worker1 = other.worker();
auto combine =
[worker0,worker1](std::tuple<S, S1> state) ->
std::tuple<bool, std::tuple<S, S1>, std::tuple<T, T1> >
{
auto[s0, s1] = state;
auto[active0, newS0, v0] = worker0(s0);
auto[active1, newS1, v1] = worker1(s1);
return std::make_tuple
( active0 && active1
, std::make_tuple(newS0, newS1)
, std::make_tuple(v0, v1)
);
};
return Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
( combine
, std::make_tuple(m_state, other.state())
);
}
Please note, how the "combining" also ends up in combining the state of both sources and the values of both sources.
As this post is already TL;DR; for many, here the...
Summary
Yes, lazy evaluation can be implemented in C++. Here, I did it by borrowing the function names from haskell and the paradigm from C# enumerators and Linq. There might be similarities to pythons itertools, btw. I think they followed a similar approach.
My implementation (see the gist link above) is just a prototype - not production code, btw. So no warranties whatsoever from my side. It serves well as demo code to get the general idea across, though.
And what would this answer be without the final C++ version of fizzbuz, eh? Here it is:
std::string fizzbuzz(size_t n)
{
typedef std::vector<std::string> SVec;
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
SVec fizzes{ "","","fizz" };
SVec buzzes{ "","","","","buzz" };
return
range(size_t{ 1 }, n)
.zip
( cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
( std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
.statefulFold<std::ostringstream&>
(
[](std::ostringstream& oss, const std::string& s)
{
if (0 == oss.tellp())
{
oss << s;
}
else
{
oss << "," << s;
}
}
, std::ostringstream()
)
.str();
}
And... to drive the point home even further - here a variation of fizzbuzz which returns an "infinite list" to the caller:
typedef std::vector<std::string> SVec;
static const SVec fizzes{ "","","fizz" };
static const SVec buzzes{ "","","","","buzz" };
auto fizzbuzzInfinite() -> decltype(auto)
{
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
auto result =
range(size_t{ 1 })
.zip
(cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
(std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
;
return result;
}
It is worth showing, since you can learn from it how to dodge the question what the exact return type of that function is (as it depends on the implementation of the function alone, namely how the code combines the enumerators).
Also it demonstrates that we had to move the vectors fizzes and buzzes outside the scope of the function so they are still around when eventually on the outside, the lazy mechanism produces values. If we had not done that, the iterRange(..) code would have stored iterators to the vectors which are long gone.
Using a very simple definition of lazy evaluation, which is the value is not evaluated until needed, I would say that one could implement this through the use of a pointer and macros (for syntax sugar).
#include <stdatomic.h>
#define lazy(var_type) lazy_ ## var_type
#define def_lazy_type( var_type ) \
typedef _Atomic var_type _atomic_ ## var_type; \
typedef _atomic_ ## var_type * lazy(var_type); //pointer to atomic type
#define def_lazy_variable(var_type, var_name ) \
_atomic_ ## var_type _ ## var_name; \
lazy_ ## var_type var_name = & _ ## var_name;
#define assign_lazy( var_name, val ) atomic_store( & _ ## var_name, val )
#define eval_lazy(var_name) atomic_load( &(*var_name) )
#include <stdio.h>
def_lazy_type(int)
void print_power2 ( lazy(int) i )
{
printf( "%d\n", eval_lazy(i) * eval_lazy(i) );
}
typedef struct {
int a;
} simple;
def_lazy_type(simple)
void print_simple ( lazy(simple) s )
{
simple temp = eval_lazy(s);
printf("%d\n", temp.a );
}
#define def_lazy_array1( var_type, nElements, var_name ) \
_atomic_ ## var_type _ ## var_name [ nElements ]; \
lazy(var_type) var_name = _ ## var_name;
int main ( )
{
//declarations
def_lazy_variable( int, X )
def_lazy_variable( simple, Y)
def_lazy_array1(int,10,Z)
simple new_simple;
//first the lazy int
assign_lazy(X,111);
print_power2(X);
//second the lazy struct
new_simple.a = 555;
assign_lazy(Y,new_simple);
print_simple ( Y );
//third the array of lazy ints
for(int i=0; i < 10; i++)
{
assign_lazy( Z[i], i );
}
for(int i=0; i < 10; i++)
{
int r = eval_lazy( &Z[i] ); //must pass with &
printf("%d\n", r );
}
return 0;
}
You'll notice in the function print_power2 there is a macro called eval_lazy which does nothing more than dereference a pointer to get the value just prior to when it's actually needed. The lazy type is accessed atomically, so it's completely thread-safe.