Bidirectional static value mapping in c++17 - c++

I want to efficiently bidirectionally map some values of different types in C++17 (1:1 mapping of only very few values). Consider for example mapping enum values and integers, though the problem is applicable to other types as well. Currently, I'm doing it like this:
#include <optional>
enum class ExampleEnum { A, B, C, D, E };
class MyMapping {
public:
std::optional<int> enumToInt(ExampleEnum v) {
switch(v) {
case ExampleEnum::A:
return 1;
case ExampleEnum::B:
return 5;
case ExampleEnum::D:
return 42;
}
return std::nullopt;
}
std::optional<ExampleEnum> intToEnum(int v) {
switch(v) {
case 1:
return ExampleEnum::A;
case 5:
return ExampleEnum::B;
case 42:
return ExampleEnum::D;
}
return std::nullopt;
}
};
This has the obvious disadvantage of having to write everything twice, and forgetting to update one of the functions will lead to inconsistencies. Is there a better method?
I need:
Consistency. It shouldn't be possible to have different semantics in mapping and reverse mapping.
Compile-time definition. The values which are mapped are known in advance, and will not change at runtime.
Runtime lookup. Which values will be looked up is not known at compile-time, and may even not contain a mapping at all (returning an empty optional instead).
I would like to have:
No additional memory allocations
Basically the same performance as the double-switch-method
An implementation which makes the mapping definition easily extendable (i.e. adding more values in the future or applying it to other types)

I've given a shot to very naive and simple implementation. https://godbolt.org/z/MtcHw8
#include <optional>
enum class ExampleEnum { A, B, C, D, E };
template<typename Enum, int N>
struct Mapping
{
Enum keys[N];
int values[N];
constexpr std::optional<Enum> at(int x) const noexcept
{
for(int i = 0; i < N; i++)
if(values[i] == x) return keys[i];
return std::nullopt;
}
constexpr std::optional<int> at(Enum x) const noexcept
{
for(int i = 0; i < N; i++)
if(keys[i] == x) return values[i];
return std::nullopt;
}
};
constexpr Mapping<ExampleEnum, 3> mapping{{ExampleEnum::A, ExampleEnum::B, ExampleEnum::D},
{111, 222, 333}};
int main()
{
int x = rand(); // Force runtime implementation
auto optEnum = mapping.at(x);
if(optEnum.has_value())
return *mapping.at(ExampleEnum::B); // Returns 222, (asm line 3) constexpr works
auto y = (ExampleEnum)rand(); // Force runtime implementation
auto optInt = mapping.at(y);
if(optInt.has_value())
return (int)*mapping.at(333); // Returns 3, constexpr works
return 0;
}
It utilizes loop unrolling to achieve switch-method performance in int -> ExampleEnum mappings.
Assembly for ExampleEnum -> int mapping is quite obscure, as optimizer utilized the fact that enum values are sequenced and prefers jump table over if-else implementation.
Anyway, the interface requires no duplication, just create constexpr object with two arrays fed into construction. You can have multiple mappings for same types. Also, enum type is templated.
Also, it can be easily extended to support two enum class instead of only enum-int.
I've also created snipped with raw switch implementations for assembly comparison:
https://godbolt.org/z/CbEcnZ
PS. I believe syntax constexpr Mapping<ExampleEnum, 3> mapping could be simplified with proper template deduction guide, but I have not found out how to do it.
PPS. I went with N up to 15, loop unrolling is still on: https://godbolt.org/z/-Cpmgm

It will be better to avoid such code. They tend to violate one of the fundamental principles of software development, The Open-Closed Principle.
You can improve MyMapping by making it general. Let a higher level class/function define the mappings.
class MyMapping {
public:
void registerItem(ExampleEnum eValue, int intValue)
{
enumToIntMap[eValue] = intValue;
intToEnumMap[intValue] = eValue;
}
std::optional<int> enumToInt(ExampleEnum v) {
auto iter = enumToIntMap.find(v);
if ( iter != enumToIntMap.end() )
{
return iter->second;
}
else
{
return std::nullopt;
}
}
std::optional<ExampleEnum> intToEnum(int v) {
auto iter = intToEnumMap.find(v);
if ( iter != intToEnumMap.end() )
{
return iter->second;
}
else
{
return std::nullopt;
}
}
std::map<ExampleEnum, int> enumToIntMap;
std::map<int, ExampleEnum> intToEnumMap;
};
A higher level function can be:
void initMyMapping(MyMapping& mapping)
{
mapping.registerItem(A, 1);
mapping.registerItem(B, 2);
mapping.registerItem(D, 42);
}
I understand that this still violates the open-closed principle but to a lesser degree. If you want to add mapping data for C and E, you'll have to add code for that. However, you can do that without changing MyMapping. You also have the option of doing that in a second function, and not change initMyMapping.
void initMyMapping_extend(MyMapping& mapping)
{
mapping.registerItem(C, 22);
mapping.registerItem(E, 38);
}

Related

Relaxing constexpr requirement for enum class and bool template parameters

Consider the code where there is a function accumulate that does heavy lifting an a dispatcher function process. The accumulate function tests a parameter in a hot loop, so the parameter is templated.
enum class Op {
Multiply,
Add
};
template<Op op>
int accumulate(const std::vector<int> &vec) {
int a = op == Op::Multiply ? 1 : 0;
for (int v : vec)
if constexpr (op == Op::Add) a += v; else a *= v;
return a;
}
int process(const std::vector<int> &vec, Op op) {
return accumulate<op>(vec);
}
You may have noticed that this code won't compile as the template parameter passed from process is not a constexpr. However, when the template parameter is a bool or, especially, an enum class there is no reason why this shouldn't be compiled.
Such code arises a lot in practice, when a function doing the heavy lifting has several variants (and we only want to keep a single copy in the code base). Is there a proposal or discussion to make such code valid in the future?
(C++ has way too many features, though this one particular one that I need is missing :p )
However, when the template parameter is a bool or, especially, an enum class there is no reason why this shouldn't be compiled.
When the parameter is a bool or an enum with few values, nothing forbid you to choose the case with an if or a switch
int process1 (const std::vector<int> &vec, bool b) {
if ( b == true )
return accumulate<true>(vec);
else
return accumulate<false>(vec);
}
int process2 (const std::vector<int> &vec, Op op) {
switch ( op ) {
case Op::Multiply:
return accumulate<Op::Multiply>(vec);
break;
case Op::Add:
return accumulate<Op::Add>(vec);
break;
// other cases
// default
}
}

How does one convert generator functions to single-return functions? [duplicate]

I've got some example Python code that I need to mimic in C++. I do not require any specific solution (such as co-routine based yield solutions, although they would be acceptable answers as well), I simply need to reproduce the semantics in some manner.
Python
This is a basic sequence generator, clearly too large to store a materialized version.
def pair_sequence():
for i in range(2**32):
for j in range(2**32):
yield (i, j)
The goal is to maintain two instances of the sequence above, and iterate over them in semi-lockstep, but in chunks. In the example below the first_pass uses the sequence of pairs to initialize the buffer, and the second_pass regenerates the same exact sequence and processes the buffer again.
def run():
seq1 = pair_sequence()
seq2 = pair_sequence()
buffer = [0] * 1000
first_pass(seq1, buffer)
second_pass(seq2, buffer)
... repeat ...
C++
The only thing I can find for a solution in C++ is to mimic yield with C++ coroutines, but I haven't found any good reference on how to do this. I'm also interested in alternative (non general) solutions for this problem. I do not have enough memory budget to keep a copy of the sequence between passes.
Generators exist in C++, just under another name: Input Iterators. For example, reading from std::cin is similar to having a generator of char.
You simply need to understand what a generator does:
there is a blob of data: the local variables define a state
there is an init method
there is a "next" method
there is a way to signal termination
In your trivial example, it's easy enough. Conceptually:
struct State { unsigned i, j; };
State make();
void next(State&);
bool isDone(State const&);
Of course, we wrap this as a proper class:
class PairSequence:
// (implicit aliases)
public std::iterator<
std::input_iterator_tag,
std::pair<unsigned, unsigned>
>
{
// C++03
typedef void (PairSequence::*BoolLike)();
void non_comparable();
public:
// C++11 (explicit aliases)
using iterator_category = std::input_iterator_tag;
using value_type = std::pair<unsigned, unsigned>;
using reference = value_type const&;
using pointer = value_type const*;
using difference_type = ptrdiff_t;
// C++03 (explicit aliases)
typedef std::input_iterator_tag iterator_category;
typedef std::pair<unsigned, unsigned> value_type;
typedef value_type const& reference;
typedef value_type const* pointer;
typedef ptrdiff_t difference_type;
PairSequence(): done(false) {}
// C++11
explicit operator bool() const { return !done; }
// C++03
// Safe Bool idiom
operator BoolLike() const {
return done ? 0 : &PairSequence::non_comparable;
}
reference operator*() const { return ij; }
pointer operator->() const { return &ij; }
PairSequence& operator++() {
static unsigned const Max = std::numeric_limts<unsigned>::max();
assert(!done);
if (ij.second != Max) { ++ij.second; return *this; }
if (ij.first != Max) { ij.second = 0; ++ij.first; return *this; }
done = true;
return *this;
}
PairSequence operator++(int) {
PairSequence const tmp(*this);
++*this;
return tmp;
}
private:
bool done;
value_type ij;
};
So hum yeah... might be that C++ is a tad more verbose :)
In C++ there are iterators, but implementing an iterator isn't straightforward: one has to consult the iterator concepts and carefully design the new iterator class to implement them. Thankfully, Boost has an iterator_facade template which should help implementing the iterators and iterator-compatible generators.
Sometimes a stackless coroutine can be used to implement an iterator.
P.S. See also this article which mentions both a switch hack by Christopher M. Kohlhoff and Boost.Coroutine by Oliver Kowalke. Oliver Kowalke's work is a followup on Boost.Coroutine by Giovanni P. Deretta.
P.S. I think you can also write a kind of generator with lambdas:
std::function<int()> generator = []{
int i = 0;
return [=]() mutable {
return i < 10 ? i++ : -1;
};
}();
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;
Or with a functor:
struct generator_t {
int i = 0;
int operator() () {
return i < 10 ? i++ : -1;
}
} generator;
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;
P.S. Here's a generator implemented with the Mordor coroutines:
#include <iostream>
using std::cout; using std::endl;
#include <mordor/coroutine.h>
using Mordor::Coroutine; using Mordor::Fiber;
void testMordor() {
Coroutine<int> coro ([](Coroutine<int>& self) {
int i = 0; while (i < 9) self.yield (i++);
});
for (int i = coro.call(); coro.state() != Fiber::TERM; i = coro.call()) cout << i << endl;
}
Since Boost.Coroutine2 now supports it very well (I found it because I wanted to solve exactly the same yield problem), I am posting the C++ code that matches your original intention:
#include <stdint.h>
#include <iostream>
#include <memory>
#include <boost/coroutine2/all.hpp>
typedef boost::coroutines2::coroutine<std::pair<uint16_t, uint16_t>> coro_t;
void pair_sequence(coro_t::push_type& yield)
{
uint16_t i = 0;
uint16_t j = 0;
for (;;) {
for (;;) {
yield(std::make_pair(i, j));
if (++j == 0)
break;
}
if (++i == 0)
break;
}
}
int main()
{
coro_t::pull_type seq(boost::coroutines2::fixedsize_stack(),
pair_sequence);
for (auto pair : seq) {
print_pair(pair);
}
//while (seq) {
// print_pair(seq.get());
// seq();
//}
}
In this example, pair_sequence does not take additional arguments. If it needs to, std::bind or a lambda should be used to generate a function object that takes only one argument (of push_type), when it is passed to the coro_t::pull_type constructor.
All answers that involve writing your own iterator are completely wrong. Such answers entirely miss the point of Python generators (one of the language's greatest and unique features). The most important thing about generators is that execution picks up where it left off. This does not happen to iterators. Instead, you must manually store state information such that when operator++ or operator* is called anew, the right information is in place at the very beginning of the next function call. This is why writing your own C++ iterator is a gigantic pain; whereas, generators are elegant, and easy to read+write.
I don't think there is a good analog for Python generators in native C++, at least not yet (there is a rummor that yield will land in C++17). You can get something similarish by resorting to third-party (e.g. Yongwei's Boost suggestion), or rolling your own.
I would say the closest thing in native C++ is threads. A thread can maintain a suspended set of local variables, and can continue execution where it left off, very much like generators, but you need to roll a little bit of additional infrastructure to support communication between the generator object and its caller. E.g.
// Infrastructure
template <typename Element>
class Channel { ... };
// Application
using IntPair = std::pair<int, int>;
void yield_pairs(int end_i, int end_j, Channel<IntPair>* out) {
for (int i = 0; i < end_i; ++i) {
for (int j = 0; j < end_j; ++j) {
out->send(IntPair{i, j}); // "yield"
}
}
out->close();
}
void MyApp() {
Channel<IntPair> pairs;
std::thread generator(yield_pairs, 32, 32, &pairs);
for (IntPair pair : pairs) {
UsePair(pair);
}
generator.join();
}
This solution has several downsides though:
Threads are "expensive". Most people would consider this to be an "extravagant" use of threads, especially when your generator is so simple.
There are a couple of clean up actions that you need to remember. These could be automated, but you'd need even more infrastructure, which again, is likely to be seen as "too extravagant". Anyway, the clean ups that you need are:
out->close()
generator.join()
This does not allow you to stop generator. You could make some modifications to add that ability, but it adds clutter to the code. It would never be as clean as Python's yield statement.
In addition to 2, there are other bits of boilerplate that are needed each time you want to "instantiate" a generator object:
Channel* out parameter
Additional variables in main: pairs, generator
Using range-v3:
#include <iostream>
#include <tuple>
#include <range/v3/all.hpp>
using namespace std;
using namespace ranges;
auto generator = [x = view::iota(0) | view::take(3)] {
return view::cartesian_product(x, x);
};
int main () {
for (auto x : generator()) {
cout << get<0>(x) << ", " << get<1>(x) << endl;
}
return 0;
}
You should probably check generators in std::experimental in Visual Studio 2015 e.g: https://blogs.msdn.microsoft.com/vcblog/2014/11/12/resumable-functions-in-c/
I think it's exactly what you are looking for. Overall generators should be available in C++17 as this is only experimental Microsoft VC feature.
If you only need to do this for a relatively small number of specific generators, you can implement each as a class, where the member data is equivalent to the local variables of the Python generator function. Then you have a next function that returns the next thing the generator would yield, updating the internal state as it does so.
This is basically similar to how Python generators are implemented, I believe. The major difference being they can remember an offset into the bytecode for the generator function as part of the "internal state", which means the generators can be written as loops containing yields. You would have to instead calculate the next value from the previous. In the case of your pair_sequence, that's pretty trivial. It may not be for complex generators.
You also need some way of indicating termination. If what you're returning is "pointer-like", and NULL should not be a valid yieldable value you could use a NULL pointer as a termination indicator. Otherwise you need an out-of-band signal.
Something like this is very similar:
struct pair_sequence
{
typedef pair<unsigned int, unsigned int> result_type;
static const unsigned int limit = numeric_limits<unsigned int>::max()
pair_sequence() : i(0), j(0) {}
result_type operator()()
{
result_type r(i, j);
if(j < limit) j++;
else if(i < limit)
{
j = 0;
i++;
}
else throw out_of_range("end of iteration");
}
private:
unsigned int i;
unsigned int j;
}
Using the operator() is only a question of what you want to do with this generator, you could also build it as a stream and make sure it adapts to an istream_iterator, for example.
Well, today I also was looking for easy collection implementation under C++11. Actually I was disappointed, because everything I found is too far from things like python generators, or C# yield operator... or too complicated.
The purpose is to make collection which will emit its items only when it is required.
I wanted it to be like this:
auto emitter = on_range<int>(a, b).yield(
[](int i) {
/* do something with i */
return i * 2;
});
I found this post, IMHO best answer was about boost.coroutine2, by Yongwei Wu. Since it is the nearest to what author wanted.
It is worth learning boost couroutines.. And I'll perhaps do on weekends. But so far I'm using my very small implementation. Hope it helps to someone else.
Below is example of use, and then implementation.
Example.cpp
#include <iostream>
#include "Generator.h"
int main() {
typedef std::pair<int, int> res_t;
auto emitter = Generator<res_t, int>::on_range(0, 3)
.yield([](int i) {
return std::make_pair(i, i * i);
});
for (auto kv : emitter) {
std::cout << kv.first << "^2 = " << kv.second << std::endl;
}
return 0;
}
Generator.h
template<typename ResTy, typename IndexTy>
struct yield_function{
typedef std::function<ResTy(IndexTy)> type;
};
template<typename ResTy, typename IndexTy>
class YieldConstIterator {
public:
typedef IndexTy index_t;
typedef ResTy res_t;
typedef typename yield_function<res_t, index_t>::type yield_function_t;
typedef YieldConstIterator<ResTy, IndexTy> mytype_t;
typedef ResTy value_type;
YieldConstIterator(index_t index, yield_function_t yieldFunction) :
mIndex(index),
mYieldFunction(yieldFunction) {}
mytype_t &operator++() {
++mIndex;
return *this;
}
const value_type operator*() const {
return mYieldFunction(mIndex);
}
bool operator!=(const mytype_t &r) const {
return mIndex != r.mIndex;
}
protected:
index_t mIndex;
yield_function_t mYieldFunction;
};
template<typename ResTy, typename IndexTy>
class YieldIterator : public YieldConstIterator<ResTy, IndexTy> {
public:
typedef YieldConstIterator<ResTy, IndexTy> parent_t;
typedef IndexTy index_t;
typedef ResTy res_t;
typedef typename yield_function<res_t, index_t>::type yield_function_t;
typedef ResTy value_type;
YieldIterator(index_t index, yield_function_t yieldFunction) :
parent_t(index, yieldFunction) {}
value_type operator*() {
return parent_t::mYieldFunction(parent_t::mIndex);
}
};
template<typename IndexTy>
struct Range {
public:
typedef IndexTy index_t;
typedef Range<IndexTy> mytype_t;
index_t begin;
index_t end;
};
template<typename ResTy, typename IndexTy>
class GeneratorCollection {
public:
typedef Range<IndexTy> range_t;
typedef IndexTy index_t;
typedef ResTy res_t;
typedef typename yield_function<res_t, index_t>::type yield_function_t;
typedef YieldIterator<ResTy, IndexTy> iterator;
typedef YieldConstIterator<ResTy, IndexTy> const_iterator;
GeneratorCollection(range_t range, const yield_function_t &yieldF) :
mRange(range),
mYieldFunction(yieldF) {}
iterator begin() {
return iterator(mRange.begin, mYieldFunction);
}
iterator end() {
return iterator(mRange.end, mYieldFunction);
}
const_iterator begin() const {
return const_iterator(mRange.begin, mYieldFunction);
}
const_iterator end() const {
return const_iterator(mRange.end, mYieldFunction);
}
private:
range_t mRange;
yield_function_t mYieldFunction;
};
template<typename ResTy, typename IndexTy>
class Generator {
public:
typedef IndexTy index_t;
typedef ResTy res_t;
typedef typename yield_function<res_t, index_t>::type yield_function_t;
typedef Generator<ResTy, IndexTy> mytype_t;
typedef Range<IndexTy> parent_t;
typedef GeneratorCollection<ResTy, IndexTy> finalized_emitter_t;
typedef Range<IndexTy> range_t;
protected:
Generator(range_t range) : mRange(range) {}
public:
static mytype_t on_range(index_t begin, index_t end) {
return mytype_t({ begin, end });
}
finalized_emitter_t yield(yield_function_t f) {
return finalized_emitter_t(mRange, f);
}
protected:
range_t mRange;
};
This answer works in C (and hence I think works in C++ too)
#include<stdint.h>
//#include<stdio.h>
#define MAX (1ll << 32) //2^32
typedef struct {
uint64_t i, j;
} Pair;
int generate_pairs(Pair* p)
{
static uint64_t i = 0;
static uint64_t j = 0;
p->i = i;
p->j = j;
if(++j == MAX)
{
j = 0;
if(++i == MAX)
{
return -1; // return -1 to indicate generator finished.
}
}
return 1; // return non -1 to indicate generator not finished.
}
int main()
{
while(1)
{
Pair p;
int fin = generate_pairs(&p);
//printf("%lld, %lld\n", p.i, p.j);
if(fin == -1)
{
//printf("end");
break;
}
}
return 0;
}
This is simple, non object-oriented way to mimic a generator. This worked as expected for me.
Edit: Previous code was erroneous and I have updated it.
Note: This code can be improved to use just uint32_t instead of uint64_t for the given question.
It is possible to have yield comportment with simple goto statement. As it is simple, I wrote it in C.
All you have to do in your generator function is :
all variables are declared as static
last yield exit is memorized with a label
variables are reinitialized at the end of function
example :
#include <stdio.h>
typedef struct {
int i, j;
} Pair;
// the function generate_pairs can generate values in successive calls.
// - all variables are declared as static
// - last yield exit is memorized with a label
// - variables are reinitialized at the end of function
Pair* generate_pairs(int imax, int jmax)
{
// all local variable are declared static. So they are declared at the beginning
static int i = 0;
static int j = 0;
static Pair p;
// the exit position is marked with a label
static enum {EBEGIN, EYIELD1} tag_goto = EBEGIN;
// I goto to the last exit position
if (tag_goto == EYIELD1)
goto TYIELD1;
for (i=0; i<imax; i++) {
for (j=0; j<jmax; j++) {
p.i = i; p.j = -j;
// I manage the yield comportment
tag_goto = EYIELD1;
return &p;
TYIELD1 : ;
}
j = 0;
}
// reinitialization of variables
i = 0; j = 0; // in fact this reinitialization is not useful in this example
tag_goto = EBEGIN;
// NULL means ends of generator
return NULL;
}
int main()
{
for (Pair *p = generate_pairs(2,4); p != NULL; p = generate_pairs(2,4))
{
printf("%d,%d\n",p->i,p->j);
}
printf("end\n");
return 0;
}
Something like this:
Example use:
using ull = unsigned long long;
auto main() -> int {
for (ull val : range_t<ull>(100)) {
std::cout << val << std::endl;
}
return 0;
}
Will print the numbers from 0 to 99
Just as a function simulates the concept of a stack, generators simulate the concept of a queue. The rest is semantics.
As a side note, you can always simulate a queue with a stack by using a stack of operations instead of data. What that practically means is that you can implement a queue-like behavior by returning a pair, the second value of which either has the next function to be called or indicates that we are out of values. But this is more general than what yield vs return does. It allows to simulate a queue of any values rather than homogeneous values that you expect from a generator, but without keeping a full internal queue.
More specifically, since C++ does not have a natural abstraction for a queue, you need to use constructs which implement a queue internally. So the answer which gave the example with iterators is a decent implementation of the concept.
What this practically means is that you can implement something with bare-bones queue functionality if you just want something quick and then consume queue's values just as you would consume values yielded from a generator.

How can I avoid "for" loops with an "if" condition inside them with C++?

With almost all code I write, I am often dealing with set reduction problems on collections that ultimately end up with naive "if" conditions inside of them. Here's a simple example:
for(int i=0; i<myCollection.size(); i++)
{
if (myCollection[i] == SOMETHING)
{
DoStuff();
}
}
With functional languages, I can solve the problem by reducing the collection to another collection (easily) and then perform all operations on my reduced set. In pseudocode:
newCollection <- myCollection where <x=true
map DoStuff newCollection
And in other C variants, like C#, I could reduce with a where clause like
foreach (var x in myCollection.Where(c=> c == SOMETHING))
{
DoStuff();
}
Or better (at least to my eyes)
myCollection.Where(c=>c == Something).ToList().ForEach(d=> DoStuff(d));
Admittedly, I am doing a lot of paradigm mixing and subjective/opinion based style, but I can't help but feel that I am missing something really fundamental that could allow me to use this preferred technique with C++. Could someone enlighten me?
IMHO it's more straight forward and more readable to use a for loop with an if inside it. However, if this is annoying for you, you could use a for_each_if like the one below:
template<typename Iter, typename Pred, typename Op>
void for_each_if(Iter first, Iter last, Pred p, Op op) {
while(first != last) {
if (p(*first)) op(*first);
++first;
}
}
Usecase:
std::vector<int> v {10, 2, 10, 3};
for_each_if(v.begin(), v.end(), [](int i){ return i > 5; }, [](int &i){ ++i; });
Live Demo
Boost provides ranges that can be used w/ range-based for. Ranges have the advantage that they don't copy the underlying data structure, they merely provide a 'view' (that is, begin(), end() for the range and operator++(), operator==() for the iterator). This might be of your interest: http://www.boost.org/libs/range/doc/html/range/reference/adaptors/reference/filtered.html
#include <boost/range/adaptor/filtered.hpp>
#include <iostream>
#include <vector>
struct is_even
{
bool operator()( int x ) const { return x % 2 == 0; }
};
int main(int argc, const char* argv[])
{
using namespace boost::adaptors;
std::vector<int> myCollection{1,2,3,4,5,6,7,8,9};
for( int i: myCollection | filtered( is_even() ) )
{
std::cout << i;
}
}
Instead of creating a new algorithm, as the accepted answer does, you can use an existing one with a function that applies the condition:
std::for_each(first, last, [](auto&& x){ if (cond(x)) { ... } });
Or if you really want a new algorithm, at least reuse for_each there instead of duplicating the iteration logic:
template<typename Iter, typename Pred, typename Op>
void
for_each_if(Iter first, Iter last, Pred p, Op op) {
std::for_each(first, last, [&](auto& x) { if (p(x)) op(x); });
}
The idea of avoiding
for(...)
if(...)
constructs as an antipattern is too broad.
It is completely fine to process multiple items that match a certain expression from inside a loop, and the code cannot get much clearer than that. If the processing grows too large to fit on screen, that is a good reason to use a subroutine, but still the conditional is best placed inside the loop, i.e.
for(...)
if(...)
do_process(...);
is vastly preferable to
for(...)
maybe_process(...);
It becomes an antipattern when only one element will match, because then it would be clearer to first search for the element, and perform the processing outside of the loop.
for(int i = 0; i < size; ++i)
if(i == 5)
is an extreme and obvious example of this. More subtle, and thus more common, is a factory pattern like
for(creator &c : creators)
if(c.name == requested_name)
{
unique_ptr<object> obj = c.create_object();
obj.owner = this;
return std::move(obj);
}
This is hard to read, because it isn't obvious that the body code will be executed once only. In this case, it would be better to separate the lookup:
creator &lookup(string const &requested_name)
{
for(creator &c : creators)
if(c.name == requested_name)
return c;
}
creator &c = lookup(requested_name);
unique_ptr obj = c.create_object();
There is still an if within a for, but from the context it becomes clear what it does, there is no need to change this code unless the lookup changes (e.g. to a map), and it is immediately clear that create_object() is called only once, because it is not inside a loop.
Here is a quick relatively minimal filter function.
It takes a predicate. It returns a function object that takes an iterable.
It returns an iterable that can be used in a for(:) loop.
template<class It>
struct range_t {
It b, e;
It begin() const { return b; }
It end() const { return e; }
bool empty() const { return begin()==end(); }
};
template<class It>
range_t<It> range( It b, It e ) { return {std::move(b), std::move(e)}; }
template<class It, class F>
struct filter_helper:range_t<It> {
F f;
void advance() {
while(true) {
(range_t<It>&)*this = range( std::next(this->begin()), this->end() );
if (this->empty())
return;
if (f(*this->begin()))
return;
}
}
filter_helper(range_t<It> r, F fin):
range_t<It>(r), f(std::move(fin))
{
while(true)
{
if (this->empty()) return;
if (f(*this->begin())) return;
(range_t<It>&)*this = range( std::next(this->begin()), this->end() );
}
}
};
template<class It, class F>
struct filter_psuedo_iterator {
using iterator_category=std::input_iterator_tag;
filter_helper<It, F>* helper = nullptr;
bool m_is_end = true;
bool is_end() const {
return m_is_end || !helper || helper->empty();
}
void operator++() {
helper->advance();
}
typename std::iterator_traits<It>::reference
operator*() const {
return *(helper->begin());
}
It base() const {
if (!helper) return {};
if (is_end()) return helper->end();
return helper->begin();
}
friend bool operator==(filter_psuedo_iterator const& lhs, filter_psuedo_iterator const& rhs) {
if (lhs.is_end() && rhs.is_end()) return true;
if (lhs.is_end() || rhs.is_end()) return false;
return lhs.helper->begin() == rhs.helper->begin();
}
friend bool operator!=(filter_psuedo_iterator const& lhs, filter_psuedo_iterator const& rhs) {
return !(lhs==rhs);
}
};
template<class It, class F>
struct filter_range:
private filter_helper<It, F>,
range_t<filter_psuedo_iterator<It, F>>
{
using helper=filter_helper<It, F>;
using range=range_t<filter_psuedo_iterator<It, F>>;
using range::begin; using range::end; using range::empty;
filter_range( range_t<It> r, F f ):
helper{{r}, std::forward<F>(f)},
range{ {this, false}, {this, true} }
{}
};
template<class F>
auto filter( F&& f ) {
return [f=std::forward<F>(f)](auto&& r)
{
using std::begin; using std::end;
using iterator = decltype(begin(r));
return filter_range<iterator, std::decay_t<decltype(f)>>{
range(begin(r), end(r)), f
};
};
};
I took short cuts. A real library should make real iterators, not the for(:)-qualifying pseudo-fascades I did.
At point of use, it looks like this:
int main()
{
std::vector<int> test = {1,2,3,4,5};
for( auto i: filter([](auto x){return x%2;})( test ) )
std::cout << i << '\n';
}
which is pretty nice, and prints
1
3
5
Live example.
There is a proposed addition to C++ called Rangesv3 which does this kind of thing and more. boost also has filter ranges/iterators available. boost also has helpers that make writing the above much shorter.
One style that gets used enough to mention, but hasn't been mentioned yet, is:
for(int i=0; i<myCollection.size(); i++) {
if (myCollection[i] != SOMETHING)
continue;
DoStuff();
}
Advantages:
Doesn't change the indentation level of DoStuff(); when condition complexity increases. Logically, DoStuff(); should be at the top-level of the for loop, and it is.
Immediately makes it clear that the loop iterates over the SOMETHINGs of the collection, without requiring the reader to verify that there is nothing after the closing } of the if block.
Doesn't require any libraries or helper macros or functions.
Disadvantages:
continue, like other flow control statements, gets misused in ways that lead to hard-to-follow code so much that some people are opposed to any use of them: there is a valid style of coding that some follow that avoids continue, that avoids break other than in a switch, that avoids return other than at the end of a function.
for(auto const &x: myCollection) if(x == something) doStuff();
Looks pretty much like a C++-specific for comprehension to me. To you?
If DoStuff() would be dependent on i somehow in the future then I'd propose this guaranteed branch-free bit-masking variant.
unsigned int times = 0;
const int kSize = sizeof(unsigned int)*8;
for(int i = 0; i < myCollection.size()/kSize; i++){
unsigned int mask = 0;
for (int j = 0; j<kSize; j++){
mask |= (myCollection[i*kSize+j]==SOMETHING) << j;
}
times+=popcount(mask);
}
for(int i=0;i<times;i++)
DoStuff();
Where popcount is any function doing a population count ( count number of bits = 1 ). There will be some freedom to put more advanced constraints with i and their neighbors. If that is not needed we can strip the inner loop and remake the outer loop
for(int i = 0; i < myCollection.size(); i++)
times += (myCollection[i]==SOMETHING);
followed by a
for(int i=0;i<times;i++)
DoStuff();
Also, if you don't care reordering the collection, std::partition is cheap.
#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>
void DoStuff(int i)
{
std::cout << i << '\n';
}
int main()
{
using namespace std::placeholders;
std::vector<int> v {1, 2, 5, 0, 9, 5, 5};
const int SOMETHING = 5;
std::for_each(v.begin(),
std::partition(v.begin(), v.end(),
std::bind(std::equal_to<int> {}, _1, SOMETHING)), // some condition
DoStuff); // action
}
I am in awe of the complexity of the above solutions. I was going to suggest a simple #define foreach(a,b,c,d) for(a; b; c)if(d) but it has a few obvious deficits, for example, you have to remember to use commas instead of semicolons in your loop, and you can't use the comma operator in a or c.
#include <list>
#include <iostream>
using namespace std;
#define foreach(a,b,c,d) for(a; b; c)if(d)
int main(){
list<int> a;
for(int i=0; i<10; i++)
a.push_back(i);
for(auto i=a.begin(); i!=a.end(); i++)
if((*i)&1)
cout << *i << ' ';
cout << endl;
foreach(auto i=a.begin(), i!=a.end(), i++, (*i)&1)
cout << *i << ' ';
cout << endl;
return 0;
}
Another solution in case the i:s are important. This one builds a list that fills in the indexes of which to call doStuff() for. Once again the main point is to avoid the branching and trade it for pipelineable arithmetic costs.
int buffer[someSafeSize];
int cnt = 0; // counter to keep track where we are in list.
for( int i = 0; i < container.size(); i++ ){
int lDecision = (container[i] == SOMETHING);
buffer[cnt] = lDecision*i + (1-lDecision)*buffer[cnt];
cnt += lDecision;
}
for( int i=0; i<cnt; i++ )
doStuff(buffer[i]); // now we could pass the index or a pointer as an argument.
The "magical" line is the buffer loading line that arithmetically calculates wether to keep the value and stay in position or to count up position and add value. So we trade away a potential branch for some logics and arithmetics and maybe some cache hits. A typical scenario when this would be useful is if doStuff() does a small amount of pipelineable calculations and any branch in between calls could interrupt those pipelines.
Then just loop over the buffer and run doStuff() until we reach cnt. This time we will have the current i stored in the buffer so we can use it in the call to doStuff() if we would need to.
One can describe your code pattern as applying some function to a subset of a range, or in other words: applying it to the result of applying a filter to the whole range.
This is achievable in the most straightforward manner with Eric Neibler's ranges-v3 library; although it's a bit of an eyesore, because you want to work with indices:
using namespace ranges;
auto mycollection_has_something =
[&](std::size_t i) { return myCollection[i] == SOMETHING };
auto filtered_view =
views::iota(std::size_t{0}, myCollection.size()) |
views::filter(mycollection_has_something);
for (auto i : filtered_view) { DoStuff(); }
But if you're willing to forego indices, you'd get:
auto is_something = [&SOMETHING](const decltype(SOMETHING)& x) { return x == SOMETHING };
auto filtered_collection = myCollection | views::filter(is_something);
for (const auto& x : filtered_collection) { DoStuff(); }
which is nicer IMHO.
PS - The ranges library is mostly going into the C++ standard in C++20.
I'll just mention Mike Acton, he would definitely say:
If you have to do that, you have a problem with your data. Sort your data!

Array access on a Getter that returns a pointer, is that bad practice?

Imagine the following scenario:
class A
{
int a[50];
int* GetAPtr() { return a; };
};
...
A b;
if(b.GetAPtr()[22] == SOME_RANDOM_DEFINE) do_this_and_that();
Is this kind of access considered bad practice? b.GetAPtr()[22]
To clarify my situation:
1. I cannot use new/malloc in this case, the array muste be static
2. This is meant to encapsulate older C code that uses multiple arrays where this comes extremly handy
3. I know that returning a pointer can possibly return a NULL pointer, we do not talk about that issue here
If you really need such const expression you could make it into a function:
class A
{
int a[50];
bool check_this_and_that() { return a[22] == SOME_RANDOM_DEFINE; };
};
...
A b;
if(b.check_this_and_that()) do_this_and_that();
magic numbers are bad in general but inside a class logic it's more forgiveable and outsiders don't have to see this.
Yes, it is bad practice, because you have no way of knowing how long the array is. You could follow the idiomatic standard library approach and return begin and end pointers, pointing to the first and one-past-last elements.
class A
{
int a[50];
int* begin() { return &a[0]; };
int* end() { return &a[50]; };
const int* begin() const { return &a[0]; };
const int* end() const { return &a[50]; };
size_t size() const { return 50; } // this could be handy too
};
As well as giving you the tools to iterate over the elements like you would over a standard library container, this allows you to check whether any pointer to an element of the array is < v.end(). For example
it* it = b.begin() + 22;
if(it < b.end() && *it == SOME_RANDOM_DEFINE) do_this_and_that();
This makes it trivial to use standard library algorithms:
A b;
// fill with increasing numbers
std::iota(b.begin(), b.end());
// sort in descending order
std::sort(s.begin(), s.end(), std::greater<int>());
// C++11 range based for loop
for (auto i : b)
std::cout << i << " ";
std::endl;
GetAPtr is a method for accessing a private data member. Now ask yourself what are the advantages of b.GetAPtr()[22] over b.a[22]?
Encapsulating data is a good way to maintain constraints on and between data members. In your case there is at least a correlation between the a array and its length 50.
Depending on the use of A you could build a interface providing different access patterns:
class A {
int a[50];
public:
// low level
int atA(unsigned i) const { return a[i]; }
// or "mid" level
int getA(unsigned i) const { if(i >= 50) throw OutOfRange(); return a[i]; };
// or high level
bool checkSomething() const { return a[22] == SOME_RANDOM_DEFINE; }
};

function which is able to return different types?

I am trying to create a function in c++, I am wondering if I can create it such that it is able to return different types of vectors. e.g based on different case it returns vector string, int, double or ...anything.
Is it possible in c++? (I do not want to use overload function with different arg(S) and different returns)
I am very new to C++ and my question may seem to be stupid.
here is a piece of my code:
//zero here means intersection
std::vector<??????> findZeros(const mesh::Region& s, char *model) const
{
//Point
if( model == "point" )
{
std::vector<Vertex> zeros;
for(Region::pointIterator it = s.beginPoint(); itv != s.endPoint(); ++itv )
{
if( abs(Val(*it)) < 1.e-12 )
zeros.push_back(*it);
}
std::vector<point> zerosP(zeros.begin(), zeros.end());
return zerosP;
}
//line
else if (EntityS == "line")
{
std::vector<line> zerosE;
std::vector<Point&> PointE;
for(Region::lineIterator ite = s.beginLine(); ite != s.endLine(); ++ite )
{
Line ed = *ite;
Point P0 = ed.point(0);
Point P1 = e.point(1);
if( ......... ) zerosE.push_back(ed);
else if ( ....... )
{
PontE.push_back( P0, P1);
zerosE.push_back(ed);
}
}
//here I want to return "point" or "line with its points" or in upper level our surface.
//I want to do all in one function!
}
Templates
Try this:
template <typename T>
std::vector<T> func( /* arguments */ )
{
std::vector<T> v;
// ... do some stuff to the vector ...
return v;
}
You can call this function with different type in this way:
std::vector<int> func<int>( args );
std::vector<double> func<double>( args );
Alternatives
This is one way, if you know the types at compile-time. If you don't know the type at compile-time but at run-time only, then you have different choices:
Use unions. I can only recommend this, if you have very simple C-struct-like types which are called PODs (plain old data) in the C++ standard.
Use some type of variant. For example there is boost::variant from the Boost libraries or QVariant from the Qt library. They are a safe kind of unions on more general types. They also allow some conversions between different types. For example setting something to an integer value will make it possible to read the same value as floating point number.
Use boost::any which can wrap any type but does not allow conversions between them.
Use inheritance and polymorphism. For this case you need a common base class, say Base. Then you create an array of pointers to that base preferably with std::shared_ptrs. So the array type would be std::vector<std::shared_ptr<Base>>. The std::shared_ptr is better than built in pointers in this case because the manage your memory automagically by reference counting.
Use a dynamic language that doesn't care about types and performance.
C++17 Update
If you known the type at compile time, you can use templates as illustrated in this answer.
If the type is known at runtime only, with c++17 as an alternative to boost::variant we have the std::variant.
Here is a working example:
#include <iostream>
#include <string>
#include <type_traits>
#include <variant>
#include <vector>
using variant_vector = std::variant<std::vector<int>, std::vector<std::string>>;
auto get_vector(int i)
{
if (i < 0)
return variant_vector(std::vector<int>(3, 1));
else
return variant_vector(std::vector<std::string>(3, "hello"));
}
int main()
{
auto visit_vec = [](const auto& vec) {
using vec_type = typename std::remove_reference_t<decltype(vec)>::value_type;
if constexpr (std::is_same_v<vec_type, int>)
std::cout << "vector of int:" << std::endl;
else if constexpr (std::is_same_v<vec_type, std::string>)
std::cout << "vector of string:" << std::endl;
for (const auto& x : vec)
std::cout << x << std::endl;
};
std::visit(visit_vec, get_vector(-1));
std::visit(visit_vec, get_vector(1));
return 0;
}
See it live on Coliru.
In the code above, the function get_vector returns a std::variant object that either holds a std::vector<int> or a std::vector<std::string>. The contents of the returned object are inspected using std::visit.
It depends on exactly what you're trying to accomplish, but there multiple possibilities for how to do this. Here are a few that come to mind:
If one of a specific list of return types is decided inside the function:
Since you edited your question, this seems to be what you want. You might try boost::variant:
boost::variant<int, double, std::string> foo() {
if (something)
//set type to int
else if (something else)
//set type to double
else
//set type to std::string
}
If the return type depends on a template argument:
You can use SFINAE to manipulate overload resolution:
template<typename T, typename = typename std::enable_if<std::is_integral<T>::value, T>::type>
std::vector<int> foo() {...}
template<typename T, typename = typename std::enable_if<std::is_floating_point<T>::value, T>::type>
std::vector<std::string> foo() {...}
If the return type can be anything:
A boost::any would work well:
boost::any foo() {...}
If the return type is always derived from a specific class:
Return a smart pointer to the base class:
std::unique_ptr<Base> foo() {
if (something)
return std::unique_ptr<Base>{new Derived1};
if (something else)
return std::unique_ptr<Base>{new Derived2};
}
You can use templates, if you know what type to return before you call the function. But you can't have a function, which internally decide to return some type.
What you can do is to create a class which will be a container for returned data, fill object of this class with desired data and then return this object.
typedef enum { VSTRING, VINT, V_WHATEVER ... } datatype;
class MyReturnClass {
datatype d;
// now either
vector<string> * vs;
vector<int> * vi;
// or
void * vector;
}
MyReturnClass * thisIsTheFunction () {
MyReturnClass * return_me = new MyReturnClass();
return_me->datatype = VSTRING;
return_me->vs = new Vector<String>;
return return_me;
}
To update #chris' answer, since C++17 you can use std::variant:
#include <variant>
std::variant<int, double, std::string> foo() {
if (something)
//set type to int
else if (something else)
//set type to double
else
//set type to std::string
}
auto result = foo();
if (std::holds_alternative<int>(result)) {
int value = std::get<int>(result);
}