Most elegant way to write a one-shot 'if' - c++

Since C++ 17 one can write an if block that will get executed exactly once like this:
#include <iostream>
int main() {
for (unsigned i = 0; i < 10; ++i) {
if (static bool do_once = true; do_once) { // Enter only once
std::cout << "hello one-shot" << std::endl;
// Possibly much more code
do_once = false;
}
}
}
I know I might be overthinking this, and there are other ways to solve this, but still - is it possible to write this somehow like this, so there is no need of the do_once = false at the end?
if (DO_ONCE) {
// Do stuff
}
I'm thinking a helper function, do_once(), containing the static bool do_once, but what if I wanted to use that same function in different places? Might this be the time and place for a #define? I hope not.

Use std::exchange:
if (static bool do_once = true; std::exchange(do_once, false))
You can make it shorter reversing the truth value:
if (static bool do_once; !std::exchange(do_once, true))
But if you are using this a lot, don't be fancy and create a wrapper instead:
struct Once {
bool b = true;
explicit operator bool() { return std::exchange(b, false); }
};
And use it like:
if (static Once once; once)
The variable is not supposed to be referenced outside the condition, so the name does not buy us much. Taking inspiration from other languages like Python which give a special meaning to the _ identifier, we may write:
if (static Once _; _)
Further improvements: take advantage of the BSS section (#Deduplicator), avoid the memory write when we have already run (#ShadowRanger), and give a branch prediction hint if you are going to test many times (e.g. like in the question):
// GCC, Clang, icc only; use [[likely]] in C++20 instead
#define likely(x) __builtin_expect(!!(x), 1)
struct Once {
bool b = false;
explicit operator bool()
{
if (likely(b))
return false;
b = true;
return true;
}
};

Maybe not the most elegant solution and you don't see any actual if, but the standard library actually covers this case:, see std::call_once.
#include <mutex>
std::once_flag flag;
for (int i = 0; i < 10; ++i)
std::call_once(flag, [](){ std::puts("once\n"); });
The advantage here is that this is thread safe.

C++ does have a builtin control flow primitive that consists of "(before-block; condition; after-block)" already:
for (static bool b = true; b; b = false)
Or hackier, but shorter:
for (static bool b; !b; b = !b)
However, I think any of the techniques presented here should be used with care, as they are not (yet?) very common.

In C++17 you can write
if (static int i; i == 0 && (i = 1)){
in order to avoid playing around with i in the loop body. i starts with 0 (guaranteed by the standard), and the expression after the ; sets i to 1 the first time it is evaluated.
Note that in C++11 you could achieve the same with a lambda function
if ([]{static int i; return i == 0 && (i = 1);}()){
which also carries a slight advantage in that i is not leaked into the loop body.

static bool once = [] {
std::cout << "Hello one-shot\n";
return false;
}();
This solution is thread safe (unlike many of the other suggestions).

You could wrap the one-time action in the constructor of a static object that you instantiate in place of the conditional.
Example:
#include <iostream>
#include <functional>
struct do_once {
do_once(std::function<void(void)> fun) {
fun();
}
};
int main()
{
for (int i = 0; i < 3; ++i) {
static do_once action([](){ std::cout << "once\n"; });
std::cout << "Hello World\n";
}
}
Or you may indeed stick with a macro, that may look something like this:
#include <iostream>
#define DO_ONCE(exp) \
do { \
static bool used_before = false; \
if (used_before) break; \
used_before = true; \
{ exp; } \
} while(0)
int main()
{
for (int i = 0; i < 3; ++i) {
DO_ONCE(std::cout << "once\n");
std::cout << "Hello World\n";
}
}

Like #damon said, you can avoid using std::exchange by using a decrementing integer, but you have to remember that negative values resolve to true. The way to use this would be:
if (static int n_times = 3; n_times && n_times--)
{
std::cout << "Hello world x3" << std::endl;
}
Translating this to #Acorn's fancy wrapper would look like this:
struct n_times {
int n;
n_times(int number) {
n = number;
};
explicit operator bool() {
return n && n--;
};
};
...
if(static n_times _(2); _)
{
std::cout << "Hello world twice" << std::endl;
}

While using std::exchange as suggested by #Acorn is probably the most idiomatic way, an exchange operation is not necessarily cheap. Although of course static initialization is guaranteed to be thread-safe (unless you tell your compiler not to do it), so any considerations about performance are somewhat futile anyway in presence of the static keyword.
If you are concerned about micro-optimization (as people using C++ often are), you could as well scratch bool and use int instead, which will allow you to use post-decrement (or rather, increment, as unlike bool decrementing an int will not saturate to zero...):
if(static int do_once = 0; !do_once++)
It used to be that bool had increment/decrement operators, but they were deprecated long ago (C++11? not sure?) and are to be removed altogether in C++17. Nevertheless you can decrement an int just fine, and it will of course work as a Boolean condition.
Bonus: You can implement do_twice or do_thrice similarly...

Based on #Bathsheba's great answer for this - just made it even simpler.
In C++ 17, you can simply do:
if (static int i; !i++) {
cout << "Execute once";
}
(In previous versions, just declare int i outside the block. Also works in C :) ).
In simple words: you declare i, which takes default value of zero (0).
Zero is falsey, therefore we use exclamation mark (!) operator to negate it.
We then take into account the increment property of the <ID>++ operator, which first gets processed (assigned, etc) and then incremented.
Therefore, in this block, i will be initialized and have the value 0 only once, when block gets executed, and then the value will increase. We simply use the ! operator to negate it.

Related

How to select functions called inside nested loops before getting into loops?

As shown in the following code, one of several atomic routines is called in the function messagePassing.
Which one to use is determined before diving into the nested loops.
In the current implementation, several while loops are used for sake of runtime performance.
I want to avoid repeating myself (repeating the shared operations in the nested loops) for sake of readability and maintainability, and achieve something like messagePassingCleanButSlower.
Is there a approach which does not sacrifice runtime performance?
I need to deal with two scenarios.
In the first one, the atomic routines are small and only involve 3 plus/minus operations, thus I guess they will be inlined.
In the second one, the atomic routines are big (about 200 lines) and hence unlikely to be inlined.
#include <vector>
template<typename Uint, typename Real>
class Graph {
public:
void messagePassing(Uint nit, Uint type);
void messagePassingCleanButSlower(Uint nit, Uint type);
private:
struct Vertex {}; // Details are hidden since they are distracting.
std::vector< Vertex > vertices;
void atomicMessagePassingType1(Vertex &v);
void atomicMessagePassingType2(Vertex &v);
void atomicMessagePassingType3(Vertex &v);
// ...
// may have other types
};
template<typename Uint, typename Real>
void
Graph<Uint, Real>::
messagePassing(Uint nit, Uint type)
{
Uint count = 0; // round counter
if (type == 1) {
while (count < nit) {
++count;
// many operations
for (auto &v : vertices) {
// many other operations
atomicMessagePassingType1(v);
}
}
}
else if (type == 2) {
while (count < nit) {
++count;
// many operations
for (auto &v : vertices) {
// many other operations
atomicMessagePassingType2(v);
}
}
}
else {
while (count < nit) {
++count;
// many operations
for (auto &v : vertices) {
// many other operations
atomicMessagePassingType3(v);
}
}
}
}
template<typename Uint, typename Real>
void
Graph<Uint, Real>::
messagePassingCleanButSlower(Uint nit, Uint type)
{
Uint count = 0; // round counter
while (count < nit) {
++count;
// many operations
for (auto &v : vertices) {
// many other operations
if (type == 1) {
atomicMessagePassingType1(v);
}
else if (type == 2) {
atomicMessagePassingType2(v);
}
else {
atomicMessagePassingType3(v);
}
}
}
}
See the benchmarks here:
http://quick-bench.com/rMsSb0Fg4I0WNFX8QbKugCe3hkc
For 1. I have setup a test scenario where the operations in atomicMessagePassingTypeX are really short (only an optimization barrier). I chose roughly 100 elements for vertices and 100 iterations of the outer while. These conditions are going to be different for your actual code, so whether my benchmark results apply to your case, you must verify by benchmarking your own code.
The four test cases are: Your two variants, the one with a function pointer mentioned in the other answers and one where the function pointer is called through a dispatching lambda, like this:
template<typename Uint, typename Real>
void
Graph<Uint, Real>::
messagePassingLambda(Uint nit, Uint type)
{
using ftype = decltype(&Graph::atomicMessagePassingType1);
auto lambda = [&](ftype what_to_call) {
Uint count = 0; // round counter
while (count < nit) {
++count;
// many operations
for (auto &v : vertices) {
// many other operations
(this->*what_to_call)(v);
}
}
};
if(type == 1) lambda(&Graph::atomicMessagePassingType1);
else if(type == 2) lambda(&Graph::atomicMessagePassingType2);
else lambda(&Graph::atomicMessagePassingType3);
}
Try all combinations of GCC 9.1/Clang 8.0 and O2/O3. You will see that at O3 both compilers give roughly the same performance for your "slow" variant, in the case of GCC, it is actually the best. The compiler does hoist the if/else statements out of at least the inner loops and then, for some reason that is not completely clear to me, GCC does reorder the instructions in the inner loop differently than it does for the first variant, resulting in it being even a slightly bit faster.
The function pointer variant is consistently slowest.
The lambda variant is effectively equal to your first variant in performance. I guess it is clear why they are essentially the same if the the lambda is inlined.
If it is not inlined, then there might be a significant performance penalty due to the indirect call of what_to_call. This can be avoided by forcing a different type with appropriate direct call at each call site of lambda:
With C++14 or later you can make a generic lambda:
auto lambda = [&](auto what_to_call) {
adjust the call form (this->*what_to_call)(v); to what_to_call(); and call it with another lambda:
lambda([&](){ atomicMessagePassingType1(v); });
which will force the compiler to instantiate one function per dispatch and that should remove any potential indirect calls.
With C++11 you cannot make a generic lambda or variable template and so you would need to write an actual function template taking the secondary lambda as argument.
You can use a function pointer to make the decision before entering the loop, like so:
template<typename Uint, typename Real>
void
Graph<Uint, Real>::
messagePassingV2(Uint nit, bool isType1)
{
void (Graph::* aMPT_Ptr)(Vertex &); // Thanks to #uneven_mark for the corerct
if (isType1)
aMPT_Ptr = &Graph<Uint, Real>::atomicMessagePassingType1; // syntax here
else
aMPT_Ptr = &Graph<Uint, Real>::atomicMessagePassingType2;
Uint count = 0; // round counter
while (count < nit) {
++count;
for (auto& v : vertices) {
(this->*aMPT_Ptr)(v); // Again, thanks to #uneven_mark for the syntax!
}
}
}
The one thing that remains as a potential issue is what happens if either of the functions 'assigned' to the pointer is inlined. I'm thinking that, as there is code taking the address of these functions, then the compiler will probably prevent any inlining.
There are a couple ways.
1) Bool param. This really just moves the if/else into the function... but that's a good thing when you use the function[s] in multiple places, and a bad thing if you're trying to move the test out of the loop. OTOH, speculative execution should mitigate that.
2) Member function pointers. Nasty syntax in the raw, but 'auto' can burry all that for us.
#include <functional>
#include <iostream>
class Foo
{
public:
void bar() { std::cout << "bar\n"; }
void baz() { std::cout << "baz\n"; }
};
void callOneABunch(Foo& foo, bool callBar)
{
auto whichToCall = callBar ? &Foo::bar : &Foo::baz;
// without the auto, this would be "void(Foo::*)()"
// typedef void(Foo::*TypedefNameGoesHereWeirdRight)();
for (int i = 0; i < 4; ++i)
{
std::invoke(whichToCall, foo); // C++17
(foo.*whichToCall)(); // ugly, several have recommended wrapping it in a macro
Foo* foop = &foo;
(foop->*whichToCall)(); // yep, still ugly
}
}
int main() {
Foo myFoo;
callOneABunch(myFoo, true);
}
You can also take a swing at this with std::function or std::bind, but after arguing with fuction for a bit, I fell back on the bare syntax.

Can (a==1)&&(a==2)&&(a==3) evaluate to true? (and can it be useful?)

Inspired by another question regarding java-script language. Can the expression
(a==1)&&(a==2)&&(a==3)
evaluate to true in C++? (And if so, can it actually be useful?)
Yes it can:
class Foo
{
public:
bool operator==(int a)
{
return true;
}
};
Then, let a be of type Foo and voila.
Can this actually be useful? I don't really see it being useful no.
Can the expression evaluate to true in C++?
Yes, nothing is impossible...
struct A {
int x = 0;
};
bool operator==(A& a, int num) {
return ++a.x == num;
}
Then:
if ((a == 1) && (a == 2) && (a == 3)) {
std::cout << "meow" << std::endl;
}
prints meow.
But I have no idea of any practical usage of such weird overloading and, hopefully, will never see such code in production.
Could be somewhat useful.
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
struct Foo {
std::vector<int> v = {1,2,3};
};
bool operator==(const Foo& foo, int i) {
return std::any_of(foo.v.begin(), foo.v.end(), [=](int v){ return v == i; });
}
int main() {
Foo a;
if (a==1 && a==2 && a==3)
cout << "Really??" << endl;
return 0;
}
As it was noticed before, this trick could be performed with volatile. This is more honest approach compared to operator changing. Just let us use two threads:
volatile int a;
void changingValue(){
std::srand(unsigned(std::time(0)));
while (true) {
a = (rand() % 3 + 1);
}
}
void checkingValue(){
while (true) {
if (a == 1 && a == 2 && a == 3) {
std::cout << "Good choice!" << std::endl;
}
}
}
int main() {
std::thread changeValue = std::thread(changingValue);
changeValue.detach();
std::thread checkValue = std::thread(checkingValue);
checkValue.detach();
while (true){
continue;
}
}
Moreover, this code in my case is working well with no volatile declaration. As far as I understand, it should depend on compiler and processor. Maybe someone could correct it, if I'm wrong.
Other things not mentioned yet (source):
a might have overloaded operator int(), the operator for implicit conversion to int (instead of operator== as covered by other answers).
a might be a preprocessor macro.
Example of the latter:
int b = 0;
#define a ++b
if ((a==1)&&(a==2)&&(a==3))
std::cout << "aha\n";
Operator overloading and macros are trivial solutions to such a riddle.
And if so, can it actually be useful?
Well, with some imagination... One possible use case I can think of are unit tests or integration tests in which you want to make sure that your overloaded operator== for some class works correctly and you know for sure that it works incorrectly if it reports equality for different operands when it's not supposed to do that:
class A {
public:
A(int i);
bool operator==(int i) const;
// ...
};
A a(1);
if ((a==1)&&(a==2)&&(a==3)) {
// failed test
// ...
}
I assume a requirement is a valid program free of undefined behaviour. Otherwise simply introduce something like a data race and wait for the right circumstances to occur.
In a nutshell: Yes, it is possible for user-defined types. C++ has operator overloading, so the related answers from the JavaScript question apply. a must be a user-defined type because we compare against integers and you cannot implement operator overloads where all parameters are built-in types. Given that, a trivial solution could look like:
struct A {}
bool operator==(A, int) { return true; }
bool operator==(int, A) { return true; }
Can something like this be useful? As the question is stated: almost certainly not. There is a strong semantic meaning implied by the operator symbols used in their usual context. In the case of == that’s equality comparison. Changing that meaning makes for a surprising API, and that’s bad because it encourages incorrect usage.
However there are libraries that explicitly use operators for completely different purposes.
We have an example from the STL itself: iostream’s usage of << and >>.
Another one is Boost Spirit. They use operators for writing parsers in an EBNF like syntax.
Such redefinitions of the operator symbols are fine because they make it perfectly obvious that the usual operator symbols are used for a very different purpose.
Just to show my own ideas. I was thinking of some data structure like a stream buffer or a ring buffer.
We can with template inheritance "hide away the operator" and nothing will be altered in the data structure in itself but all checking will be done in the template superclass.
template<class C>
class Iterable{
C operator==(int a);
public:
Iterable(){pos = 0;}
int pos;
};
template<class C>
class Stream : public Iterable<Stream<C>>{
public:
Stream(C a){ m[0] = a; }
C m[32];
int operator==(int a){
return (m[this->pos++]==a); }
};
int main(){
Stream<int> a(1);
a.m[0] = 1; a.m[1] = 3; a.m[2] = 2;
if(a==1 && a==3 && a==2)
std::cout<<"Really!?"<<std::endl;
return 0;
}
In this case the == to an integer could be a short-cut for "is the next packet/element in the stream of this ID number" for example.

C/C++ pattern: exiting a for() loop on elapsed timeout

In embedded C/C++ programming is quite common to write for loops of this type:
...
for(int16_t i=0; i<5; i++)
{
if(true == unlockSIM(simPinNumber))
return true;
wait_ms(1000);
}
return false;
...
or like this while() loop:
bool CGps::waitGpsFix(int8_t fixVal)
{
int16_t iRunStatus=0, iFixStatus=0;
_timeWaitGpsFix = CSysTick::instance().set(TIME_TO_WAIT_GPS);
while(1)
{
bool bRet = _pGsm->GPSstatus(&iRunStatus, &iFixStatus);
if( (true == bRet) && (1 == iRunStatus) && (iFixStatus >= fixVal) )
return true;
if(CSysTick::instance().isElapsed(_timeWaitGpsFix))
return false;
wait_ms(500);
}
return false;
}
//---------------------------------------------------------------------------
is there any well known good pattern useful to don't write each time so many lines but just a function or method call?
For the for loop, you could use a function template that accepts the function (must return a boolean) and return when succeeded. For the while loop, things get more complicated, but I guess that lambdas could be used as true and false conditions.
For loop:
#include <iostream>
template<int retries, int wait_time, typename FUNC, typename ...Args>
bool retry(FUNC& f, Args &&... args)
{
for (int i = 0; i < retries; ++i)
{
if (f(std::forward<Args>(args)...)) return true;
if (i < retries - 1)
{
std::cout << "waiting " << wait_time << "\n";
}
}
return false;
}
bool func(int i)
{
return (i > 0);
}
bool func2(int i, int j)
{
return (i > j);
}
int main()
{
bool result = retry<5, 500>(func, 0);
std::cout << result << "\n";
result = retry<5, 500>(func, 1);
std::cout << result << "\n";
result = retry<5, 500>(func2, 1, 2);
std::cout << result << "\n";
result = retry<5, 500>(func2, 1, 0);
std::cout << result << "\n";
}
See example in coliru
This is simple enough with the execute-around idiom, which executes a given piece of code in an environment/set of circumstances controlled by the function the piece of code is passed in to. Here, we'll simply be calling the piece of code in a loop once every n milliseconds, either for a set amount of time, or for a set number of times.
Since you're working in an embedded environment and seem to be using a set of timing mechanisms different from that provided by <chrono> and <thread>, I've tried to adjust my answer so you can use the methods you seem to have access to to do the same thing. These are the functions I've used in my solution:
// similar functions to what you seem to have access to
namespace timing{
// interval defined as some arbitrary type
interval getInterval(int msCount);
bool intervalElapsed(interval i);
void await(interval i);
}
A note on the await function there--it takes an interval, and pauses execution until the interval has passed. It seems like the closest you can get to this might be simply waiting for the total number of milliseconds instead, though that strategy will introduce a certain amount of drift in your timings. If you can tolerate that (and given you're using it, it seems you can), then great.
The retry-for variant would look like this, given the above function signatures:
template <typename Func>
bool pollRetries(
int retryLimit,
int msInterval,
Func func){
for(int i = 0; i < retryLimit; ++i){
auto interval = timing::getInterval(msInterval);
if (func()){return true;}
timing::await(interval);
}
return false;
}
and the retry-while would look like this:
template <typename Func>
bool pollDuration(
int msLimit,
int msInterval,
Func func){
auto limit = timing::getInterval(msLimit);
while(!timing::intervalElapsed(limit)){
auto interval = timing::getInterval(msInterval);
if (func()){return true;}
timing::await(interval);
}
return false;
}
Live demo on Coliru
Both functions take a single functor which will be called with no arguments, and which returns either a boolean or something convertible to a boolean value. If the evaluation ever returns true, the loop will exit and the function will return true. Otherwise it'll return false at the end of the retry count or period.
Your sample code would convert to this:
retry for loop:
return pollRetries(5,1000,[simPinNumber](){
return unlockSIM(simPinNumber);
});
retry while loop:
return pollDuration(TIME_TO_WAIT_GPS, 500, [fixVal, this](){
int16_t
iRunStatus = 0,
iFixStatus = 0;
bool bRet = this->_pGsm->GPSstatus(&iRunStatus, &iFixStatus);
return bRet && (1 == iRunStatus) && (iFixStatus >= fixVal);
});
You can pass either functors or function pointers into these methods and the execution will occur, so you can also simply directly pass in lambdas to be executed. A few notes about that:
Lambdas without a capture group can be converted to function pointers with the unary + operator, allowing the template to use the function pointer instantiation of the template rather than one based off the lambda. You might want to do this because:
Every lambda in every function has an anonymous type. Passing the lambda into the template function will result in an additional template instantiation which might increase binary size.
You can also mitigate the above problem by defining your own functor class for uses that share a set of persistent state.
You might try making the timing functions into variadic templates per #stefaanv's solution. If you went this route, you could remove the capture groups from your lambdas and pass that information in manually, which would allow you to convert the lambdas into function pointers as though they were stateless.
Were most of these retry loops for a single class you could simply define the retry mechanisms as member function templates, and then subject yourself to member function pointers, thereby passing state around using the called object. I'd not recommend this though, as member function pointers are a bit of a pain to deal with, and you could get the same result by making a stateless lambda take a reference to the object, and passing in *this to the call. You'd also have to define all the bits of code as their own functions, rather than simply defining them within the function where they were used.

Can a function return the same value inside a loop, and return different values outside of loops?

It acts like this.
fun();//return 1;
for (int i=0;i++;i<100)
fun();//return 2;
fun();//return 3;
I don't want to do it manually, like:
static int i=0;
fun(){return i};
main()
{
i++;
fun();//return 1;
i++;
for (int i=0;i++;i<100)
fun();//return 2;
i++;
fun();//return 3;
}
New classes and static variables are allowed.
I am trying to design a cache replacement algorithm. Most of the time I use the LRU algorithm, but, if I use LRU algorithm inside a loop I would very likely get a cache thrashing.
https://en.wikipedia.org/wiki/Thrashing_(computer_science)
I need to know if I am inside a loop. Then I can use the LFU algorithm to avoid thrashing.
An obvious way of doing this would be using the __LINE__ macro. It will return the source code line number, which will be different throughout your function.
It is not possible within c++ for a function to know whether or not it is inside a loop 100% of the time. However, if you are happy to do some manual coding to tell the function that it is inside a loop then making use of C++'s default parameters you could simply implement a solution. For more information on default parameters see http://www.learncpp.com/cpp-tutorial/77-default-parameters/. Also, because Global variables are generally frowned upon I have placed them in a separate namespace in order to prevent clashes.
namespace global_variables {
int i = 0;
}
int func(bool is_in_loop = false) {
if (is_in_loop)
{
//do_something;
return global_variables::i;
}
else
{
//do_something_else;
return global_variables::i++;
}
}
int main()
{
// Calling function outside of loop
std::cout << func();
// Calling function inside of loop
for (int j=0;j<100;j++)
{
// is_in_loop will be overided to be true.
std::cout << function(true);
}
return 0;
}

"yield" keyword for C++, How to Return an Iterator from my Function?

Consider the following code.
std::vector<result_data> do_processing()
{
pqxx::result input_data = get_data_from_database();
return process_data(input_data);
}
std::vector<result_data> process_data(pqxx::result const & input_data)
{
std::vector<result_data> ret;
pqxx::result::const_iterator row;
for (row = input_data.begin(); row != inpupt_data.end(); ++row)
{
// somehow populate output vector
}
return ret;
}
While I was thinking about whether or not I could expect Return Value Optimization (RVO) to happen, I found this answer by Jerry Coffin [emphasis mine]:
At least IMO, it's usually a poor idea, but not for efficiency reasons. It's a poor idea because the function in question should usually be written as a generic algorithm that produces its output via an iterator. Almost any code that accepts or returns a container instead of operating on iterators should be considered suspect.
Don't get me wrong: there are times it makes sense to pass around collection-like objects (e.g., strings) but for the example cited, I'd consider passing or returning the vector a poor idea.
Having some Python background, I like Generators very much. Actually, if it were Python, I would have written above function as a Generator, i.e. to avoid the necessity of processing the entire data before anything else could happen. For example like this:
def process_data(input_data):
for item in input_data:
# somehow process items
yield result_data
If I correctly interpreted Jerry Coffins note, this is what he suggested, isn't it? If so, how can I implement this in C++?
No, that’s not what Jerry means, at least not directly.
yield in Python implements coroutines. C++ doesn’t have them (but they can of course be emulated but that’s a bit involved if done cleanly).
But what Jerry meant is simply that you should pass in an output iterator which is then written to:
template <typename O>
void process_data(pqxx::result const & input_data, O iter) {
for (row = input_data.begin(); row != inpupt_data.end(); ++row)
*iter++ = some_value;
}
And call it:
std::vector<result_data> result;
process_data(input, std::back_inserter(result));
I’m not convinced though that this is generally better than just returning the vector.
There is a blog post by Boost.Asio author Chris Kohlhoff about this: http://blog.think-async.com/2009/08/secret-sauce-revealed.html
He simulates yield with a macro
#define yield \
if ((_coro_value = __LINE__) == 0) \
{ \
case __LINE__: ; \
(void)&you_forgot_to_add_the_entry_label; \
} \
else \
for (bool _coro_bool = false;; \
_coro_bool = !_coro_bool) \
if (_coro_bool) \
goto bail_out_of_coroutine; \
else
This has to be used in conjunction with a coroutine class. See the blog for more details.
When you parse something recursively or when the processing has states, the generator pattern could be a good idea and simplify the code greatly—one cannot easily iterate then, and normally callbacks are the alternative. I want to have yield, and find that Boost.Coroutine2 seems good to use now.
The code below is an example to cat files. Of course it is meaningless, until the point when you want to process the text lines further:
#include <fstream>
#include <functional>
#include <iostream>
#include <string>
#include <boost/coroutine2/all.hpp>
using namespace std;
typedef boost::coroutines2::coroutine<const string&> coro_t;
void cat(coro_t::push_type& yield, int argc, char* argv[])
{
for (int i = 1; i < argc; ++i) {
ifstream ifs(argv[i]);
for (;;) {
string line;
if (getline(ifs, line)) {
yield(line);
} else {
break;
}
}
}
}
int main(int argc, char* argv[])
{
using namespace std::placeholders;
coro_t::pull_type seq(
boost::coroutines2::fixedsize_stack(),
bind(cat, _1, argc, argv));
for (auto& line : seq) {
cout << line << endl;
}
}
I found that a istream-like behavior would come close to what I had in mind. Consider the following (untested) code:
struct data_source {
public:
// for delivering data items
data_source& operator>>(input_data_t & i) {
i = input_data.front();
input_data.pop_front();
return *this;
}
// for boolean evaluation
operator void*() { return input_data.empty() ? 0 : this; }
private:
std::deque<input_data_t> input_data;
// appends new data to private input_data
// potentially asynchronously
void get_data_from_database();
};
Now I can do as the following example shows:
int main () {
data_source d;
input_data_t i;
while (d >> i) {
// somehow process items
result_data_t r(i);
cout << r << endl;
}
}
This way the data acquisition is somehow decoupled from the processing and is thereby allowed to happen lazy/asynchronously. That is, I could process the items as they arrive and I don't have to wait until the vector is filled completely as in the other example.