Generic type vector performances - c++

I need to have a vector class that is able to manage as set of predefined types and have the same performance as the standard std::vector.
I can't just rely on a simple template because the type isn't defined until run time. Bellow are two implementations I tested, but the first is about 4 times slower and the second 3 times slower than a simple std::vector. I also tested having a function pointer to the proper setValue that is assigned in the constructor, but it was 10 times slower.
#include <chrono>
#include <iostream>
#include <vector>
class TypeBase
{
public:
virtual void setValue(int id, int value) = 0;
virtual void setValue(int id, float value) = 0;
};
class TypeInt : public TypeBase
{
public:
int* data;
TypeInt(char* ptr)
{
data = (int*)ptr;
}
void setValue(int id, int value)
{
data[id] = value;
}
void setValue(int id, float value)
{
data[id] = (int)value;
}
};
class TypeFloat : public TypeBase
{
public:
float* data;
TypeFloat(char* ptr)
{
data = (float*)ptr;
}
void setValue(int id, int value)
{
data[id] = (float)value;
}
void setValue(int id, float value)
{
data[id] = value;
}
};
class genericTypeVector1
{
public:
// Allow to get the right setValue
TypeBase* typeMng = nullptr;
// Vector storage
std::vector<char> data;
enum class Types {Int, Float};
genericTypeVector1(Types type, size_t size)
{
switch (type)
{
case Types::Int:
data.resize(size * sizeof(int));
typeMng = new TypeInt(data.data());
break;
case Types::Float:
data.resize(size * sizeof(float));
typeMng = new TypeFloat(data.data());
break;
}
}
~genericTypeVector1()
{
if (typeMng != nullptr)
delete typeMng;
}
void setValue(int id, int value)
{
typeMng->setValue(id, value);
}
void setValue(int id, float value)
{
typeMng->setValue(id, value);
}
};
class genericTypeVector2
{
public:
// Vector storage
std::vector<char> data;
enum class Types { Int, Float };
// Current type of the vector
Types type;
genericTypeVector2(Types type, size_t size)
:type(type)
{
switch (type)
{
case Types::Int:
data.resize(size * sizeof(int));
break;
case Types::Float:
data.resize(size * sizeof(float));
break;
}
}
void setValue(int id, int value)
{
switch (type)
{
case Types::Int:
((int*)data.data())[id] = value;
break;
case Types::Float:
((float*)data.data())[id] = (int)value;
break;
}
}
void setValue(int id, float value)
{
switch (type)
{
case Types::Int:
((int*)data.data())[id] = (float)value;
break;
case Types::Float:
((float*)data.data())[id] = value;
break;
}
}
};
template<class T>
inline void setValue(T* ptr, int id, float value)
{
ptr[id] = value;
}
int main()
{
int N = 100000000;
{
std::vector<float> v(N);
auto begin = std::chrono::steady_clock::now();
for (int i = 0; i < N; ++i)
v[i] = 2 * i - N;
std::cout << "std::vector : " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - begin).count() << "ms" << std::endl;
}
{
genericTypeVector1 a(genericTypeVector1::Types::Float, N);
auto begin = std::chrono::steady_clock::now();
for (int i = 0; i < N; ++i)
a.setValue(i, 2 * i - N);
std::cout << "genericTypeVector with type manager : " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - begin).count() << "ms" << std::endl;
}
{
genericTypeVector2 a(genericTypeVector2::Types::Float, N);
auto begin = std::chrono::steady_clock::now();
for (int i = 0; i < N; ++i)
a.setValue(i, 2 * i - N);
std::cout << "genericTypeVector with switch-case in setValue : " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - begin).count() << "ms" << std::endl;
}
{
genericTypeVector2 a(genericTypeVector2::Types::Float, N);
auto begin = std::chrono::steady_clock::now();
switch (a.type)
{
case genericTypeVector2::Types::Int:
for (int i = 0; i < N; ++i)
setValue((int*)a.data.data(), i, 2 * i - N);
break;
case genericTypeVector2::Types::Float:
for (int i = 0; i < N; ++i)
setValue((float*)a.data.data(), i, 2 * i - N);
break;
}
std::cout << "genericTypeVector with switch case outside for loop : " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - begin).count() << "ms" << std::endl;
}
}
I don't know what to try now. Perhaps I'm on a completely wrong path.
A added a last example that shows that it's possible to have the same performances as with a raw std::vector, but I don't want to have this switch-case every time I call setValue.

[Edit]
I redo my whole answer now that I've looked at the code and ran it! One thing BTW from a benchmarking standpoint -- you have code like this:
genericTypeVector2 a(genericTypeVector2::Types::Float, N);
auto begin = std::chrono::steady_clock::now();
for (int i = 0; i < N; ++i)
a.setValue(i, 2 * i - N);
std::cout << "genericTypeVector with switch-case in setValue : " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - begin).count() << "ms" << std::endl;
The time measurements don't begin until after the data structure is initialized. I was wondering why on earth your final version was beating std::vector when it still involved dynamic dispatch:
std::vector : 46ms
genericTypeVector with type manager : 182ms
genericTypeVector with switch-case in setValue : 115ms
genericTypeVector with switch case outside for loop : 44ms
.. until I looked at your code more carefully and realized you weren't factoring in times to construct the structure.std::vector is often aggresively optimized in terms of the functions it uses for constructing and destroying PODs, able to do things like a calloc if the type is trivially default-constructible (although it still often just uses a plain old placement new loop for fill constructors) in which case it might not even need to touch the memory until you begin to access it via operator[]. So with the way you're measuring, the bulk of the overhead of paging and cache misses are likely incurred by vector in its timing (I need to profile to make sure but this seems most likely), while your version incurs them outside of when you actually begin measuring the times.
So I fixed the test and got these times which is closer to my expectations from having repeatedly benchmarked and profiled similar code:
std::vector : 109ms
genericTypeVector with type manager : 276ms
genericTypeVector with switch-case in setValue : 191ms
genericTypeVector with switch case outside for loop : 131ms
And that's not too bad unless you have a real-world case where just setting values of array elements in a loop is showing up as one of your bigger hotspots. It's easy to get cross-eyed trying to do these ultra granular performance tests and think 3-4x slower is so bad when most domains involve more meaty processing over each element (unless we're talking about image processing or something like that).
But the main overhead I suspect is just the extra branching for the dynamic dispatch in all your cases (two layers for version 1, one for version 2) + switch overhead.
One thing you can possibly do about the switch versions is to use __assume(0) in MSVC or goto labels in GCC. It's a trick I learned optimizing interpreters. You can get a much improvement in performance if the switch branching is dominating the times since both of these solutions eliminate an extra branch required for what would otherwise be a default case when you handle all possible cases.

Related

Cyclic splitting of execution into several threads (1-N-1-N-1...)

Consider this case:
for (...)
{
const size_t count = ...
for (size_t i = 0; i < count; ++i)
{
calculate(i); // thread-safe function
}
}
What is the most elegant solution to maximize performance using C++17 and/or boost?
Cyclic "create + join" threads makes no sense because of huge overhead (which in my case exactly equals possible gain).
So I have to create N threads only once and keep them synchronized with the main one (using: mutex, shared_mutex, condition_variable, atomic, etc.). It appeared to be quite difficult task for such common and clear situation (in order to make everything really safe and fast). Sticking with it during days I have a feeling of "inventing a bicycle"...
Update 1: calculate(x) and calculate(y) can (and should) run in
parallel
Update 2: std::atomic::fetch_add (or smth.) is more preferable
than queue (or smth.)
Update 3: extreme computations (i.e. millions of "outer" calls and hundreds of "inner")
Update 4: calculate() changes internal object's data without returning a value
Intermediate solution
For some reason "async + wait" is much faster then "create + join" threads. So these two examples make 100% speed increase:
Example 1
for (...)
{
const size_t count = ...
future<void> execution[cpu_cores];
for (size_t x = 0; x < cpu_cores; ++x)
{
execution[x] = async(launch::async, ref(*this), x, count);
}
for (size_t x = 0; x < cpu_cores; ++x)
{
execution[x].wait();
}
}
void operator()(const size_t x, const size_t count)
{
for (size_t i = x; i < count; i += cpu_cores)
{
calculate(i);
}
}
Example 2
for (...)
{
index = 0;
const size_t count = ...
future<void> execution[cpu_cores];
for (size_t x = 0; x < cpu_cores; ++x)
{
execution[x] = async(launch::async, ref(*this), count);
}
for (size_t x = 0; x < cpu_cores; ++x)
{
execution[x].wait();
}
}
atomic<size_t> index;
void operator()(const size_t count)
{
for (size_t i = index.fetch_add(1); i < count; i = index.fetch_add(1))
{
calculate(i);
}
}
Is it possible to make it even faster by creating threads only once and then synchronize them with a small overhead?
Final solution
Additional +20% of speed increase in comparison to std::async!
for (size_t i = 0; i < _countof(index); ++i) { index[i] = i; }
for_each_n(par_unseq, index, count, [&](const size_t i) { calculate(i); });
Is it possible to avoid redundant array "index"?
Yes:
for_each_n(par_unseq, counting_iterator<size_t>(0), count,
[&](const size_t i)
{
calculate(i);
});
In the past, you'd use OpenMP, GNU Parallel, Intel TBB.¹
If you have c++17², I'd suggest using execution policies with standard algorithms.
It's really better than you can expect to do things yourself, although it
requires some fore-thought to choose your types to be amenable to standard algorithms
still helps if you know what will happen under the hood
Here's a simple example without further ado:
Live On Compiler Explorer
#include <thread>
#include <algorithm>
#include <random>
#include <execution>
#include <iostream>
using namespace std::chrono_literals;
static size_t s_random_seed = std::random_device{}();
static auto generate_param() {
static std::mt19937 prng {s_random_seed};
static std::uniform_int_distribution<> dist;
return dist(prng);
}
struct Task {
Task(int p = generate_param()) : param(p), output(0) {}
int param;
int output;
struct ByParam { bool operator()(Task const& a, Task const& b) const { return a.param < b.param; } };
struct ByOutput { bool operator()(Task const& a, Task const& b) const { return a.output < b.output; } };
};
static void calculate(Task& task) {
//std::this_thread::sleep_for(1us);
task.output = task.param ^ 0xf0f0f0f0;
}
int main(int argc, char** argv) {
if (argc>1) {
s_random_seed = std::stoull(argv[1]);
}
std::vector<Task> jobs;
auto now = std::chrono::high_resolution_clock::now;
auto start = now();
std::generate_n(
std::execution::par_unseq,
back_inserter(jobs),
1ull << 28, // reduce for small RAM!
generate_param);
auto laptime = [&](auto caption) {
std::cout << caption << " in " << (now() - start)/1.0s << "s" << std::endl;
start = now();
};
laptime("generate randum input");
std::sort(
std::execution::par_unseq,
begin(jobs), end(jobs),
Task::ByParam{});
laptime("sort by param");
std::for_each(
std::execution::par_unseq,
begin(jobs), end(jobs),
calculate);
laptime("calculate");
std::sort(
std::execution::par_unseq,
begin(jobs), end(jobs),
Task::ByOutput{});
laptime("sort by output");
auto const checksum = std::transform_reduce(
std::execution::par_unseq,
begin(jobs), end(jobs),
0, std::bit_xor<>{},
std::mem_fn(&Task::output)
);
laptime("reduce");
std::cout << "Checksum: " << checksum << "\n";
}
When run with the seed 42, prints:
generate randum input in 10.8819s
sort by param in 8.29467s
calculate in 0.22513s
sort by output in 5.64708s
reduce in 0.108768s
Checksum: 683872090
CPU utilization is 100% on all cores except for the first (random-generation) step.
¹ (I think I have answers demoing all of these on this site).
² See Are C++17 Parallel Algorithms implemented already?

How to calculate the length of a mpz_class in bytes?

I want to implement RSA with padding but first I have to find out the length in bytes of the message which is a mpz_class item. Which function would be useful in cpp to accomplish this?
const mpz_class m(argv[1])
What is the length of m in bytes?
Thank you!
#Shawn's comment is correct: The bytes occupied in memory by your class are not what you should be concerned about. Not only does the location of the bytes in memory depend on how your compiler decides to pack them, but their order can also depend on the hardware used.
So, instead of doing some awkward and very fragile memcopy'ish thing that are almost guaranteed to break at some point, you should construct the message yourself (google keyword: Serialization). This also has the advantage that your class can contain stuff that you don't want to add to the message (caches with temp results, or other implementation/optimization stuff).
To the best of my knowledge C++ (unlike f.ex. C#) does not come with build in serialization support, but there are likely to exist libraries that can do a lot of it for you. Otherwise you just have to write your "data member to byte array" functions yourself.
Super simple example:
#include <vector>
#include<iostream>
class myClass
{
int32_t a;
public:
myClass(int32_t val) : a(val) {}
// Deserializer
bool initFromBytes(std::vector<uint8_t> msg)
{
if (msg.size() < 4)
return false;
a = 0;
for (int i = 0; i < 4; ++i)
{
a += msg[i] << (i * 8);
}
return true;
}
// Serializer
std::vector<uint8_t> toBytes()
{
std::vector<uint8_t> res;
for (int i = 0; i < 4; ++i)
{
res.push_back(a >> (i*8));
}
return res;
}
void print() { std::cout << "myClass: " << a << std::endl; }
};
int main()
{
myClass myC(123456789);
myC.print();
std::vector<uint8_t> message = myC.toBytes();
myClass recreate(0);
if (recreate.initFromBytes(message))
recreate.print();
else
std::cout << "Error" << std::endl;
return 0;
}

Is there a way to replace two similar functions that is different only by type by a single solution in this case?

There's an class named PlotCurve. It describes a chart as a container of points and operations on them. A data for PlotCurve is gotten from the class RVDataProvider. Important thing is that the amount of points that is provided by RVDataProvider may be big (more than 1kk) so RVDataProvider returns a read-only pointer to Y data (X data can be calculated by index of the pointer) to improve the perfomance.
The main problem is that RVDataProvider has two different methods for two types:
class RVDataProvider : public QObject, public IRVImmutableProvider
{
public:
// ...
ReadonlyPointer<float> getSignalDataFloat(int signalIndex, quint64 start, quint64 count) override;
ReadonlyPointer<double> getSignalDataDouble(int signalIndex, quint64 start, quint64 count) override;
// ...
}
ReadonlyPointer<T> is only a read-only wrapper of a C-style pointer.
In order to get a curve's range of values (for looking for min-max, painting them on the canvas, etc) I am supposed to declare different functions too.
class PlotCurve : public QObject
{
public:
// ...`
virtual ReadonlyPointer<float> getFloatPointer(quint64 begin, quint64 length) const;
virtual ReadonlyPointer<double> getDoublePointer(quint64 begin, quint64 length) const;
// ...
}
It leads to using switch statement in the client code and its changes if the new available type of data is added.
switch (dataType())
{
case RVSignalInfo::DataType::Float: {
auto pointer = getFloatPointer(begin, length);
Q_ASSERT(!(pointer).isNull()); \
for (quint64 i = 0; i < (length); ++i) { \
auto y = (pointer)[i]; \
if (y < (minY)) { (minY) = y; continue; } \
if (y > (maxY)) { (maxY) = y; } \
}
} break;
case RVSignalInfo::DataType::Double: {
auto pointer = getDoublePointer(begin, length);
Q_ASSERT(!(pointer).isNull()); \
for (quint64 i = 0; i < (length); ++i) { \
auto y = (pointer)[i]; \
if (y < (minY)) { (minY) = y; continue; } \
if (y > (maxY)) { (maxY) = y; } \
}
} break;
// ...
}
Is there a way to get rid of dependencies to a client code? Three thing came to my mind:
1) Create Iterator type that would be a wrapper of ReadonlyPointer. Nope - performance is decreased to 10+ times because of iterator's virtual functions.
2) Create a traverse method that would be perform some function to every value in some range. Nope again - the most optimized version using function pointers is two times slower than switch statement in client code.
3) Make the class PlotCurve template. In this way I can't add different PlotCurves to the one container like it is now.
Unfortunately, I don't see much which can be done for OPs problem.
At best, the similar looking parts of cases could be moved to
a macro
a function template
to prevent code duplication.
For demonstration, I resembled OPs problem with the following sample code:
enum DataType { Float, Double };
struct Data {
std::vector<float> dataFloat;
std::vector<double> dataDouble;
DataType type;
Data(const std::vector<float> &data): dataFloat(data), type(Float) { }
Data(const std::vector<double> &data): dataDouble(data), type(Double) { }
};
With a function template, processing could look like this:
namespace {
// helper function template for process()
template <typename T>
std::pair<double, double> getMinMax(const std::vector<T> &values)
{
assert(values.size());
double min = values[0], max = values[0];
for (const T &value : values) {
if (min > value) min = value;
else if (max < value) max = value;
}
return std::make_pair(min, max);
}
} // namespace
void process(const Data &data)
{
std::pair<double, double> minMax;
switch (data.type) {
case Float: minMax = getMinMax(data.dataFloat); break;
case Double: minMax = getMinMax(data.dataDouble); break;
}
std::cout << "range: " << minMax.first << ", " << minMax.second << '\n';
}
Live Demo on coliru
With a macro it would appear even more compact:
void process(const Data &data)
{
std::pair<double, double> minMax;
switch (data.type) {
#define CASE(TYPE) \
case TYPE: { \
assert(data.data##TYPE.size()); \
minMax.first = minMax.second = data.data##TYPE[0]; \
for (const double value : data.data##TYPE) { \
if (minMax.first > value) minMax.first = value; \
else if (minMax.second < value) minMax.second = value; \
} \
} break
CASE(Float);
CASE(Double);
#undef CASE
}
std::cout << "range: " << minMax.first << ", " << minMax.second << '\n';
}
Live Demo on coliru
Many people (me included) consider macros in C++ as dangerous. In opposition to everything else, macros are not subject of namespaces or scopes. This can cause confusion if any identifier becomes unexpectedly subject of preprocessing. In worst case, the unintendedly modified code passes compiler and leads to unexpected behavior in runtime. (My sad experience.)
However, this is not expected in this case (assuming the code would be part of a source file).
I would've preferred a third alternative which places the repeated code inside of process(). A lambda came in my mind but lambdas can not (yet) be templated: SO: Can lambda functions be templated?.
A local template (functor) isn't alternative. It's prohibited as well: SO: Why can't templates be declared in a function?.
After feedback of OP, a note about X macros: It's an ancient technique in C to prevent redundancy of data.
A "data table" is defined where each row is a "call" of a (here not defined) macro X which contains all features.
To use the data table:
define a macro X which uses just the arguments which are needed in the individual case (and ignores the rest)
#include the data table
#undef X.
The sample again:
void process(const Data &data)
{
std::pair<double, double> minMax;
switch (data.type) {
#define X(TYPE_ID, TYPE) \
case TYPE_ID: { \
assert(data.data##TYPE_ID.size()); \
minMax.first = minMax.second = data.data##TYPE_ID[0]; \
for (const double value : data.data##TYPE_ID) { \
if (minMax.first > value) minMax.first = value; \
else if (minMax.second < value) minMax.second = value; \
} \
} break;
#include "Data.inc"
#undef X
}
std::cout << "range: " << minMax.first << ", " << minMax.second << '\n';
}
where Data.inc is:
X(Float, float)
X(Double, double)
X(Int, int)
Live Demon on coliru
Although, this macro-trickery makes a bit scary – this is very convenient concerning maintenance. If a new data type has to be added, a new X() line in Data.inc (and, of course, a re-compile) is all what's necessary. (The compiler / build chain will hopefully consider all dependencies of sources from Data.inc. We never faced problems with this in Visual Studio.)

Return the size of the hash table?

Excuse me in advance if I'm not explaining this clear..
Okay so I have declared a hash table using a vector like so:
> class HashTable{
private:
vector<string> arrayofbuckets[100];
public:
void insertelement(string input);
void deleteelement(string remove);
bool lookupelement(string search);
int tablesize();
> }; // end of class
I have also creating a menu using a switch statement to insert elements into the hash table:
> case 'I':
{
cout << " Which element would you like to insert?: ";
cin >> Element;
hash.insertelement(Element);
}
break;
It then gets passed on to this function:
void HashTable::insertelement(string input){
int hashValue = 0;
for(int i = 0; i<input.length(); i++){
hashValue = hashValue + int(input[i]);
}
hashValue = hashValue % 100;
arrayofbuckets[hashValue].push_back(input);
cout << " The element " << input << " has been put into value " << hashValue << ends;
}
Does anyone have any idea how to write a function to obtain and display the size of the table?
The best way is to keep track of the size inside functions that should initialise or modify it:
HashTable::HashTable() : size_(0) { }
void HashTable::insertelement(string input){
...do all the existing stuff...
++size_;
}
// similarly --size_ inside deleteelement...
int HashTable::tablesize() const { return size_; }
Make sure you add an int size_; data member.
Do note that bool lookupelement(string search) const; and int tablesize() const; should be const - I've inserted the keyword here so you know where to put it, and used it above when defining tablesize().
If you were really determined to avoid an extra member variable, you could also do this...
int HashTable::tablesize() const {
int size = 0;
for (std::vector<std::string>& vs : arrayOfBuckets)
size += vs.size();
return size;
}
...but most users will expect a constant-time and fast size() function: they may call it every time through their loops, so keep it cheap.

Lazy transform in C++

I have the following Python snippet that I would like to reproduce using C++:
from itertools import count, imap
source = count(1)
pipe1 = imap(lambda x: 2 * x, source)
pipe2 = imap(lambda x: x + 1, pipe1)
sink = imap(lambda x: 3 * x, pipe2)
for i in sink:
print i
I've heard of Boost Phoenix, but I couldn't find an example of a lazy transform behaving in the same way as Python's imap.
Edit: to clarify my question, the idea is not only to apply functions in sequence using a for, but rather to be able to use algorithms like std::transform on infinite generators. The way the functions are composed (in a more functional language like dialect) is also important, as the next step is function composition.
Update: thanks bradgonesurfing, David Brown, and Xeo for the amazing answers! I chose Xeo's because it's the most concise and it gets me right where I wanted to be, but David's was very important into getting the concepts through. Also, bradgonesurfing's tipped Boost::Range :).
Employing Boost.Range:
int main(){
auto map = boost::adaptors::transformed; // shorten the name
auto sink = generate(1) | map([](int x){ return 2*x; })
| map([](int x){ return x+1; })
| map([](int x){ return 3*x; });
for(auto i : sink)
std::cout << i << "\n";
}
Live example including the generate function.
I think the most idiomatic way to do this in C++ is with iterators. Here is a basic iterator class that takes an iterator and applies a function to its result:
template<class Iterator, class Function>
class LazyIterMap
{
private:
Iterator i;
Function f;
public:
LazyIterMap(Iterator i, Function f) : i(i), f(f) {}
decltype(f(*i)) operator* () { return f(*i); }
void operator++ () { ++i; }
};
template<class Iterator, class Function>
LazyIterMap<Iterator, Function> makeLazyIterMap(Iterator i, Function f)
{
return LazyIterMap<Iterator, Function>(i, f);
}
This is just a basic example and is still incomplete as it has no way to check if you've reached the end of the iterable sequence.
Here's a recreation of your example python code (also defining a simple infinite counter class).
#include <iostream>
class Counter
{
public:
Counter (int start) : value(start) {}
int operator* () { return value; }
void operator++ () { ++value; }
private:
int value;
};
int main(int argc, char const *argv[])
{
Counter source(0);
auto pipe1 = makeLazyIterMap(source, [](int n) { return 2 * n; });
auto pipe2 = makeLazyIterMap(pipe1, [](int n) { return n + 1; });
auto sink = makeLazyIterMap(pipe2, [](int n) { return 3 * n; });
for (int i = 0; i < 10; ++i, ++sink)
{
std::cout << *sink << std::endl;
}
}
Apart from the class definitions (which are just reproducing what the python library functions do), the code is about as long as the python version.
I think the boost::rangex library is what you are looking for. It should work nicely with the new c++lambda syntax.
int pipe1(int val) {
return 2*val;
}
int pipe2(int val) {
return val+1;
}
int sink(int val) {
return val*3;
}
for(int i=0; i < SOME_MAX; ++i)
{
cout << sink(pipe2(pipe1(i))) << endl;
}
I know, it's not quite what you were expecting, but it certainly evaluates at the time you want it to, although not with an iterator iterface. A very related article is this:
Component programming in D
Edit 6/Nov/12:
An alternative, still sticking to bare C++, is to use function pointers and construct your own piping for the above functions (vector of function pointers from SO q: How can I store function pointer in vector?):
typedef std::vector<int (*)(int)> funcVec;
int runPipe(funcVec funcs, int sinkVal) {
int running = sinkVal;
for(funcVec::iterator it = funcs.begin(); it != funcs.end(); ++it) {
running = (*(*it))(running); // not sure of the braces and asterisks here
}
return running;
}
This is intended to run through all the functions in a vector of such and return the resulting value. Then you can:
funcVec funcs;
funcs.pushback(&pipe1);
funcs.pushback(&pipe2);
funcs.pushback(&sink);
for(int i=0; i < SOME_MAX; ++i)
{
cout << runPipe(funcs, i) << endl;
}
Of course you could also construct a wrapper for that via a struct (I would use a closure if C++ did them...):
struct pipeWork {
funcVec funcs;
int run(int i);
};
int pipeWork::run(int i) {
//... guts as runPipe, or keep it separate and call:
return runPipe(funcs, i);
}
// later...
pipeWork kitchen;
kitchen.funcs = someFuncs;
int (*foo) = &kitchen.run();
cout << foo(5) << endl;
Or something like that. Caveat: No idea what this will do if the pointers are passed between threads.
Extra caveat: If you want to do this with varying function interfaces, you will end up having to have a load of void *(void *)(void *) functions so that they can take whatever and emit whatever, or lots of templating to fix the kind of pipe you have. I suppose ideally you'd construct different kinds of pipe for different interfaces between functions, so that a | b | c works even when they are passing different types between them. But I'm going to guess that that's largely what the Boost stuff is doing.
Depending on the simplicity of the functions :
#define pipe1(x) 2*x
#define pipe2(x) pipe1(x)+1
#define sink(x) pipe2(x)*3
int j = 1
while( ++j > 0 )
{
std::cout << sink(j) << std::endl;
}