I'm writing a function for calculating integrals recursively, using the trapezoid rule. For some f(x) on the interval (a,b), the method is to calculate the area of the big trapezoid with side (b-a) and then compare it with the sum of small trapezoids formed after dividing the interval into n parts. If the difference is larger than some given error, the function is called again for each small trapezoid and the results summed. If the difference is smaller, it returns the arithmetic mean of the two values.
The function takes two parameters, a function pointer to the function which is to be integrated and a constant reference to an auxiliary structure, which contains information such as the interval (a,b), the amount of partitions, etc:
struct Config{
double min,max;
int partitions;
double precision;
};
The problem arises when I want to change the amount of partitions with each iteration, for the moment let's say just increment by one. I see no way of doing this without resorting to calling the current depth of the recurrence:
integrate(const Config &conf, funptr f){
double a=conf.min,b=conf.max;
int n=conf.partitions;
//calculating the trapezoid areas here
if(std::abs(bigTrapezoid-sumOfSmallTrapezoids) > conf.precision){
double s=0.;
Config configs = new Config[n];
int newpartitions = n+(calls);
for(int i=0; i < n;++i){
configs[i]={ a+i*(b-a)/n , a+(i+1)*(b-a)/n , newpartitions};
s+=integrate(configs[i],f);
}
delete [] configs;
return s; }
else{
return 0.5*(bigTrapezoid+sumOfSmallTrapezoids);}
}
The part I'm missing here is of course a way to find (calls). I have tried doing something similar to this answer, but it does not work, in fact it freezes the pc until makefile kills the process. But perhaps I'm doing it wrong. I do not want to add an extra parameter to the function or an additional variable to the structure. How should I proceed?
You cannot "find" calls, but you can definitely pass it yourself, like this:
integrate(const Config &conf, funptr f, int calls=0) {
...
s+=integrate(configs[i],f, calls+1);
...
}
It seems to me that 'int newpartitions = n + 1;' would be enough, no? At every recursion level, the number of partitions increases by one. Say conf.partitions starts off at 1. If the routine needs to recurse down a new level, newpartitions is 2, and you will build 2 new Config instances each with '2' as the value for partitions. Recursing down another level, newpartitions is 3, and you build 3 Configs, each with '3' as 'partitions', and so on.
The trick here is to make sure your code is robust enough to avoid infinite recursion.
By the way, it seems inefficient to me to use dynamic allocation for Config instances that have to be destroyed after the loop. Why not build a single Config instance on the stack inside the loop? Your code should run much faster that way.
Related
I am trying to improve the speed of a computational (biological) model written in C++ (previous version is on my github: Prokaryotes). The most time-consuming function is where I calculate binding affinities between transcription factors and binding sites on a single genome.
Background: In my model, binding affinity is given by the Hamming distance between the binding domain of a transcription factor (a 20-bool array) and the sequence of a binding site (also a 20-bool array). For a single genome, I need to calculate the affinities between all active transcription factors (typically 5-10) and all binding sites (typically 10-50). I do this every timestep for more than 10,000 cells in the population to update their gene expression states. Having to calculate up to half a million comparisons of 20-bool arrays to simulate just one timestep of my model means that typical experiments take several months (2M--10M timesteps).
For the previous version of the model (link above) genomes remained fairly small, so I could calculate binding affinities once for every cell (at birth) and store and re-use these numbers during the cell's lifetime. However, in the latest version, genomes expand considerably and multiple genomes reside within the same cell. Thus, storing affinities of all transcript factor--binding site pairs in a cell becomes impractical.
In the current implementation I defined an inline function belonging to the Bead class (which is a base class for transcription factor class "Regulator" and binding site class "Bsite"). It is written directly in the header file Bead.hh:
inline int Bead::BindingAffinity(const bool* sequenceA, const bool* sequenceB, int seqlen) const
{
int affinity = 0;
for (int i=0; i<seqlen; i++)
{
affinity += (int)(sequenceA[i]^sequenceB[i]);
}
return affinity;
}
The above function accepts two pointers to boolean arrays (sequenceA and sequenceB), and an integer specifying their length (seqlen). Using a simple for-loop I then check at how many positions the arrays differ (sequenceA[i]^sequenceB[i]), summing into the variable affinity.
Given a binding site (bsite) on the genome, we can then iterate through the genome and for every transcription factor (reg) calculate its affinity to this particular binding site like so:
affinity = (double)reg->BindingAffinity(bsite->sequence, reg->sequence);
So, this is how streamlined I managed to make it; since I don't have a programming background, I wonder whether there are better ways to write the above function or to structure the code (i.e. should BindingAffinity be a function of the base Bead class)? Suggestions are greatly appreciated.
Thanks to #PaulMcKenzie and #eike for your suggestions. I tested both ideas against my previous implementation. Below are the results. In short, both answers work very well.
My previous implementation yielded an average runtime of 5m40 +/- 7 (n=3) for 1000 timesteps of the model. Profiling analysis with GPROF showed that the function BindingAffinity() took 24.3% of total runtime. [see Question for the code].
The bitset implementation yielded an average runtime of 5m11 +/- 6 (n=3), corresponding to a ~9% speed increase. Only 3.5% of total runtime is spent in BindingAffinity().
//Function definition in Bead.hh
inline int Bead::BindingAffinity(std::bitset<regulator_length> sequenceA, const std::bitset<regulator_length>& sequenceB) const
{
return (int)(sequenceA ^= sequenceB).count();
}
//Function call in Genome.cc
affinity = (double)reg->BindingAffinity(bsite->sequence, reg->sequence);
The main downside of the bitset implementation is that unlike with boolean arrays (my previous implementation), I have to specify the length of the bitset that goes into the function. I am occasionally comparing bitsets of different lengths, so for these I now have to specify separate functions (templates would not work for multi-file project according to https://www.cplusplus.com/doc/oldtutorial/templates/).
For the integer implementation I tried two alternatives to the std::popcount(seq1^seq2) function suggested by #eike since I am working with an older version of C++ that doesn't include this.
Alternative #1:
inline int Bead::BindingAffinity(int sequenceA, int sequenceB) const
{
int i = sequenceA^sequenceB;
std::bitset<32> bi (i);
return ((std::bitset<32>)i).count();
}
Alternative #2:
inline int Bead::BindingAffinity(int sequenceA, int sequenceB) const
{
int i = sequenceA^sequenceB;
//SWAR algorithm, copied from https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer
i = i - ((i >> 1) & 0x55555555); // add pairs of bits
i = (i & 0x33333333) + ((i >> 2) & 0x33333333); // quads
i = (i + (i >> 4)) & 0x0F0F0F0F; // groups of 8
return (i * 0x01010101) >> 24; // horizontal sum of bytes
}
These yielded average runtimes of 5m06 +/- 6 (n=3) and 5m06 +/- 3 (n=3), respectively, corresponding to a ~10% speed increase compared to my previous implementation. I only profiled Alternative #2, which showed that only 2.2% of total runtime was spent in BindingAffinity(). The downside of using integers for bitstrings is that I have to be very careful whenever I change any of the code. Single-bit mutations are definitely possible as mentioned by #eike, but everything is just a little bit trickier.
Conclusion:
Both the bitset and integer implementations for comparing bitstrings achieve impressive speed improvements. So much so, that BindingAffinity() is no longer the bottleneck in my code.
I am trying to find one element in one array, which has the minimum absolute value. For example, in array [5.1, -2.2, 8.2, -1, 4, 3, -5, 6], I want get the value -1. I use following code (myarray is 1D array and not sorted)
for (int i = 1; i < 8; ++i)
{
if(fabsf(myarray[i])<fabsf(myarray[0])) myarray[0] = myarray[i];
}
Then, the target value is in myarray[0].
Because I have to repeat this procedure many times, this piece of code becomes the bottleneck in my program. Does anyone know how to improve this code? Thanks in advance!
BTW, the size of the array is always eight. Could this be used to optimize this code?
Update: so far, following code works slightly better on my machine:
float absMin = fabsf(myarray[0]); int index = 0;
for (int i = 1; i < 8; ++i)
{
if(fabsf(myarray[i])<absMin) {absMin = fabsf(myarray[i]); index=i;}
}
float result = myarray[index];
I am wandering how to avoid fabsf, because I just want to compare the absolute values instead of computing them. Does anyone have any idea?
There are some urban myths like inlining, loop unrolling by hand and similar which are supposed to make your code faster. Good news is you don't have to do it, at least if you use -O3 compiler optimization.
Bad news is, if you already use -O3 there is nothing you can do to speed up this function: the compiler will optimize the hell out of your code! For example it will surely do the caching of fabsf(myarray[0]) as some suggested. The only thing you can achieve with this "refactoring" is to build bugs into your program and make it less readable.
My advice is to look somewhere else for improvements:
try to reduce the number of invocations of this code
if this code is the bottle neck, than my guess would be that you recalculate the minimal value over and over again (otherwise filling the values into the array would take approximately the same time) - so cache the results of the search
shift costs to changing the elements of the array, for example by using some fancy data structures (heaps, priority_queue) or by tracking the minimum of elements. Lets say your array has only two elements values [1,2] so minimum is 1. Now if you change
2 to 3, you don't have to do anything
2 to 0, you can easily update your minimum to 0
1 to 3, you have to loop through all elements. But maybe this case is not that often.
Can you store the values pre fabbed?
Also as #Gerstrong mentions, storing the number outside the loop and only calculating it when array changes will give you a boost.
Calling partial_sort or nth_element will sort the array only so that the correct value is in the right location.
std::nth_element(v.begin(), v.begin(), v.end(), [](float& lhs, float& rhs){
return fabsf(lhs)<fabsf(rhs);
});
Let me give some ideas that could help:
float minVal = fabsf(myarray[0]);
for (int i = 1; i < 8; ++i)
{
if(fabsf(myarray[i])<minVal) minVal = fabsf(myarray[i]);
}
myarray[0] = minVal;
But compilers nowadays are very smart and you might not get any more speed, as you already get optimized code. It depends on how your mentioned piece of code is called.
Another way to optimize this maybe is using C++ and STL, so you can do the following using the typical binary search tree std::set:
// Absolute comparator for std::set
bool absless_compare(const int64_t &a, const int64_t &b)
{
return (fabsf(a) < fabsf(b));
}
std::set<float, absless_compare> mySet = {5.1, -2.2, 8.2, -1, 4, 3, -5, 6};
const float minVal = *(mySet.begin());
With this approach by inserting your numbers they are already sorted in ascending order. The less-Comparator is usually a set for the std::set, but you can change it to use something different like in this example. This might help on larger datasets, but you mentioned you only have eight values to compare, so it really will not help.
Eight elements is a very small number, which might be kept in stack with for example the declaration of std::array<float,8> myarray close to your sorting function before filling it with data. You should that variants on your full codeset and observe what helps. Of course if you declare std::array<float,8> myarray or float[8] myarray runtime you should get the same results.
What you also could check is if fabsf really uses float as parameter and does not convert your variable to double which would degrade the performance. There is also std::abs() which for my understanding deduces the data type, because in C++ you can use templates etc.
If don't want to use fabs obviously a call like this
float myAbs(const float val)
{
return (val<0) ? -val : val;
}
or you hack the bit to zero which make your number negative. Either way, I'm pretty sure, that fabsf is fully aware of that, and I don't think a code like that will make it faster.
So I would check if the argument is converted to double. If you have C99 Standard in your system though, you should not have that issue.
One thought would be to do your comparisons "tournament" style, instead of linearly. In other words, you first compare 1 with 2, 3 with 4, etc. Then you take those 4 elements and do the same thing, and then again, until you only have one element left.
This does not change the number of comparisons. Since each comparison eliminates one element from the running, you will have exactly 7 comparisons no matter what. So why do I suggest this? Because it removes data dependencies from your code. Modern processors have multiple pipelines and can retire multiple instructions simultaneously. However, when you do the comparisons in a loop, each loop iteration depends on the previous one. When you do it tournament style, the first four comparisons are completely independent, so the processor may be able to do them all at once.
In addition to doing that, you can compute all the fabs at once in a trivial loop and put it in a new array. Since the fabs computations are independent, this can get sped up pretty easily. You would do this first, and then the tournament style comparisons to get the index. It should be exactly the same number of operations, it's just changing the order around so that the compiler can more easily see larger blocks that lack data dependencies.
The element of an array with minimal absolute value
Let the array, A
A = [5.1, -2.2, 8.2, -1, 4, 3, -5, 6]
The minimal absolute value of A is,
double miniAbsValue = A.array().abs().minCoeff();
int i_minimum = 0; // to find the position of minimum absolute value
for(int i = 0; i < 8; i++)
{
double ftn = evalsH(i);
if( fabs(ftn) == miniAbsValue )
{
i_minimum = i;
}
}
Now the element of A with minimal absolute value is
A(i_minimum)
I have a grid of 1000 x 1000 in which a person Q travels from a start point A to a stop point B. When Q starts out from A, he walks randomly till he reaches B. By walking randomly, I mean that for any position (i,j) where Q is currently, Q can travel to (i+1,j) , (i-1,j) , (i,j+1), (i,j-1) with equal probability. If Q reaches B in this manner, he gets a treasure stored at B and now he wants to retrace the exact same path he followed from A to B , only backwards.
Is there a way to implement this in C++ without explicitly storing the path in a vector?
You might be able to do something like this:
Store the random number seed
Get a random number between 1 and 4 for a directional move
Store a move count, beginning with 0 (already at destination)
For each move where you don't get to your destination, increment the count.
Minus a fixed number from your random number each time.
Once you reach your destination, traverse the move count seed in reverse, going from count to 0, and taking the opposite move.
The point is to relate the move count and the seed. Assuming the random seed is a fornal function, given the same input, you should always get the same output. You could store the initial time, fix the time step, and then allow your seed to be the current time otheach time step, but the idea is to allow your seed to be related to a count.
Using this method, you should be able to extract your path using only the begin time and the amount of ticks it took to reach the target. Also, an added bonus: you can also store how long it took to get to your destination in ticks and get other variables dependent on that time state.
Use a reversible pseudo-random generator.
For instance, with a linear congruential generator Y = (a.X+b) mod c, it is probably possible to invert the relation as X = (a'.Y+b') mod c'.
With such a generator, you can go back and forth freely along the path.
Suggestion for a quick (but not supported by theory) approach: use an accumulator and add an arbitrary constant, ignoring the overflows; this process is exactly inverted by subtraction. Take two independent bits of the accumulator to form your random number.
Try recursion :
void travel(){
if(treasureFound())
return;
else {
NextStep p;
chooseNextStep(&p);
travel();
moveBackwards(&p);
return;
}
}
You could also store your path, but you don't have to store all the coordinates, 1 char per move is enough to describe your move, 'N' 'S' 'E' 'W' for example
also NextStep in my example could be a char
Also in my example, if you prefer to store data on the heap et not in the stack, use a pointer !
You can store it implicitly via recursion.
The idea is fairly simple: You test if you are at the treasure and return true if you are. Otherwise you randomly pick a different route and return its result to either process the path or backtrace if necessary. Changing this approach to match exactly what your loop is doing should not be too hard (e.g. maybe you only want to backtrace once all options are exhausted).
Upon returning from each function, you immediately have the path in reverse.
bool walk_around(size_t x, size_t y) {
if(treasure(x, y)) return true;
if(better_abort()) return false;
size_t x2, y2;
randomly_choose(&x2, &y2);
if(walk_around(x2, y2))
{
std::cout << x << "," << y << "\n";
return true;
}
else return false;
}
Note that there is a danger in this approach: The thread stack (where the data to return from functions is stored) is usually limited to a few MB. You are in the area where it is possible to require enough space (1000*1000 recursive calls if your maze creates a hamiltonian path) that you might need to increase this limit. A quick google search will turn up an approach appropriate for your OS.
I am given
struct point
{
int x;
int y;
};
and the table of points:
point tab[MAX];
Program should return the minimal distance between the centers of gravity of any possible pair of subsets from tab. Subset can be any size (of course >=1 and < MAX).
I am obliged to write this program using recursion.
So my function will be int type because I have to return int.
I globally set variable min (because while doing recurssion I have to compare some values with this min)
int min = 0;
My function should for sure, take number of elements I add, sum of Y coordinates and sum of X coordinates.
int return_min_distance(int sY, int sX, int number, bool iftaken[])
I will be glad for any help further.
I thought about another table of bools which I pass as a parameter to determine if I took value or not from table. Still my problem is how to implement this, I do not know how to even start.
I think you need a function that can iterate through all subsets of the table, starting with either nothing or an existing iterator. The code then gets easy:
int min_distance = MAXINT;
SubsetIterator si1(0, tab);
while (si1.hasNext())
{
SubsetIterator si2(&si1, tab);
while (si2.hasNext())
{
int d = subsetDistance(tab, si1.subset(), si2.subset());
if (d < min_distance)
{
min_distance = d;
}
}
}
The SubsetIterators can be simple base-2 numbers capable of counting up to MAX, where a 1 bit indicates membership in the subset. Yes, it's a O(N^2) algorithm, but I think it has to be.
The trick is incorporating recursion. Sorry, I just don't see how it helps here. If I can think of a way to use it, I'll edit my answer.
Update: I thought about this some more, and while I still can't see a use for recursion, I found a way to make the subset processing easier. Rather than run through the entire table for every distance computation, the SubsetIterators could store precomputed sums of the x and y values for easy distance computation. Then, on every iteration, you subtract the values that are leaving the subset and add the values that are joining. A simple bit-and operation can reveal these. To be even more efficient, you could use gray coding instead of two's complement to store the membership bitmap. This would guarantee that at each iteration exactly one value enters and/or leaves the subset. Minimal work.
Imagine you have a pretty big array of double and a simple function avg(double*,size_t) that computes the average value (just a simple example: both the array and the function could be whatever data structure and algorithm). I would like that if the function is called a second time and the array is not changed in the meanwhile, the return value comes directly from the previous one, without going through the unchanged data.
To hold the previous value looks simple, I just need a static variable inside the function, right? But what about detecting the changes in the array? Do I need to write an interface to access the array which sets a flag to be read by the function? Can something smarter and more portable be done?
As Kerrek SB so astutely put it, this is known as "memoization." I'll cover my personal favorite method at the end (both with double* array and the much easier DoubleArray), so you can skip to there if you just want to see code. However, there are many ways to solve this problem, and I wanted to cover them all, including those suggested by others. Skip to the horizontal rule if you just want to see code.
The first part is some theory and alternate approaches. There are fundamentally four parts to the problem:
Prove the function is idempotent (calling a function once is the same as calling it any number of times)
Cache results keyed to the inputs
Search cached results given a new set of inputs
Invalidating cached results which are no longer accurate/current
The first step is easy for you: average is idempotent. It has no side effects.
Caching the results is a fun step. You obviously are going to create some "key" for the inputs that you can compare against the cached "keys." In Kerrek SB's memoization example, the key is a tuple of all of the arguments, compared against other keys with ==. In your system, the equivalent solution would be to have the key be the contents of the entire array. This means each key comparison is O(n), which is expensive. If the function was more expensive to calculate than the average function is, this price may be acceptable. However in the case of averaging, this key is terribly expensive.
This leads one on the open-ended search for good keys. Dieter Lücking's answer was to key the array pointer. This is O(1), and wicked fast to boot. However, it also makes the assumption that once you've calculated the average for an array, that array's values never change, and that memory address is never re-used for another array. Solutions for this come later, in the invalidation portion of the task.
Another popular key is HotLick's (1) in the comments. You use a unique identifier for the array (pointer or, better yet, a unique integer idx that will never be used again) as your key. Each array then has a "dirty bit for avg" that they are expected to set to true whenever a value is changed. Caches first look for the dirty bit. If it is true, they ignore the cached value, calculate the new value, cache the new value, then clear the dirty bit indicating that the cached value is now valid. (this is really invalidation, but it fit well in this part of the answer)
This technique assumes that there are more calls to avg than updates to the data. If the array is constantly dirty, then avg still has to keep recalculating, but we still pay the price of setting the dirty bit on every write (slowing it down).
This technique also assumes that there is only one function, avg, which needs cached results. If you have many functions, it starts to get expensive to keep all of the dirty bits up to date. The solution is an "epoch" counter. Instead of a dirty bit, you have an integer, which starts at 0. Every write increments it. When you cache a result, you cache not only the identity of the array, but its epoch as well. When you check to see if you have a cached value, you also check to see if the epoch changed. If it did change, you can't prove your old results are current, and have to throw them out.
Storing the results is an interesting task. It is very easy to write a storing algorithm which uses up gobs of memory by remembering hundreds of thousands of old results to avg. Generally speaking, there needs to be a way to let the caching code know that an array has been destroyed, or a way to slowly remove old unused cache results. In the former case, the deallocator of the double arrays needs to let the cache code know that that array is being deallocated. In the latter case, it is common to limit a cache to 10 or 100 entries, and have evict old cache results.
The last piece is invalidation of caches. I spoke earlier regarding the dirty bit. The general pattern for this is that a value inside a cache must be marked invalid if the key it was stored in didn't change, but the values in the array did change. This can obviously never happen if the key is a copy of the array, but it can occur when the key is an identifing integer or a pointer.
Generally speaking, invalidation is a way to add a requirement to your caller: if you want to use avg with caching, here's the extra work you are required to do to help the caching code.
Recently I implemented a system with such caching invalidation scheme. It was very simple, and stemmed from one philosophy: the code which is calling avg is in a better position to determine if the array has changed than avg is itself.
There were two versions of the equvalent of avg: double avg(double* array, int n) and double avg(double* array, int n, CacheValidityObject& validity).
Calling the 2 argument version of avg never cached, because it had no guarantees that array had not changed.
Calling the 3 argument version of avg activated caching. The caller guarentees that, if it passes the same CacheValidityObject to avg without marking it dirty, then the arrays must be the same.
Putting the onus on the caller makes average trivial. CacheValidityObject is a very simple class to hold on to the results
class CacheValidityObject
{
public:
CacheValidityObject(); // creates a new dirty CacheValidityObject
void invalidate(); // marks this object as dirty
// this function is used only by the `avg` algorithm. "friend" may
// be used here, but this example makes it public
boost::shared_ptr<void>& getData();
private:
boost::shared_ptr<void> mData;
};
inline void CacheValidityObject::invalidate()
{
mData.reset(); // blow away any cached data
}
double avg(double* array, int n); // defined as usual
double avg(double* array, int n, CacheValidityObject& validity)
{
// this function assumes validity.mData is null or a shared_ptr to a double
boost::shared_ptr<void>& data = validity.getData();
if (data) {
// The cached result, stored on the validity object, is still valid
return *static_pointer_cast<double>(data);
} else {
// There was no cached result, or it was invalidated
double result = avg(array, n);
data = make_shared<double>(result); // cache the result
return result;
}
}
// usage
{
double data[100];
fillWithRandom(data, 100);
CacheValidityObject dataCacheValidity;
double a = avg(data, 100, dataCacheValidity); // caches the aveerage
double b = avg(data, 100, dataCacheValidity); // cache hit... uses cached result
data[0] = 0;
dataCacheValidity.invalidate();
double c = avg(data, 100, dataCacheValidity); // dirty.. caches new result
double d = avg(data, 100, dataCacheValidity); // cache hit.. uses cached result
// CacheValidityObject::~CacheValidityObject() will destroy the shared_ptr,
// freeing the memory used to cache the result
}
Advantages
Nearly the fastest caching possible (within a few opcodes)
Trivial to implement
Doesn't leak memory, saving cached values only when the caller thinks it may want to use them again
Disadvantages
Requires the caller to handle caching, instead of doing it implicitly for them.
If you wrap the double* array in a class, you can minimize the disadvantage. Assign each algorithm an index (can be done at run time) Have the DoubleArray class maintain a map of cached values. Each modification to DoubleArray invalidates the cached results. This is the most easy to use version, but doesn't work with a naked array... you need a class to help you out
class DoubleArray
{
public:
// all of the getters and setters and constructors.
// Special note: all setters MUST call invalidate()
CacheValidityObject getCache(int inIdx)
{
return mCaches[inIdx];
}
void setCache(int inIdx, const CacheValidityObject& inObj)
{
mCaches[inIdx] = inObj;
}
private:
void invalidate()
{
mCaches.clear();
}
std::map<int, CacheValidityObject> mCaches;
double* mArray;
int mSize;
};
inline int getNextAlgorithmIdx()
{
static int nextIdx = 1;
return nextIdx++;
}
static const int avgAlgorithmIdx = getNextAlgorithmIdx();
double avg(DoubleArray& inArray)
{
CacheValidityObject valid = inArray.getCache(avgAlgorithmIdx);
// use the 3 argument avg in the previous example
double result = avg(inArray.getArray(), inArray.getSize(), valid);
inArray.setCache(avgAlgorithmIdx, valid);
return result;
}
// usage
DoubleArray array(100);
fillRandom(array);
double a = avg(array); // calculates, and caches
double b = avg(array); // cache hit
array.set(0, 5); // invalidates caches
double c = avg(array); // calculates, and caches
double d = avg(array); // cache hit
#include <limits>
#include <map>
// Note: You have to manage cached results - release it with avg(p, 0)!
double avg(double* p, std::size_t n) {
typedef std::map<double*, double> map;
static map results;
map::iterator pos = results.find(p);
if(n) {
// Calculate or get a cached value
if(pos == results.end()) {
pos = results.insert(map::value_type(p, 0.5)).first; // calculate it
}
return pos->second;
}
// Erase a cached value
results.erase(pos);
return std::numeric_limits<double>::quiet_NaN();
}