using : VC++ 2013
concurrency::concurrent_vector<datanode*> dtnodelst
Occasionally when I do dtnodelst->at(i) .... I am getting an invalid address (0XCDCD.. ofc)
which shouldn't be the case cause after I do push back, I never delete or remove any of the itms ( even if I delete it should have returned the deleted old address... but I am not ever deleting so that is not even the case )
dtnodelst itm = new dtnodelst ();
....
dtnodelst->push_back(itm);
any ideas on what might be happening ?
p.s. I am using windows thread pool. some times .. I can do 8million inserts and find and everything goes fine .... but sometimes even 200 inserts and finds will fail. I am kind of lost. any help would be awesomely appreciated!!
thanks and best regards
actual code as an fyi
p.s. am I missing something or is it pain in the ass to past code with proper formatting ? I remember it being auto align before ... -_-
struct datanode {
volatile int nodeval;
T val;
};
concurrency::concurrent_vector<datanode*> lst
inline T find(UINT32 key)
{
for (int i = 0; i < lst->size(); i++)
{
datanode* nd = lst->at(i);
//nd is invalid sometimes
if (nd)
if (nd->nodeval == key)
{
return (nd->val);
}
}
return NULL;
}
inline T insert_nonunique(UINT32 key, T val){
datanode* itm = new datanode();
itm->val = val;
itm->nodeval = key;
lst->push_back(itm);
_updated(lst);
return val;
}
The problem is using of concurrent_vector::size() which is not fully thread-safe as you can get reference to not yet constructed elements (where memory contains garbage). Microsoft PPL library (which provides it in concurrency:: namespace) uses Intel TBB implementation of concurrent_vector and TBB Reference says:
size_type size() const |
Returns: Number of elements in the vector. The result may include elements that are allocated but still under construction by concurrent calls to any of the growth methods.
Please see my blog for more explanation and possible solutions.
In TBB, the most reasonable solution is to use tbb::zero_allocator as underlying allocator of concurrent_vector in order to fill newly allocated memory with zeroes before size() will count it too.
concurrent_vector<datanode*, tbb::zero_allocator<datanode*> > lst;
Then, the condition if (nd) will filter out not-yet-ready elements.
volatile is no substitute for atomic<T>. Do not use volatile in some attempt to provide synchronization.
The whole idea of your find call doesn't make sense in a concurrent context. As soon as the function iterates over one value, it could be mutated by another thread to be the value you're looking for. Or it could be the value you want, but mutated to be some other value. Or as soon as it returns false, the value you're seeking is added. The return value of such a function would be totally meaningless. size() has all the same problems, which is a good part of why your implementation would never work.
Inspecting the state of concurrent data structures is a very bad idea, because the information becomes invalid the moment you have it. You should design operations that do not require knowing the state of the structure to execute correctly, or, block all mutations whilst you operate.
Related
This question is about owning pointers, consuming pointers, smart pointers, vectors, and allocators.
I am a little bit lost on my thoughts about code architecture. Furthermore, if this question has already an answer somewhere, 1. sorry, but I haven't found a satisfying answer so far and 2. please point me to it.
My problem is the following:
I have several "things" stored in a vector and several "consumers" of those "things". So, my first try was like follows:
std::vector<thing> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return &i_am_the_owner_of_things[5]; // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
In my application, this would be safe because the "things" outlive the "consumers" in any case. However, more "things" can be added during runtime and that can become a problem because if the std::vector<thing> i_am_the_owner_of_things; gets reallocated, all the thing* m_thing pointers become invalid.
A fix to this scenario would be to store unique pointers to "things" instead of "things" directly, i.e. like follows:
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return i_am_the_owner_of_things[5].get(); // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
The downside here is that memory coherency between "things" is lost. Can this memory coherency be re-established by using custom allocators somehow? I am thinking of something like an allocator which would always allocate memory for, e.g., 10 elements at a time and whenever required, adds more 10-elements-sized chunks of memory.
Example:
initially:
v = ☐☐☐☐☐☐☐☐☐☐
more elements:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
and again:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
Using such an allocator, I wouldn't even have to use std::unique_ptrs of "things" because at std::vector's reallocation time, the memory addresses of the already existing elements would not change.
As alternative, I can only think of referencing the "thing" in "consumer" via a std::shared_ptr<thing> m_thing, as opposed to the current thing* m_thing but that seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So, is the allocator-approach a good one? And if so, how can it be done? Do I have to implement the allocator by myself or is there an existing one?
If you are able to treat thing as a value type, do so. It simplifies things, you don't need a smart pointer for circumventing the pointer/reference invalidation issue. The latter can be tackled differently:
If new thing instances are inserted via push_front and push_back during the program, use std::deque instead of std::vector. Then, no pointers or references to elements in this container are invalidated (iterators are invalidated, though - thanks to #odyss-jii for pointing that out). If you fear that you heavily rely on the performance benefit of the completely contiguous memory layout of std::vector: create a benchmark and profile.
If new thing instances are inserted in the middle of the container during the program, consider using std::list. No pointers/iterators/references are invalidated when inserting or removing container elements. Iteration over a std::list is much slower than a std::vector, but make sure this is an actual issue in your scenario before worrying too much about that.
There is no single right answer to this question, since it depends a lot on the exact access patterns and desired performance characteristics.
Having said that, here is my recommendation:
Continue storing the data contiguously as you are, but do not store aliasing pointers to that data. Instead, consider a safer alternative (this is a proven method) where you fetch the pointer based on an ID right before using it -- as a side-note, in a multi-threaded application you can lock attempts to resize the underlying store whilst such a weak reference lives.
So your consumer will store an ID, and will fetch a pointer to the data from the "store" on demand. This also gives you control over all "fetches", so that you can track them, implement safety measure, etc.
void consumer::foo() {
thing *t = m_thing_store.get(m_thing_id);
if (t) {
// do something with t
}
}
Or more advanced alternative to help with synchronization in multi-threaded scenario:
void consumer::foo() {
reference<thing> t = m_thing_store.get(m_thing_id);
if (!t.empty()) {
// do something with t
}
}
Where reference would be some thread-safe RAII "weak pointer".
There are multiple ways of implementing this. You can either use an open-addressing hash table and use the ID as a key; this will give you roughly O(1) access time if you balance it properly.
Another alternative (best-case O(1), worst-case O(N)) is to use a "reference" structure, with a 32-bit ID and a 32-bit index (so same size as 64-bit pointer) -- the index serves as a sort-of cache. When you fetch, you first try the index, if the element in the index has the expected ID you are done. Otherwise, you get a "cache miss" and you do a linear scan of the store to find the element based on ID, and then you store the last-known index value in your reference.
IMO best approach would be create new container which will behave is safe way.
Pros:
change will be done on separate level of abstraction
changes to old code will be minimal (just replace std::vector with new container).
it will be "clean code" way to do it
Cons:
it may look like there is a bit more work to do
Other answer proposes use of std::list which will do the job, but with larger number of allocation and slower random access. So IMO it is better to compose own container from couple of std::vectors.
So it may start look more or less like this (minimum example):
template<typename T>
class cluster_vector
{
public:
static const constexpr cluster_size = 16;
cluster_vector() {
clusters.reserve(1024);
add_cluster();
}
...
size_t size() const {
if (clusters.empty()) return 0;
return (clusters.size() - 1) * cluster_size + clusters.back().size();
}
T& operator[](size_t index) {
thowIfIndexToBig(index);
return clusters[index / cluster_size][index % cluster_size];
}
void push_back(T&& x) {
if_last_is_full_add_cluster();
clusters.back().push_back(std::forward<T>(x));
}
private:
void thowIfIndexToBig(size_t index) const {
if (index >= size()) {
throw std::out_of_range("cluster_vector out of range");
}
}
void add_cluster() {
clusters.push_back({});
clusters.back().reserve(cluster_size);
}
void if_last_is_full_add_cluster() {
if (clusters.back().size() == cluster_size) {
add_cluster();
}
}
private:
std::vector<std::vector<T>> clusters;
}
This way you will provide container which will not reallocate items. It doesn't meter what T does.
[A shared pointer] seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So what? Maybe the code is a little less self-documenting, but it will solve all your problems.
(And by the way you are muddling things by using the word "consumer", which in a traditional producer/consumer paradigm would take ownership.)
Also, returning a raw pointer in your current code is already entirely ambiguous as to ownership. In general, I'd say it's good practice to avoid raw pointers if you can (like you don't need to call delete.) I would return a reference if you go with unique_ptr
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing& get_thing_for_consumer() {
// some thing-selection logic
return *i_am_the_owner_of_things[5]; // 5 is just an example
}
Brief Preface
I recognize that there are many nuances and requirements for a standard-compatible allocator. There are a number of questions here covering a range of topics associated with allocators. I realize that the requirements set out by the standard are critical to ensuring that the allocator functions correctly in all cases, doesn't leak memory, doesn't cause undefined-behaviour, etc. This is particularly true where the allocator is meant to be used (or at least, can be used) in a wide range of use cases, with a variety of underlying types and different standard containers, object sizes, etc.
In contrast, I have a very specific use case where I personally strictly control all of the conditions associated with its use, as I describe in detail below. Consequently, I believe that what I've done is perfectly acceptable given the highly-specific nature of what I'm trying to implement.
I'm hoping someone with far more experience and understanding than me can either confirm that the description below is acceptable or point out the problems (and, ideally, how to fix them too).
Overview / Specific Requirements
In a nutshell, I'm trying to write an allocator that is to be used within my own code and for a single, specific purpose:
I need "a few" std::vector (probably uint16_t), with a fixed (at runtime) number of elements. I'm benchmarking to determine the best tradeoff of performance/space for the exact integer type[1]
As noted, the number of elements is always the same, but it depends on some runtime configuration data passed to the application
The number of vectors is also either fixed or at least bounded. The exact number is handled by a library providing an implementation of parallel::for(execution::par_unseq, ...)
The vectors are constructed by me (i.e. so I know with certainty that they will always be constructed with N elements)
[1] The value of the vectors are used to conditionally copy a float from one of 2 vectors to a target: c[i] = rand_vec[i] < threshold ? a[i] : b[i] where a, b, c are contiguous arrays of float, rand_vec is the std::vector I'm trying to figure out here, and threshold is a single variable of type integer_tbd. The code compiles as SSE SIMD operations. I do not remember the details of this, but I believe that this requires additional shifting instructions if the ints are smaller than the floats.
On this basis, I've written a very simple allocator, with a single static boost::lockfree::queue as the free-list. Given that I will construct the vectors myself and they will go out of scope when I'm finished with them, I know with certainty that all calls to alloc::deallocate(T*, size_t) will always return vectors of the same size, so I believe that I can simply push them back onto the queue without worrying about a pointer to a differently-sized allocation being pushed onto the free-list.
As noted in the code below, I've added in runtime tests for both the allocate and deallocate functions for now, while I've been confirming for myself that these situations cannot and will not occur. Again, I believe it is unquestionably safe to delete these runtime tests. Although some advice would be appreciated here too -- considering the surrounding code, I think they should be handled adequately by the branch predictor so they don't have a significant runtime cost (although without instrumenting, hard to say for 100% certain).
In a nutshell - as far as I can tell, everything here is completely within my control, completely deterministic in behaviour, and, thus, completely safe. This is also suggested when running the code under typical conditions -- there are no segfaults, etc. I haven't yet tried running with sanitizers yet -- I was hoping to get some feedback and guidance before doing so.
I should point out that my code runs 2x faster compared to using std::allocator which is at least qualitatively to be expected.
CR_Vector_Allocator.hpp
class CR_Vector_Allocator {
using T = CR_Range_t; // probably uint16_t or uint32_t, set elsewhere.
private:
using free_list_type = boost::lockfree::queue>;
static free_list_type free_list;
public:
T* allocate(size_t);
void deallocate(T* p, size_t) noexcept;
using value_type = T;
using pointer = T*;
using reference = T&;
template struct rebind { using other = CR_Vector_Allocator;};
};
CR_Vector_Allocator.cc
CR_Vector_Allocator::T* CR_Vector_Allocator::allocate(size_t n) {
if (n <= 1)
throw std::runtime_error("Unexpected number of elements to initialize: " +
std::to_string(n));
T* addr_;
if (free_list.pop(addr_)) return addr_;
addr_ = reinterpret_cast<T*>(std::malloc(n * sizeof(T)));
return addr_;
}
void CR_Vector_Allocator::deallocate(T* p, size_t n) noexcept {
if (n <= 1) // should never happen. but just in case, I don't want to leak
free(p);
else
free_list.push(p);
}
CR_Vector_Allocator::free_list_type CR_Vector_Allocator::free_list;
It is used in the following manner:
using CR_Vector_t = std::vector<uint16_t, CR_Vector_Allocator>;
CR_Vector_t Generate_CR_Vector(){
/* total_parameters is a member of the same class
as this member function and is defined elsewhere */
CR_Vector_t cr_vec (total_parameters);
std::uniform_int_distribution<uint16_t> dist_;
/* urng_ is a member variable of type std::mt19937_64 in the class */
std::generate(cr_vec.begin(), cr_vec.end(), [this, &dist_](){
return dist_(this->urng_);});
return cr_vec;
}
void Prepare_Next_Generation(...){
/*
...
*/
using hpx::parallel::execution::par_unseq;
hpx::parallel::for_loop_n(par_unseq, 0l, pop_size, [this](int64_t idx){
auto crossovers = Generate_CR_Vector();
auto new_parameters = Generate_New_Parameters(/* ... */, std::move(crossovers));
}
}
Any feedback, guidance or rebukes would be greatly appreciated.
Thank you!!
I am currently running into a disgusting problem. Suppose there is a list aList of objects(whose type we call Object), and I want to iterate through it. Basically, the code would be like this:
for(int i = 0; i < aList.Size(); ++i)
{
aList[i].DoSth();
}
The difficult part here is, the DoSth() method could change the caller's position in the list! So two consequences could occur: first, the iteration might never be able to come to an end; second, some elements might be skipped (the iteration is not necessarily like above, since it might be a linked list). Of course, the first one is the major concern.
The problem must be solved with these constraints:
1) The possibility of doing position-exchanging operations cannot be excluded;
2) The position-exchanging operations can be delayed until the iteration finishes, if necessary and doable;
3) Since it happens quite often, the iteration can be modified only minimally (so actions like creating a copy of the list is not recommended).
The language I'm using is C++, but I think there are similar problems in JAVA and C#, etc.
The following are what I've tried:
a) Try forbidding the position-exchanging operations during the iteration. However, that involves too many client code files and it's just not practical to find and modify all of them.
b) Modify every single method(e.g., Method()) of Object that can change the position of itself and will be called by DoSth() directly or indirectly, in this way: first we can know that aList is doing the iteration, and we'll treat Method() accordingly. If the iteration is in progress, then we delay what Method() wants to do; otherwise, it does what it wants to right now. The question here is: what is the best (easy-to-use, yet efficient enough) way of delaying a function call here? The parameters of Method() could be rather complex. Moreover, this approach will involve quite a few functions, too!
c) Try modifying the iteration process. The real situation I encounter here is quite complex because it involves two layers of iterations: the first of them is a plain array iteration, while the second is a typical linked list iteration lying in a recursive function. The best I can do about the second layer of iteration for now, is to limit its iteration times and prevent the same element from being iterated more than once.
So I guess there could be some better way to tackle this problem? Maybe some awesome data structure will help?
Your question is a little light on detail, but from what you have written it seems that you are making the mistake of mixing concerns.
It is likely that your object can perform some action that causes it to either continue to exist or not. The decision that it should no longer exist is a separate concern to that of actually storing it in a container.
So let's split those concerns out:
#include <vector>
enum class ActionResult {
Dies,
Lives,
};
struct Object
{
ActionResult performAction();
};
using Container = std::vector<Object>;
void actions(Container& cont)
{
for (auto first = begin(cont), last = end(cont)
; first != last
; )
{
auto result = first->performAction();
switch(result)
{
case ActionResult::Dies:
first = cont.erase(first); // object wants to die so remove it
break;
case ActionResult::Lives: // object wants to live to continue
++first;
break;
}
}
}
If there are indeed only two results of the operation, lives and dies, then we could express this iteration idiomatically:
#include <algorithm>
// ...
void actions(Container& cont)
{
auto actionResultsInDeath = [](Object& o)
{
auto result = o.performAction();
return result == ActionResult::Dies;
};
cont.erase(remove_if(begin(cont), end(cont),
actionResultsInDeath),
end(cont));
}
Well, problem solved, at least in regard to the situation I'm interested in right now. In my situation, aList is really a linked list and the Object elements are accessed through pointers. If the size of aList is relatively small, then we have an elegant solution just like this:
Object::DoSthBig()
{
Object* pNext = GetNext();
if(pNext)
pNext->DoSthBig();
DoSth();
}
This has the underlying hypothesis that each pNext keeps being valid during the process. But if the element-deletion operation has already been dealt with discreetly, then everything is fine.
Of course, this is a very special example and is unable to be applied to other situations.
Imagine you have a pretty big array of double and a simple function avg(double*,size_t) that computes the average value (just a simple example: both the array and the function could be whatever data structure and algorithm). I would like that if the function is called a second time and the array is not changed in the meanwhile, the return value comes directly from the previous one, without going through the unchanged data.
To hold the previous value looks simple, I just need a static variable inside the function, right? But what about detecting the changes in the array? Do I need to write an interface to access the array which sets a flag to be read by the function? Can something smarter and more portable be done?
As Kerrek SB so astutely put it, this is known as "memoization." I'll cover my personal favorite method at the end (both with double* array and the much easier DoubleArray), so you can skip to there if you just want to see code. However, there are many ways to solve this problem, and I wanted to cover them all, including those suggested by others. Skip to the horizontal rule if you just want to see code.
The first part is some theory and alternate approaches. There are fundamentally four parts to the problem:
Prove the function is idempotent (calling a function once is the same as calling it any number of times)
Cache results keyed to the inputs
Search cached results given a new set of inputs
Invalidating cached results which are no longer accurate/current
The first step is easy for you: average is idempotent. It has no side effects.
Caching the results is a fun step. You obviously are going to create some "key" for the inputs that you can compare against the cached "keys." In Kerrek SB's memoization example, the key is a tuple of all of the arguments, compared against other keys with ==. In your system, the equivalent solution would be to have the key be the contents of the entire array. This means each key comparison is O(n), which is expensive. If the function was more expensive to calculate than the average function is, this price may be acceptable. However in the case of averaging, this key is terribly expensive.
This leads one on the open-ended search for good keys. Dieter Lücking's answer was to key the array pointer. This is O(1), and wicked fast to boot. However, it also makes the assumption that once you've calculated the average for an array, that array's values never change, and that memory address is never re-used for another array. Solutions for this come later, in the invalidation portion of the task.
Another popular key is HotLick's (1) in the comments. You use a unique identifier for the array (pointer or, better yet, a unique integer idx that will never be used again) as your key. Each array then has a "dirty bit for avg" that they are expected to set to true whenever a value is changed. Caches first look for the dirty bit. If it is true, they ignore the cached value, calculate the new value, cache the new value, then clear the dirty bit indicating that the cached value is now valid. (this is really invalidation, but it fit well in this part of the answer)
This technique assumes that there are more calls to avg than updates to the data. If the array is constantly dirty, then avg still has to keep recalculating, but we still pay the price of setting the dirty bit on every write (slowing it down).
This technique also assumes that there is only one function, avg, which needs cached results. If you have many functions, it starts to get expensive to keep all of the dirty bits up to date. The solution is an "epoch" counter. Instead of a dirty bit, you have an integer, which starts at 0. Every write increments it. When you cache a result, you cache not only the identity of the array, but its epoch as well. When you check to see if you have a cached value, you also check to see if the epoch changed. If it did change, you can't prove your old results are current, and have to throw them out.
Storing the results is an interesting task. It is very easy to write a storing algorithm which uses up gobs of memory by remembering hundreds of thousands of old results to avg. Generally speaking, there needs to be a way to let the caching code know that an array has been destroyed, or a way to slowly remove old unused cache results. In the former case, the deallocator of the double arrays needs to let the cache code know that that array is being deallocated. In the latter case, it is common to limit a cache to 10 or 100 entries, and have evict old cache results.
The last piece is invalidation of caches. I spoke earlier regarding the dirty bit. The general pattern for this is that a value inside a cache must be marked invalid if the key it was stored in didn't change, but the values in the array did change. This can obviously never happen if the key is a copy of the array, but it can occur when the key is an identifing integer or a pointer.
Generally speaking, invalidation is a way to add a requirement to your caller: if you want to use avg with caching, here's the extra work you are required to do to help the caching code.
Recently I implemented a system with such caching invalidation scheme. It was very simple, and stemmed from one philosophy: the code which is calling avg is in a better position to determine if the array has changed than avg is itself.
There were two versions of the equvalent of avg: double avg(double* array, int n) and double avg(double* array, int n, CacheValidityObject& validity).
Calling the 2 argument version of avg never cached, because it had no guarantees that array had not changed.
Calling the 3 argument version of avg activated caching. The caller guarentees that, if it passes the same CacheValidityObject to avg without marking it dirty, then the arrays must be the same.
Putting the onus on the caller makes average trivial. CacheValidityObject is a very simple class to hold on to the results
class CacheValidityObject
{
public:
CacheValidityObject(); // creates a new dirty CacheValidityObject
void invalidate(); // marks this object as dirty
// this function is used only by the `avg` algorithm. "friend" may
// be used here, but this example makes it public
boost::shared_ptr<void>& getData();
private:
boost::shared_ptr<void> mData;
};
inline void CacheValidityObject::invalidate()
{
mData.reset(); // blow away any cached data
}
double avg(double* array, int n); // defined as usual
double avg(double* array, int n, CacheValidityObject& validity)
{
// this function assumes validity.mData is null or a shared_ptr to a double
boost::shared_ptr<void>& data = validity.getData();
if (data) {
// The cached result, stored on the validity object, is still valid
return *static_pointer_cast<double>(data);
} else {
// There was no cached result, or it was invalidated
double result = avg(array, n);
data = make_shared<double>(result); // cache the result
return result;
}
}
// usage
{
double data[100];
fillWithRandom(data, 100);
CacheValidityObject dataCacheValidity;
double a = avg(data, 100, dataCacheValidity); // caches the aveerage
double b = avg(data, 100, dataCacheValidity); // cache hit... uses cached result
data[0] = 0;
dataCacheValidity.invalidate();
double c = avg(data, 100, dataCacheValidity); // dirty.. caches new result
double d = avg(data, 100, dataCacheValidity); // cache hit.. uses cached result
// CacheValidityObject::~CacheValidityObject() will destroy the shared_ptr,
// freeing the memory used to cache the result
}
Advantages
Nearly the fastest caching possible (within a few opcodes)
Trivial to implement
Doesn't leak memory, saving cached values only when the caller thinks it may want to use them again
Disadvantages
Requires the caller to handle caching, instead of doing it implicitly for them.
If you wrap the double* array in a class, you can minimize the disadvantage. Assign each algorithm an index (can be done at run time) Have the DoubleArray class maintain a map of cached values. Each modification to DoubleArray invalidates the cached results. This is the most easy to use version, but doesn't work with a naked array... you need a class to help you out
class DoubleArray
{
public:
// all of the getters and setters and constructors.
// Special note: all setters MUST call invalidate()
CacheValidityObject getCache(int inIdx)
{
return mCaches[inIdx];
}
void setCache(int inIdx, const CacheValidityObject& inObj)
{
mCaches[inIdx] = inObj;
}
private:
void invalidate()
{
mCaches.clear();
}
std::map<int, CacheValidityObject> mCaches;
double* mArray;
int mSize;
};
inline int getNextAlgorithmIdx()
{
static int nextIdx = 1;
return nextIdx++;
}
static const int avgAlgorithmIdx = getNextAlgorithmIdx();
double avg(DoubleArray& inArray)
{
CacheValidityObject valid = inArray.getCache(avgAlgorithmIdx);
// use the 3 argument avg in the previous example
double result = avg(inArray.getArray(), inArray.getSize(), valid);
inArray.setCache(avgAlgorithmIdx, valid);
return result;
}
// usage
DoubleArray array(100);
fillRandom(array);
double a = avg(array); // calculates, and caches
double b = avg(array); // cache hit
array.set(0, 5); // invalidates caches
double c = avg(array); // calculates, and caches
double d = avg(array); // cache hit
#include <limits>
#include <map>
// Note: You have to manage cached results - release it with avg(p, 0)!
double avg(double* p, std::size_t n) {
typedef std::map<double*, double> map;
static map results;
map::iterator pos = results.find(p);
if(n) {
// Calculate or get a cached value
if(pos == results.end()) {
pos = results.insert(map::value_type(p, 0.5)).first; // calculate it
}
return pos->second;
}
// Erase a cached value
results.erase(pos);
return std::numeric_limits<double>::quiet_NaN();
}
Should I be worried about having too many levels of vectors in vectors?
For example, I have a hierarchy of 5 levels and I have this kind of code
all over my project:
rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d]
where each element is a vector. The whole thing is a vector of vectors of vectors ...
Using this still should be lot faster than copying the object like this:
Block b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
// use b
The second approach is much nicer, but slower I guess.
Please give me any suggestion if I should worry about performance issues related to this,
or else...
Thanks
Efficiency won't really be affected in your code (the cost of a vector random access is basically nothing), what you should be concerned with is the abuse of the vector data structure.
There's little reason that you should be using a vector over a class for something as complex as this. Classes with properly defined interfaces won't make your code any more efficient, but it WILL make maintenance much easier in future.
Your current solution can also run into undefined behaviour. Take for example the code you posted:
Block b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
Now what happens if the vector indexes referred to by pos.a, pos.b, pos.c, pos.d don't exist in one of those vectors? You'll go into undefined behaviour and your application will probably segfault (if you're lucky).
To fix that, you'll need to compare the size of ALL vectors before trying to retrieve the Block object.
e.g.
Block b;
if ((pos.a < rawSheets.size()) &&
(pos.b < rawSheets[pos.a].countries.size()) &&
(pos.c < rawSheets[pos.a].countries[pos.b].cities.size()) &&
(pos.d < rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks.size()))
{
b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
}
Are you really going to do that every time you need a block?!!
You could do that, or you can, at the very least, wrap it up in a class...
Example:
class RawSheet
{
Block & FindBlock(const Pos &pos);
std::vector<Country> m_countries;
};
Block & RawSheet::FindBlock(const Pos &pos)
{
if ((pos.b < m_countries.size()) &&
(pos.c < m_countries[pos.b].cities.size()) &&
(pos.d < m_countries[pos.b].cities[pos.c].blocks.size()))
{
return m_countries[pos.b].cities[pos.c].blocks[pos.d];
}
else
{
throw <some exception type here>;
}
}
Then you could use it like this:
try
{
Block &b = rawSheets[pos.a].FindBlock(pos);
// Do stuff with b.
}
catch (const <some exception type here>& ex)
{
std::cout << "Unable to find block in sheet " << pos.a << std::endl;
}
At the very least, you can continue to use vectors inside the RawSheet class, but with it being inside a method, you can remove the vector abuse at a later date, without having to change any code elsewhere (see: Law Of Demeter)!
Use references instead. This doesn't copy an object but just makes an alias to make it more usable, so performance is not touched.
Block& b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
(watch the ampersand). When you use b you will be working with the original vector.
But as #delnan notes you should be worried more about your code structure - I'm sure you could rewrite it in a more appropriate and maintable way.
You should be worried about specific answers since we don't know what the constraints are for your program or even what it does?
The code you've given isn't that bad given what little we know.
The first and second approaches you've shown are functionally identical. Both by default will return an object reference but depending on assignment may result in a copy being made. The second certainly will.
Sasha is right in that you probably want a reference rather than a copy of the object. Depending on how you're using it you may want to make it const.
Since you're working with vectors, each call is fixed time and should be quite fast. If you're really concerned, time the call and consider how often the call is made per second.
You should also consider the size of your dataset and think about if another data structure (database perhaps) would be more appropriate.