RAII with array of COM objects - c++

Problem:
In COM you occasionally find functions with signatures like this:
HRESULT STDMETHODCALLTYPE GetColorContexts(
UINT cCount,
IWICColorContext **ppIColorContexts,
UINT *pcActualCount)
The problem this presents for me is that ppIColorContexts must be an initialized array of IWICColorContext *. I have tried referencing the first element of a Vector of ATL::CComPtr<IWICColorContext> with no such luck it won't trigger the () operator so it complains about a type mismatch.
Attempted solutions:
vector<ATL::CComPtr<IWICColorContext>> failed due to type mismatch, as noted in the comments this has other issues as CComPtr overloads operator & which breaks STL containers. It seems that this was fixed in C++11 and was included in the STL in VC2010
BOOST_SCOPE_EXIT_ALL works but still means I'm manually managing the lifetime of the COM objects which is something I'd like to get away from.
Unattempted solutions:
Custom data structure - this is likely what I'll have to do if there is not a more elegant solution, but at least it would allow me to take advantage of destruction semantics properly.
Attach a CComPtr after this call - I dislike this solution because it leaves me with a period of execution where the resource may not get released if something goes wrong.
std::unique_ptr<IWICColorContext[]> with a custom deleter - I have yet to fully explore this possibility but it would ensure that the COM objects would always get released.

I would do it by passing a vector of raw pointers to the function, then copying to another vector of CComPtr.
std::vector<IWICColorContext *> vec(5, NULL);
UINT nActualCount = 0;
GetColorContexts(vec.size(), &vec[0], &nActualCount);
std::vector<CComPtr<IWICColorContext> > results(vec.begin(), vec.begin() + nActualCount);
The only unfortunate part is that the CComPtr constructor performs an AddRef so you must do a corresponding Release on the raw pointers before they're lost.
for (auto it = vec.begin(); it != vec.end(); ++it)
if (*it != NULL)
(*it)->Release();
vec.clear();

Ultimately the solution was described by igor tandetnik in the comments above:
Basically in VC2010+ ATL::CComPtr has a sizeof that is the same as the pointer they represent (e.g. sizeof(ATL::CComPtr<IWICColorContext>) == sizeof(IWICColorContext*)), as best I can tell this is because they have no virtual functions and thus need no vTable. This is however highly dangerous as it's relying on a compiler implementation detail. Thus the following works:
std::vector<ATL::CComPtr<IWICColorContext> > > vec(5);
// CComPtrs are created and initialized here
GetColorContexts(vec.size(), &vec[0].m_T, ...);
Mark brought up a very good point that the solution above was completely dependent on compiler implementation which is dangerous. However the solution of only attaching ATL::CComPtr after the GetColorContexts call was not palattable either as it would not have been exception safe.
Ultimately my solution (tested this morning) is to create a vector<IWICColorContext*> temporarily from the vector<CComPtr<IWICColorContext>> this temporary vector does not increment the ref count and allows me to maintain exception safety.

I think that you need something like that:
long lSize = 0;
ptr->GetColorContexts(cCount, NULL, &lSize);//return required amount of contexts
IWICColorContext** ppColorContexts = NULL;
ppColorContexts = new IWICColorContext*[lSize];
ptr->GetColorContexts(cCount, ppColorContexts, &lSize);
//use something to wrap received raw interfaces with CComPtr -
//for example use for loop to pass them to new container,
//which stores CComPtr<IWICColorContext>

Related

Custom allocators as alternatives to vector of smart pointers?

This question is about owning pointers, consuming pointers, smart pointers, vectors, and allocators.
I am a little bit lost on my thoughts about code architecture. Furthermore, if this question has already an answer somewhere, 1. sorry, but I haven't found a satisfying answer so far and 2. please point me to it.
My problem is the following:
I have several "things" stored in a vector and several "consumers" of those "things". So, my first try was like follows:
std::vector<thing> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return &i_am_the_owner_of_things[5]; // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
In my application, this would be safe because the "things" outlive the "consumers" in any case. However, more "things" can be added during runtime and that can become a problem because if the std::vector<thing> i_am_the_owner_of_things; gets reallocated, all the thing* m_thing pointers become invalid.
A fix to this scenario would be to store unique pointers to "things" instead of "things" directly, i.e. like follows:
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return i_am_the_owner_of_things[5].get(); // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
The downside here is that memory coherency between "things" is lost. Can this memory coherency be re-established by using custom allocators somehow? I am thinking of something like an allocator which would always allocate memory for, e.g., 10 elements at a time and whenever required, adds more 10-elements-sized chunks of memory.
Example:
initially:
v = ☐☐☐☐☐☐☐☐☐☐
more elements:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
and again:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
Using such an allocator, I wouldn't even have to use std::unique_ptrs of "things" because at std::vector's reallocation time, the memory addresses of the already existing elements would not change.
As alternative, I can only think of referencing the "thing" in "consumer" via a std::shared_ptr<thing> m_thing, as opposed to the current thing* m_thing but that seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So, is the allocator-approach a good one? And if so, how can it be done? Do I have to implement the allocator by myself or is there an existing one?
If you are able to treat thing as a value type, do so. It simplifies things, you don't need a smart pointer for circumventing the pointer/reference invalidation issue. The latter can be tackled differently:
If new thing instances are inserted via push_front and push_back during the program, use std::deque instead of std::vector. Then, no pointers or references to elements in this container are invalidated (iterators are invalidated, though - thanks to #odyss-jii for pointing that out). If you fear that you heavily rely on the performance benefit of the completely contiguous memory layout of std::vector: create a benchmark and profile.
If new thing instances are inserted in the middle of the container during the program, consider using std::list. No pointers/iterators/references are invalidated when inserting or removing container elements. Iteration over a std::list is much slower than a std::vector, but make sure this is an actual issue in your scenario before worrying too much about that.
There is no single right answer to this question, since it depends a lot on the exact access patterns and desired performance characteristics.
Having said that, here is my recommendation:
Continue storing the data contiguously as you are, but do not store aliasing pointers to that data. Instead, consider a safer alternative (this is a proven method) where you fetch the pointer based on an ID right before using it -- as a side-note, in a multi-threaded application you can lock attempts to resize the underlying store whilst such a weak reference lives.
So your consumer will store an ID, and will fetch a pointer to the data from the "store" on demand. This also gives you control over all "fetches", so that you can track them, implement safety measure, etc.
void consumer::foo() {
thing *t = m_thing_store.get(m_thing_id);
if (t) {
// do something with t
}
}
Or more advanced alternative to help with synchronization in multi-threaded scenario:
void consumer::foo() {
reference<thing> t = m_thing_store.get(m_thing_id);
if (!t.empty()) {
// do something with t
}
}
Where reference would be some thread-safe RAII "weak pointer".
There are multiple ways of implementing this. You can either use an open-addressing hash table and use the ID as a key; this will give you roughly O(1) access time if you balance it properly.
Another alternative (best-case O(1), worst-case O(N)) is to use a "reference" structure, with a 32-bit ID and a 32-bit index (so same size as 64-bit pointer) -- the index serves as a sort-of cache. When you fetch, you first try the index, if the element in the index has the expected ID you are done. Otherwise, you get a "cache miss" and you do a linear scan of the store to find the element based on ID, and then you store the last-known index value in your reference.
IMO best approach would be create new container which will behave is safe way.
Pros:
change will be done on separate level of abstraction
changes to old code will be minimal (just replace std::vector with new container).
it will be "clean code" way to do it
Cons:
it may look like there is a bit more work to do
Other answer proposes use of std::list which will do the job, but with larger number of allocation and slower random access. So IMO it is better to compose own container from couple of std::vectors.
So it may start look more or less like this (minimum example):
template<typename T>
class cluster_vector
{
public:
static const constexpr cluster_size = 16;
cluster_vector() {
clusters.reserve(1024);
add_cluster();
}
...
size_t size() const {
if (clusters.empty()) return 0;
return (clusters.size() - 1) * cluster_size + clusters.back().size();
}
T& operator[](size_t index) {
thowIfIndexToBig(index);
return clusters[index / cluster_size][index % cluster_size];
}
void push_back(T&& x) {
if_last_is_full_add_cluster();
clusters.back().push_back(std::forward<T>(x));
}
private:
void thowIfIndexToBig(size_t index) const {
if (index >= size()) {
throw std::out_of_range("cluster_vector out of range");
}
}
void add_cluster() {
clusters.push_back({});
clusters.back().reserve(cluster_size);
}
void if_last_is_full_add_cluster() {
if (clusters.back().size() == cluster_size) {
add_cluster();
}
}
private:
std::vector<std::vector<T>> clusters;
}
This way you will provide container which will not reallocate items. It doesn't meter what T does.
[A shared pointer] seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So what? Maybe the code is a little less self-documenting, but it will solve all your problems.
(And by the way you are muddling things by using the word "consumer", which in a traditional producer/consumer paradigm would take ownership.)
Also, returning a raw pointer in your current code is already entirely ambiguous as to ownership. In general, I'd say it's good practice to avoid raw pointers if you can (like you don't need to call delete.) I would return a reference if you go with unique_ptr
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing& get_thing_for_consumer() {
// some thing-selection logic
return *i_am_the_owner_of_things[5]; // 5 is just an example
}

Fastest way to allocate temporary elements (knowing maximum number) in a vector?

in a function I need to store some integers in a vector. The function is called a lot of times. I know that they are less then 10 but the number is variable for each call of the function. What is the choice to have better performances?
In example I found that this:
std::vector<int> list(10)
std::vector<int>::iterator it=list.begin();
unsigned int nume_of_elements_stored;
for ( ... iterate on some structures ... ){
if (... a specific condition ...){
*it= integer from structures ;
it++;
nume_of_elements_stored++;
}
}
is slower than:
std::vector<int> list;
unsigned int num_of_elements_stored(0);
for ( ... iterate on some structures ... ){
if (... a specific condition ...){
list.push_back( integer from structures )
}
}
num_of_elements_stored=list.size();
I'm going to go down an extremely uncool route here. At the risk of being crucified, I would suggest that std::vector isn't so great here. An exception would be if you get lucky with the memory allocator and get that temporal locality through the allocator that creating and destroying a bunch of teeny vectors normally wouldn't provide.
Wait!
Before people kill me, I want to say that vector is awesome, generally speaking, as one of the most well-rounded data structures available. But when you're looking at a hotspot like this (hopefully with a profiler) as a result of creating a bunch of teeny vectors repeatedly in a tight loop, that's where this kind of straightforward usage of vector can bite you.
The trouble is that it's a heap-allocated structure (basically a dynamic array), and when we're dealing with a boatload of teeny arrays like this, we really want to use that often-cached memory at the top of the stack that's so cheap to allocate/free when we can.
One way to mitigate this is to reuse the same vector across repeated calls. Store it in the outside caller function's scope and pass it in by reference, clear it, do your push_backs, rinse and repeat. It's worth noting that clear doesn't free any memory in the vector, so it keeps that former capacity around (useful here when we want to reuse the same memory and play to temporal locality).
But here we can play to that stack. As a simplified example (using C-style code that isn't very kosher in C++ or even bothers with exception-safety, but easier to illustrate):
int stack_mem[32];
int num = 0;
int cap = 32;
int* ptr = stack_mem;
for ( ... iterate on some structures ... )
{
if (... a specific condition ...)
{
if (num == cap)
{
cap *= 2;
int* new_ptr = static_cast<int*>(malloc(cap * sizeof(int)));
memcpy(new_ptr, ptr, num * sizeof(int));
if (ptr != stack_mem)
free(ptr);
ptr = new_ptr;
}
ptr[num++] = your_int;
}
}
if (ptr != stack_mem)
free(ptr);
Of course if you use something like this, you should properly wrap it in a reusable class template that does bounds-checking, doesn't use memcpy, has exception-safety, a formal push_back method, emplace_back, copy ctor, move ctor, swap, possibly a fill ctor, range ctor, erase, range erase, insert, range insert, size, empty, iterators/begin/end, uses placement new to avoid requiring copy assignment or default ctor, etc.
The solution uses the stack when N <= 32 (can use a different number suited for your common-case needs) and then switches to heap when exceeded. This allows it to handle your common case scenarios efficiently but also not just go kablooey in those rare case scenarios when N might be huge in some pathological case. That makes it somewhat comparable to variable-length arrays in C (something I actually wish we had in C++, at least until std::dynarray is available) but without the stack overflow tendencies VLAs could have since it switches to heap in rare case scenarios.
I applied all these standard-compliant formalities with a structure based on this idea with a class template that accepts <T, FixedN>, and now use it almost as much as vector since I work with so many cases like this with teeny arrays being repeatedly created that should, in the vast majority of common cases, fit on the stack (but always with those ultra rare exceptional possibilities). It wiped off many profiler hotspots I was getting related to memory off the map.
... but applying this basic idea might give you quite a boost. You can apply that kind of effort above of wrapping it into a safe container preserving C++ object semantics if it pays off in your measurements, and I think it should quite a bit in your case.
I would probably go with sort of a middle ground:
std::vector<int> list;
list.reserve(10);
...and the rest could be pretty much like your second version. To be honest, however, it's probably open to question whether this will really make a lot of difference though.
If you use static vector it will be allocated only once.
First example works slower because it allocates and destroys vector each call.

concurrent_vector invalid data

using : VC++ 2013
concurrency::concurrent_vector<datanode*> dtnodelst
Occasionally when I do dtnodelst->at(i) .... I am getting an invalid address (0XCDCD.. ofc)
which shouldn't be the case cause after I do push back, I never delete or remove any of the itms ( even if I delete it should have returned the deleted old address... but I am not ever deleting so that is not even the case )
dtnodelst itm = new dtnodelst ();
....
dtnodelst->push_back(itm);
any ideas on what might be happening ?
p.s. I am using windows thread pool. some times .. I can do 8million inserts and find and everything goes fine .... but sometimes even 200 inserts and finds will fail. I am kind of lost. any help would be awesomely appreciated!!
thanks and best regards
actual code as an fyi
p.s. am I missing something or is it pain in the ass to past code with proper formatting ? I remember it being auto align before ... -_-
struct datanode {
volatile int nodeval;
T val;
};
concurrency::concurrent_vector<datanode*> lst
inline T find(UINT32 key)
{
for (int i = 0; i < lst->size(); i++)
{
datanode* nd = lst->at(i);
//nd is invalid sometimes
if (nd)
if (nd->nodeval == key)
{
return (nd->val);
}
}
return NULL;
}
inline T insert_nonunique(UINT32 key, T val){
datanode* itm = new datanode();
itm->val = val;
itm->nodeval = key;
lst->push_back(itm);
_updated(lst);
return val;
}
The problem is using of concurrent_vector::size() which is not fully thread-safe as you can get reference to not yet constructed elements (where memory contains garbage). Microsoft PPL library (which provides it in concurrency:: namespace) uses Intel TBB implementation of concurrent_vector and TBB Reference says:
size_type size() const |
Returns: Number of elements in the vector. The result may include elements that are allocated but still under construction by concurrent calls to any of the growth methods.
Please see my blog for more explanation and possible solutions.
In TBB, the most reasonable solution is to use tbb::zero_allocator as underlying allocator of concurrent_vector in order to fill newly allocated memory with zeroes before size() will count it too.
concurrent_vector<datanode*, tbb::zero_allocator<datanode*> > lst;
Then, the condition if (nd) will filter out not-yet-ready elements.
volatile is no substitute for atomic<T>. Do not use volatile in some attempt to provide synchronization.
The whole idea of your find call doesn't make sense in a concurrent context. As soon as the function iterates over one value, it could be mutated by another thread to be the value you're looking for. Or it could be the value you want, but mutated to be some other value. Or as soon as it returns false, the value you're seeking is added. The return value of such a function would be totally meaningless. size() has all the same problems, which is a good part of why your implementation would never work.
Inspecting the state of concurrent data structures is a very bad idea, because the information becomes invalid the moment you have it. You should design operations that do not require knowing the state of the structure to execute correctly, or, block all mutations whilst you operate.

Should I be worried about having too many levels of vectors in vectors?

Should I be worried about having too many levels of vectors in vectors?
For example, I have a hierarchy of 5 levels and I have this kind of code
all over my project:
rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d]
where each element is a vector. The whole thing is a vector of vectors of vectors ...
Using this still should be lot faster than copying the object like this:
Block b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
// use b
The second approach is much nicer, but slower I guess.
Please give me any suggestion if I should worry about performance issues related to this,
or else...
Thanks
Efficiency won't really be affected in your code (the cost of a vector random access is basically nothing), what you should be concerned with is the abuse of the vector data structure.
There's little reason that you should be using a vector over a class for something as complex as this. Classes with properly defined interfaces won't make your code any more efficient, but it WILL make maintenance much easier in future.
Your current solution can also run into undefined behaviour. Take for example the code you posted:
Block b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
Now what happens if the vector indexes referred to by pos.a, pos.b, pos.c, pos.d don't exist in one of those vectors? You'll go into undefined behaviour and your application will probably segfault (if you're lucky).
To fix that, you'll need to compare the size of ALL vectors before trying to retrieve the Block object.
e.g.
Block b;
if ((pos.a < rawSheets.size()) &&
(pos.b < rawSheets[pos.a].countries.size()) &&
(pos.c < rawSheets[pos.a].countries[pos.b].cities.size()) &&
(pos.d < rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks.size()))
{
b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
}
Are you really going to do that every time you need a block?!!
You could do that, or you can, at the very least, wrap it up in a class...
Example:
class RawSheet
{
Block & FindBlock(const Pos &pos);
std::vector<Country> m_countries;
};
Block & RawSheet::FindBlock(const Pos &pos)
{
if ((pos.b < m_countries.size()) &&
(pos.c < m_countries[pos.b].cities.size()) &&
(pos.d < m_countries[pos.b].cities[pos.c].blocks.size()))
{
return m_countries[pos.b].cities[pos.c].blocks[pos.d];
}
else
{
throw <some exception type here>;
}
}
Then you could use it like this:
try
{
Block &b = rawSheets[pos.a].FindBlock(pos);
// Do stuff with b.
}
catch (const <some exception type here>& ex)
{
std::cout << "Unable to find block in sheet " << pos.a << std::endl;
}
At the very least, you can continue to use vectors inside the RawSheet class, but with it being inside a method, you can remove the vector abuse at a later date, without having to change any code elsewhere (see: Law Of Demeter)!
Use references instead. This doesn't copy an object but just makes an alias to make it more usable, so performance is not touched.
Block& b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
(watch the ampersand). When you use b you will be working with the original vector.
But as #delnan notes you should be worried more about your code structure - I'm sure you could rewrite it in a more appropriate and maintable way.
You should be worried about specific answers since we don't know what the constraints are for your program or even what it does?
The code you've given isn't that bad given what little we know.
The first and second approaches you've shown are functionally identical. Both by default will return an object reference but depending on assignment may result in a copy being made. The second certainly will.
Sasha is right in that you probably want a reference rather than a copy of the object. Depending on how you're using it you may want to make it const.
Since you're working with vectors, each call is fixed time and should be quite fast. If you're really concerned, time the call and consider how often the call is made per second.
You should also consider the size of your dataset and think about if another data structure (database perhaps) would be more appropriate.

std::vector Assertion failed (vector iterators incompatible)

I have this struct:
struct MxMInstanceData
{
D3DXVECTOR2 mTransform;
float mSpacing;
};
Then I create a vector of MxMInstanceData:
std::vector<MxMInstanceData> instInFrustumData;
If I call instInFrustumData.clear() I get this error:
Assertion failed (vector iterators
incompatible)
Vector creation code:
instInFrustumData.reserve(mNumInstances);
Vector update code:
void Terrain::updateInstances()
{
mNumInstancesInFrustum = 0;
if(instInFrustumData.size() != 0)
instInFrustumData.clear();
mpMxMInstInFrustumB->Map(D3D10_MAP_WRITE_DISCARD, NULL, (void**) &instInFrustumData);
for(int x = 0; x < mNumInstances; x++)
{
if(mpCamera->point2DInFrustum(instData[x].mTransform +
D3DXVECTOR2(instData[x].mSpacing/2 + mpCamera->getPosition().x, instData[x].mSpacing/2 + mpCamera->getPosition().z), instData[x].mSpacing/2)
!= OUTSIDE)
{
instInFrustumData.push_back(instData[x]);
mNumInstancesInFrustum++;
}
}
mpMxMInstInFrustumB->Unmap();
}
What can make this happen?
And in the destructor of my class I also call clear()
You may want to check out a reference on using std::vector like http://www.cplusplus.com/reference/stl/vector/ or buy a good STL book. You are using some methods in what I would consider unorthodox ways.
Use empty() to check if a vector has elements (if not empty clear just reads better)
Use locally scoped variables when possible (things that don't need to stay in scope shouldn't)
Use STL iterators or container sizes in loops (is having two incrementing integers in one loop needed?)
Use the "best" STL container for your implementation (do you want vectors or maps here?)
Avoid C-style casts and misuse of objects ((void**) &instInFrustumData is a very bad idea)
You have so many members variables whose definition is unknown as well unknown methods Map() and UnMap() and still haven't shown any code using iterators related to your original error. I would guess your use of instData[x] is dangerous and problematic as well as the way that loop is constructed in general. You also really don't want to be treating STL containers as anything but STL containers. Things like (void**) &instInFrustumData should be avoided as they can only cause problems.
I highly suggest you learn C++ first before tackling DirectX or graphics and game engines written in both.
Kind of guessing here, but maybe your problem is this line:
mpMxMInstInFrustumB->Map(D3D10_MAP_WRITE_DISCARD, NULL, (void**) &instInFrustumData);
You're passing a pointer to the vector itself to this Map function, which I'm guessing might be overwriting some of its internals? I don't have its documentation, but it doesn't look like a function that's expecting a pointer to a vector :)