Keeping track of (stack-allocated) objects - c++

In a rather large application, I want to keep track of some statistics about objects of a certain class. In order to not degrade performance, I want the stats to be updated in a pull-configuration. Hence, I need to have a reference to each live object in some location. Is there an idiomatic way to:
Create, search, iterate such references
Manage it automatically (i.e. remove the reference upon destruction)
I am thinking in terms of a set of smart pointers here, but the memory management would be somewhat inverted: Instead of destroying the object when the smart pointer is destroyed, I'd want the smart pointer to be removed, when the object is destroyed. Ideally, I do not want to reinvent the wheel.
I could live with a delay in the removal of the pointers, I'd just need a way to invalidate them quickly.
edit: Because paddy asked for it: The reason for pull-based collection is that obtaining the information may be relatively costly. Pushing is obviously a clean solution but considered too expensive.

There is no special feature of the language that will allow you to do this. Sometimes object tracking is handled by rolling your own memory allocator, but this doesn't work easily on the stack.
But if you're using only the stack it actually makes your problem easier, assuming that the objects being tracked are on a single thread. C++ makes special guarantees about the order of construction and destruction on the stack. That is, the destruction order is exactly the reverse of construction order.
And so, you can leverage this to store a single pointer in each object, plus one static pointer to track the most recent one. Now you have an object stack represented as a linked list.
template <typename T>
class Trackable
{
public:
Trackable()
: previous( current() )
{
current() = this;
}
~Trackable()
{
current() = previous;
}
// External interface
static const T *head() const { return dynamic_cast<const T*>( current() ); }
const T *next() const { return dynamic_cast<const T*>( previous ); }
private:
static Trackable * & current()
{
static Trackable *ptr = nullptr;
return ptr;
}
Trackable *previous;
}
Example:
struct Foo : Trackable<Foo> {};
struct Bar : Trackable<Bar> {};
// :::
// Walk linked list of Foo objects currently on stack.
for( Foo *foo = Foo::head(); foo; foo = foo->next() )
{
// Do kung foo
}
Now, admittedly this is a very simplistic solution. In a large application you may have multiple stacks using your objects. You could handle stacks on multiple threads by making current() use thread_local semantics. Although you need some magic to make this work, as head() would need to point at a registry of threads, and that would require synchronization.
You definitely don't want to synchronize all stacks into a single list, because that will kill your program's performance scalability.
As for your pull-requirement, I presume it's a separate thread wanting to walk over the list. You would need a way to synchronize such that all new object construction or destruction is blocked inside Trackable<T> while the list is being iterated. Or similar.
But at least you could take this basic idea and extend it to your needs.
Remember, you can't use this simple list approach if you allocate your objects dynamically. For that you would need a bi-directional list.

The simplest approach is to have code inside each object so that it registers itself on instantiation and removes itself upon destruction. This code can easily be injected using a CRTP:
template <class T>
struct AutoRef {
static auto &all() {
static std::set<T*> theSet;
return theSet;
}
private:
friend T;
AutoRef() { all().insert(static_cast<T*>(this)); }
~AutoRef() { all().erase(static_cast<T*>(this)); }
};
Now a Foo class can inherit from AutoRef<Foo> to have its instances referenced inside AutoRef<Foo>::all().
See it live on Coliru

Related

How to safely implement reusable scratch memory in C++?

It is very common that even pure functions require some additional scratch memory for their operations. If the size of this memory is known at compile time, we can allocate this memory on the stack with std::array or a C array. But the size often depends on the input, so we often resort to dynamic allocations on the heap through std::vector.
Consider a simple example of building a wrapper around some C api:
void addShapes(std::span<const Shape> shapes) {
std::vector<CShape> cShapes;
cShapes.reserve(shapes.size());
// Convert shapes to a form accepted by the API
for (const Shape& shape : shapes) {
cShapes.push_back(static_cast<CShape>(shape));
}
cAddShapes(context, cShapes.data(), cShapes.size());
}
Let's say that we call this function repeatedly and that we identify that the overhead of std::vector memory allocations is significant, even with the call to reserve(). So what can we do?
We could declare the vector as static to reuse the allocated space between calls, but that comes with several problems. First, it is no longer thread safe, but that can be fixed easily enough by using thread_local instead. Second, the memory doesn't get released until the program or thread terminates. Let's say we are fine with that. And lastly, we have to remember to clear the vector every time, because it's not just the memory that will persist between function calls, but the data as well.
void addShapes(std::span<const Shape> shapes) {
thread_local std::vector<CShape> cShapes;
cShapes.clear();
// Convert shapes to a form accepted by the API
for (const Shape& shape : shapes) {
cShapes.push_back(static_cast<CShape>(shape));
}
cAddShapes(context, cShapes.data(), cShapes.size());
}
This is the pattern I use whenever I would like to avoid the dynamic allocation on every call. The issue is, I don't think the semantics of this are very apparent if you aren't aware of the pattern. thread_local looks scary, you have to remember to clear the vector and even though the lifetime of the object now extends beyond the scope of the function, it is unsafe to return a reference to it, because another call to the same function would modify it.
My first attempt to make this a bit easier was to define a helper function like this:
template <typename T, typename Cleaner = void (T&)>
T& getScratch(Cleaner cleaner = [] (T& o) { o.clear(); }) {
thread_local T scratchObj;
cleaner(scratchObj);
return scratchObj;
}
void addShapes(std::span<const Shape> shapes) {
std::vector<CShape>& cShapes = getScratch<std::vector<CShape>>();
// Convert shapes to a form accepted by the API
for (const Shape& shape : shapes) {
cShapes.push_back(static_cast<CShape>(shape));
}
cAddShapes(context, cShapes.data(), cShapes.size());
}
But of course, that creates a thread_local variable for each template instantiation of the getScratch function, rather than for each place the function is called. So if we asked for two vectors of the same type at once, we'd get two references to the same vector. Not good.
What would be a good way to implement this sort of a reusable memory safely and cleanly? Are there already existing solutions? Or should we not use thread local storage in this way and just use local allocations despite the performance benefits that reusing them brings: https://quick-bench.com/q/VgkPLveFL_K5wT5wX6NL1MRSE8c ?
To answer my own question, I came up with a solution that builds upon the last example. Rather than keeping only one object for each thread and type, lets keep a free list of them. Upon request, we either reuse an object from the free list or create a new one. The user keeps a RAII-style handle that returns the object into the free list when it leaves the scope. Since we still use thread_local, this is thread safe without any effort. We can wrap all this into a simple class:
template <typename T>
class Scratch {
public:
template <typename Cleaner = void (T&)>
explicit Scratch(Cleaner cleaner = [] (T& o) { o.clear(); }) : borrowedObj(acquire()) {
cleaner(borrowedObj);
}
T& operator*() {
return borrowedObj;
}
T* operator->() {
return &borrowedObj;
}
~Scratch() {
release(std::move(borrowedObj));
}
private:
static thread_local std::vector<T> freeList;
T borrowedObj;
static T acquire() {
if (!freeList.empty()) {
T obj = std::move(freeList.back());
freeList.pop_back();
return obj;
} else {
return T();
}
}
static void release(T&& obj) {
freeList.push_back(std::move(obj));
}
};
That can be used simply as:
void addShapes(std::span<const Shape> shapes) {
Scratch<std::vector<CShape>> cShapes;
// Convert shapes to a form accepted by the API
for (const Shape& shape : shapes) {
cShapes->push_back(static_cast<CShape>(shape));
}
cAddShapes(context, cShapes->data(), cShapes->size());
}
You might want to extend this as needed, perhaps add a [] operator for convenience if it's going to be used with containers. You could keep its intended use to be a local object in a function and explicitly make it non-copyable and non-movable, or it could be turned into a general purpose handle like unique_ptr. But beware that the object must be destroyed by the same thread that created it.
In both cases it addresses my issues with a raw thread_local. The clear is implicit and returning a reference to the scratch object or its data is now obviously wrong. It still doesn't automatically free memory, which is what we want after all, but at least it's now easier to implement the functionality to free it on demand as needed.
In general, it should have lower memory usage than the raw thread_local method, too, since allocations of the same type can be reused across different call sites. But there is a scenario in which this behavior will result in a higher memory usage, too. Let's say we have a function that needs a std::vector<int> of size 10000. If we call this function and then ask for a vector of the same type, we will get the one with capacity 10000. If we then call the function again while holding this vector, it will have to create another one, resizing it to 10000 elements, too.
For those reasons I would recommend using it only where you don't expect to see large amounts of data, but rather want to avoid lots of small, but frequent and short-lived allocations.
static to reuse the allocated space between calls, but that comes with several problems. First, it is no longer thread safe, but that can be fixed easily enough by using thread_local instead. Second, the memory doesn't get released until the program or thread terminates.
Exactly. Because only the user of the function knows how and when he wants to call the function and when he wants to do it, only the user of the function should be the one responsible for reusing space if he wants to and for clearing it up, because the user knows if he is going to use it later or not. So add cache object to your function, where you cache the state to speed it up later.
void addShapes(std::span<const Shape> shapes, std::vector<CShape>& cache) {
cache.reserve(shapes.size());
// Convert shapes to a form accepted by the API
for (const Shape& shape : shapes) {
cache.push_back(static_cast<CShape>(shape));
}
cAddShapes(context, cache.data(), cache.size());
}
Or you could objectify it a bit, like:
class shapes {
std::vector<CShape> cache;
void add(std::span<const Shape> shapes) {
cache.reserve(shapes.size());
// Convert shapes to a form accepted by the API
for (const Shape& shape : shapes) {
cache.push_back(static_cast<CShape>(shape));
}
cAddShapes(context, cache.data(), cache.size());
}
void clear_cache() {
cache.clear();
}
};

Creating template types without new/delete

I have a C++ Object class like this:
class Component {};
template <typename T>
concept component = std::is_base_of_v<Component, T>;
class Object
{
std::map<std::type_index, Component*> components;
public:
template<component T>
T* add()
{
if(components.find(typeid(T)) == components.cend())
{
T* value{new T{}};
components[typeid(T)] = static_cast<Component*>(value);
}
}
template<component T, typename... Args>
T* add(Args &&... args)
{
if(components.find(typeid(T)) == components.cend())
{
T* value{new T{std::forward<Args>(args)...}};
components[typeid(T)] = static_cast<Component*>(value);
}
}
};
Components that are added to class Object are deleted on another function that is not related to my question. AFAIK doing a lot of new/delete calls (heap allocations) hurt performance and supposedly there should be like 20/30 (or even more) Objectss with 3-10 Object::add on each one. I thought that I could just call T-s constructor without new, then to static_cast<Component*>(&value), but the Component added on the map is "invalid", meaning all T's members (ex. on a class with some int members, they are all equal to 0 instead of some custom value passed on its constructor). I am aware that value goes out of scope and the pointer on the map becomes a dangling one, but I can't find a way to instantiate T objects without calling new or without declaring them as static. Is there any way to do this?
EDIT: If I declare value as static, everything works as expected, so I guess its a lifetime issue related to value.
I suppose, you think of this as the alternative way of creating your objects
T value{std::forward<Args>(args)...};
components[typeid(T)] = static_cast<Component*>(&value);
This creates a local variable on the stack. Doing the assignment then, stores a pointer to a local variable in the map.
When you leave method add(), the local object will be destroyed, and you have a dangling pointer in the map. This, in turn, will bite you eventually.
As long as you want to store pointers, there's no way around new and delete. You can mitigate this a bit with some sort of memory pool.
If you may also store objects instead of pointers in the map, you could create the components in place with std::map::emplace. When you do this, you must also remove the call to delete and clean up the objects some other way.
Trying to avoid heap allocations before you've proven that they indeed hurt your programs' performance is not a good approach in my opinion. If that was the case, you should probably get rid of std::map in your code as well. That being said, if you really want to have no new/delete calls there, it can be done, but requires explicit enumeration of the Component types. Something like this could be what you are looking for:
#include <array>
#include <variant>
// Note that components no longer have to implement any specific interface, which might actually be useful.
struct Component1 {};
struct Component2 {};
// Component now is a variant enumerating all known component types.
using Component = std::variant<std::monostate, Component1, Component2>;
struct Object {
// Now there is no need for std::map, as we can use variant size
// and indexes to create and access a std::array, which avoids more
// dynamic allocations.
std::array<Component, std::variant_size_v<Component> - 1> components;
bool add (Component component) {
// components elements hold std::monostate by default, and holding std::monostate
// is indicated by returning index() == 0.
if (component.index() > 0 && components[component.index() - 1].index() == 0) {
components[component.index() - 1] = std::move(component);
return true;
}
return false;
}
};
Component enumerates all known component types, this allows to avoid dynamic allocation in Object, but can increase memory usage, as the memory used for single Object is roughly number_of_component_types * size_of_largest_component.
While the other answers made clear what the problem is I want to make a proposition how you could get around this in its entirety.
You know at compile time what possible types will be in the map at mosz, since you know which instantation of the add template where used. Hence you can get rid of the map and do all in a compile time.
template<component... Comps>
struct object{
std::tuple<std::optional<Comps>...> components;
template<component comp, class ... args>
void add(Args... &&args) {
std::get<std::optional<comp>>(components).emplace(std::forward<Args>(args)...);
}
}
Of course this forces you to collect all the possible objects when you create the object, but this not more info you have to have just more impractical.
You could add the following overload for add to make the errors easier to read
template<component T>
void add(...) {
static_assert(false, "Please add T to the componentlist of this object");
}

Enforce no bald pointers in C++

Generally speaking, in this crazy modern world full of smart pointers, I'm coming round to the fact that bald/bare pointers shouldn't be in method signatures.
I suppose there might be some cases where you need bald pointers for performance reasons but certainly I've always managed with references or references to smart pointers.
My question is this, can anyone suggest an automated way to enforce this? I can't see a way with clang, GCC or vera++
I think it is ok for non-owning raw-pointers to be used in method signatures. It is not ok to use "owning" raw-pointers.
For example this is quite ok:
void bar(const Widget* widget) {
if (widget)
widget->doSomething();
}
void foo() {
Widget w;
bar(&w);
}
If you want the Widget to be nullable you can't use a reference and using a smart pointer would be slower and not express the ownership model. The Widget can be guaranteed to be alive for the whole scope of foo. bar simply wants to "observe" it so a smart pointer is not needed. Another example where a non-owning raw-pointer is appropriate is something like:
class Node {
private:
std::vector<std::unique_ptr<Node>> children_; // Unique ownership of children
Node* parent_;
public:
// ...
};
The Node "owns" its children and when a Node dies its children die with it. Providing encapsulation of Node hasn't been broken, a Node can guarantee its parent is alive (if it has one). So a non-owning raw-pointer is appropriate for a back reference to the parent. Using smart pointers where they are not required makes code harder to reason about, can result in memory leaks and is premature pessimization.
If smart pointers are used throughout instead, all kinds of things can go wrong:
// Bad!
class Node {
private:
std::vector<std::shared_ptr<Node>> children_; // Can't have unique ownership anymore
std::shared_ptr<Node> parent_; // Cyclic reference!
public:
// Allow someone to take ownership of a child from its parent.
std::shared_ptr<Node> getChild(size_t i) { return children_.at(i); }
// ...
};
Some of these problems can be solved by using weak_ptr instead of shared_ptr but they still need to be used with care. A weak_ptr is a pointer that has the option of owning.
The point is to use the appropriate kind of pointer in the appropriate context and to have an ownership model that is as simple as possible.
And no, I don't think there is an automated way of enforcing this, it is part of the design. There are tricks you can use like Cheersandhth.-Alf said to try and enforce a particular class is used in the way you want by restricting access.
I don't know of any automated tool to do this, but such a check wouldn't be too hard to implement in cppcheck, which is open source and AIUI offers an easy way to add new rules.
I don't think it's a good idea to completely stay away from raw (bald, bare) pointers even for function arguments. But it can sometimes be a good idea to make sure that all instances of a certain type are dynamic and owned by smart pointers. This involves two measures:
Ensure (to the degree practical) that the type can only be instantiated dynamically.
Ensure (to the degree practical) that instantiation yields a smart pointer.
The first point is easy: just make the destructor non-public, and voilá, the few remaining ways to instantiate that type non-dynamically are not very natural. Any simple attempt to declare a local or namespace scope variable will fail. As will simple attempts to use that type directly as a data member type:
#include <memory>
template< class Type >
void destroy( Type* p ) { delete p; }
class Dynamic
{
template< class Type > friend void destroy( Type* );
protected:
virtual ~Dynamic() {}
};
//Dynamic x; //! Nope.
struct Blah
{
Dynamic x; //! Nope, sort of.
};
auto main()
-> int
{
//Dynamic x; //! Nope.
new Blah; //!Can't delete, but new is OK with MSVC 12...
}
Visual C++ 12.0 unfortunately accepts the above code as long as the out-commented statements remain out-commented. But it does issue a stern warning about its inability to generate a destructor.
So, Dynamic can only be created dynamically (without using low-level shenanigans, or ignoring that MSVC warning and accepting a memory leak), but how to ensure that every new instance is owned by a smart pointer?
Well you can restrict access to the constructors so that a factory function must be used. Another way to ensure use of the factory function is to define a placement allocation function, operator new, that an ordinary new-expression can't access. Since that's most esoteric and possibly non-trivial, and not of clear-cut value, the below code just restricts access to the constructors:
#include <memory> // std::shared_ptr
#include <utility> // std::forward
// A common "static" lifetime manager class that's easy to be-"friend".
struct Object
{
template< class Type >
static
void destroy( Type* p ) { delete p; }
template< class Type, class... Args>
static
auto create_shared( Args&&... args )
-> std::shared_ptr<Type>
{
return std::shared_ptr<Type>(
new Type( std::forward<Args>( args )... ),
&destroy<Type>
);
}
};
class Smart
{
friend class Object;
protected:
virtual ~Smart() {}
Smart( int ) {}
};
//Smart x; //! Nope.
struct Blah
{
//Smart x; //! Nope, sort of.
};
auto main()
-> int
{
//Smart x; //! Nope.
//new Blah; //!Can't delete.
//new Smart( 42 ); // Can't new.
Object::create_shared<Smart>( 42 ); // OK, std::shared_ptr
}
One problem with this approach is that the standard library's make_shared doesn't support a custom deleter, and shared_ptr doesn't use the common deleter of unique_ptr. One may hope that perhaps this will be rectified in future standard.
Disclaimer: I just wrote this for this answer, it's not been extensively tested. But I did this in old times with C++98, then with a macro for the create-functionality. So the principle is known to be sound.
The problem with this idea is that in the smart pointer world, T* is just a shorthand for optional<T&>. It no longer implies memory management, and as a result it becomes a safe idiom to identify an optional non-owned parameter.

Accessing Members of Containing Objects from Contained Objects

If I have several levels of object containment (one object defines and instantiates another object which define and instantiate another object..), is it possible to get access to upper, containing - object variables and functions, please?
Example:
class CObjectOne
{
public:
CObjectOne::CObjectOne() { Create(); };
void Create();
std::vector<ObjectTwo>vObejctsTwo;
int nVariableOne;
}
bool CObjectOne::Create()
{
CObjectTwo ObjectTwo(this);
vObjectsTwo.push_back(ObjectTwo);
}
class CObjectTwo
{
public:
CObjectTwo::CObjectTwo(CObjectOne* pObject)
{
pObjectOne = pObject;
Create();
};
void Create();
CObjectOne* GetObjectOne(){return pObjectOne;};
std::vector<CObjectTrhee>vObjectsTrhee;
CObjectOne* pObjectOne;
int nVariableTwo;
}
bool CObjectTwo::Create()
{
CObjectThree ObjectThree(this);
vObjectsThree.push_back(ObjectThree);
}
class CObjectThree
{
public:
CObjectThree::CObjectThree(CObjectTwo* pObject)
{
pObjectTwo = pObject;
Create();
};
void Create();
CObjectTwo* GetObjectTwo(){return pObjectTwo;};
std::vector<CObjectsFour>vObjectsFour;
CObjectTwo* pObjectTwo;
int nVariableThree;
}
bool CObjectThree::Create()
{
CObjectFour ObjectFour(this);
vObjectsFour.push_back(ObjectFour);
}
main()
{
CObjectOne myObject1;
}
Say, that from within CObjectThree I need to access nVariableOne in CObjectOne. I would like to do it as follows:
int nValue = vObjectThree[index].GetObjectTwo()->GetObjectOne()->nVariable1;
However, after compiling and running my application, I get Memory Access Violation error.
What is wrong with the code above(it is example, and might contain spelling mistakes)?
Do I have to create the objects dynamically instead of statically?
Is there any other way how to achieve variables stored in containing objects from withing contained objects?
When you pass a pointer that points back to the container object, this pointer is sometimes called a back pointer. I see this technique being used all the time in GUI libraries where a widget might want access to its parent widget.
That being said, you should ask yourself if there's a better design that doesn't involve circular dependencies (circular in the sense that the container depends on the containee and the containee depends on the container).
You don't strictly have to create the objects dynamically for the back pointer technique to work. You can always take the address of a stack-allocated (or statically-allocated) object. As long as the life of that object persists while others are using pointers to it. But in practice, this technique is usually used with dynamically-created objects.
Note that you might also be able to use a back-reference instead of a back-pointer.
I think I know what's causing your segmentation faults. When your vectors reallocate their memory (as the result of growing to a larger size), the addresses of the old vector elements become invalid. But the children (and grand-children) of these objects still hold the old addresses in their back-pointers!
For the back-pointer thing to work, you'll have to allocate each object dynamically and store their pointers in the vectors. This will make memory management a lot more messy, so you might want to use smart pointers or boost::ptr_containers.
After seeing the comment you made in another answer, I now have a better idea of what you're trying to accomplish. You should research generic tree structures and the composite pattern. The composite pattern is usually what's used in the widget example I cited previously.
Maybe all your object can inherit from a common interface like :
class MyObject
{
public:
virtual int getData() = 0;
}
And after you can use a std::tree from the stl library to build your structure.
As Emile said, segmentation fault is caused by reallocation. Exactly speaking -- when the local stack objects' 'this' pointer was passed to create another object, which is then copied to the vector container. Then the 'Create()' function exits, the stack frame object ceases to exist and the pointer in the container gets invalid.

Classes, constructor and pointer class members

I'm a bit confused about the object references. Please check the examples below:
class ListHandler {
public:
ListHandler(vector<int> &list);
private:
vector<int> list;
}
ListHandler::ListHandler(vector<int> &list) {
this->list = list;
}
Because of the internal
vector<int> list;
definition, here I would be wasting memory right? So the right one would be:
class ListHandler {
public:
ListHandler(vector<int>* list);
private:
vector<int>* list;
}
ListHandler::ListHandler(vector<int>* list) {
this->list = list;
}
ListHandler::~ListHandler() {
delete list;
}
Basically all I want is to create a vector and pass to ListHandler. This vector will not be used anywhere else than the ListHandler itself so I'm expecting ListHandler to do all the other things and cleanup etc. stuff.
It depends on whether you want to share the underyling vector or not. In general, I think it is a good practice to avoid sharing wherever possible, since it removes the question of object ownership. Without sharing:
class ListHandler
{
public:
ListHandler(const std::vector<int>& list) : _list(list) {}
~ListHandler(){}
private:
std::vector<int> _list;
};
Note that, unlike in your example, I make it const since the original will not be modified. If, however, we want to hang on to and share the same underlying object, then we could use something like this:
class ListHandler
{
public:
ListHandler(std::vector<int>& list) : _list(&list) {}
~ListHandler(){}
private:
std::vector<int>* _list;
};
Note that in this case, I choose to leave the caller as the owner of the object (so it is the caller's responsiblity to ensure that the list is around for the lifetime of the list handler object and that the list is later deallocated). Your example, in which you take over the ownership is also a possibility:
class ListHandler
{
public:
ListHandler(std::vector<int>* list) : _list(list) {}
ListHandler(const ListHandler& o) : _list(new std::vector<int>(o._list)) {}
~ListHandler(){ delete _list; _list=0; }
ListHandler& swap(ListHandler& o){ std::swap(_list,o._list); return *this; }
ListHandler& operator=(const ListHandler& o){ ListHandler cpy(o); return swap(cpy); }
private:
std::vector<int>* _list;
};
While the above is certainly possible, I personally don't like it... I find it confusing for an object that isn't simply a smart pointer class to acquire ownership of a pointer to another object. If I were doing that, I would make it more explicit by wrapping the std::vector in a smart pointer container as in:
class ListHandler
{
public:
ListHandler(const boost::shared_ptr< std::vector<int> >& list) : _list(list) {}
~ListHandler(){}
private:
boost::shared_ptr< std::vector<int> > _list;
};
I find the above much clearer in communicating the ownership. However, all these different ways of passing along the list are acceptable... just make sure users know who will own what.
The first example isn't necessarily wasting memory, its just making a copy of the entire vector at the "this->list = list;" line (which could be what you want, depends on the context). That is because the operator= method on the vector is called at that point which for vector makes a full copy of itself and all its contents.
The second example definitely isn't making a copy of the vector, merely assigning a memory address. Though the caller of the ListHandler contructor better realize that ListHandler is taking over control of the pointer, since it will end up deallocating the memory in the end.
It depends on whether the caller expects to keep using their list (in which case you better not delete it, and need to worry about it changing when you least expect), and whether the caller is going to destroy it (in which case you better not keep a pointer to it).
If the documentation of your class is that the caller allocates a list with new and then turns ownership over to your class when calling your constructor, then keeping the pointer is fine (but use auto_ptr so you don't have to write "delete list" yourself and worry about exception safety).
It all depends what you want, and what policies you can ensure. There is nothing "wrong" with your first example (though I would avoid explicitly using this-> by choosing different names). It makes a copy of the vector, and that may be the right thing to do. It may be the safest thing to do.
But it looks like you would like to reuse the same vector. If the list is guaranteed to outlive any ListHandler, you can use a reference instead of a pointer. The trick is that the reference member variable must be initialized in an initialization list in the constructor, like so:
class ListHandler
{
public:
ListHandler(const vector<int> &list)
: list_m(list)
{
}
private:
vector<int>& list_m;
};
The initialization list is the bit after the colon, but before the body.
However, this is not equivalent to your second example, which using pointer and calls delete in its destructor. That is a third way, in which the ListHandler assumes ownership of the list. But the code comes with dangers, because by calling delete it assumes the list was allocated with new. One way to clarify this policy is by using a naming convention (such as an "adopt" prefix) that identifies the change of ownership:
ListHandler::ListHandler(vector<int> *adoptList)
: list_m(adoptList)
{
}
(This is the same as yours, except for the name change, and the use of an initialization list.)
So now we have seen three choices:
Copy the list.
Keep a reference to a list that someone else owns.
Assume ownership of a list that someone created with new.
There are still more choices, such as smart pointers that do reference counting.
There's no single "right way." Your second example would be very poor style, however, because the ListHandler acquires ownership of the vector when it is constructed. Every new should be closely paired with its delete if at all possible — seriously, that is a very high priority.
If the vector lives as long as the ListHandler, it might as well live inside the ListHandler. It doesn't take up any less space if you put it on the heap. Indeed, the heap adds some overhead. So this is not a job for new at all.
You might also consider
ListHandler::ListHandler(vector<int> &list) {
this->list.swap( list );
}
if you want the initializer list to be cleared and avoid the time and memory overhead of copying the vector's contents.