I'm not sure if I'm using the right terminology here, but say I have a function that returns a vector:
std::vector<int> func()
{
std::vector<int> vec(100,1);
return vec;
}
And when I call this function I want to allocate the vector on the heap. Can I do this?
I'm thinking something along the lines of this:
std::shared_ptr<std::vector<int>> vec(new std::vector<int>);
vec->swap(func());
Is there a way of doing this that is less convoluted, without changing func()?
Just try to remove that std::move, it's a specific compiler exception to avoid you put a std::move and let the compiler to do the rest.
std::vector<int> func()
{
std::vector<int> vec(100,1);
return vec; // NOT: return std::move(vec);
}
Why?
Because, the automatic object vec is going to destroy after executing return and it will be behave as same as a rvalue in this case. Then compiler will move it. Putting std::move will annoy the compiler to NRVO.
That simple returning the vector is optimized and don't worry about the performance.
The only better way i can think of is not use std::move since the compiler automatically does RVO
The second expression can be shortened a bit :
std::vector<int>* vec2 = new std::vector<int>( f() );
And just like what other says, allocating vector on the heap isn't really neccessary
I don't know what you're doing, how much can you optimize away from the stack without any trade-offs or from the heap if you want more fragments in your system memory ? If you have millions of records you need to allocate on "your heap", the first thing comes to my mind is your chance to get it corrupted with operations possible existing in your code is high. Second, heap size is limited then you need to re-implement your allocator to handle "how to deal with small memory and large data to be stored".
If you insist on optimization strategy over heap management, then returning pointer to the vector seems promising
std::vector<int*>* yourfunc()
{
// do something
return pVec;
}
then new method is applied but object deletion at the end is still required.
Related
I am reading a lot of different things on C++ optimization and I am getting quite mixed up. I would appreciate some help. Basically, I want to clear up what needs to be a pointer or not when I am passing vectors and structures as parameters or returning vectors and structures.
Say I have a Structure that contains 2 elements: an int and then a vector of integers. I will be creating this structure locally in a function, and then returning it. This function will be called multiple times and generate a new structure every time. I would like to keep the last structure created in a class member (lastStruct_ for example). So before returning the struct I could update lastStruct_ in some way.
Now, what would be the best way to do this, knowing that the vector in the structure can be quite large (would need to avoid copies). Does the vector in the struct need to be a pointer ? If I want to share lastStruct_ to other classes by creating a get_lastStruct() method, should I return a reference to lastStruct_, a pointer, or not care about that ? Should lastStruct_ be a shared pointer ?
This is quite confusing to me because apparently C++ knows how to avoid copying, but I also see a lot of people recommending the use of pointers while others say a pointer to a vector makes no sense at all.
struct MyStruct {
std::vector<int> pixels;
int foo;
}
class MyClass {
MyStruct lastStruct_;
public:
MyStruct create_struct();
MyStruct getLastStruct();
}
MyClass::create_struct()
{
MyStruct s = {std::vector<int>(100, 1), 1234};
lastStruct_ = s;
return s;
}
MyClass::getLastStruct()
{
return lastStruct_;
}
If the only copy you're trying to remove is the one that happen when you return it from your factory function, I'd say containing the vector directly will be faster all the time.
Why? Two things. Return Value Optimisation (RVO/NRVO) will remove any need for temporaries when returning. This is enough for almost all cases.
When return value optimisation don't apply, move semantics will. returning a named variable (eg: return my_struct;) will do implicit move in the case NRVO won't apply.
So why is it always faster than a shared pointer? Because when copying the shared pointer, you must dereference the control block to increase the owner count. And since it's an atomic operation, the incrementation is not free.
Also, using a shared pointer brings shared ownership and non-locality. If you were to use a shared pointer, use a pointer to const data to bring back value semantics.
Now that you added the code, it's much clearer what you're trying to do.
There's no way around the copy here. If you measure performance degradation, then containing a std::shared_ptr<const std::vector<int>> might be the solution, since you'll keep value semantic but avoid vector copy.
I just read this post on SO, that discusses where in memory, STL vectors are stored. According to the accepted answer,
vector<int> temp;
the header info of the vector on the stack but the contents on the heap.
In that case, would the following code be erroneous?
vector<int> some_function() {
vector<int> some_vector;
some_vector.push_back(10);
some_vector.push_back(20);
return some_vector;
}
Should I have used vector<int> *some_vector = new vector<int> instead? Would the above code result in some code of memory allocation issues? Would this change if I used an instance of a custom class instead of int?
Your code is precisely fine.
Vectors manage all the memory they allocate for you.
It doesn't matter whether they store all their internal data using dynamic allocations, or hold some metadata as direct members (with automatic storage duration). Any dynamic allocations performed internally will be safely cleaned-up in the vector's destructor, copy constructor, and other similar special functions.
You do not need to do anything as all of that is abstracted away from your code. Your code has no visibility into that mechanism, and dynamically allocating the vector itself will not have any effect on it.
That is the purpose of them!
If you decide for dynamic allocation of the vector, you will have really hard time destroying it correctly even in very simple cases (do not forget about exceptions!). Do avoid dynamic allocation at all costs whenever possible.
In other words, your code is perfectly correct. I would not worry about copying the returned vector in memory. In these simple cases compilers (in release builds) should use return value optimization / RVO (http://en.wikipedia.org/wiki/Return_value_optimization) and create some_vector at memory of the returned object. In C++11 you can use move semantics.
But if you really do not trust the compiler using RVO, you can always pass a reference to a vector and fill it in inside the function.
//function definition
void some_function(vector<int> &v) { v.push_back(10); v.push_back(20); }
//function usage
vector<int> vec;
some_function(vec);
And back to dynamic allocation, if you really need to use it, try the pattern called RAII. Or use smart pointers.
It is not important where internally vectors define their data because you return the vector by copy.:) (by value) It is the same as if you would return an integer
int some_function()
{
int x = 10;
return x;
}
Is this correct?:
std::vector<Enemy*> enemies;
enemies.push_back(new Enemy());
Enemy* enemy = enemies[0];
enemies.erase(enemies.begin() + 0);
delete enemy;
It works, yes, but it's not an ideal approach.
Firstly, adding 0 is just noise, you can remove that. But even better, just use pop_front(). Also, no need for the intermediate step, you can delete before removing.
But std::vector isn't good as popping from the front, especially if it's large (because the remaining elements need to be shifted to fill the void). If you don't need contiguous memory, use a std::deque instead. Or, if order doesn't matter, you can use something like this:
template <typename T, typename A>
void unordered_pop_front(std::vector<T, A>& vec)
{
using std::swap;
swap(vec.front(), vec.back()); // constant time
vec.pop_back(); // constant time
}
It swaps the front element with the back element, then pops it off. Of course, order is not retained.
The other problem is with your approach to memory management. Anytime you have explicit clean up code, you've done something wrong. It should be done automatically.
Use either Boost's ptr_vector, or a std::vector of smart pointers. (Note: do not use std::auto_ptr in a container, it's broken in this regard.) For a quick smart pointer suggestion, use either std::unique_ptr (if your compiler supports C++0x), or std::/boost::shared_ptr.
std::vector<Enemy*> enemies;
enemies.push_back(new Enemy());
This isn't exception-safe. If push_back fails to allocate enough memory to accommodate the new pointer, then the Enemy object is leaked.
Using a vector of smart pointers can solve this, but failing that you should reserve the space in the vector before pushing back:
std::vector<Enemy*> enemies;
enemies.reserve(1); // or more generally, enemies.reserve(enemies.size()+1);
enemies.push_back(new Enemy());
Now we know that push_back can't fail to allocate memory, and if reserve fails then the exception is thrown before the Enemy is created.
Certainly looks fine to me. You don't need the + 0 in the enemies.erase line, but aside from that, it's perfectly OK.
Yes, that's fine. You can simplify it a little:
delete enemies[0];
enemies.clear();
For removing the element, you could also use:
enemies.pop_back();
or (very similar to yours):
enemies.erase(enemies.begin());
I'm trying to learn C++, and trying to understand returning objects. I seem to see 2 ways of doing this, and need to understand what is the best practice.
Option 1:
QList<Weight *> ret;
Weight *weight = new Weight(cname, "Weight");
ret.append(weight);
ret.append(c);
return &ret;
Option 2:
QList<Weight *> *ret = new QList();
Weight *weight = new Weight(cname, "Weight");
ret->append(weight);
ret->append(c);
return ret;
(of course, I may not understand this yet either).
Which way is considered best-practice, and should be followed?
Option 1 is defective. When you declare an object
QList<Weight *> ret;
it only lives in the local scope. It is destroyed when the function exits. However, you can make this work with
return ret; // no "&"
Now, although ret is destroyed, a copy is made first and passed back to the caller.
This is the generally preferred methodology. In fact, the copy-and-destroy operation (which accomplishes nothing, really) is usually elided, or optimized out and you get a fast, elegant program.
Option 2 works, but then you have a pointer to the heap. One way of looking at C++ is that the purpose of the language is to avoid manual memory management such as that. Sometimes you do want to manage objects on the heap, but option 1 still allows that:
QList<Weight *> *myList = new QList<Weight *>( getWeights() );
where getWeights is your example function. (In this case, you may have to define a copy constructor QList::QList( QList const & ), but like the previous example, it will probably not get called.)
Likewise, you probably should avoid having a list of pointers. The list should store the objects directly. Try using std::list… practice with the language features is more important than practice implementing data structures.
Use the option #1 with a slight change; instead of returning a reference to the locally created object, return its copy.
i.e. return ret;
Most C++ compilers perform Return value optimization (RVO) to optimize away the temporary object created to hold a function's return value.
In general, you should never return a reference or a pointer. Instead, return a copy of the object or return a smart pointer class which owns the object. In general, use static storage allocation unless the size varies at runtime or the lifetime of the object requires that it be allocated using dynamic storage allocation.
As has been pointed out, your example of returning by reference returns a reference to an object that no longer exists (since it has gone out of scope) and hence are invoking undefined behavior. This is the reason you should never return a reference. You should never return a raw pointer, because ownership is unclear.
It should also be noted that returning by value is incredibly cheap due to return-value optimization (RVO), and will soon be even cheaper due to the introduction of rvalue references.
passing & returning references invites responsibilty.! u need to take care that when you modify some values there are no side effects. same in the case of pointers. I reccomend you to retun objects. (BUT IT VERY-MUCH DEPENDS ON WHAT EXACTLY YOU WANT TO DO)
In ur Option 1, you return the address and Thats VERY bad as this could lead to undefined behaviour. (ret will be deallocated, but y'll access ret's address in the called function)
so use return ret;
It's generally bad practice to allocate memory that has to be freed elsewhere. That's one of the reasons we have C++ rather than just C. (But savvy programmers were writing object-oriented code in C long before the Age of Stroustrup.) Well-constructed objects have quick copy and assignment operators (sometimes using reference-counting), and they automatically free up the memory that they "own" when they are freed and their DTOR automatically is called. So you can toss them around cheerfully, rather than using pointers to them.
Therefore, depending on what you want to do, the best practice is very likely "none of the above." Whenever you are tempted to use "new" anywhere other than in a CTOR, think about it. Probably you don't want to use "new" at all. If you do, the resulting pointer should probably be wrapped in some kind of smart pointer. You can go for weeks and months without ever calling "new", because the "new" and "delete" are taken care of in standard classes or class templates like std::list and std::vector.
One exception is when you are using an old fashion library like OpenCV that sometimes requires that you create a new object, and hand off a pointer to it to the system, which takes ownership.
If QList and Weight are properly written to clean up after themselves in their DTORS, what you want is,
QList<Weight> ret();
Weight weight(cname, "Weight");
ret.append(weight);
ret.append(c);
return ret;
As already mentioned, it's better to avoid allocating memory which must be deallocated elsewhere. This is what I prefer doing (...these days):
void someFunc(QList<Weight *>& list){
// ... other code
Weight *weight = new Weight(cname, "Weight");
list.append(weight);
list.append(c);
}
// ... later ...
QList<Weight *> list;
someFunc(list)
Even better -- avoid new completely and using std::vector:
void someFunc(std::vector<Weight>& list){
// ... other code
Weight weight(cname, "Weight");
list.push_back(weight);
list.push_back(c);
}
// ... later ...
std::vector<Weight> list;
someFunc(list);
You can always use a bool or enum if you want to return a status flag.
Based on experience, do not use plain pointers because you can easily forget to add proper destruction mechanisms.
If you want to avoid copying, you can go for implementing the Weight class with copy constructor and copy operator disabled:
class Weight {
protected:
std::string name;
std::string desc;
public:
Weight (std::string n, std::string d)
: name(n), desc(d) {
std::cout << "W c-tor\n";
}
~Weight (void) {
std::cout << "W d-tor\n";
}
// disable them to prevent copying
// and generate error when compiling
Weight(const Weight&);
void operator=(const Weight&);
};
Then, for the class implementing the container, use shared_ptr or unique_ptr to implement the data member:
template <typename T>
class QList {
protected:
std::vector<std::shared_ptr<T>> v;
public:
QList (void) {
std::cout << "Q c-tor\n";
}
~QList (void) {
std::cout << "Q d-tor\n";
}
// disable them to prevent copying
QList(const QList&);
void operator=(const QList&);
void append(T& t) {
v.push_back(std::shared_ptr<T>(&t));
}
};
Your function for adding an element would make use or Return Value Optimization and would not call the copy constructor (which is not defined):
QList<Weight> create (void) {
QList<Weight> ret;
Weight& weight = *(new Weight("cname", "Weight"));
ret.append(weight);
return ret;
}
On adding an element, the let the container take the ownership of the object, so do not deallocate it:
QList<Weight> ql = create();
ql.append(*(new Weight("aname", "Height")));
// this generates segmentation fault because
// the object would be deallocated twice
Weight w("aname", "Height");
ql.append(w);
Or, better, force the user to pass your QList implementation only smart pointers:
void append(std::shared_ptr<T> t) {
v.push_back(t);
}
And outside class QList you'll use it like:
Weight * pw = new Weight("aname", "Height");
ql.append(std::shared_ptr<Weight>(pw));
Using shared_ptr you could also 'take' objects from collection, make copies, remove from collection but use locally - behind the scenes it would be only the same only object.
All of these are valid answers, avoid Pointers, use copy constructors, etc. Unless you need to create a program that needs good performance, in my experience most of the performance related problems are with the copy constructors, and the overhead caused by them. (And smart pointers are not any better on this field, I'd to remove all my boost code and do the manual delete because it was taking too much milliseconds to do its job).
If you're creating a "simple" program (although "simple" means you should go with java or C#) then use copy constructors, avoid pointers and use smart pointers to deallocate the used memory, if you're creating a complex programs or you need a good performance, use pointers all over the place, and avoid copy constructors (if possible), just create your set of rules to delete pointers and use valgrind to detect memory leaks,
Maybe I will get some negative points, but I think you'll need to get the full picture to take your design choices.
I think that saying "if you're returning pointers your design is wrong" is little misleading. The output parameters tends to be confusing because it's not a natural choice for "returning" results.
I know this question is old, but I don't see any other argument pointing out the performance overhead of that design choices.
Is there a way to transfer ownership of the data contained in a std::vector (pointed to by, say T*data) into another construct, preventing having "data" become a dangling pointer after the vector goes out of scope?
EDIT: I DON'T WANT TO COPY THE DATA (which would be an easy but ineffective solution).
Specifically, I'd like to have something like:
template<typename T>
T* transfer_ownership(vector<T>&v){
T*data=&v[0];
v.clear();
...//<--I'd like to make v's capacity 0 without freeing data
}
int main(){
T*data=NULL;
{
vector<double>v;
...//grow v dynamically
data=transfer_ownership<double>(v);
}
...//do something useful with data (user responsible for freeing it later)
// for example mxSetData(mxArray*A,double*data) from matlab's C interface
}
The only thing that comes to my mind to emulate this is:
{
vector<double>*v=new vector<double>();
//grow *v...
data=(*v)[0];
}
and then data will later either be freed or (in my case) used as mxSetData(mxArrayA,doubledata). However this results in a small memory leak (data struct for handling v's capacity, size, etc... but not the data itself of course).
Is it possible without leaking ?
A simple workaround would be swapping the vector with one you own:
vector<double> myown;
vector<double> someoneelses = foo();
std::swap( myown, someoneelses );
A tougher but maybe better approach is write your own allocator for the vector, and let it allocate out of a pool you maintain. No personal experience, but it's not too complicated.
The point of using a std::vector is not to have to worry about the data in it:
Keep your vector all along your application;
Pass it by const-ref to other functions (to avoid unnecessary copies);
And feed functions expecting a pointer-to-T with &v[0].
If you really don't want to keep your vector, you will have to copy your data -- you can't transfer ownership because std::vector guarantees it will destroy its content when going out-of-scope. In that case, use the std::copy() algorithm.
If your vector contains values you can only copy them (which happens when you call std::copy, std::swap, etc.). If you keep non-primitive objects in a vector and don't want to copy them (and use in another data structure), consider storing pointers
Does something like this work for you?
int main()
{
double *data = 0;
{
vector<double> foo;
// insert some elements to foo
data = new double[foo.size()];
std::copy(foo.begin(), foo.end(), &data[0]);
}
// Pass data to Matlab function.
delete [] data;
return 0;
}
Since you don't want to copy data between containers, but want to transfer ownership of data between containers, I suggest using a container of smart pointers as follows.
void f()
{
std::vector<boost::shared_ptr<double> > doubles;
InitVector(doubles);
std::vector<boost::shared_ptr<double> > newDoubles(doubles);
}
You really can't transfer ownership of data between standard containers without making a copy of it, since standard containers always copy the data they encapsulate. If you want to minimize the overhead of copying expensive objects, then it is a good idea to use a reference-counted smart pointer to wrap your expensive data structure. boost::shared_ptr is suitable for this task since it is fairly cheap to make a copy of it.