'auto_ptr' and STL containers: writing an example of erroneous usage - c++

This question raised after reading this tutorial:
http://www.cprogramming.com/tutorial/auto_ptr.html
There you can find the following statement: A subtle consequence of this behavior is that auto_ ptrs don't work well in all scenarios. For instance, using auto _ptr objects with the standard template library can lead to problems as some functions in the STL may make copies of the objects in containers such as the vector container class. One example is the sort function, which makes copies of some of the objects in the container being sorted. As a consequence, this copy can blithely delete the data in the container!
Most of the papers concerning 'auto_ptr' tell us something like following:
"Never use 'auto_ptr' with STL containers! They often copy their elements while performing intrinsic operations. For example consider sort on std::vector".
So my goal is to write the code sample that illustrates this point or prove that such examples are only theoretically true and weird on practice.
P.S. #everybody_who_also_knows_that_auto_ptr_is_deprecated
I also know this. But don't you consider technical reasons (legacy code or old compiler) that may not allow new pointer containers usage? And moreover this question is about old and bad (if you'd like) auto_ptr.

I don't have MSVC right now, but judging from the error from g++, I guess this is the reason:
auto_ptr<T> only has a "copy constructor" which takes mutable references (§D.10.1.1[auto.ptr.cons]/2­–6):
auto_ptr(auto_ptr& a) throw();
template<class Y> auto_ptr(auto_ptr<Y>& a) throw();
But vector::push_back will accept a const reference (§23.3.6.1[vector.overview]/2).
void push_back(const T& x);
So it is impossible to construct an auto_ptr via push_back because no constructor takes a const reference.

From what you write, it seems that you already know everything that there is to know about containers of auto_ptrs and why they are unsafe.
Therefore, I assume that your interest in containers of auto_ptrs is purely teaching oriented. I understand your frustration in attempting to build a deliberate counter-example: in fact, most implementers of standard containers have put in place work-arounds to avoid accidentally triggering the broken semantics of auto_ptrs.
So, here's an example that I have written myself precisely for teaching:
class MyClass {
int a;
public:
MyClass (int i) : a(i) { }
int get() const { return a; }
};
int main() {
constexpr unsigned size = 10;
std::vector< std::auto_ptr<MyClass> > coap;
coap.resize(size);
for (unsigned u=0; u<size; u++)
coap[u] = std::auto_ptr<MyClass>( new MyClass( rand() % 50 ));
std::sort( coap.begin(), coap.end(),
[]( std::auto_ptr<MyClass> a,
std::auto_ptr<MyClass> b) { return a->get() < b->get(); });
}
Compiling it with g++ 4.9.2 will lead to an executable that will segfault nicely.
You can rewrite the example above even more concisely by using type deduction:
std::sort( coap.begin(), coap.end(),
[]( auto a, auto b) { return a->get() < b->get(); });
Note that the problem is not in the specific implementation of std::sort, which seems to be auto_ptr-safe. It is rather in the comparison lambda function I am passing to std::sort, that deliberately accepts its arguments by value, thus destroying the objects in the container every time a comparison is performed.
If you changed the lambda so that it receives its arguments by reference, as shown below, most STL implementations would actually behave correctly, even if you are doing something that is conceptually wrong.
std::sort( coap.begin(), coap.end(),
[]( const std::auto_ptr<MyClass> & a,
const std::auto_ptr<MyClass> & b) { return a->get() < b->get(); });
Good luck!

STEP 1
Lets' solve this problem in a straight way:
#include <iostream>
#include <vector>
#include <algorithm>
template<> struct std::less<std::auto_ptr<int>>: public std::binary_function<std::auto_ptr<int>, std::auto_ptr<int>, bool> {
bool operator()(const std::auto_ptr<int>& _Left, const std::auto_ptr<int>& _Right) const
{ // apply operator< to operands
return *_Left < *_Right;
}
};
int wmain() {
using namespace std;
auto_ptr<int> apai(new int(1)), apai2(new int(2)), apai3(new int(3));
vector<auto_ptr<int>> vec;
vec.push_back(apai3);
vec.push_back(apai);
vec.push_back(apai2);
for ( vector<auto_ptr<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << i->get() << L'\t';
vector<int> vec2;
vec2.push_back(3);
vec2.push_back(2);
vec2.push_back(5);
sort(vec2.begin(), vec2.end(), less<int>());
sort(vec.begin(), vec.end(), less<auto_ptr<int>>());
return 0;
}
On MSVCPP11 the error text is following:
_Error 1 error C2558: class 'std::auto_ptr<Ty>': no copy constructor available or copy constructor is declared 'explicit' c:\program files (x86)\microsoft visual studio 11.0\vc\include\xmemory0 608
The conclusion is: I even cannot compile such example. Why do they prevent me to do something that I cannot compile?? Their preventions are not always true.
STEP 2
We cannot use auto_ptr as vector element type directly due to auto_ptr design. But we can wrap `auto_ptr' in the way presented below.
#include <iostream>
#include <vector>
#include <algorithm>
#include <memory>
#include <functional>
template<typename T> class auto_ptr_my: public std::auto_ptr<T> {
public:
explicit auto_ptr_my(T *ptr = 0) {
this->reset(ptr);
}
auto_ptr_my<T> &operator=(const auto_ptr_my<T> &right) {
*(static_cast<std::auto_ptr<T> *>(this)) = *(static_cast<std::auto_ptr<T> *>(const_cast<auto_ptr_my *>(&right)));
return *this;
}
auto_ptr_my(const auto_ptr_my<T>& right) {
*this = right;
}
};
namespace std
{
template<> struct less<auto_ptr_my<int> >: public std::binary_function<auto_ptr_my<int>, auto_ptr_my<int>, bool> {
bool operator()(const auto_ptr_my<int>& _Left, const auto_ptr_my<int>& _Right) const
{ // apply operator< to operands
return *_Left < *_Right;
}
};
}
int wmain() {
using namespace std;
auto_ptr_my<int> apai(new int(1)), apai2(new int(2)), apai3(new int(3));
vector<auto_ptr_my<int>> vec;
vec.push_back(apai3);
vec.push_back(apai);
vec.push_back(apai2);
for ( vector<auto_ptr_my<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << **i << L'\t';
sort(vec.begin(), vec.end(), less<auto_ptr_my<int>>());
for ( vector<auto_ptr_my<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << **i << L'\t';
return 0;
}
This code works well showing that auto_ptr can be used with vector and sort with no memory leaks and crashes.
STEP 3
As KennyTM posted below:
add this code before return 0; statement:
std::vector<auto_ptr_my<int>> vec2 = vec;
for ( vector<auto_ptr_my<int>>::const_iterator i(vec2.cbegin()) ; i != vec2.cend() ; ++i )
wcout << **i << L'\t';
wcout << std::endl;
for ( vector<auto_ptr_my<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << **i << L'\t';
wcout << std::endl;
...and get memory leaks!
CONCLUSION
Sometimes we can use auto_ptr with containers without visible crash, sometimes not. Anyway it is bad practice.
But don't forget that auto_ptr is designed in such way that you cannot use it straight with STL containers and algorithms: against you have to write some wrapper code. At last using auto_ptr with STL containers is for your own risk. For example, some implementations of sort will not lead to the crash while processing vector elements, but other implementations will lead directly to the crash.
This question has academic purposes.
Thanks to KennyTM for providing STEP 3 crash example!

The conclusion is: I even cannot compile such example. Why do they prevent me to do something that I cannot compile??
IIRC, it is the other way around: the compiler vendor takes steps to prevent you from compiling something that you shouldn't be able to do. The way the standard is written, they could implement the library in a way that the code compiles, and then fails to work properly. They can also implement it this way, which is seen as superior because it's one of those few times where the compiler is actually allowed to prevent you from doing something stupid :)

The right answer is "never use auto_ptr at all" -- its deprecated and never became part of the standard at all, for precisely the reasons outlined here. Use std::unique_ptr instead.

Related

Do I need to synchronize reads on elements in std::sort called with std::execution::par?

If I have the following code that makes use of execution policies, do I need to synchronize all accesses to Foo::value even when I'm just reading the variable?
#include <algorithm>
#include <execution>
#include <vector>
struct Foo { int value; int getValue() const { return value; } };
int main() {
std::vector<Foo> foos;
//fill foos here...
std::sort(std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
return left.getValue() > right.getValue();
});
return 0;
}
My concern is that std::sort() will move (or copy) elements asynchronously which is effectively equivalent to asynchronously writing to Foo::value and, therefore, all read and write operations on that variable need to be synchronized. Is this correct or does the sort function itself take care of this for me?
What if I were to use std::execution::par_unseq?
If you follow the rules, i.e. you don't modify anything or rely on the identity of the objects being sorted inside your callback, then you're safe.
The parallel algorithm is responsible for synchronizing access to the objects it modifies.
See [algorithms.parallel.exec]/2:
If an object is modified by an element access function, the algorithm will perform no other unsynchronized accesses to that object. The modifying element access functions are those which are specified as modifying the object. [ Note: For example, swap(), ++, --, #=, and assignments modify the object. For the assignment and #= operators, only the left argument is modified. — end note ]
In case of std::execution::par_unseq, there's the additional requirement on the user-provided callback that it isn't allowed to call vectorization-unsafe functions, so you can't even lock anything in there.
This is OK. After all, you have told std::sort what you want of it and you would expect it to behave sensibly as a result, given that it is presented with all the relevant information up front. There's not a lot of point to the execution policy parameter at all, otherwise.
Where there might be an issue (although not in your code, as written) is if the comparison function has side effects. Suppose we innocently wrote this:
int numCompares;
std::sort(std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
++numCompares;
return left.getValue() > right.getValue();
});
Now we have introduced a race condition, since two threads of execution might be passing through that code at the same time and access to numCompares is not synchronised (or, as I would put it, serialised).
But, in my slightly contrived example, we don't need to be so naive, because we can simply say:
std::atomic_int numCompares;
and then the problem goes away (and this particular example would also work with what appears to me to be the spectacularly useless std::execution::par_unseq, because std_atomic_int is lockless on any sensible platform, thank you Rusty).
So, in summary, don't be too concerned about what std::sort does (although I would certainly knock up a quick test program and hammer it a bit to see if it does actually work as I am claiming). Instead, be concerned about what you do.
More here.
Edit And while Rusty was digging that up, I did in fact write that quick test program (had to fix your lambda) and, sure enough, it works fine. I can't find an online compiler that supports execution (MSVC seems to think it is experimental) so I can't offer you a live demo, but when run on the latest version of MSVC, this code:
#define _SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING
#include <algorithm>
#include <execution>
#include <vector>
#include <cstdlib>
#include <iostream>
constexpr int num_foos = 100000;
struct Foo
{
Foo (int value) : value (value) { }
int value;
int getValue() const { return value; }
};
int main()
{
std::vector<Foo> foos;
foos.reserve (num_foos);
// fill foos
for (int i = 0; i < num_foos; ++i)
foos.emplace_back (rand ());
std::sort (std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
return left.getValue() < right.getValue();
});
int last_foo = 0;
for (auto foo : foos)
{
if (foo.getValue () < last_foo)
{
std::cout << "NOT sorted\n";
break;
}
last_foo = foo.getValue ();
}
return 0;
}
Generates the following output every time I run it:
<nothing>
QED.

Initialize a Container of unique_ptr with iota

To learn about the intricacies of C++11 I am playing aroung with unique_ptr a bit.
I wonder, is there any way to use iota to initialize an Container of unique_ptr?
I started with the unique-ptr-less solution which works fine:
std::vector<int> nums(98); // 98 x 0
std::iota(begin(nums), end(alleZahlen), 3); // 3..100
Now lets do it as far as we can using unique_ptr
std::vector<std::unique_ptr<int>> nums(98); // 98 x nullptr
std::unique_ptr three{ new int{3} };
std::iota(begin(nums), end(nums), std::move{three});
This fails obviously. Reasons:
Although I marked three with move as a && this may not be sufficient to copy/move the initial value into the container.
++initValue will also not work, because initValue is of type unique_ptr<int>, and there is no operator++ defined. But: we could define a free function unique_ptr<int> operator++(const unique_ptr<int>&); and that would take care of that at least.
But to copy/move the results from that operation is again not permitted in unique_ptr and this time I can not see how I could trick the compiler into using move.
Well, that's where I stopped. And I wonder if I miss some interesting idea on how to tell the compiler that he may move the results of the operator++. Or are there other hindrances, too?
In order to end up with 98 instances of unique_ptr, there must be 98 calls to new. You attempt to get away with just one - that can't possibly fly.
If you are really intent on pounding a square peg into a round hole, you could do something like this:
#include <algorithm>
#include <iostream>
#include <memory>
#include <vector>
class MakeIntPtr {
public:
explicit MakeIntPtr(int v) : value_(v) {}
operator std::unique_ptr<int>() {
return std::unique_ptr<int>(new int(value_));
}
MakeIntPtr& operator++() { ++value_; return *this; }
private:
int value_;
};
int main() {
std::vector<std::unique_ptr<int>> nums(98);
std::iota(begin(nums), end(nums), MakeIntPtr(3));
std::cout << *nums[0] << ' ' << *nums[1] << ' ' << *nums[2];
return 0;
}
Maybe std::generate_n is a better algorithm for this?
std::vector<std::unique_ptr<int>> v;
{
v.reserve(98);
int n = 2;
std::generate_n(std::back_inserter(v), 98,
[&n]() { return std::make_unique<int>(++n); });
}

return a vector vs use a parameter for the vector to return it

With the code below, the question is:
If you use the "returnIntVector()" function, is the vector copied from the local to the "outer" (global) scope? In other words is it a more time and memory consuming variation compared to the "getIntVector()"-function? (However providing the same functionality.)
#include <iostream>
#include <vector>
using namespace std;
vector<int> returnIntVector()
{
vector<int> vecInts(10);
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
return vecInts;
}
void getIntVector(vector<int> &vecInts)
{
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
}
int main()
{
vector<int> vecInts = returnIntVector();
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
cout << vecInts[ui] << endl;
cout << endl;
vector<int> vecInts2(10);
getIntVector(vecInts2);
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
return 0;
}
In theory, yes it's copied. In reality, no, most modern compilers take advantage of return value optimization.
So you can write code that acts semantically correct. If you want a function that modifies or inspects a value, you take it in by reference. Your code does not do that, it creates a new value not dependent upon anything else, so return by value.
Use the first form: the one which returns vector. And a good compiler will most likely optimize it. The optimization is popularly known as Return value optimization, or RVO in short.
Others have already pointed out that with a decent (not great, merely decent) compiler, the two will normally end up producing identical code, so the two give equivalent performance.
I think it's worth mentioning one or two other points though. First, returning the object does officially copy the object; even if the compiler optimizes the code so that copy never takes place, it still won't (or at least shouldn't) work if the copy ctor for that class isn't accessible. std::vector certainly supports copying, but it's entirely possible to create a class that you'd be able to modify like in getIntVector, but not return like in returnIntVector.
Second, and substantially more importantly, I'd generally advise against using either of these. Instead of passing or returning a (reference to) a vector, you should normally work with an iterator (or two). In this case, you have a couple of perfectly reasonable choices -- you could use either a special iterator, or create a small algorithm. The iterator version would look something like this:
#ifndef GEN_SEQ_INCLUDED_
#define GEN_SEQ_INCLUDED_
#include <iterator>
template <class T>
class sequence : public std::iterator<std::forward_iterator_tag, T>
{
T val;
public:
sequence(T init) : val(init) {}
T operator *() { return val; }
sequence &operator++() { ++val; return *this; }
bool operator!=(sequence const &other) { return val != other.val; }
};
template <class T>
sequence<T> gen_seq(T const &val) {
return sequence<T>(val);
}
#endif
You'd use this something like this:
#include "gen_seq"
std::vector<int> vecInts(gen_seq(0), gen_seq(10));
Although it's open to argument that this (sort of) abuses the concept of iterators a bit, I still find it preferable on practical grounds -- it lets you create an initialized vector instead of creating an empty vector and then filling it later.
The algorithm alternative would look something like this:
template <class T, class OutIt>
class fill_seq_n(OutIt result, T num, T start = 0) {
for (T i = start; i != num-start; ++i) {
*result = i;
++result;
}
}
...and you'd use it something like this:
std::vector<int> vecInts;
fill_seq_n(std::back_inserter(vecInts), 10);
You can also use a function object with std::generate_n, but at least IMO, this generally ends up more trouble than it's worth.
As long as we're talking about things like that, I'd also replace this:
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
...with something like this:
std::copy(vecInts2.begin(), vecInts2.end(),
std::ostream_iterator<int>(std::cout, "\n"));
In C++03 days, getIntVector() is recommended for most cases. In case of returnIntVector(), it might create some unncessary temporaries.
But by using return value optimization and swaptimization, most of them can be avoided. In era of C++11, the latter can be meaningful due to the move semantics.
In theory, the returnIntVector function returns the vector by value, so a copy will be made and it will be more time-consuming than the function which just populates an existing vector. More memory will also be used to store the copy, but only temporarily; since vecInts is locally scoped it will be stack-allocated and will be freed as soon as the returnIntVector returns. However, as others have pointed out, a modern compiler will optimize away these inefficiencies.
returnIntVector is more time consuming because it returns a copy of the vector, unless the vector implementation is realized with a single pointer in which case the performance is the same.
in general you should not rely on the implementation and use getIntVector instead.

storing mem_fun in a standard container

Is there a way to create a vector< mem_fun_t< ReturnType, MyClass > > ?
The error i'm seeing is:
error C2512: 'std::mem_fun1_t<_Result,_Ty,_Arg>' : no appropriate default constructor available
I really can't see why it would not work, but it's actually a pretty ugly solution. Just take vector<function<ReturnType(MyClass*)>> and be without those issues present in C++03 binders.
You certainly can create such a vector.
#include <vector>
#include <functional>
#include <iostream>
struct MyClass
{
int a() { return 1; }
int b() { return 2; }
};
int main()
{
std::vector<std::mem_fun_t<int, MyClass> > vec;
vec.push_back(std::mem_fun(&MyClass::a));
vec.push_back(std::mem_fun(&MyClass::b));
MyClass x;
for (size_t i = 0; i != vec.size(); ++i) {
std::cout << vec[i](&x) << '\n';
}
}
If you are having problems, read the error message carefully. For example, std::mem_fun can return all sorts of wrappers, depending on what you pass to it.
Or indeed, switch to boost's or C++0x's function.
Edit: With this particular error message, I assume that you are doing something that invokes the default constructor for contained type (e.g resize or specifying the size with the vector's constructor). You can't use those functions.
mem_fun_t meets the requirements to be stored in a container (it is copy-constructible and assignable), so the answer is yes.
However, it isn't default-constructible or comparable, so there are some things you can't do with a container of them, including:
Resizing, unless you provide a value to fill with
Constructing with a non-zero size, unless you provide a value to fill with
Comparing containers
The error you are seeing comes from trying to either resize, or construct with a size.

Code compiles locally on g++ 4.4.1, but not on codepad.org (g++ 4.1.2)? (reference to reference problem)?)

I was writing a test case out to tackle a bigger problem in my application. I ended trying some code out on codepad and discovered that some code that compiled on my local machine (g++ 4.4.1, with -Wall) didn't compile on codepad (g++ 4.1.2), even though my local machine has a newer version of g++.
Codepad calls this a reference to reference error, which I looked up and found a litle information on. It looks like it's not a good idea to have a stl container of references. Does this mean I need to define my own PairPages class? And if this is the case, why did it compile locally in the first place? What's going on?
codepad link: http://codepad.org/UAaJI1rl
#include <deque>
#include <utility>
#include <iostream>
using namespace std;
class Page {
public:
Page() : number_(++count) {}
int getNum() const { return number_; }
private:
static int count;
int number_;
};
int Page::count = 0;
class Book {
public:
Book() : currPageIdx_(3) {
int numPages = 5;
while (numPages > 0) {
pages_.push_back(Page());
numPages--; // oops
}
}
pair<const Page&, const Page&> currPages() { return pagesAt(currPageIdx_); }
pair<const Page&, const Page&> pagesAt(int pageNo) { return make_pair(pages_[pageNo - 1], pages_[pageNo]); }
//const Page& currPages() { return pagesAt(currPageIdx_); }
//const Page& pagesAt(int pageNo);
private:
deque<Page> pages_;
int currPageIdx_;
};
int main() {
Book book;
cout << book.pagesAt(3).first.getNum() << endl;
cout << book.currPages().first.getNum() << endl;
}
A vector (or any STL container) of references is indeed a bad idea, as obvious when you simply look at requirements for element type T of any STL container (ISO C++03 23.1[lib.container.requirements]). It starts off by saying that "containers are objects that store other objects". We can stop right here, because a reference is not an object in C++ (unlike, say, a pointer; note that "object" in C++ parlance doesn't mean "instance of class"!). But, furthermore, it requires T to be Assignable, the requirements for which refer to type T& - if T is itself some reference type U&, then the constructed type would be U& &, which (reference to reference) is illegal in C++.
If you really want to have a container that doesn't manage lifetimes of objects, then you should use a container of pointers. If you prefer the safety of references (e.g. lack of pointer arithmetic and null value), you can use std::tr1::reference_wrapper<T> class, which is copy constructible and assignable wrapper for a reference.
On the poster's infinite loop problem: Your code loops indefinitely because in the Book constructor the counter numPages is never decreased, thus the while statement never halts until you run out of memory.
It's usually easier to spot these kind of errors by just using a for-loop:
Book() : currPageIdx_(3) {
for(int numPages = 0; numPages < 5; ++numPages) {
pages_.push_back(Page());
}
}