Why can't a std::vector hold constant objects? - c++

I am not sure why the following code is not allowed to exist:
int main()
{
std::vector<const int> v;
v.reserve(2);
v.emplace_back(100);
v.emplace_back(200);
}
In theory, reserve() does not construct anything, as opposite to resize(). Also, emplace_back() is constructing "in place" objects, so none of this code is writing over an already constructed constant object.
Despite that, even writing the first line, std::vector<const int> v;, is already resulting in compilation error. Why is not allowed at all to have a std::vector of constants?

While the "why" is best read here there are some easy ways to get what you want:
template<class T>
class as_const {
T t;
public:
as_const(T& t_): t(t_) {}
as_const(const as_const&) = default;
as_const(as_const &&) = delete;
as_const& operator=(const as_const&) = default;
as_const& operator=(as_const&&) = delete;
operator const T&() const {
return t;
}
const T& operator*() const {
// Or just a get method, anyway it's nicer to also have an explicit getter
return t;
}
};
std::vector<as_const<int>> vec;
vec.reserve(2)
vec.emplace_back(100);
vec.emplace_back(200);
You even can decide "how constant" your wrapper should be and provide the move constructors (note that t is not constant per-se in as_const) if you think that is reasonable for your use-case.
Note that you cannot prohibit the reallocation of your vector in this way. If you want to do that (and don't want to use a compile-time size array since you now your size only at runtime) take a look at std::unique_ptr<T[]>. But note that in this case you first have to create a mutable variant and then reseat it in a const variant, since the underlying array gets default initialized and you cannot change anything after that.
Of course there is also the possibility of working with an allocator and disallowing a reallocation. But that has no stl-implementation. There are some implementations for this type of behavior out there (I myself gave it a try once but that is a bit of a mess) but I do not know if there is anything in boost.

As the link provided by Bathsheba tells you, the problem is that std::vector<T> is really std::vector<T, std::allocator<T>>. Up to C++20, std::allocator<T>::address() had two overloads, one which takes T& and one which takes T const&. The problem with T=const int should be obvious. C++ has no such thing as extra-const, so both overloads are equally good.

Related

Why is value taking setter member functions not recommended in Herb Sutter's CppCon 2014 talk (Back to Basics: Modern C++ Style)?

In Herb Sutter's CppCon 2014 talk Back to Basics: Modern C++ Style he refers on slide 28 (a web copy of the slides are here) to this pattern:
class employee {
std::string name_;
public:
void set_name(std::string name) noexcept { name_ = std::move(name); }
};
He says that this is problematic because when calling set_name() with a temporary, noexcept-ness isn't strong (he uses the phrase "noexcept-ish").
Now, I have been using the above pattern pretty heavily in my own recent C++ code mainly because it saves me typing two copies of set_name() every time - yes, I know that can be a bit inefficient by forcing a copy construction every time, but hey I am a lazy typer. However Herb's phrase "This noexcept is problematic" worries me as I don't get the problem here: std::string's move assignment operator is noexcept, as is its destructor, so set_name() above seems to me to be guaranteed noexcept. I do see a potential exception throw by the compiler before set_name() as it prepares the parameter, but I am struggling to see that as problematic.
Later on on slide 32 Herb clearly states the above is an anti-pattern. Can someone explain to me why lest I have been writing bad code by being lazy?
Others have covered the noexcept reasoning above.
Herb spent much more time in the talk on the efficiency aspects. The problem isn't with allocations, its with unnecessary deallocations. When you copy one std::string into another the copy routine will reuse the allocated storage of the destination string if there's enough space to hold the data being copied. When doing a move assignment the destination string's existing storage must be deallocated as it takes over the storage from the source string. The "copy and move" idiom forces the deallocation to always occur, even when you don't pass a temporary. This is the source of the horrible performance that is demonstrated later in the talk. His advice was to instead take a const ref and if you determine that you need it have an overload for r-value references. This will give you the best of both worlds: copy into existing storage for non-temporaries avoiding the deallocation and move for temporaries where you're going to pay for a deallocation one way or the other (either the destination deallocates prior to the move or the source deallocates after the copy).
The above doesn't apply to constructors since there's no storage in the member variable to deallocate. This is nice since constructors often take more than one argument and if you need to do const ref/r-value ref overloads for each argument you end up with a combinatorial explosion of constructor overloads.
The question now becomes: how many classes are there that reuse storage like std::string when copying? I'm guessing that std::vector does, but outside of that I'm not sure. I do know that I've never written a class that reuses storage like this, but I have written a lot of classes that contain strings and vectors. Following Herb's advice won't hurt you for classes that don't reuse storage, you'll be copying at first with the copying version of the sink function and if you determine that the copying is too much of a performance hit you'll then make an r-value reference overload to avoid the copy (just as you would for std::string). On the other hand, using "copy-and-move" does have a demonstrated performance hit for std::string and other types that reuse storage, and those types probably see a lot of use in most peoples code. I'm following Herb's advice for now, but need to think through some of this a bit more before I consider the issue totally settled (there's probably a blog post that I don't have time to write lurking in all this).
There were two reasons considered for why passing by value might be better than passing by const reference.
more efficient
noexcept
In the case of setters for members of type std::string, he debunked the claim that passing by value was more efficient by showing that passing by const reference typically produced fewer allocations (at least for std::string).
He also debunked the claim that it allows the setter be noexcept by showing that the noexcept declaration is misleading, since an exception can still occur in the process of copying the parameter.
He thus concluded that passing by const reference was to be preferred over passing by value, at least in this case. However, he did mention that passing by value was a potentially good approach for constructors.
I do think that the example for std::string alone is not sufficient to generalize to all types, but it does call into question the practice of passing expensive-to-copy but cheap-to-move parameters by value, at least for efficiency and exception reasons.
Herb has a point, in that taking by-value when you already have storage allocated within can be inefficient and cause a needless allocation. But taking by const& is almost as bad, as if you take a raw C string and pass it to the function, a needless allocation occurs.
What you should take is the abstraction of reading from a string, not a string itself, because that is what you need.
Now, you can do this as a template:
class employee {
std::string name_;
public:
template<class T>
void set_name(T&& name) noexcept { name_ = std::forward<T>(name); }
};
which is reasonably efficient. Then add some SFINAE maybe:
class employee {
std::string name_;
public:
template<class T>
std::enable_if_t<std::is_convertible<T,std::string>::value>
set_name(T&& name) noexcept { name_ = std::forward<T>(name); }
};
so we get errors at the interface and not the implementation.
This isn't always practical, as it requires exposing the implementation publically.
This is where a string_view type class can come in:
template<class C>
struct string_view {
// could be private:
C const* b=nullptr;
C const* e=nullptr;
// key component:
C const* begin() const { return b; }
C const* end() const { return e; }
// extra bonus utility:
C const& front() const { return *b; }
C const& back() const { return *std::prev(e); }
std::size_t size() const { return e-b; }
bool empty() const { return b==e; }
C const& operator[](std::size_t i){return b[i];}
// these just work:
string_view() = default;
string_view(string_view const&)=default;
string_view&operator=(string_view const&)=default;
// myriad of constructors:
string_view(C const* s, C const* f):b(s),e(f) {}
// known continuous memory containers:
template<std::size_t N>
string_view(const C(&arr)[N]):string_view(arr, arr+N){}
template<std::size_t N>
string_view(std::array<C, N> const& arr):string_view(arr.data(), arr.data()+N){}
template<std::size_t N>
string_view(std::array<C const, N> const& arr):string_view(arr.data(), arr.data()+N){}
template<class... Ts>
string_view(std::basic_string<C, Ts...> const& str):string_view(str.data(), str.data()+str.size()){}
template<class... Ts>
string_view(std::vector<C, Ts...> const& vec):string_view(vec.data(), vec.data()+vec.size()){}
string_view(C const* str):string_view(str, str+len(str)) {}
private:
// helper method:
static std::size_t len(C const* str) {
std::size_t r = 0;
if (!str) return r;
while (*str++) {
++r;
}
return r;
}
};
such an object can be constructed directly from a std::string or a "raw C string" and nearly costlessly store what you need to know in order to produce a new std::string from it.
class employee {
std::string name_;
public:
void set_name(string_view<char> name) noexcept { name_.assign(name.begin(),name.end()); }
};
and as now our set_name has a fixed interface (not a perfect forward one), it can have its implementation not visible.
The only inefficiency is that if you pass in a C-style string pointer, you somewhat needlessly go over its size twice (first time looking for the '\0', second time copying them). On the other hand, this gives your target information about how big it has to be, so it can pre-allocate rather than re-allocate.
You have two ways of calling that methods.
With rvalue parameter, as long as the move constructor of the parameter type is noexcept there is no problem (in case of std::string most probably is noexcept), in any case would be better use a conditional noexcept (to be sure the parameters is noexcept)
With lvalue parameter, in this case the copy constructor of the parameter type would be called and almost sure it will need some allocation (that could throw).
In cases like this that the use could be miss used, it's better to avoid. The client of the class assume that no exception is throw as specified, but in valid, compilable, not suspicious C++11 could throw.

assignment of class with const member

Consider the following code:
struct s
{
const int id;
s(int _id):
id(_id)
{}
};
// ...
vector<s> v; v.push_back(s(1));
I get a compiler error that 'const int id' cannot use default assignment operator.
Q1. Why does push_back() need an assignment operator?
A1. Because the current c++ standard says so.
Q2. What should I do?
I don't want to give up the const specifier
I want the data to be copied
A2. I will use smart pointers.
Q3. I came up with a "solution", which seems rather insane:
s& operator =(const s& m)
{
if(this == &m) return *this;
this->~s();
return *new(this) s(m);
}
Should I avoid this, and why (if so)? Is it safe to use placement new if the object is on the stack?
C++03 requires that elements stored in containers be CopyConstructible and Assignable (see §23.1). So implementations can decide to use copy construction and assignment as they see fit. These constraints are relaxed in C++11. Explicitly, the push_back operation requirement is that the type be CopyInsertable into the vector (see §23.2.3 Sequence Containers)
Furthermore, C++11 containers can use move semantics in insertion operations and do on.
I don't want to give up the const specifier
Well, you have no choice.
s& operator =(const s& m) {
return *new(this) s(m);
}
Undefined behaviour.
There's a reason why pretty much nobody uses const member variables, and it's because of this. There's nothing you can do about it. const member variables simply cannot be used in types you want to be assignable. Those types are immutable, and that's it, and your implementation of vector requires mutability.
s& operator =(const s& m)
{
if(this == &m) return *this;
this->~s();
return *new(this) s(m);
}
Should I avoid this, and why (if so)? Is it safe to use placement new if the object is on the stack?
You should avoid it if you can, not because it is ill-formed, but because it is quite hard for a reader to understand your goal and trust in this code. As a programmer, you should aim to reduce the number of WTF/line of code you write.
But, it is legal. According to
[new.delete.placement]/3
void* operator new(std::size_t size, void* ptr) noexcept;
3 Remarks: Intentionally performs no other action.
Invoking the placement new does not allocate or deallocate memory, and is equivalent to manually call the copy constructor of s, which according to [basic.life]/8 is legal if s has a trivial destructor.
Ok,
You should always think about a problem with simple steps.
std::vector<typename T>::push_back(args);
needs to reserve space in the vector data then assigns(or copy, or move) the value of the parameter to memory of the vector.data()[idx] at that position.
to understand why you cannot use your structure in the member function std::vector::push_back , try this:
std::vector<const int> v; // the compiler will hate you here,
// because this is considered ill formed.
The reason why is ill formed, is that the member functions of the class std::vector could call the assignment operator of its template argument, but in this case it's a constant type parameter "const int" which means it doesn't have an assignment operator ( it's none sense to assign to a const variable!!).
the same behavior is observed with a class type that has a const data member. Because the compiler will delete the default assignment operator, expel
struct S
{
const int _id; // automatically the default assignment operator is
// delete i.e. S& operator-(const S&) = delete;
};
// ... that's why you cannot do this
std::vector<S> v;
v.Push_back(S(1234));
But if you want to keep the intent and express it in a well formed code this is how you should do it:
class s
{
int _id;
public:
explicit s(const int& id) :
_id(id)
{};
const int& get() const
{
return _id;
}; // no user can modify the member variable after it's initialization
};
// this is called data encapsulation, basic technique!
// ...
std::vector<S> v ;
v.push_back(S(1234)); // in place construction
If you want to break the rules and impose an assignable constant class type, then do what the guys suggested above.
Q2. What should I do?
Store pointers, preferably smart.
vector<unique_ptr<s>> v;
v.emplace_back(new s(1));
It's not really a solution, but a workaround:
#include <vector>
struct s
{
const int id;
s(int _id):
id(_id)
{}
};
int main(){
std::vector<s*> v;
v.push_back(new s(1));
return 0;
}
This will store pointers of s instead of the object itself. At least it compiles... ;)
edit: you can enhance this with smart c++11 pointers. See Benjamin Lindley's answer.
Use a const_cast in the assignment operator:
S& operator=(const S& rhs)
{
if(this==&rhs) return *this;
int *pid=const_cast<int*>(&this->id);
*pid=rhs.id;
return *this;
}

Why couldn't push_back be overloaded to do the job of emplace_back?

Firstly, I'm aware of this question, but I don't believe I'm asking the same thing.
I know what std::vector<T>::emplace_back does - and I understand why I would use it over push_back(). It uses variadic templates allowing me to forward multiple arguments to the constructor of a new element.
But what I don't understand is why the C++ standard committee decided there was a need for a new member function. Why couldn't they simply extend the functionality of push_back(). As far as I can see, push_back could be overloaded in C++11 to be:
template <class... Args>
void push_back(Args&&... args);
This would not break backwards compatibility, while allowing you to pass N arguments, including arguments that would invoke a normal rvalue or copy constructor. In fact, the GCC C++11 implementation of push_back() simply calls emplace_back anyway:
void push_back(value_type&& __x)
{
emplace_back(std::move(__x));
}
So, the way I see it, there is no need for emplace_back(). All they needed to add was an overload for push_back() which accepts variadic arguments, and forwards the arguments to the element constructor.
Am I wrong here? Is there some reason that an entirely new function was needed here?
If T has an explicit conversion constructor, there is different behavior between emplace_back and push_back.
struct X
{
int val;
X() :val() {}
explicit X(int v) :val(v) {}
};
int main()
{
std::vector<X> v;
v.push_back(123); // this fails
v.emplace_back(123); // this is okay
}
Making the change you suggest would mean that push_back would be legal in that instance, and I suppose that was not desired behavior. I don't know if this is the reason, but it's the only thing I can come up with.
Here's another example.
Honestly, the two are semantically so different, that their similar behavior should be regarded as a mere coincidence (due to the fact that C++ has "copy constructors" with a particular syntax).
You should really not use emplace_back unless you want in-place-construction semantics.
It's rare that you'd need such a thing. Generally push_back is what you really, semantically want.
#include <vector>
struct X { X(struct Y const &); };
struct Y { Y(int const &); operator X(); };
int main()
{
std::vector<X> v;
v. push_back(Y(123)); // Calls Y::operator X() and Y::Y(int const &)
v.emplace_back(Y(123)); // Calls X::X(Y const &) and Y::Y(int const &)
}

Is this C++ pointer container safe?

I want a safe C++ pointer container similar to boost's scoped_ptr, but with value-like copy semantics. I intend to use this for a very-rarely used element of very-heavily used class in the innermost loop of an application to gain better memory locality. In other words, I don't care about performance of this class so long as its "in-line" memory load is small.
I've started out with the following, but I'm not that adept at this; is the following safe? Am I reinventing the wheel and if so, where should I look?
template <typename T>
class copy_ptr {
T* item;
public:
explicit copy_ptr() : item(0) {}
explicit copy_ptr(T const& existingItem) : item(new T(existingItem)) {}
copy_ptr(copy_ptr<T> const & other) : item(new T(*other.item)) {}
~copy_ptr() { delete item;item=0;}
T * get() const {return item;}
T & operator*() const {return *item;}
T * operator->() const {return item;}
};
Edit: yes, it's intentional that this behaves pretty much exactly like a normal value. Profiling shows that the algorithm is otherwise fairly efficient but is sometimes hampered by cache misses. As such, I'm trying to reduce the size of the object by extracting large blobs that are currently included by value but aren't actually used in the innermost loops. I'd prefer to do that without semantic changes - a simple template wrapper would be ideal.
No it is not.
You have forgotten the Assignment Operator.
You can choose to either forbid assignment (strange when copying is allowed) by declaring the Assignment Operator private (and not implementing it), or you can implement it thus:
copy_ptr& operator=(copy_ptr const& rhs)
{
using std::swap;
copy_ptr tmp(rhs);
swap(this->item, tmp.item);
return *this;
}
You have also forgotten in the copy constructor that other.item may be null (as a consequence of the default constructor), pick up your alternative:
// 1. Remove the default constructor
// 2. Implement the default constructor as
copy_ptr(): item(new T()) {}
// 3. Implement the copy constructor as
copy_ptr(copy_ptr const& rhs): item(other.item ? new T(*other.item) : 0) {}
For value-like behavior I would prefer 2, since a value cannot be null. If you go for allowing nullity, introduces assert(item); in both operator-> and operator* to ensure correctness (in debug mode) or throw an exception (whatever you prefer).
Finally the item = 0 in the destructor is useless: you cannot use the object once it's been destroyed anyway without invoking undefined behavior...
There's also Roger Pate's remark about const-ness propagation to be more "value-like" but it's more a matter of semantics than correctness.
You should "pass on" the const-ness of the copy_ptr type:
T* get() { return item; }
T& operator*() { return *item; }
T* operator->() { return item; }
T const* get() const { return item; }
T const& operator*() const { return *item; }
T const* operator->() const { return item; }
Specifying T isn't needed in the copy ctor:
copy_ptr(copy_ptr const &other) : item (new T(*other)) {}
Why did you make the default ctor explicit? Nulling the pointer in the dtor only makes sense if you plan on UB somewhere...
But these are all minor issues, what you have there is pretty much it. And yes, I've seen this invented many times over, but people tend to tweak the semantics just slightly each time. You might look at boost::optional, as that's almost what you have written here as you present it, unless you're adding move semantics and other operations.
In addition to what Roger has said, you can Google 'clone_ptr' for ideas/comparisons.

Allow member to be const while still supporting operator= on the class

I have several members in my class which are const and can therefore only be initialised via the initialiser list like so:
class MyItemT
{
public:
MyItemT(const MyPacketT& aMyPacket, const MyInfoT& aMyInfo)
: mMyPacket(aMyPacket),
mMyInfo(aMyInfo)
{
}
private:
const MyPacketT mMyPacket;
const MyInfoT mMyInfo;
};
My class can be used in some of our internally defined container classes (e.g. vectors), and these containers require that operator= is defined in the class.
Of course, my operator= needs to do something like this:
MyItemT&
MyItemT::operator=(const MyItemT& other)
{
mMyPacket = other.mPacket;
mMyInfo = other.mMyInfo;
return *this;
}
which of course doesn't work because mMyPacket and mMyInfo are const members.
Other than making these members non-const (which I don't want to do), any ideas about how I could fix this?
You're kind of violating the definition of const if you have an assignment operator that can change them after construction has finished. If you really need to, I think Potatoswatter's placement new method is probably best, but if you have an assignment operator your variables aren't really const, since someone could just make a new instance and use it to change their values
Rather than storing objects in your containers directly, you might be able to store pointers (or smart pointers). That way, you don't have to mutate any of the members of your class -- you get back exactly the same object as you passed in, const and all.
Of course, doing this will probably change the memory management of your application somewhat, which may well be a good enough reason not to want to.
It's a dirty hack, but you can destroy and reconstruct yourself:
MyItemT&
MyItemT::operator=(const MyItemT& other)
{
if ( this == &other ) return *this; // "suggested" by Herb Sutter ;v)
this->MyItemT::~MyItemT();
try {
new( this ) MyItemT( other );
} catch ( ... ) {
new( this ) MyItemT(); // nothrow
throw;
}
return *this;
}
Edit: lest I destroy my credibility, I don't actually do this myself, I would remove the const. However, I've been debating changing the practice, because const simply is useful and better to use wherever possible.
Sometimes there is a distinction between the resource and the value represented by an object. A member may be const through changes to value as long as the resource is the same, and it would be nice to get compile-time safety on that.
Edit 2: #Charles Bailey has provided this wonderful (and highly critical) link: http://gotw.ca/gotw/023.htm.
Semantics are tricky in any derived class operator=.
It may be inefficient because it doesn't invoke assignment operators that have been defined.
It's incompatible with wonky operator& overloads (whatever)
etc.
Edit 3: Thinking through the "which resource" vs "what value" distinction, it seems clear that operator= should always change the value and not the resource. The resource identifier may then be const. In the example, all the members are const. If the "info" is what's stored inside the "packet," then maybe the packet should be const and the info not.
So the problem isn't so much figuring out the semantics of assignment as lack of an obvious value in this example, if the "info" is actually metadata. If whatever class owns a MyItemT wants to switch it from one packet to another, it needs to either give up and use an auto_ptr<MyItemT> instead, or resort to a similar hack as above (the identity test being unnecessary but the catch remaining) implemented from outside. But operator= shouldn't change resource binding except as an extra-special feature which absolutely won't interfere with anything else.
Note that this convention plays well with Sutter's advice to implement copy construction in terms of assignment.
MyItemT::MyItemT( MyItemT const &in )
: mMyPacket( in.mMyPacket ) // initialize resource, const member
{ *this = in; } // assign value, non-const, via sole assignment method
I think you could get away with a special const proxy.
template <class T>
class Const
{
public:
// Optimal way of returning, by value for built-in and by const& for user types
typedef boost::call_traits<T>::const_reference const_reference;
typedef boost::call_traits<T>::param_type param_type;
Const(): mData() {}
Const(param_type data): mData(data) {}
Const(const Const& rhs): mData(rhs.mData) {}
operator const_reference() const { return mData; }
void reset(param_type data) { mData = data; } // explicit
private:
Const& operator=(const Const&); // deactivated
T mData;
};
Now, instead of const MyPacketT you would have Const<MyPacketT>. Not that the interface only provides one way to change it: through an explicit call to reset.
I think any use of mMyPacket.reset can easily be search for. As #MSalters said it protects against Murphy, not Machiavelli :)
You might consider making the MyPacketT and MyInfoT members be pointers to const (or smart pointers to const). This way the data itself is still marked const and immutable, but you can cleanly 'swap' to another set of const data in an assignment if that makes sense. In fact, you can use the swap idiom to perform the assignment in an exception safe manner.
So you get the benefit of const to help you prevent accidentally allowing changes that you want the design to prevent, but you still allow the object as a whole to be assigned from another object. For example, this will let you use objects of this class in STL containers.
You might look at this as a special application of the 'pimpl' idiom.
Something along the lines of:
#include <algorithm> // for std::swap
#include "boost/scoped_ptr.hpp"
using namespace boost;
class MyPacketT {};
class MyInfoT {};
class MyItemT
{
public:
MyItemT(const MyPacketT& aMyPacket, const MyInfoT& aMyInfo)
: pMyPacket(new MyPacketT( aMyPacket)),
pMyInfo(new MyInfoT( aMyInfo))
{
}
MyItemT( MyItemT const& other)
: pMyPacket(new MyPacketT( *(other.pMyPacket))),
pMyInfo(new MyInfoT( *(other.pMyInfo)))
{
}
void swap( MyItemT& other)
{
pMyPacket.swap( other.pMyPacket);
pMyInfo.swap( other.pMyInfo);
}
MyItemT const& operator=( MyItemT const& rhs)
{
MyItemT tmp( rhs);
swap( tmp);
return *this;
}
private:
scoped_ptr<MyPacketT const> pMyPacket;
scoped_ptr<MyInfoT const> pMyInfo;
};
Finally, I changed my example to use scoped_ptr<> instead of shared_ptr<> because I thought it was a more general representation of what the OP intended. However, if the 'reassignable' const members can be shared (and that's probably true, given my understanding of why the OP wants them const), then it might be an optimization to use shared_ptr<>'s and let the copy and assignment operations of the shared_ptr<> class take care of things for those objects - if you have no other members that require special copy or assign sematics, then your class just got a lot simpler, and you might even save a significant bit of memory usage by being able to share copies of the MyPacketT and MyInfoT objects.