I have a class similar to vector that is primarily a dynamically sized array. I am writing it for a resource-limited platform so I am required to not use exceptions.
It has become clear that to use operator overloading to simplify the interface for this class dynamic allocation would have to be performed in some of the operator overload functions. The assignment operator (=) is one example.
Without exceptions though, it becomes rather challenging to inform the caller of a bad allocation error in a sensible way while still retatining strong error safety. I could have an error property of the class which the caller must check after every call that involves dynamic allocation, but this seems like a not-so-optimal solution.
EDIT:
This is the best idea I have got at the moment (highlighted as a not-so-optimal solution in the paragraph above), any improvements would be greatly appreciated:
dyn_arr & dyn_arr::operator=(dyn_arr const & rhs) {
if (reallocate(rhs.length)) // this does not destroy data on bad alloc
error |= bad_alloc; // set flag indicating the allocate has failed
else {
size_t i;
for (i = 0; i < rhs.length; ++i) // coppy the array
arr[i] = rhs.arr[i]; // assume this wont throw an exceptions and it wont fail
}
return *this;
}
then to call:
dyn_arr a = b;
if (a.error)
// handle it...
I havn't compiled this so there might be typos, but hopefully you get the idea.
There are two separate issues going on here.
The first is related to operator overloading. As CashCow mentions, overloaded operators in C++ are just syntactical sugar for function calls. In particular, operators are not required to return *this. That is merely a programming convention, created with the intention to facilitate operator chaining.
Now, chaining assignment operators (a = b = c = ...) is quite a corner case in C++ applications. So it's possible that you're better off by explicitly forbidding the users of your dyn_arr class to ever chain assignment operators. That would give you to the freedom to instead return an error code from the operator, just like from a regular function:
error_t operator = (dyn_arr const & rhs) {
void *mem = realloc(...);
if (mem == NULL) {
return ERR_BAD_ALLOC; // memory allocation failed
}
...
return ERR_SUCCESS; // all ok
}
And then in caller code:
dyn_arr a, b;
if ((a = b) != ERR_SUCCESS) {
// handle error
}
The second issue is related to the actual example you're giving:
dyn_arr a = b;
This example will NOT call the overloaded assigment operator! Instead, it means "construct dyn_arr object a with b as argument to the constructor". So this line actually calls the copy constructor of dyn_arr. If you're interested to understand why, think in terms of efficiency. If the semantics of that line included calling the assignment operator, the runtime system would have do two things as result of this line: construct a with some default state, and then immediately destroy that state by assigning to a the state of b. Instead, just doing one thing - calling the copy construction - is sufficient. (And leads to the same semantics, assuming any sane implementations of copy constructor and the assignment operator.)
Unfortunately, you're right to recognize that this issue is hard to deal with. There does not seem to be a really elegant way of handling failure in constructor, other than throwing an exception. If you cannot do that, either:
set a flag in the constructor and require/suggest the user to check for it afterwards, or
require that a pointer to already allocated memory area is
passed as an argument to the constructor.
For more details, see How to handle failure in constructor in C++?
Operator overloading has nothing to do with exceptions, it is simply allowing a "function" to be invoked by means of use of operators.
e.g. if you were writing your own vector you could implement + to concatenate two vectors or add a single item to a vector (as an alias to push_back())
Of course any operation that requires assigning more memory could run out of it (and you would get bad_alloc and have to manage that if you cannot throw it, by setting some kind of error state).
Related
Im reading a book about c++ and in the "Copy Control" section the author teach us to write the operator= in a class telling us that we have to be sure that the method is safe of self-assigment when we have a class that use dynamic memory.
So, imagine that we have a class named "Bank_Client" with a std::string in created with new. The book teach us to do this to avoid the self-assigment case:
Bank_Client& Bank_Client::operator=(const Bank_Client &addres){
std::string *temp = new std::string(*addres.name);
delete name;
name = temp;
return *this;
}
So if i do
Bank_Client bob("Bobinsky");
bob = bob;
The program will not just blow-up. But right when i thought that temp variable was a waste of time the writer of the book show us another way to do it:
Bank_Client& Bank_Client::operator=(const Bank_Client &addres){
if (this != &addres){
delete name;
name = new std::string(*addres.name);
}
return *this;
}
Like if he read my mind. BUT right after that he tell us to never do that, that is better to do it by the other way but never explain why.
Why is the first way better? It is slower, isn't it?
What is the best why to do it?
what about use assert to check there is no self-assignment? (because we dont really need it). And then desactivate it with the corresponding NDEBUG then there is no waste of time in the checking.
The first way is slower if an object is being self-assigned. However, self-assignment is rare. In all other cases, the additional if check from the second approach is a waste.
That said, both approaches are poor ways to implement the copy-assignment operator: neither is exception-safe. If the assignment fails partway through, you'll be left with a half-assigned object in some inconsistent state. It's also bad that it partly duplicates logic from the copy-constructor. Instead, you should implement the copy-assignment operator using the copy-and-swap idiom.
You should use copy and swap. To that end you need a copy constructor (and maybe a move contructor as well if you want to also use move semantics).
class Bank_Client {
// Note that swap is a free function:
// This is important to allow it to be used along with std::swp
friend void swap(Bank_Client& c1, Bank_Client& c2) noexcept {
using std::swap;
swap(c1.name, c2.name);
// ...
}
// ...
};
// Note that we took the argument by value, not const reference
Bank_Client& Bank_Client::operator=(Bank_Client address) {
// Will call the swap function we defined above
swap(*this, adress);
return *this;
}
Now, let's take a look at the client code:
Bank_Client bob("Bobinsky");
// 1. Copy constructor is called to construct the `address` parameter of `operator=()`
// 2. We swap that newly created copy with the current content of bob
// 3. operator=() returns, the temporary copy is destroyed, everything is cleaned up!
bob = bob;
Obviously, you make a useless copy when doing self assignment, but it has the merit of allowing you to reuse logic in your copy constructor (or move constructor) and it is exception safe: nothing is done if the initial copy throws an exception.
The other benefit is you only need one implementation for operator=() to handle both copy and move semantics (copy-and-swap and move-and-swap). If the performance is a big issue you can still have an rvalue overload operator=(Bank_Client&&) to avoid the extra move (though I discourage it).
As a final word, I would also advise you to try to rely on the rule of 0 to the best of your ability as in the example above, should the content of the class change, you must also update the swap function accordingly.
Hello I have a class Truck with only one property of type int. I am not using any pointers in the whole class. I have written 2 versions of the operator=:
Truck& operator=( Truck &x)
{
if( this != &x)
{
price=x.getPrice();
}
return *this;
}
Truck operator=(Truck x)
{
if( this != &x)
{
price=x.getPrice();
}
return *this;
}
Both of them work, but is there any performance issue with anyone of them? And, what if I used pointers to declare my properties, should I stick to the first type of declaration?
Both of them work, but is there any performance issue with anyone of
them?
There is a potential performance issue with both of the code samples you've posted.
Since your class only has an int member, writing a user-defined assignment operator, regardless of how well-written it may look, could be slower than what the compiler default version would have achieved.
If your class does not require you to write a user-defined assignment operator (or copy constructor), then it is more wise not to write these functions yourself, as compilers these days know intrinsically how to optimize the routines they themselves generate.
The same thing with the destructor -- that seemingly harmless empty destructor that you see written almost as a kneejerk reaction can have an impact on performance, since again, it overrides the compiler's default destructor, which is optimized to do whatever it needs to do.
So the bottom line is leave the compiler alone when it comes to these functions. If the compiler default versions of the copy / assignment functions are adequate, don't interfere by writing your own versions. There is a potential for writing the wrong things (such as leaving out members you could have failed to copy) or doing things less efficient than what the compiler would have produced.
Way 1 is a valid way for assign operator, except it is recommended to pass a constant reference there. It returns a reference to this, i.e. a lightweight pointer.
Way 2 can decrease performance. It constructs and returns a copy of this object. Furthermore, it is invalid. Why a reference return in assign operator is a standard signature? It allows expressions like
copy1 = copy2 = original;
while ((one = two).condition())
doSomething();
Let's consider the following:
(copy = original).changeObject();
With way 1 this expression is what a programmer expect. In the second way it is incorrect: you call changeObject for a temporary object returned by the assign operator, not for a copy.
You can say: "I don't want to use such ugly syntax". In this case just don't allow it and return nothing in operator=. Hence, it is recommended to return a reference to this.
See also links in comments, they seem to be useful.
I made a sparse matrix class for some work I am doing. For the sparse structures, I used pointers, e.g. int* rowInd = new int[numNonZero]. For the class I wrote copy and move assignment operators and all works fine.
Reading about the move and copy semantics online, I have tangentially found an overwhelming opinion that in modern C++ I should probably not be using raw pointers. If this is the case, then I would like to modify my code to use vectors for good coding practice.
I mostly have read vectors over raw pointers. Is there any reason not to change to vectors?
If I change the data to be stored in vectors instead of new[] arrays, do I still need to manually write copy/move assignment and constructor operators for classes? Are there any important differences between vector and new[] move/copy operators?
Suppose I have a class called Levels, which contains several sparse matrix variables. I would like a function to create a vector of Levels, and return it:
vector<Levels> GetGridLevels(int &n, ... ) {
vector<Levels> grids(n);
\\ ... Define matrix variables for each Level object in grids ...
return grids;
}
Will move semantics prevent this from being an expensive copy? I would think so, but it's a vector of objects containing objects containing member vector variables, which seems like a lot...
Yes, use std::vector<T> instead of raw T *.
Also yes, the compiler will generate copy and move assignment operators for you and those will very likely have optimal performance, so don't write your own. If you want to be explicit, you can say that you want the generated defaults:
struct S
{
std::vector<int> numbers {};
// I want a default copy constructor
S(const S&) = default;
// I want a default move constructor
S(S &&) noexcept = default;
// I want a default copy-assignment operator
S& operator=(const S&) = default;
// I want a default move-assignment operator
S& operator=(S&&) noexcept = default;
};
Regarding your last question, if I understand correctly, you mean whether returning a move-aware type by-value will be efficient. Yes, it will. To get the most out of your compiler's optimizations, follow these rules:
Return by-value (not by const value, this will inhibit moving).
Don't return std::move(x), just return x (at least if your return type is decltype(x)) so not to inhibit copy elision.
If you have more than one return statement, return the same object on every path to facilitate named return value optimization (NRVO).
std::string
good(const int a)
{
std::string answer {};
if (a % 7 > 3)
answer = "The argument modulo seven is greater than three.";
else
answer = "The argument modulo seven is less than or equal to three.";
return answer;
}
std::string
not_so_good(const int a)
{
std::string answer {"The argument modulo seven is less than or equal to three."};
if (a % 7 > 3)
return "The argument modulo seven is greater than three.";
return answer;
}
For those types where you write move constructors and assignment operators, make sure to declare them noexcept or some standard library containers (notably std::vector) will refuse to use them.
Nothing related to correctness. Just be aware that constructing a vector of size n means it will initialize all of its elements, so you might prefer to construct an empty vector, then reserve(n), then push_back the elements.
No, the implicit move constructor/assignment should take care of it all - unless you suppress them.
Yes, if you don't write code to prevent the move, you'll get an efficient move from std::vector automatically.
Also, consider using an existing library such as Eigen, so you get some fairly optimized routines for free.
No. In 99% of the cases the simplest use of std::vector will do the job better and safer than raw pointers, and in the less common cases where you need to manually manage memory, these class can work with custom allocators/deallocators (for instance, if you want aligned memory for use of aligned SSE intrinsics). If you use custom allocators, the code will be potentially more complex than raw pointers, but more maintainable and less prone to memory problems.
Depending on what your other members are, and what your class does, you may need to implement move/copy assignment/ctors. But this will be much more simple. You may have to implement them yourself, but for your vectors you just need to call the corresponding operators/ctors. The code will be simple, readable, and you will have no risks of segfaults / memory leaks
Yes, but move semantics are not even necessary. Return value optimization will be responsible for the optimized copy (in fact there will be no copy). However this is compiler specific, and not guaranteed by the standard.
template<class T>
T Stack<T>::pop()
{
if (vused_ == 0)
{
throw "Popping empty stack";
}
else
{
T result = v_[used_ - 1];
--vused_;
return result;
}
}
I didn't understand all of it, or rather I understood none of it, but it was said that this code doesn't work, because it returns by value, I am guessing he was referring to result, and that calls the copy constructor and I have no idea how that's even possible. Can anyone care to explain?
Unlike the code in the question's example, std::stack<T>::pop does not return a value.
That's because if the item type needs to be copied, and the copying throws, then you have an operation failure that has changed the state of the object, with no means of re-establishing the original state.
I.e. the return-a-value-pop does not offer a strong exception guarantee (either succeed or no change).
Similarly, throwing a literal string is unconventional to say the least.
So while the code doesn't have any error in itself (modulo possible typing errors such as vused_ versus v_ etc.), it's weak on guarantees and so unconventional that it may lead to bugs in exception handling elsewhere.
A different viewpoint is that the non-value-returning pop of std::stack is impractical, leading to needlessly verbose client code.
And for using a stack object I prefer to have a value-returning pop.
But it's not either/or: a value-returning convenience method popped can be easily defined in terms of pop (state change) and top (inspection). This convenience method then has a weaker exception guarantee. But the client code programmer can choose. :-)
An improvement within the existing design would be to support movable objects, that is, replace
return result;
with
return move( result );
helping the compiler a little.
↑ Correction:
Actually, the above deleted text has the opposite effect of the intended one, namely, it inhibits RVO (guaranteeing a constructor call). Somehow my thinking got inverted here. But as a rule, don't use move on a return expression that is just the name of a non-parameter automatic variable, because the default is optimization, and the added move can not improve things, but can inhibit an RVO optimization.
Yes, returning by value formally calls the copy constructor. But that's not a problem at all, because in practice, compilers will typically be able to optimize away the additional copy. This technique is called "Return-Value Optimization".
More than the return statement (which can work if the class is movable but not copyable, e.g. you can return std::unique_ptrs), the problem is the copy you do here:
T result = v_[used_ - 1];
To make this copy possible, the type T must be copyable (e.g. T should have public copy constructor - required by the above statement - and copy assignment operator=).
As a side note, throwing a string is really bad: you should throw an exception class, e.g.
throw std::runtime_error("Popping empty stack.");
or just define an ad hoc class for this case and throw it, e.g.:
class StackUnderflowException : public std::runtime_error
{
public:
StackUnderflowException()
: std::runtime_error("Popping empty stack.")
{ }
};
....
throw StackUnderflowException();
I've recently come across this rant.
I don't quite understand a few of the points mentioned in the article:
The author mentions the small annoyance of delete vs delete[], but seems to argue that it is actually necessary (for the compiler), without ever offering a solution. Did I miss something?
In the section 'Specialized allocators', in function f(), it seems the problems can be solved with replacing the allocations with: (omitting alignment)
// if you're going to the trouble to implement an entire Arena for memory,
// making an arena_ptr won't be much work. basically the same as an auto_ptr,
// except that it knows which arena to deallocate from when destructed.
arena_ptr<char> string(a); string.allocate(80);
// or: arena_ptr<char> string; string.allocate(a, 80);
arena_ptr<int> intp(a); intp.allocate();
// or: arena_ptr<int> intp; intp.allocate(a);
arena_ptr<foo> fp(a); fp.allocate();
// or: arena_ptr<foo>; fp.allocate(a);
// use templates in 'arena.allocate(...)' to determine that foo has
// a constructor which needs to be called. do something similar
// for destructors in '~arena_ptr()'.
In 'Dangers of overloading ::operator new[]', the author tries to do a new(p) obj[10]. Why not this instead (far less ambiguous):
obj *p = (obj *)special_malloc(sizeof(obj[10]));
for(int i = 0; i < 10; ++i, ++p)
new(p) obj;
'Debugging memory allocation in C++'. Can't argue here.
The entire article seems to revolve around classes with significant constructors and destructors located in a custom memory management scheme. While that could be useful, and I can't argue with it, it's pretty limited in commonality.
Basically, we have placement new and per-class allocators -- what problems can't be solved with these approaches?
Also, in case I'm just thick-skulled and crazy, in your ideal C++, what would replace operator new? Invent syntax as necessary -- what would be ideal, simply to help me understand these problems better.
Well, the ideal would probably be to not need delete of any kind. Have a garbage-collected environment, let the programmer avoid the whole problem.
The complaints in the rant seem to come down to
"I liked the way malloc does it"
"I don't like being forced to explicitly create objects of a known type"
He's right about the annoying fact that you have to implement both new and new[], but you're forced into that by Stroustrups' desire to maintain the core of C's semantics. Since you can't tell a pointer from an array, you have to tell the compiler yourself. You could fix that, but doing so would mean changing the semantics of the C part of the language radically; you could no longer make use of the identity
*(a+i) == a[i]
which would break a very large subset of all C code.
So, you could have a language which
implements a more complicated notion of an array, and eliminates the wonders of pointer arithmetic, implementing arrays with dope vectors or something similar.
is garbage collected, so you don't need your own delete discipline.
Which is to say, you could download Java. You could then extend that by changing the language so it
isn't strongly typed, so type checking the void * upcast is eliminated,
...but that means that you can write code that transforms a Foo into a Bar without the compiler seeing it. This would also enable ducktyping, if you want it.
The thing is, once you've done those things, you've got Python or Ruby with a C-ish syntax.
I've been writing C++ since Stroustrup sent out tapes of cfront 1.0; a lot of the history involved in C++ as it is now comes out of the desire to have an OO language that could fit into the C world. There were plenty of other, more satisfying, languages that came out around the same time, like Eiffel. C++ seems to have won. I suspect that it won because it could fit into the C world.
The rant, IMHO, is very misleading and it seems to me that the author does understand the finer details, it's just that he appears to want to mislead. IMHO, the key point that shows the flaw in argument is the following:
void* operator new(std::size_t size, void* ptr) throw();
The standard defines that the above function has the following properties:
Returns: ptr.
Notes: Intentionally performs no other action.
To restate that - this function intentionally performs no other action. This is very important, as it is the key to what placement new does: It is used to call the constructor for the object, and that's all it does. Notice explicitly that the size parameter is not even mentioned.
For those without time, to summarise my point: everything that 'malloc' does in C can be done in C++ using "::operator new". The only difference is that if you have non aggregate types, ie. types that need to have their destructors and constructors called, then you need to call those constructor and destructors. Such types do not explicitly exist in C, and so using the argument that "malloc does it better" is not valid. If you have a struct in 'C' that has a special "initializeMe" function which must be called with a corresponding "destroyMe" then all points made by the author apply equally to that struct as they do to a non-aggregate C++ struct.
Taking some of his points explicitly:
To implement multiple inheritance, the compiler must actually change the values of pointers during some casts. It can't know which value you eventually want when converting to a void * ... Thus, no ordinary function can perform the role of malloc in C++--there is no suitable return type.
This is not correct, again ::operator new performs the role of malloc:
class A1 { };
class A2 { };
class B : public A1, public A2 { };
void foo () {
void * v = ::operator new (sizeof (B));
B * b = new (v) B(); // Placement new calls the constructor for B.
delete v;
v = ::operator new (sizeof(int));
int * i = reinterpret_cast <int*> (v);
delete v'
}
As I mention above, we need placement new to call the constructor for B. In the case of 'i' we can cast from void* to int* without a problem, although again using placement new would improve type checking.
Another point he makes is about alignment requirements:
Memory returned by new char[...] will not necessarily meet the alignment requirements of a struct intlist.
The standard under 3.7.3.1/2 says:
The pointer returned shall be suitably aligned so that it can be converted to a
pointer of any complete object type and then used to access the object or array in the storage allocated (until
the storage is explicitly deallocated by a call to a corresponding deallocation function).
That to me appears pretty clear.
Under specialized allocators the author describes potential problems that you might have, eg. you need to use the allocator as an argument to any types which allocate memory themselves and the constructed objects will need to have their destructors called explicitly. Again, how is this different to passing the allocator object through to an "initalizeMe" call for a C struct?
Regarding calling the destructor, in C++ you can easily create a special kind of smart pointer, let's call it "placement_pointer" which we can define to call the destructor explicitly when it goes out of scope. As a result we could have:
template <typename T>
class placement_pointer {
// ...
~placement_pointer() {
if (*count == 0) {
m_b->~T();
}
}
// ...
T * m_b;
};
void
f ()
{
arena a;
// ...
foo *fp = new (a) foo; // must be destroyed
// ...
fp->~foo ();
placement_pointer<foo> pfp = new (a) foo; // automatically !!destructed!!
// ...
}
The last point I want to comment on is the following:
g++ comes with a "placement" operator new[] defined as follows:
inline void *
operator new[](size_t, void *place)
{
return place;
}
As noted above, not just implemented this way - but it is required to be so by the standard.
Let obj be a class with a destructor. Suppose you have sizeof (obj[10]) bytes of memory somewhere and would like to construct 10 objects of type obj at that location. (C++ defines sizeof (obj[10]) to be 10 * sizeof (obj).) Can you do so with this placement operator new[]? For example, the following code would seem to do so:
obj *
f ()
{
void *p = special_malloc (sizeof (obj[10]));
return new (p) obj[10]; // Serious trouble...
}
Unfortunately, this code is incorrect. In general, there is no guarantee that the size_t argument passed to operator new[] really corresponds to the size of the array being allocated.
But as he highlights by supplying the definition, the size argument is not used in the allocation function. The allocation function does nothing - and so the only affect of the above placement expression is to call the constructor for the 10 array elements as you would expect.
There are other issues with this code, but not the one the author listed.