const input vs non-const output

const input vs non-const output - c++

I have a function which traverses a tree of objects and does not modify any of the objects in the tree.
The function looks something like this:
static Node* findMatchingNode(const Node& root, const SomeFilterData& d);
struct Node {
Node* left;
Node* right;
};
The function can return the root or any object in the tree or nothing. It's obvious than with given declaration I have to perform const_cast somewhere, which is forbidden in most cases.
Is it acceptable for a function to guarantee constness and at the same time allow anyone to modify its output?
EDIT. I haven't stated this clearly, my function really does not modify any of the nodes in the tree, doesn't create new Node, it's pure function. I'd like to always have const qualifier there to tell everyone clearly that the function doesn't modify anything
EDIT. The unspoken problem behind is that there is no legal way to express constness of the input during the function execution (and only inside the function) without enforcing the output constness.

As always, you cannot modify const data because it is constant. In particular:
Even though const_cast may remove constness or volatility from any pointer or reference, using the resulting pointer or reference to write to an object that was declared const or to access an object that was declared volatile invokes undefined behavior.
(from here)
So if a pointer to the const input argument root is a sensible outcome, the return type should be const as well. You should be really careful with returning pointers anyway: As of now, your function can return a pointer to a temporary!
So better return a boost::optional<Node> or an std::optional<Node> (if the latter makes it into C++17 and you use that standard).
If your function only makes sense for a modifiable input root (e.g. if the outcome must be modifiable and the outcome can be the address of root), drop the const in your declaration. That would also prevent the input root from being a temporary.
Most likely the cleanest fix for your potential XY-problem:
If it fits your use-case (I find it highly unlikely that it does not), the most likely even better option would be defining your algorithm on a iteratable data structure like a tree or a list (whatever your Node is a node of) and then return an iterator to the matching node, or a past-the-end iterator (with respect to the high-level structure) if no such node exists.
As for the const-vs.-non-const discussion: With the iterator option you can differentiate between the constness of the high level structure and the reference node you compare stuff with by taking non-const iterators over the high level structure and a const & argument for the reference input node (where a temporary now would not be a problem).
From what I can see from your question, your function might even be replaceable with std::find_if. Then you would not have this problem in the first place.

Your code is not const-correct, it drops a const. This is why you have the problem of a "required" const cast. Don't do that:
static const Node* findMatchingNode(const Node& root, const SomeFilterData& d);
You may point out that you may want to call this function from another function that does modify the nodes, and thus wants a non-const result. As such, that should be a new function with a totally different signature:
static Node* findMatchingNode(Node& root, const SomeFilterData& d);
You may point out that these have identical bodies, and there's the DRY principle (DRY = Don't Repeat Yourself = Don't have copy-pasted code). Yes, so just take a shortcut here: const_cast. I think it's ok in this case, because it's only purpose is shared code, and it's clear that it's not violating any const-correctness principles.
//This function does not modify anything
static Node* findMatchingNode(Node& root, const SomeFilterData& d) {
return const_cast<Node*>(findMatchingNode(const_cast<const Node&>(root),d));
}
Loki Asari suggests adding a third private function findMatchingNodeCommon() that both versions of findMatchingNode() call. Then you can get away with one less const_cast. If you want to go extreme then you can make findMatchingNodeCommon() templated and then all const_casts go away. I think it's not worth the bother, but they're very valid opinions, so worth a mention.

According to the Standard C++ Foundation:
What is the relationship between a return-by-reference and a const member function?
If you want to return a member of your this object by reference from an inspector method, you should return it using reference-to-const (const X& inspect() const) or by value (X inspect() const).
So yeah, you should be returning either by value or a const reference/pointer. You aren't using this but the pattern is the same.

I'd look at how the STL does it. There, you have a sequence (delimited by iterators), a predicate that defines a match and a template function that searches. Since it's a template function, it automatically matches the return values const-qualifiers to that of the parameters.
If, for some reason, you don't want to write a template function, you could also add two overloads, because you can overload functions with different const-qualifiers.

I'd like to answer myself, as the original question was not shaped properly. So, my answer can't be treated as a proper.
There is no legit way to express constness of the input (during the function execution) without forcing a user to not modify the output (and without enforcing the output constness).
A way without _const_cast_:
static Node* findMatchingNode(Node& root, const SomeFilterData& d);
doesn't look like it guarantees the input tree constness.
The original function declaration allows hacks like this, which is also inappropriate:
const Node& root = ...
Node* result = findMatchingNode(root, filter);
result->doBadNonConstThings();

Related

C++ overloading the equality operator. Should I write my function to accept argument passed by reference or value?

I want to overload the == operator for a simple struct
struct MyStruct {
public:
int a;
float b;
bool operator==( ) { }
}
All the examples I'm seeing seem to pass the value by reference using a &.
But I really want to pass these structs by value.
Is there anything wrong with me writing this as
bool operator== (MyStruct another) { return ( (a==another.a) && (b==another.b) ); }

It should really not matter expect that you pay the penalty of a copy when you pass by value. This applies if the struct is really heavy. In the simple example you quote, there may not be a big difference.
That being said, passing by const reference makes more sense since it expresses the intent of the overloaded function == clearly. const makes sure that the overloaded function accidentally doesn't modify the object and passing by reference saves you from making a copy. For == operator, there is no need to pass a copy just for comparison purposes.
If you are concerned about consistency, it's better to switch the other pass by value instances to pass by const ref.

While being consistent is laudable goal, one shouldn't overdo it. A program containing only 'A' characters would be very consistent, but hardly useful. Argument passing mechanism is not something you do out of consistency, it is a technical decision based on certain technical aspects.
For example, in your case, passing by value could potentially lead to better performance, since the struct is small enough and on AMD64 ABI (the one which is used on any 64bit Intel/AMD chip) it will be passed in a register, thus saving time normally associated with dereferencing.
On the hand, in your case, it is reasonable to assume that the function will be inlined, and passing scheme will not matter at all (since it won't be passed). This is proven by codegen here (no call to operator== exist in generated assembly): https://gcc.godbolt.org/z/G7oEgE

What in the world is T*& return type

I have been looking at vector implementations and stumbled upon a line that confuses me as a naive C++ learner.
What is T*& return type?
Is this merely a reference to a pointer?
Why would this be useful then?
link to code: https://github.com/questor/eastl/blob/56beffd7184d4d1b3deb6929f1a1cdbb4fd794fd/vector.h#L146
T*& internalCapacityPtr() EASTL_NOEXCEPT { return mCapacityAllocator.first(); }

It's a reference-to-a-pointer to a value of type T which is passed as a template argument, or rather:
There exists an instance of VectorBase<T> where T is specified by the program, T could be int, string or anything.
The T value exists as an item inside the vector.
A pointer to the item can be created: T* pointer = &this->itemValues[123]
You can then create a reference to this pointer: https://msdn.microsoft.com/en-us/library/1sf8shae.aspx?f=255&MSPPError=-2147217396
Correct
If you need to use a value "indirectly" then references-to-pointers are cheaper to use than pointer-to-pointers as the compiler/CPU doesn't need to perform a double-indirection.

http://c-faq.com/decl/spiral.anderson.html
This would be a reference to a pointer of type T. References to pointers can be a bit tricky but are used a lot with smart pointers when using a reference saves an increment to the reference counter.

Types in C++ should be read from right to left. Following this, it becomes a: Reference to a pointer of T. So your assumption is correct.
References to pointers are very useful, this is often used as an output argument or an in-out argument. Let's consider a specific case of std::swap
template <typename T>
void swap(T*& lhs, T*& rhs) {
T *tmp = rhs;
rhs = lhs;
lhs = tmp;
}
As with every type, it can be used as return value. In the state library, you can find this return type for std::vector<int *>::operator[], allowing v[0] = nullptr.
On the projects that I've worked on, I haven't seen much usages of this kind of getters that allow changing the internals. However, it does allow you to write a single method for reading and writing the value of the member.
In my opinion, I would call it a code smell as it makes it harder to understand which callers do actual modifications.
The story is off course different when returning a const reference to a member, as that might prevent copies. Though preventing the copy of a pointer doesn't add value.

About the implementations of c++ stl predicate

I wonder how c++ stl predicate is implemented? For example in copy_if()
http://www.cplusplus.com/reference/algorithm/copy_if/
According to Effective STL, predicate is passed by value. For the following code for int,
struct my_predicate{
int var_1;
float var_2;
bool operator()(const int& arg){
// some processing here
}
}
How is copy_if() implemented regarding to passing value of my_predicate? There are var_1 and var_2 here. For other predicates, there may be different variables in the struct.
If passing by reference or pointer, that is very reasonable to me.
Thanks a lot!

(I hope I'm not misunderstanding your question.)
The reason why it can be passed by value is that the 'my_predicate' struct has an implicit copy constructor automatically generated by the compiler. You can pass it by value because it has a copy constructor.
In practice, It is very likely the compiler will optimise away the copy. In fact it is very likely the compiler will optimise away the entire predicate object and for example in the case of std::copy_if reduce the code to the equivalent of a for loop + if statement.
By convention predicates are passed by value. They are not meant to be heavy weight objects and for small objects even if the entire predicate isn't optimised away, it is faster to pass by value anyway.
Also generally you cannot pass temporary values by non-const reference (let alone pointer) so:
std::copy_if(begin(..),end(..),my_predicate{});
would not compile as your predicate is not a const function. With pass by value you can get away with this.

comparator for sorting a vector contatining pointers to objects of custom class

By this question I am also trying to understand fundamentals of C++, as I am very new to C++. There are many good answers to problem of sorting a vector/list of custom classes, like this. In all of the examples the signature of comparator functions passed to sort are like this:
(const ClassType& obj1, const ClassType& obj2)
Is this signature mandatory for comparator functions? Or we can give some thing like this also:
(ClassType obj1, ClassType obj2)
Assuming I will modify the body of comparator accordingly.
If the first signature is mandatory, then why?
I want to understand reasons behind using const and reference'&'.
What I can think is const is because you don't want the comparator function to be able to modify the element. And reference is so that no multiple copies are created.
How should my signature be if I want to sort a vector which contains pointers to objects of custom class? Like (1) or (2) (see below) or both will work?
vertor to be sorted is of type vector
(1)
(const ClassType*& ptr1, const ClassType*& ptr2)
(2)
(ClassType* ptr1, ClassType* ptr2)

I recommend looking through This Documentation.
It explains that the signature of the compare function must be equivalent to:
bool cmp(const Type1& a, const Type2& b);
Being more precise it then goes on to explain that each parameter needs to be a type that is implicitly convertable from an object that is obtained by dereferencing an iterator to the sort function.
So if your iterator is std::vector<ClassType*>::iterator then your arguments need to be implicitly convertable to ClassType*.
If you are using something relatively small like an int or a pointer then I would accept them by value:
bool cmp(const ClassType* ptr1, const ClassType* ptr2) // this is more efficient
NOTE: I made them pointers to const because a sort function should not modify the values it is sorting.

(ClassType obj1, ClassType obj2)
In most situations this signature will also work, for comparators. The reason it is not used is because you have to realize that this is passing the objects by value, which requires the objects to be copied.
This will be a complete waste. The comparator function does not need to have its own copies of its parameters. All it needs are references to two objects it needs to compare, that's it. Additionally, a comparator function does not need to modify the objects it is comparing. It should not do that. Hence, explicitly using a const reference forces the compiler to issue a compilation error, if the comparator function is coded, in error, to modify the object.
And one situation where this will definitely not work is for classes that have deleted copy constructors. Instances of those classes cannot be copied, at all. You can still emplace them into the containers, but they cannot be copied. But they still can be compared.

const is so you know not to change the values while you're comparing them. Reference is because you don't want to make a copy of the value while you're trying to compare them -- they may not even be copyable.
It should look like your first example -- it's always a reference to the const type of the elements of the vector.
If you have vector, it's always:
T const & left, T const & right
So, if T is a pointer, then the signature for the comparison includes the comparison.

There's nothing really special about the STL. I use it for two main reasons, as a slightly more convenient array (std::vector) and because a balanced binary search tree is a hassle to implement. STL has a standard signature for comparators, so all the algorithms are written to operate on the '<' operation (so they test for equality with if(!( a < b || b < a)) ). They could just as easily have chosen the '>' operation or the C qsort() convention, and you can write your own templated sort routines to do that if you want. However it's easier to use C++ if everything uses the same conventions.
The comparators take const references because a comparator shouldn't modify what it is comparing, and because references are more efficient for objects than passing by value. If you just want to sort integers (rarely you need to sort just raw integers in a real program, though it's often done as an exercise) you can quite possibly write your own sort that passes by value and is a tiny bit faster than the STL sort as a consequence.

You can define the comparator with the following signature:
bool com(ClassType* const & lhs, ClassType* const & rhs);
Note the difference from your first option. (What is needed is a const reference to a ClassType* instead of a reference to a const ClassType*)
The second option should also be good.

What is the difference between these two parameters in C++?

I am new to C++ and currently am learning about templates and iterators.
I saw some code implementing custom iterators and I'm curious to know what the difference between these two iterator parameters is:
iterator & operator=(iterator i) { ... i.someVar }
bool operator==(const iterator & i) { ... i.someVar }
They implement the = and == operators for the particular iterator. Assuming the iterator class has a member variable 'someVar', why is one operator implemented using "iterator i" and another with "iterator & i"? Is there any difference between the two "i.someVar" expressions?
I googled a little and found this question
Address of array - difference between having an ampersand and no ampersand
to which the answer was "the array is converted to a pointer and its value is the address of the first thing in the array." I'm not sure this is related, but it seems like the only valid explanation I could find.
Thank you!

operator= takes its argument by value (a.k.a. by copy). operator == takes its argument by const reference (a.k.a. by address, albeit with a guarantee that the object will not be modified).
An iterator may be/contain a pointer into an array but it is not itself an array.
The ampersand (&) has different contextual meanings. Used in an expression, it behaves as an operator. Used in a declaration such as iterator & i, it forms part of the type iterator & and indicates that i is a reference, as opposed to an object.
For more discussion (with pictures!), see Pass by Reference / Value in C++ and What's the difference between passing by reference vs. passing by value? (this one is language agnostic).

the assignment operator = takes the iterator i as value, which means a copy of the original iterator is made and passed to the function so any changes applied to the iterator i inside the operator method won't affect the original.
the comparison operator == takes a constant reference, which denotes that the original object can't/shouldn't be changed in the method. This makes sense since a comparison operator usually only compares objects without changing them. The reference allows to pass a reference to the original iterator which lives outside the method. This means that the actual object won't be copied which is usually faster.

First, you don't have an address of an array here.
There's no semantic difference, unless you try to make a local change to the local variable i: iterator i will allow a local change, while const iterator & i will not.
Many people are used to writing const type & var for function parameters because passing by reference can be faster than by value, especially if type is big and expensive to copy, but in your case, an iterator should be small and cheap to copy, so there's no gain from avoiding copying. (Actually, having a local copy can enhance locality of reference and help optimization, so I would just pass small values by value (by copying).)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

const input vs non-const output - c++

Related

C++ overloading the equality operator. Should I write my function to accept argument passed by reference or value?

What in the world is T*& return type

About the implementations of c++ stl predicate

comparator for sorting a vector contatining pointers to objects of custom class

What is the difference between these two parameters in C++?

Categories

Resources