What exactly do the "==" and "is" operators compare in D? - d

From "Programming in D" book I learnt that == operator needs to access the objects in order to evaluate the expression on the left and on the right before returning the boolean value. Thus it is not suitable for comparing whether the object is null.
... the == operator may need to consult the values of the members of the objects and that attempting to access the members through a potentially null variable would cause a memory access error.
also
Value equality: The == operator that appears in many examples throughout
the book compares variables by their values. When two variables are said to
be equal in that sense, their values are equal.
So, let's try the following:
import std.experimental.all;
int[] arr = [1, 1, 2, 2, 3];
arr == arr.reverse.array; // --> true
Well, this is unexpected. In Scala for instance the same expression returns False.
It becomes more clear after you check the memory address of both arr and arr.reverse.array -- it does not change. So, now the result of == makes kind of sense although one would expect it to compare values of the arrays not their addresses, right?
Now, let's try is operator which is used to compare object references and should be used to check if the object is null. It is also used to compare class variables.
arr is arr.reverse.array; // --> false
I would expect it to return true as well since it compares references. What is actually going on here? Why does is returns false instead and == returns true?

== DOES compare values. is DOES compare references. Your big mistake is using the function reverse.
http://dpldocs.info/experimental-docs/std.algorithm.mutation.reverse.html
Reverses r in-place.
emphasis mine. That means it modifies the contents of the original.
I suspect you are also checking the memory address wrong. If you are using &arr, you are comparing the address of the local variables, not the array contents. That doesn't change because it is the same local variable, you are just binding to a different array. Check .ptr instead of & and you will see it changes - the .array function always allocates a new array for it.
So == passed because the reverse changed the left hand side at the same time! It wasn't because [1,2,3] == [3,2,1], but rather because after calling reverse, the [1,2,3] was itself modified to [3,2,1] which is == [3,2,1]!.
Now, as to what these operators actually do: == checks for some abstract quality of equality. This varies by type: it can be overridden by member functions (which is why calling it on null classes is problematic) and frequently does a member-by-member comparison (e.g. array elements or struct pieces).
is, on the other hand, does something far simpler: it is a bit comparison of the variable directly, which is closer to an abstract idea of identity, but not quite (like int a = 3; int b = 3; assert(a is b); passes because both are 3 but is it the same identity? fuzzy cuz of value type.)
is will never call a user-defined function, and will never descend into member references, it just compares the bit values.
(interestingly, float.nan is float.nan also returns true, whereas == would not, again just because it compares bit values. But not all nans have the same bit value, so it is not a substitute for isNaN in the math module!)

Related

C++ unordered_map/map [] operator default initialization value [duplicate]

I have a std::map like this:
map<wstring,int> Scores;
It stores names of players and scores. When someone gets a score I would simply do:
Scores[wstrPlayerName]++;
When there is no element in the map with the key wstrPlayerName it will create one, but does it initialize to zero or null before the increment or is it left undefined?
Should I test if the element exists every time before increment?
I just wondered because I thought primitive-type things are always undefined when created.
If I write something like:
int i;
i++;
The compiler warns me that i is undefined and when I run the program it is usually not zero.
operator[] looks like this:
Value& map<Key, Value>::operator[](const Key& key);
If you call it with a key that's not yet in the map, it will default-construct a new instance of Value, put it in the map under key you passed in, and return a reference to it. In this case, you've got:
map<wstring,int> Scores;
Scores[wstrPlayerName]++;
Value here is int, and ints are default-constructed as 0, as if you initialized them with int(). Other primitive types are initialized similarly (e.g., double(), long(), bool(), etc.).
In the end, your code puts a new pair (wstrPlayerName, 0) in the map, then returns a reference to the int, which you then increment. So, there's no need to test if the element exists yet if you want things to start from 0.
This will default-construct a new instance of value. For integers, the default construction is 0, so this works as intended.
You should not test if the item exists before incrementing it. The [] operator does exactly what you need it to do, as others have said.
But what if the default-constructed value wouldn't work for you? In your case the best way to find if the element already exists is to try to insert it. The insert member function for std::map returns a std::pair<iterator, bool>. Whether the insert succeeds or fails, the first element of the pair will point to the desired object (either your new one, or the one that was already present). You can then alter its value as you see fit.
Check rules for initialization.
See section 4.9.5 Initialization of C++ Prog Lang or C++ std book. Depending on whether your variable is local, static, user-defined or const default initialization can happen.
In you case, int is called POD (Plain old Datatype). Any auto (created on heap / local variable) POD variable is not default initialized. Hence for you "i" above will not have value zero.
Always make an habit of initializing POD when defined in heap. You can even use int() to initialize value.

Char array copy fails

I'm copying an array, and for some reason the values aren't the same after the copy. The code is below. In both cases, the _data variable is a char[4]. After the copy, the assert fires. If I examine the two values in the debugger, they show as: 0x00000000015700a8 and 0x00000000015700b0.
_data[0] = rhsG->_data[0];
_data[1] = rhsG->_data[1];
_data[2] = rhsG->_data[2];
_data[3] = rhsG->_data[3];
assert(_data == rhsG->_data);
You've made the mistake of thinking C++ is an easy-to-use high-level language (joke). operator == on C-style arrays compares their address, which of course is different here. You can use std::equal to compare the two arrays, or use a different data structure which supports a more intuitive opeartor ==, such as std::array or std::vector.
You could then also use their operator = to copy them, instead of each element one at a time, assuming the source and destination are the same size. There is std::copy if they are not, or they must be C-style arrays.
If comparing with == you are just comparing two pointers, which value are different. If you want to compare for equality two arrays, you can use memcmp()
assert( ! memcmp(_data, rhsG->_data, 4) );
When you use operator == in assert "_data == rhsG->_data", _data and rhsG->_data are both represented address of the array. So, in your debugger, 0x00000000015700a8 is array address of _data and 0x00000000015700b0 is array address of rhsG->_data. Obviously, they are different, then the assert fires.
After all, array name is always a pointer that point to the first array address in memory.
"_data == rhsG->_data" does not compare the individual elements of two arrays.
The "==" operator is not defined for arrays, so the two parameters are decayed to pointers, which == can work on.
Your assert is comparing the addresses of the two arrays which are different because they are in different memory locations.
If you really want to compare the values, then either loop over them or use memmp.
assert(memcmp(_data, rhsG->_data, 4) == 0);

Mechanism of using sets as keys to map

I have this code which I do not understand why it works:
map<set<int>,int> states;
set<int> s1 = {5,1,3}, s2 = {1,5,3};
states[s1] = 42;
printf("%d", states[s2]); // 42
The output is 42, so the values of the states key are used for comparison somehow. How is this possible? I'd expect this not to work as in the similar example:
map<const char*,int> states;
char s1[]="foo", s2[]="foo";
states[s1] = 42;
printf("%d",states[s2]); // not 42
Here the address of the char pointer is used as the key, not the value of the memory where it points, right? Please explain what's the difference between these two samples.
Edit: I've just found something about comparison object which explains a lot. But how is the comparison object created for sets? I can't see how it could be the default less object.
You were up to something with the comparison object.
One of the template parameters for a C++ map defines what acts as the comparison predicate, by default std::less<Key>, in this case std::less<set<int>>.
From cplusplus.com:
The map object uses this expression to determine both the order the elements follow in the container and whether two element keys are equivalent (by comparing them reflexively: they are equivalent if !comp(a,b) && !comp(b,a)). No two elements in a map container can have equivalent keys.
std::less:
Binary function object class whose call returns whether the its first argument compares less than the second (as returned by operator <).
std::set::key_comp:
By default, this is a less object, which returns the same as operator<.
Now, what does the less-than operator do?:
The less-than comparison (operator<) behaves as if using algorithm lexicographical_compare, which compares the elements sequentially using operator< in a reciprocal manner (i.e., checking both a<b and b<a) and stopping at the first occurrence.
Or from MSDN:
The comparison between set objects is based on a pairwise comparison of their elements. The less-than relationship between two objects is based on a comparison of the first pair of unequal elements.
So, since both sets are equivalent, because sets are ordered by key, using either as key refers to the same entry in the map.
But the second example uses a pointer as key, so the two equivalent values aren't equal as far as the map goes because the operator< in this case isn't defined in any special way, it's just a comparison between addresses. If you had used a std::string as key, they would have matched, though (because they do have their own operator<).
A few reasons for this behavior:
Sets are kept sorted as you input values, so s1 and s2 actually look the same if u print it out.
Sets have overloaded operators so that s1 == s2 compares the contents instead
the second example with *char explicitly put the pointer (address) as the key
declaring the static string gives u different addresses for two instances of "foo"
if you put s2 = s1 instead instead you should get the same behavior
looking at the documentation on cppreference.com:
template<
class Key,
class T,
class Compare = std::less<Key>,
class Allocator = std::allocator<std::pair<const Key, T> >
> class map;
so if you don't supply your own custom comparison function, it orders the elements by the < operator, and:
operator==,!=,<,<=,>,>=(std::set)...
Compares the contents of lhs and rhs
lexicographically. The comparison is
performed by a function equivalent to
std::lexicographical_compare.
so when you call < on two sets it compares their sorted elements lexicographically. your two sets, when sorted, are exactly the same.
When you do:
cout << (s1 == s2);
You get an output of 1. I don't know how sets in C++ work; I don't even know what they are. I do know that the == comparison operator does not return whether not the values of the sets are in the same order. Likely, it returns whether or not the sets contain the same values, regardless of order.
Edit: Yeah, the sets are sorted. So when you create them, they get sorted, which means they become the same thing. Use a std::vector or a std::array
Edit #2: They do become equal. Let me create an analogy.
int v1 = 4 + 1;
int v2 = 1 + 4;
Now obviously, v1 and v2 would be the same value. Sets are no different; once created, the contents are sorted, and when you compare using the == operator, which is how partially how std::map identifies mapped elements, it returns "yes, they are equal". That is why your code works how it works.

Is there a reason for zero sized std::array in C++11?

Consider the following piece of code, which is perfectly acceptable by a C++11 compiler:
#include <array>
#include <iostream>
auto main() -> int {
std::array<double, 0> A;
for(auto i : A) std::cout << i << std::endl;
return 0;
}
According to the standard § 23.3.2.8 [Zero sized arrays]:
1 Array shall provide support for the special case N == 0.
2 In the case that N == 0, begin() == end() == unique value. The return value of
data() is unspecified.
3 The effect of calling front() or back() for a zero-sized array is undefined.
4 Member function swap() shall have a noexcept-specification which is equivalent to
noexcept(true).
As displayed above, zero sized std::arrays are perfectly allowable in C++11, in contrast with zero sized arrays (e.g., int A[0];) where they are explicitly forbidden, yet they are allowed by some compilers (e.g., GCC) in the cost of undefined behaviour.
Considering this "contradiction", I have the following questions:
Why the C++ committee decided to allow zero sized std::arrays?
Are there any valuable uses?
If you have a generic function it is bad if that function randomly breaks for special parameters. For example, lets say you could have a template function that takes N random elements form a vector:
template<typename T, size_t N>
std::array<T, N> choose(const std::vector<T> &v) {
...
}
Nothing is gained if this causes undefined behavior or compiler errors if N for some reason turns out to be zero.
For raw arrays a reason behind the restriction is that you don't want types with sizeof T == 0, this leads to strange effects in combination with pointer arithmetic. An array with zero elements would have size zero, if you don't add any special rules for it.
But std::array<> is a class, and classes always have size > 0. So you don't run into those problems with std::array<>, and a consistent interface without an arbitrary restriction of the template parameter is preferable.
One use that I can think of is the return of zero length arrays is possible and has functionality to be checked specifically.
For example see the documentation on the std::array function empty(). It has the following return value:
true if the array size is 0, false otherwise.
http://www.cplusplus.com/reference/array/array/empty/
I think the ability to return and check for 0 length arrays is in line with the standard for other implementations of stl types, for eg. Vectors and maps and is therefore useful.
As with other container classes, it is useful to be able to have an object that represents an array of things, and to have it possible for that array to be or become empty. If that were not possible, then one would need to create another object, or a managing class, to represent that state in a legal way. Having that ability already contained in all container classes, is very helpful. In using it, one then just needs to be in the habit of relating to the array as a container that might be empty, and checking the size or index before referring to a member of it in cases where it might not point to anything.
There are actually quite a few cases where you want to be able to do this. It's present in a lot of other languages too. For example Java actually has Collections.emptyList() which returns a list which is not only size zero but cannot be expanded or resized or modified.
An example usage might be if you had a class representing a bus and a list of passengers within that class. The list might be lazy initialized, only created when passengers board. If someone calls getPassengers() though then an empty list can be returned rather than creating a new list each time just to report empty.
Returning null would also work for the internal efficiency of the class - but would then make life a lot more complicated for everyone using the class since whenever you call getPassengers() you would need to null check the result. Instead if you get an empty list back then so long as your code doesn't make assumptions that the list is not empty you don't need any special code to handle it being null.

A few C++ vector questions

I'm trying to learn some c++, to start off I created some methods to handle outputing to and reading from a console.
I'm having 2 major problems, marked in the code, manipulating/accessing values within a std::vector of strings passed in by reference.
The method below takes in a question (std string) to ask the user and a vector std strings that contain responses from the user deemed acceptable. I also wanted, in the interest of learning, to access a string within the vector and change its value.
std::string My_Namespace::My_Class::ask(std::string question, std::vector<std::string> *validInputs){
bool val = false;
std::string response;
while(!val){
//Ask and get a response
response = ask(question);
//Iterate through the acceptable responses looking for a match
for(unsigned int i = 0; i < validInputs->size(); i++){
if(response == validInputs->at(i)){
////1) Above condition always returns true/////
val = true;
break;
}
}
}
//////////2) does not print anything//////////
println(validInputs->at(0)); //note the println method is just cout << param << "\n" << std::endl
//Really I want to manipulate its value (not the pointer the actual value)
//So I'd want something analogous to validInputs.set(index, newVal); from java
///////////////////////////////////////////
}
A few additional questions:
3) I'm using .at(index) on the the vector to get the value but I've read that [] should be used instead, however I'm not sure what that should look like (validInputs[i] doesn't compile).
4) I assume that since a deep copy is unnecessary its good practice to pass in a pointer to the vector as above, can someone verify that?
5) I've heard that ++i is better practice than i++ in loops, is that true? why?
3) There should not be a significant difference using at and operator[] in this case. Note that you have a pointer-to-vector, not a vector (nor reference-to-vector) so you will have to use either (*validInputs)[i] or validInputs->operator[](i) to use the operator overload. Using validInputs->at(i) is fine if you don't want to use either of these other approaches. (The at method will throw an exception if the argument is out of the array bounds, while the operator[] method has undefined behavior when the argument is out of the array bounds. Since operator[] skips the bounds check, it is faster if you know for a fact that i is within the vector's bounds. If you are not sure, use at and be prepared to catch an exception.)
4) A pointer is good, but a reference would be better. And if you're not modifying the vector in the method, a reference-to-const-vector would be best (std::vector<std::string> const &). This ensures that you cannot be passed a null pointer (references cannot be null), while also ensuring that you don't accidentally modify the vector.
5) It usually is. i++ is post-increment, which means that the original value must be copied, then i is incremented and the copy of the original value is returned. ++i increments i and then returns i, so it is usually faster, especially when dealing with complex iterators. With an unsigned int the compiler should be smart enough to realize that a pre-increment will be fine, but it's good to get into the practice of using ++i if you don't need the original, unincremented value of i.
I'd use a reference-to-const, and std::find. Note that I also take the string by reference (it gets deep copied otherwise) :
std::string My_Class::
ask (const std::string& question, const std::vector<std::string>& validInputs)
{
for (;;) {
auto response = ask (question);
auto i = std::find (validInputs.begin (), validInputs.end (), response);
if (i != validInputs.end ()) {
std::cout << *i << '\n'; // Prints the value found
return *i;
}
}
}
Read about iterators if you don't understand the code. Of course, feel free to ask other questions if you need.
I'm not going to address points 1 and 2 since we don't know what you are doing and we don't even see the code for ask and println.
I'm using .at(index) on the the vector to get the value but I've read that [] should be used instead, however I'm not sure what that should look like (validInputs[i] doesn't compile).
Subscript access and at member function are different things. They give you the very same thing, a reference to the indexed element, but they behave differently if you pass an out-of bounds index: at will throw an exception while [] will invoke undefined behavior (as builtin arrays do). Using [] on a pointer is somewhat ugly, (*validInputs)[i], but you really should avoid pointers when possible.
I assume that since a deep copy is unnecessary its good practice to pass in a pointer to the vector as above, can someone verify that?
A deep copy is unnecessary, but so is a pointer. You want a reference instead, and a const one since I presume you shouldn't be modifying those:
ask(std::string const& question, std::vector<std::string> const& validInputs)
I've heard that ++i is better practice than i++ in loops, is that true? why?
Its true in the general case. The two operations are different, ++i increments i and returns the new value while i++ increments i but returns the value before the incrementation, which requires a temporary to be hold and returned. For ints this hardly matters, but for potentially fat iterators preincrement is more efficient and a better choice if you don't need or care for its return value.
To answer questions 1 and 2, we'll probably need more information, like: How did you initialize validInputs? What's the source of ask?
3) First dereference the pointer, then index the vector:
(*validInputs)[i]
4) References are considered better style. Especially instead of pointers which never are NULL.
5) For integers, it doesn't matter (unless you evaluate the result of the expression). For other objects, with overloaded ++ operators (iterators, for example) it may be better to use ++i. But in practice, for inline definitions of the ++ operator, it will probably be optimized to the same code.