Why are string::append operations behaving strangely?

Why are string::append operations behaving strangely? - c++

look at the following simple code:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s("1234567890");
string::iterator i1 = s.begin();
string::iterator i2 = s.begin();
string s1, s2;
s1.append(i1, ++i1);
s2.append(++i2, s.end());
cout << s1 << endl;
cout << s2 << endl;
}
what would you expect the output to be?
would you, like me, expect it to be:
1
234567890
wrong!
it is:
234567890
i.e. the first string is empty.
seams that prefix increment operator is problematic with iterators. or am I missing something?

You're missing something: this really has nothing to do with iterators. The order in which arguments to a function are evaluated is unspecified. As such your: append(i1, ++i1); would depend on unspecified behavior, regardless of the type of i1. Just for example, given something a lot simpler like:
void print(int a, int b) {
std::cout << a << " " << b << "\n";
}
int main() {
int a =0;
print(a, ++a);
return 0;
}
Your output could perfectly reasonably be the "0 1" you seem to expect, but it could also perfectly reasonably be: "1 1". Since this is unspecified, it could change from one version of the compiler to the next, or even with the same compiler when you change flags, or (in theory) could vary depending on the phase of the moon...

C++ implementations are free to evaluate arguments in any order. In this case, if ++i1 is evaluated first, you will get an empty string.

Not a bug.
The order in which the arguments to
s1.append(i1, ++i1);
are evaluated is not specified by the standard. The compiler is free to use any order it chooses. In this case, it evaluates the second argument (++i1) before the first (i1) and you specify a null range to copy.

The C++ standard does not specify anything about the order in which the function arguments are evaluated, making it implementation dependent. C++ requires that the arguments to a function be completely evaluated (and all side-effects posted) prior to entering the function, but the implementation is free to evaluate the arguments in any order
In your case i++ got evaluated before making both the parameters same, resulting in an empty string.
More info on this behavior here on comp.compilers newsgroup

Related

Strange behavior when initializing template structure

I have a structure:
template < class L, class R > struct X {
X()
{ }
friend std::ostream& operator<<(std::ostream& str, X& __x)
{
return str << '(' << __x.__val1 << ", " << __x.__val2 << ')';
}
private:
L __val1;
R __val2;
};
and create it without initializing anything:
X<std::size_t, std::string> x;
std::cout << x << std::endl;
It always gives output: (2, )
But, when i do:
X<std::string, std::size_t> x;
std::cout << x << std::endl;
I have "right" behaviour with uninitialized variable: (, 94690864442656).
Why?

There is no "right" value of an uninitialized variable.
The value is said to be "indeterminate". Using an indeterminate value leads to undefined behavior. Your program could output anything or nothing.
Assuming (, 94690864442656) to be "right", because it looks like some "uninitialized value" while (2, ) looks like something was initialized, is wrong.
2 is just as wrong as 94690864442656. When the behavior of your code is undefined, then it is undefined.
If it helps, think of it like this: You are supposed to calculate the result of 2*3. Instead of actually carrying out the calculation you call the number that comes to your mind in that moment. Most of the time you will say the wrong result. Once in a while you will answer with a result that looks meaningful, because you correctly guessed 6, or you said 5 or 7 which is just off by one. However, getting the expected result sometimes, does not imply that your way of getting the result is correct.
Or consider this: (but be careful with the use of randomness here. Uninitialized values are not random!) Suppose instead of calculating the result of 2*3 you use the wrong way of rolling a dice (instead of actually calculating the number). Now assume you roll a 6. Would you be surprised to get the "correct" result, even though your algorithm is wrong?
If you really care why you get 2 in one case and 94690864442656 in the other, you need to study the assembly generated by the compiler, because C++ does not specify what is the outcome of compiling code with undefined behavior. It just says: It is undefined.
Note that also using identifiers that contain a double underscore is not allowed, as such names are reserved (https://en.cppreference.com/w/cpp/language/identifiers).

printing out unordered map values prints out address [duplicate]

Why is the output of the below program what it is?
#include <iostream>
using namespace std;
int main(){
cout << "2+3 = " <<
cout << 2 + 3 << endl;
}
produces
2+3 = 15
instead of the expected
2+3 = 5
This question has already gone multiple close/reopen cycles.
Before voting to close, please consider this meta discussion about this issue.

Whether intentionally or by accident, you have << at the end of the first output line, where you probably meant ;. So you essentially have
cout << "2+3 = "; // this, of course, prints "2+3 = "
cout << cout; // this prints "1"
cout << 2 + 3; // this prints "5"
cout << endl; // this finishes the line
So the question boils down to this: why does cout << cout; print "1"?
This turns out to be, perhaps surprisingly, subtle. std::cout, via its base class std::basic_ios, provides a certain type conversion operator that is intended to be used in boolean context, as in
while (cout) { PrintSomething(cout); }
This is a pretty poor example, as it's difficult to get output to fail - but std::basic_ios is actually a base class for both input and output streams, and for input it makes much more sense:
int value;
while (cin >> value) { DoSomethingWith(value); }
(gets out of the loop at end of stream, or when stream characters do not form a valid integer).
Now, the exact definition of this conversion operator has changed between C++03 and C++11 versions of the standard. In older versions, it was operator void*() const; (typically implemented as return fail() ? NULL : this;), while in newer it's explicit operator bool() const; (typically implemented simply as return !fail();). Both declarations work fine in a boolean context, but behave differently when (mis)used outside of such context.
In particular, under C++03 rules, cout << cout would be interpreted as cout << cout.operator void*() and print some address. Under C++11 rules, cout << cout should not compile at all, as the operator is declared explicit and thus cannot participate in implicit conversions. That was in fact the primary motivation for the change - preventing nonsensical code from compiling. A compiler that conforms to either standard would not produce a program that prints "1".
Apparently, certain C++ implementations allow mixing and matching the compiler and the library in such a way that produces non-conforming outcome (quoting #StephanLechner: "I found a setting in xcode which produces 1, and another setting that yields an address: Language dialect c++98 combined with "Standard library libc++ (LLVM standard library with c++11 support)" yields 1, whereas c++98 combined with libstdc (gnu c++ standard library) yields an address;"). You can have a C++03-style compiler that doesn't understand explicit conversion operators (which are new in C++11) combined with a C++11-style library that defines the conversion as operator bool(). With such a mix, it becomes possible for cout << cout to be interpreted as cout << cout.operator bool(), which in turn is simply cout << true and prints "1".

As Igor says, you get this with a C++11 library, where std::basic_ios has the operator bool instead of the operator void*, but somehow isn't declared (or treated as) explicit. See here for the correct declaration.
For example, a conforming C++11 compiler will give the same result with
#include <iostream>
using namespace std;
int main() {
cout << "2+3 = " <<
static_cast<bool>(cout) << 2 + 3 << endl;
}
but in your case, the static_cast<bool> is being (wrongly) allowed as an implicit conversion.
Edit:
Since this isn't usual or expected behaviour, it might be useful to know your platform, compiler version, etc.
Edit 2: For reference, the code would usually be written either as
cout << "2+3 = "
<< 2 + 3 << endl;
or as
cout << "2+3 = ";
cout << 2 + 3 << endl;
and it's mixing the two styles together that exposed the bug.

The reason for the unexpected output is a typo. You probably meant
cout << "2+3 = "
<< 2 + 3 << endl;
If we ignore the strings that have the expected output, we are left with:
cout << cout;
Since C++11, this is ill-formed. std::cout is not implicitly convertible to anything that std::basic_ostream<char>::operator<< (or a non member overload) would accept. Therefore a standards conforming compiler must at least warn you for doing this. My compiler refused to compile your program.
std::cout would be convertible to bool, and the bool overload of the stream input operator would have the observed output of 1. However, that overload is explicit, so it shouldn't allow an implicit conversion. It appears that your compiler/standard library implementation doesn't strictly conform to the standard.
In a pre-C++11 standard, this is well formed. Back then std::cout had an implicit conversion operator to void* which has a stream input operator overload. The output for that would however be different. it would print the memory address of the std::cout object.

The posted code should not compile for any C++11 (or later conformant compiler), but it should compile without even a warning on pre C++11 implementations.
The difference is that C++11 made the convertion of a stream to a bool explicit:
C.2.15 Clause 27: Input/output library [diff.cpp03.input.output]
27.7.2.1.3, 27.7.3.4, 27.5.5.4
Change: Specify use of explicit in existing boolean conversion operators
Rationale: Clarify intentions, avoid workarounds.
Effect on original feature: Valid C++ 2003 code that relies on implicit boolean conversions will fail to
compile with this International Standard. Such conversions occur in the following conditions:
passing a value to a function that takes an argument of type bool;...
ostream operator << is defined with a bool parameter. As a conversion to bool existed (and was not explicit) is pre-C++11, cout << cout was translated to cout << true which yields 1.
And according to C.2.15, this should not longer compile starting with C++11.

You can easily debug your code this way. When you use cout your output is buffered so you can analyse it like this:
Imagine first occurence of cout represents the buffer and operator << represents appending to the end of the buffer. Result of operator << is output stream, in your case cout. You start from:
cout << "2+3 = " << cout << 2 + 3 << endl;
After applying the above stated rules you get a set of actions like this:
buffer.append("2+3 = ").append(cout).append(2 + 3).append(endl);
As I said before the result of buffer.append() is buffer. At the begining your buffer is empty and you have the following statement to process:
statement: buffer.append("2+3 = ").append(cout).append(2 + 3).append(endl);
buffer: empty
First you have buffer.append("2+3 = ") which puts the given string directly into the buffer and becomes buffer. Now your state looks like this:
statement: buffer.append(cout).append(2 + 3).append(endl);
buffer: 2+3 =
After that you continue to analyze your statement and you come across cout as argument to append to the end of buffer. The cout is treated as 1 so you will append 1 to the end of your buffer. Now you are in this state:
statement: buffer.append(2 + 3).append(endl);
buffer: 2+3 = 1
Next thing you have in buffer is 2 + 3 and since addition has higher precedence than output operator you will first add these two numbers and then you will put the result in buffer. After that you get:
statement: buffer.append(endl);
buffer: 2+3 = 15
Finally you add value of endl to the end of the buffer and you have:
statement:
buffer: 2+3 = 15\n
After this process the characters from the buffer are printed from the buffer to standard output one by one. So the result of your code is 2+3 = 15. If you look at this you get additional 1 from cout you tried to print. By removing << cout from your statement you will get the desired output.

C++ reference on static variable

I just find out that this little piece of C++ code doesn't give me the same result with clang++ and with g++:
#include <iostream>
#include <string>
using namespace std;
const string& createString(char c) {
static string s;
s="";
for(int i=0; i<10; ++i) {
s+=c;
}
return s;
}
int main() {
cout << createString('a') << ' ' << createString('z') << endl;
return 0;
}
With clang++ it writes:
aaaaaaaaaa zzzzzzzzzz
like I want it to be, but with g++ it writes:
aaaaaaaaaa aaaaaaaaaa
Why is it so? Is the g++ implementation standard compliant?
And what should I do if I want a function to return a temporary "big" type by reference like here to avoid useless copy?

Yes, both implementations are compliant. The order of evaluation of function arguments is not specified.
Therefore, createString('a') and createString('z') can be evaluated in any order. Furthermore, createString('z') can be evaluated before or after the result of createString('a') is written out.
Since the function is stateful, and returns the state by reference, both outputs are permissible, as is zzzzzzzzzz zzzzzzzzzz.
Finally, it is worth noting that having static state would be a major headache in a multithreaded environment.

And what should I do if I want a function to return a temporary "big"
type by reference like here to avoid useless copy ?
It won't be. RVO and NRVO can trivially take care of this. In addition, move semantics. In short, there's nothing problematic about returning a std::string by value at all.

And what should I do if I want a function to return a temporary "big" type by reference like here to avoid useless copy ?
Call it only once per expression. For example, this will work fine:
std::cout << createString('a') << ' ';
std::cout << createString('z') << std::endl;

Does std::less have to be consistent with the equality operator for pointer types?

I've bumped into a problem yesterday, which I eventually distilled into the following minimal example.
#include <iostream>
#include <functional>
int main()
{
int i=0, j=0;
std::cout
<< (&i == &j)
<< std::less<int *>()(&i, &j)
<< std::less<int *>()(&j, &i)
<< std::endl;
}
This particular program, when compiled using MSVC 9.0 with optimizations enabled, outputs 000. This implies that
the pointers are not equal, and
neither of the pointers is ordered before the other according to std::less, implying that the two pointers are equal according to the total order imposed by std::less.
Is this behavior correct? Is the total order of std::less not required to be consistend with equality operator?
Is the following program allowed to output 1?
#include <iostream>
#include <set>
int main()
{
int i=0, j=0;
std::set<int *> s;
s.insert(&i);
s.insert(&j);
std::cout << s.size() << std::endl;
}

Seems as we have a standard breach! Panic!
Following 20.3.3/8 (C++03) :
For templates greater, less,
greater_equal, and less_equal, the
specializations for any pointer type
yield a total order, even if the
built-in operators <, >, <=, >= do
not.
It seems a situation where eager optimizations lead to improper code...
Edit: C++0x also holds this one under 20.8.5/8
Edit 2: Curiously, as an answer to the second question:
Following 5.10/1 C++03:
Two pointers of the same type compare
equal if and only if they are both
null, both point to the same function,
or both represent the same address
Something is wrong here... on many levels.

No, the result is obviously not correct.
However, MSVC is known not to follow the "unique address" rules to the letter. For example, it merges template functions that happens to generate identical code. Then those different functions will also have the same address.
I guess that you example would work better if you actually did something to i and j, other that taking their address.

C++ Output evaluation order with embedded function calls

I'm a TA for an intro C++ class. The following question was asked on a test last week:
What is the output from the following program:
int myFunc(int &x) {
int temp = x * x * x;
x += 1;
return temp;
}
int main() {
int x = 2;
cout << myFunc(x) << endl << myFunc(x) << endl << myFunc(x) << endl;
}
The answer, to me and all my colleagues, is obviously:
8
27
64
But now several students have pointed out that when they run this in certain environments they actually get the opposite:
64
27
8
When I run it in my linux environment using gcc I get what I would expect. Using MinGW on my Windows machine I get what they're talking about.
It seems to be evaluating the last call to myFunc first, then the second call and then the first, then once it has all the results it outputs them in the normal order, starting with the first. But because the calls were made out of order the numbers are opposite.
It seems to me to be a compiler optimization, choosing to evaluate the function calls in the opposite order, but I don't really know why. My question is: are my assumptions correct? Is that what's going on in the background? Or is there something totally different? Also, I don't really understand why there would be a benefit to evaluating the functions backwards and then evaluating output forward. Output would have to be forward because of the way ostream works, but it seems like evaluation of the functions should be forward as well.
Thanks for your help!

The C++ standard does not define what order the subexpressions of a full expression are evaluated, except for certain operators which introduce an order (the comma operator, ternary operator, short-circuiting logical operators), and the fact that the expressions which make up the arguments/operands of a function/operator are all evaluated before the function/operator itself.
GCC is not obliged to explain to you (or me) why it wants to order them as it does. It might be a performance optimisation, it might be because the compiler code came out a few lines shorter and simpler that way, it might be because one of the mingw coders personally hates you, and wants to ensure that if you make assumptions that aren't guaranteed by the standard, your code goes wrong. Welcome to the world of open standards :-)
Edit to add: litb makes a point below about (un)defined behavior. The standard says that if you modify a variable multiple times in an expression, and if there exists a valid order of evaluation for that expression, such that the variable is modified multiple times without a sequence point in between, then the expression has undefined behavior. That doesn't apply here, because the variable is modified in the call to the function, and there's a sequence point at the start of any function call (even if the compiler inlines it). However, if you'd manually inlined the code:
std::cout << pow(x++,3) << endl << pow(x++,3) << endl << pow(x++,3) << endl;
Then that would be undefined behavior. In this code, it is valid for the compiler to evaluate all three "x++" subexpressions, then the three calls to pow, then start on the various calls to operator<<. Because this order is valid and has no sequence points separating the modification of x, the results are completely undefined. In your code snippet, only the order of execution is unspecified.

Exactly why does this have unspecified behaviour.
When I first looked at this example I felt that the behaviour was well defined because this expression is actually short hand for a set of function calls.
Consider this more basic example:
cout << f1() << f2();
This is expanded to a sequence of function calls, where the kind of calls depend on the operators being members or non-members:
// Option 1: Both are members
cout.operator<<(f1 ()).operator<< (f2 ());
// Option 2: Both are non members
operator<< ( operator<<(cout, f1 ()), f2 () );
// Option 3: First is a member, second non-member
operator<< ( cout.operator<<(f1 ()), f2 () );
// Option 4: First is a non-member, second is a member
cout.operator<<(f1 ()).operator<< (f2 ());
At the lowest level these will generate almost identical code so I will refer only to the first option from now.
There is a guarantee in the standard that the compiler must evaluate the arguments to each function call before the body of the function is entered. In this case, cout.operator<<(f1()) must be evaluated before operator<<(f2()) is, since the result of cout.operator<<(f1()) is required to call the other operator.
The unspecified behaviour kicks in because although the calls to the operators must be ordered there is no such requirement on their arguments. Therefore, the resulting order can be one of:
f2()
f1()
cout.operator<<(f1())
cout.operator<<(f1()).operator<<(f2());
Or:
f1()
f2()
cout.operator<<(f1())
cout.operator<<(f1()).operator<<(f2());
Or finally:
f1()
cout.operator<<(f1())
f2()
cout.operator<<(f1()).operator<<(f2());

The order in which function call parameters is evaluated is unspecified. In short, you shouldn't use arguments that have side-effects that affect the meaning and result of the statement.

Yeah, the order of evaluation of functional arguments is "Unspecified" according to the Standards.
Hence the outputs differ on different platforms

As has already been stated, you've wandered into the haunted forest of undefined behavior. To get what is expected every time you can either remove the side effects:
int myFunc(int &x) {
int temp = x * x * x;
return temp;
}
int main() {
int x = 2;
cout << myFunc(x) << endl << myFunc(x+1) << endl << myFunc(x+2) << endl;
//Note that you can't use the increment operator (++) here. It has
//side-effects so it will have the same problem
}
or break the function calls up into separate statements:
int myFunc(int &x) {
int temp = x * x * x;
x += 1;
return temp;
}
int main() {
int x = 2;
cout << myFunc(x) << endl;
cout << myFunc(x) << endl;
cout << myFunc(x) << endl;
}
The second version is probably better for a test, since it forces them to consider the side effects.

And this is why, every time you write a function with a side-effect, God kills a kitten!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js