I just find out that this little piece of C++ code doesn't give me the same result with clang++ and with g++:
#include <iostream>
#include <string>
using namespace std;
const string& createString(char c) {
static string s;
s="";
for(int i=0; i<10; ++i) {
s+=c;
}
return s;
}
int main() {
cout << createString('a') << ' ' << createString('z') << endl;
return 0;
}
With clang++ it writes:
aaaaaaaaaa zzzzzzzzzz
like I want it to be, but with g++ it writes:
aaaaaaaaaa aaaaaaaaaa
Why is it so? Is the g++ implementation standard compliant?
And what should I do if I want a function to return a temporary "big" type by reference like here to avoid useless copy?
Yes, both implementations are compliant. The order of evaluation of function arguments is not specified.
Therefore, createString('a') and createString('z') can be evaluated in any order. Furthermore, createString('z') can be evaluated before or after the result of createString('a') is written out.
Since the function is stateful, and returns the state by reference, both outputs are permissible, as is zzzzzzzzzz zzzzzzzzzz.
Finally, it is worth noting that having static state would be a major headache in a multithreaded environment.
And what should I do if I want a function to return a temporary "big"
type by reference like here to avoid useless copy ?
It won't be. RVO and NRVO can trivially take care of this. In addition, move semantics. In short, there's nothing problematic about returning a std::string by value at all.
And what should I do if I want a function to return a temporary "big" type by reference like here to avoid useless copy ?
Call it only once per expression. For example, this will work fine:
std::cout << createString('a') << ' ';
std::cout << createString('z') << std::endl;
Related
I am looking for a string implementation with fixed upper size that can be used in memcopy environment and that is trivially constructible and copyable.
I found boost beast static_string, but IDK if my example works by accident or no?
#include <algorithm>
#include <iostream>
#include <boost/beast/core/static_string.hpp>
boost::beast::static_string<16> s1("abc");
int main(){
boost::beast::static_string<16> s2;
std::copy_n((char*)&s1, sizeof(s2), (char*)&s2);
s1.push_back('X');
std::cout << "--" << std::endl;
std::cout << s2 << std::endl;
s2.push_back('Y');
std::cout << s2 << std::endl;
std::cout << std::is_trivial_v<decltype(s2)> << std::endl;
}
note: last line says type is not trivially copyable, but it could be just that Vinnie forgott to add a type trait.
P.S. I know this is a generally bad idea, what I am replacing is even worse, just a plain C array and modifying the allocation/copying to support std::string is much much more work.
Technically no, there are user defined copy constructors and operators (both call assign) which mean the class is not trivially copyable.
These appear to exist as an optimisation, if a static_string has a large size, but only stores a small string, assign only copies the used portion of the string, plus a null terminator.
C++ does not allow for std::is_trivially_copyable to be specialized by programs, so I don't believe there is a way to get both at present.
static_string does just contain a size_t member and a CharT[N+1], so if those two were default, it would be.
I tried to wrap the io manipulator std::put_money. Here's a reduced illustration:
#include <iomanip>
#include <iostream>
long double scale(long double f) { return f * 100.0L; }
namespace acm {
auto put_money(const long double &f, bool intl = false) {
return std::put_money(scale(f), intl);
}
}
int main() {
long double f(1234.567L);
std::cout << "1: " << acm::put_money(f) << '\n';
std::cout << "2: " << std::put_money(scale(f)) << '\n';
return 0;
}
The output is:
1: -92559631349317829570406876446720000000000000000000000000000000
2: 123457
I dug into both MSVC and libc++'s guts and learned that std::put_money returns a custom type that keeps a const reference to the value rather than making a copy.
Line 1 could be wrong because the reference is invalid when the custom object is streamed (i.e., the temporary value returned by scale inside my acm::put_money is already destructed).
Q: But then, why is line 2 correct?
Theory 1: "Bad luck." Keeping a const reference to the temporary is a bug, but it just so happens that the value still exists on the stack, possibly because it wasn't trampled by the extra function call. (This is supported by the fact that a Release build generally works, presumably because the extra function call is inlined.)
Theory 2: Lifetime extension of the temporary by the const reference is helping in the second case, but, for some reason, it doesn't apply in the first case. Perhaps the extra function call breaks the rules for lifetime extension?
Theory 3: ???
Finally located where in the standard this is specified (slightly edited for readability):
[ext.manip]
template <class moneyT> unspecified put_money(const moneyT& mon, bool intl = false);
Requires: The type moneyT shall be either long double
or a specialization of the basic_string template.
Returns: An object of unspecified type such that if out is an object of type
basic_ostream then the expression out << put_money(mon, intl)
behaves as a formatted output function that calls
f(out, mon, intl), where the function f is defined as:
[ example omitted ]
The expression out << put_money(mon, intl) shall have type
basic_ostream& and value out.
The long and the short of it is that std::put_money is only defined when it is on the right hand side of the << formatted output operator with the left hand side being a std::basic_ostream. Only your line 2 meets that requirement, and line 1 does not.
I have a function which modifies std::string& lvalue references in-place, returning a reference to the input parameter:
std::string& transform(std::string& input)
{
// transform the input string
...
return input;
}
I have a helper function which allows the same inline transformations to be performed on rvalue references:
std::string&& transform(std::string&& input)
{
return std::move(transform(input)); // calls the lvalue reference version
}
Notice that it returns an rvalue reference.
I have read several questions on SO relating to returning rvalue references (here and here for example), and have come to the conclusion that this is bad practice.
From what I have read, it seems the consensus is that since return values are rvalues, plus taking into account the RVO, just returning by value would be as efficient:
std::string transform(std::string&& input)
{
return transform(input); // calls the lvalue reference version
}
However, I have also read that returning function parameters prevents the RVO optimisation (for example here and here)
This leads me to believe a copy would happen from the std::string& return value of the lvalue reference version of transform(...) into the std::string return value.
Is that correct?
Is it better to keep my std::string&& transform(...) version?
There's no right answer, but returning by value is safer.
I have read several questions on SO relating to returning rvalue references, and have come to the conclusion that this is bad practice.
Returning a reference to a parameter foists a contract upon the caller that either
The parameter cannot be a temporary (which is just what rvalue references represent), or
The return value won't be retained past the the next semicolon in the caller's context (when temporaries get destroyed).
If the caller passes a temporary and tries to save the result, they get a dangling reference.
From what I have read, it seems the consensus is that since return values are rvalues, plus taking into account the RVO, just returning by value would be as efficient:
Returning by value adds a move-construction operation. The cost of this is usually proportional to the size of the object. Whereas returning by reference only requires the machine to ensure that one address is in a register, returning by value requires zeroing a couple pointers in the parameter std::string and putting their values in a new std::string to be returned.
It's cheap, but nonzero.
The direction currently taken by the standard library is, somewhat surprisingly, to be fast and unsafe and return the reference. (The only function I know that actually does this is std::get from <tuple>.) As it happens, I've presented a proposal to the C++ core language committee toward the resolution of this issue, a revision is in the works, and just today I've started investigating implementation. But it's complicated, and not a sure thing.
std::string transform(std::string&& input)
{
return transform(input); // calls the lvalue reference version
}
The compiler won't generate a move here. If input weren't a reference at all, and you did return input; it would, but it has no reason to believe that transform will return input just because it was a parameter, and it won't deduce ownership from rvalue reference type anyway. (See C++14 ยง12.8/31-32.)
You need to do:
return std::move( transform( input ) );
or equivalently
transform( input );
return std::move( input );
Some (non-representative) runtimes for the above versions of transform:
run on coliru
#include <iostream>
#include <time.h>
#include <sys/time.h>
#include <unistd.h>
using namespace std;
double GetTicks()
{
struct timeval tv;
if(!gettimeofday (&tv, NULL))
return (tv.tv_sec*1000 + tv.tv_usec/1000);
else
return -1;
}
std::string& transform(std::string& input)
{
// transform the input string
// e.g toggle first character
if(!input.empty())
{
if(input[0]=='A')
input[0] = 'B';
else
input[0] = 'A';
}
return input;
}
std::string&& transformA(std::string&& input)
{
return std::move(transform(input));
}
std::string transformB(std::string&& input)
{
return transform(input); // calls the lvalue reference version
}
std::string transformC(std::string&& input)
{
return std::move( transform( input ) ); // calls the lvalue reference version
}
string getSomeString()
{
return string("ABC");
}
int main()
{
const int MAX_LOOPS = 5000000;
{
double start = GetTicks();
for(int i=0; i<MAX_LOOPS; ++i)
string s = transformA(getSomeString());
double end = GetTicks();
cout << "\nRuntime transformA: " << end - start << " ms" << endl;
}
{
double start = GetTicks();
for(int i=0; i<MAX_LOOPS; ++i)
string s = transformB(getSomeString());
double end = GetTicks();
cout << "\nRuntime transformB: " << end - start << " ms" << endl;
}
{
double start = GetTicks();
for(int i=0; i<MAX_LOOPS; ++i)
string s = transformC(getSomeString());
double end = GetTicks();
cout << "\nRuntime transformC: " << end - start << " ms" << endl;
}
return 0;
}
output
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
Runtime transformA: 444 ms
Runtime transformB: 796 ms
Runtime transformC: 434 ms
This leads me to believe a copy would happen from the std::string&
return value of the lvalue reference version of transform(...) into
the std::string return value.
Is that correct?
The return reference version will not let std::string copy happened, but the return value version will have copy, if the compiler does not do RVO. However, RVO has its limitation, so C++11 add r-value reference and move constructor / assignment / std::move to help handle this situation. Yes, RVO is more efficient than move semantic, move is cheaper than copy but more expensive than RVO.
Is it better to keep my std::string&& transform(...) version?
This is somehow interesting and strange. As Potatoswatter answered,
std::string transform(std::string&& input)
{
return transform(input); // calls the lvalue reference version
}
You should call std::move manually.
However, you can click this developerworks link: RVO V.S. std::move to see more detail, which explain your problem clearly.
if your question is pure optimization oriented it's best to not worry about how to pass or return an argument. the compiler is smart enough to strech your code into either pure-reference passing , copy elision, function inlining and even move semantics if it's the fastest method.
basically, move semantics can benefit you in some esoteric cases. let's say I have a matrix objects that holds double** as a member variable and this pointer points to a two dimenssional array of double. now let's say I have this expression:
Matrix a = b+c;
a copy constructor (or assigment operator, in this case) will get the sum of b and c as a temorary, pass it as const reference, re-allocate m*namount of doubles on a inner pointer, then, it will run on a+b sum-array and will copy its values one by one.
easy computation shows that it can take up to O(nm) steps (which can be generlized to O(n^2)). move semantics will only re-wire that hidden double** out of the temprary into a inner pointer. it takes O(1).
now let's think about std::string for a moment:
passing it as a reference takes O(1) steps (take the memory addres, pass it , dereference it etc. , this is not linear in any sort).
passing it as r-value-reference requires the program to pass it as a reference, re-wire that hidden underlying C-char* which holds the inner buffer, null the original one (or swap between them), copy size and capacity and many more actions. we can see that although we're still in the O(1) zone - there can be actualy MORE steps than simply pass it as a regular reference.
well, the truth is that I didn't benchmarked it, and the discussion here is purely theoratical. never the less, my first paragraph is still true. we assume many things as developers, but unless we benchmark everything to death - the compiler simply knows better than us in 99% of the time
taking this argument into acount, I'd say to keep it as a reference-pass and not move semantics since it's backword compatible and much more understood for developers who didn't master C++11 yet.
So i have a piece of code with a class like that:
#include<iostream>
#include<cstring>
class stu
{
static int proba;
public:
stu();
static int no(){
return proba;
}
};
int stu::proba=0;
stu::stu()
{
proba=proba+1;
}
int main()
{
std::cout<< stu::no << std::endl;
}
The output is 1.
It does so even if i change stu::no so that it would be only {return 12;}
Why does it happen? How do I fix it??
Change it to std::cout<< stu::no() << std::endl;
Without the (), I believe it's evaluating as a pointer, and not doing what you're expecting.
Edit: As pointed out by #Loomchild, using g++ -Wall will provide further insight as to why it's always 1. The pointer to the static function is always evaluated as true in this context, hence the value being printed.
std::cout<< stu::no << std::endl; prints the address of the function, you're not actually calling it.
std::cout<< stu::no() << std::endl;
calls the function and prints the return value.
In MSVS, this indeed produces a pointer value, with the overload operator << (void*).
Use stu::no() instead of stu::no.
Also, a minor thing really but if you put
using namespace std;
below the #includes you won't have to use std::
Just makes things a little more readable.
stu::no is a function that takes no arguments and returns int.
There is no operator<< that takes functions with your particular signature, so the available overloads are considered. Long story short, the operator<<(ostream&, bool) is the closest match, after function-to-pointer and pointer-to-bool conversions.
Since the function actually exists, its address is definitely non-zero, so the pointer to bool conversion always yields true, which you see as 1.
Make it std::cout<< std::boolalpha << stu::no << std::endl; to see for yourself that it's really a boolean output.
Make it std::cout<< stu::no() << std::endl; to print the result of the function call.
See How to print function pointers with cout? if you want to know what happened in more detail.
look at the following simple code:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s("1234567890");
string::iterator i1 = s.begin();
string::iterator i2 = s.begin();
string s1, s2;
s1.append(i1, ++i1);
s2.append(++i2, s.end());
cout << s1 << endl;
cout << s2 << endl;
}
what would you expect the output to be?
would you, like me, expect it to be:
1
234567890
wrong!
it is:
234567890
i.e. the first string is empty.
seams that prefix increment operator is problematic with iterators. or am I missing something?
You're missing something: this really has nothing to do with iterators. The order in which arguments to a function are evaluated is unspecified. As such your: append(i1, ++i1); would depend on unspecified behavior, regardless of the type of i1. Just for example, given something a lot simpler like:
void print(int a, int b) {
std::cout << a << " " << b << "\n";
}
int main() {
int a =0;
print(a, ++a);
return 0;
}
Your output could perfectly reasonably be the "0 1" you seem to expect, but it could also perfectly reasonably be: "1 1". Since this is unspecified, it could change from one version of the compiler to the next, or even with the same compiler when you change flags, or (in theory) could vary depending on the phase of the moon...
C++ implementations are free to evaluate arguments in any order. In this case, if ++i1 is evaluated first, you will get an empty string.
Not a bug.
The order in which the arguments to
s1.append(i1, ++i1);
are evaluated is not specified by the standard. The compiler is free to use any order it chooses. In this case, it evaluates the second argument (++i1) before the first (i1) and you specify a null range to copy.
The C++ standard does not specify anything about the order in which the function arguments are evaluated, making it implementation dependent. C++ requires that the arguments to a function be completely evaluated (and all side-effects posted) prior to entering the function, but the implementation is free to evaluate the arguments in any order
In your case i++ got evaluated before making both the parameters same, resulting in an empty string.
More info on this behavior here on comp.compilers newsgroup