I was programming some test cases an noticed an odd behaviour.
An move assignment to a string did not erase the value of the first string, but assigned the value of the target string.
sample code:
#include <utility>
#include <string>
#include <iostream>
int main(void) {
std::string a = "foo";
std::string b = "bar";
std::cout << a << std::endl;
b = std::move(a);
std::cout << a << std::endl;
return 0;
}
result:
$ ./string.exe
foo
bar
expected result:
$ ./string.exe
foo
So to my questions:
Is that intentional?
Does this happen only with strings and/or STL objects?
Does this happen with custom objects (as in user defined)?
Environment:
Win10 64bit
msys2
g++ 5.2
EDIT
After reading the possible duplicate answer and the answer by #OMGtechy
i extended the test to check for small string optimizations.
#include <utility>
#include <string>
#include <iostream>
#include <cinttypes>
#include <sstream>
int main(void) {
std::ostringstream oss1;
oss1 << "foo ";
std::ostringstream oss2;
oss2 << "bar ";
for (std::uint64_t i(0);;++i) {
oss1 << i % 10;
oss2 << i % 10;
std::string a = oss1.str();
std::string b = oss2.str();
b = std::move(a);
if (a.size() < i) {
std::cout << "move operation origin was cleared at: " << i << std::endl;
break;
}
if (0 == i % 1000)
std::cout << i << std::endl;
}
return 0;
}
This ran on my machine up to 1 MB, which is not a small string anymore.
And it just stopped, so i could paste the source here (Read: i stopped it).
This is likely due to short string optimization; i.e. there's no internal pointer to "move" over, so it ends up acting just like a copy.
I suggest you try this with a string large number of characters; this should be enough to get around short string optimization and exhibit the behaviour you expected.
This is perfectly valid, because the C++ standard states that moved from objects (with some exceptions, strings are not one of them as of C++11) shall be in a valid but unspecified state.
Related
So here i got a small test program:
#include <string>
#include <iostream>
#include <memory>
#include <vector>
class Test
{
public:
Test(const std::vector<int>& a_, const std::string& b_)
: a(std::move(a_)),
b(std::move(b_)),
vBufAddr(reinterpret_cast<long long>(a.data())),
sBufAddr(reinterpret_cast<long long>(b.data()))
{}
Test(Test&& mv)
: a(std::move(mv.a)),
b(std::move(mv.b)),
vBufAddr(reinterpret_cast<long long>(a.data())),
sBufAddr(reinterpret_cast<long long>(b.data()))
{}
bool operator==(const Test& cmp)
{
if (vBufAddr != cmp.vBufAddr) {
std::cout << "Vector buffers differ: " << std::endl
<< "Ours: " << std::hex << vBufAddr << std::endl
<< "Theirs: " << cmp.vBufAddr << std::endl;
return false;
}
if (sBufAddr != cmp.sBufAddr) {
std::cout << "String buffers differ: " << std::endl
<< "Ours: " << std::hex << sBufAddr << std::endl
<< "Theirs: " << cmp.sBufAddr << std::endl;
return false;
}
}
private:
std::vector<int> a;
std::string b;
long long vBufAddr;
long long sBufAddr;
};
int main()
{
Test obj1 { {0x01, 0x02, 0x03, 0x04}, {0x01, 0x02, 0x03, 0x04}};
Test obj2(std::move(obj1));
obj1 == obj2;
return 0;
}
Software i used for test:
Compiler: gcc 7.3.0
Compiler flags: -std=c++11
OS: Linux Mint 19 (tara) with upstream release Ubuntu 18.04 LTS (bionic)
The results i see here, that after move, vector buffer still has the same address, but string buffer doesn't. So it looks to me, that it allocated fresh one, instead of just swapping buffer pointers. What causes such behavior?
You're likely seeing the effects of the small/short string optimization (SSO). To avoid unnecessary allocations for every tiny little string, many implementations of std::string include a small fixed size array to hold small strings without requiring new (this array usually repurposes some of the other members that aren't necessary when dynamic allocation has not been used, so it consumes little or no additional memory to provide it, either for small or large strings), and those strings don't benefit from std::move (but they're small, so it's fine). Larger strings will require dynamic allocation, and will transfer the pointer as you expect.
Just for demonstration, this code on g++:
void move_test(std::string&& s) {
std::string s2 = std::move(s);
std::cout << "; After move: " << std::hex << reinterpret_cast<uintptr_t>(s2.data()) << std::endl;
}
int main()
{
std::string sbase;
for (size_t len=0; len < 32; ++len) {
std::string s1 = sbase;
std::cout << "Length " << len << " - Before move: " << std::hex << reinterpret_cast<uintptr_t>(s1.data());
move_test(std::move(s1));
sbase += 'a';
}
}
Try it online!
produces high (stack) addresses that change on move construction for lengths of 15 or less (presumably varies with architecture pointer size), but switches to low (heap) addresses that remain unchanged after move construction once you hit length 16 or higher (the switch is at 16, not 17, because it is NUL-terminating the strings, since C++11 and higher require it).
To be 100% clear: This is an implementation detail. No part of the C++ spec requires this behavior, so you should not rely on it occurring at all, and when it occurs, you should not rely on it occurring for specific string lengths.
Suppose I have a code like this:
void printHex(std::ostream& x){
x<<std::hex<<123;
}
..
int main(){
std::cout<<100; // prints 100 base 10
printHex(std::cout); //prints 123 in hex
std::cout<<73; //problem! prints 73 in hex..
}
My question is if there is any way to 'restore' the state of cout to its original one after returning from the function? (Somewhat like std::boolalpha and std::noboolalpha..) ?
Thanks.
you need to #include <iostream> or #include <ios> then when required:
std::ios_base::fmtflags f( cout.flags() );
//Your code here...
cout.flags( f );
You can put these at the beginning and end of your function, or check out this answer on how to use this with RAII.
Note that the answers presented here won't restore the full state of std::cout. For example, std::setfill will "stick" even after calling .flags(). A better solution is to use .copyfmt:
std::ios oldState(nullptr);
oldState.copyfmt(std::cout);
std::cout
<< std::hex
<< std::setw(8)
<< std::setfill('0')
<< 0xDECEA5ED
<< std::endl;
std::cout.copyfmt(oldState);
std::cout
<< std::setw(15)
<< std::left
<< "case closed"
<< std::endl;
Will print:
case closed
rather than:
case closed0000
The Boost IO Stream State Saver seems exactly what you need. :-)
Example based on your code snippet:
void printHex(std::ostream& x) {
boost::io::ios_flags_saver ifs(x);
x << std::hex << 123;
}
I've created an RAII class using the example code from this answer. The big advantage to this technique comes if you have multiple return paths from a function that sets flags on an iostream. Whichever return path is used, the destructor will always be called and the flags will always get reset. There is no chance of forgetting to restore the flags when the function returns.
class IosFlagSaver {
public:
explicit IosFlagSaver(std::ostream& _ios):
ios(_ios),
f(_ios.flags()) {
}
~IosFlagSaver() {
ios.flags(f);
}
IosFlagSaver(const IosFlagSaver &rhs) = delete;
IosFlagSaver& operator= (const IosFlagSaver& rhs) = delete;
private:
std::ostream& ios;
std::ios::fmtflags f;
};
You would then use it by creating a local instance of IosFlagSaver whenever you wanted to save the current flag state. When this instance goes out of scope, the flag state will be restored.
void f(int i) {
IosFlagSaver iosfs(std::cout);
std::cout << i << " " << std::hex << i << " ";
if (i < 100) {
std::cout << std::endl;
return;
}
std::cout << std::oct << i << std::endl;
}
You can create another wrapper around the stdout buffer:
#include <iostream>
#include <iomanip>
int main() {
int x = 76;
std::ostream hexcout (std::cout.rdbuf());
hexcout << std::hex;
std::cout << x << "\n"; // still "76"
hexcout << x << "\n"; // "4c"
}
In a function:
void print(std::ostream& os) {
std::ostream copy (os.rdbuf());
copy << std::hex;
copy << 123;
}
Of course if performance is an issue this is a bit more expensive because it's copying the entire ios object (but not the buffer) including some stuff that you're paying for but unlikely to use such as the locale.
Otherwise I feel like if you're going to use .flags() it's better to be consistent and use .setf() as well rather than the << syntax (pure question of style).
void print(std::ostream& os) {
std::ios::fmtflags os_flags (os.flags());
os.setf(std::ios::hex);
os << 123;
os.flags(os_flags);
}
As others have said you can put the above (and .precision() and .fill(), but typically not the locale and words-related stuff that is usually not going to be modified and is heavier) in a class for convenience and to make it exception-safe; the constructor should accept std::ios&.
C++20 std::format will be a superior alternative to save restore in most cases
Once you can use it, you will e.g. be able to write hexadecimals simply as:
#include <format>
#include <string>
int main() {
std::cout << std::format("{:x} {:#x} {}\n", 16, 17, 18);
}
Expected output:
10 0x11 18
This will therefore completely overcome the madness of modifying std::cout state.
The existing fmt library implements it for before it gets official support: https://github.com/fmtlib/fmt Install on Ubuntu 22.04:
sudo apt install libfmt-dev
Modify source to replace:
<format> with <fmt/core.h>
std::format to fmt::format
main.cpp
#include <iostream>
#include <fmt/core.h>
int main() {
std::cout << fmt::format("{:x} {:#x} {}\n", 16, 17, 18);
}
and compile and run with:
g++ -std=c++11 -o main.out main.cpp -lfmt
./main.out
Output:
10 0x11 18
Related: std::string formatting like sprintf
With a little bit of modification to make the output more readable :
void printHex(std::ostream& x) {
ios::fmtflags f(x.flags());
x << std::hex << 123 << "\n";
x.flags(f);
}
int main() {
std::cout << 100 << "\n"; // prints 100 base 10
printHex(std::cout); // prints 123 in hex
std::cout << 73 << "\n"; // problem! prints 73 in hex..
}
Instead of injecting format into cout, the << way, adopting setf and unsetf could be a cleaner solution.
void printHex(std::ostream& x){
x.setf(std::ios::hex, std::ios::basefield);
x << 123;
x.unsetf(std::ios::basefield);
}
the ios_base namespace works fine too
void printHex(std::ostream& x){
x.setf(std::ios_base::hex, std::ios_base::basefield);
x << 123;
x.unsetf(std::ios_base::basefield);
}
Reference: http://www.cplusplus.com/reference/ios/ios_base/setf/
I would like to generalize the answer from qbert220 somewhat:
#include <ios>
class IoStreamFlagsRestorer
{
public:
IoStreamFlagsRestorer(std::ios_base & ioStream)
: ioStream_(ioStream)
, flags_(ioStream_.flags())
{
}
~IoStreamFlagsRestorer()
{
ioStream_.flags(flags_);
}
private:
std::ios_base & ioStream_;
std::ios_base::fmtflags const flags_;
};
This should work for input streams and others as well.
PS: I would have liked to make this simply a comment to above answer, stackoverflow however does not allow me to do so because of missing reputation. Thus make me clutter the answers here instead of a simple comment...
What is a good, safe way to extract a specific number of characters from a std::basic_istream and store it in a std::string?
In the following program I use a char[] to eventually obtain result, but I would like to avoid the POD types and ensure something safer and more maintainable:
#include <sstream>
#include <string>
#include <iostream>
#include <exception>
int main()
{
std::stringstream inss{std::string{R"(some/path/to/a/file/is/stored/in/50/chars Other data starts here.)"}};
char arr[50]{};
if (!inss.read(arr,50))
throw std::runtime_error("Could not read enough characters.\n");
//std::string result{arr}; // Will probably copy past the end of arr
std::string result{arr,arr+50};
std::cout << "Path is: " << result << '\n';
std::cout << "stringstream still has: " << inss.str() << '\n';
return 0;
}
Alternatives:
Convert entire stream to a string up front: std::string{inss.c_str()}
This seems wasteful as it would make a copy of the entire stream.
Write a template function to accept the char[]
This would still use an intermediate POD array.
Use std::basic_istream::get in a loop to read the required number of characters together with std::basic_string::push_back
The loop seems a bit unwieldy, but it does avoid the array.
Just read it directly into the result string.
#include <sstream>
#include <string>
#include <iostream>
#include <exception>
int main()
{
std::stringstream inss{std::string{R"(some/path/to/a/file/is/stored/in/50/chars Other data starts here.)"}};
std::string result(50, '\0');
if (!inss.read(&result[0], result.size()))
throw std::runtime_error("Could not read enough characters.\n");
std::cout << "Path is: " << result << '\n';
std::cout << "stringstream still has: " << inss.str() << '\n';
return 0;
}
Since C++11, the following guarantee about the memory layout of the std::string (from cppreference).
The elements of a basic_string are stored contiguously, that is, for a basic_string s, &*(s.begin() + n) == &*s.begin() + n for any n in [0, s.size()), or, equivalently, a pointer to s[0] can be passed to functions that expect a pointer to the first element of a CharT[] array.
(since C++11)
I'm new to C++ and I try to adapt a program snippet which generates "weak compositions" or Multisets found here on stackoverflow but I run - to be quite frank - since hours into problems.
First of all, the program runs without any complaint under MSVC - but not on gcc.
The point is, that I have read many articles like this one here on stackoverflow, about the different behaviour of gcc and msvc and I have understood, that msvc is a bit more "liberal" in dealing with this situation and gcc is more "strict". I have also understood, that one should "not bind a non-const reference to a temporary (internal) variable."
But I am sorry, I can not fix it and get this program to work under gcc - again since hours.
And - if possible - a second question: I have to introduce a global variable
total, which is said to be "evil", although it works well. I need this value of total, however I could not find a solution with a non-global scope.
Thank you all very much for your assistance.
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
int total = 0;
string & ListMultisets(unsigned au4Boxes, unsigned au4Balls, string & strOut = string(), string strBuild = string()) {
unsigned au4;
if (au4Boxes > 1) for (au4 = 0; au4 <= au4Balls; au4++)
{
stringstream ss;
ss << strBuild << (strBuild.size() == 0 ? "" : ",") << au4Balls - au4;
ListMultisets(au4Boxes - 1, au4, strOut, ss.str());
}
else
{
stringstream ss;
ss << mycount << ".\t" << "(" << strBuild << (strBuild.size() == 0 ? "" : ",") << au4Balls << ")\n";
strOut += ss.str();
total++;
}
return strOut;
}
int main() {
cout << endl << ListMultisets(5,3) << endl;
cout << "Total: " << total << " weak compositions." << endl;
return 0;
}
C++ demands that a reference parameter to an unnamed temporary (like string()) must either be a const reference or an r-value reference.
Either of those reference types will protect you from modifying a variable that you don't realize is going to be destroyed within the current expression.
Depending on your needs, it could would to make it a value parameter:
string ListMultisets( ... string strOut = string() ... ) {
or it could would to make it a function-local variable:
string ListMultisets(...) {
string strOut;
In your example program, either change would work.
Remove the default value for the strOut parameter.
Create a string in main and pass it to the function.
Change the return type of the function to be int.
Make total a local variable ListMultisets(). Return total rather than strOut (you are returning the string value strOut as a reference parameter.)
The signature of the new ListMultisets will look like:
int ListMultisets(unsigned au4Boxes, unsigned au4Balls, string & strOut)
I'll let you figure out the implementation. It will either be easy or educational.
Your new main function will look like:
int main() {
string result;
int total = ListMultisets(5,3, result);
cout << endl << result << endl;
cout << "Total: " << total << " weak compositions." << endl;
return 0;
}
I have a vector of vectors, and I want to connect them one by one to form a long vector. This could be done by inserting at the end. Inspired by this question, I was thinking that using make_move_iterator would replace copy with move and thus would be more efficient. But the following test demonstrates that make_move_iterator will cause a larger time consumption.
#include <iostream>
#include <string>
#include <vector>
#include <chrono>
using namespace std;
int main()
{
string a = "veryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongstring";
vector<string> b(10,a);
vector<vector<string> > c(1000,b);
vector<string> d,e;
auto t1 = chrono::system_clock::now();
for(auto& item : c)
{
d.insert(d.end(),item.begin(),item.end());
}
cout << c[0][0].length() << endl;
auto t2 = chrono::system_clock::now();
for(auto& item:c)
{
e.insert(e.end(), std::make_move_iterator(item.begin()),std::make_move_iterator(item.end()));
}
auto t3 = chrono::system_clock::now();
cout << chrono::duration_cast<chrono::nanoseconds>(t2-t1).count() << endl;
cout << chrono::duration_cast<chrono::nanoseconds>(t3-t2).count() << endl;
cout << c[0][0].length() << endl;
cout << "To check that c has been moved from." <<endl;
}
//Output:
//122
//1212000
//1630000
//0
//To check that c has been moved from.
Thus I'm wondering, does this approach really help improve efficiency?
The test in the question description was conducted on cpp shell
I later tried on ideone and it turned out that make_move_iterator is obviously more efficient. So it seems to be a compiler-dependent thing.
122
320576
98434
0
To check that c has been moved from.