std::string::reserve() and std::string::clear() conundrum - c++

This question starts with a bit of code, just because I think it is easier to see what I am after:
/*static*/
void
Url::Split
(std::list<std::string> & url
, const std::string& stringUrl
)
{
std::string collector;
collector.reserve(stringUrl.length());
for (auto c : stringUrl)
{
if (PathSeparator == c)
{
url.push_back(collector);
collector.clear(); // Sabotages my optimization with reserve() above!
}
else
{
collector.push_back(c);
}
}
url.push_back(collector);
}
In the code above, the collector.reserve(stringUrl.length()); line is supposed to reduce the amount of heap operations performed during the loop below. Each substring cannot be longer than the whole url, after all and so reserving enough capacity as I do it looks like a good idea.
But, once a substring is finished and I add it to the url parts list, I need to reset the string to length 0 one way or another. Brief "peek definition" inspection suggests to me that at least on my platform, the reserved buffer will be released and with that, the purpose of my reserve() call is compromised.
Internally it calls some _Eos(0) in case of clear.
I could as well accomplish the same with collector.resize(0) but peeking definition reveals it also calls _Eos(newsize) internally, so the behavior is the same as in case of calling clear().
Now the question is, if there is a portable way to establish the intended optimization and which std::string function would help me with that.
Of course I could write collector[0] = '\0'; but that looks very off to me.
Side note: While I found similar questions, I do not think this is a duplicate of any of them.
Thanks, in advance.

In the C++11 standard clear is defined in terms of erase, which is defined as value replacement. There is no obvious guarantee that the buffer isn't deallocated. It might be there, implicit in other stuff, but I failed to find any such.
Without a formal guarantee that clear doesn't deallocate, and it appears that at least as of C++11 it isn't there, you have the following options:
Ignore the problem.
After all, chances are that the micro-seconds incurred by dynamic buffer allocation will be absolutely irrelevant, and in addition, even without a formal guarantee the chance of clear deallocating is very low.
Require a C++ implementation where clear doesn't deallocate.
(You can add an assert to this effect, checking .capacity().)
Do your own buffer implementation.
Ignoring the problem appears to be safe even where the allocations (if performed) would be time critical, because with common implementations clear does not reduce the capacity.
E.g., here with g++ and Visual C++ as examples:
#include <iostream>
#include <string>
using namespace std;
auto main() -> int
{
string s = "Blah blah blah";
cout << s.capacity();
s.clear();
cout << ' ' << s.capacity() << endl;
}
C:\my\so\0284>g++ keep_capacity.cpp -std=c++11
C:\my\so\0284>a
14 14
C:\my\so\0284>cl keep_capacity.cpp /Feb
keep_capacity.cpp
C:\my\so\0284>b
15 15
C:\my\so\0284>_
Doing your own buffer management, if you really want to take it that far, can be done as follows:
#include <iostream>
#include <string>
#include <vector>
namespace my {
using std::string;
using std::vector;
class Collector
{
private:
vector<char> buffer_;
int size_;
public:
auto str() const
-> string
{ return string( buffer_.begin(), buffer_.begin() + size_ ); }
auto size() const -> int { return size_; }
void append( const char c )
{
if( size_ < int( buffer_.size() ) )
{
buffer_[size_++] = c;
}
else
{
buffer_.push_back( c );
buffer_.resize( buffer_.capacity() );
++size_;
}
}
void clear() { size_ = 0; }
explicit Collector( const int initial_capacity = 0 )
: buffer_( initial_capacity )
, size_( 0 )
{ buffer_.resize( buffer_.capacity() ); }
};
auto split( const string& url, const char pathSeparator = '/' )
-> vector<string>
{
vector<string> result;
Collector collector( url.length() );
for( const auto c : url )
{
if( pathSeparator == c )
{
result.push_back( collector.str() );
collector.clear();
}
else
{
collector.append( c );
}
}
if( collector.size() > 0 ) { result.push_back( collector.str() ); }
return result;
}
} // namespace my
auto main() -> int
{
using namespace std;
auto const url = "http://en.wikipedia.org/wiki/Uniform_resource_locator";
for( string const& part : my::split( url ) )
{
cout << '[' << part << ']' << endl;
}
}

Related

Chromium stack_container for StackString can't work (at least in VisualC++)

Symptoms
I was investigating using the chromium stack_container set specifically the StackString. I made a test program with the following:
#include <chromium/base/stack_container.h>
int main() {
StackString<300> s;
return 0;
}
This should create space on the stack and the string would reserve this space.
I was surprised to find, when i added some breakpoints to allocate in StackAllocator that the stack buffer is never returned to anyone. Ie, the general purpose allocater is always called:
pointer allocate(size_type n, void* hint = 0) {
if (source_ != NULL && !source_->used_stack_buffer_
&& n <= stack_capacity) {
source_->used_stack_buffer_ = true; // source_ is always NULL
return source_->stack_buffer(); // and so this is never returned.
} else {
return std::allocator<T>::allocate(n, hint); // This is always called.
}
}
Problem
After further investigation, i found that this is because when the std::basic_string type is created (as part of the construction of StackString) the VisualC++ implementation stores the allocater into some pair. Then when needing to use it, it copies it into a proxy:
void _Alloc_proxy()
{ // construct proxy
typename _Alty::template rebind<_Container_proxy>::other
_Alproxy(_Getal()); // Copies the allocator!
_Myproxy() = _Unfancy(_Alproxy.allocate(1)); // NOTE this for a later point.
...
The copy constructor of the StackAllocator will set the copies stack pointer to NULL. Hence the StackString could never work.
Furthermore, if the StackStringdidnt have this problem, it immediatly allocates space of 1, meaning after you add anything, it will quickly grow and suffer the same problem anyway.
Questions
Is this a bug, and if so by whom, VisualC++ or chromium?
If the first symptom doesn't occur, wouldn't the second item be a problem for most compilers anyway?
It seems that StackString was removed from the Chromium project: https://bugs.chromium.org/p/chromium/issues/detail?id=709273
But it's rather not a bug but some kind of optimization for a small string.
Visual Studio 2015/2017 will allocate a 16-byte std::_Container_proxy in the heap when compiling in "Debug" even for an empty string.
In "Release", we won't use the heap memory for the StackString.
I tested it with this code:
#include <iostream>
#include <new>
#include "stack_container.h"
std::size_t memory = 0;
std::size_t alloc = 0;
void* operator new(std::size_t s) throw(std::bad_alloc) {
// Put breakpoint here
memory += s;
++alloc;
return malloc(s);
}
void operator delete(void* p) throw() {
--alloc;
free(p);
}
void PrintMem_Info()
{
std::cout << "memory = " << memory << '\n';
std::cout << "alloc = " << alloc << '\n';
}
int main()
{
StackString<256> str;
PrintMem_Info();
str->append("Hello World!");
PrintMem_Info();
str[0] = '1';
str[1] = '2';
}
And my solution is:
pointer allocate(size_type n, void* hint = 0) {
#if defined(_MSC_VER)
if (n > stack_capacity)
{
n = stack_capacity;
}
#endif // if defined(_MSC_VER)
if (source_ != nullptr && !source_->used_stack_buffer_
&& n <= stack_capacity) {
source_->used_stack_buffer_ = true;
return source_->stack_buffer();
}
else {
return std::allocator<T>::allocate(n, hint);
}
}

Char array in a struct - not renewing?

I have a for-loop and i'm creating a new instance of a struct on the stack each time. This struct just contains 2 variables - 2 char arrays of 64 bytes.
The code is below:
for (std::map<std::string, std::string>::iterator iter = m_mDevices.begin(); iter != m_mDevices.end(); ++iter)
{
Structs::SDeviceDetails sRecord;
if (false == GenerateDeviceCacheRecord(iter->first, iter->second, sRecord)) // could just pass iter in?
{
// Failed to create cache record
return false;
}
}
The really strange thing i am seeing in the debugger, is everytime i loop round, i am seeing the same value in sRecord's buffers. i.e. sRecord.m_strUsername and sRecord.m_strPassword is getting "written over" as opposed to being a newly created struct.
If sRecord.m_strUsername was "abc" on the first loop round, then after the GenerateDeviceCacheRecord function (which just modifies sRecord), sRecord.m_strUsername might be "HIc", where c is the character off the first loop! I'm obviously expecting "abc" and "HI", not "abc" and "HIc". Does anyone know what might be going on here?
Thanks
Extra code:
namespace Constants
{
static const int64 MAX_HOSTNAME_BUFFER = 64;
static const int64 MAX_ILA_BUFFER = 64;
};
struct SDeviceRecordDetails
{
char m_strHostname[Constants::MAX_HOSTNAME_BUFFER];
char m_strILA[Constants::MAX_ILA_BUFFER];
};
bool GenerateDeviceCacheRecord(std::string strHostname, std::string strILA, Structs::SDeviceRecordDetails& sRecord)
{
// Convert strings to char arrays to store in the authentication cache manager records
if (strHostname.length() > Constants::MAX_HOSTNAME_BUFFER)
return false;
if (strILA.length() > Constants::MAX_ILA_BUFFER)
return false;
std::copy(strHostname.begin(), strHostname.end(), sRecord.m_strHostname);
std::copy(strILA.begin(), strILA.end(), sRecord.m_strILA);
return true;
}
//! #brief Devices retrieved from XML file
std::map<std::string, std::string> m_mDevicesAuthenticated;
So. I appreciate that you tried to get closer to a better question. So I'm going to take some next steps with you.
What you posted wasn't really a mcve.
Here's a mcve for your problem:
#include <iostream>
#include <cstdint>
#include <map>
#include <string>
#include <algorithm>
namespace Constants
{
static const int64_t MAX_HOSTNAME_BUFFER = 64;
static const int64_t MAX_ILA_BUFFER = 64;
};
struct SDeviceRecordDetails
{
char m_strHostname[Constants::MAX_HOSTNAME_BUFFER];
char m_strILA[Constants::MAX_ILA_BUFFER];
};
bool GenerateDeviceCacheRecord(std::string strHostname, std::string strILA, SDeviceRecordDetails& sRecord)
{
// Convert strings to char arrays to store in the authentication cache manager records
if (strHostname.length() > Constants::MAX_HOSTNAME_BUFFER)
return false;
if (strILA.length() > Constants::MAX_ILA_BUFFER)
return false;
std::copy(strHostname.begin(), strHostname.end(), sRecord.m_strHostname);
std::copy(strILA.begin(), strILA.end(), sRecord.m_strILA);
return true;
}
std::map<std::string, std::string> m_mDevices;
int main() {
m_mDevices["hello"] = "foo";
m_mDevices["buzz"] = "bear";
for (std::map<std::string, std::string>::iterator iter = m_mDevices.begin(); iter != m_mDevices.end(); ++iter) {
SDeviceRecordDetails sRecord;
const bool result = GenerateDeviceCacheRecord(iter->first, iter->second, sRecord);
if (result == false)
std::cout << "Failed\n";
else
std::cout << sRecord.m_strHostname << " " << sRecord.m_strILA << "\n";
}
}
Things to note:
I can take this as is (instead of two code blocks in your question) and throw it at a compiler.
I included the proper #include lines.
There were namespaces in your type names that weren't represented in your code.
m_mDevicesAuthenticated != m_mDevices.
You didn't include anything that actually had any output.
What is actually in m_mDevices? This is really important to include!
Among other small corrections I had to apply to the code to get it to build.
What did this code do?
This code almost produces the correct output. It has an error, in that the strings that are written to sRecord are not null terminated.
Because of how compilers generate code, and that you don't explicitly clear sRecord each loop, it's likely that this is the root cause of your problem.
Let's fix that:
Instead of:
std::copy(strHostname.begin(), strHostname.end(), sRecord.m_strHostname);
std::copy(strILA.begin(), strILA.end(), sRecord.m_strILA);
Let'd do:
snprintf(sRecord.m_strHostname, Constants::MAX_HOSTNAME_BUFFER, "%s", strHostname.c_str());
snprintf(sRecord.m_strILA, Constants::MAX_ILA_BUFFER, "%s", strILA.c_str());
Or perhaps you are concerned about what sRecord starts each loop with:
In this case, sRecord is not initialized at the beginning of each loop. The compiler is free to have junk data in the struct for optimization purposes.
It happens that most compilers will place each iteration of the struct in that exact same spot in memory. This means that the junk data in the struct could be the data from the previous iteration. Or some other junk depending on how the compiler optimizations function.
You could fix this by initializing the struct to contain explicit data:
SDeviceRecordDetails sRecord = {};
What does all of this look like:
The finished code, with all the bug fixes looks like:
#include <iostream>
#include <cstdint>
#include <map>
#include <string>
#include <algorithm>
namespace Constants
{
static const int64_t MAX_HOSTNAME_BUFFER = 64;
static const int64_t MAX_ILA_BUFFER = 64;
};
struct SDeviceRecordDetails
{
char m_strHostname[Constants::MAX_HOSTNAME_BUFFER];
char m_strILA[Constants::MAX_ILA_BUFFER];
};
bool GenerateDeviceCacheRecord(std::string strHostname, std::string strILA, SDeviceRecordDetails& sRecord)
{
// Convert strings to char arrays to store in the authentication cache manager records
if (strHostname.length() > Constants::MAX_HOSTNAME_BUFFER)
return false;
if (strILA.length() > Constants::MAX_ILA_BUFFER)
return false;
snprintf(sRecord.m_strHostname, Constants::MAX_HOSTNAME_BUFFER, "%s", strHostname.c_str());
snprintf(sRecord.m_strILA, Constants::MAX_ILA_BUFFER, "%s", strILA.c_str());
return true;
}
std::map<std::string, std::string> m_mDevices;
int main() {
m_mDevices["hello"] = "foo";
m_mDevices["buzz"] = "bear";
m_mDevices["zed"] = "zoo";
for (std::map<std::string, std::string>::iterator iter = m_mDevices.begin(); iter != m_mDevices.end(); ++iter) {
SDeviceRecordDetails sRecord = {};
const bool result = GenerateDeviceCacheRecord(iter->first, iter->second, sRecord);
if (result == false)
std::cout << "Failed\n";
else
std::cout << sRecord.m_strHostname << " " << sRecord.m_strILA << "\n";
}
}
And outputs:
buzz bear
hello foo
zed zoo
Which looks correct to my eyes.
I don't see any initialisation here. You're seeing whatever happened to be at that place in memory before, which for you, today, happens to be the previous contents of those data members.

C++ for each, pulling from vector elements

I am trying to do a foreach on a vector of attacks, each attack has a unique ID say, 1-3.
The class method takes the keyboard input of 1-3.
I am trying to use a foreach to run through my elements in m_attack to see if the number matches, if it does... do something.
The problem I'm seeing is this:
a'for each' statement cannot operate on an expression of type "std::vector<Attack
Am I going about this totally wrong, I have C# experience and is kind of what I'm basing this on, any help would be appreciated.
My code is as follows:
In header
vector<Attack> m_attack;
In class
int Player::useAttack (int input)
{
for each (Attack* attack in m_attack) // Problem part
{
//Psuedo for following action
if (attack->m_num == input)
{
//For the found attack, do it's damage
attack->makeDamage();
}
}
}
For next examples assumed that you use C++11.
Example with ranged-based for loops:
for (auto &attack : m_attack) // access by reference to avoid copying
{
if (attack.m_num == input)
{
attack.makeDamage();
}
}
You should use const auto &attack depending on the behavior of makeDamage().
You can use std::for_each from standard library + lambdas:
std::for_each(m_attack.begin(), m_attack.end(),
[](Attack * attack)
{
if (attack->m_num == input)
{
attack->makeDamage();
}
}
);
If you are uncomfortable using std::for_each, you can loop over m_attack using iterators:
for (auto attack = m_attack.begin(); attack != m_attack.end(); ++attack)
{
if (attack->m_num == input)
{
attack->makeDamage();
}
}
Use m_attack.cbegin() and m_attack.cend() to get const iterators.
This is how it would be done in a loop in C++(11):
for (const auto& attack : m_attack)
{
if (attack->m_num == input)
{
attack->makeDamage();
}
}
There is no for each in C++. Another option is to use std::for_each with a suitable functor (this could be anything that can be called with an Attack* as argument).
The for each syntax is supported as an extension to native c++ in Visual Studio.
The example provided in msdn
#include <vector>
#include <iostream>
using namespace std;
int main()
{
int total = 0;
vector<int> v(6);
v[0] = 10; v[1] = 20; v[2] = 30;
v[3] = 40; v[4] = 50; v[5] = 60;
for each(int i in v) {
total += i;
}
cout << total << endl;
}
(works in VS2013) is not portable/cross platform but gives you an idea of how to use for each.
The standard alternatives (provided in the rest of the answers) apply everywhere. And it would be best to use those.
C++ does not have the for_each loop feature in its syntax. You have to use c++11 or use the template function std::for_each.
struct Function {
int input;
Function(int input): input(input) {}
void operator()(Attack& attack) {
if(attack->m_num == input) attack->makeDamage();
}
};
Function f(input);
std::for_each(m_attack.begin(), m_attack.end(), f);

Efficient way to convert int to string

I'm creating a game in which I have a main loop. During one cycle of this loop, I have to convert int value to string about ~50-100 times. So far I've been using this function:
std::string Util::intToString(int val)
{
std::ostringstream s;
s << val;
return s.str();
}
But it doesn't seem to be quite efficient as I've encountered FPS drop from ~120 (without using this function) to ~95 (while using it).
Is there any other way to convert int to string that would be much more efficient than my function?
It's 1-72 range. I don't have to deal with negatives.
Pre-create an array/vector of 73 string objects, and use an index to get your string. Returning a const reference will let you save on allocations/deallocations, too:
// Initialize smallNumbers to strings "0", "1", "2", ...
static vector<string> smallNumbers;
const string& smallIntToString(unsigned int val) {
return smallNumbers[val < smallNumbers.size() ? val : 0];
}
The standard std::to_string function might be a useful.
However, in this case I'm wondering if maybe it's not the copying of the string when returning it might be as big a bottleneck? If so you could pass the destination string as a reference argument to the function instead. However, if you have std::to_string then the compiler probably is C++11 compatible and can use move semantics instead of copying.
Yep — fall back on functions from C, as explored in this previous answer:
namespace boost {
template<>
inline std::string lexical_cast(const int& arg)
{
char buffer[65]; // large enough for arg < 2^200
ltoa( arg, buffer, 10 );
return std::string( buffer ); // RVO will take place here
}
}//namespace boost
In theory, this new specialisation will take effect throughout the rest of the Translation Unit in which you defined it. ltoa is much faster (despite being non-standard) than constructing and using a stringstream.
However, I've experienced problems with name conflicts between instantiations of this specialisation, and instantiations of the original function template, between competing shared libraries.
In order to get around that, I actually just give this function a whole new name entirely:
template <typename T>
inline std::string fast_lexical_cast(const T& arg)
{
return boost::lexical_cast<std::string>(arg);
}
template <>
inline std::string my_fast_lexical_cast(const int& arg)
{
char buffer[65];
if (!ltoa(arg, buffer, 10)) {
boost::throw_exception(boost::bad_lexical_cast(
typeid(std::string), typeid(int)
));
}
return std::string(buffer);
}
Usage: std::string myString = fast_lexical_cast<std::string>(42);
Disclaimer: this modification is reverse-engineered from Kirill's original SO code, not the version that I created and put into production from my company codebase. I can't think right now, though, of any other significant modifications that I made to it.
Something like this:
const int size = 12;
char buf[size+1];
buf[size] = 0;
int index = size;
bool neg = false
if (val < 0) { // Obviously don't need this if val is always positive.
neg = true;
val = -val;
}
do
{
buf[--index] = (val % 10) + '0';
val /= 10;
} while(val);
if (neg)
{
buf[--index] = '-';
}
return std::string(&buf[index]);
I use this:
void append_uint_to_str(string & s, unsigned int i)
{
if(i > 9)
append_uint_to_str(s, i / 10);
s += '0' + i % 10;
}
If You want negative insert:
if(i < 0)
{
s += '-';
i = -i;
}
at the beginning of function.

STL container leak

I'm using a vector container to hold instances of an object which contain 3 ints and 2 std::strings, this is created on the stack and populated from a function in another class but running the app through deleaker shows that the std::strings from the object are all leaked. Here's the code:
// Populator function:
void PopulatorClass::populate(std::vector<MyClass>& list) {
// m_MainList contains a list of pointers to the master objects
for( std::vector<MyClass*>::iterator it = m_MainList.begin(); it != m_MainList.end(); it++ ) {
list.push_back(**it);
}
}
// Class definition
class MyClass {
private:
std::string m_Name;
std::string m_Description;
int m_nType;
int m_nCategory;
int m_nSubCategory;
};
// Code causing the problem:
std::vector<MyClass> list;
PopulatorClass.populate(list);
When this is run through deleaker the leaked memory is in the allocator for the std::string classes.
I'm using Visual Studio 2010 (CRT).
Is there anything special I need to do to make the strings delete properly when unwinding the stack and deleting the vector?
Thanks,
J
May be Memory leak with std::vector<std::string> or something like this.
Every time you got a problem with the STL implementation doing something strange or wrong like a memory leak, try this :
Reproduce the most basic example of what you try to achieve. If it runs without a leak, then the problem is in the way you fill the data. It's the most probable source of problem (I mean your own code).
Not tested simple on-the-fly example for your specific problem :
#include <string>
#include <sstream>
// Class definition
struct MyClass { // struct for convenience
std::string m_Name;
std::string m_Description;
int m_nType;
int m_nCategory;
int m_nSubCategory;
};
// Prototype of populator function:
void populate(std::vector<MyClass>& list)
{
const int MAX_TYPE_IDX = 4;
const int MAX_CATEGORY_IDX = 8;
const int MAX_SUB_CATEGORY_IDX = 6;
for( int type_idx = 0; type_idx < MAX_TYPE_IDX ; ++type_idx)
for( int category_idx = 0; category_idx < MAX_CATEGORY_IDX ; ++category_idx)
for( int sub_category_idx = 0; sub_category_idx < MAX_SUB_CATEGORY_IDX ; ++sub_category_idx)
{
std::stringstream name_stream;
name_stream << "object_" << type_idx << "_" << category_idx << "_" << sub_category_idx ;
std::stringstream desc_stream;
desc_stream << "This is an object of the type N°" << type_idx << ".\n";
desc_stream << "It is of category N°" << category_idx << ",\n";
desc_stream << "and of sub-category N°" << category_idx << "!\n";
MyClass object;
object.m_Name = name_stream.str();
object.m_Description = desc_stream.str();
object.m_nType = type_idx;
m_nCategory =
m_nSubCategory =
list.push_back( object );
}
}
int main()
{
// Code causing the problem:
std::vector<MyClass> list;
populate(list);
// memory leak check?
return 0;
}
If you still got the memory leak, first check that it's not a false-positive from your leak detection software.
Then if it's not, google for memory leak problems with your STL implementation (most of the time on the compiler developer website). The implementor might provide a bug tracking tool where you could search in for the same problem and potential solution.
If you still can't find the source of the leak, maybe try to build your project with a different compiler (if you can) and see if it have the same effect. Again if the leak still occurs, the problem have a lot of chances to come from your code.
Probably same root issue as Alexey's link. The shipped version has broken move code for basic_string. MS abandoned us VC10 users, so you must fix it yourself. in xstring file you have this:
_Myt& assign(_Myt&& _Right)
{ // assign by moving _Right
if (this == &_Right)
;
else if (get_allocator() != _Right.get_allocator()
&& this->_BUF_SIZE <= _Right._Myres)
*this = _Right;
else
{ // not same, clear this and steal from _Right
_Tidy(true);
if (_Right._Myres < this->_BUF_SIZE)
_Traits::move(this->_Bx._Buf, _Right._Bx._Buf,
_Right._Mysize + 1);
else
{ // copy pointer
this->_Bx._Ptr = _Right._Bx._Ptr;
_Right._Bx._Ptr = 0;
}
this->_Mysize = _Right._Mysize;
this->_Myres = _Right._Myres;
_Right._Mysize = 0;
_Right._Myres = 0;
}
return (*this);
}
Note the last
_Right._Myres = 0;
that should happen only under the last condition, for the short case _Right should better be left alone.
As the capacity is set to 0 instead of 15, other code will take unintended branch in function Grow() when you assign another small string and will allocate a block of memory just to trample over the pointer with the immediate string content.