I recently had some interest for std::allocator, thinking it might solve an issue I had with some design decision on C++ code.
Now I've read some documentation about it, watched some videos, like Andrei Alexandrescu's one at CppCon 2015, and I now basically understand I shouldn't use them, because they're not designed to work the way I think allocators might work.
That being said, before realising this, I write some test code to see how a custom subclass of std::allocator could work.
Obviously, didn't work as expected... : )
So the question is not about how allocators should be used in C++, but I'm just curious to learn exactly why my test code (provided below) is not working.
Not because I want to use custom allocators. Just curious to see the exact reason...
typedef std::basic_string< char, std::char_traits< char >, TestAllocator< char > > TestString;
int main( void )
{
TestString s1( "hello" );
TestString s2( s1 );
s1 += ", world";
std::vector< int, TestAllocator< int > > v;
v.push_back( 42 );
return 0;
}
Complete code for TestAllocator is provided at the end of this question.
Here I'm simply using my custom allocator with some std::basic_string, and with std::vector.
With std::basic_string, I can see an instance of my allocator is actually created, but no method is called...
So it just looks like it's not used at all.
But with std::vector, my own allocate method is actually being called.
So why is there a difference here?
I did try with different compilers and C++ versions.
Looks like the old GCC versions, with C++98, do call allocate on my TestString type, but not the new ones with C++11 and later.
Clang also don't call allocate.
So just curious to see an explanation about these different behaviours.
Allocator code:
template< typename _T_ >
struct TestAllocator
{
public:
typedef _T_ value_type;
typedef _T_ * pointer;
typedef const _T_ * const_pointer;
typedef _T_ & reference;
typedef const _T_ & const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef std::true_type propagate_on_container_move_assignment;
typedef std::true_type is_always_equal;
template< class _U_ >
struct rebind
{
typedef TestAllocator< _U_ > other;
};
TestAllocator( void ) noexcept
{
std::cout << "CTOR" << std::endl;
}
TestAllocator( const TestAllocator & other ) noexcept
{
( void )other;
std::cout << "CCTOR" << std::endl;
}
template< class _U_ >
TestAllocator( const TestAllocator< _U_ > & other ) noexcept
{
( void )other;
std::cout << "CCTOR" << std::endl;
}
~TestAllocator( void )
{
std::cout << "DTOR" << std::endl;
}
pointer address( reference x ) const noexcept
{
return std::addressof( x );
}
pointer allocate( size_type n, std::allocator< void >::const_pointer hint = 0 )
{
pointer p;
( void )hint;
std::cout << "allocate" << std::endl;
p = new _T_[ n ]();
if( p == nullptr )
{
throw std::bad_alloc() ;
}
return p;
}
void deallocate( _T_ * p, std::size_t n )
{
( void )n;
std::cout << "deallocate" << std::endl;
delete[] p;
}
const_pointer address( const_reference x ) const noexcept
{
return std::addressof( x );
}
size_type max_size() const noexcept
{
return size_type( ~0 ) / sizeof( _T_ );
}
void construct( pointer p, const_reference val )
{
( void )p;
( void )val;
std::cout << "construct" << std::endl;
}
void destroy( pointer p )
{
( void )p;
std::cout << "destroy" << std::endl;
}
};
template< class _T1_, class _T2_ >
bool operator ==( const TestAllocator< _T1_ > & lhs, const TestAllocator< _T2_ > & rhs ) noexcept
{
( void )lhs;
( void )rhs;
return true;
}
template< class _T1_, class _T2_ >
bool operator !=( const TestAllocator< _T1_ > & lhs, const TestAllocator< _T2_ > & rhs ) noexcept
{
( void )lhs;
( void )rhs;
return false;
}
std::basic_string can be implemented using the small buffer optimization (a.k.a. SBO or SSO in the context of strings) - this means that it internally stores a small buffer that avoids allocations for small strings. This is very likely the reason your allocator is not being used.
Try changing "hello" to a longer string (more than 32 characters) and it will probably invoke allocate.
Also note that the C++11 standard forbids std::string to be implemented in a COW (copy-on-write) fashion - more information in this question: "Legality of COW std::string implementation in C++11"
The Standard forbids std::vector to make use of the small buffer optimization: more information can be found in this question: "May std::vector make use of small buffer optimization?".
Related
This is excerpt from my code:
std::map<int, std::pair< const int, const std::vector<POINT_3d> > > m_srcHitData;
void addHit( const int edgeId, const int hit )
{
m_srcHitData[edgeId] = std::make_pair( hit, std::vector<POINT_3d>() );
}
I keep getting the error:
stl_pair.h(180): error: no operator "=" matches these operands
operand types are: const std::vector<POINT_3d, std::allocator<POINT_3d>> = const std::vector<POINT_3d, std::allocator<POINT_3d>>
second = __p.second;
^
detected during instantiation of "std::pair<_T1, _T2> &std::pair<_T1, _T2>::operator=(const std::pair<_U1, _U2> &)
What does that mean? I tried different approaches but still getting this or similar error. Thank you!
Well, m_srcHitData[edgeId] is a pair with a const vector member. You can't simply assign to it, because that means assigning to the const vector, which is not possible...
As for what you can do about it, see:
How to create a std::map of constant values which is still accessible by the [] operator?
As #FrancisCugler suggests, that could be, for example, writing:
m_srcHitData[edgeId].insert( std::make_pair( hit, std::vector<POINT_3d>() );
However, if your vectors are long, you might not actually want to copy all that data around.
This part in your code kind of looks ugly...
std::map<int, std::pair< const int, const std::vector<POINT_3d> > > m_srcHitData;
You could try restructuring your code a little.
struct Pair {
unsigned int key_;
std::vector<POINT_3d> points_;
Pair() {} // Empty Default
Pair( const unsigned int& key, const std::vector<POINT_3d>& points ) :
key_(key),
points_( points )
{}
};
Then...
std::map<unsigned, Pair> m_srcHitData;
void addHit( const int edgeId, const int hit ) {
m_srcHitData[edgeId] = Pair( hit, std::vector<POINT_3d>() );
}
I made this short program to simulate a similar structure only I used strings in place of your std::vector<POINT_3d>
#include <string>
#include <iostream>
#include <map>
struct Pair {
unsigned key_;
std::string value_;
Pair() {}
Pair( const unsigned int& key, const std::string& value ) :
key_( key ),
value_( value ) {}
};
class MyClass {
public:
std::map<unsigned, Pair> myMap_;
void addValue( const unsigned int& key, const std::string& value ) {
myMap_[key] = Pair( key, value );
}
};
int main() {
MyClass myClass;
myClass.addValue( 1, "Hello" );
myClass.addValue( 2, "World" );
typedef std::map<unsigned, Pair>::iterator Iter;
Iter it = myClass.myMap_.begin();
for ( ; it != myClass.myMap_.end(); ++it ) {
std::cout << "Key: " << it->first << " Pair-Key: " << it->second.key_ << " Pair-value: " << it->second.value_ << std::endl;
}
std::cout << "\nPress any key and enter to quit." << std::endl;
char c;
std::cin >> c;
}
You can use the above, except substitute your objects of vector<T> with the strings.
I also used public interface on both the struct and class for simplicity of demonstration. Normally the container in the class would be either protected or private with accessory functions.
EDIT This is to help construct the map first. Once you have the map working then you can modify it to add in the const storage types if needed, but they can be tricky to work with. Refer to the link in einpoklum's answer.
If you are working with the newer versions of C++ you can change these lines of code:
typedef std::map<unsigned, Pair>::iterator Iter;
Iter it = myClass.myMap_.begin();
into this:
auto it = myClass.myMap_.begin();
PROBLEM
I have this old piece of pre-stl C++ code that I want to translate into std C++11 without losing efficiency.
using T = unsigned; // but can be any POD
FILE* fp = fopen( outfile.c_str(), "r" );
T* x = new T[big_n];
fread( x, sizeof(T), big_n, fp );
delete[] x;
fclose( fp );
Note that big_n is really big - like millions of records big, so any inefficiencies are pronounced.
PREVIOUS SOLUTION
In this answer from my previous question, I accepted this solution:
std::vector<T> x(big_n);
fread(x.data(), sizeof(T), big_n, fp);
ISSUE AND ATTEMPTED SOLUTION
That previous solution works, but the constructor actually calls T's default constructor big_n times. This is very slow when big_n is really big (and totally unnecessary as I am about to fread() the entire chunk from disk). FWIW, in my test case for one file, it was taking 3 seconds instead of 200ms.
So I tried to use this instead:
std::vector<T> x;
x.reserve( big_n );
fread(x.data(), sizeof(T), big_n, fp);
This seems to work, but then I run into the issue that size() returns 0 and not big_n.
How do I correct this without losing too much efficiency?
ADDENDUM
I just noticed that std::vector<> can take a custom allocator. Could using that form of the constructor solve my problem? I'm looking into this approach now.
WHAT WORKS FOR ME
I've looked into Ali's custom allocator solution below in addition to jrok's simple array solution. I have decided to adapt jrock's solution for its ease-of-understanding/lower maintenance.
The working code I came up with is below:
#include <vector>
#include <set>
#include <memory>
#include <fstream>
#include <iostream>
#include <cassert>
struct Foo
{
int m_i;
Foo() { }
Foo( int i ) : m_i( i ) { }
bool operator==( Foo const& rhs ) const { return m_i==rhs.m_i; }
bool operator!=( Foo const& rhs ) const { return m_i!=rhs.m_i; }
friend std::ostream& operator<<( std::ostream& os, Foo const& rhs )
{ os << rhs.m_i; }
};
// DESIGN NOTES /*{{{*/
//
// LIMITATION T must be a POD so we can fread/fwrite quickly
//
// WHY DO WE NEED THIS CLASS?
//
// We want to write a large number of small PODs to disk and read them back without
// 1. spurious calls to default constructors by std::vector
// 2. writing to disk a gazillion times
//
// SOLUTION
// A hybrid class containing a std::vector<> for adding new items and a
// std::unique_ptr<T[]> for fast persistence. From the user's POV, it looks
// like a std::vector<>.
//
// Algorithm
// 1. add new items into:
// std::vector<T> m_v;
// 2. when writing to disk, write out m_v as a chunk
// 3. when reading from disk, read into m_chunk (m_v will start empty again)
// 4. m_chunk and m_v combined will represent all the data
/*}}}*/
template<typename T>
class vector_chunk
{
// STATE /*{{{*/
size_t m_n_in_chunk;
std::unique_ptr<T[]> m_chunk;
std::vector<T> m_v;
/*}}}*/
// CONSTRUCTOR, INITIALIZATION /*{{{*/
public:
vector_chunk() : m_n_in_chunk( 0 ) { }
/*}}}*/
// EQUALITY /*{{{*/
public:
bool operator==( vector_chunk const& rhs ) const
{
if ( rhs.size()!=size() )
return false;
for( size_t i=0; i<size(); ++i )
if ( operator[]( i )!=rhs[i] )
return false;
return true;
}
/*}}}*/
// OSTREAM /*{{{*/
public:
friend std::ostream& operator<<( std::ostream& os, vector_chunk const& rhs )
{
for( size_t i=0; i<rhs.m_n_in_chunk; ++i )
os << rhs.m_chunk[i] << "\n";
for( T const& t : rhs.m_v )
os << rhs.t << "\n";
}
/*}}}*/
// BINARY I/O /*{{{*/
public:
void write_as_binary( std::ostream& os ) const
{
// write everything out
size_t const n_total = size();
os.write( reinterpret_cast<const char*>( &n_total ), sizeof( n_total ));
os.write( reinterpret_cast<const char*>( &m_chunk[0] ), m_n_in_chunk * sizeof( T ));
os.write( reinterpret_cast<const char*>( m_v.data() ), m_v.size() * sizeof( T ));
}
void read_as_binary( std::istream& is )
{
// only read into m_chunk, clear m_v
is.read( reinterpret_cast<char*>( &m_n_in_chunk ), sizeof( m_n_in_chunk ));
m_chunk.reset( new T[ m_n_in_chunk ] );
is.read( reinterpret_cast<char*>( &m_chunk[0] ), m_n_in_chunk * sizeof( T ));
m_v.clear();
}
/*}}}*/
// DELEGATION to std::vector<T> /*{{{*/
public:
size_t size() const { return m_n_in_chunk + m_v.size(); }
void push_back( T const& value ) { m_v.push_back( value ); }
void push_back( T&& value ) { m_v.push_back( value ); }
template< class... Args >
void emplace_back( Args&&... args ) { m_v.emplace_back( args... ); }
typename std::vector<T>::const_reference
operator[]( size_t pos ) const
{ return ((pos < m_n_in_chunk) ? m_chunk[ pos ] : m_v[ pos - m_n_in_chunk]); }
typename std::vector<T>::reference
operator[]( size_t pos )
{ return ((pos < m_n_in_chunk) ? m_chunk[ pos ] : m_v[ pos - m_n_in_chunk]); }
/*}}}*/
};
int main()
{
size_t const n = 10;
vector_chunk<Foo> v, w;
for( int i=0; i<n; ++i )
v.emplace_back( Foo{ i } );
std::filebuf ofb, ifb;
std::unique_ptr<std::ostream> osp;
std::unique_ptr<std::istream> isp;
ofb.open( "/tmp/junk.bin", (std::ios::out | std::ios::binary));
osp.reset( new std::ostream( &ofb ));
v.write_as_binary( *osp );
ofb.close();
ifb.open( "/tmp/junk.bin", (std::ios::in | std::ios::binary));
isp.reset( new std::istream( &ifb ));
w.read_as_binary( *isp );
ifb.close();
assert( v==w );
}
Using vector::reserve() and then writing into vector::data() is a dirty hack and undefined behavior. Please don't do that.
The way to solve this problem is to use a custom allocator, such as the one in this answer. I have just tested it, works fine with clang 3.5 trunk but doesn't compile with gcc 4.7.2.
Although, as others have already pointed out, unique_ptr<T[]> will serve your needs just fine.
If you don't need the interface of the vector:
auto p = unique_ptr<T[]>{ new T[big_n] };
It won't initialize the array if T is POD, otherwise it calls default constructors (default-initialization).
In C++1y, you'll be able to use std::make_unique.
If using boost is an option for you, since version 1.55 boost::container::vector has had support for explicitly default-initializing elements when resizing using the syntax:
using namespace boost::container;
vector<T> vector(37283, default_init);
at creation or
using namespace boost::container;
vector.resize(37283, default_init);
after creation. This results in the nice syntax:
using T = unsigned; // but can be any trivially copyable type
FILE* fp = fopen( outfile.c_str(), "r" );
boost::container::vector<T> x(big_n, boost::container::default_init);
fread( x.data(), sizeof(T), big_n, fp );
fclose( fp );
In my tests performance is identical to using std::vector with a default-initializing allocator.
EDIT: Unrelated aside, I'd use an RAII wrapper for FILE*:
struct FILE_deleter {
void operator () (FILE* f) const {
if (f) fclose(f);
}
};
using FILE_ptr = std::unique_ptr<FILE, FILE_deleter>;
using T = unsigned; // but can be any trivially copyable type
FILE_ptr fp{fopen( outfile.c_str(), "r" )};
boost::container::vector<T> x(big_n, boost::container::default_init);
fread( x.data(), sizeof(T), big_n, fp.get() );
I'm a bit OCD about RAII.
EDIT 2: Another option, if you absolutely MUST produce a std::vector<T>, and not a boost::container::vector<T> or std::vector<T, default_allocator<T>>, is to fill your std::vector<T> from a custom iterator pair. Here's one way to make an fread iterator:
template <typename T>
class fread_iterator :
public boost::iterator_facade<fread_iterator<T>, T,
std::input_iterator_tag, T> {
friend boost::iterator_core_access;
bool equal(const fread_iterator& other) const {
return (file_ && feof(file_)) || n_ <= other.n_;
}
T dereference() const {
// is_trivially_copyable is sufficient, but libstdc++
// (for whatever reason) doesn't have that trait.
static_assert(std::is_pod<T>::value,
"Jabberwocky is killing user.");
T result;
fread(&result, sizeof(result), 1, file_);
return result;
}
void increment() { --n_; }
FILE* file_;
std::size_t n_;
public:
fread_iterator() : file_(nullptr), n_(0) {}
fread_iterator(FILE* file, std::size_t n) : file_(file), n_(n) {}
};
(I've used boost::iterator_facade to reduce the iterator boilerplate.) The idea here is that the compiler can elide the move construction of dereference's return value so that fread will read directly into the vector's memory buffer. It will likely be less efficient due to calling fread once per item vs. just once for the allocator modification methods, but nothing too terrible since (a) the file data is still only copied once from the stdio buffer into the vector, and (b) the whole point of buffering IO is so that granularity has less impact. You would fill the vector using its assign(iterator, iterator) member:
using T = unsigned; // but can be any trivially copyable type
FILE_ptr fp{fopen( outfile.c_str(), "r" )};
std::vector<T> x;
x.reserve(big_n);
x.assign(fread_iterator<T>{fp.get(), big_n}, fread_iterator<T>{});
Throwing it all together and testing side-by-side, this iterator method is about 10% slower than using the custom allocator method or boost::container::vector. The allocator and boost method have virtually identical performance.
Since you are upgrading to c++11, why not use file streams as well ? I just tried to read a 17 MB to a char* using ifstream & then write the contents to a file using ofstream.
I ran the same application in a loop 15 times and the maximum time it took is 320 ms and minimum is 120 ms.
std::unique_ptr<char []> ReadToEnd(const char* filename)
{
std::ifstream inpfile(filename, std::ios::in | std::ios::binary | std::ios::ate);
std::unique_ptr<char[]> ret;
if (inpfile.is_open())
{
auto sz = static_cast<size_t>(inpfile.tellg());
inpfile.seekg(std::ios::beg);
ret.reset(new char[sz + 1]);
ret[sz] = '\0';
inpfile.read(ret.get(), sz);
}
return ret;
}
int main(int argc, char* argv [])
{
auto data = ReadToEnd(argv[1]);
std::cout << "Num of characters in file:" << strlen(data.get()) << "\n";
std::ofstream outfile("output.txt");
outfile.write(data.get(), strlen(data.get()));
}
Output
D:\code\cpp\ConsoleApplication1\Release>ConsoleApplication1.exe d:\code\cpp\SampleApp\Release\output.txt
Num of characters in file:18805057
Time taken to read the file, d:\code\cpp\SampleApp\Release\output.txt:152.008 ms.
Let’s consider that snippet, and please suppose that a, b, c and d are non-empty strings.
std::string a, b, c, d;
d = a + b + c;
When computing the sum of those 3 std::string instances, the standard library implementations create a first temporary std::string object, copy in its internal buffer the concatenated buffers of a and b, then perform the same operations between the temporary string and the c.
A fellow programmer was stressing that instead of this behaviour, operator+(std::string, std::string) could be defined to return a std::string_helper.
This object’s very role would be to defer the actual concatenations to the moment where it’s casted into a std::string. Obviously, operator+(std::string_helper, std::string) would be defined to return the same helper, which would "keep in mind" the fact that it has an additional concatenation to carry out.
Such a behavior would save the CPU cost of creating n-1 temporary objects, allocating their buffer, copying them, etc. So my question is: why doesn’t it already work like that ?I can’t think of any drawback or limitation.
why doesn’t it already work like that?
I can only speculate about why it was originally designed like that. Perhaps the designers of the string library simply didn't think of it; perhaps they thought the extra type conversion (see below) might make the behaviour too surprising in some situations. It is one of the oldest C++ libraries, and a lot of wisdom that we take for granted simply didn't exist in past decades.
As to why it hasn't been changed to work like that: it could break existing code, by adding an extra user-defined type conversion. Implicit conversions can only involve at most one user-defined conversion. This is specified by C++11, 13.3.3.1.2/1:
A user-defined conversion sequence consists of an initial standard conversion sequence followed by a user-defined conversion followed by a second standard conversion sequence.
Consider the following:
struct thingy {
thingy(std::string);
};
void f(thingy);
f(some_string + another_string);
This code is fine if the type of some_string + another_string is std::string. That can be implicitly converted to thingy via the conversion constructor. However, if we were to change the definition of operator+ to give another type, then it would need two conversions (string_helper to string to thingy), and so would fail to compile.
So, if the speed of string building is important, you'll need to use alternative methods like concatenation with +=. Or, according to Matthieu's answer, don't worry about it because C++11 fixes the inefficiency in a different way.
The obvious answer: because the standard doesn't allow it. It impacts code by introducing an additional user defined conversion in some cases: if C is a type having a user defined constructor taking an std::string, then it would make:
C obj = stringA + stringB;
illegal.
It depends.
In C++03, it is exact that there may be a slight inefficiency there (comparable to Java and C# as they use string interning by the way). This can be alleviated using:
d = std::string("") += a += b +=c;
which is not really... idiomatic.
In C++11, operator+ is overloaded for rvalue references. Meaning that:
d = a + b + c;
is transformed into:
d.assign(std::move(operator+(a, b).append(c)));
which is (nearly) as efficient as you can get.
The only inefficiency left in the C++11 version is that the memory is not reserved once and for all at the beginning, so there might be reallocation and copies up to 2 times (for each new string). Still, because appending is amortized O(1), unless C is quite longer than B, then at worst a single reallocation + copy should take place. And of course, we are talking POD copy here (so a memcpy call).
Sounds to me like something like this already exists: std::stringstream.
Only you have << instead of +. Just because std::string::operator + exists, it doesn't make it the most efficient option.
I think if you use +=, then it will be little faster:
d += a;
d += b;
d += c;
It should be faster, as it doesn't create temporary objects.Or simply this,
d.append(a).append(b).append(c); //same as above: i.e using '+=' 3 times.
The main reason for not doing a string of individual + concatenations, and especially not doing that in a loop, is that is has O(n2) complexity.
A reasonable alternative with O(n) complexity is to use a simple string builder, like
template< class Char >
class ConversionToString
{
public:
// Visual C++ 10.0 has some DLL linking problem with other types:
CPP_STATIC_ASSERT((
std::is_same< Char, char >::value || std::is_same< Char, wchar_t >::value
));
typedef std::basic_string< Char > String;
typedef std::basic_ostringstream< Char > OutStringStream;
// Just a default implementation, not particularly efficient.
template< class Type >
static String from( Type const& v )
{
OutStringStream stream;
stream << v;
return stream.str();
}
static String const& from( String const& s )
{
return s;
}
};
template< class Char, class RawChar = Char >
class StringBuilder;
template< class Char, class RawChar >
class StringBuilder
{
private:
typedef std::basic_string< Char > String;
typedef std::basic_string< RawChar > RawString;
RawString s_;
template< class Type >
static RawString fastStringFrom( Type const& v )
{
return ConversionToString< RawChar >::from( v );
}
static RawChar const* fastStringFrom( RawChar const* s )
{
assert( s != 0 );
return s;
}
static RawChar const* fastStringFrom( Char const* s )
{
assert( s != 0 );
CPP_STATIC_ASSERT( sizeof( RawChar ) == sizeof( Char ) );
return reinterpret_cast< RawChar const* >( s );
}
public:
enum ToString { toString };
enum ToPointer { toPointer };
String const& str() const { return reinterpret_cast< String const& >( s_ ); }
operator String const& () const { return str(); }
String const& operator<<( ToString ) { return str(); }
RawChar const* ptr() const { return s_.c_str(); }
operator RawChar const* () const { return ptr(); }
RawChar const* operator<<( ToPointer ) { return ptr(); }
template< class Type >
StringBuilder& operator<<( Type const& v )
{
s_ += fastStringFrom( v );
return *this;
}
};
template< class Char >
class StringBuilder< Char, Char >
{
private:
typedef std::basic_string< Char > String;
String s_;
template< class Type >
static String fastStringFrom( Type const& v )
{
return ConversionToString< Char >::from( v );
}
static Char const* fastStringFrom( Char const* s )
{
assert( s != 0 );
return s;
}
public:
enum ToString { toString };
enum ToPointer { toPointer };
String const& str() const { return s_; }
operator String const& () const { return str(); }
String const& operator<<( ToString ) { return str(); }
Char const* ptr() const { return s_.c_str(); }
operator Char const* () const { return ptr(); }
Char const* operator<<( ToPointer ) { return ptr(); }
template< class Type >
StringBuilder& operator<<( Type const& v )
{
s_ += fastStringFrom( v );
return *this;
}
};
namespace narrow {
typedef StringBuilder<char> S;
} // namespace narrow
namespace wide {
typedef StringBuilder<wchar_t> S;
} // namespace wide
Then you can write efficient and clear things like …
using narrow::S;
std::string a = S() << "The answer is " << 6*7;
foo( S() << "Hi, " << username << "!" );
I know vector< bool > is "evil", and dynamic_bitset is preferred (bitset is not suitable) but I am using C++ Builder 6 and I don't really want to pursue the Boost route for such an old version. I tried :
int RecordLen = 1;
int NoBits = 8;
std::ofstream Binary( FileNameBinary );
vector< bool > CaseBits( NoBits, 0 );
Binary.write( ( const char * ) & CaseBits[ 0 ], RecordLen);
but the results are incorrect. I suspect that the implementation may mean this is a stupid thing to try, but I don't know.
Operator[] for vector <bool> doesn't return a reference (because bits are not addressable), so taking the return value's address is going to be fraught with problems. Have you considered std::deque <bool>?
the bool vector specialization does not return a reference to bool.
see here, bottom of the page.
It's too late for me to decide how compliant this is, but it works for me: give the bitvector a custom allocator to alias the bits to your own buffer.
Can someone weigh in with whether the rebound allocator inside the vector is required to be copy-constructed from the one passed in? Works on GCC 4.2.1. I seem to recall that the functionality is required for C++0x, and since it's not incompatible with anything in C++03 and is generally useful, support may already be widespread.
Of course, it's implementation-defined whether bits are stored forwards or backwards or left- or right-justified inside whatever storage vector<bool> uses, so take great care.
#include <vector>
#include <iostream>
#include <iomanip>
using namespace std;
template< class T >
struct my_alloc : allocator<T> {
template< class U > struct rebind {
typedef my_alloc<U> other;
};
template< class U >
my_alloc( my_alloc<U> const &o ) {
buf = o.buf;
}
my_alloc( void *b ) { buf = b; }
// noncompliant with C++03: no default constructor
T *allocate( size_t, const void *hint=0 ) {
return static_cast< T* >( buf );
}
void deallocate( T*, size_t ) { }
void *buf;
};
int main() {
unsigned long buf[ 2 ];
vector<bool, my_alloc<bool> > blah( 128, false, my_alloc<bool>( buf ) );
blah[3] = true;
blah[100] = true;
cerr << hex << setw(16) << buf[0] << " " << setw(16) << buf[1] << endl;
}
So I have this library code, see...
class Thing
{
public:
class Obj
{
public:
static const int len = 16;
explicit Obj(char *str)
{
strncpy(str_, str, len);
}
virtual void operator()() = 0;
private:
char str_[len];
};
explicit Thing(vector<Obj*> &objs) : objs_(objs) {}
~Thing() {
for(vector<Obj*>::iterator i = objs_.begin(); i != objs_.end(); ++i) {
delete *i;
}
}
private:
vector<Obj*> objs_;
}
And in my client code...
class FooObj : public Thing::Obj
{
virtual void operator()() {
//do stuff
}
}
class BarObj : public Thing::Obj
{
virtual void operator()() {
//do different stuff
}
}
vector<Objs*> objs;
int nStructs = system_call(*structs);
for(int i = 0; i < nStructs; i++) {
objs.push_back(newFooObj(structs[i].str));
}
objs.push_back(newBarObj("bar1");
objs.push_back(newBarObj("bar2");
Thing thing(objs);
// thing does stuff, including call Obj::()() on the elements of the objs_ vector
The thing I'm stuck on is exception safety. As it stands, if any of the Obj constructors throw, or the Thing constructor throws, the Objs already in the vector will leak. The vector needs to contain pointers to Objs because they're being used polymorphically. And, I need to handle any exceptions here, because this is being invoked from an older codebase that is exception-unaware.
As I see it, my options are:
Wrap the client code in a giant try block, and clean up the vector in the catch block.
Put try blocks around all of the allocations, the catch blocks of which call a common cleanup function.
Some clever RAII-based idea that I haven't thought of yet.
Punt. Realistically, if the constructors throw, the application is about to go down in flames anyway, but I'd like to handle this more gracefully.
Take a look at boost::ptr_vector
Since your Thing destructor already knows how to clean up the vector, you're most of the way towards a RAII solution. Instead of creating the vector of Objs, and then passing it to Thing's constructor, you could initialize Thing with an empty vector and add a member function to add new Objs, by pointer, to the vector.
This way, if an Obj's constructor throws, the compiler will automatically invoke Thing's destructor, properly destroying any Objs that were already allocated.
Thing's constructor becomes a no-op:
explicit Thing() {}
Add a push_back member:
void push_back(Obj *new_obj) { objs_.push_back(new_obj); }
Then the allocation code in your client becomes:
Thing thing(objs);
int nStructs = system_call(*structs);
for(int i = 0; i < nStructs; i++) {
thing.push_back(newFooObj(structs[i].str));
}
thing.push_back(newBarObj("bar1");
thing.push_back(newBarObj("bar2");
As another poster suggested, a smart pointer type inside your vector would also work well. Just don't use STL's auto_ptr; it doesn't follow normal copy semantics and is therefore unsuitable for use in STL containers. The shared_ptr provided by Boost and the forthcoming C++0x would be fine.
Answer 3 - Use smart pointers instead of Obj* in your vectors. I'd suggest boost::shared_ptr.
Vector of Obj can be very poor for performance, since a vector could have to move object on resize and has to copy them all. Pointers to objects are better.
I've used Pointainers which will do what you need. Here is the original code.
/*
* pointainer - auto-cleaning container of pointers
*
* Example usage:
* {
* pointainer< std::vector<int*> > v;
* // v can be manipulated like any std::vector<int*>.
*
* v.push_back(new int(42));
* v.push_back(new int(17));
* // v now owns the allocated int-s
*
* v.erase(v.begin());
* // frees the memory allocated for the int 42, and then removes the
* // first element of v.
* }
* // v's destructor is called, and it frees the memory allocated for
* // the int 17.
*
* Notes:
* 1. Assumes all elements are unique (you don't have two elements
* pointing to the same object, otherwise you might delete it twice).
* 2. Not usable with pair associative containers (map and multimap).
* 3. For ANSI-challenged compilers, you may want to #define
* NO_MEMBER_TEMPLATES.
*
* Written 10-Jan-1999 by Yonat Sharon <yonat##ootips.org>
* Last updated 07-Feb-1999
*
* Modified June 9, 2003 by Steve Fossen
* -- to fix g++ compiling problem with base class typenames
*/
#ifndef POINTAINER_H
#define POINTAINER_H
#ifdef NO_MEMBER_TEMPLATES
#include <functional> // for binder2nd
#endif
template <typename Cnt>
class pointainer : public Cnt
{
public:
// sf - change to fix g++ compiletime errors
#ifdef USE_USING_NOT_TYPEDEF
// I get compile errors with this
using typename Cnt::size_type;
using typename Cnt::difference_type;
using typename Cnt::reference;
using typename Cnt::const_reference;
using typename Cnt::value_type;
using typename Cnt::iterator;
using typename Cnt::const_iterator;
using typename Cnt::reverse_iterator;
using typename Cnt::const_reverse_iterator;
#else
// this way works
typedef typename Cnt::size_type size_type;
typedef typename Cnt::difference_type difference_type;
typedef typename Cnt::reference reference;
typedef typename Cnt::const_reference const_reference;
typedef typename Cnt::value_type value_type;
typedef typename Cnt::iterator iterator;
typedef typename Cnt::const_iterator const_iterator;
typedef typename Cnt::reverse_iterator reverse_iterator;
typedef typename Cnt::const_reverse_iterator const_reverse_iterator;
#endif
typedef pointainer< Cnt > its_type;
pointainer() {}
pointainer( const Cnt &c ) : Cnt( c ) {}
its_type &operator=( const Cnt &c ) { Cnt::operator=( c ); return *this; }
~pointainer() { clean_all(); }
void clear() { clean_all(); Cnt::clear(); }
iterator erase( iterator i ) { clean( i ); return Cnt::erase( i ); }
iterator erase( iterator f, iterator l ) { clean( f, l ); return Cnt::erase( f, l ); }
// for associative containers: erase() a value
size_type erase( const value_type& v )
{
iterator i = find( v );
size_type found( i != end() ); // can't have more than 1
if( found )
erase( i );
return found;
}
// for sequence containers: pop_front(), pop_back(), resize() and assign()
void pop_front() { clean( begin() ); Cnt::pop_front(); }
void pop_back() { iterator i( end() ); clean( --i ); Cnt::pop_back(); }
void resize( size_type s, value_type c = value_type() )
{
if( s < size() )
clean( begin() + s, end() );
Cnt::resize( s, c );
}
#ifndef NO_MEMBER_TEMPLATES
template <class InIter> void assign(InIter f, InIter l)
#else
void assign( iterator f, iterator l )
#endif
{
clean_all();
Cnt::assign( f, l );
}
#ifndef NO_MEMBER_TEMPLATES
template <class Size, class T> void assign( Size n, const T& t = T() )
#else
void assign( size_t n, const value_type& t = value_type() )
#endif
{
clean_all();
Cnt::assign( n, t );
}
// for std::list: remove() and remove_if()
void remove( const value_type& v )
{
clean( std::find( begin(), end(), v ));
Cnt::remove( v );
}
#ifndef NO_MEMBER_TEMPLATES
template <class Pred>
#else
typedef std::binder2nd< std::not_equal_to< value_type > > Pred;
#endif
void remove_if(Pred pr)
{
for( iterator i = begin(); i != end(); ++i )
if( pr( *i ))
clean( i );
Cnt::remove_if( pr );
}
private:
void clean( iterator i ) { delete *i; *i = 0; } // sf add *i = NULL so double deletes don't fail
void clean( iterator f, iterator l ) { while( f != l ) clean( f++ ); }
void clean_all() { clean( begin(), end() ); }
// we can't have two pointainers own the same objects:
pointainer( const its_type& ) {}
its_type& operator=( const its_type& ) { return NULL;} // sf - return null to remove no return value warning..
};
#endif // POINTAINER_H
Instead of a vector of pointers to Obj you always can use a vector of Obj. In that case you need to make sure you can copy Obj safely (dangerous when it contains pointers). As your Obj contains only a fixed size char array it should be safe, though.
Without changing the type stored in objs to a type that can be copied and manage Obj* itself, you need to add a try/catch block to cleanup objs if an exception is thrown. The easiest way would be to do this:
vector<Obj *> objs;
try {...}
catch (...)
{
// delete all objs here
throw;
}
Although you'll want to clean up objs anyway if an exception isn't thrown also.
Hmmm. I really like Commodore Jaeger's idea; it neatly clears up some of the funny smells this code was giving me. I'm reluctant to bring in the Boost libraries at this point; this is a somewhat conservative codebase, and I'd rather bring it into the 21st century squirming and complaining than kicking and screaming.