Optimize destructors size away - c++

I'm building a code for an embedded system and I'm trying to save as much binary space as necessary.
The code is for parsing a protocol (MQTT for what it's worth), where there are numerous packets type and they are all different, but share some common parts.
Currently, to simplify writing the code, I'm using this pattern :
template <PacketType type>
struct ControlPacket
{
FixedHeader<type> type;
VariableHeader<type> header;
Properties<type> props;
... and so on...
};
// Specialize for each type
template <>
struct FixedHeader<CONNECT>
{
uint8_t typeAndFlags;
PacketType getType() const { return static_cast<PacketType>(typeAndFlags >> 4); }
uint8 getFlags() const { return 0; }
bool parseType(const uint8_t * buffer, int len)
{
if (len < 1) return false;
typeAndFlags = buffer[0];
return true;
}
...
};
template <>
struct FixedHeader<PUBLISH>
{
uint8_t typeAndFlags;
PacketType getType() const { return static_cast<PacketType>(typeAndFlags >> 4); }
uint8 getFlags() const { return typeAndFlags & 0xF; }
bool parseType(const uint8_t * buffer, int len)
{
if (len < 1) return false;
typeAndFlags = buffer[0];
if (typeAndFlags & 0x1) return false; // Example of per packet specific check to perform
return true;
}
...
};
... For all packet types ...
This is working, and I'm now trying to reduce the binary impact of all those template specializations (else the code is almost duplicated 16 times)
So, I've came up to this paradigm:
// Store the most common implementation in a base class
struct FixedHeaderBase
{
uint8_t typeAndFlags;
virtual PacketType getType() { return static_cast<PacketType(typeAndFlags >> 4); }
virtual uint8 getFlags() { return 0; } // Most common code here
virtual bool parseType(const uint8_t * buffer, int len)
{
if (len < 1) return false;
typeAndFlags = buffer[0];
return true;
}
virtual ~FixedHeaderBase() {}
};
// So that most class ends up empty
template <>
struct FixedHeader<CONNECT> final : public FixedHeaderBase
{
};
// And specialize only the specific classes
template <>
struct FixedHeader<PUBLISH> final : public FixedHeaderBase
{
uint8 getFlags() const { return typeAndFlags & 0xF; }
bool parseType(const uint8_t * buffer, int len)
{
if (!FixedHeaderBase::parseType(buffer, len)) return false;
if (typeAndFlags & 0x1) return false; // Example of per packet specific check to perform
return true;
}
};
// Most of the code is shared here
struct ControlPacketBase
{
FixedHeaderBase & type;
...etc ...
virtual bool parsePacket(const uint8_t * packet, int packetLen)
{
if (!type.parseType(packet, packetLen)) return false;
...etc ...
}
ControlPacketBase(FixedHeaderBase & type, etc...) : type(type) {}
virtual ~ControlPacketBase() {}
};
// This is only there to tell which specific version to use for the generic code
template <PacketType type>
struct ControlPacket final : public ControlPacketBase
{
FixedHeader<type> type;
VariableHeader<type> header;
Properties<type> props;
... and so on...
ControlPacket() : ControlPacketBase(type, header, props, etc...) {}
};
This is working quite well and allows to shave a lot of the used binary code space. By the way, I'm using final here so the compiler could devirtualize, and I'm compiling without RTTI (obviously also with -Os and each function in its own section that are garbage collected).
However, when I inspect the symbol table size, I'm finding a lot of duplication on the destructors (all template instances are implementing a destructor which is clearly the same (same binary size) or empty).
Typically, I understand that ControlPacket<CONNECT> needs to call ~FixedHeader<CONNECT>() and that ControlPacket<PUBLISH> needs to call ~FixedHeader<PUBLISH>() upon destruction.
Yet, since all the destructor are virtual, is there a way that the specialization of ControlPacket avoid their destructors and instead have ControlPacketBase to destruct them virtually so that I don't ends up with 16 useless destructors but only one ?

It's worth pointing out that this is related to an optimization called "identical COMDAT folding", or ICF. This is a linker feature where identical functions (i.e., empty functions) are all merged into one.
Not every linker supports this, and not every linker is willing to do this (because the language says that different functions require different address), but your toolchain could have this. It would be fast and easy.
I'm going to assume your problem is reproduced with this toy example:
#include <iostream>
#include <memory>
#include <variant>
extern unsigned nondet();
struct Base {
virtual const char* what() const = 0;
virtual ~Base() = default;
};
struct A final : Base {
const char* what() const override {
return "a";
}
};
struct B final : Base {
const char* what() const override {
return "b";
}
};
std::unique_ptr<Base> parse(unsigned v) {
if (v == 0) {
return std::make_unique<A>();
} else if (v == 1) {
return std::make_unique<B>();
} else {
__builtin_unreachable();
}
}
const char* what(const Base& b) {
return b.what(); // virtual dispatch
}
const char* what(const std::unique_ptr<Base>& b) {
return what(*b);
}
int main() {
unsigned v = nondet();
auto packet = parse(v);
std::cout << what(packet) << std::endl;
}
The disassembly shows that both A::~A and B::~B have (multiple) listings, even though they are empty and identical. This is with = default and final.
If one removes virtual, then these vacuous definitions go away and we achieve the goal - but now when the unique_ptr deletes the object, we invoke undefined behavior.
We have three choices for leaving the destructor non-virtual while maintaining well-defined behavior, two of which are useful and one is not.
Non-useful: the first option is to use shared_ptr. This works because shared_ptr actually type-erases its deleter function (see this question), so at no point does it delete via the base. In other words, when you make a shared_ptr<T>(u) for some u deriving from T, the shared_ptr stores a function pointer to U::~U directly.
However, this type erasure simply reintroduces the problem and generates even more empty virtual destructors. See the modified toy example to compare. I'm mentioning this for completeness, in case you happen to already be putting these into shared_ptr's on the side.
Useful: the alternative is to avoid virtual dispatch for lifetime management, and use a variant. It's not really proper to make such a blanket statement, but generally you can achieve smaller code and even some speedup with tag dispatch, as you avoid specifying vtables and dynamic allocation.
This requires the largest change in your code, because the object representing your packet must be interacted with in a different way (it is no longer an is-a relationship):
#include <iostream>
#include <boost/variant.hpp>
extern unsigned nondet();
struct Base {
~Base() = default;
};
struct A final : Base {
const char* what() const {
return "a";
}
};
struct B final : Base {
const char* what() const {
return "b";
}
};
typedef boost::variant<A, B> packet_t;
packet_t parse(unsigned v) {
if (v == 0) {
return A();
} else if (v == 1) {
return B();
} else {
__builtin_unreachable();
}
}
const char* what(const packet_t& p) {
return boost::apply_visitor([](const auto& v){
return v.what();
}, p);
}
int main() {
unsigned v = nondet();
auto packet = parse(v);
std::cout << what(packet) << std::endl;
}
I used Boost.Variant because it produces the smallest code. Annoyingly, std::variant insists on producing some minor but present vtables for implementing itself - I feel like this defeats the purpose a bit, though even with the variant vtables the code remains much smaller overall.
I want to point out a nice result of modern optimizing compilers. Note the resulting implementation of what:
what(boost::variant<A, B> const&):
mov eax, DWORD PTR [rdi]
cdq
cmp eax, edx
mov edx, OFFSET FLAT:.LC1
mov eax, OFFSET FLAT:.LC0
cmove rax, rdx
ret
The compiler understands the closed set of options in the variant, the lambda duck-typing proved that each option really has a ...::what member function, and so it's really just picking out the string literal to return based on the variant value.
The trade-off with a variant is that you must have a closed set of options, and you no longer have a virtual interface enforcing certain functions exist. In return you get smaller code and the compiler can often see through the dispatch "wall".
However, if we define these simple visitor helper functions per "expected" member function, it acts as an interface checker - plus you've already got your helper class templates to keep things in line.
Finally, as an extension of the above: you are always free to maintain some virtual functions within the base class. This can offer the best of both worlds, if the cost of vtables is acceptable to you:
#include <iostream>
#include <boost/variant.hpp>
extern unsigned nondet();
struct Base {
virtual const char* what() const = 0;
~Base() = default;
};
struct A final : Base {
const char* what() const override {
return "a";
}
};
struct B final : Base {
const char* what() const override {
return "b";
}
};
typedef boost::variant<A, B> packet_t;
packet_t parse(unsigned v) {
if (v == 0) {
return A();
} else if (v == 1) {
return B();
} else {
__builtin_unreachable();
}
}
const Base& to_base(const packet_t& p) {
return *boost::apply_visitor([](const auto& v){
return static_cast<const Base*>(&v);
}, p);
}
const char* what(const Base& b) {
return b.what(); // virtual dispatch
}
const char* what(const packet_t& p) {
return what(to_base(p));
}
int main() {
unsigned v = nondet();
auto packet = parse(v);
std::cout << what(packet) << std::endl;
}
This produces fairly compact code.
What we have here is a virtual base class (but, no virtual destructor as it is not needed), and a to_base function that can take a variant and return for you the common base interface. (And in a hierarchy such as yours, you could have several of these per kind of base.)
From the common base you are free to perform virtual dispatch. This is sometimes easier to manage and faster depending on workload, and the additional freedom only costs some vtables. In this example, I've implemented what as first converting to the base class, and then perform virtual dispatch to the what member function.
Again, I want to point out the the definition of a visit, this time in to_base:
to_base(boost::variant<A, B> const&):
lea rax, [rdi+8]
ret
The compiler understands the closed set of classes all inherit from Base, and so doesn't have to actually examine any variant type tag at all.
In the above I used Boost.Variant. Not everyone can or wants to use Boost, but the principles of the answer still apply: store an object and track what type of object is stored in an integer. When it's time to do something, peek at the integer and jump to the right place in code.
Implementing a variant is a whole different question. :)

Related

Cast structs with certain common members

Lets say I have 2 structs:
typedef struct
{
uint8_t useThis;
uint8_t u8Byte2;
uint8_t u8Byte3;
uint8_t u8Byte4;
} tstr1
and
typedef struct
{
uint8_t u8Byte1;
uint8_t u8Byte2;
uint8_t useThis;
} tstr2
I will only need the useThis member inside a function, but in some cases I will need to cast one struct or the other:
void someFunction()
{
someStuff();
SOMETHING MyInstance;
if(someVariable)
{
MyInstance = reinterpret_cast<tstr1*>(INFO_FROM_HARDWARE); //This line of course doesn't work
}
else
{
MyInstance = reinterpret_cast<tstr2*>(INFO_FROM_HARDWARE); //This line of course doesn't work
}
MyInstance->useThis; //Calling this memeber with no problem
moreStuff();
}
So I want to use useThis no matter what cast was done. How can this be done?
I want to avoid someFunction() to be template (just to avoid this kind of things)
Note tha questions like this have a kind of similar problem but the struct members have the same order
EDIT:
In RealLife these structs are way larger and have several "same named" members. Directly casting a uint8_t as reinterpret_cast<tstr1*>(INFO_FROM_HARDWARE)->useThis would be tedious and require several reinterpret_casts (althought it's a working solution for my question before this EDIT). This is why I insist on MyInstance being "complete".
This is what templates are for:
template<class tstr>
std::uint8_t
do_something(std::uint8_t* INFO_FROM_HARDWARE)
{
tstr MyInstance;
std::memcpy(&MyInstance, INFO_FROM_HARDWARE, sizeof MyInstance);
MyInstance.useThis; //Calling this memeber with no problem
// access MyInstance within the template
}
// usage
if(someVariable)
{
do_something<tstr1>(INFO_FROM_HARDWARE);
}
else
{
do_something<tstr2>(INFO_FROM_HARDWARE);
}
I want to avoid someFunction() to be template (just to avoid this kind of things)
Why can’t I separate the definition of my templates class from its declaration and put it inside a .cpp file?
The linked problem isn't an issue for your use case because the potential set of template arguments is a finite set. The very next FAQ entry explains how: Use explicit instantiations of the template.
as suggested by AndyG, how about std::variant (there's no mention of the c++ standard you are using so maybe a c++17 solution is ok - also worth using <insert other variant implementation> if no c++17 available).
here's an example
std::variant knows what type is stored in it and you can use visit anytime you wish to use any of the members in there (snippet here for clarity):
// stolen from #eerrorika (sorry for that :( )
struct hardware {
uint8_t a = 'A';
uint8_t b = 'B';
uint8_t c = 'C';
uint8_t d = 'D';
};
struct tstr1 {
uint8_t useThis;
uint8_t u8Byte2;
uint8_t u8Byte3;
uint8_t u8Byte4;
};
struct tstr2 {
uint8_t u8Byte1;
uint8_t u8Byte2;
uint8_t useThis;
};
// stuff
if(true)
{
msg = *reinterpret_cast<tstr1*>(&hw);
}
else
{
msg = *reinterpret_cast<tstr2*>(&hw);
}
std::visit(overloaded {
[](tstr1 const& arg) { std::cout << arg.useThis << ' '; },
[](tstr2 const& arg) { std::cout << arg.useThis << ' '; }
}, msg);
EDIT: you can also do a variant of pointers
EDIT2: forgot to escape some stuff...
Using virtual dispatch is usually not what you want when mapping to hardware but it is an alternative.
Example:
// define a common interface
struct overlay_base {
virtual ~overlay_base() = default;
virtual uint8_t& useThis() = 0;
virtual uint8_t& useThat() = 0;
};
template<class T>
class wrapper : public overlay_base {
public:
template<class HW>
wrapper(HW* hw) : instance_ptr(reinterpret_cast<T*>(hw)) {}
uint8_t& useThis() { return instance_ptr->useThis; }
uint8_t& useThat() { return instance_ptr->useThat; }
private:
T* instance_ptr;
};
With that, you can declare a base class pointer, assign it, and use after the if statement:
int main(int argc, char**) {
std::unique_ptr<overlay_base> MyInstance;
if(argc % 2) {
MyInstance.reset( new wrapper<tstr1>(INFO_FROM_HARDWARE) );
} else {
MyInstance.reset( new wrapper<tstr2>(INFO_FROM_HARDWARE) );
}
std::cout << MyInstance->useThis() << '\n';
std::cout << MyInstance->useThat() << '\n';
}
Demo
Explanation regarding my comment: "It works, but unless the compiler is really clever and can optimize away the virtual dispatch in your inner loops, it's going to be slower than if you actually take the time to type cast":
Think of virtual dispatch as having a lookup table (vtable) used at runtime (which is often what actually happens). When calling a virtual function, the program has to use that lookup table to find the address of the actual member function to call. When it's impossible to optimize away the lookup (as I made sure in my example above by using a value only available at runtime) it does take a few CPU cycles extra compared to what you'd get by doing a static cast.
A simple reference to the struct member might be what you need:
uint8_t &useThis=SomeVariable
?reinterpret_cast<tstr1*>(INFO_FROM_HARDWARE)->useThis
:reinterpret_cast<tstr2*>(INFO_FROM_HARDWARE)->useThis;

Is it possible to enforce, at compile time, that two derived classes would always return different values for an overriding function?

Is it possible to enforce, at compile time, that the following is acceptable:
class B {
public:
virtual constexpr const char* getKeyStr() const = 0;
};
class D1 : public B {
public:
constexpr const char* getKeyStr() const override { return "D1"; }
};
class D2 : public B {
public:
constexpr const char* getKeyStr() const override { return "D2"; }
};
... But the following is not? We don't want D1 and D2 to return the same key string:
class B {
public:
virtual constexpr const char* getKeyStr() const = 0;
};
class D1 : public B {
public:
constexpr const char* getKeyStr() const override { return "D1"; }
};
class D2 : public B {
public:
constexpr const char* getKeyStr() const override { return "D1"; } // can we error out here at compile time?
};
Clarifications:
I am only showing two derived classes in this example, but what I am trying to achieve is to post this constraint on any number of derived classes.
The underlying problem to solve: I am thinking about a serializing/deserializing application, where each object with the same base class will be able to generate a textual/string representation of itself to be written to a file and, when given that string back (let's call it the content-string), will be able to reconstruct the corresponding data.
During deserialization, the application code should be able to tell from a key part of the content-string, let's call it the key-string, what derived object should be reconstructed. Therefore, the key-string needs to be unique for each derived class. I know type_info::name could be unique, but it is not customizable (or compiler independent?).
Regarding the getKeyStr() function (corresponding to the key-string mentioned above), it should always return the same string for all objects of the same derived class.
Yes, you can check at compile time if the "strings" returned by getKeyStr of D1, and D2 are different.
First provide a function that compares 2 const char * at compile time:
constexpr bool different(const char *x, const char *y)
{
while(*x != '\0' )
if (*x++ != *y++) return true;
return *y != '\0';
}
and then compare the returned values:
// this will trigger, if returned strings are the same.
static_assert( different(D1{}.getKeyStr(), D2{}.getKeyStr()) );
Edit: As #Jarod42 points out, string_view comparison is constexpr, and so the comparison function can be written much more simply:
constexpr bool different(const char * x, const char * y)
{
return std::string_view{x} != std::string_view{y};
}
Here's a demo.
If you can have a list of all Derived classes, you might do something like:
template <typename... Ts>
bool have_unique_keys()
{
std::string_view keys[] = {Ts{}.getKeyStr()...}; // requires constexpr default constructor
// static method might be more appropriate
std::sort(std::begin(keys), std::end(keys)); // constexpr in C++20
auto it = std::adjacent_find(std::begin(keys), std::end(keys));
return it != std::end(keys);
}
static_assert(have_unique_keys<>(D1, D2, D3));

Combining typesafe code with runtime decisions

I am in the process of rewriting some existing code - where previously, all answer information was stored in a string array in memory. Based on the datatype, the data was transformed in various places. Below is a quick mock up of the setup I am aiming for. Essentially you have some questions - and the structure of the answers stored in the database depends on the datatype. Generally I avoid dealing with void*, and casting them to an appropriate type - but I couldn't find a better solution that would allow me to run generic code (by means of lambdas), or be specific if the datatype is known. Templated classes won't help in this case, as all the answers need to be stored in the same vector (as some arithmetic are applied to all answers based on predefined rules).
Any advice is appreciated.
#include <vector>
#include <memory>
struct AddressData
{
wchar_t Line1[50];
wchar_t Line2[50];
long CountrySeqNo;
AddressData()
{
memset(this, 0, sizeof(*this));
};
};
struct GenericData
{
wchar_t value[200];
GenericData()
{
memset(this, 0, sizeof(*this));
};
};
enum class DataType
: short
{
GENERIC,
ADDRESS
};
class AnswerBase
{
protected:
const void* const data;
const DataType dataType;
protected:
AnswerBase(const DataType datatype, const void* const _data)
: dataType(datatype), data(data)
{
if (data == nullptr)
throw std::exception("Data may not be initialized as NULL");
};
public:
/*
Some generic methods here that would apply logic by means of lambdas etc - these would be overwritten in the derived classes
*/
template<typename T> const T& GetData() { static_assert(false, "The given type is not supported"); };
template<>
const GenericData& GetData()
{
if (DataType::GENERIC != dataType)
throw std::exception("The requested type does not match the value that initialised data");
return *static_cast<const GenericData* const>(data);
};
template<>
const AddressData& GetData()
{
if (DataType::ADDRESS != dataType)
throw std::exception("The requested type does not match the value that initialised data");
return *static_cast<const AddressData* const>(data);
};
};
class AddressAnswer
: public AnswerBase
{
public:
AddressAnswer()
: AnswerBase(DataType::ADDRESS, &answer)
{
};
protected:
AddressData answer;
};
class GenericAnswer
: public AnswerBase
{
public:
GenericAnswer()
: AnswerBase(DataType::GENERIC, &answer)
{
};
protected:
GenericData answer;
};
int main()
{
std::vector<std::shared_ptr<AnswerBase>> answers;
answers.push_back(std::make_shared<GenericAnswer>());
answers.push_back(std::make_shared<AddressAnswer>());
// In some parts of code - interact with generic methods without needing to check the underlying data type
// ....
// ....
// In parts of code where we know we are dealing with a given type - like saving to a DB
auto val1 = answers[0]->GetData<GenericData>().value;
auto val2 = answers[1]->GetData<AddressData>().Line1;
// this will give a runtime failure
//auto val3 = answers[0]->GetData<AddressData>().Line1;
return 0;
}
variant is the clean way to do this. Store it in the parent.
Alternatively, provide a variant<A,B> GetData() in the parent. Now visiting is encapsulated in the variant returned. The parent stores the data.
Alternatively, provide a virtual variant<A,B> GetData() = 0. The child type returns the data, either A or B, in the variant in question.
Alternatively, write virtual A* GetA() = 0; virtual B* GetB() = 0;. Then maybe write a template method called GetData<T> such that GetData<A>() calls GetA, etc.
Alternatively, write virtual A* Get(tag_t<A>) = 0; virtual B* Get(tag_t<B>)=0;, where
template<class T>
struct tag_t {
using type=T;
constexpr tag_t(){}
};
template<class T>
constexpr tag_t<T> tag{};
is a tag used for dispatching. Now you can call the right virtual interface by doing a Get(tag<AddressData>).
In these virtual cases, the data is stored in the derived type.

How to force return value optimization in msvc

I have a function in a class that I want the compiler to use NRVO on...all the time...even in debug mode. Is there a pragma for this?
Here is my class that works great in "release" mode:
template <int _cbStack> class CBuffer {
public:
CBuffer(int cb) : m_p(0) {
m_p = (cb > _cbStack) ? (char*)malloc(cb) : m_pBuf;
}
template <typename T> operator T () const {
return static_cast<T>(m_p);
}
~CBuffer() {
if (m_p && m_p != m_pBuf)
free(m_p);
}
private:
char *m_p, m_pBuf[_cbStack];
};
The class is used to make a buffer on the stack unless more than _cbStack bytes are required. Then when it destructs, it frees memory if it allocated any. It's handy when interfacing to c functions that require a string buffer, and you are not sure of the maximum size.
Anyway, I was trying to write a function that could return CBuffer, like in this test:
#include "stdafx.h"
#include <malloc.h>
#include <string.h>
template <int _cbStack> CBuffer<_cbStack> foo()
{
// return a Buf populated with something...
unsigned long cch = 500;
CBuffer<_cbStack> Buf(cch + 1);
memset(Buf, 'a', cch);
((char*)Buf)[cch] = 0;
return Buf;
}
int _tmain(int argc, _TCHAR* argv[])
{
auto Buf = foo<256>();
return 0;
}
I was counting on NRVO to make foo() fast. In release mode, it works great. In debug mode, it obviously fails, because there is no copy constructor in my class. I don't want a copy constructor, since CBuffer will be used by developers who like to copy everything 50 times. (Rant: these guys were using a dynamic array class to create a buffer of 20 chars to pass to WideCharToMultiByte(), because they seem to have forgotten that you can just allocate an array of chars on the stack. I don't know if they even know what the stack is...)
I don't really want to code up the copy constructor just so the code works in debug mode! It gets huge and complicated:
template <int _cbStack>
class CBuffer {
public:
CBuffer(int cb) : m_p(0) { Allocate(cb); }
CBuffer(CBuffer<_cbStack> &r) {
int cb = (r.m_p == r.m_pBuf) ? _cbStack : ((int*)r.m_p)[-1];
Allocate(cb);
memcpy(m_p, r.m_p, cb);
}
CBuffer(CBuffer<_cbStack> &&r) {
if (r.m_p == r.m_pBuf) {
m_p = m_pBuf;
memcpy(m_p, r.m_p, _cbStack);
} else {
m_p = r.m_p;
r.m_p = NULL;
}
}
template <typename T> operator T () const {
return static_cast<T>(m_p);
}
~CBuffer() {
if (m_p && m_p != m_pBuf)
free((int*)m_p - 1);
}
protected:
void Allocate(int cb) {
if (cb > _cbStack) {
m_p = (char*)malloc(cb + sizeof(int));
*(int*)m_p = cb;
m_p += sizeof(int);
} else {
m_p = m_pBuf;
}
}
char *m_p, m_pBuf[_cbStack];
};
This pragma does not work:
#pragma optimize("gf", on)
Any ideas?
It is not hard to make your code both standards conforming and work.
First, wrap arrays of T with optional extra padding. Now you know the layout.
For ownership use a unique ptr instead of a raw one. If it is vapid, operator T* returns it, otherwise buffer. Now your default move ctor works, as does NRVO if the move fails.
If you want to support non POD types, a bit of work will let you both suppoort ctors and dtors and move of array elements and padding bit for bit.
The result will be a class that does not behave surprisingly and will not create bugs the first time someome tries to copy or move it - well not the first, that would be easy. The code as written will blow up in different ways at different times!
Obey the rule of three.
Here is an explicit example (now that I'm off my phone):
template <size_t T, size_t bufSize=sizeof(T)>
struct CBuffer {
typedef T value_type;
CBuffer();
explicit CBuffer(size_t count=1, size_t extra=0) {
reset(count, extra);
}
void resize(size_t count, size_t extra=0) {
size_t amount = sizeof(value_type)*count + extra;
if (amount > bufSize) {
m_heapBuffer.reset( new char[amount] );
} else {
m_heapBuffer.reset();
}
}
explicit operator value_type const* () const {
return get();
}
explicit operator value_type* () {
return get();
}
T* get() {
return reinterpret_cast<value_type*>(getPtr())
}
T const* get() const {
return reinterpret_cast<value_type const*>(getPtr())
}
private:
std::unique_ptr< char[] > m_heapBuffer;
char m_Buffer[bufSize];
char const* getPtr() const {
if (m_heapBuffer)
return m_heapBuffer.get();
return &m_Buffer[0];
}
char* getPtr() {
if (m_heapBuffer)
return m_heapBuffer.get();
return &m_Buffer[0];
}
};
The above CBuffer supports move construction and move assignment, but not copy construction or copy assignment. This means you can return a local instance of these from a function. RVO may occur, but if it doesn't the above code is still safe and legal (assuming T is POD).
Before putting it into production myself, I would add some T must be POD asserts to the above, or handle non-POD T.
As an example of use:
#include <iostream>
size_t fill_buff(size_t len, char* buff) {
char const* src = "This is a string";
size_t needed = strlen(src)+1;
if (len < needed)
return needed;
strcpy( buff, src );
return needed;
}
void test1() {
size_t amt = fill_buff(0,0);
CBuffer<char, 100> strBuf(amt);
fill_buff( amt, strBuf.get() );
std::cout << strBuf.get() << "\n";
}
And, for the (hopefully) NRVO'd case:
template<size_t n>
CBuffer<char, n> test2() {
CBuffer<char, n> strBuf;
size_t amt = fill_buff(0,0);
strBuf.resize(amt);
fill_buff( amt, strBuf.get() );
return strBuf;
}
which, if NRVO occurs (as it should), won't need a move -- and if NRVO doesn't occur, the implicit move that occurs is logically equivalent to not doing the move.
The point is that NRVO isn't relied upon to have well defined behavior. However, NRVO is almost certainly going to occur, and when it does occur it does something logically equivalent to doing the move-constructor option.
I didn't have to write such a move-constructor, because unique_ptr is move-constructable, as are arrays inside structs. Also note that copy-construction is blocked, because unique_ptr cannot be copy-constructed: this aligns with your needs.
In debug, it is quite possibly true that you'll end up doing a move-construct. But there shouldn't be any harm in that.
I don't think there is a publicly available fine-grained compiler option that only triggers NRVO.
However, you can still manipulate compiler optimization flags per each source file via either changing options in Project settings, command line, and #pragma.
http://msdn.microsoft.com/en-us/library/chh3fb0k(v=vs.110).aspx
Try to give /O1 or /O2 to the file that you want.
And, the debug mode in Visual C++ is nothing but a configuration with no optimizations and generating debugging information (PDB, program database file).
If you are using Visual C++ 2010 or later, you can use move semantics to achieve an equivalent result. See How to: Write a Move Constructor.

OneOfAType container -- storing one each of a given type in a container -- am I off base here?

I've got an interesting problem that's cropped up in a sort of pass based compiler of mine. Each pass knows nothing of other passes, and a common object is passed down the chain as it goes, following the chain of command pattern.
The object that is being passed along is a reference to a file.
Now, during one of the stages, one might wish to associate a large chunk of data, such as that file's SHA512 hash, which requires a reasonable amount of time to compute. However, since that chunk of data is only used in that specific case, I don't want all file references to need to reserve space for that SHA512. However, I also don't want other passes to have to recalculate the SHA512 hash over and over again. For example, someone might only accept files which match a given list of SHA512s, but they don't want that value printed when the file reference gets to the end of the chain, or perhaps they want both, or... .etc.
What I need is some sort of container which contain only one of a given type. If the container does not contain that type, it needs to create an instance of that type and store it somehow. It's basically a dictionary with the type being the thing used to look things up.
Here's what I've gotten so far, the relevant bit being the FileData::Get<t> method:
class FileData;
// Cache entry interface
struct FileDataCacheEntry
{
virtual void Initalize(FileData&)
{
}
virtual ~FileDataCacheEntry()
{
}
};
// Cache itself
class FileData
{
struct Entry
{
std::size_t identifier;
FileDataCacheEntry * data;
Entry(FileDataCacheEntry *dataToStore, std::size_t id)
: data(dataToStore), identifier(id)
{
}
std::size_t GetIdentifier() const
{
return identifier;
}
void DeleteData()
{
delete data;
}
};
WindowsApi::ReferenceCounter refCount;
std::wstring fileName_;
std::vector<Entry> cache;
public:
FileData(const std::wstring& fileName) : fileName_(fileName)
{
}
~FileData()
{
if (refCount.IsLastObject())
for_each(cache.begin(), cache.end(), std::mem_fun_ref(&Entry::DeleteData));
}
const std::wstring& GetFileName() const
{
return fileName_;
}
//RELEVANT METHOD HERE
template<typename T>
T& Get()
{
std::vector<Entry>::iterator foundItem =
std::find_if(cache.begin(), cache.end(), boost::bind(
std::equal_to<std::size_t>(), boost::bind(&Entry::GetIdentifier, _1), T::TypeId));
if (foundItem == cache.end())
{
std::auto_ptr<T> newCacheEntry(new T);
Entry toInsert(newCacheEntry.get(), T::TypeId);
cache.push_back(toInsert);
newCacheEntry.release();
T& result = *static_cast<T*>(cache.back().data);
result.Initalize(*this);
return result;
}
else
{
return *static_cast<T*>(foundItem->data);
}
}
};
// Example item you'd put in cache
class FileBasicData : public FileDataCacheEntry
{
DWORD dwFileAttributes;
FILETIME ftCreationTime;
FILETIME ftLastAccessTime;
FILETIME ftLastWriteTime;
unsigned __int64 size;
public:
enum
{
TypeId = 42
}
virtual void Initialize(FileData& input)
{
// Get file attributes and friends...
}
DWORD GetAttributes() const;
bool IsArchive() const;
bool IsCompressed() const;
bool IsDevice() const;
// More methods here
};
int main()
{
// Example use
FileData fd;
FileBasicData& data = fd.Get<FileBasicData>();
// etc
}
For some reason though, this design feels wrong to me, namely because it's doing a whole bunch of things with untyped pointers. Am I severely off base here? Are there preexisting libraries (boost or otherwise) which would make this clearer/easier to understand?
As ergosys said already, std::map is the obvious solution to your problem. But I can see you concerns with RTTI (and the associated bloat). As a matter of fact, an "any" value container does not need RTTI to work. It is sufficient to provide a mapping between a type and an unique identifier. Here is a simple class that provides this mapping:
#include <stdexcept>
#include <boost/shared_ptr.hpp>
class typeinfo
{
private:
typeinfo(const typeinfo&);
void operator = (const typeinfo&);
protected:
typeinfo(){}
public:
bool operator != (const typeinfo &o) const { return this != &o; }
bool operator == (const typeinfo &o) const { return this == &o; }
template<class T>
static const typeinfo & get()
{
static struct _ti : public typeinfo {} _inst;
return _inst;
}
};
typeinfo::get<T>() returns a reference to a simple, stateless singleton which allows comparisions.
This singleton is created only for types T where typeinfo::get< T >() is issued anywhere in the program.
Now we are using this to implement a top type we call value. value is a holder for a value_box which actually contains the data:
class value_box
{
public:
// returns the typeinfo of the most derived object
virtual const typeinfo& type() const =0;
virtual ~value_box(){}
};
template<class T>
class value_box_impl : public value_box
{
private:
friend class value;
T m_val;
value_box_impl(const T &t) : m_val(t) {}
virtual const typeinfo& type() const
{
return typeinfo::get< T >();
}
};
// specialization for void.
template<>
class value_box_impl<void> : public value_box
{
private:
friend class value_box;
virtual const typeinfo& type() const
{
return typeinfo::get< void >();
}
// This is an optimization to avoid heap pressure for the
// allocation of stateless value_box_impl<void> instances:
void* operator new(size_t)
{
static value_box_impl<void> inst;
return &inst;
}
void operator delete(void* d)
{
}
};
Here's the bad_value_cast exception:
class bad_value_cast : public std::runtime_error
{
public:
bad_value_cast(const char *w="") : std::runtime_error(w) {}
};
And here's value:
class value
{
private:
boost::shared_ptr<value_box> m_value_box;
public:
// a default value contains 'void'
value() : m_value_box( new value_box_impl<void>() ) {}
// embedd an object of type T.
template<class T>
value(const T &t) : m_value_box( new value_box_impl<T>(t) ) {}
// get the typeinfo of the embedded object
const typeinfo & type() const { return m_value_box->type(); }
// convenience type to simplify overloading on return values
template<class T> struct arg{};
template<class T>
T convert(arg<T>) const
{
if (type() != typeinfo::get<T>())
throw bad_value_cast();
// this is safe now
value_box_impl<T> *impl=
static_cast<value_box_impl<T>*>(m_value_box.get());
return impl->m_val;
}
void convert(arg<void>) const
{
if (type() != typeinfo::get<void>())
throw bad_value_cast();
}
};
The convenient casting syntax:
template<class T>
T value_cast(const value &v)
{
return v.convert(value::arg<T>());
}
And that's it. Here is how it looks like:
#include <string>
#include <map>
#include <iostream>
int main()
{
std::map<std::string,value> v;
v["zero"]=0;
v["pi"]=3.14159;
v["password"]=std::string("swordfish");
std::cout << value_cast<int>(v["zero"]) << std::endl;
std::cout << value_cast<double>(v["pi"]) << std::endl;
std::cout << value_cast<std::string>(v["password"]) << std::endl;
}
The nice thing about having you own implementation of any is, that you can very easily tailor it to the features you actually need, which is quite tedious with boost::any. For example, there are few requirements on the types that value can store: they need to be copy-constructible and have a public destructor. What if all types you use have an operator<<(ostream&,T) and you want a way to print your dictionaries? Just add a to_stream method to box and overload operator<< for value and you can write:
std::cout << v["zero"] << std::endl;
std::cout << v["pi"] << std::endl;
std::cout << v["password"] << std::endl;
Here's a pastebin with the above, should compile out of the box with g++/boost: http://pastebin.com/v0nJwVLW
EDIT: Added an optimization to avoid the allocation of box_impl< void > from the heap:
http://pastebin.com/pqA5JXhA
You can create a hash or map of string to boost::any. The string key can be extracted from any::type().