Better solution to data storage and passing - c++

I'm trying to find a more elegant solution for some code I'm working on at the moment. I have data that needs to be stored then moved around, but I don't really want to take up any more space than I need to for the data that is stored.
I have 2 solutions, but neither seem very nice.
Using inheritance and a tag
enum class data_type{
first, second, third
};
class data_base{
public:
virtual data_type type() const noexcept = 0;
};
using data_ptr = data_base*;
class first_data: public data_base{
public:
data_type type() const noexcept{return data_type::first;}
// hold the first data type
};
// ...
Then you pass around a data_ptr and cast it to the appropriate type.
I really don't like this approach because it requires upwards casting and using bare pointers.
Using a union and storing all data types
enum class data_type{
first, second, third
};
class data{
public:
data(data_type type_): type(type_){}
data_type type;
union{
// first, second and third data types stored
};
};
But I don't like this approach because then you start wasting a lot of memory when you have a large data type that may get passed around.
This data will then be passed onto a function that will parse it into a greater expression. Something like this:
class expression{/* ... */};
class expr_type0: public expression{/* ... */};
// every expression type
using expr_ptr = expression*;
// remember to 'delete'
expr_ptr get_expression(){
data_ptr dat = get_data();
// interpret data
// may call 'get_data()' many times
expr_ptr expr = new /* expr_type[0-n] */
delete dat;
return expr;
}
and the problem arrises again, but it doesn't matter in this case because the expr_ptr doesn't need to be reinterpreted and will have a simple virtual function call.
What is a more elegant method of tagging and passing around the data to another function?

It's difficult to envisage exactly what you're looking for without more information. But if I wanted some framework that allowed me to store and retrieve data in some structured way, in as-yet-unknown storage devices this is the kind of way I'd be thinking.
This may not be the answer you're looking for, but I think there'll be concepts here that will inspire you in the right direction.
#include <iostream>
#include <tuple>
#include <boost/variant.hpp>
#include <map>
// define some concepts
// bigfoo is a class that's expensive to copy - so lets give it a shared-handle idiom
struct bigfoo {
struct impl {
impl(std::string data) : _data(std::move(data)) {}
void write(std::ostream& os) const {
os << "I am a big object. Don't copy me: " << _data;
}
private:
std::string _data;
};
bigfoo(std::string data) : _impl { std::make_shared<impl>(std::move(data)) } {};
friend std::ostream& operator<<(std::ostream&os, const bigfoo& bf) {
bf._impl->write(os);
return os;
}
private:
std::shared_ptr<impl> _impl;
};
// all the data types our framework handles
using abstract_data_type = boost::variant<int, std::string, double, bigfoo>;
// defines the general properties of a data table store concept
template<class...Columns>
struct table_definition
{
using row_type = std::tuple<Columns...>;
};
// the concept of being able to store some type of table data on some kind of storage medium
template<class IoDevice, class TableDefinition>
struct table_implementation
{
using io_device_type = IoDevice;
using row_writer_type = typename io_device_type::row_writer_type;
template<class...Args> table_implementation(Args&...args)
: _io_device(std::forward<Args>(args)...)
{}
template<class...Args>
void add_row(Args&&...args) {
auto row_instance = _io_device.open_row();
set_row_args(row_instance,
std::make_tuple(std::forward<Args>(args)...),
std::index_sequence_for<Args...>());
row_instance.commit();
}
private:
template<class Tuple, size_t...Is>
void set_row_args(row_writer_type& row_writer, const Tuple& args, std::index_sequence<Is...>)
{
using expand = int[];
expand x { 0, (row_writer.set_value(Is, std::get<Is>(args)), 0)... };
(void)x; // mark expand as unused;
}
private:
io_device_type _io_device;
};
// model the concepts into a concrete specialisation
// this is a 'data store' implementation which simply stores data to stdout in a structured way
struct std_out_io
{
struct row_writer_type
{
void set_value(size_t column, abstract_data_type value) {
// roll on c++17 with it's much-anticipated try_emplace...
auto ifind = _values.find(column);
if (ifind == end(_values)) {
ifind = _values.emplace(column, std::move(value)).first;
}
else {
ifind->second = std::move(value);
}
}
void commit()
{
std::cout << "{" << std::endl;
auto sep = "\t";
for (auto& item : _values) {
std::cout << sep << item.first << "=" << item.second;
sep = ",\n\t";
}
std::cout << "\n}";
}
private:
std::map<size_t, abstract_data_type> _values; // some value mapped by ascending column number
};
row_writer_type open_row() {
return row_writer_type();
}
};
// this is a model of a 'data table' concept
using my_table = table_definition<int, std::string, double, bigfoo>;
// here is a test
auto main() -> int
{
auto data_store = table_implementation<std_out_io, my_table>( /* std_out_io has default constructor */);
data_store.add_row(1, "hello", 6.6, bigfoo("lots and lots of data"));
return 0;
}
expected output:
{
0=1,
1=hello,
2=6.6,
3=I am a big object. Don't copy me: lots and lots of data
}

Related

Usage of empty structs in C++

In some code that I was reading, I found the usage of empty struct like so:
struct input_iterator_tag { };
struct bidirectional_iterator_tag { };
struct random_access_iterator_tag { };
So in the rest of the code, it was used as what they call tag dispatching.
I was wondering if there is other usage of empty structs.
from an older post I saw that :
three major reasons we use empty structs in C++ are:
a base interface
a template parameter
a type to help overload resolution. (tag dispatching if I am not wrong)
Could someone explain that please?
a type to help overload resolution. (tag dispatching if I am not wrong)
When you want to use a complex template specialization pattern on some function, you don't try to go at it directly, but rather write:
template <typename T1, typename T2, other things maybe>
int foo(T1 param1, T2 param2 and so on)
{
using tag = put your complex stuff here, which produces an empty struct
detail::foo_impl(tag, std::forward<T1>(param1), std::forward<T2>(param2) and so on);
}
Now, the compiler doesn't have to decide between competing choices of template specialization, since with different tags you get incompatible functions.
a base interface
struct vehicle {
// common members and methods,
// including (pure) virtual ones, e.g.
virtual std::size_t num_maximum_occupants() = 0;
virtual ~vehicle() = default;
};
namespace mixins {
struct named { std::string name; };
struct wheeled { int num_wheels; public: rev() { }; };
} // namespace mixins
struct private_sedan : public vehicle, public wheeled, named {
// I dunno, put some car stuff here
//
// and also an override of `num_maximum_occupants()`
};
Making the base struct completely empty is perhaps not that common, but it's certainly possible if you use mixins a lot. And you could check for inheritance from vehicle (although I'm not sure I'd do that).
a template parameter
Not sure what this means, but venturing a guess:
template <typename T>
struct foo { };
template <typename T, typename N>
struct foo<std::array<T, N>> {
int value = 1;
};
If you now use foo<T>::value in a function, it will work only if T is int with few (?) exceptions.
I also tried to come up with examples:
as a base interface
// collection of very abstract vehicles
#include <vector>
struct Vehicle {};
struct Car : Vehicle {
int count_of_windows;
};
struct Bike : Vehicle {
int size_of_wheels;
};
std::vector<Vehicle> v{Bike{}, Car{}};
as a template parameter
// print same number in 3 different formats
#include <iostream>
struct dec {};
struct hex {};
struct octal {};
template<typename HOW = dec>
void print_me(int v);
template<>
void print_me<dec>(int v) {
auto f = std::cout.flags();
std::cout << std::dec << v << std::endl;
std::cout.flags(f);
}
template<>
void print_me<hex>(int v) {
auto f = std::cout.flags();
std::cout << std::hex << v << std::endl;
std::cout.flags( f );
}
template<>
void print_me<octal>(int v) {
auto f = std::cout.flags();
std::cout << std::oct << v << std::endl;
std::cout.flags(f);
}
int main() {
print_me(100);
print_me<hex>(100);
print_me<octal>(100);
}
a type to help overload resolution
// add a "noexcept" qualifier to overloaded function
// the noexcept version typically uses different functions
// and a custom "abort" handler
#include <iostream>
struct disable_exceptions {};
void is_number_1() {
int v;
std::cin >> v;
if (v != 1) {
throw new std::runtime_error("AAAA");
}
}
void is_number_1(disable_exceptions) noexcept {
int v;
// use C function - they don't throw
if (std::scanf("%d", &v) != 1) {
std::abort();
}
if (v != 1) {
std::abort();
}
}
int main() {
is_number_1();
is_number_1(disable_exceptions());
}
The example about "tag dispatching" can be found on cppreference iterator_tags. The iterator_category() member of an iterator is used to pick a different overload. That way you could write a different algorithm if for example iterator is forward_iterator, where you can only go forward, or it is a bidirectional_iterator, where your algorithm could change because you may walk back.

RVO with a standard layout struct without any constructors

I have a struct representing a binary message. I want to write a function to get the next such record from a buffer (whether a file or a socket, doesn't matter):
template <typename Record>
Record getNext();
Now, I could write this like:
template <typename Record>
Record getNext() {
Record r;
populateNext(reinterpret_cast<char*>(&r), // maybe ::read()
sizeof(r)); // or equivalent
return r;
}
which is nice and gives me the benefits of RVO. However, it will invoke the default constructor of Record, which may be composed of types with non-trival default constructors which do work that I would like to avoid - these are not necessarily POD types, but they are standard layout.
Is there a way to write getNext() such that we avoid any constructors (default or copy/move) on Record? Ideally, when the user calls:
auto record = getNext<Record>();
The buffer is read directly into the memory of record. Is this possible?
no_init is a constant of type no_init_t.
If you construct a pod from a no_init_t, you get an uninitialized pod, and (assuming elision) there is nothing to be done.
If you construct a non-pod from a no_init_t, you have to override a constructor, and make it not initialize the data. Usually class_name(no_init_t):field1(no_init), field2(no_init){} will do it, and sometimes class_name(no_init_t){} will do it (assuming all contents are pod).
Constructing from no_init on each member can act as a sanity check that the members are indeed pod, however. A non-pod class constructed from no_init will fail to compile until you write the no_init_t constructor.
This (having to no_init each member constructor) does generate some annoying DRY failure, but we don't got reflection, so you are gonna repeat yourself and like it.
namespace {
struct no_init_t {
template<class T, class=std::enable_if_t<std::is_pod<T>{}>>
operator T()const{
T tmp;
return tmp;
}
static no_init_t instance() { return {}; }
no_init_t(no_init_t const&) = default;
private:
no_init_t() = default;
};
static const no_init = no_init_t::instance();
}
struct Foo {
char buff[1000];
size_t hash;
Foo():Foo(""){}
template<size_t N, class=std::enable_if_t< (N<=sizeof(buff)) >>
Foo( char const(&in)[N] ) {
// some "expensive" copy and hash
}
Foo(no_init_t) {} // no initialization!
};
struct Record {
int x;
Foo foo;
Record()=default;
Record(no_init_t):
x(no_init), foo(no_init)
{}
};
Now we can construct Record with no_init and it won't be initialized.
Every POD class is not initialized. Every non-POD class must provide a no_init_t constructor (and presumably implement non-initialization, as best it can).
You then memcpy right over it.
This requires modifying your type, and the types it contains, to support non-initialization.
Something like this?
EDIT:
Addresses comment on alignment. Now uses anonymous union to ensure correct alignment.
TestRecord now incorporates another standard layout type egg
Added proof that even though egg has a default constructor, the class is not constructed prior to being filled by populateNextRecord()
I think this is about as fast as it can be isn't it?
#include <iostream>
#include <array>
#include <algorithm>
struct egg {
egg(int i) : _val(i) {}
egg() {}
int _val = 6;
friend std::ostream& operator<<(std::ostream& os, const egg& e) {
return os << e._val;
}
};
struct TestRecord {
egg x;
double y;
};
void populateNext(uint8_t* first, size_t length)
{
// do work here
TestRecord data_source { 10, 100.2 };
auto source = reinterpret_cast<uint8_t*>(&data_source);
std::copy(source, source + length, first);
}
template<class Record>
struct RecordProxy
{
RecordProxy() {}
uint8_t* data() {
return _data;
}
static constexpr size_t size() {
return sizeof(Record);
}
Record& as_record() {
return _record;
}
union {
Record _record;
uint8_t _data[sizeof(Record)];
};
};
template <typename Record>
RecordProxy<Record> getNext() {
RecordProxy<Record> r;
populateNext(r.data(), // maybe ::read()
r.size()); // or equivalent
return r;
}
using namespace std;
int main()
{
RecordProxy<TestRecord> prove_not_initialised;
auto& r1 = prove_not_initialised.as_record();
cout << "x = " << r1.x << ", y = " << r1.y << endl;
auto buffer = getNext<TestRecord>();
auto& actual_record = buffer.as_record();
cout << "x = " << actual_record.x << ", y = " << actual_record.y << endl;
return 0;
}

Dynamically define a function return type

I have a Message class that is able to pack its payload to binary and unpack it back. Like:
PayloadA p;
msg->Unpack(&p);
where PayloadA is a class.
The problem is that I have a bunch of payloads, so I need giant if or switch statement:
if (msg->PayloadType() == kPayloadTypeA)
{
PayloadA p;
msg->Unpack(&p); // void Unpack(IPayload *);
// do something with payload ...
}
else if ...
I want to write a helper function that unpacks payloads. But what would be the type of this function? Something like:
PayloadType UnpackPayload(IMessage *msg) { ... }
where PayloadType is a typedef of a proper payload class. I know it is impossible but I looking for solutions like this. Any ideas?
Thanks.
I would split one level higher to avoid the problem entirely:
#include <map>
#include <functional>
...
std::map<int, std::function<void()> _actions;
...
// In some init section
_actions[kPayloadA] = [](IMessage* msg) {
PayloadA p;
msg->Unpack(&p);
// do something with payload ...
};
// repeat for all payloads
...
// decoding function
DecodeMsg(IMessage* msg) {
_actions[id](msg);
}
To further reduce the code size, try to make Unpack a function template (possible easily only if it's not virtual, if it is you can try to add one level of indirection so that it isn't ;):
class Message {
template <class Payload>
Payload Unpack() { ... }
};
auto p = msg->Unpack<PayloadA>();
// do something with payload ...
EDIT
Now let's see how we can avoid writing the long list of _actions[kPayloadN]. This is highly non trivial.
First you need a helper to run code during the static initialization (i.e. before main):
template <class T>
class Registrable
{
struct Registrar
{
Registrar()
{
T::Init();
}
};
static Registrar R;
template <Registrar& r>
struct Force{ };
static Force<R> F; // Required to force all compilers to instantiate R
// it won't be done without this
};
template <class T>
typename Registrable<T>::Registrar Registrable<T>::R;
Now we need to define our actual registration logic:
typedef map<int, function<void()> PayloadActionsT;
inline PayloadActionsT& GetActions() // you may move this to a CPP
{
static PayloadActionsT all;
return all;
}
Then we factor in the parsing code:
template <class Payload>
struct BasePayload : Registrable<BasePayload>
{
static void Init()
{
GetActions()[Payload::Id] = [](IMessage* msg) {
auto p = msg->Unpack<Payload>();
p.Action();
}
}
};
Then we define all the payloads one by one
struct PayloadA : BasePayload<PayloadA>
{
static const int Id = /* something unique */;
void Action()
{ /* what to do with this payload */ }
}
Finally we parse the incoming messages:
void DecodeMessage(IMessage* msg)
{
static const auto& actions = GetActions();
actions[msg->GetPayloadType]();
}
How about a Factory Method that creates a payload according to the type, combined with a payload constructor for each payload type, taking a message as a parameter?
There's no avoiding the switch (or some similar construct), but at least it's straightforward and the construction code is separate from the switch.
Example:
class PayloadA : public Payload
{
public:
PayloadA(const &Message m) {...} // unpacks from m
};
class PayloadB : public Payload
{
public:
PayloadB(const &Message m) {...} // as above
};
Payload * UnpackFromMessage(const Message &m)
{
switch (m.PayloadType) :
case TypeA : return new PayloadA(m);
case TypeB : return new PayloadB(m);
... etc...
}
I seen this solved with unions. The first member of the union is the type of packet contained.
Examples here: What is a union?
An important question is how the payloads differ, and how they are the same. A system whereby you produce objects of a type determined by the payload, then interact with them via a virtual interface that is common to all types of payload, is reasonable in some cases.
Another option assuming you have a finite and fixed list of types of payload, returning a boost::variant is relatively easy. Then to process it, call apply_visitor with a functor that accepts every type in the variant.
If you only want to handle one type of payload differently, a "call and run the lambda if and only if the type matches T" function isn't that hard to write this way.
So you can get syntax like this:
struct State;
struct HandlePayload
{
typedef void return_type;
State* s;
HandlePayload(State* s_):s(s_) {}
void operator()( int const& payload ) const {
// handle int here
}
void operator()( std::shared_ptr<bob> const& payload ) const {
// handle bob ptrs here
}
template<typename T>
void operator()( T const& payload ) const {
// other types, maybe ignore them
}
}
which is cute and all, but you'll note it is quite indirect. However, you'll also note that you can write template code with a generic type T above to handle the payload, and use stuff like traits classes for some situations, or explicit specialization for others.
If you expect the payload to be one particular kind, and only want to do some special work in that case, writing a single-type handler on a boost::variant is easy.
template<typename T, typename Func>
struct Helper {
typedef bool return_type;
Func f;
Helper(Func f_):f(f_) {}
bool operator()(T const& t) {f(t); return true; }
template<typename U>
bool operator()(U const& u) { return false; }
};
template<typename T, typename Variant, typename Func>
bool ApplyFunc( Variant const& v, Func f )
{
return boost::apply_visitor( Helper<T, Func>(f), v );
}
which will call f on a variant v but only on the type T in the Variant, returning true iff the type is matched.
Using this, you can do stuff like:
boost::variant<int, double> v = 1.0;
boost::variant<int, double> v2 = int(1);
ApplyFunc<double>( v, [&](double d) { std::cout << "Double is " << d << "\n"; } );
ApplyFunc<double>( v2, [&](double d) { std::cout << "code is not run\n"; } );
ApplyFunc<int>( v2, [&](int i) { std::cout << "code is run\n"; } );
or some such variant.
One good solution is a common base class + all payloads inheriting from that class:
class PayLoadBase {
virtual char get_ch(int i) const=0;
virtual int num_chs() const=0;
};
And then the unpack would look like this:
class Unpacker {
public:
PayLoadBase &Unpack(IMessage *msg) {
switch(msg->PayLoadType()) {
case A: a = *msg; return a;
case B: b = *msg; return b;
...
}
}
private:
PayLoadA a;
PayLoadB b;
PayLoadC c;
};
You can make the function return a void *. A void pointer can be cast to any other type.

C++ Push Multiple Types onto Vector

Note: I know similar questions to this have been asked on SO before, but I did not find them helpful or very clear.
Second note: For the scope of this project/assignment, I'm trying to avoid third party libraries, such as Boost.
I am trying to see if there is a way I can have a single vector hold multiple types, in each of its indices. For example, say I have the following code sample:
vector<something magical to hold various types> vec;
int x = 3;
string hi = "Hello World";
MyStruct s = {3, "Hi", 4.01};
vec.push_back(x);
vec.push_back(hi);
vec.push_back(s);
I've heard vector<void*> could work, but then it gets tricky with memory allocation and then there is always the possibility that certain portions in nearby memory could be unintentionally overridden if a value inserted into a certain index is larger than expected.
In my actual application, I know what possible types may be inserted into a vector, but these types do not all derive from the same super class, and there is no guarantee that all of these types will be pushed onto the vector or in what order.
Is there a way that I can safely accomplish the objective I demonstrated in my code sample?
Thank you for your time.
The objects hold by the std::vector<T> need to be of a homogenous type. If you need to put objects of different type into one vector you need somehow erase their type and make them all look similar. You could use the moral equivalent of boost::any or boost::variant<...>. The idea of boost::any is to encapsulate a type hierarchy, storing a pointer to the base but pointing to a templatized derived. A very rough and incomplete outline looks something like this:
#include <algorithm>
#include <iostream>
class any
{
private:
struct base {
virtual ~base() {}
virtual base* clone() const = 0;
};
template <typename T>
struct data: base {
data(T const& value): value_(value) {}
base* clone() const { return new data<T>(*this); }
T value_;
};
base* ptr_;
public:
template <typename T> any(T const& value): ptr_(new data<T>(value)) {}
any(any const& other): ptr_(other.ptr_->clone()) {}
any& operator= (any const& other) {
any(other).swap(*this);
return *this;
}
~any() { delete this->ptr_; }
void swap(any& other) { std::swap(this->ptr_, other.ptr_); }
template <typename T>
T& get() {
return dynamic_cast<data<T>&>(*this->ptr_).value_;
}
};
int main()
{
any a0(17);
any a1(3.14);
try { a0.get<double>(); } catch (...) {}
a0 = a1;
std::cout << a0.get<double>() << "\n";
}
As suggested you can use various forms of unions, variants, etc. Depending on what you want to do with your stored objects, external polymorphism could do exactly what you want, if you can define all necessary operations in a base class interface.
Here's an example if all we want to do is print the objects to the console:
#include <iostream>
#include <string>
#include <vector>
#include <memory>
class any_type
{
public:
virtual ~any_type() {}
virtual void print() = 0;
};
template <class T>
class concrete_type : public any_type
{
public:
concrete_type(const T& value) : value_(value)
{}
virtual void print()
{
std::cout << value_ << '\n';
}
private:
T value_;
};
int main()
{
std::vector<std::unique_ptr<any_type>> v(2);
v[0].reset(new concrete_type<int>(99));
v[1].reset(new concrete_type<std::string>("Bottles of Beer"));
for(size_t x = 0; x < 2; ++x)
{
v[x]->print();
}
return 0;
}
In order to do that, you'll definitely need a wrapper class to somehow conceal the type information of your objects from the vector.
It's probably also good to have this class throw an exception when you try to get Type-A back when you have previously stored a Type-B into it.
Here is part of the Holder class from one of my projects. You can probably start from here.
Note: due to the use of unrestricted unions, this only works in C++11. More information about this can be found here: What are Unrestricted Unions proposed in C++11?
class Holder {
public:
enum Type {
BOOL,
INT,
STRING,
// Other types you want to store into vector.
};
template<typename T>
Holder (Type type, T val);
~Holder () {
// You want to properly destroy
// union members below that have non-trivial constructors
}
operator bool () const {
if (type_ != BOOL) {
throw SomeException();
}
return impl_.bool_;
}
// Do the same for other operators
// Or maybe use templates?
private:
union Impl {
bool bool_;
int int_;
string string_;
Impl() { new(&string_) string; }
} impl_;
Type type_;
// Other stuff.
};

OneOfAType container -- storing one each of a given type in a container -- am I off base here?

I've got an interesting problem that's cropped up in a sort of pass based compiler of mine. Each pass knows nothing of other passes, and a common object is passed down the chain as it goes, following the chain of command pattern.
The object that is being passed along is a reference to a file.
Now, during one of the stages, one might wish to associate a large chunk of data, such as that file's SHA512 hash, which requires a reasonable amount of time to compute. However, since that chunk of data is only used in that specific case, I don't want all file references to need to reserve space for that SHA512. However, I also don't want other passes to have to recalculate the SHA512 hash over and over again. For example, someone might only accept files which match a given list of SHA512s, but they don't want that value printed when the file reference gets to the end of the chain, or perhaps they want both, or... .etc.
What I need is some sort of container which contain only one of a given type. If the container does not contain that type, it needs to create an instance of that type and store it somehow. It's basically a dictionary with the type being the thing used to look things up.
Here's what I've gotten so far, the relevant bit being the FileData::Get<t> method:
class FileData;
// Cache entry interface
struct FileDataCacheEntry
{
virtual void Initalize(FileData&)
{
}
virtual ~FileDataCacheEntry()
{
}
};
// Cache itself
class FileData
{
struct Entry
{
std::size_t identifier;
FileDataCacheEntry * data;
Entry(FileDataCacheEntry *dataToStore, std::size_t id)
: data(dataToStore), identifier(id)
{
}
std::size_t GetIdentifier() const
{
return identifier;
}
void DeleteData()
{
delete data;
}
};
WindowsApi::ReferenceCounter refCount;
std::wstring fileName_;
std::vector<Entry> cache;
public:
FileData(const std::wstring& fileName) : fileName_(fileName)
{
}
~FileData()
{
if (refCount.IsLastObject())
for_each(cache.begin(), cache.end(), std::mem_fun_ref(&Entry::DeleteData));
}
const std::wstring& GetFileName() const
{
return fileName_;
}
//RELEVANT METHOD HERE
template<typename T>
T& Get()
{
std::vector<Entry>::iterator foundItem =
std::find_if(cache.begin(), cache.end(), boost::bind(
std::equal_to<std::size_t>(), boost::bind(&Entry::GetIdentifier, _1), T::TypeId));
if (foundItem == cache.end())
{
std::auto_ptr<T> newCacheEntry(new T);
Entry toInsert(newCacheEntry.get(), T::TypeId);
cache.push_back(toInsert);
newCacheEntry.release();
T& result = *static_cast<T*>(cache.back().data);
result.Initalize(*this);
return result;
}
else
{
return *static_cast<T*>(foundItem->data);
}
}
};
// Example item you'd put in cache
class FileBasicData : public FileDataCacheEntry
{
DWORD dwFileAttributes;
FILETIME ftCreationTime;
FILETIME ftLastAccessTime;
FILETIME ftLastWriteTime;
unsigned __int64 size;
public:
enum
{
TypeId = 42
}
virtual void Initialize(FileData& input)
{
// Get file attributes and friends...
}
DWORD GetAttributes() const;
bool IsArchive() const;
bool IsCompressed() const;
bool IsDevice() const;
// More methods here
};
int main()
{
// Example use
FileData fd;
FileBasicData& data = fd.Get<FileBasicData>();
// etc
}
For some reason though, this design feels wrong to me, namely because it's doing a whole bunch of things with untyped pointers. Am I severely off base here? Are there preexisting libraries (boost or otherwise) which would make this clearer/easier to understand?
As ergosys said already, std::map is the obvious solution to your problem. But I can see you concerns with RTTI (and the associated bloat). As a matter of fact, an "any" value container does not need RTTI to work. It is sufficient to provide a mapping between a type and an unique identifier. Here is a simple class that provides this mapping:
#include <stdexcept>
#include <boost/shared_ptr.hpp>
class typeinfo
{
private:
typeinfo(const typeinfo&);
void operator = (const typeinfo&);
protected:
typeinfo(){}
public:
bool operator != (const typeinfo &o) const { return this != &o; }
bool operator == (const typeinfo &o) const { return this == &o; }
template<class T>
static const typeinfo & get()
{
static struct _ti : public typeinfo {} _inst;
return _inst;
}
};
typeinfo::get<T>() returns a reference to a simple, stateless singleton which allows comparisions.
This singleton is created only for types T where typeinfo::get< T >() is issued anywhere in the program.
Now we are using this to implement a top type we call value. value is a holder for a value_box which actually contains the data:
class value_box
{
public:
// returns the typeinfo of the most derived object
virtual const typeinfo& type() const =0;
virtual ~value_box(){}
};
template<class T>
class value_box_impl : public value_box
{
private:
friend class value;
T m_val;
value_box_impl(const T &t) : m_val(t) {}
virtual const typeinfo& type() const
{
return typeinfo::get< T >();
}
};
// specialization for void.
template<>
class value_box_impl<void> : public value_box
{
private:
friend class value_box;
virtual const typeinfo& type() const
{
return typeinfo::get< void >();
}
// This is an optimization to avoid heap pressure for the
// allocation of stateless value_box_impl<void> instances:
void* operator new(size_t)
{
static value_box_impl<void> inst;
return &inst;
}
void operator delete(void* d)
{
}
};
Here's the bad_value_cast exception:
class bad_value_cast : public std::runtime_error
{
public:
bad_value_cast(const char *w="") : std::runtime_error(w) {}
};
And here's value:
class value
{
private:
boost::shared_ptr<value_box> m_value_box;
public:
// a default value contains 'void'
value() : m_value_box( new value_box_impl<void>() ) {}
// embedd an object of type T.
template<class T>
value(const T &t) : m_value_box( new value_box_impl<T>(t) ) {}
// get the typeinfo of the embedded object
const typeinfo & type() const { return m_value_box->type(); }
// convenience type to simplify overloading on return values
template<class T> struct arg{};
template<class T>
T convert(arg<T>) const
{
if (type() != typeinfo::get<T>())
throw bad_value_cast();
// this is safe now
value_box_impl<T> *impl=
static_cast<value_box_impl<T>*>(m_value_box.get());
return impl->m_val;
}
void convert(arg<void>) const
{
if (type() != typeinfo::get<void>())
throw bad_value_cast();
}
};
The convenient casting syntax:
template<class T>
T value_cast(const value &v)
{
return v.convert(value::arg<T>());
}
And that's it. Here is how it looks like:
#include <string>
#include <map>
#include <iostream>
int main()
{
std::map<std::string,value> v;
v["zero"]=0;
v["pi"]=3.14159;
v["password"]=std::string("swordfish");
std::cout << value_cast<int>(v["zero"]) << std::endl;
std::cout << value_cast<double>(v["pi"]) << std::endl;
std::cout << value_cast<std::string>(v["password"]) << std::endl;
}
The nice thing about having you own implementation of any is, that you can very easily tailor it to the features you actually need, which is quite tedious with boost::any. For example, there are few requirements on the types that value can store: they need to be copy-constructible and have a public destructor. What if all types you use have an operator<<(ostream&,T) and you want a way to print your dictionaries? Just add a to_stream method to box and overload operator<< for value and you can write:
std::cout << v["zero"] << std::endl;
std::cout << v["pi"] << std::endl;
std::cout << v["password"] << std::endl;
Here's a pastebin with the above, should compile out of the box with g++/boost: http://pastebin.com/v0nJwVLW
EDIT: Added an optimization to avoid the allocation of box_impl< void > from the heap:
http://pastebin.com/pqA5JXhA
You can create a hash or map of string to boost::any. The string key can be extracted from any::type().