Collections holding different Types Simultaneously

Collections holding different Types Simultaneously - c++

Traditionally, I've programmed in c++ and Java, and I'm now beginning to learn ruby.
My question then is, how do languages like ruby internally implement their array and hash data structures in such a way that they can hold any type at the same time? I know that in Java, the fact that every class is derived from object, could be one way to implement this, but I was wondering if there was another way. For example, in c++, if I wanted to implement a dynamic array that could simultaneously hold multiple types of values (of no relation), how could I do this?
To clarify, I'm not referring to generic programming or templates, as those simply create a new collection interface for a type. I'm referring to a structure such as this:
array = [1, "hello", someClass];

Most of them do roughly the same as you'd get in C++ by creating a vector (or list, deque, etc.) of boost::any, or something similar.
That is to say, they basically attach some tag to each type of object as it's stored in memory. When they store an object, they store the tag. When they read an object, they look at the tag to figure out what kind of object that is. Of course, they also handle most of this internally, so you don't have to write the code to figure out what kind of object you've just retrieved from the collection.
In case it's not clear: the "tag" is just a unique number assigned to each type. If the system you're dealing with has primitive types, it'll normally pre-assign a type number to each of them. Likewise, each class you create gets a unique number assigned to it.
To do that in C++, you'd normally create a central registry of tags. When you register a type, you receive a unique number back that you use to tag objects of that type. When a language supports this directly, it automates the process of registering types and choosing a unique tag for each.
Although this is probably the most common method of implementing such things, it's definitely not the only one. Just for example, it's also possible to designate specific ranges of storage for particular types. When you allocate an object of a given type, it's always allocated from that type's address range. When you create a collection of "objects", you're really not storing the objects themselves, but instead storing something that contains the address of the object. Since objects are segregated by address you can figure out the type of the object based on the value of the pointer.

In the MRI interpreter, a ruby value is stored as a pointer type which points to a data structure storing the class of the value and any data associated with the value. Since pointers are always the same size, (sizeof(unsigned long) usually), it is possible. To answer your question about C++, it is impossible in C++ to determine the class of an object given it's location in memory, so it wouldn't be possible unless you had something like this:
enum object_class { STRING, ARRAY, MAP, etc... };
struct TaggedObject {
enum object_class klass;
void *value;
}
and passed around TaggedObject * values. That is pretty much what ruby does internally.

There are many ways to do that :-
You can define a common interface for all the elements and make a container of those. For example:
class Common { /* ... */ }; // the common interface.
You can use container of void* :-
vector<void*> common; // this would rather be too low level.
// you have to use cast very much.
And then the best approach I think is using an Any class, such as Boost::Any :-
vector<boost::any> v;

You're looking for something called type erasure. The simplest way to do this in C++ is with boost::any:
std::vector<boost::any> stuff;
stuff.push_back(1);
stuff.push_back(std::string("hello"));
stuff.push_back(someClass);
Of course with any, you're extremely limited in what you can do with your stuff since you have to personally remember everything you put into it.
A more common use-case of heterogeneous containers might be a series of callbacks. The standard class std::function<R(Args...)> is, in fact, a type-erased functor:
void foo() { .. }
struct SomeClass {
void operator()() { .. }
};
std::vector<std::function<void()>> callbacks;
callbacks.push_back(foo);
callbacks.push_back(SomeClass{});
callbacks.push_back([]{ .. });
Here, we're adding three objects of different types (a void(*)(), a SomeClass, and some lambda) to the same container - which we do by erasing the type. So we can still do:
for (auto& func : callbacks) {
func();
}
And that will do the right thing in each of the three objects... no virtuals needed!

Others have explained ways you can do this in C++.
There are various ways to solve this problem. To answer your question about how does languages such as Ruby solve this, without going into details of exactly how Ruby solves it, they use a structure that contains type information. For example, we could do that in C++ something like this:
enum TypeKind { None, Int, Float, String }; // May need a few more?
class TypeBase
{
protected:
TypeKind kind;
public:
TypeBase(TypeKind k) : kind(k) { }
virtual ~TypeBase() {};
TypeKind type() { return kind; }
};
class TypeInt : public TypeBase
{
private:
int value;
public:
TypeInt(int v) : value(v), TypeBase(Int) {}
};
class TypeFloat : public TypeBase
{
private:
double value;
public:
TypeFloat(double v) : value(v), TypeBase(Float) {}
};
class TypeString : public TypeBase
{
private:
std::string value;
public:
TypeString(std::string v) : value(v), TypeBase(String) {}
};
(To make it useful, we probably need some more methods for the TypeXxx class, but I don't feel like typing for another hour... ;) )
And then somewhere, it determines the type, e.g.
Token t = getNextToken();
TypeBase *ty;
if (t.type == IntegerToken)
{
ty = new(TypeInt(stoi(t.str));
}
else if (t.type == FloatToken)
{
ty = new(TypeFloat(stod(t.str));
}
else if (t.type == StringToken)
{
ty = new(TypeString(t.str));
}
Of course, we'd also need to deal with variables and various other scenarios, but the essence of it is that the language can keep track of (and sometimes mutate) the value that is stored.
Most languages in the general category where Ruby, PHP, Python, etc are, will have this sort of mechanism, and all variables are stored in some sort of indirect way. The above is just one possible solution, I can think of at least half a dozen other ways to do this, but they are variations on the theme of "store data together with type information".
(And by the way, boost::any also does something along the lines of the above, more or less....)

In Ruby, the answer is rather simple: that array doesn't contain values of different types, they are all of the same type. They are all objects.
Ruby is dynamically typed, the idea of an array that is statically constrained to only hold elements of the same type doesn't even make sense.
For a statically typed language, the question is, how much do you want it to be like Ruby? Do you want it to be actually dynamically typed? Then you need to implement a dynamic type in your language (if it doesn't already have one, like C♯’s dynamic).
Otherwise, if you want a statically typed heterogenous list, such a thing is usually called an HList. There's a very nice implementation for Scala in the Shapeless library, for example.

Related

Proper design for C++ class wrapping multiple possible types

I am trying to implement a C++ class which will wrap a value (among other things). This value may be one of a number of types (string, memory buffer, number, vector).
The easy way to implement this would be to do something like this
class A {
Type type;
// Only one of these will be valid data; which one will be indicated by `type` (an enum)
std::wstring wData{};
long dwData{};
MemoryBuffer lpData{};
std::vector<std::wstring> vData{};
};
This feels inelegant and like it wastes memory.
I also tried implementing this as a union, but it came with significant development overhead (defining custom destructors/move constructors/copy constructors), and even with all of those, there were still some errors I encountered.
I've also considered making A a base class and making a derived class for each possible value it can hold. This also feels like it isn't a great way to solve the problem.
My last approach would be to make each member an std::optional, but this still adds some overhead.
Which approach would be the best? Or is there another design that works better than any of these?

Use std::variant. It is typesafe, tested and exactly the right thing for a finite number of possible types.
It also gets rid of the type enum.
class A {
std::variant<std::wstring, long, MemoryBuffer, std::vector<std::wstring>> m_data{}; // default initializes the wstring.
public
template<class T>
void set_data(T&& data) {
m_data = std::forward<T>(data);
}
int get_index() { // returns index of type.
m_data.index();
}
long& get_ldata() {
return std::get<long>(m_data); // throws if long is not the active type
}
// and the others, or
template<class T>
T& get_data() { // by type
return std::get<T>(m_data);
}
template<int N>
auto get_data() { // by index
return std::get<N>(m_data);
}
};
// using:
A a;
a.index() == 0; // true
a.set_data(42);
a.index() == 1; // true
auto l = a.get<long>(); // l is now of type long, has value 42
a.get<long>() = 1;
l = a.get<1>();
PS: This example does not even include the coolest (in my opinion) feature of std::variant: std::visit I am not sure what you want to do with your class, so I cannot create a meaningful example. If you let me know, I will think about it.

You basically want QVariant without the rest of Qt, then :)?
As others have mentioned, you could use std::variant and put using MyVariant = std::variant<t1, t2, ...> in some common header, and then use it everywhere it's called for. This isn't as inelegant as you may think - the specific types to be passed around are only provided in one place. It is the only way to do it without building a metatype machinery that can encapsulate operations on any type of an object.
That's where boost::any comes in: it does precisely that. It wraps concepts, and thus supports any object that implements these concepts. What concepts are required depends on you, but in general you'd want to choose enough of them to make the type usable and useful, yet not too many so as to exclude some types prematurely. It's probably the way to go, you'd have: using MyVariant = any<construct, _a>; then (where construct is a contract list, an example of which is as an example in the documentation, and _a is a type placeholder from boost::type_erasure.
The fundamental difference between std::variant and boost::any is that variant is parametrized on concrete types, whereas any is parametrized on contracts that the types are bound to. Then, any will happily store an arbitrary type that fulfills all of those contracts. The "central location" where you define an alias for the variant type will constantly grow with variant, as you need to encapsulate more type. With any, the central location will be mostly static, and would change rarely, since changing the contract requirements is likely to require fixes/adaptations to the carried types as well as points of use.

How to save a type of a pointer c++

Function1 can be called with any type T which will be converted to (void*) to be able to add to the list but with this I lose the original pointer type (I need t store tham in one linkedlist because I cannot create one for every possible type). So somehow I need to save the type of the pointer as well. I know that it cant be done using c++. Can anyone suggest an alternative solution?
class MyClass
{
template<class T>
void function1(T* arg1)
{
myList.add((void*)arg);
}
void function2()
{
for(int i = 0; i < myList.size(); i++)
{
myList.get(i);
//restore the original pointer type
}
}
STLinkedlist<void*> myList;
}

The usual way to handle these kinds of problems is by using a public interface, in C++ this is done through inheritance. This can be a drag, especially in constrained situations, where a full class/interface hierarchy would provide too much code/runtime overhead.
In comes Boost.Variant, which allows you to use the same object to store different types. If you need even more freedom, use Boost.Any. For a comparison, see e.g. here.
At the end of the day (or better: rather sooner than later), I'd try to do things differently so you don't have this problem. It may well be worth it in the end.

If you lost the type info by going void* it is just gone. You can not just restore it.
So you either must store extra information along with the pointer, then use branching code to cast it back, or rather drive design to avoid the loss.
Your case is pretty suspicious that you do not what you really want.
A more usual case is that you want a polymorphic collection. That doesn't store any kind of pointers but those belonging to the same hierarchy. Collection has Base* pointers, and you can use the objects through Base's interface, all calling the proper virtual function without programmer's interaction. And if you really need to cast to the original type you can do it via dynamic_cast safely. Or add some type info support in the base interface.

Function1 can be called with any type T which will be converted to (void*) to be able to add to the list but with this I lose the original pointer type (I need t store tham in one linkedlist because I cannot create one for every possible type).
You're having the XY problem. The solution is not to decay your pointers to void* and store type information.
You simply can create a type for every possible type - you create a template type. You need to define an abstract interface for your "type for every object", then define a template class implementing this interface, that is particularized by type. Finally, you create your custom-type instance on your type of pointer received and store them by base class pointer (where the base class is your interface definition).
All that said, you (normally) shouldn't need to implement this at all, because the functionality is already implemented in boost::any or boost::variant (you will have to choose one of them).

General
Take into consideration, that if you want to store different objects inside a std::vector<void *>, mostly likely your application has a bad design. In this case, I'd think, whether it is really necessary to do it (or how can it be done in another way), rather than searching for the solution, how to do it.
However, there are no fully evil things in C++ (nor in any other language), so if you are absolutely certain, that this is the only solution, here are three possible ways to solve your problem.
Option 1
If you store only pointers to simple types, store the original type along with the pointer by an enum value or simply a string.
enum DataType
{
intType,
floatType,
doubleType
};
std::vector<std::pair<void *, DataType>> myData;
Option 2
If you store mixed data (classes and simple types), wrap your data in some kind of class.
class BaseData
{
public:
virtual ~BaseData() { }
};
class IntData : public BaseData
{
public:
int myData;
};
std::vector<BaseData *> myData;
Later, you'll be able to check the type of your data using dynamic_cast.
Option 3
If you store only classes, store them simply as a pointer to their base class and dynamic_cast your way out.

You could use boost::any to store any type in your list instead of use void*. It's not exactly what you want but I don't think you can restore the type in run time (as Kerrek said, it's not Java).
class MyClass
{
template<class T>
void function1(T arg1)
{
myList.add(arg);
}
template<class T>
T get(int i)
{
return boost::any_cast<T>(myList.get(i));
}
STLinkedlist<boost::any> myList;
};

container of unrelated T in c++

If I have the following hypothetical class:
namespace System
{
template <class T>
class Container
{
public:
Container() { }
~Container() { }
}
}
If I instantiate two Containers with different T's, say:
Container<int> a;
Container<string> b;
I would like to create vector with pointers to a and b. Since a and b are different types, normally this wouldn't be possible. However, if I did something like:
std::stack<void*> _collection;
void *p = reinterpret_cast<void*>(&a);
void *q = reinterpret_cast<void*>(&b);
_collection.push(a);
_collection.push(b);
Then later on, I can get a and b back from _collection like so:
Container<string> b = *reinterpret_cast<Container<string>*>(_collection.pop());
Container<int> a = *reinterpret_cast<Container<int>*>(_collection.pop());
My question is, is this the best way for storing a collection of unrelated types? Also would this be the preferred way of storing and retrieving the pointers from the vector (the reinterpret cast)? I've looked around and seen that boost has a nicer way of solving this, Boost::Any, but since this is a learning project I am on I would like to do it myself (Also I have been curious to find a good reason to use a reinterpret_cast correctly).

Consider boost::any or boost::variant if you want to store objects of heterogeneous types.
And before deciding which one to use, have a look at the comparison:
Boost.Variant vs. Boost.Any
Hopefully, it will help you to make the correct decision. Choose one, and any of the container from the standard library to store the objects, std::stack<boost::any>, std::stack<boost::variant>, or any other. Don't write your own container.
I repeat don't write your own container. Use containers from the standard library. They're well-tested.

While it is possible to cast to void * and back, the problem is knowing which type you're popping. After all, you give the example:
Container<string> b = *reinterpret_cast<Container<string>*>(_collection.pop());
Container<int> a = *reinterpret_cast<Container<int>*>(_collection.pop());
However, if you were to accidentally do:
Container<int> a = *reinterpret_cast<Container<int>*>(_collection.pop());
Container<string> b = *reinterpret_cast<Container<string>*>(_collection.pop());
Now you've got pointers to the wrong type, and will likely see crashes - or worse.
If you want to do something like this, at least use dynamic_cast to check that you have the right types. With dynamic_cast, you can have C++ check, at runtime (using RTTI), that your cast is safe, as long as the types being casted (both before and after) have a common base type with at least one virtual method.
So, first create a common base type with a virtual destructor:
class ContainerBase {
public:
virtual ~ContainerBase() { }
};
Make your containers derive from it:
template <typename T>
class Container : public ContainerBase {
// ...
}
Now use a std::stack<ContainerBase *>. When you retrieve items from the stack, use dynamic_cast<Container<int> >(stack.pop()) or dynamic_cast<Container<string> >(stack.pop()); if you have the types wrong, these will check, and will return NULL.
That said, heterogeneous containers are almost always the wrong thing to be using; at some level you need to know what's in the container so you can actually use it. What are you actually trying to accomplish by creating a container like this?

How to create a correct hierarchy of objects in C++

I'm building an hierarchy of objects that wrap primitive types, e.g integers, booleans, floats etc, as well as container types like vectors, maps and sets. I'm trying to (be able to) build an arbitrary hierarchy of objects, and be able to set/get their values with ease. This hierarchy will be passed to another class (not mentioned here) and an interface will be created from this representation. This is the purpose of this hierarchy, to be able to create a GUI representation from these objects.To be more precise, i have something like this:
class ValObject
{
public:
virtual ~ValObject() {}
};
class Int : public ValObject
{
public:
Int(int v) : val(v) {}
void set_int(int v) { val = v);
int get_int() const { return val; }
private:
int val;
};
// other classes for floats, booleans, strings, etc
// ...
class Map : public ValObject {}
{
public:
void set_val_for_key(const string& key, ValObject* val);
ValObject* val_for_key(const string& key);
private:
map<string, ValObject*> keyvals;
};
// classes for other containers (vector and set) ...
The client, should be able to create and arbitrary hierarchy of objects, set and get their values with ease, and I, as a junior programmer, should learn how to correctly create the classes for something like this.
The main problem I'm facing is how to set/get the values through a pointer to the base class ValObject. At first, i thought i could just create lots of functions in the base class, like set_int, get_int, set_string, get_string, set_value_for_key, get_value_for_key, etc, and make them work only for the correct types. But then, i would have lots of cases where functions do nothing and just pollute my interface. My second thought was to create various proxy objects for setting and getting the various values, e.g
class ValObject
{
public:
virtual ~ValObject() {}
virtual IntProxy* create_int_proxy(); // <-- my proxy
};
class Int : public ValObject
{
public:
Int (int v) : val(v) {}
IntProxy* create_int_proxy() { return new IntProxy(&val); }
private:
int val;
};
class String : public ValObject
{
public:
String(const string& s) : val(s) {}
IntProxy* create_int_proxy() { return 0; }
private:
string val;
};
The client could then use this proxy to set and get the values of an Int through an ValObject:
ValObject *val = ... // some object
IntProxy *ipr = val->create_int_proxy();
assert(ipr); // we know that val is an Int (somehow)
ipr->set_val(17);
But with this design, i still have too many classes to declare and implement in the various subclasses. Is this the correct way to go ? Are there any alternatives ?
Thank you.

Take a look at boost::any and boost::variant for existing solutions. The closest to what you propose is boost::any, and the code is simple enough to read and understand even if you want to build your own solution for learning purposes --if you need the code, don't reinvent the wheel, use boost::any.

One of the beauties of C++ is that these kinds of intrusive solutions often aren't necessary, yet unfortunately we still see similar ones being implemented today. This is probably due to the prevalence of Java, .NET, and QT which follows these kinds of models where we have a general object base class which is inherited by almost everything.
By intrusive, what's meant is that the types being used have to be modified to work with the aggregate system (inheriting from a base object in this case). One of the problems with intrusive solutions (though sometimes appropriate) is that they require coupling these types with the system used to aggregate them: the types become dependent on the system. For PODs it is impossible to use intrusive solutions directly as we cannot change the interface of an int, e.g.: a wrapper becomes necessary. This is also true of types outside your control like the standard C++ library or boost. The result is that you end up spending a lot of time and effort manually creating wrappers to all kinds of things when such wrappers could have been easily generated in C++. It can also be very pessimistic on your code if the intrusive solution is uniformly applied even in cases where unnecessary and incurs a runtime/memory overhead.
With C++, a plethora of non-intrusive solutions are available at your fingertips, but this is especially true when we know that we can combine static polymorphism using templates with dynamic polymorphism using virtual functions. Basically we can generate these base object-derived wrappers with virtual functions on the fly only for the cases in which this solution is needed without pessimizing the cases where this isn't necessary.
As already suggested, boost::any is a great model for what you want to achieve. If you can use it directly, you should use it. If you can't (ex: if you are providing an SDK and cannot depend on third parties to have matching versions of boost), then look at the solution as a working example.
The basic idea of boost::any is to do something similar to what you are doing, only these wrappers are generated at compile-time. If you want to store an int in boost::any, the class will generate an int wrapper class which inherits from a base object that provides the virtual interface required to make any work at runtime.
The main problem I'm facing is how to
set/get the values through a pointer
to the base class ValObject. At first,
i thought i could just create lots of
functions in the base class, like
set_int, get_int, set_string,
get_string, set_value_for_key,
get_value_for_key, etc, and make them
work only for the correct types. But
then, i would have lots of cases where
functions do nothing and just pollute
my interface.
As you already correctly deduced, this would generally be an inferior design. One tell-tale sign of inheritance being used improperly is when you have a lot of base functions which are not applicable to your subclasses.
Consider the design of I/O streams. We don't have ostreams with functions like output_int, output_float, output_foo, etc. as being directly methods in ostream. Instead, we can overload operator<< to output any data type we want in a non-intrusive fashion. A similar solution can be achieved for your base type. Do you want to associate widgets with custom types (ex: custom property editor)? We can allow that:
shared_ptr<Widget> create_widget(const shared_ptr<int>& val);
shared_ptr<Widget> create_widget(const shared_ptr<float>& val);
shared_ptr<Widget> create_widget(const shared_ptr<Foo>& val);
// etc.
Do you want to serialize these objects? We can use a solution like I/O streams. If you are adapting your own solution like boost::any, it can expect such auxiliary functions to already be there with the type being stored (the virtual functions in the generated wrapper class can call create_widget(T), e.g.
If you cannot be this general, then provide some means of identifying the types being stored (a type ID, e.g.) and handle the getting/setting of various types appropriately in the client code based on this type ID. This way the client can see what's being stored and deal set/get values on it accordingly.
Anyway, it's up to you, but do consider a non-intrusive approach to this as it will generally be less problematic and a whole lot more flexible.

Use dynamic_cast to cast up the hierarchy. You don't need to provide an explicit interface for this - any reasonable C++ programmer can do that. If they can't do that, you could try enumerating the different types and creating an integral constant for each, which you can then provide a virtual function to return, and you can then static_cast up.
Finally, you could consider passing a function object, in double-dispatch style. This has a definite encapsulation advantage.
struct functor {
void operator()(Int& integral) {
...
}
void operator()(Bool& boo) {
...
}
};
template<typename Functor> void PerformOperationByFunctor(Functor func) {
if (Int* ptr = dynamic_cast<Int*>(this)) {
func(*ptr);
}
// Repeat
}
More finally, you should avoid creating types where they've basically been already covered. For example, there's little point providing a 64bit integral type and a 32bit integral type and ... it's just not worth the hassle. Same with double and float.

Maintain reference to any object type in C++?

I'm trying to teach myself C++, and one of the traditional "new language" exercises I've always used is to implement some data structure, like a binary tree or a linked list. In Java, this was relatively simple: I could define some class Node that maintained an instance variable Object data, so that someone could store any kind of object in every node of the list or tree. (Later I worked on modifying this using generics; that's not what this question is about.)
I can't find a similar, idiomatic C++ way of storing "any type of object." In C I'd use a void pointer; the same thing works for C++, obviously, but then I run into problems when I construct an instance of std::string and try to store it into the list/tree (something about an invalid cast from std::string& to void*). Is there such a way? Does C++ have an equivalent to Java's Object (or Objective-C's NSObject)?
Bonus question: If it doesn't, and I need to keep using void pointers, what's the "right" way to store a std::string into a void*? I stumbled upon static_cast<char*>(str.c_str()), but that seems kind of verbose for what I'm trying to do. Is there a better way?

C++ does not have a base object that all objects inherit from, unlike Java. The usual approach for what you want to do would be to use templates. All the containers in the standard C++ library use this approach.
Unlike Java, C++ does not rely on polymorphism/inheritance to implement generic containers. In Java, all objects inherit from Object, and so any class can be inserted into a container that takes an Object. C++ templates, however, are compile time constructs that instruct the compiler to actually generate a different class for each type you use. So, for example, if you have:
template <typename T>
class MyContainer { ... };
You can then create a MyContainer that takes std::string objects, and another MyContainer that takes ints.
MyContainer<std::string> stringContainer;
stringContainer.insert("Blah");
MyContainer<int> intContainer;
intContainer.insert(3342);

You can take a look at boost::any class. It is type safe, you can put it into standard collections and you don't need to link with any library, the class is implemented in header file.
It allows you to write code like this:
#include <list>
#include <boost/any.hpp>
typedef std::list<boost::any> collection_type;
void foo()
{
collection_type coll;
coll.push_back(boost::any(10));
coll.push_back(boost::any("test"));
coll.push_back(boost::any(1.1));
}
Full documentation is here: http://www.boost.org/doc/libs/1_40_0/doc/html/any.html

What you are looking for are templates. They allow you to make classes and function which allow you to take any datatype whatsoever.

Templates are the static way to do this. They behave like Java and C# generics but are 100% static (compile time). If you d'ont need to store different types of objetcs in the same container, use this (other answers describe this very well).
However, if you need to store different types of objects in the same container, you can do it the dynamic way, by storing pointers on a base class. Of course, you have to define your own objects hierarchy, since there is no such "Object" class in C++ :
#include <list>
class Animal {
public:
virtual ~Animal() {}
};
class Dog : public Animal {
public:
virtual ~Dog() {}
};
class Cat : public Animal {
public:
virtual ~Cat() {}
};
int main() {
std::list<Animal*> l;
l.push_back(new Dog);
l.push_back(new Cat);
for (std::list<Animal*>::iterator i = l.begin(); i!= l.end(); ++i)
delete *i;
l.clear();
return 0;
}
A smart pointer is easier to use. Example with boost::smart_ptr:
std::list< boost::smart_ptr<Animal> > List;
List.push_back(boost::smart_ptr<Animal>(new Dog));
List.push_back(boost::smart_ptr<Animal>(new Cat));
List.clear(); // automatically call delete on each stored pointer

You should be able to cast a void* into a string* using standard C-style casts. Remember that a reference is not treated like a pointer when used, it's treated like a normal object. So if you're passing a value by reference to a function, you still have to de-refrence it to get its address.
However, as others have said, a better way to do this is with templates

static_cast<char*>(str.c_str())
looks odd to me. str.c_str() retrieves the C-like string, but with type const char *, and to convert to char * you'd normally use const_cast<char *>(str.c_str()). Except that that's not good to do, since you'd be meddling with the internals of a string. Are you sure you didn't get a warning on that?
You should be able to use static_cast<void *>(&str). The error message you got suggests to me that you got something else wrong, so if you could post the code we could look at it. (The data type std::string& is a reference to a string, not a pointer to one, so the error message is correct. What I don't know is how you got a reference instead of a pointer.)
And, yes, this is verbose. It's intended to be. Casting is usually considered a bad smell in a C++ program, and Stroustrup wanted casts to be easy to find. As has been discussed in other answers, the right way to build a data structure of arbitrary base type is by using templates, not casts and pointers.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js