How to deal with a generic object in C++ - c++

What is the proper way to deal with a generic value in C++11 or is it OK to use (void *)?
Basically, I am parsing json, and the node value can either be String, Integer, Double, Date, etc.
In C, just using void * is OK (not safe, but ok), and in C# we use Object. But what is the proper way in C++11 to do this? Do I have to build a wrapper class, or is there an easier way?

You can make a base class for the various types, or use a "discriminated union" class such as Boost.Variant which holds a known set of types and remembers which one it is holding.

Related

how to declare variable using typeinfo.name C++

I love coding, and generally do so in Python due to its simplicity and power.
However, for some time critical programs/tasks, I use C++.
Therefore, to get best of both worlds, I am making a Pythonesque list in C++.
AIM: I would like to be able to add any variable or value of any data type, including classes user has defined.
To do this, I have a structure item with a char * value, a char * type and an int size.
My List has an array of these item * s.
Now, I have taken the variable in a template function:
template<class T> item * encode(const T& var);
and declared a pointer to item item * i = new item;
And, I have stored the values of these variables as c style strings.
For example, 14675 in binary is 0000 0000 0000 0000 0011 1001 0101 0011
Therefore, I have dynamically created space, like so:
i->size = sizeof(var);
i->value = new char[i->size]; //4 in this case
and set each bit in value with respective bits in var.
I have also stored their types as
i->type = typeinfo(var).name();
So far so good!
Now, I am stuck with auto decode(item * i) -> decltype(/*What goes here???*/)
How do I specify the return type of the function?
Is there any possible way?
Preferably using the i->type?
Or, should I make changes in the basic design of this process?
Thanks in advance!
Answering your question
I would like to be able to add any variable or value of any data type, including classes user has defined.
Without cooperation from the user that’s impossible in C++.
Remember that C++ types are a compile-time concept only. They do not exist at runtime. The only type information available at runtime is the thin layer of RTTI provided by typeid(). Runtime duck-typing like in Python is not possible.
You can create a container of arbitrary objects quite easily.
std::vector<std::any> v; // requires C++17
However the user of that container has to know what index contains what type:
if (v[0].type() == typeid(ArbitraryUserType)) {
const auto& item = std::any_cast<ArbitraryUserType>(v[0]);
// work on item ...
}
Because of the compile-time nature of types you as the library writer cannot perform that any_cast. It has to be spelled out in the user’s source code.
In general, don’t try to shoehorn a pythonic mindset into C++. It never ends well, especially when you try to circumvent one of the most basic foundations of C++: its powerful static type system.
Notes:
Without C++17 you could use boost::any.
If you know the list of possible types at compile-time std::vector<std::variant<Type1, Type2, etc>> is a good alternative. With any the user is fully responsible to keep track of their types. Because all type checks happen at runtime the compiler cannot help. Variant on the other hand brings back a large chunk of the compile-time safety. And again there’s boost::variant as a non-C++17 alternative.
Notes on your encoding approach
Basically you’re trying to serialize (encode) and deserialize (decode) arbitrary types. Without cooperation from those types, that’s not possible.
Your approach only works for trivial types that can be copied bit by bit. C++ even has a type trait for that: std::is_trivially_copyable. In the end you support fundamental types and C-style structs of those, but nothing else.
Imagine the T for your encode() function was std::string. Simply put a std::string contains a pointer to a separately allocated piece of memory where the actual string data is stored. The string object itself is just a managing wrapper for that pointer. encode() only serializes the wrapper object, but not the pointed-to memory block with the actual data.
Even if during deserialization you could instantiate arbitrary types from a stream of bits, the stream is not complete. What you’d have to implement is a C++ version of Python’s copy.deepcopy, which is impossible without cooperation from each type. Have a look at a C++ serialization library – take Cereal as a straight-forward example – to see how that cooperation can look in practice.

Is there a legitimate use for void*?

Is there a legitimate use of void* in C++? Or was this introduced because C had it?
Just to recap my thoughts:
Input: If we want to allow multiple input types we can overload functions and methods, alternatively we can define a common base class, or template (thanks for mentioning this in the answers). In both cases the code get's more descriptive and less error prone (provided the base class is implemented in a sane way).
Output: I can't think of any situation where I would prefer to receive void* as opposed to something derived from a known base class.
Just to make it clear what I mean: I'm not specifically asking if there is a use-case for void*, but if there is a case where void* is the best or only available choice. Which has been perfectly answered by several people below.
void* is at least necessary as the result of ::operator new (also every operator new...) and of malloc and as the argument of the placement new operator.
void* can be thought as the common supertype of every pointer type. So it is not exactly meaning pointer to void, but pointer to anything.
BTW, if you wanted to keep some data for several unrelated global variables, you might use some std::map<void*,int> score; then, after having declared global int x; and double y; and std::string s; do score[&x]=1; and score[&y]=2; and score[&z]=3;
memset wants a void* address (the most generic ones)
Also, POSIX systems have dlsym and its return type evidently should be void*
There are multiple reasons to use void*, the 3 most common being:
interacting with a C library using void* in its interface
type-erasure
denoting un-typed memory
In reverse order, denoting un-typed memory with void* (3) instead of char* (or variants) helps preventing accidental pointer arithmetic; there are very few operations available on void* so it usually require casting before being useful. And of course, much like with char* there is no issue with aliasing.
Type-erasure (2) is still used in C++, in conjunction with templates or not:
non-generic code helps reducing binary bloat, it's useful in cold paths even in generic code
non-generic code is necessary for storage sometimes, even in generic container such as std::function
And obviously, when the interface you deal with uses void* (1), you have little choice.
Oh yes. Even in C++ sometimes we go with void * rather than template<class T*> because sometimes the extra code from the template expansion weighs too much.
Commonly I would use it as the actual implementation of the type, and the template type would inherit from it and wrap the casts.
Also, custom slab allocators (operator new implementations) must use void *. This is one of the reasons why g++ added an extension of permitting pointer arithmatic on void * as though it were of size 1.
Input: If we want to allow multiple input types we can overload
functions and methods
True.
alternatively we can define a common base
class.
This is partially true: what if you can't define a common base class, an interface or similar? To define those you need to have access to the source code, which is often not possible.
You didn't mention templates. However, templates cannot help you with polymorphism: they work with static types i.e. known at compile time.
void* may be consider as the lowest common denominator. In C++, you typically don't need it because (i) you can't inherently do much with it and (ii) there are almost always better solutions.
Even further, you will typically end up on converting it to other concrete types. That's why char * is usually better, although it may indicate that you're expecting a C-style string, rather than a pure block of data. That's whyvoid* is better than char* for that, because it allows implicit cast from other pointer types.
You're supposed to receive some data, work with it and produce an output; to achieve that, you need to know the data you're working with, otherwise you have a different problem which is not the one you were originally solving. Many languages don't have void* and have no problem with that, for instance.
Another legitimate use
When printing pointer addresses with functions like printf the pointer shall have void* type and, therefore, you may need a cast to void*
Yes, it is as useful as any other thing in the language.
As an example, you can use it to erase the type of a class that you are able to statically cast to the right type when needed, in order to have a minimal and flexible interface.
In that response there is an example of use that should give you an idea.
I copy and paste it below for the sake of clarity:
class Dispatcher {
Dispatcher() { }
template<class C, void(C::*M)() = C::receive>
static void invoke(void *instance) {
(static_cast<C*>(instance)->*M)();
}
public:
template<class C, void(C::*M)() = &C::receive>
static Dispatcher create(C *instance) {
Dispatcher d;
d.fn = &invoke<C, M>;
d.instance = instance;
return d;
}
void operator()() {
(fn)(instance);
}
private:
using Fn = void(*)(void *);
Fn fn;
void *instance;
};
Obviously, this is only one of the bunch of uses of void*.
Interfacing with an external library function which returns a pointer. Here is one for an Ada application.
extern "C" { void* ada_function();}
void* m_status_ptr = ada_function();
This returns a pointer to whatever it was Ada wanted to tell you about. You don't have to do anything fancy with it, you can give it back to Ada to do the next thing.
In fact disentangling an Ada pointer in C++ is non-trivial.
In short, C++ as a strict language (not taking into account C relics like malloc()) requires void* since it has no common parent of all possible types. Unlike ObjC, for example, which has object.
The first thing that occurs to my mind (which I suspect is a concrete case of a couple of the answers above) is the capability to pass an object instance to a threadproc in Windows.
I've got a couple of C++ classes which need to do this, they have worker thread implementations and the LPVOID parameter in the CreateThread() API gets an address of a static method implementation in the class so the worker thread can do the work with a specific instance of the class. Simple static cast back in the threadproc yields the instance to work with, allowing each instantiated object to have a worker thread from a single static method implementation.
In case of multiple inheritance, if you need to get a pointer to the first byte of a memory chunk occupied by an object, you may dynamic_cast to void*.

Is there a way to have a function return a type?

I have a data class (struct actually) two variables: a void pointer and a string containing the type of the object being pointed to.
struct data{
void* index;
std::string type;
data(): index(0), type("null"){}
data(void* index, std::string type): index(index), type(type){}};
Now I need to use the object being pointed to, by casting the void pointer to a type that is specified by the string, so I thought of using an std::map with strings and functions.
std::unordered_map<std::string, function> cast;
The problem is that the functions must always have the exact same return-type and can't return a type itself.
Edit:
Because I use the data class as a return-type and as arguments, templates won't suffice.
(also added some code to show what I mean)
data somefunction(data a){
//do stuff
return data();}
Currently, I use functions like this to do the trick, but I thought it could be done more easily:
void functionforstring(data a){
dynamic_cast<string*>(data.index)->function();}
Neither thing is possible in C++:
Functions cannot return types (that is to say, types are not values).
Code cannot operate on objects whose type it doesn't know at compile-time (that is to say, C++ is statically typed). Of course there is dynamic polymorphism via virtual functions, but even with that, the type of the pointer you use to call them is known at compile time by the calling code.
So the operation you want, "convert to the pointer type indicated by a string" is not possible. If it were possible, then the result would be a pointer whose type is not known at compile time, and that cannot be.
There's nothing you could do with this "pointer of type unknown at compile time", that you can't do using the void* you started with. void* pretty much already is what C++ has in place of a pointer to unknown type.
While it's not possible to return a type from a function, you could use typeid to get information about the object, and use the string returned by typeid(*obj).name() as an argument to your constructor.
Keep in mind that this string would be implementation defined, so you would have to generate this string at runtime for every type that you might possibly use in the program in order to make your unordered_map useful.
There is almost certainly a much simpler and more idiomatic way to accomplish your goal in C++, however. Perhaps if you explained more about the goals of the program, someone might be able to suggest an alternative approach.

Creating a simple scripted 'language' - VARIANT-like value type

For a rules engine developed in C++, one of the core features is the value type. What I have so far is a bit like a COM-style VARIANT - each value knows its type. There are some rules for type conversion but it's a bit messy.
I wondered if there are nice drop-in value classes I could use which solve this, without requiring me to use a whole pre-built system. For instance maybe boost has something?
Looking for boost::any or boost::variant?
There are basically three types of variant implementations:
A type that can be freely casted between types (think untyped languages) -- boost::lexical_cast is your friend here, or boost::variant...
A type that can hold any type, but is typesafe -- e.g. initialized with an int, stays an int and doesn't allow to be treated implicitly like anything else -- this is the boost::any type
The evil allow anything type -- cast to what you want without error checking, no type information held -- think void*

How to store variant data in C++

I'm in the process of creating a class that stores metadata about a particular data source. The metadata is structured in a tree, very similar to how XML is structured. The metadata values can be integer, decimal, or string values.
I'm curious if there is a good way in C++ to store variant data for a situation like this. I'd like for the variant to use standard libraries, so I'm avoiding the COM, Ole, and SQL VARIANT types that are available.
My current solution looks something like this:
enum MetaValueType
{
MetaChar,
MetaString,
MetaShort,
MetaInt,
MetaFloat,
MetaDouble
};
union MetaUnion
{
char cValue;
short sValue;
int iValue;
float fValue;
double dValue;
};
class MetaValue
{
...
private:
MetaValueType ValueType;
std::string StringValue;
MetaUnion VariantValue;
};
The MetaValue class has various Get functions for obtaining the currently stored variant value, but it ends up making every query for a value a big block of if/else if statements to figure out which value I'm looking for.
I've also explored storing the value as only a string, and performing conversions to get different variant types out, but as far as I've seen this leads to a bunch of internal string parsing and error handling which isn't pretty, opens up a big old can of precision and data loss issues with floating point values, and still doesn't eliminate the query if/else if issue stated above.
Has anybody implemented or seen something that's cleaner to use for a C++ variant data type using standard libraries?
As of C++17, there’s std::variant.
If you can’t use that yet, you might want Boost.Variant. A similar, but distinct, type for modelling polymorphism is provided by std::any (and, pre-C++17, Boost.Any).
Just as an additional pointer, you can look for “type erasure”.
While Konrad's answer (using an existing standardized solution) is certainly preferable to writing your own bug-prone version, the boost variant has some overheads, especially in copy construction and memory.
A common customized approach is the following modified Factory Pattern:
Create a Base interface for a generic object that also encapsulates the object type (either as an enum), or using 'typeid' (preferable).
Now implement the interface using a template Derived class.
Create a factory class with a templateized create function with signature:
template <typename _T> Base * Factory::create ();
This internally creates a Derived<_T> object on the heap, and retuns a dynamic cast pointer. Specialize this for each class you want implemented.
Finally, define a Variant wrapper that contains this Base * pointer and defines template get and set functions. Utility functions like getType(), isEmpty(), assignment and equality operators, etc can be appropriately implemented here.
Depending on the utility functions and the factory implementation, supported classes will need to support some basic functions like assignment or copy construction.
You can also go down to a more C-ish solution, which would have a void* the size of a double on your system, plus an enum for which type you're using. It's reasonably clean, but definitely a solution for someone who feels wholly comfortable with the raw bytes of the system.
C++17 now has std::variant which is exactly what you're looking for.
std::variant
The class template std::variant represents a type-safe union. An
instance of std::variant at any given time either holds a value of one
of its alternative types, or in the case of error - no value (this
state is hard to achieve, see valueless_by_exception).
As with unions, if a variant holds a value of some object type T, the
object representation of T is allocated directly within the object
representation of the variant itself. Variant is not allowed to
allocate additional (dynamic) memory.
Although the question had been answered for a long time, for the record I would like to mention that QVariant in the Qt libraries also does this.
Because C++ forbids unions from including types that have non-default
constructors or destructors, most interesting Qt classes cannot be
used in unions. Without QVariant, this would be a problem for
QObject::property() and for database work, etc.
A QVariant object holds a single value of a single type() at a time.
(Some type()s are multi-valued, for example a string list.) You can
find out what type, T, the variant holds, convert it to a different
type using convert(), get its value using one of the toT() functions
(e.g., toSize()) and check whether the type can be converted to a
particular type using canConvert().