Serialize objects to JSON using custom reflection

Serialize objects to JSON using custom reflection - c++

I'm writing a custom runtime reflection system to serialize the objects of my application (a game, so game entities).
My reflection framework is really simple, it allows me to reflect data types and attach data members, constructors and member functions to these meta types:
// reflect built-in types
Reflect::Reflect<int>("int");
Reflect::Reflect<float>("float");
Reflect::Reflect<std::string>("string");
// reflect TransformComponent
Reflect::Reflect<TransformComponent>("TransformComponent")
.AddConstructor<>()
.AddDataMember<&TransformComponent::SetPosition, &TransformComponent::GetPosition>("position")
.AddDataMember<&TransformComponent::SetId, &TransformComponent::GetId>("id")
.AddDataMember(&TransformComponent::mName, "name")
.AddDataMember(&TransformComponent::f, "f");
I can use reflected members to set and get data members (which can have attached setters and getters) or to construct instances of reflected types:
Reflect::any transformComponentAny = Reflect::Resolve("TransformComponent")->GetConstructor<>()->NewInstance();
// using reflection to set and get object fields
Reflect::any positionAny = Reflect::Resolve("Vector3D")->GetConstructor<>()->NewInstance();
Reflect::Resolve("Vector3D")->GetDataMember("x")->Set(positionAny, 22.22f);
Reflect::Resolve("Vector3D")->GetDataMember("y")->Set(positionAny, 130.12f);
Reflect::Resolve("Vector3D")->GetDataMember("z")->Set(positionAny, 545.12f);
Reflect::Resolve("TransformComponent")->GetDataMember("f")->Set(transformComponentAny, 0.009f);
Reflect::Resolve("TransformComponent")->GetDataMember("position")->Set(transformComponentAny, positionAny);
Reflect::Resolve("TransformComponent")->GetDataMember("id")->Set(transformComponentAny,32325);
Reflect::Resolve("TransformComponent")->GetDataMember("name")->Set(transformComponentAny, std::string("hey a transform, again..."));
all objects are wrapped inside a class like std::any, which abstract the real object, and mantains a type descriptor of the object type (to perform casts, conversions etc..)
Now, this is all ordinary reflection stuff I guess, what I'm stuck on is how to serialize my objects (which is the whole point of all this stuff).
When I want to serialize a member of an object to a JSON, I can easily retrieve the value, given the type descriptor of the object, the meta data member and the instance of the object:
auto typeDesc = object.GetType(); // get the type descriptor for the object (this is a any object)
for (auto dataMember : typeDesc->GetDataMembers())
{
auto fieldName = dataMember->GetName();
auto fieldType = dataMember->GetType()->GetName();
any field = dataMember->Get(object);
if (fieldType == "int")
json[fieldName] = field.Get<int>();
else if (fieldType == "string")
....
}
what I want is to avoid the explicit check for the runtime type and the cast to the type.
I'd like something like this:
json[fieldName] = field // field is a any, it has no info about type at compile time
and let the meta type perform the necessary cast. I'm able to do it when deserializing from a JSON, since the meta types preserve the compile time type of the data member they represent, while any just keeps a type descriptor of the object it contains (I need to call any.Get<> at compile time with the appropriate type).
Any ideas about how to do it?

Related

Can list cast correctly in Kotlin?

I am new to Kotlin,
data class RewardDetail(
val name: String
val isActivated: Boolean
val amountInCents: Int?
val Campain: String?
val expirationDate: Long?
)
class Rewards(Array<Reward>)
class Reward(
    val name: String    
isActive: Boolean    
amountInCents: Int
    campaignId: String
    expirationDate: LocalDateTime
)
val details : List<RewardDetail> = blablabla
val rewards = Rewards(details)
can details cast to rewards successfully?
Also note campaignId and Campain field name are different in RewardDetail and Reward and some fields can be nullable in RewardsDetail
What is the best way to handle situation like this?

Kotlin is strongly-typed. You can never successfully cast one thing into a different class. You can only cast an object into a type that it already satisfies. For example, if you have an Int that is currently only known to the compiler to be a Number, you can cast to Int to tell the compiler that it has an Int, so the compiler will allow you to use the functions that are specific to Int. But nothing but an Int can ever be cast to an Int.
So, unlike weakly typed languages, casting does not convert from one type to another. Casting is only you making a promise to the compiler that an object already is of the other type.
In your example, the only way to get a RewardDetail from a Reward is by writing a function that manually converts each property to the appropriate type.
The Rewards class above is largely redundant. There's no need for a wrapper class around a single Array or List unless you need to do validation of items added to or retrieved from the list. In that case, it would probably make more sense to create a subclass of ArrayList for that purpose, so you could still easily iterate the list and use all the List and Iterable helper functions on it.
Probably about 95% of the time, you should prefer using List over using Array. Arrays should be used only when you need a fixed size collection that is also mutable, or if you are working with highly performance-critical code. The reason it should be limited to these uses is that mutability should be avoided when possible for robustness and Arrays are more cumbersome to work with than MutableLists.
A typical implementation of a function that converts from one type to another would be to write an extension function RewardDetail.toReward() extension function, or a toReward() function inside the RewardDetail class. However, in your case you need to decide what you need to happen when some of the values of RewardDetail are null. Maybe you just return null so your conversion function should be toRewardOrNull(), or you provide default values for the properties that have no value in RewardDetail.

How can I check at compile time that an ID is only used on the object instance that generated it?

Let's say that I have a class that manages "identifiers" that can be later used to look up data. For instance, if I wanted to manage some objects of class Foo, I could have:
Manager<Foo> manager;
Id<Foo> id = manager.create();
In this example, Id is just a generic identifier like a struct with some generic meta information that does not depend on the template parameter at all. The only thing the template parameter does is add some type safety by using a phantom type. The phantom type would prevent me from ever accidentally passing an Id<Foo> to a manager object of type Manager<Bar>, even though the underlying type of the Id is exactly the same struct. In contrast to a plain old int, the phantom type makes certain at compile time that I don't use Id with a Manager of the wrong type.
Now, I want to take this idea a step further. Instead of enforcing that Manager<Foo> only accepts identifiers of type Id<Foo>, I want to enforce that the specific instance of Manager<Foo> only accepts identifiers created by itself in the call to create(). This would prevent an object of class Manager<T> from accepting an identifier of Id<T> that was created by a different instance of Manager<T>. Since an identifier is only valid in the context of the manager that created it, this would detect at compile time that an ID is passed to the wrong manager instance even if the type being managed happens to be the same.
Specifically, I would like usage akin to the below sample code to throw a compiler error. It doesn't have to be exactly like this, but I don't want to require too much hoop jumping by the class user (myself).
Manager<Foo> m1;
Manager<Foo> m2;
Id<Foo> id1 = m1.create();
Id<Foo> id2 = m2.create();
Foo myFoo = m2.lookup(id1); // <-- causes compiler error. id1 used on m2.
I would prefer something that works in C++11 or earlier, but C++14 is acceptable, too.

Pre-serialisation message objects - implementation?

I have a TCP client-server setup where I need to be able to pass messages of different formats at different times, using the same transmit/receive infrastructure.
Two different types of messages sent from client to server might be:
TIME_SYNC_REQUEST: Requesting server's game time. Contains no information other than the message type.
UPDATE: Describes all changes to game state that happened since the last update that was posted (if this is not the very first one after connecting), so that the server may update its data model where it sees fit.
(The message type to be included in the header, and any data to be included in the body of the message.)
In dynamic languages, I'd create an AbstractMessage type, and derive two different message types from it, with TimeSyncRequestMessage accommodating no extra data members, and UpdateMessage containing all necessary members (player position etc.), and use reflection to see what I need to actually serialise for socket send(). Since the class name describes the type, I would not even need an additional member for that.
In C++: I do not wish to use dynamic_cast to mirror the approach described above, for performance reasons. Should I use a compositional approach, with dummy members filling in for any possible data, and a char messageType? I guess another possibility is to keep different message types in differently-typed lists. Is this the only choice? Otherwise, what else could I do to store the message info until it's time to serialise it?

Maybe you can let the message class to do the serialization - Define a serialize interface, and each message implements this interface. So at the time you want to serialize and send, you call AbstractMessage::Serialize() to get the serialized data.

Unless you have some very high performance characteristics, I would use a self describing message format. This typically use a common format (say key=value), but no specific structure, instead known attributes would describe the type of the message, and then any other attributes can be extracted from that message using logic specific to that message type.
I find this type of messaging retains better backward compatibility - so if you have new attributes you want to add, you can add away and older clients will simply not see them. Messaging that uses fixed structures tend to fare less well.
EDIT: More information on self describing message formats. Basically the idea here is that you define a dictionary of fields - this is the universe of fields that your generic message contains. Now a message be default must contain some mandatory fields, and then it's up to you what other fields are added to the message. The serialization/deserialization is pretty straightforward, you end up constructing a blob which has all the fields you want to add, and at the other end, you construct a container which has all the attributes (imagine a map). The mandatory fields can describe the type, for example you can have a field in your dictionary which is the message type, and this is set for all messages. You interrogate this field to determine how to handle this message. Once you are in the handling logic, you simply extract the other attributes the logic needs from the container (map) and process them.
This approach affords the best flexibility, allows you to do things like only transmit fields that have really changed. Now how you keep this state on either side is up to you - but given you have a one-to-one mapping between message and the handling logic - you need neither inheritance or composition. The smartness in this type of system stems from how you serialize the fields (and deserialize so that you know what attribute in the dictionary the field is). For an example of such a format look at the FIX protocol - now I wouldn't advocate this for gaming, but the idea should demonstrate what a self describing message is.
EDIT2: I cannot provide a full implementation, but here is a sketch.
Firstly let me define a value type - this is the typical type of values which can exist for a field:
typedef boost::variant<int32, int64, double, std::string> value_type;
Now I describe a field
struct field
{
int field_key;
value_type field_value;
};
Now here is my message container
struct Message
{
field type;
field size;
container<field> fields; // I use a generic "container", you can use whatever you want (map/vector etc. depending on how you want to handle repeating fields etc.)
};
Now let's say that I want to construct a message which is the TIME_SYNC update, use a factory to generate me an appropriate skeleton
boost::unique_ptr<Message> getTimeSyncMessage()
{
boost::unique_ptr<Message> msg(new Message);
msg->type = { dict::field_type, TIME_SYNC }; // set the type
// set other default attributes for this message type
return msg;
}
Now, I want to set more attributes, and this is where I need a dictionary of supported fields for example...
namespace dict
{
static const int field_type = 1; // message type field id
// fields that you want
static const int field_time = 2;
:
}
So now I can say,
boost::unique_ptr<Message> msg = getTimeSyncMessage();
msg->setField(field_time, some_value);
msg->setField(field_other, some_other_value);
: // etc.
Now the serialization of this message when you are ready to send is simply stepping through the container and adding to the blob. You can use ASCII encoding or binary encoding (I would start with former first and then move to latter - depending on requirements). So an ASCII encoded version of the above could be something like:
1=1|2=10:00:00.000|3=foo
Here for arguments sake, I use a | to separate the fields, you can use something else that you can guarantee doesn't occur in your values. With a binary format - this is not relevant, the size of each field can be embedded in the data.
The deserialization would step through the blob, extract each field appropriately (so by seperating by | for example), use the factory methods to generate a skeleton (once you've got the type - field 1), then fill in all the attributes in the container. Later when you want to get a specific attribute - you can do something like:
msg->getField(field_time); // this will return the variant - and you can use boost::get for the specific type.
I know this is only a sketch, but hopefully it conveys the idea behind a self describing format. Once you've got the basic idea, there are lots of optimizations that can be done - but that's a whole another thing...

A common approach is to simply have a header on all of your messages. for example, you might have a header structure that looks like this:
struct header
{
int msgid;
int len;
};
Then the stream would contain both the header and the message data. You could use the information in the header to read the correct amount of data from the stream and to determine which type it is.
How the rest of the data is encoded, and how the class structure is setup, greatly depends on your architecture. If you are using a private network where each host is the same and runs identical code, you can use a binary dump of a structure. Otherwise, the more likely case, you'll have a variable length data structure for each type, serialized perhaps using Google Protobuf, or Boost serialization.
In pseudo-code, the receiving end of a message looks like:
read_header( header );
switch( header.msgid )
{
case TIME_SYNC:
read_time_sync( ts );
process_time_sync( ts );
break;
case UPDATE:
read_update( up );
process_update( up );
break;
default:
emit error
skip header.len;
break;
}
What the "read" functions look like depends on your serialization. Google protobuf is pretty decent if you have basic data structures and need to work in a variety of languages. Boost serialization is good if you use only C++ and all code can share the same data structure headers.

A normal approach is to send the message type and then send the serialized data.
On the receiving side, you receive the message type and based on that type, you instantiate the class via a factory method (using a map or a switch-case), and then let the object deserialize the data.

Your performance requirements are strong enough to rule out dynamic_cast? I do not see how testing a field on a general structure can possibly be faster than that, so that leaves only the different lists for different messages: you have to know by some other means the type of your object on every case. But then you can have pointers to an abstract class and do a static cast over those pointers.
I recommend that you re-assess the usage of dynamic_cast, I do not think that it be deadly slow for network applications.

On the sending end of the connection, in order to construct our message, we keep the message ID and header separate from the message data:
Message is a type that holds only the messageCategory and messageID.
Each such Message is pushed onto a unified messageQueue.
Seperate hashes are kept for data pertaining to each of the messageCategorys. In these, there is a record of data for each message of that type, keyed by messageID. The value type depends on the message category, so for a TIME_SYNC message we'd have a struct TimeSyncMessageData, for instance.
Serialisation:
Pop the message from the messageQueue, reference the appropriate hash for that message type, by messageID, to retrieve the data we want to serialise & send.
Serialise & send the data.
Advantages:
No potentially unused data members in a single, generic Message object.
An intuitive setup for data retrieval when the time comes for serialisation.

Get strongly typed list item data type?

Lets say I have a List<object> which is passed into a class as an argument, this list should contain a bunch of models for my application all of the same type. Is it then possible for me to somehow retrieve the type of the list which was passed in? (without calling GetType() on a item in the list).
For example, I pass in List<User> which is stored as List<object>, can I now retrieve the type User from the list without doing something like:
List<object> aList;
aList[0].GetType();

Well, you can use:
Type elementType = aList.GetType().GetGenericArguments[0];
However, that will fail if you pass in FooList which derives from List<Foo> for example. You could walk the type hierarchy and work things out appropriately that way, but it would be a pain.
If at all possible, it would be better to use generics throughout your code instead, potentially making existing methods generic - e.g. instead of:
public void Foo(List<object> list)
you'd have
public void Foo<T>(List<T> list)
or even
public void Foo<T>(IList<T> list)
If you just need it for the very specific case where the execution-time type will always be exactly List<T> for some list, then using GetGenericArguments will work... but it's not terribly nice.

As I understand it, the purpose of generics is to not have to do the type checking manually. The compiler ensures that the items in the list are the type they claim to be, and therefore the items that come out will be that type.
If you have a List<object>, you're defeating the purpose of using generics at all. A List<object> is a list that can store any type of object, no matter what types you actually put into it. Therefore, the onus is upon you to detect what the actual type of the object you retrieve is.
In short: you have to use GetType.

Objects vs instance in python

In C++ there are just objects and classes, where objects are instances of classes.
In Python, a class definition (i.e., the body of a class) is called an object.
And, the object in C++ is called instance in python.
Check this
Am I wrong?
EDIT : Actually can someone explain with example difference of object vs instance
EDIT : In python, everything will inherit from object class & hence everything is an object (i.e object of object class).
A Class is also an object (i.e object of object class).
Instance is the name used to call the object of any class.(a.k.a c++ object).
Please refer this

In Python, a class definition (i.e., the body of a class) is called an object
Actually, this is still called a class in Python. That's why you define it like this:
class Foo(object):
pass
The class keyword is used because the result is still called a class.
The word object is in parentheses to show that Foo is derived from the class called object. Don't be confused -- any existing class could be used here; more than one, in fact.
The reason you usually derive classes from object is a historical accident but probably is worth a detail. Python's original object implementation treated user-defined classes and built-in types as slightly different kinds of things. Then the language's designer decided to unify these two concepts. As a result, classes derived from object (or from a descendant of object) behave slightly differently from classes that are not derived from object and are called new-style classes. Old-style classes, on the other hand, were ones defined like this:
class Foo:
pass
class Bar(Foo):
pass
Note these do not inherit from object or from anything else that inherits from object. This makes them old-style classes.
When working with Python 2.x, your classes should almost always inherit from object, as the new-style objects are nicer to work with in several small but important ways.
To further confuse things, in Python 3.0 and later, there are no old-style classes, so you don't have to derive from object explicitly. In other words, all the above classes would be new-style classes in Python 3.x.
Now, back to the matter at hand. Classes are objects because everything is an object in Python. Lists, dictionaries, integers, strings, tuples... all of these are objects, and so are the building blocks of Python programs: modules, functions, and classes. You can create a class using the class keyword and then pass it to a function, modify it, etc. (For completeness, you can also create a class using the type() function.)
A class is a template for building objects, which are referred to as instances. This part you already know. You instantiate objects similar to calling a function, passing in the initial values and other parameters:
mylist = list("abc") # constructs ["a", "b", "c"]
Behind the scenes, this creates an instance, then calls the new instance's __init__() method to initialize it. Since everything's an object in Python, instances of a class are also objects.
One last thing you might want to know is that just as classes are templates for building objects, so it is possible to have templates for building classes. These are called metaclasses. The base metaclass is called type (that is, an ordinary new-style class is an instance of type).
(Yes, this is the same type that I mentioned earlier can be used to create classes, and the reason you can call it to create classes is that it's a metaclass.)
To create your own metaclass, you derive it from type like so:
class mymeta(type):
pass
Metaclasses are a fairly advanced Python topic, so I won't go into what you might use them for or how to do it, but they should make it clear how far Python takes the "everything's an object" concept.

Terminology-wise, classes and instances are both called objects in Python, but for you as a regular Python programmer this is of no importance. You can see Python's classes and instances pretty much as C++'s classes and instances:
class MyClass:
data = 1
mc = MyClass()
MyClass is a class and mc is an instance of class MyClass.
Python is much more dynamic in nature than C++ though, so its classes are also objects. But this isn't something programmers usually are exposed to, so you can just not worry about it.

Everything in Python is an object. Even classes, which are instances of metaclasses.

Since you asked for "english please", I'll try to make it simple at the cost of detail.
Let's ignore classes and instances at first, and just look at objects.
A Python object contains data and functions, just like objects in every other object oriented programming language. Functions attached to objects are called methods.
x = "hello" #now x is an object that contains the letters in "hello" as data
print x.size() #but x also has methods, for example size()
print "hello".size() #In python, unlike C++, everything is an object, so a string literal has methods.
print (5).bit_length() #as do integers (bit_length only works in 2.7+ and 3.1+, though)
A class is a description (or a recipe, if you will) of how to construct new objects. Objects constructed according to a class description are said to belong to that class. A fancy name for belonging to a class is to be an instance of that class.
Now, earlier I wrote that in Python everything is an object. Well, that holds for stuff like functions and classes as well. So a description of how to make new objects is itself an object.
class C: #C is a class and an object
a = 1
x1 = C() #x1 is now an instance of C
print x1.a #and x1 will contain an object a
y = C #Since C is itself an object, it is perfectly ok to assign it to y, note the lack of ()
x2 = y() #and now we can make instances of C, using y instead.
print x2.a #x2 will also contain an object a
print C #since classes are objects, you can print them
print y #y is the same as C.
print y == C #really the same.
print y is C #exactly the same.
This means that you can treat classes (and functions) like everything else and, for example, send them as arguments to a function, which can use them to construct new objects of a class it never knew existed.

In a very real sense, everything in Python is an object: a class (or any
type) is an object, a function is an object, a number is an object...
And every object has a type. A "type" is a particular type of object (a
class, if you wish), with additional data describing the various
attributes of the type (functions, etc.). If you're used to C++, you
can think of it as something like:
struct Type;
struct Object // The base class of everything.
{
Type* myType;
// Some additional stuff, support for reference counting, etc.
};
struct Type : Object
{
// Lots of additional stuff defining type attributes...
};
When you define a new class in Python, you're really just creating a new
instance of Type; when you instantiate that class, Python initializes
the myType member with a pointer to the correct instance of Type.
Note, however, that everything is dynamic. When you define a type
Toto (by executing a class definition—even defining a type is a
runtime thing, not compile time, as in C++), the Python interpreter
creates an instance of Type, and puts it in a dictionary
(map<string, Object*>, in C++ parlance) somewhere. When the interpreter
encounters a statement like:
x = Toto()
, it looks up Toto in the dictionary: if the Object referred to has
the type Type, it constructs a new instance of that object, if it has
type Function (functions are also objects), it calls the function.
(More generally, a type/class may be callable or not; if the type of the
Object found in the dictionary under Toto is callable, the Python
interpreter does whatever the object has defined "call" to mean. Sort
of like overloading operator()() in C++. The overload of
operator()() for Type is to construct a new object of that type.)
And yes, if you come from a classical background—strictly procedural,
structured, fully-compiled languages, it can be pretty confusing at
first.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js