Serializing a struct whose definition is not known - c++

I am using geos library in my software as the geometry engine. I am currently using its capi(as that is the recommended api).
Now the problem is I would like to serialize and deserialize the struct GEOSGeometry. The library itself is in c++ and the capi is a wrapper around it. So the struct definition is not available per say. What are my options?
This is what the capi mentions
/* When we're included by geos_c.cpp, those are #defined to the original
* JTS definitions via preprocessor. We don't touch them to allow the
* compiler to cross-check the declarations. However, for all "normal"
* C-API users, we need to define them as "opaque" struct pointers, as
* those clients don't have access to the original C++ headers, by design.
*/
#ifndef GEOSGeometry
typedef struct GEOSGeom_t GEOSGeometry;
And this is how it is wrapped
// Some extra magic to make type declarations in geos_c.h work -
// for cross-checking of types in header.
#define GEOSGeometry geos::geom::Geometry
Any help is appreciated.

First of all, if you really can't access struct's definition in a source file, I'd try to inspect it with C++11 type_traits classes, e.g. is_pod, is_trivial, is_standard_layout, ...
This way, you can get an idea of what are you dealing with. If you see that the struct is quite simple, you can "hope" that it stores all data inside itself, i.e. not points to other memory areas. Sadly, as far as I know, there is no way to find out if a class has got a member pointer.
Eventually, all you can do is trying to serialize it brutally writing to your output sizeof(GEOSGeometry) bytes (chars). Then read it back and... good luck!

Related

SCIP: About the "SCIP_ReaderData" in the bin packing example

A quesion about the reader plugin defined in the binpacking example. I found the following declaration in the interface method (file reader_bpa.c),
SCIP_READERDATA* readerdata;
readerdata = NULL;
I know SCIP_READERDATA is defined in file type_reader.h:
typedef struct SCIP_ReaderData SCIP_READERDATA;
However, the struct SCIP_ReaderData is not defined in the binpacking reader, so which is the actual struct that "SCIP_READERDATA* readerdata;" reference to? what kind of pointer is readerdata?
PS: I noticed that the default readers in SCIP have similar usage.
That is more a C-question than a SCIP question if I am not mistaken. The interface functions SCIPincludeReader() and SCIPincludeReaderBasic() require a pointer to reader data as last argument. Reader data is supposed to allow the plugin author to connect arbitrary data with their reader plugin by declaring the corresponding struct SCIP_ReaderData as many other plugins do.
If you try to do anything with the pointer, e.g., allocate memory for it using SCIPallocMemory(scip, &readerdata), you will get compiler errors because the pointer refers to an incomplete type, namely struct SCIP_ReaderData.
More useful information on incomplete types is found, e.g., here
The point is, the example uses this to make it clearer which arguments are passed to the SCIPIncludeReaderBasic()-function, where you would see NULL otherwise.

UML representation for C/C++ function pointers

What would be the best representation of a C/C++ function pointer (fp) in an UML structural diagram?
I'm thinking about using an interface element, may be even if 'degenerate' with the constraint of having at most a single operation declared.
I found some proposal in this document: C and UML Synchronization User Guide, Section 5.7.4. But this sounds quite cumbersome and not very useful in practice. Even if right from a very low level of semantic view. Here's a diagram showing their concept briefly:
IMHO in C and C++ function pointers are used as such a narrowed view of an interface which only provides a single function and it's signature. In C fp's would be used also to implement more complex interfaces declaring a struct containing a set of function pointers.
I think I can even manage to get my particular UML tool (Enterprise Architect) to forward generate the correct code, and synchronizing with code changes without harm.
My questions are:
Would declaration of fp's as part of interface elements in UML proivde a correct semantic view?
What kind of stereotype should be used for single fp declaration? At least I need to provide a typedef in code so this would be my guts choice.(I found this stereotype is proprietary for Enterprise Architect) and I need to define an appropriate stereotype to get the code generation adapted. Actually I have chosen the stereotype name 'delegate', does this have any implications or semantic collisions?
As for C++, would be nesting a 'delegate' sterotyped interface with in a class element enough to express a class member function pointer correctly?
Here's a sample diagram of my thoughts for C language representation:
This is the C code that should be generated from the above model:
struct Interface1;
typedef int (*CallbackFunc)(struct Interface1*);
typedef struct Interface1
{
typedef void (*func1Ptr)(struct Interface1*, int, char*);
typedef int (*func2Ptr)(struct Interface1*, char*);
typedef int (*func3Ptr)(struct Interface1*, CallbackFunc);
func1Ptr func1;
func2Ptr func2;
func3Ptr func3;
void* instance;
};
/* The following extern declarations are only dummies to satisfy code
* reverse engineering, and never should be called.
*/
extern void func1(struct Interface1* self, int p1, char* p2) = 0;
extern int func2(struct Interface1* self, char*) = 0;
extern int func3(struct Interface1* self, CallbackFunc p1) = 0;
EDIT:
The whole problem boils down what would be the best way with the UML tool at hand and its specific code engineering capabilities. Thus I have added the enterprise-architect tag.
EA's help file has the following to say on the subject of function pointers:
When importing C++ source code, Enterprise Architect ignores function pointer declarations. To import them into your model you could create a typedef to define a function pointer type, then declare function pointers using that type. Function pointers declared in this way are imported as attributes of the function pointer type.
Note "could." This is from the C++ section, the C section doesn't mention function pointers at all. So they're not well supported, which in turn is of course due to the gap between the modelling and programming communities: non-trivial language concepts are simply not supported in UML, so any solution will by necessity be tool-specific.
My suggestion is a bit involved and it's a little bit hacky, but I think it should work pretty well.
Because in UML operations are not first-class and cannot be used as data types, my response is to create first-class entities for them - in other words, define function pointer types as classes.
These classes will serve two purposes: the class name will reflect the function's type signature so as to make it look familiar to the programmer in the diagrams, while a set of tagged values will represent the actual parameter and return types for use in code generation.
0) You may want to set up an MDG Technology for steps 1-4.
1) Define a tagged value type "retval" with the Detail "Type=RefGUID;Values=Class;"
2) Define a further set of tagged value types with the same Detail named "par1", "par2" and so on.
3) Define a profile with a Class stereotype "funptr" containing a "retval" tagged value (but no "par" tags).
4) Modify the code generation scripts Attribute Declaration and Parameter to retrieve the "retval" (always) and "par1" - "parN" (where defined) and generate correct syntax for them. This will be the tricky bit and I haven't actually done this. I think it can be done without too much effort, but you'll have to try it. You should also make sure that no code is generated for "funptr" class definitions as they represent anonymous types, not typedefs.
5) In your target project, define a set of classes to represent the primitive C types.
With this, you can define a function pointer type as a «funptr» class with a name like "long(*)(char)" for a function that takes a char and returns a long.
In the "retval" tag, select the "long" class you defined in step 4.
Add the "par1" tag manually, and select the "char" class as above.
You can now use this class as the type of an attribute or parameter, or anywhere else where EA allows a class reference (such as in the "par1" tag of a different «funptr» class; this allows you to easily create pointer types for functions where one of the parameters is itself of a function pointer type).
The hackiest bit here is the numbered "par1" - "parN" tags. While it is possible in EA to define several tags with the same name (you may have to change the tagged value window options to see them), I don't think you could retrieve the different values in the code generation script (and even if you could I don't think the order would necessarily be preserved, and parameter order is important in C). So you'd need to decide the maximum number of parameters beforehand. Not a huge problem in practice; setting up say 20 parameters should be plenty.
This method is of no help for reverse engineering, as EA 9 does not allow you to customize the reverse-engineering process. However, the upcoming EA 10 (currently in RC 1) will allow this, although I haven't looked at it myself so I don't know what form this will take.
Defining of function pointers is out of scope of UML specification. What is more, it is language-specific feature that is not supported by many UML modeling software. So I think that the general answer to your first question suggests avoiding of this feature. Tricks you provided are relevant to Enterprise Architect only and are not compatible with other UML modeling tools. Here is how function pointers is supported in some other UML software:
MagicDraw UML uses <<C++FunctionPtr>> stereotypes for FP class members and <<C++FunctionSignature>> for function prototype.
Sample of code (taken from official site -- see "Modeling typedef and function pointer for C++ code generation" viewlet):
class Pointer
{
void (f*) ( int i );
}
Corresponding UML model:
Objecteering defines FP attributes with corresponding C++ TypeExpr note.
Rational Software Architect from IBM doesn't support function pointers. User might add them to generated code in user-defined sections that are leaved untouched during code->UML and UML->code transformations.
Seems correct to me. I'm not sure you should dive into the low-level details of descripting the type and relation of your single function pointer. I usually find that description an interface is enough detalization without the need to decompose the internal elements of it.
I think you could virtually wrap the function pointer with a class. I think UML has not to be blueprint level to the code, documenting the concept is more important.
My feeling is that you desire to map UML interfaces to the struct-with-function-pointers C idiom.
Interface1 is the important element in your model. Declaring function pointer object types all over the place will make your diagrams illegible.
Enterprise Architect allows you to specify your own code generators. Look for the Code Template Framework. You should be able to modify the preexisting code generator for C with the aid of a new stereotype or two.
I have been able to get something sort of working with Enterprise Architect. Its a bit of a hacky solution, but it meets my needs. What I did:
Create a new class stereotype named FuncPtr. I followed the guide here: http://www.sparxsystems.com/enterprise_architect_user_guide/10/extending_uml_models/addingelementsandmetaclass.html
When I did this I made a new view for the profile. So I can keep it contained outside of my main project.
Modified the Class code templates. Basically selecting the C language and start with the Class Template and hit the 'Add New Stereotype Override' and add in FuncPtr as a new override.
Add in the following code to that new template:
%PI="\n"%
%ClassNotes%
typedef %classTag:"returnType"% (*%className%)(
%list="Attribute" #separator=",\n" #indent=" "%
);
Modified the Attribute Declaration code template. Same way as before, adding in a new Stereotype
Add in the following code to the new template:
%PI=""% %attConst=="T" ? "const" : ""%
%attType%
%attContainment=="By Reference" ? "*" : ""%
%attName%
That's all that I had to do to get function pointers in place in Enterprise Architect. When I want to define a function pointer I just:
Create a regular class
Add in the tag 'returnType' with the type of return I want
Add in attributes for the parameters.
This way it'll create a new type that can be included as attributes or parameters in other classes (structures), and operators. I didn't make it an operator itself because then it wouldn't have been referenced inside the tool as a type you can select.
So its a bit hacky, using special stereotyped classes as typedefs to function pointers.
Like your first example I would use a Classifier but hide it away in a profile. I think they've included it for clarity of the explaining the concept; but in practice the whole idea of stereotypes is abstract away details into profiles to avoid the 'noise' problem. EA is pretty good for handling Profiles.
Where I differ from your first example is that I would Classify the Primitive Type Stereotype not the Data Type stereotype. Data Type is a Domain scope object, while Primitive Type is an atomic element with semantics defined out side the scope of UML. That is not to say you cannot add notes, especially in the profile or give it a very clear stereotype name like functionPointer.

Central Typedefs.h file - is it a good idea?

Coming from the Java world, in which there are no typedefs, I have a question for C++ developers:
My task is to rewrite a large MATLAB project in C++. In order to get to know the structure of the code, I have started rebuilding the module and class structure without actually implementing the functionality.
I know that I frequently need classes/types like Vector and ParameterList, which will be provided by some framework I have not decided on yet.
So I created a central header file Typedefs.h in which I have type definitions like
typedef void Vector; // TODO: set vector class
typedef void ParameterList; // TODO: set parameter list class
For now, these are set to void, but I can use these types to write class skeletons and method signatures. Later I can replace them with the actual types.
Is this something that makes sense? If yes, is there a way to avoid manually including the Typedefs.h in every file?
I doubt this would work, unless you use, for example Vector*. You wouldn't be able to have Vector objects or parameters, so it's pretty much pointless.
And for use as a pointer, you can very well do a forward declaration.
Anyway, I don't really see the need for any of this. You can declare an empty class without having to implement it, and it's even easier to write than a typedef:
typedef void Vector;
vs
struct Vector{};
Note that you will not be able to overload functions with typedefs that map to the same type:
void foo(Vector);
void foo(ParameterList); // error: foo(void) already declared

I've done a shady thing

Are (seemingly) shady things ever acceptable for practical reasons?
First, a bit of background on my code. I'm writing the graphics module of my 2D game. My module contains more than two classes, but I'll only mention two in here: Font and GraphicsRenderer.
Font provides an interface through which to load (and release) files and nothing much more. In my Font header I don't want any implementation details to leak, and that includes the data types of the third-party library I'm using. The way I prevent the third-party lib from being visible in the header is through an incomplete type (I understand this is standard practice):
class Font
{
private:
struct FontData;
boost::shared_ptr<FontData> data_;
};
GraphicsRenderer is the (read: singleton) device that initializes and finalizes the third-party graphics library and also is used to render graphical objects (such as Fonts, Images, etc). The reason it's a singleton is because, as I've said, the class initializes the third-party library automatically; it does this when the singleton object is created and exits the library when the singleton is destroyed.
Anyway, in order for GR to be able to render Font it must obviously have access to its FontData object. One option would be to have a public getter, but that would expose the implementation of Font (no other class other than Font and GR should care about FontData). Instead I considered it's better to make GR a friend of Font.
Note: Until now I've done two things that some may consider shady (singleton and friend), but these are not the things I want to ask you about. Nevertheless, if you think my rationale for making GR a singleton and a friend of Font is wrong please do criticize me and maybe offer better solutions.
The shady thing. So GR has access to Font::data_ though friendship, but how does it know exactly what a FontData is (since it's not defined in the header, it's an incomplete type)? I'll just show the code and the comment that includes the rationale...
// =============================================================================
// graphics/font.cpp
// -----------------------------------------------------------------------------
struct Font::FontData
: public sf::Font
{
// Just a synonym of sf::Font
};
// A redefinition of FontData exists in GraphicsRenderer::printText(),
// which will have to be modified as well if this definition is modified.
// (The redefinition is called FontDataSurogate.)
// Why not have FontData defined only once in a separate header:
// If the definition of FontData changes, most likely printText() text will
// have to be altered also regardless. Considering that and also that FontData
// has (and should have) a very simple definition, a separate header was
// considered too much of an overhead and of little practical advantage.
// =============================================================================
// graphics/graphics_renderer.cpp
// -----------------------------------------------------------------------------
void GraphicsRenderer::printText(const Font& fnt /* ... */)
{
struct FontDataSurogate
: public sf::Font {
};
FontDataSurogate* suro = (FontDataSurogate*)fnt.data_.get();
sf::Font& font = (sf::Font)(*suro);
// ...
}
So that's the shady thing I'm trying to do. Basically what I want is a review of my rationale, so please tell me if you think I've done something horrendous or if not confirm my rationale so I can be a bit surer I'm doing the right thing. :) (This is my biggest project yet and I'm only at the beginning so I'm kinda feeling things in the dark atm.)
In general, if something looks sketchy, I've found that it's often worth going back a few times and trying to figure out exactly why that's necessary. In most cases, some kind of fix pops up (maybe not as "nice", but not relying on any kind of trick).
Now, the first issue I see in your example is this bit of code:
struct FontDataSurogate
: public sf::Font {
};
occurs twice, in different files (neither being a header). That may come back and be a bother when you change one but not the other in the future, and making sure both are identical will very likely be a pain.
To solve that, I would suggest putting the definition to FontDataSurogate and the appropriate includes (whatever library/header defines sf::Font) in a separate header. From the two files that need to use FontDataSurogate, include that definition header (not from any other code files or headers, just those two).
If you have a main class declaration header for your library, place the forward declaration for the class there, and use pointers in your objects and parameters (regular pointers or shared pointers).
You can then use friend or add a get method to retrieve the data, but by moving the class definition to its own header, you've created a single copy of that code and have a single object/file that's interfacing with the other library.
Edit:
You commented on the question while I was writing this, so I'll add on a reply to your comment.
"Too much overhead" - more to document, one more thing to include, the complexity of the code grows, etc.
Not so. You will have one copy of the code, compared to the two that must remain identical now. The code exists either way, so it needs documented, but your complexity and particularly maintenance is simplified. You do gain two #include statements, but is that such a high cost?
"Little practical advantage" - printText() would have to be modified every time FontData is modified regardless of whether or not it's defined in a separate header or not.
The advantage is less duplicate code, making it easier to maintain for you (and others). Modifying the function when the input data changes is not surprising or unusual really. Moving it to another header doesn't cost you anything but the mentioned includes.
friend is fine, and encouraged. See C++ FAQ Lite's rationale for more info: Do friends violate encapsulation?
This line is indeed horrendous, as it invokes undefined behavior: FontDataSurogate* suro = (FontDataSurogate*)fnt.data_.get();
You forward declare the existence of the FontData struct, and then go on to fully declare it in two locations: Font, and GraphicsRenderer. Ew. Now you have to manually keep these exactly binary compatible.
I'm sure it works, but you're right, it is kindof shady. But whenever we say such-and-such is eeevil, we mean to avoid a certain practice, with the caveat that sometimes it can be useful. That being said, I don't think this is one of those times.
One technique is to invert your handling. Instead of putting all of the logic inside GraphicsRenderer, put some of it inside Font. Like so:
class Font
{
public:
void do_something_with_fontdata(GraphicsRenderer& gr);
private:
struct FontData;
boost::shared_ptr<FontData> data_;
};
void GraphicsRenderer::printText(const Font& fnt /* ... */)
{
fnt.do_something_with_fontdata(*this);
}
This way, the Font details are kept within the Font class, and even GraphicsRenderer doesn't need to know the specifics of the implementation. This solves the friend issue too (although I don't think friend is all that bad to use).
Depending on how your code is laid out, and what it's doing, attempting to invert it like this may be quite difficult. If that is the case, simply move the real declaration of FontData to its own header file, and use it in both Font and GraphicsRenderer.
You've spent a lot more effort asking this question then you've supposedly saved by duplicating that code.
You state three reasons you didn't want to add the file:
Extra include
Extra Documentation
Extra Complexity
But I would have to say that 2 and 3 are increased by duplicating that code. Now you document what its doing in the original place and what the fried monkey its doing defined again in another random place in the code base. And duplicating code can only increase the complexity of a project.
The only thing you are saving is an include file. But files are cheap. You should not be afraid of creating them. There is almost zero cost (or at least there should be) to add a new header file.
The advantages of doing this properly:
The compiler doesn't have to make the definition you give compatible
Someday, somebody is going to modify the FontData class without modifying PrintText(), maybe they should modify PrintText(), but they either haven't done it yet or don't know that they need to. Or perhaps in a way that simply hasn't occoured to additional data on FontData make sense. Regardless, the different pieces of code will operate on different assumptions and will explode in a very hard to trace bug.

Hide class type in header

I'm not sure if this is even possible, but here goes:
I have a library whose interface is, at best, complex. Unfortunately, not only is it a 3rd-party library (and far too big to rewrite), I'm using a few other libraries that are dependent on it. So that interface has to stay how it is.
To solve that, I'm trying to essentially wrap the interface and bundle all the dependencies' interfaces into fewer, more logical classes. That part is going fine and works great. Most of the wrapper classes hold a pointer to an object of one of the original classes. Like so:
class Node
{
public:
String GetName()
{
return this->llNode->getNodeName();
}
private:
OverlyComplicatedNodeClass * llNode; // low-level node
};
My only problem is the secondary point of this. Beside simplifying the interface, I'd like to remove the requirement for linking against the original headers/libraries.
That's the first difficulty. How can I wrap the classes in such a way that there's no need to include the original headers? The wrapper will be built as a shared-library (dll/so), if that makes it simpler.
The original classes are pointers and not used in any exported functions (although they are used in a few constructors).
I've toyed with a few ideas, including preprocessor stuff like:
#ifdef ACCESSLOWLEVEL
# define LLPtr(n) n *
#else
# define LLPtr(n) void *
#endif
Which is ugly, at best. It does what I need basically, but I'd rather a real solution that that kind of mess.
Some kind of pointer-type magic works, until I ran into a few functions that use shared pointers (some kind of custom SharedPtr<> class providing reference count) and worse yet, a few class-specific shared pointers derived from the basic SharedPtr class (NodePtr, for example).
Is it at all possible to wrap the original library in such a way as to require only my headers to be included in order to link to my dynamic library? No need to link to the original library or call functions from it, just mine. Only problem I'm running into are the types/classes that are used.
The question might not be terribly clear. I can try to clean it up and add more code samples if it helps. I'm not really worried about any performance overhead or anything of this method, just trying to make it work first (premature optimization and all that).
Use the Pimpl (pointer to implementation) idiom. As described, OverlyComplicatedNodeClass is an implementation detail as far as the users of your library are concerned. They should not have to know the structure of this class, or even it's name.
When you use the Pimpl idiom, you replace the OverlyComplicatedNodeClass pointer in your class with a pointer to void. Only you the library writer needs to know that the void* is actually a OverlyComplicatedNodeClass*. So your class declaration becomes:
class Node
{
public:
String GetName();
private:
void * impl;
};
In your library's implementation, initialize impl with a pointer to the class that does the real work:
my_lib.cpp
Node::Node()
: impl(new OverlyComplicatedNodeClass)
{
// ...
};
...and users of your library need never know that OverlyComplicatedNodeClass exists.
There's one potential drawback to this approach. All the code which uses the impl class must be implemented in your library. None if it can be inline. Whether this is a drawback depends very much on your application, so judge for yourself.
In the case of your class, you did have GetName()'s implementation in the header. That must be moved to the library, as with all other code that uses the impl pointer.
Essentially, you need a separate set of headers for each use. One that you use to build your DLL and one with only the exported interfaces, and no mention at all of the encapsulated objects. Your example would look like:
class Node
{
public:
String GetName();
};
You can use preprocessor statements to get both versions in the same physical file if you don't mind the mess.