Strategy for Wrapping external library return types - unit-testing

I have been asked to unit test some legacy code.
Currently, the code is tightly coupled with a 3rd party library both in terms of method calls and types used.
I am planning on writing a wrapper around the library in the form of a Façade design pattern which will aid in testability, create a cleaner interface for the rest of the code and allow me to swap out the library in the future if required.
This works fine where the method calls are void return type because the library functions are self contained. But what if the existing code uses library specific types? An example is here:
LibrarySpecificType[] myVar = wrappedLibrary.DoX();
Although I have wrapped my library call in the above example, it still returns a library specific type, so it is still somewhat coupled.
Does anybody know a way around this?

you can just create wrapper classes around the types that are returned and have the wrappedLibrary return those wrapped types instead. This might be quite a lot of work if each of those types also exposes methods which accept and return other types. Something like this:
WrappedLibrarySpecificType[] myVar = wrappedLibrary.DoX();
Then in the library wrapper will have to call the actual library and wrap the type the library returns and return the wrapped type.
This ends up being a rabbit hole though and you will probably need to wrap every type.
If this is a large library you might find some benefit in writing (or using) a tool which will be able to generate the wrappers for you by reflecting over the types in the third-party library
you might have some assistance in generating the delegating members, depending on your ide

Related

Late Binding COM objects with C++Builder

We're interfacing to some 3rd party COM objects from a C++Builder 2010 application.
Currently we import the type library and generate component wrappers, and then are able to make method calls and access properties in a fairly natural way.
object->myProperty = 42;
object->doSomething(666);
However, we've been bitten by changes to the COM object's interface (which is still being extended and developed) causing our own app to fail because some method GUIDs seem to get invalidated - even if the only change to the interface has been the addition of a new method).
Late Binding has been suggested as a way of addressing this. I think this requires our code to be changed rather like this:
object.OlePropertySet("myProperty", 42);
object.OlePrcedure("doSomething", 666);
Obviously this is painful to read and write, so we'd have to write wrapper classes instead.
Is there any way of getting late binding wrappers generated automatically when we import the type library? And, if so, are they smart enough to only do the textual binding once when the object is created, rather than on every single method call?
When you import a TypeLibrary for a COM object that supports late-binding (when it implements the IDispatch interface), the importer can generate separate wrapper classes (not components) for both static-binding and late-binding.
Adding a new method to an existing interface should not invalidate your code. Methods do not have GUIDs. However, for an IDispatch-based interface, its methods do have DISPID values associated with them, and those DISPID values can be changed from one release to another. Though any respectable COM developer should never do that once an interface definition has been locked in.
After deep investigation of the code and headers generated by the TLIBIMP, this turns out to be fairly easy.
If your Type Library has a class Foo, then after importing the type library, you would typically use the auto-generated smart pointer classes IFooPtr.
{
IFooPtr f;
...
f->myMethod(1,2);
}
You should note that at this point that the bindings are static - that is, they depend not just on the GUIDs of the objects and the DISPIDs of the methods, but on the exact layout of the VTable in the DLL. Any changes that affect the vtable - for instance, adding an additional method to a base class of Foo will cause the method call to fail.
To use dynamic bindings, you can use the IFooDisp classes instead of IFooPtr. Again, these are smart wrappers, handling object lifetimes automatically. Note that with these classes you should use the . operator to access methods, not the indirection -> operator. Using the indirection operator will call the method, but via a static binding.
{
IFooDisp f;
...
f.myMethod(1,2);
}
By using these IDispatch-based wrappers, methods will be dispatched by their DISPIDs, even if the objects vtable layout is changed. I think these classes also give a way to dispatch by function name rather than DISPID, but haven't confirmed the details of that.

A collection of custom structures (a wrapper) with a single member (also a custom structure), to a collection of the single members

The problem is specific but the solution open ended. I'm a lone coder looking to bat some ideas around with some fellow programmers.
I have a wrapper for a maths library. The wrapper provides the system with a consistent interface, while allowing me to switch in/out math libraries for different platforms. The wrapper contains a single member, so say for my Matrix4x4 wrapper class there is an api_matrix_4x4 structure as the only member to the wrapper.
My current target platform has a nifty little optimised library, with a few of those nifty functions requiring a C-style array of the wrapper's embedded member, while my wrapper functions for those math API functions don't want to expose that member type to the rest of the system. So we have a collection of wrappers (reference/pointer to) going into the function, & the members of the wappers being needed in a collection inside the function, so they can be passed to the math API.
I'm predominantly using C++, including C++11 features, & can also go C-style. Ideally I want a no-exception solution, & to avoid as many, if not all dynamic allocations. My wrapper functions can use standard library arrays or vectors, or C-style pointers to arrays as parameters, & whatever is necessary internally, just no dynamic casting (Run-Time Type Information).
1) Can I cast a custom struct/class containing a single custom struct, to the custom struct? If so, what about if it was a standard library collection of them. I'm thinking about type slicing here.
2) Would you perhaps use a template to mask the type passed to the function, although the implementation can only act on a single type (based on the math API used), or is such usage of templates considered bad?
3) Can you think of a nifty solution, perhaps involving swaps/move semantics/emplacement? If so, please help by telling me about it.
4) Or am I resigned to the obvious, iterate through one collection, taking the member out into another, then using that for the API function?
Example of what I am doing by the wrapper struct & wrapper function signature, & example of what I am trying to avoid doing is given by the function implementation:
struct Vector3dWrapper
{
API_Specific_Vector_3d m_api_vector_3d;
inline void operation_needing_vector_3d_wrappers(std::vector<Vector3d>& vectors)
{
// Now need a collection of API_Specific_Vector_3ds
try
{
std::Vector<API_Specific_Vector_3d> api_vectors;
api_vectors.reserve(vectors.size());
for( auto vectors_itr = vectors.begin(); vectors_itr != vectors.end(); ++vectors)
{
// fill each Vector3d.m_api_vector_3d into api_vectors
}
}
catch(std::bad_alloc &e)
{
// handle... though in reality, try/catch is done elsewhere in the system.
}
// Signature is API_Multiply_Vectors_With_Matrix_And_Project(API_Specific_Vector_3d* vectors, size_t vector_count)
API_Multiply_Vectors_With_Matrix_And_Project(&api_vectors, api_vectors.size());
}
};
You can cast a standard-layout struct (such as a struct compatible with C) to its first member, but what's the point? Just access the first member and apply &.
Templates usually allow uniform parameterization over a set of types. You can write a template that's only instantiated once, but again that seems pointless. What you really want is a different interface library for each platform. Perhaps templates could help define common code shared between them. Or you could do the same in plain C by setting typedefs before #include.
Solution to what? The default copy and move semantics should work for flat, C-style structs containing numbers. As for deep copies, if the underlying libraries have pointer-based structures, you need to be careful and implement all the semantics you'll need. Safe… simple… default… "nifty" sounds dirty.
Not sure I understand what you're doing with collections. You mean that every function requires its parameters to be first inserted into a generic container object? Constructing containers sounds expensive. Your functions should parallel the functions in the underlying libraries as well as possible.

STL Containers and Binary Interface Compatibility

STL Binary Interfaces
I'm curious to know if anyone is working on compatible interface layers for STL objects across multiple compilers and platforms for C++.
The goal would be to support STL types as first-class or intrinsic data types.
Is there some inherent design limitation imposed by templating in general that prevents this? This seems like a major limitation of using the STL for binary distribution.
Theory - Perhaps the answer is pragmatic
Microsoft has put effort into .NET and doesn't really care about C++ STL support being "first class".
Open-source doesn't want to promote binary-only distribution and focuses on getting things right with a single compiler instead of a mismatch of 10 different versions.
This seems to be supported by my experience with Qt and other libraries - they generally provide a build for the environment you're going to be using. Qt 4.6 and VS2008 for example.
References:
http://code.google.com/p/stabi/
Binary compatibility of STL containers
I think the problem preceeds your theory: C++ doesn't specify the ABI (application binary interface).
In fact even C doesn't, but being a C library just a collection of functions (and may be global variables) the ABI is just the name of the functions themselves. Depending on the platform, names can be mangled somehow, but, since every compiler must be able to place system calss, everything ends up using the same convention of the operating system builder (in windows, _cdecl just result in prepending a _ to the function name.
But C++ has overloading, hence more complex mangling scheme are required.
As far as of today, no agreement exists between compiler manufacturers about how such mangling must be done.
It is technically impossible to compile a C++ static library and link it to a C++ OBJ coming from another compiler. The same is for DLLs.
And since compilers are all different even for compiled overloaded member functions, no one is actually affording the problem of templates.
It CAN technically be afforded by introducing a redirection for every parametric type, and introducing dispatch tables but ... this makes templated function not different (in terms of call dispatching) than virtual functions of virtual bases, thus making the template performance to become similar to classic OOP dispatching (although it can limit code bloating ... the trade-off is not always obvious)
Right now, it seems there is no interest between compiler manufacturers to agree to a common standard since it will sacrifice all the performance differences every manufacturer can have with his own optimization.
C++ templates are compile-time generated code.
This means that if you want to use a templated class, you have to include its header (declaration) so the compiler can generate the right code for the templated class you need.
So, templates can't be pre-compiled to binary files.
What other libraries give you is pre-compiled base-utility classes that aren't templated.
C# generics for example are compiled into IL code in the form of dlls or executables.
But IL code is just like another programming language so this allows the compiler to read generics information from the included library.
.Net IL code is compiled into actual binary code at runtime, so the compiler at runtime has all the definitions it needs in IL to generate the right code for the generics.
I'm curious to know if anyone is working on compatible interface
layers for STL objects across multiple compilers and platforms for
C++.
Yes, I am. I am working on a layer of standardized interfaces which you can (among other things) use to pass binary safe "managed" references to instances of STL, Boost or other C++ types across component boundaries. The library (called 'Vex') will provide implementations of these interfaces plus proper factories to wrap and unwrap popular std:: or boost:: types. Additionally the library provides LINQ-like query operators to filter and manipulate contents of what I call Range and RangeSource. The library is not yet ready to "go public" but I intend to publish an early "preview" version as soon as possible...
Example:
com1 passes a reference to std::vector<uint32_t> to com2:
com1:
class Com2 : public ICom1 {
std::vector<int> mVector;
virtual void Com2::SendDataTo(ICom1* pI)
{
pI->ReceiveData(Vex::Query::From(mVector) | Vex::Query::ToInterface());
}
};
com2:
class Com2 : public ICom2 {
virtual void Com2::ReceiveData(Vex::Ranges::IRandomAccessRange<uint32_t>* pItf)
{
std::deque<uint32_t> tTmp;
// filter even numbers, reverse order and process data with STL or Boost algorithms
Vex::Query::From(pItf)
| Vex::Query::Where([](uint32_t _) -> { return _ % 2 == 0; })
| Vex::Query::Reverse
| Vex::ToStlSequence(std::back_inserter(tTmp));
// use tTmp...
}
};
You will recognize the relationship to various familiar concepts: LINQ, Boost.Range, any_iterator and D's Ranges... One of the basic intents of 'Vex' is not to reinvent wheel - it only adds that interface layer plus some required infrastructures and syntactic sugar for queries.
Cheers,
Paul

Modifying SWIG Interface file to Support C void* and structure return types

I'm using SWIG to generate my JNI layer for a large set of C APIs and I was wondering what is the best practices for the below situations. The below not only pertain to SWIG but JNI in general.
When C functions return pointers to Structures, should the SWIG interface file (JNI logic) be heavily used or should C wrapper functions be created to return the data in pieces (i.e. a char array that contains the various data elements)?
When C Functions return void* should the C APIs be modified to return the actual data type, whether it be primitive or structure types?
I'm unsure if I want to add a mass amount of logic and create a middle layer (SWIG interface file/JNI logic). Thoughts?
My approach to this in the past has been to write as little code as possible to make it work. When I have to write code to make it work I write it in this order of preference:
Write as C or C++ in the original library - everyone can use this code, you don't have to write anything Java or SWIG specific (e.g. add more overloads in C++, add more versions of functions in C, use return types that SWIG knows about in them)
Write more of the target language - supply "glue" to bring some bits of the library together. In this case that would be Java.
It doesn't really matter if this is "pure" Java, outside of SWIG altogether, or as part of the SWIG interface file from my perspective. Users of the Java interface shouldn't be able to distinguish the two. You can use SWIG to help avoid repetition in a number of cases though.
Write some JNI through SWIG typemaps. This is ugly, error prone if you're not familiar with writing it, harder to maintain (arguably) and only useful to SWIG+Java. Using SWIG typemaps does at least mean you only write it once for every type you wrap.
The times I'd favour this over 2. is one or more of:
When it comes up a lot (saves repetitious coding)
I don't know the target language at all, in which case using the language's C API probably is easier than writing something in that language
The users will expect this
Or it just isn't possible to use the previous styles.
Basically these guidelines I suggested are trying to deliver functionality to as many users of the library as possible whilst minimising the amount of extra, target language specific code you have to write and reducing the complexity of it when you do have to write it.
For a specific case of sockaddr_in*:
Approach 1
The first thing I'd try and do is avoid wrapping anything more than a pointer to it. This is what swig does by default with the SWIGTYPE_p_sockaddr_in thing. You can use this "unknown" type in Java quite happily if all you do is pass it from one thing to another, store in containers/as a member etc., e.g.
public static void main(String[] argv) {
Module.takes_a_sockaddr(Module.returns_a_sockaddr());
}
If that doesn't do the job you could do something like write another function, in C:
const char * sockaddr2host(struct sockaddr_in *in); // Some code to get the host as a string
unsigned short sockaddr2port(struct sockaddr_in *in); // Some code to get the port
This isn't great in this case though - you've got some complexity to handle there with address families that I'd guess you'd rather avoid (that's why you're using sockaddr_in in the first place), but it's not Java specific, it's not obscure syntax and it all happens automatically for you besides that.
Approach 2
If that still isn't good enough then I'd start to think about writing a little bit of Java - you could expose a nicer interface by hiding the SWIGTYPE_p_sockaddr_in type as a private member of your own Java type, and wrapping the call to the function that returns it in some Java that constructs your type for you, e.g.
public class MyExtension {
private MyExtension() { }
private SWIGTYPE_p_sockaddr_in detail;
public static MyExtension native_call() {
MyExtension e = new MyExtension();
e.detail = Module.real_native_call();
return e;
}
public void some_call_that_takes_a_sockaddr() {
Module.real_call(detail);
}
}
No extra SWIG to write, no JNI to write. You could do this through SWIG using %pragma(modulecode) to make it all overloads on the actual Module SWIG generates - this feels more natural to the Java users probably (it doesn't look like a special case) and isn't really any more complex. The hardwork is being done by SWIG still, this just provides some polish that avoids repetitious coding on the Java side.
Approach 3
This would basically be the second part of my previous answer. It's nice because it looks and feels native to the Java users and the C library doesn't have to be modified either. In essence the typemap provides a clean-ish syntax for encapsulating the JNI calls for converting from what Java users expect to what C works with and neither side knows about the other side's outlook.
The downside though is that it is harder to maintain and really hard to debug. My experience has been that SWIG has a steep learning curve for things like this, but once you reach a point where it doesn't take too much effort to write typemaps like that the power they give you through re-use and encapsulation of the C type->Java type mapping is very useful and powerful.
If you're part of a team, but the only person who really understands the SWIG interface then that puts a big "what if you get hit by a bus?" factor on the project as a whole. (Probably quite good for making you unfirable though!)

C/C++ Dynamic loading of functions with unknown prototype

I'm in the process of writing a kind of runtime system/interpreter, and one of things that I need to be able to do is call c/c++ functions located in external libraries.
On linux I'm using the dlfcn.h functions to open a library, and call a function located within. The problem is that, when using dlsysm() the function pointer returned need to be cast to an appropriate type before being called so that the function arguments and return type are know, however if I’m calling some arbitrary function in a library then obviously I will not know this prototype at compile time.
So what I’m asking is, is there a way to call a dynamically loaded function and pass it arguments, and retrieve it’s return value without knowing it’s prototype?
So far I’ve come to the conclusion there is not easy way to do this, but some workarounds that I’ve found are:
Ensure all the functions I want to load have the same prototype, and provide some sort mechanism for these functions to retrieve parameters and return values. This is what I am doing currently.
Use inline asm to push the parameters onto the stack, and to read the return value. I really want to steer clear of doing this if possible!
If anyone has any ideas then it would be much appreciated.
Edit:
I have now found exactly what I was looking for:
http://sourceware.org/libffi/
"A Portable Foreign Function Interface Library"
(Although I’ll admit I could have been clearer in the original question!)
What you are asking for is if C/C++ supports reflection for functions (i.e. getting information about their type at runtime). Sadly the answer is no.
You will have to make the functions conform to a standard contract (as you said you were doing), or start implementing mechanics for trying to call functions at runtime without knowing their arguments.
Since having no knowledge of a function makes it impossible to call it, I assume your interpreter/"runtime system" at least has some user input or similar it can use to deduce that it's trying to call a function that will look like something taking those arguments and returning something not entirely unexpected. That lookup is hard to implement in itself, even with reflection and a decent runtime type system to work with. Mix in calling conventions, linkage styles, and platforms, and things get nasty real soon.
Stick to your plan, enforce a well-defined contract for the functions you load dynamically, and hopefully make due with that.
Can you add a dispatch function to the external libraries, e.g. one that takes a function name and N (optional) parameters of some sort of variant type and returns a variant? That way the dispatch function prototype is known. The dispatch function then does a lookup (or a switch) on the function name and calls the corresponding function.
Obviously it becomes a maintenance problem if there are a lot of functions.
I believe the ruby FFI library achieves what you are asking. It can call functions
in external dynamically linked libraries without specifically linking them in.
http://wiki.github.com/ffi/ffi/
You probably can't use it directly in your scripting language but perhapps the ideas are portable.
--
Brad Phelan
http://xtargets.heroku.com
I'm in the process of writing a kind of runtime system/interpreter, and one of things that I need to be able to do is call c/c++ functions located in external libraries.
You can probably check for examples how Tcl and Python do that. If you are familiar with Perl, you can also check the Perl XS.
General approach is to require extra gateway library sitting between your interpreter and the target C library. From my experience with Perl XS main reasons are the memory management/garbage collection and the C data types which are hard/impossible to map directly on to the interpreter's language.
So what I’m asking is, is there a way to call a dynamically loaded function and pass it arguments, and retrieve it’s return value without knowing it’s prototype?
No known to me.
Ensure all the functions I want to load have the same prototype, and provide some sort mechanism for these functions to retrieve parameters and return values. This is what I am doing currently.
This is what in my project other team is doing too. They have standardized API for external plug-ins on something like that:
typedef std::list< std::string > string_list_t;
string_list_t func1(string_list_t stdin, string_list_t &stderr);
Common tasks for the plug-ins is to perform transformation or mapping or expansion of the input, often using RDBMS.
Previous versions of the interface grew over time unmaintainable causing problems to both customers, products developers and 3rd party plug-in developers. Frivolous use of the std::string is allowed by the fact that the plug-ins are called relatively seldom (and still the overhead is peanuts compared to the SQL used all over the place). The argument stdin is populated with input depending on the plug-in type. Plug-in call considered failed if inside output parameter stderr any string starts with 'E:' ('W:' is for warnings, rest is silently ignored thus can be used for plug-in development/debugging).
The dlsym is used only once on function with predefined name to fetch from the shared library array with the function table (function public name, type, pointer, etc).
My solution is that you can define a generic proxy function which will convert the dynamic function to a uniform prototype, something like this:
#include <string>
#include <functional>
using result = std::function<std::string(std::string)>;
template <class F>
result proxy(F func) {
// some type-traits technologies based on func type
}
In user-defined file, you must add define to do the convert:
double foo(double a) { /*...*/ }
auto local_foo = proxy(foo);
In your runtime system/interpreter, you can use dlsym to define a foo-function. It is the user-defined function foo's responsibility to do calculation.