Expose a vector as a memoryview using SWIG - c++

I have a header file like:
#include <vector>
inline std::vector<uint8_t>& vec() {
static std::vector<uint8_t> v { 'a', 'b', 'c', 'd' };
return v;
}
inline const std::vector<uint8_t>& cvec() {
return vec();
}
I can wrap it in SWIG using std_vector.i and pyabc.i but that is quite inefficient (there's a jump between C++ and Python code for every access) and given that these are literally just a bunch of bytes I ought to be able to wrap them with Python's memoryview interface.
How can I expose my std::vector<uint8_t> as a Python memoryview?

Exposing it as a memoryview requires creating a Py_buffer first. In Python 3.3+ there is a convenient helper function, PyMemoryView_FromMemory that does a lot of the work for us. In earlier versions though we'll need to take a few extra steps, so our basic out typemap looks like:
%typemap(out) std::vector<uint8_t>&, const std::vector<uint8_t>& {
Py_buffer *buf=(Py_buffer*)malloc(sizeof *buf);
const bool ro = info<$1_type>::is_readonly();
if (PyBuffer_FillInfo(buf, NULL, &((*$1)[0]), (*$1).size(), ro, PyBUF_ND)) {
// error, handle
}
$result = PyMemoryView_FromBuffer(buf);
}
Here we're basically allocating some memory for the Py_buffer. This just contains the details of the buffer internally for Python. The memory we allocate will be owned by the memoryview object once it's created. Unfortunately since it's going to be released with a call to free() we need to allocate it with malloc(), even though it's C++ code.
Besides the Py_buffer and an optional Py_Object PyBuffer_FillInfo takes a void* (the buffer itself), the size of the buffer, a boolean indicating if it's writeable and a flag. In this case our flag simply indicates that we have provided C-style contiguous memory for the buffer.
For deciding if it is readonly or not we used SWIG's built in $n_type variable and a helper (which could be a C++11 type trait if we wanted).
To complete our SWIG interface we need to provide that helper and include the header file, so the whole thing becomes:
%module test
%{
#include "test.hh"
namespace {
template <typename T>
struct info {
static bool is_readonly() {
return false;
}
};
template <typename T>
struct info<const T&> {
static bool is_readonly() {
return true;
}
};
}
%}
%typemap(out) std::vector<uint8_t>&, const std::vector<uint8_t>& {
Py_buffer *buf=(Py_buffer*)malloc(sizeof *buf);
const bool ro = info<$1_type>::is_readonly();
if (PyBuffer_FillInfo(buf, NULL, &((*$1)[0]), (*$1).size(), ro, PyBUF_ND)) {
// error, handle
}
$result = PyMemoryView_FromBuffer(buf);
}
%include "test.hh"
We can then test it with:
import test
print test.vec()
print len(test.vec())
print test.vec()[0]
print test.vec().readonly
test.vec()[0]='z'
print test.vec()[0]
print "This should fail:"
test.cvec()[0] = 0
Which worked as expected, tested using Python 2.7.
Compared to just wrapping it using std_vector.i this approach does have some drawbacks. The biggest being that we can't resize the vector, or convert it back to a vector later trivially. We could work around that, at least partially by creating a SWIG proxy for the vector like normal and using the second parameter of PyBuffer_FillInfo to store it internally. (This would also be needed if we had to manage the ownership of the vector for instance).

Related

C++ template to read value from member variable or member function

I am writing code generator and using flatbuffers for generating classes. The rest of the code generator will work with these classes in C++.
I have not been able to figure out how to keep the API consistent for reading data for two different types of classes that flatbuffer may generate. I am using the object api (testRecordT) in the example for whenever an object needs to be written to (and can be read back as well) and flatbuffer overlay for when the data can only be read from.
I have not been able to get any template or free functions to work to give me a consistent api that would work in both the cases.
Below is a snippet of what I am trying to get to work.
struct testRecordT {
int32_t field1;
std::string field2;
};
struct testRecord {
int32_t field1() const {
return 0;
// flatbuffer generated - return GetField<int32_t>(VT_FIELD1, 0);
}
const flatbuffers::String *field2() const {
return nullptr;
// flatbuffer generated - return GetPointer<const flatbuffers::String *>(VT_FIELD3);
}
};
void Test() {
testRecordT * members; // assume pointers are valid
testRecord * memberFunctions;
// Need to be able to create a read function/template that would work. This would simplify the code generation a lot. I can generate either one below, as long as it is consistent in both cases.
auto r = read(members->field1); // or read(members,field1)
auto v = read(memberFunctions->field1); // or read(memberFunctions,field1)
}
The read functions or template functions should be consistent. Any pointers or thoughts would be helpful. I am using C++17 with gcc 7.3.1 .
You can use std::invoke for this. It can both call functions or access members.
auto r = std::invoke(&testRecordT::field1, members);
auto v = std::invoke(&testRecord::field1, memberFunctions);
You can use std::invoke() for this problem.

Dynamic Function Args for Callback / RPC in C++

I need to register functions like the following in a list of functions with arguments.
void func1( int a , char* b ) {}
void func2( vec3f a , std::vector<float> b , double c) {}
...
And call them back when I receive data over network with proper arguments. I imagined va_list would solve, but it doesnt work :
void func1(int a, char* b)
{
printf("%d %s",a,b);
}
void prepare(...)
{
va_list argList;
int args = 2;
va_start(argList, args);
((void (*)(va_list))func1)(argList);
va_end(argList);
}
int main(int argc, char **argv)
{
prepare(1, "huhu");
return 0;
}
What is the most elegant way to solve this ?
I know std::bind / std::function has similar abilities, but the internal data is hidden deep in std I assume. I just need a few basic data types, doesnt have to be for arbitrary types. If preprocessor tricks with ##VA_ARGS or using templates would solve, I am also OK with that. Priority is that it is most simple to use.
Edit1 : I found that assembly can solve ( How do I pass arguments to C++ functions when I call them from inline assembly ) - but I would prefer a more platform independent solution.
If your goal is to create your own, small and ad-hoc "rpc" solution, possibly one of the major drivers for making decisions should be: 1. Minimal amount of code 2. Easy as possible.
Keeping that in mind, it is paying off to ponder, what the difference is between the following 2 scenarios:
"Real" RPC: The handlers shall be as you wrote with rpc-method-specific signature.
"Message passing": The handlers receive messages of either "end point-determined type" or simply of a unified message type.
Now, what has to be done to get a solution of type 1?
Incoming byte streams/network packets need to get parsed to some sort of message with regards to some chosen protocol. Then, using some meta-info (contract), according to { serviceContract, serviceMethod }, a specific set of data items needs to be confirmed in the packet and if present, the respective, registered handler function needs to be called. Somewhere within that infrastructure you typically have a (likely code generated) function which does something like that:
void CallHandlerForRpcXYCallFoo( const RpcMessage*message )
{
uint32_t arg0 = message->getAsUint32(0);
// ...
float argN = message->getAsFloat(N);
Foo( arg0, arg1, ... argN );
}
All that can, of course also be packed into classes and virtual methods with the classes being generated from the service contract meta data. Maybe, there is also a way by means of some excessive template voodoo to avoid generating code and having a more generic meta-implementation. But, all that is work, real work. Way too much work to do it just for fun. Instead of doing that, it would be easier to use one of the dozens technologies which do that already.
Worth noting so far is: Somewhere within that piece of art, there is likely a (code generated) function which looks like the one given above.
Now, what has to be done to get a solution of type 2?
Less than for case 1. Why? Because you simply stop your implementation at calling those handler methods, which all take the RpcMessage as their single argument. As such, you can get away without generating the "make-it-look-like-a-function-call" layer above those methods.
Not only is it less work, it is also more robust in the presence of some scenarios where the contract changes. If one more data item is being added to the "rpc solution", the signature of the "rpc function" MUST change. Code re-generated, application code adapted. And that, whether or not the application needs that new data item. On the other hand, in approach 2, there are no breaking changes in the code. Of course, depending on your choices and the kind of changes in the contract, it still would break.
So, the most elegant solution is: Don't do RPC, do message passing. Preferably in a REST-ful way.
Also, if you prefer a "unified" rpc message over a number of rpc-contract specific message types, you remove another reason for code bloat.
Just in case, what I say seems a bit too abstract, here some mock-up dummy code, sketching solution 2:
#include <cstdio>
#include <cstdint>
#include <map>
#include <vector>
#include <deque>
#include <functional>
// "rpc" infrastructure (could be an API for a dll or a lib or so:
// Just one way to do it. Somehow, your various data types need
// to be handled/represented.
class RpcVariant
{
public:
enum class VariantType
{
RVT_EMPTY,
RVT_UINT,
RVT_SINT,
RVT_FLOAT32,
RVT_BYTES
};
private:
VariantType m_type;
uint64_t m_uintValue;
int64_t m_intValue;
float m_floatValue;
std::vector<uint8_t> m_bytesValue;
explicit RpcVariant(VariantType type)
: m_type(type)
{
}
public:
static RpcVariant MakeEmpty()
{
RpcVariant result(VariantType::RVT_EMPTY);
return result;
}
static RpcVariant MakeUint(uint64_t value)
{
RpcVariant result(VariantType::RVT_UINT);
result.m_uintValue = value;
return result;
}
// ... More make-functions
uint64_t AsUint() const
{
// TODO: check if correct type...
return m_uintValue;
}
// ... More AsXXX() functions
// ... Some ToWire()/FromWire() functions...
};
typedef std::map<uint32_t, RpcVariant> RpcMessage_t;
typedef std::function<void(const RpcMessage_t *)> RpcHandler_t;
void RpcInit();
void RpcUninit();
// application writes handlers and registers them with the infrastructure.
// rpc_context_id can be anything opportune - chose uint32_t, here.
// could as well be a string or a pair of values (service,method) or whatever.
void RpcRegisterHandler(uint32_t rpc_context_id, RpcHandler_t handler);
// Then according to taste/style preferences some receive function which uses the registered information and dispatches to the handlers...
void RpcReceive();
void RpcBeginReceive();
void RpcEndReceive();
// maybe some sending, too...
void RpcSend(uint32_t rpc_context_id, const RpcMessage_t * message);
int main(int argc, const char * argv[])
{
RpcInit();
RpcRegisterHandler(42, [](const RpcMessage_t *message) { puts("message type 42 received."); });
RpcRegisterHandler(43, [](const RpcMessage_t *message) { puts("message type 43 received."); });
while (true)
{
RpcReceive();
}
RpcUninit();
return 0;
}
And if RpcMessage then is traded, while packed in a std::shared_ptr, you can even have multiple handlers or do some forwarding (to other threads) of the same message instance. This is one particularly annoying thing, which needs yet another "serializing" in the rpc approach. Here, you simply forward the message.

Using var_arg to pass parameters for function calls

I am writing an adapter to combine two APIs (one in C and another in C++).
If a function is called on the one API I need to pass the callers ID and the function's arguments to an adapter and call the according function with this information passed.
Now aparently they can not be mapped directly as one interface requires C++ compilation and the name mangling would screw the other so that is why I am using a set of adapters in the first place.
As the number of arguments varies, I looked up variadic functions and found the idea pretty useful, however I am operating on POD only and have to deal with structs, enums and a lot of different arguments per call, which might need to be put back into a struct before feeding it to the target function.
Every example I stumbled upon was far simpler and involved mostly arithmetic operations like summing stuff up , finding largest numbers or printing. Mostly done with for loops on the var_list.
Maybe I got stuck on the idea and it won't work at all, but I am just curious...
Say I wanted to assign the arguments from the list to my target functions parameters (the order of the arguments passed is the correct one), what would be a good way?
BOOL Some_Function(
/* in */ CallerId *pObjectId,
/* in */ someDataType argument1 )
{
BOOL ret = Adapter_Call(pFunction, pObjectId, argument1);
return ret;
}
and so once I made it to the right adapter I want to do
BOOL Adapter_Call(*pFunction, *pObjectId, argument1, ...)
{
va_list args;
va_start(args, argument1);
/*go over list and do `var_list[i] = pFunctionArgList[i]` which is
of whatever type so I can use it as input for my function */
va_end(args);
pObjectId.pFunction(arg1,...,argn);
}
Can I access the input parameters of a function to perform assignments like this?
Has anyone done something like this before? Is there a conceptual mistake in my thinking?
All I found on the net was this, http://www.drdobbs.com/cpp/extracting-function-parameter-and-return/240000586but due to the use of templates I am not sure if it wouldn't create another problem and so in the end implementing an adapter for each and every single functioncall may be simpler to do.
A SO search only returned this: Dynamic function calls at runtime (va_list)
First, you should heed Kerrek's advice about extern "C". This is C++'s mechanism for giving an identifier C linkage, meaning that the name won't be mangled by the C++ compiler.
Sometimes, and adapter still needs to be written for a C++ interface, because it manipulates objects that do not map to a C POD. So, the adapter gives the C interface a POD or opaque pointer type to manipulate, but the implementation of that interface converts that into an C++ object or reference and then calls the C++ interface. For example, suppose you wanted to provide a C interface for C++ std::map<int, void *>, you would have a common header file in C and C++ that would contain:
#ifdef __cplusplus
extern "C" {
#endif
struct c_map_int_ptr;
// ...
// return -1 on failure, otherwise 0, and *data is populated with result
int c_map_int_ptr_find (struct c_map_int_ptr *, int key, void **data);
#ifdef __cplusplus
}
#endif
Then, the C++ code could implement the function like:
typedef std::map<int, void *> map_int_ptr;
int c_map_int_ptr_find (struct c_map_int_ptr *cmap, int key, void **data) {
map_int_ptr &map = *static_cast<map_int_ptr *>(cmap);
map_int_ptr::iterator i = map.find(key);
if (i != map.end()) {
*data = i->second;
return 0;
}
return -1;
}
Thus, there is no need to pass the arguments passed via the C interface through a variable argument adapter. And so, there is no need for the C++ code to tease out the arguments from a variable argument list. The C code calls directly into the C++ code, which knows what to do with the arguments.
I suppose if you are trying to implement some kind of automated C adapter code generator by parsing C++ code, you could think that using variable arguments would provide a regular mechanism to communicate arguments between the generated C code interface and the generated C++ adapter code that would call the original C++ interface. For such a scenario, the code for the above example would look something like this:
// C interface
typedef struct c_map_int_ptr c_map_int_ptr;
typedef struct c_map_int_ptr_iterator c_map_int_ptr_iterator;
//...
c_map_int_ptr_iterator c_map_int_ptr_find (c_map_int_ptr *map, int key) {
c_map_int_ptr_iterator result;
cpp_map_int_ptr_adapter(__func__, map, key, &result);
return result;
}
// C++ code:
struct cpp_adapter {
virtual ~cpp_adapter () {}
virtual void execute (va_list) {}
};
void cpp_map_int_ptr_adapter(const char *func, ...) {
va_list ap;
va_start(ap, func);
cpp_map_int_ptr_adapter_method_lookup(func).execute(ap);
va_end(ap);
}
//...
struct cpp_map_int_ptr_find_adapter : cpp_adapter {
void execute (va_list ap) {
map_int_ptr *map = va_arg(ap, map_int_ptr *);
int key = va_arg(ap, int);
c_map_int_ptr_iterator *c_iter = va_arg(ap, c_map_int_ptr_iterator *);
map_int_ptr::iterator i = map->find(key);
//...transfer result to c_iter
}
};
Where cpp_map_int_ptr_adapter_method_lookup() returns an appropriate cpp_adapter instance based on a table lookup.

Variadic Templates before C++11

How did Boost implement Tuple before C++11 and Variadic Templates?
In other words:
Is it possible to implement a Variadic Templates class or function by not using built-in Variadic Templates feature in C++11?
Boost had a limit for the size of the tuple. As in most real-world scenarios you don't need more than 10 elements, you won't mind this limitation. As a library maintainer, I guess, the world became much simpler with variadic templates. No more macro hacks...
Here is an insightful discussion about the size limit of Boost tuple and its implementation:
boost tuple: increasing maximum number of elements
To answer your second question: No, it is not possible. At least not for an unlimited number of elements.
There are 2 common use cases I've seen, as a library developer, for variadic templates. You can build a work around for both.
Case 1: Function objects
std::function<> and lambdas are very nice, but even c++11 only gives you a fairly basic set of things you can do with them "out of the box". To implement really cool things and utilities on top of them, you need to support variadic templates because std::function can be used with any normal function signature.
Workaround:
A recursive call using std::bind is your friend. It IS less efficient than real variadic templates (and some tricks like perfect forwarding probably won't work), but it'll work okay for modest #s of template arguments until you port to c++11.
Case 2: Ordinary classes
Sometimes you need an ordinary class to manage generic std::function<>s (see above) or expose an API like "printf". Workarounds here come down to details and what each API of the class is doing.
APIs that merely manipulate variadic template data but don't need to store it can run as recursive calls. You need to write them so that they "consume" one argument at a time, and stop when they run out of arguments.
APIs (including constructors) that need to STORE variadic template data are harder- you're screwed if the types are really unlimited and could be anything. BUT, if they're always going to be primitives that map deterministically to binary, you can do it. Just write a "Serialize" call taking all the types you support, then use it to serialize the entire set into a binary buffer and build a vector of "type info" data you use to fetch & set them. Its actually a better solution than std::tuple in terms of memory and performance in the special cases its available.
Here's the "serialize tuple" trick:
// MemoryBuffer: A basic byte buffer w/ its size
class MemoryBuffer {
private:
void* buffer;
int size;
int currentSeekPt;
protected:
void ResizeBuffer() {
int newSz = size << 1; // Multiply by 2
void* newBuf = calloc( newSz, 1); // Make sure it is zeroed
memcpy( newBuf, buffer, target->size);
free( buffer);
size = newSz;
buffer = newBuf;
}
public:
MemoryBuffer(int initSize)
: buffer(0), size(initSize), currentSeekPt(0)
{
buffer = calloc( size, 1);
}
~MemoryBuffer() {
if(buffer) {
free( buffer);
}
}
// Add data to buffer
bool AddData(const void* data, int dataSz) {
if(!data || !dataSz) return false;
if(dataSz + currentSeekPt > size) { // resize to hold data
ResizeBuffer();
}
memcpy( buffer, data, dataSz);
return true;
}
void* GetDataPtr() const { return buffer; }
int GetSeekOffset() const { return currentSeekPt; }
int GetTotalSize() const { return size; }
};
struct BinaryTypeInfo {
std::type_info type; // RTTI type_info struct. You can use an "enum"
// instead- code will be faster, but harder to maintain.
ui64 bufferOffset; // Lets me "jump" into the buffer to
}
// Versions of "Serialize" for all 'tuple' data types I support
template<typename BASIC>
bool Serialize(BASIC data, MemoryBuffer* target,
std::vector<BinaryTypeInfo>& types)
{
// Handle boneheads
if(!target) return false;
// Setup our type info structure
BinaryTypeInfo info;
info.type = typeid(data);
info.bufferOffset = target->GetSeekOffset();
int binarySz = sizeof(data);
void* binaryVersion = malloc( binarySz);
if(!binaryVersion) return false;
memcpy( binaryVersion, &data, binarySz); // Data type must support this
if(!target->AddData( binaryVersion, binarySz)) {
free( binaryVersion);
return false;
}
free( binaryVersion);
// Populate type vector
types.push_back( info);
return true;
}
This is just a quick & dirty version; you'd hide the real thing better and probably combine the pieces into 1 reusable class. Note that you need a special version of Serialize() if you wish to handle std::string and more complex types.

C++ library API. Using converter classes instead of plain C api

This question is about C++ <-> C++ interoperability.
As is well known implementation of standard library classes/functions may differ across different vendors. Moreover implementation may differ even within same library vendor, when using different compiler keys, configuration (Debug/Release), etc.
Due to that reason, many library developers shifts to old plain C-style API.
Which leads to uglish error-prone interfaces.
For instance, in order to get string from some function, interfaces like Win GetCurrentDirectory function are used:
DWORD WINAPI GetCurrentDirectory(
__in DWORD nBufferLength,
__out LPTSTR lpBuffer
);
three parameters + some boilerplate code on both sides(checking if buffer size was enough, etc) just to get simple string.
I am thinking to use some auxiliary adapter/proxy class, which will do all conversions automaticly, and can be simply reused.
Something like:
#include <string>
#include <algorithm>
#include <iostream>
#include <ostream>
class StringConverter
{
char *str; // TODO: use smart pointer with right deleter
public:
StringConverter(const std::string &user_string) // Will be defined only at user side
{
str=new char[user_string.length()+1];
(*(std::copy(user_string.begin(),user_string.end(),str)))=0;
}
operator std::string() // Will be defined only at library side
{
return std::string(str);
}
~StringConverter()
{
delete [] str;
}
};
StringConverter foo()
{
return std::string("asd");
}
int main(int argc,char *argv[])
{
std::cout << std::string(foo()) << std::endl;
return 0;
}
http://ideone.com/EfcKv
Note, I plan to have defenition of conversion from user string to StringConverter only at user side, and defenition of conversion from StringConverter to library string only inside library.
Also, right deleter should be used (from right heap).
What do you think about such approach?
Are there some major pitfalls?
Are there some superior alternatives?
This technique will work in some cases where standard data types are incompatible, but in others it will fare no better: name mangling differences and class memory layout differences (vptrs and tags) come to mind.
This is why C APIs are preferred.
But you can improve usability by burying the C API where the library caller never needs to see it. Then add a thin, idiomatic C++ overlay that provides the visible library interface. In some cases the thin overlay code can be used on both caller and library side, each compiled in its own environment: flags, link conventions, etc. Only C data types are exchanged. The simpler these data types are, the more compatibility you'll obtain.
The layer also takes care that memory allocation and deallocation occur on the same side of the API on a per object basis, as your code does. But it can be more flexible. For example, it's possible to arrange for an object allocated in the caller to be deallocated in the library and vice versa.
// library.h for both caller and library
// "shadow" C API should not be used by caller
extern "C" {
void *make_message(char *text);
char *get_text_of_message(void* msg);
void send_message(void *msg); // destroys after send.
}
// Thin C++ overlay
class Message {
void *msg;
public:
Message(const std::string &text) {
msg = make_message(text.c_str());
}
void send() {
if (msg) send_message(msg);
else error("already sent");
msg = 0;
}
std:string getTextString() {
return std:string(get_text_of_message(void* msg));
}
}