How Python can get binary data(char*) from C++ by SWIG? - c++

I am using C++ functions in Python by SWIG,and I met a problem now.
When I pass a char * from C++ to Python, the char * is truncted by Python.
For example:
example.h:
char * fun()
{
return "abc\0de";
}
now in Python,we call
example.fun()
it only print
"abc"
instead of
"abc\0de"
the data behind '\0' is deleted by Python.
I want to get all the chars(it is a binary data that can contains '\0') from fun() in C++,
and any advise is appreciated

First of all, you should not use char * if you are dealing with binary data (swig thinks that they are normal strings). Instead you should use void *. swig provides a module named 'cdata.i' -
you should include this in the interface definition file.
Once you include this, it gives two functions - cdata() and memmove().
Given a void * and the length of the binary data, cdata() converts it into a string type of the target language.
memmove() does the reverse - given a string type, it will copy the contents of the string(including embedded null bytes) into the C void* type.
Handling binary data becomes much simpler with this module. I hope this is what you need.
example.i
%module example
%include "cdata.i"
%{
void *fun()
{
return "abc\0de";
}
%}
test.py
import example
print example.cdata(example.fun(), 6)

C/C++ strings are NULL-terminated which means that the first \0 character denotes the end of the string.
When a function returns a pointer to such a string, the caller (SWIG in this case) has no way of knowing if there is more data after the first \0 so that's why you only get the first part.
So first thing to do is to change your C function to return not just the string but its length as well. Since there can be only one return value we'll use pointer arguments instead.
void fun(char** s, int *sz)
{
*s = "abc\0de";
*sz = 6;
}
The SWIG docs suggest using the cstring.i library to wrap such functions. In particullar, the last macro does exactly what you need.
%cstring_output_allocate_size(parm, szparm, release)
Read the docs to learn how to use it.

See 8.3 C String Handling in the documentation.
Also from the documentation:
The char * datatype is handled as a NULL-terminated ASCII string. SWIG
maps this into a 8-bit character string in the target scripting
language. SWIG converts character strings in the target language to
NULL terminated strings before passing them into C/C++. The default
handling of these strings does not allow them to have embedded NULL
bytes. Therefore, the char * datatype is not generally suitable for
passing binary data. However, it is possible to change this behavior
by defining a SWIG typemap. See the chapter on Typemaps for details
about this.

Related

C++ - evaluating an input string as an internal code variable

Is there a way to take a string as an input argument to a c++ function and evaluate it as an internal argument e.g. the name of a structure or other variable?
For example (written in pseudo code)
int myFunction(string nameStructure){
nameStructure.field = 1234
}
The "take away" point is converting the input string as a variable within the code.
Mark
This type of question is often a symptom of a XY problem so consider other options first. That being said, there's no such default mechanism in C++ but there is a simple workaround I can think of - use a dictionary (std::map / std::unordered_map) to store all your objects:
std::map<std::string, MyAwesomeObject> objects;
...
int myFunction(std::string nameStructure)
{
objects[nameStructure].field = 1234
}
The names of local variables are just artifacts of the human-readable code and have no meaning in the compiled binary. Your int myIntVar's and char* myCharP's get turned into instructions like "four bytes starting at the location of the base pointer minus eight bytes, interpreted as a four-byte integer". They no longer have names as such.
If you export symbols from your binary, you can at runtime to look into export table according to your binary format and find the variable you want. But i bet you want something like access to local variable and that is not possible.
If you really need this funcionality, take a look at more dynamic interpreted languages as php
http://php.net/manual/en/language.variables.variable.php

Returning string from function having multiple NULL '\0' in C++

I am compressing string. And the compressed string sometimes having NULL character inside before the end NULL. I want to return the string till the end null.But the compressor function is returning the sting till the occurring of the first NULL. I made a question for c before about it. But consecutively I need also the solution in C++ now, and in next C#. Please help me.Thanks.
char* compressor(char* str)
{
char *compressed_string;
//After some calculation
compressed_string="bk`NULL`dk";// at the last here is automatic an NULL we all know
return compressed_string;
}
void main()
{
char* str;
str=compressor("Muhammad Ashikuzzaman");
printf("Compressed Value = %s",str);
}
The output is : Compressed Value = bk;
And all other characters from compressor function is not here. Is there any way to show all the string.
The fundamental problem that you have is that compression algorithms operate on binary data rather than text. If you compress something, then expect some of the compressed bytes to be zero. Thus the compressed data cannot be stored in a null-terminated string.
You need to change your mindset to work with binary data.
To compress do the following:
Convert from text to binary using some well-defined encoding. For instance, UTF-8. This will yield an array of unsigned char.
Compress the unsigned char, which will again yield an array of unsigned char, but now compressed.
To decompress you just reverse these steps.
Since you are writing C++ code you would be well advised to use standard containers. Such as std::string or std::wstring and std::vector<T>.
The exact same principles apply in all languages. When you come to code this in C#, you need to convert from text to binary. Use Encoding.GetBytes() to do that. That yields a byte array, byte[]. Compress that to another byte array. And so on.
But you really must first overcome this desire to attempt to store binary data in text data types.

Trouble creating char * object to send to c++/c shared object library

First off I am new to ctypes and did search for an answer to my question. Definitely will appreciate any insight from here.
I have a byte string supplied to me by another tool. It contains what appears to be hex and other values. I'm creating the c_char_p object as follows:
mybytestring = b'something with a lot of hex \x00\x00\xc7\x87\x9bb and other alphanumeric and non-word characters' # Length of this is very long let's say 480
mycharp = c_char_p(mybytestring)
I also create a c_char_Array as follows:
mybuff = create_string_buffer(mybytestring)
The problem is when I send either mycharp or mybuff to a c++ library .so function, the string gets cut off at the NULL terminator (first occurrence of '\x00')
I'm loading the c++ library and calling the function as follows:
lib_handle = cdll.LoadLibrary(mylib.so)
lib_handle.myfunction(mycharp)
lib_handle.myfunction(mybuff)
The c++ function expects a char *
Does someone know how to be able to send the whole string with NULL terminators ('\x00') included?
Thanks
Add your original data to a vector<char> vec, and send vec.data()
But the actual problem is
The c++ function expects a char *.
You will need to change this (to accept a second arg=length of the buffer, or for example, to accept a vector<char>) if you want it to accept an array of char including null.
Alternatively you can figure out what do you actually want the c++ function to do, and make self a "preprocessing" of the char array, adding a null-terminator to each new array, and after that send to the c++ function.
For example, you may decide that the “input” array is actually a set of c-string: you will need to do a simple parse to “split” and send to the c++ in a cycle one, after other.
Or maybe you decide that the input could be a string in an UTF16 and not UTF8. Then you need to, as good as possible, convert it to UTF8 and send to the c++ function.

what is use of C_str() function in C/C++

Can anybody tell me what is the use of c_str() function in C/C++?.
In which case it is necessary to use it?.
When you want to use your string with C-functions
string s = "hello";
printf( "your string:%s", s.c_str() );
It is a C++ thing, not a C one.
A common use of c_str (from std::string) is precisely to convert a C++ std::string to a const char* C string, which is required by many many low level C functions (e.g. Posix system calls like stat, etc).
Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
There is a good example of its use here: http://www.cplusplus.com/reference/string/string/c_str/
I presume you're asking about string::c_str()? It's a method that returns a C string representation of the string object. You might need a C string representation to call an OS API, for example.

C++: How to add raw binary data into source with Visual Studio?

I have a binary file which i want to embed directly into my source code, so it will be compiled into the .exe file directly, instead of reading it from a file, so the data would already be in the memory when i launch the program.
How do i do this?
Only idea i got was to encode my binary data into base64, put it in a string variable and then decode it back to raw binary data, but this is tricky method which will cause pointless memory allocating. Also, i would like to store the data in the .exe as compact as the original data was.
Edit: The reason i thought of using base64 was because i wanted to make the source code files as small as possible too.
The easiest and most portable way would be to write a small
program which converts the data to a C++ source, then compile
that and link it into your program. This generated file might
look something like:
unsigned char rawData[] =
{
0x12, 0x34, // ...
};
There are tools for this, a typical name is "bin2c". The first search result is this page.
You need to make a char array, and preferably also make it static const.
In C:
Some care might be needed since you can't have a char-typed literal, and also because generally the signedness of C's char datatype is up to the implementation.
You might want to use a format such as
static const unsigned char my_data[] = { (unsigned char) 0xfeu, (unsigned char) 0xabu, /* ... */ };
Note that each unsigned int literal is cast to unsigned char, and also the 'u' suffix that makes them unsigned.
Since this question was for C++, where you can have a char-typed literal, you might consider using a format such as this, instead:
static const char my_data[] = { '\xfe', '\xab', /* ... */ };
since this is just an array of char, you could just as well use ordinary string literal syntax. Embedding zero-bytes should be fine, as long as you don't try to treat it as a string:
static const char my_data[] = "\xfe\xdab ...";
This is the most compact solution. In fact, you could probably use that for C, too.
You can use resource files (.rc). Sometimes they are bad, but for Windows based application that's the usual way.
Why base64? Just store the file as it is in one char*.