Trouble creating char * object to send to c++/c shared object library - c++

First off I am new to ctypes and did search for an answer to my question. Definitely will appreciate any insight from here.
I have a byte string supplied to me by another tool. It contains what appears to be hex and other values. I'm creating the c_char_p object as follows:
mybytestring = b'something with a lot of hex \x00\x00\xc7\x87\x9bb and other alphanumeric and non-word characters' # Length of this is very long let's say 480
mycharp = c_char_p(mybytestring)
I also create a c_char_Array as follows:
mybuff = create_string_buffer(mybytestring)
The problem is when I send either mycharp or mybuff to a c++ library .so function, the string gets cut off at the NULL terminator (first occurrence of '\x00')
I'm loading the c++ library and calling the function as follows:
lib_handle = cdll.LoadLibrary(mylib.so)
lib_handle.myfunction(mycharp)
lib_handle.myfunction(mybuff)
The c++ function expects a char *
Does someone know how to be able to send the whole string with NULL terminators ('\x00') included?
Thanks

Add your original data to a vector<char> vec, and send vec.data()
But the actual problem is
The c++ function expects a char *.
You will need to change this (to accept a second arg=length of the buffer, or for example, to accept a vector<char>) if you want it to accept an array of char including null.
Alternatively you can figure out what do you actually want the c++ function to do, and make self a "preprocessing" of the char array, adding a null-terminator to each new array, and after that send to the c++ function.
For example, you may decide that the “input” array is actually a set of c-string: you will need to do a simple parse to “split” and send to the c++ in a cycle one, after other.
Or maybe you decide that the input could be a string in an UTF16 and not UTF8. Then you need to, as good as possible, convert it to UTF8 and send to the c++ function.

Related

Reading / Writing Control Characters in binary file

I'm currently processing a binary file using C++...
At some point I read a byte in and the char * read is "\x3" which seems to be a control character.
But when i got to write it back out using:
const char *control = "\x3";
fout.write(control, sizeof(control));
And then i read the binary file back in the read value is "\x11C".
How does one write the control character array back out to file the correct way?
Your code is writing 4-8 characters to the binary file instead of the 1 you seem to be expecting. control is treated as a normal pointer, and sizeof(control) is interpreting said pointer without considering the data it points to, and is returning a value of 4-8.
The best way to fix this is to declare control as a single character, which is what you seem to intend:
char control = '\x3';
fout.write(&control, sizeof(control));
The other way, if you actually need to write multiple characters, is like this:
const std::string control = "\x3";
fout.write(control.data(), control.size());
Either method will correctly output the number of characters you expect.
Another method to write string literals, is by declaring them as an array:
static char const data[] = "Hello World!\n";
fout.write(data, sizeof(data) - 1U);
The - 1U is so that the terminating NUL is not written. Remove as you wish.
Since the data array is declared with no capacity, so the compiler determines the length based on the content.
The sizeof can be used since the size of a character is 1 (by definition).
A nice advantage of this method is that the size is known at compile time. No searching for the length is required.

converting a string to a c string

m working on some homework but don't even know where to start on this one. If you could can you throw me in the right direction. This is what i'm suppose to do
Write your own version of the str_c function that takes a C++ string as an argument (with the parameter set as a constant reference variable) and returns a pointer to the equivalent C-string. Be sure to test it with an appropriate driver.
There are different possibilities to write such a function.
First, take a look at the C++ reference for std::string, which is the starting point for your problem.
In the Iterator section on that page, you might find some methods which can help you to get the string character by character.
It can also help to read the documentation for the std::string::c_str method, you'd like to imitate: string::c_string. It's important to understand, how the system works with normal C-strings (char*):
Due to the fact, that a C-string has now length- or size-attribute, a trick is used to determine the end of the string: The last character in the string has to be a '\0'.
Make sure you understand, that a char* string can also be seen as array of characters (char[]). This might help you, when understanding and solving your problem.
as we know, C-string is null-terminated array of char. you can put char by char from std::string to an array of char, and then closed with '\0'. and remember a pointer to a char (char*) is also representation of array of char. you can use this concept

How to pass string from C# to C++ specifying encoding

I'm writing native C++ library to be used from C#. I need a C++ method receiving string (or char array) and Encoding. Inside this method I want to convert this string to byte array with respect of Encoding, work with it and send back string converted from byte array with respect of Encoding. As far as this method will be called from C#, I can pass System.Text.Encoding to it, but I don't know any analog in C++ for it. What approach would you suggest?
It would be much simpler to pass the byte array to the C++ library if you're only going to operate on the bytes anyway...

How would i convert a String object to a UTF8CHAR pointer?

Im integrating a new system, and the old system had a char* in a method. Now there is a UTF8CHAR * instead.
I have a string object:
string data("test set");
and wanted to pass it into the function:
my_method(UTF8CHAR* text, ENUM extra, newStruct &item);
What my first attempt was:
newStruct param("hi", 0,0);
my_method(data.c_str(), extra::OPEN,param);
I dont get an ERROR, but instead a EXC_BAD_ACCESS
A string and a char array each contains a sequence of bytes. It depends on the library in question, but common sense indicates that a UTF8CHAR array is a sequence of bytes as well, with the added understanding that certain byte combinations describe certain unicode codepoints, and certain other byte combinations are illegal. So every utf8 char array is a char array, but not neccessarily the other way round. As the distinction is not a thing the compiler can check, except for ensuring proper data type handling, passing a char pointer should work. If it does not, perhaps something else went wrong, which we cannot decide from the code you posted.

How Python can get binary data(char*) from C++ by SWIG?

I am using C++ functions in Python by SWIG,and I met a problem now.
When I pass a char * from C++ to Python, the char * is truncted by Python.
For example:
example.h:
char * fun()
{
return "abc\0de";
}
now in Python,we call
example.fun()
it only print
"abc"
instead of
"abc\0de"
the data behind '\0' is deleted by Python.
I want to get all the chars(it is a binary data that can contains '\0') from fun() in C++,
and any advise is appreciated
First of all, you should not use char * if you are dealing with binary data (swig thinks that they are normal strings). Instead you should use void *. swig provides a module named 'cdata.i' -
you should include this in the interface definition file.
Once you include this, it gives two functions - cdata() and memmove().
Given a void * and the length of the binary data, cdata() converts it into a string type of the target language.
memmove() does the reverse - given a string type, it will copy the contents of the string(including embedded null bytes) into the C void* type.
Handling binary data becomes much simpler with this module. I hope this is what you need.
example.i
%module example
%include "cdata.i"
%{
void *fun()
{
return "abc\0de";
}
%}
test.py
import example
print example.cdata(example.fun(), 6)
C/C++ strings are NULL-terminated which means that the first \0 character denotes the end of the string.
When a function returns a pointer to such a string, the caller (SWIG in this case) has no way of knowing if there is more data after the first \0 so that's why you only get the first part.
So first thing to do is to change your C function to return not just the string but its length as well. Since there can be only one return value we'll use pointer arguments instead.
void fun(char** s, int *sz)
{
*s = "abc\0de";
*sz = 6;
}
The SWIG docs suggest using the cstring.i library to wrap such functions. In particullar, the last macro does exactly what you need.
%cstring_output_allocate_size(parm, szparm, release)
Read the docs to learn how to use it.
See 8.3 C String Handling in the documentation.
Also from the documentation:
The char * datatype is handled as a NULL-terminated ASCII string. SWIG
maps this into a 8-bit character string in the target scripting
language. SWIG converts character strings in the target language to
NULL terminated strings before passing them into C/C++. The default
handling of these strings does not allow them to have embedded NULL
bytes. Therefore, the char * datatype is not generally suitable for
passing binary data. However, it is possible to change this behavior
by defining a SWIG typemap. See the chapter on Typemaps for details
about this.