libxml2 xmlChar* cast to char* - c++

How would you convert / cast an xmlChar* to char* from the libxml2 library? Thanks.

If you take a look at the examples, for instance io2.c, you'll notice that they just blithely cast it to a char *:
printf("%s", (char *) xmlbuff);

Looks like it's just unsigned char. So it should be safe to cast as long as you're not doing arithmetic on it.
But, you probably don't need to as that page has the key string functionality implemented in terms of the type.

Related

How to resolve a "cppcoreguidelines-pro-type-cstyle-cast" error in C++?

I'm working on C++ with Visual Studio 2015 64-bit and Clang as my compiler.
I tried to convert a string to unsigned char* with the code below:
string content = "some content from file";
unsigned char* m_Test = (unsigned char*)content.c_str();
However, this resulted in an error when I tried to build the project:
error: do not use C-style cast to convert between unrelated types
[cppcoreguidelines-pro-type-cstyle-cast,-warnings-as-errors]
Any idea how I can work around it? Really appreciate it if you can shed some light.
You have two (or 3) options:
1) Replace the C-style cast with the appropriate C++ cast (static_cast, const_cast, reinterpret_cast or dynamic_cast).
2) (better option) Find a way to write your code where a cast is not needed in the first place.
3) Ignore/suppress the warning (not what I would recommend, though it is an option).
In modern C++ the standard cstyle casts should not be used, because they are error prone. To be able to cast from const char * returned by .c_str() method of the string. You need to be tough on compiler. So much so, that reinterpret_cast is needed. Even then you need to preserve the constness. That can be casted away as well. So the code would look like this:
string content = "some content from file";
const unsigned char* m_Test = reinterpret_cast<const unsigned char*>(content.c_str());
The reinterpret_cast just tells the compiler "From now on, treat this data as stated."
You can use the C++ cast way with static_cast:
EDIT: As pointed out in the comments, you also need a const_cast for your particular case:
string content = "some content from file";
unsigned char* m_Test = static_cast<unsigned char*>(const_cast<char*>(content.c_str()));

Convert between signed char & unsigned char representing UTF8

I am using libxml2 and ICU in the same project. They represent
UTF8 differently. libxml2 uses unsigned char*, and ICU constructors take in plain char* (which on my Pentium 64-bit is equivalent to signed char).
Question: how do I convert between the two? Can I just
use static_cast?
I understand that UTF8 only cares that the underlying data
type be at least 8 bits long. Both signed char and unsigned
char satisfy this. I am just wondering if there is any
gotcha here? Any corner cases?
EDIT: at my compiler's (g++/Gentoo) insistence, only reinterpret_cast can do this conversion (without relying on the C-style cast). Let's say we have two unsigned char strings: 0000 and 1000. The conversion will turn them both into 0. Is this possible under UTF8?
Some libraries use char for storing UTF-8, others use unsigned char.
In this case you may need to cast between char* and unsigned char* using reinterpret_cast, since these types have the same storage unit size and alignment. E.g.:
char const* s = ...;
unsigned char const* p = reinterpret_cast<unsigned char const*>(s);
static_cast can always simulate reinterpret_cast through an intermediate conversion to void*, e.g. char* -> void* -> unsigned char*, e.g.:
char const* s = ...;
void const* intermediate = s;
unsigned char const* p = static_cast<unsigned char const*>(intermediate);
If unsigned char* is just a pointer to a string it should not cause any problem.
It should not matter. In any case as soon as you need to extract a char from the char * or unsigned char * stream you will need a function provided by the library that will extract an int and update the pointer/iterator in a manner that is opaque to you (the caller)
Thanks all. Mike said it best: the difference that makes no difference, and "a byte is a byte is a byte".

Warning generated due wrong strcmp parameter handling

So I have an
unsigned char * pMyPtr
assigned to something.
Then I want to compare this to an arbitrary string with
strcmp(const char* , const char* )
But when I do that, clang compiler tells me
warning: passing (aka 'unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign
How do I remove this warning?
With all the comments to the question, I feel like I'm missing something.
I know that casts are unfashionable, but isn't the following a simple workaround?
strcmp((const char*) pMyPtr , whatever_is_being_compared)
It isn't even unsigned. Behind it, is a struct.
This means that you cannot use strcmp here. You can use strcmp when the input data are null-terminated strings. That's not the case when the input data are structs. Perhaps you should consider memcmp instead, or perhaps you need to compare the structs as structs.
Clang can't convert from unsigned char* to const char*.
That because unsigned char* is different then char*.
By adding unsiged you make the range of char 0to255 instead of -127to127.
On the line where strcmp goes, you can typecast the unsigned char value with (const char*) which will work, because there it will be threated as a const char*. instead of unsigned char
If you feel that warning does not have any side effect. you can ignore the warning like this:
#pragma warning( disable : 4507 34 )

convert unsigned char* to std::string

I am little poor in typecasting. I have a string in xmlChar* (which is unsigned char*), I want to convert this unsigned char to a std::string type.
xmlChar* name = "Some data";
I tried my best to typecast , but I couldn't find a way to convert it.
std::string sName(reinterpret_cast<char*>(name));
reinterpret_cast<char*>(name) casts from unsigned char* to char* in an unsafe way but that's the one which should be used here. Then you call the ordinary constructor of std::string.
You could also do it C-style (not recommended):
std::string sName((char*) name);
I think the accepted solution is a little bit risky and not that good to be honest. I think the better solution is using std::to_string:
unsinged char char1{192};
auto result = std::to_string(char1)
now char1 equals to std::string("192")

C++ append unsigned char to wstring

I want to append an unsigned char to a wstring for debugging reasons.
However, I don't find a function to convert the unsigned char to a wstring, so I can not append it.
Edit:
The solutions posted so far do not really do what I need.
I want to convert 0 to "0".
The solutions so far convert 0 to a 0 character, but not to a "0" string.
Can anybody help?
Thank you.
unsigned char SomeValue;
wstring sDebug;
sDebug.append(SomeValue);
The correct call for appending a char to a string (or in this case, a wchar_t to a wstring) is
sDebug.push_back(SomeValue);
Documentation here.
To widen your char to a wchar_t, you can also use std::btowc which will widen according to your current locale.
sDebug.push_back(std::btowc(SomeValue));
Just cast your unsigned char to char:
sDebug.append(1, static_cast<char>(SomeValue));
And if you want to use operator+ try this:
sDebug+= static_cast<char>(SomeValue);
Or even this:
sDebug+=boost::numeric_cast<char>(SomeValue);
There's an overload of append that also takes the number of times to append the given character:
sDebug.append(1, SomeValue);
However, this will result in a conversion between unsigned char and wchar_t. Perhaps you want SomeValue to be a wchar_t.
wstring has a constructor that takes a char. That would create a wstring from a char which you can then append.