convert unsigned char* to std::string - c++

I am little poor in typecasting. I have a string in xmlChar* (which is unsigned char*), I want to convert this unsigned char to a std::string type.
xmlChar* name = "Some data";
I tried my best to typecast , but I couldn't find a way to convert it.

std::string sName(reinterpret_cast<char*>(name));
reinterpret_cast<char*>(name) casts from unsigned char* to char* in an unsafe way but that's the one which should be used here. Then you call the ordinary constructor of std::string.
You could also do it C-style (not recommended):
std::string sName((char*) name);

I think the accepted solution is a little bit risky and not that good to be honest. I think the better solution is using std::to_string:
unsinged char char1{192};
auto result = std::to_string(char1)
now char1 equals to std::string("192")

Related

How to convert string to const unsigned char* without using reinterpret_cast (modern approach)

I have variable input type const std::string&:
const std::string& input
Now I need to convert this to const unsigned char* because this is the input of the function.
Unitl now I have correct code for converting:
reinterpret_cast<const unsigned char*>(input.c_str())
This works well, but in clang I got a warning:
do not use reinterpret_cast [cppcoreguidelines-pro-type-reinterpret-cast]
What is the correct way to change a string or const char* to const unsigned char*?
What is the correct way to change a string or const char* to const unsigned char*?
The correct way is to use reinterpret_cast.
If you want to avoid reinterpret_cast, then you must avoid the pointer conversion entirely, which is only possible by solving the XY-problem. Some options:
You could use std::basic_string<unsigned char> in the first place.
If you only need an iterator to unsigned char and not necessarily a pointer, then you could use std::ranges::views::transform which uses static cast for each element.
You could change the function that expects unsigned char* to accept char* instead.
If you cannot change the type of input and do need a unsigned char* and you still must avoid reinterpret cast, then you could create the std::basic_string<unsigned char> from the input using the transform view. But this has potential overhead, so consider whether avoiding reinterpret_cast is worth it.
Edit
Apparently type punning with an union is UB so definitely don't do this.
(Keeping the answer for posterity though!)
To strictly answer your question, there's this way:
void foo(const unsigned char* str) {
std::cout << str << std::endl;
}
int main()
{
std::string word = "test";
//foo(word.data()); fails
union { const char* ccptr; const unsigned char* cucptr; } uword;
uword.ccptr = word.data();
foo(uword.cucptr);
}
Is this any better than a reinterpret_cast? Probably not.

conversion between char* and std::string and const char*

I am now using C++ to program a robot using PROS. Pros has a print function, which is taking in a const char*. Now, I'm using lvgl to create my own screen, and I want to replicate the print function. Like the printf() functions, I want it to include variadic params to do the %d effect (so it converts all the %? to the corresponding values). The problem now is about the conversions between functions. I wanted to make a convert function to convert a string and the variadic params into a complete string. I need to input is a string which is like "hey" and I'm unsure what the type name should be. I need to be able to get size, search in it for %ds but I need the function to return a const char* to pass onto the lvgl to pring on the screen. I am having a bad time trying to convert a string into an const char* for the out put of the convert function.
Also, I tried using the input type as a char*, and when I input a string like "hello" is says a error [ISO C++11 does not allow conversion from string literal to 'char ' [-Wwritable-strings]]. But instead, when is use a const char, the error disappears. Anyone knows why?
Thanks everyone for your kind help!
char* and const char* are two flavours of the same thing: C-style strings. These are a series of bytes with a NUL terminator (0-byte). To use these you need to use the C library functions like strdup, strlen and so on. These must be used very carefully as missing out on the terminator, which is all too easy to do by accident, can result in huge problems in the form of buffer-overflow bugs.
std::string is how strings are represented in C++. They're a lot more capable, they can support "wide" characters, or variable length character sets like UTF-8. As there's no NUL terminator in these, they can't be overflowed and are really quite safe to use. Memory allocation is handled by the Standard Library without you having to pay much attention to it.
You can convert back and forth as necessary, but it's usually best to stick to std::string inside of C++ as much as you can.
To convert from C++ to C:
std::string cppstring("test");
const char* c_string = cppstring.c_str();
To convert from C to C++:
const char* c_string = "test";
std::string cppstring(c_string);
Note you can convert from char* (mutable) to const char* (immutable) but not in reverse. Sometimes things are flagged const because you're not allowed to change them, or that changing them would cause huge problems.
You don't really have to "convert" though, you just use char* as you would const char*.
std::string A = "hello"; //< assignment from char* to string
const char* const B = A.c_str(); //< call c_str() method to access the C string
std::string C = B; //< assignment works just fine (with allocation though!)
printf("%s", C.c_str()); //< pass to printf via %s & c_str() method

How to initialise std::istringstream from const unsigned char* without cast or copy?

I have binary data in a byte sequence described by const unsigned char *p and size_t len. I want to be able to pass this data to a function that expects a std::istream *.
I think I should be able to do this without copying the data, unsafe casts or writing a new stream class. But so far I'm failing. Can anyone help?
Update
Thanks all for the comments. This would seem to be an unanswerable question because std::istream operates with char and conversion would at some point require at least an integer cast from unsigned char.
The pragmatic approach is to do this:
std::string s(reinterpret_cast<const char*>(p), len);
std::istringstream i(s);
and pass &i to the function expecting std::istream *.
Your answer is still copying.
Have you considered something like this?
const unsigned char *p;
size_t len;
std::istringstream str;
str.rdbuf()->pubsetbuf(
reinterpret_cast<char*>(const_cast<unsigned char*>(p)), len);

libxml2 xmlChar* cast to char*

How would you convert / cast an xmlChar* to char* from the libxml2 library? Thanks.
If you take a look at the examples, for instance io2.c, you'll notice that they just blithely cast it to a char *:
printf("%s", (char *) xmlbuff);
Looks like it's just unsigned char. So it should be safe to cast as long as you're not doing arithmetic on it.
But, you probably don't need to as that page has the key string functionality implemented in terms of the type.

const unsigned char * to std::string

sqlite3_column_text returns a const unsigned char*, how do I convert this to a std::string? I've tried std::string(), but I get an error.
Code:
temp_doc.uuid = std::string(sqlite3_column_text(this->stmts.read_documents, 0));
Error:
1>.\storage_manager.cpp(109) : error C2440: '<function-style-cast>' : cannot convert from 'const unsigned char *' to 'std::string'
1> No constructor could take the source type, or constructor overload resolution was ambiguous
You could try:
temp_doc.uuid = std::string(reinterpret_cast<const char*>(
sqlite3_column_text(this->stmts.read_documents, 0)
));
While std::string could have a constructor that takes const unsigned char*, apparently it does not.
Why not, then? You could have a look at this somewhat related question: Why do C++ streams use char instead of unsigned char?
On the off-chance you actually want a string of unsigned characters, you could create your own type:
typedef std::basic_string <unsigned char> ustring;
You should then be able to say things like:
ustring s = sqlite3_column_text(this->stmts.read_documents, 0);
The reason people typically use an (unsigned char *) type is to indicate that the data is binary and not plain ASCII text. I know libxml does this, and from the looks of it, sqlite is doing the same thing.
The data you're getting back from the sqlite call is probably UTF-8 encoded Unicode text. While a reinterpret_cast may appear to work, if someone ever stores text in the field that is not plain ASCII, your program probably won't be well-behaved.
The std::string class isn't designed with Unicode in mind, so if you ask for the length() of a string, you'll get the number of bytes, which, in UTF-8, is not necessarily the same thing as the number of characters.
Short answer: the simple cast may work, if you're certain the data is just ASCII. If it can be any UTF-8 data, then you need to handle encoding/decoding in a smarter way.
I'm not familiar with sqlite3_column_text, but one thing you may want to do is when you call the std:string constructor, you'll want to cast to (const char*). I believe that it should have a constructor for that type.
However, it is odd that this sqlite function is return an unsigned char*, is it returning a Pascal string (first char is the length of the string)? If so, then you'll have to create the std::string with the bytes and the length.
if temp_doc.uuid is a std::string try :
temp_doc.uuid = static_cast<const char*>(sqlite3_column_text(this->stmts.read_documents, 0));
try:
temp_doc.uuid = std::string(reinterpret_cast<const char*>(sqlite3_column_text(this->stmts.read_documents, 0)));
You can't construct a std::string from const unsigned char* -- you have to cast it to const char* first:
temp_doc.uuid = std::string( reinterpret_cast< const char* >(
sqlite3_column_text(this->stmts.read_documents, 0) ) );
I'm no expert but this example here seems much simpler:
string name = (const char*) (sqlite3_column_text(res, 0));
An old but important question, if you have to preserve the full information in the unsigned char sequence. In my opinion that is with reinterpret_cast not the case. I found an interesting solution under converting string to vector
which I modified to
basic_string<unsigned char> temp = sqlite3_column_text(stmt, 0);
string firstItem( temp.begin(), temp.end() );
Since I am programming for gtkmm, you can realize the conversion into a Glib::ustring with
basic_string<unsigned char> temp = sqlite3_column_text(stmt, 0);
Glib::ustring firstItem = string( temp.begin(), temp.end() );