How to convert Bstr to Platform::String? - c++

Now I am doing a Windows Metro App job, and I am developing a C++ component to get the input method's candidate List.
An issue occurs at the last step to output the result list. I can get each member of the candidate list, but the type is the "Bstr", to get them in the Metro App, I have to convert them into the type of "Platform::String".
How to convert Bstr to Platform::String?
I really appreciate any help you can provide.

The Platform::String constructor overloads make this very easy:
BSTR comstr = SysAllocString(L"Hello world");
auto wrtstr = ref new Platform::String(comstr);
To be perfectly complete, both BSTR and a WinRT string can contain an embedded 0. If you want to handle that corner case then:
BSTR comstr = SysAllocStringLen(L"Hello\0world", 11);
auto wrtstr = ref new Platform::String(comstr, SysStringLen(comstr2));

According to MSDN ([1],[2]) we have this definitions:
#if !defined(_NATIVE_WCHAR_T_DEFINED)
typedef unsigned short WCHAR;
#else
typedef wchar_t WCHAR;
#endif
typedef WCHAR OLECHAR;
typedef OLECHAR* BSTR;
So BSTR is of type wchar_t* or unsigned short*, or a null terminated 16-bit string.
From what I see from the documentation of Platform::String the constructor accepts a null-terminated 16-bit string as const char16*
While char16 is guaranteed represent UTF16, wchar is not.
UPDATE: As Cheers and hth. - Alf pointed out the line above isn't true for Windows/MSVC. I couldn't find any information on whether it is safe to cast between both types in this case. Since both wchar_t and char16_t are different integrated types I would still recommend to use std::codecvt to avoid problems.
So you should use std::codecvt to convert from wchar_t* to char16_t* and pass the result to the constructor of your Platform::String.

Related

Conversion (const char*) var goes wrong

I need to convert from CString to double in Embedded Visual C++, which supports only old style C++. I am using the following code
CString str = "4.5";
double var = atof( (const char*) (LPCTSTR) str )
and resutlt is var=4.0, so I am loosing decimal digits.
I have made another test
LPCTSTR str = "4.5";
const char* var = (const char*) str
and result again var=4.0
Can anyone help me to get a correct result?
The issue here is, that you are lying to the compiler, and the compiler trusts you. Using Embedded Visual C++ I'm going to assume, that you are targeting Windows CE. Windows CE exposes a Unicode API surface only, so your project is very likely set to use Unicode (UTF-16 LE encoding).
In that case, CString expands to CStringW, which stores code units as wchar_t. When doing (const char*) (LPCTSTR) str you are then casting from a wchar_t const* to a char const*. Given the input, the first byte has the value 52 (the ASCII encoding for the character 4). The second byte has the value 0. That is interpreted as the terminator of the C-style string. In other words, you are passing the string "4" to your call to atof. Naturally, you'll get the value 4.0 as the result.
To fix the code, use something like the following:
CStringW str = L"4.5";
double var = _wtof( str.GetString() );
_wtof is a Microsoft-specific extension to its CRT.
Note two things in particular:
The code uses a CString variant with explicit character encoding (CStringW). Always be explicit about your string types. This helps read your code and catch bugs before they happen (although all those C-style casts in the original code defeats that entirely).
The code calls the CString::GetString member to retrieve a pointer to the immutable buffer. This, too, makes the code easier to read, by not using what looks to be a C-style cast (but is an operator instead).
Also consider defining the _CSTRING_DISABLE_NARROW_WIDE_CONVERSION macro to prevent inadvertent character set conversions from happening (e.g. CString str = "4.5";). This, too, helps you catch bugs early (unless you defeat that with C-style casts as well).
CString is not const char* To convert a TCHAR CString to ASCII, use the CT2A macro - this will also allow you to convert the string to UTF8 (or any other Windows code page):
// Convert using the local code page
CString str(_T("Hello, world!"));
CT2A ascii(str);
TRACE(_T("ASCII: %S\n"), ascii.m_psz);
// Convert to UTF8
CString str(_T("Some Unicode goodness"));
CT2A ascii(str, CP_UTF8);
TRACE(_T("UTF8: %S\n"), ascii.m_psz);
Found a solution using scanf
CString str="4.5"
double var=0.0;
_stscanf( str, _T("%lf"), &var );
This gives a correct result var=4.5
Thanks everyone for comments and help.

Only first character is assigned converting LPCTSTR to char*

I'm completely new to C++. In my program there's a function which has to take a LPCTSTR as a parameter. I want to convert it into a char*. What I tried is as follows,
char* GetChar(LPCTSTR var){
char* id = (char*)var;
.....
}
But while debugging I noticed that only first letter of var is assigned to id.
What have I done wrong?
(I tried various answers in StackOverflow about converting LPCTSTR to char* before coming to this solution. None of them worked for me.)
UPDATE
What i want is to get full string pointed by var to be treated as char*
It is much more useful to just pick a character set (wchar_t, or char), and just stick to it, in your application, since trying to use TCHAR, when trying to support both, may cause you some headaches. To be fair, today, you can just, safely, use wchar_t (or WCHAR, since from the current types you are using, I suspect that you are using Windows headers).
The problem that you have, is because casting a pointer does not have any impact on its contents. And, since, typically wchar_t is 2 bytes in size, while char is 1 byte in size, storing the value, that fits inside a char, in wchar_t, leaves 2nd byte of wchar_t set to \0. And when you try to print null(\0)-terminated string of wchar_ts as a string of chars, the printing function reaches the \0 character after reading the first symbol, and assumes it is the end of the string. \0 character in wchar_t is 2 bytes long.
For example, the string
LPCWSTR test = L"Hi!";
is stored in memory as:
48 00 69 00 21 00 00 00
If you want to convert between the wchar_t version of the string to char version, or vice-versa, there exist some functions, that can do the conversion, and since I noticed that you probably are using Windows headers (from LPCTSTR define), those functions are WideCharToMultiByte/ MultiByteToWideChar.
You may now start to think: I am not using wchar_t! I am using TCHAR!
Typically TCHAR is defined in the following way:
#ifdef UNICODE
typedef WCHAR TCHAR;
#else
typedef char TCHAR;
#endif
So you could do similar handling in your conversion code:
template<int N>
bool GetChar(LPCTSTR var, char (&out)[N])){
#ifdef UNICODE
return WideCharToMultiByte (CP_ACP, 0, var, -1, out, N, NULL, NULL) != 0;
#else
return strcpy_s (out, var) == 0;
#endif
}
Note, the return value of GetChar function is true if the function Succeeds; false - otherwise.
You code has told the compiler to convert var (which is a pointer) into a pointer to a character and then assign that converted value to id. The only thing it converts is the pointer value. It doesn't make any changes to the thing var points to, copy it, or convert it. So you haven't done anything to the string var points to.
It's not clear what you're trying to do. But your code doesn't really do anything but convert a pointer value without changing or affecting the thing pointed to in any way.
When you convert a LPCTSTR (a long pointer to a const tchar string) to a char*, you get a char* that points to a CTSTR (a const tchar string). What use is that? What sense does that make?
Most probaby LPCTSTR is const wchar_t*, so if you cast it to char* (which is Undefined Behaviour - as var could point to literal), the LSB byte (wchar_t under Visual Studio is 16bits) of *var is zero so it is treated as '\0' - which indicates end of string. So in the end you get only one char.
To convert LPCTSTR to char* you can use wsctombs for example, see here: Convert const wchar_t* to const char*.
Here's an easy solution I found based on other answers given here.
char* GetChar(LPCTSTR var){
char id[30];
int i = 0;
while (var[i] != '\0')
{
id[i] = (char)var[i];
i++;
}
id[i] = '\0';
UPDATE
As mentioned in comments this is not a good way to solve this problem. But if someone has the same problem and cannot understand any other solution, this will help a bit.
Therefore I won't remove this answer.

How would you convert a std::string to BSTR*?

How would you convert a std::string to BSTR*?
STDMETHODIMP CMyRESTApp::rest(BSTR data, BSTR* restr)
{
RESTClient restclient;
RESTClient::response resp = restclient.get(data);
Log("Response Status code: %s", resp.code);
Log("Response Body: %s", resp.body);
*restr = // here
return S_OK;
}
I need convert the resp.body and this then to be returned for the *restr here.
An ATL based approach is to use ATL::CComBSTR and then a Detach() (or CopyTo(...)) the resultant CComBSTR to the BSTR*
Something like:
CComBSTR temp(stlstr.c_str());
*restr = temp.Detach();
Else in general for std::basic_string you can use the Win32 API Sys* family of functions, such as SysAllocStringByteLen and SysAllocString;
// For the `const char*` data type (`LPCSTR`);
*restr = SysAllocStringByteLen(stlstr.c_str(), stlstr.size());
// More suitable for OLECHAR
*restr = SysAllocString(stlwstr.c_str());
OLECHAR depends on the target platform, but generally it is wchar_t.
Given your code, the shortest snippet could just be;
*restr = SysAllocStringByteLen(resp.body.c_str(), resp.body.size());
Note these Windows API functions use the "usual" windows code page conversions, please see further MSDN documentation on how to control this if required.
std::string is made by chars; BSTR is usually a Unicode UTF-16 wchar_t-based string, with a length prefix.
Even if one could use a BSTR as a simple way to marshal a byte array (since the BSTR is length-prefixed, so it can store embedded NULs), and so potentially a BSTR could be used also to store non-UTF-16 text, the usual "natural" behavior for a BSTR is to contain a Unicode UTF-16 wchar_t-string.
So, the first problem is to clarify what kind of encoding the std::string uses (for example: Unicode UTF-8? Or some other code page?). Then you have to convert that string to Unicode UTF-16, and create a BSTR containing that UTF-16 string.
To convert from UTF-8 (or some other code page) to UTF-16, you can use the MultiByteToWideChar() function. If the source std::string contains a UTF-8 string, you can use the CP_UTF8 code page value with the aforementioned API.
Once you have the UTF-16 converted string, you can create a BSTR using it, and pass that as the output BSTR* parameter.
The main Win32 API to create a BSTR is SysAllocString(). There are also some variants in which you can specify the string length.
Or, as a more convenient alternative, you can use the ATL's CComBSTR class to wrap a BSTR in safe RAII boundaries, and use its Detach() method to pass the BSTR as an output BSTR* parameter.
CComBSTR bstrResult( /* UTF-16 string from std::string */ );
*restr = bstrResult.Detach();
Bonus reading:
Eric's Complete Guide To BSTR Semantics
This is very much possible :
std::string singer("happy new year 2016");
_bstr_t sa_1(singer.c_str()); //std::string to _bstr_t
_bstr_t sa_2("Goodbye 2015");
std::string kapa(sa_2); //_bstr_t to std::string
size_t sztBuffer = (resp.body.length() + 1) * sizeof(wchar_t);
wchar_t* pBuffer = new wchar_t[resp.body.length() + 1];
ZeroMemory(&pBuffer[0], sztBuffer);
MultiByteToWideChar(CP_ACP, 0, resp.body.c_str(), resp.body.length(), pBuffer, sString.length());
SysAllocString((OLECHAR*)pBuffer);
delete[] pBuffer;
Do not forget to deallocate it afterward.

What is the difference and the relationship of char and CString [duplicate]

This question already has answers here:
What is `CString`?
(3 answers)
Closed 9 years ago.
Can someone explain me the difference and the relationship between the char * and CString?... Thanks.
There are few important differences.
char * is a pointer to char. Generally you can't say if it a single char, or a beginning of a string, and what is the length. All those things are dictated by program logic and some conventions, i.e. standard C functions, like to use const char * as inputs. You need to manage memory allocated for strings manually.
CString is a macro. Depending on your program compilation options, it can be defined to either the CStringA or CStringW class. There are differences and similarities.
The difference is that CStringAoperates with non-Unicode data (similar to char*), and CStringW is a Unicode string (similar to wchar_t*).
Both classes, however, are equivalent in the aspect of string manipulation and storage management. They are closer to the standard C++ std::string and std::wstring classes.
Apart from that, both CStringA and CStringW provide the capability to convert strings to and from Unicode form.
a CString will be an array of char and a char* will be a pointer into the array of char with which you can iterate the characters of the string.
Actually from MSDN:
CString is based on the TCHAR data type. If the symbol _UNICODE is defined for your program, TCHAR is defined as type wchar_t, a 16-bit character type; otherwise, it is defined as char, the normal 8-bit character type. Under Unicode, then, CString objects are composed of 16-bit characters. Without Unicode, they are composed of 8-bit char type.
CString is a class packed with different functionalities.. MSDN
char * is just a regular c++ data type.
CString is used mostly in MFC applications.
CString is a sequence of TCHAR-s rather then char*. The main difference is that if UNICODE is defined CString will be sequence of wchar. Actually depending on that macro CString will be tpyedef -ed either to CStringA or CStringW. Another major difference is that CString is a class while char* is simply a pointer to character.
Depending on the type of TCHAR, CString can be either CStringA or CStringW.
That said, CString is a wrapper over an array of chars, that enables you to easily treat that array of chars as a string, and operate on it in manners relevant to the string type.
For the relationship between them, here is something that illustrates it easily. You can convert between char * and CString like this:
CString str = "abc"; // const char[3] or char * to CString
and
const char * p = str.Get() // CString to const char *
A CString is a class and provides lots of functionalities that a char * doesnt. A char * is just a pointer to char or chars array.
A CString contains a buffer that is roughtly the same as a char * : LPTSTR GetBuffer( int nMinBufLength );
For the difference between LPTSTR and char * go here and here
CString is a wrapper class around a char* to provide some useful additional functions and to hide the memory allocation/deallocation from the user.
There is not much difference in performance terms so if you are using MFC classes, you might as well use a CString.

Why BSTR and how to convert it to QString?

I'm working with a Microsoft Kinect SDK where functions return BSTR. I need to get a QString or std::string.
Here's what I've tried:
BSTR bstr = s->NuiUniqueId();
// QString qs((QChar*)bstr, SysStringLen(bstr));
std::wstring ws(bstr);
ui->lblDetails->setText(QString::fromStdWString(ws));
With this solution the program crashes. With the line that is commented out I get "unresolved external symbol SysStringLen".
Is SysStringLen the way to go, but I need to add some additional libraries (wouldn't the API include it already) or is there another solution?
Additional question: why does Microsoft do it? I mean:
#if !defined(_NATIVE_WCHAR_T_DEFINED)
typedef unsigned short WCHAR;
#else
typedef wchar_t WCHAR;
#endif
typedef WCHAR OLECHAR;
typedef OLECHAR* BSTR;
typedef BSTR* LPBSTR;
What's the reason behind stuff like this? And even if they find it beneficial to use it internally, couldn't they just use normal char array or std::(w)string in the API to make other's life easier?
You can convert the BSTR object to char *, then convert it to QString. Here:
QString *getQStringFromBstr(BSTR bstrVal){
char *p= const_cast<char *>(_com_util::ConvertBSTRToString(bstrVal));
return new QString(p);
}
COM was designed to be Language-agnostic binary equalizer. Which means I could use a VB function in C++ and C++ function in, say, C#(with COM interop). This is the reason most of the strings and few functions were changed to language neutral strings IIRC.