While working with COM in C++ the strings are usually of BSTR data type. Someone can use BSTR wrapper like CComBSTR or MS's CString. But because I can't use ATL or MFC in MinGW compiler, is there standard code snippet to convert BSTR to std::string (or std::wstring) and vice versa?
Are there also some non-MS wrappers for BSTR similar to CComBSTR?
Update
Thanks to everyone who helped me out in any way! Just because no one has addressed the issue on conversion between BSTR and std::string, I would like to provide here some clues on how to do it.
Below are the functions I use to convert BSTR to std::string and std::string to BSTR respectively:
std::string ConvertBSTRToMBS(BSTR bstr)
{
int wslen = ::SysStringLen(bstr);
return ConvertWCSToMBS((wchar_t*)bstr, wslen);
}
std::string ConvertWCSToMBS(const wchar_t* pstr, long wslen)
{
int len = ::WideCharToMultiByte(CP_ACP, 0, pstr, wslen, NULL, 0, NULL, NULL);
std::string dblstr(len, '\0');
len = ::WideCharToMultiByte(CP_ACP, 0 /* no flags */,
pstr, wslen /* not necessary NULL-terminated */,
&dblstr[0], len,
NULL, NULL /* no default char */);
return dblstr;
}
BSTR ConvertMBSToBSTR(const std::string& str)
{
int wslen = ::MultiByteToWideChar(CP_ACP, 0 /* no flags */,
str.data(), str.length(),
NULL, 0);
BSTR wsdata = ::SysAllocStringLen(NULL, wslen);
::MultiByteToWideChar(CP_ACP, 0 /* no flags */,
str.data(), str.length(),
wsdata, wslen);
return wsdata;
}
BSTR to std::wstring:
// given BSTR bs
assert(bs != nullptr);
std::wstring ws(bs, SysStringLen(bs));
std::wstring to BSTR:
// given std::wstring ws
assert(!ws.empty());
BSTR bs = SysAllocStringLen(ws.data(), ws.size());
Doc refs:
std::basic_string<typename CharT>::basic_string(const CharT*, size_type)
std::basic_string<>::empty() const
std::basic_string<>::data() const
std::basic_string<>::size() const
SysStringLen()
SysAllocStringLen()
There is a c++ class called _bstr_t. It has useful methods and a collection of overloaded operators.
For example, you can easily assign from a const wchar_t * or a const char * just doing _bstr_t bstr = L"My string"; Then you can convert it back doing const wchar_t * s = bstr.operator const wchar_t *();. You can even convert it back to a regular char const char * c = bstr.operator char *(); You can then just use the const wchar_t * or the const char * to initialize a new std::wstring oe std::string.
You could also do this
#include <comdef.h>
BSTR bs = SysAllocString("Hello");
std::wstring myString = _bstr_t(bs, false); // will take over ownership, so no need to free
or std::string if you prefer
EDIT: if your original string contains multiple embedded \0 this approach will not work.
Simply pass the BSTR directly to the wstring constructor, it is compatible with a wchar_t*:
BSTR btest = SysAllocString(L"Test");
assert(btest != NULL);
std::wstring wtest(btest);
assert(0 == wcscmp(wtest.c_str(), btest));
Converting BSTR to std::string requires a conversion to char* first. That's lossy since BSTR stores a utf-16 encoded Unicode string. Unless you want to encode in utf-8. You'll find helper methods to do this, as well as manipulate the resulting string, in the ICU library.
Related
How can I convert a std::string to LPCSTR? Also, how can I convert a std::string to LPWSTR?
I am totally confused with these LPCSTR LPSTR LPWSTR and LPCWSTR.
Are LPWSTR and LPCWSTR the same?
Call c_str() to get a const char * (LPCSTR) from a std::string.
It's all in the name:
LPSTR - (long) pointer to string - char *
LPCSTR - (long) pointer to constant string - const char *
LPWSTR - (long) pointer to Unicode (wide) string - wchar_t *
LPCWSTR - (long) pointer to constant Unicode (wide) string - const wchar_t *
LPTSTR - (long) pointer to TCHAR (Unicode if UNICODE is defined, ANSI if not) string - TCHAR *
LPCTSTR - (long) pointer to constant TCHAR string - const TCHAR *
You can ignore the L (long) part of the names -- it's a holdover from 16-bit Windows.
str.c_str() gives you a const char *, which is an LPCSTR (Long Pointer to Constant STRing) -- means that it's a pointer to a 0 terminated string of characters. W means wide string (composed of wchar_t instead of char).
These are Microsoft defined typedefs which correspond to:
LPCSTR: pointer to null terminated const string of char
LPSTR: pointer to null terminated char string of char (often a buffer is passed and used as an 'output' param)
LPCWSTR: pointer to null terminated string of const wchar_t
LPWSTR: pointer to null terminated string of wchar_t (often a buffer is passed and used as an 'output' param)
To "convert" a std::string to a LPCSTR depends on the exact context but usually calling .c_str() is sufficient.
This works.
void TakesString(LPCSTR param);
void f(const std::string& param)
{
TakesString(param.c_str());
}
Note that you shouldn't attempt to do something like this.
LPCSTR GetString()
{
std::string tmp("temporary");
return tmp.c_str();
}
The buffer returned by .c_str() is owned by the std::string instance and will only be valid until the string is next modified or destroyed.
To convert a std::string to a LPWSTR is more complicated. Wanting an LPWSTR implies that you need a modifiable buffer and you also need to be sure that you understand what character encoding the std::string is using. If the std::string contains a string using the system default encoding (assuming windows, here), then you can find the length of the required wide character buffer and perform the transcoding using MultiByteToWideChar (a Win32 API function).
e.g.
void f(const std:string& instr)
{
// Assumes std::string is encoded in the current Windows ANSI codepage
int bufferlen = ::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), NULL, 0);
if (bufferlen == 0)
{
// Something went wrong. Perhaps, check GetLastError() and log.
return;
}
// Allocate new LPWSTR - must deallocate it later
LPWSTR widestr = new WCHAR[bufferlen + 1];
::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), widestr, bufferlen);
// Ensure wide string is null terminated
widestr[bufferlen] = 0;
// Do something with widestr
delete[] widestr;
}
Using LPWSTR you could change contents of string where it points to. Using LPCWSTR you couldn't change contents of string where it points to.
std::string s = SOME_STRING;
// get temporary LPSTR (not really safe)
LPSTR pst = &s[0];
// get temporary LPCSTR (pretty safe)
LPCSTR pcstr = s.c_str();
// convert to std::wstring
std::wstring ws;
ws.assign( s.begin(), s.end() );
// get temporary LPWSTR (not really safe)
LPWSTR pwst = &ws[0];
// get temporary LPCWSTR (pretty safe)
LPCWSTR pcwstr = ws.c_str();
LPWSTR is just a pointer to original string. You shouldn't return it from function using the sample above. To get not temporary LPWSTR you should made a copy of original string on the heap. Check the sample below:
LPWSTR ConvertToLPWSTR( const std::string& s )
{
LPWSTR ws = new wchar_t[s.size()+1]; // +1 for zero at the end
copy( s.begin(), s.end(), ws );
ws[s.size()] = 0; // zero at the end
return ws;
}
void f()
{
std::string s = SOME_STRING;
LPWSTR ws = ConvertToLPWSTR( s );
// some actions
delete[] ws; // caller responsible for deletion
}
The MultiByteToWideChar answer that Charles Bailey gave is the correct one. Because LPCWSTR is just a typedef for const WCHAR*, widestr in the example code there can be used wherever a LPWSTR is expected or where a LPCWSTR is expected.
One minor tweak would be to use std::vector<WCHAR> instead of a manually managed array:
// using vector, buffer is deallocated when function ends
std::vector<WCHAR> widestr(bufferlen + 1);
::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), &widestr[0], bufferlen);
// Ensure wide string is null terminated
widestr[bufferlen] = 0;
// no need to delete; handled by vector
Also, if you need to work with wide strings to start with, you can use std::wstring instead of std::string. If you want to work with the Windows TCHAR type, you can use std::basic_string<TCHAR>. Converting from std::wstring to LPCWSTR or from std::basic_string<TCHAR> to LPCTSTR is just a matter of calling c_str. It's when you're changing between ANSI and UTF-16 characters that MultiByteToWideChar (and its inverse WideCharToMultiByte) comes into the picture.
Converting is simple:
std::string myString;
LPCSTR lpMyString = myString.c_str();
One thing to be careful of here is that c_str does not return a copy of myString, but just a pointer to the character string that std::string wraps. If you want/need a copy you'll need to make one yourself using strcpy.
The conversion is simple:
std::string str;
LPCSTR lpcstr = str.c_str();
The easiest way to convert a std::string to a LPWSTR is in my opinion:
Convert the std::string to a std::vector<wchar_t>
Take the address of the first wchar_t in the vector.
std::vector<wchar_t> has a templated ctor which will take two iterators, such as the std::string.begin() and .end() iterators. This will convert each char to a wchar_t, though. That's only valid if the std::string contains ASCII or Latin-1, due to the way Unicode values resemble Latin-1 values. If it contains CP1252 or characters from any other encoding, it's more complicated. You'll then need to convert the characters.
std::string myString("SomeValue");
LPSTR lpSTR = const_cast<char*>(myString.c_str());
myString is the input string and lpSTR is it's LPSTR equivalent.
How can I convert a std::string to LPCSTR? Also, how can I convert a std::string to LPWSTR?
I am totally confused with these LPCSTR LPSTR LPWSTR and LPCWSTR.
Are LPWSTR and LPCWSTR the same?
Call c_str() to get a const char * (LPCSTR) from a std::string.
It's all in the name:
LPSTR - (long) pointer to string - char *
LPCSTR - (long) pointer to constant string - const char *
LPWSTR - (long) pointer to Unicode (wide) string - wchar_t *
LPCWSTR - (long) pointer to constant Unicode (wide) string - const wchar_t *
LPTSTR - (long) pointer to TCHAR (Unicode if UNICODE is defined, ANSI if not) string - TCHAR *
LPCTSTR - (long) pointer to constant TCHAR string - const TCHAR *
You can ignore the L (long) part of the names -- it's a holdover from 16-bit Windows.
str.c_str() gives you a const char *, which is an LPCSTR (Long Pointer to Constant STRing) -- means that it's a pointer to a 0 terminated string of characters. W means wide string (composed of wchar_t instead of char).
These are Microsoft defined typedefs which correspond to:
LPCSTR: pointer to null terminated const string of char
LPSTR: pointer to null terminated char string of char (often a buffer is passed and used as an 'output' param)
LPCWSTR: pointer to null terminated string of const wchar_t
LPWSTR: pointer to null terminated string of wchar_t (often a buffer is passed and used as an 'output' param)
To "convert" a std::string to a LPCSTR depends on the exact context but usually calling .c_str() is sufficient.
This works.
void TakesString(LPCSTR param);
void f(const std::string& param)
{
TakesString(param.c_str());
}
Note that you shouldn't attempt to do something like this.
LPCSTR GetString()
{
std::string tmp("temporary");
return tmp.c_str();
}
The buffer returned by .c_str() is owned by the std::string instance and will only be valid until the string is next modified or destroyed.
To convert a std::string to a LPWSTR is more complicated. Wanting an LPWSTR implies that you need a modifiable buffer and you also need to be sure that you understand what character encoding the std::string is using. If the std::string contains a string using the system default encoding (assuming windows, here), then you can find the length of the required wide character buffer and perform the transcoding using MultiByteToWideChar (a Win32 API function).
e.g.
void f(const std:string& instr)
{
// Assumes std::string is encoded in the current Windows ANSI codepage
int bufferlen = ::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), NULL, 0);
if (bufferlen == 0)
{
// Something went wrong. Perhaps, check GetLastError() and log.
return;
}
// Allocate new LPWSTR - must deallocate it later
LPWSTR widestr = new WCHAR[bufferlen + 1];
::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), widestr, bufferlen);
// Ensure wide string is null terminated
widestr[bufferlen] = 0;
// Do something with widestr
delete[] widestr;
}
Using LPWSTR you could change contents of string where it points to. Using LPCWSTR you couldn't change contents of string where it points to.
std::string s = SOME_STRING;
// get temporary LPSTR (not really safe)
LPSTR pst = &s[0];
// get temporary LPCSTR (pretty safe)
LPCSTR pcstr = s.c_str();
// convert to std::wstring
std::wstring ws;
ws.assign( s.begin(), s.end() );
// get temporary LPWSTR (not really safe)
LPWSTR pwst = &ws[0];
// get temporary LPCWSTR (pretty safe)
LPCWSTR pcwstr = ws.c_str();
LPWSTR is just a pointer to original string. You shouldn't return it from function using the sample above. To get not temporary LPWSTR you should made a copy of original string on the heap. Check the sample below:
LPWSTR ConvertToLPWSTR( const std::string& s )
{
LPWSTR ws = new wchar_t[s.size()+1]; // +1 for zero at the end
copy( s.begin(), s.end(), ws );
ws[s.size()] = 0; // zero at the end
return ws;
}
void f()
{
std::string s = SOME_STRING;
LPWSTR ws = ConvertToLPWSTR( s );
// some actions
delete[] ws; // caller responsible for deletion
}
The MultiByteToWideChar answer that Charles Bailey gave is the correct one. Because LPCWSTR is just a typedef for const WCHAR*, widestr in the example code there can be used wherever a LPWSTR is expected or where a LPCWSTR is expected.
One minor tweak would be to use std::vector<WCHAR> instead of a manually managed array:
// using vector, buffer is deallocated when function ends
std::vector<WCHAR> widestr(bufferlen + 1);
::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), &widestr[0], bufferlen);
// Ensure wide string is null terminated
widestr[bufferlen] = 0;
// no need to delete; handled by vector
Also, if you need to work with wide strings to start with, you can use std::wstring instead of std::string. If you want to work with the Windows TCHAR type, you can use std::basic_string<TCHAR>. Converting from std::wstring to LPCWSTR or from std::basic_string<TCHAR> to LPCTSTR is just a matter of calling c_str. It's when you're changing between ANSI and UTF-16 characters that MultiByteToWideChar (and its inverse WideCharToMultiByte) comes into the picture.
Converting is simple:
std::string myString;
LPCSTR lpMyString = myString.c_str();
One thing to be careful of here is that c_str does not return a copy of myString, but just a pointer to the character string that std::string wraps. If you want/need a copy you'll need to make one yourself using strcpy.
The conversion is simple:
std::string str;
LPCSTR lpcstr = str.c_str();
The easiest way to convert a std::string to a LPWSTR is in my opinion:
Convert the std::string to a std::vector<wchar_t>
Take the address of the first wchar_t in the vector.
std::vector<wchar_t> has a templated ctor which will take two iterators, such as the std::string.begin() and .end() iterators. This will convert each char to a wchar_t, though. That's only valid if the std::string contains ASCII or Latin-1, due to the way Unicode values resemble Latin-1 values. If it contains CP1252 or characters from any other encoding, it's more complicated. You'll then need to convert the characters.
std::string myString("SomeValue");
LPSTR lpSTR = const_cast<char*>(myString.c_str());
myString is the input string and lpSTR is it's LPSTR equivalent.
How can I convert a std::string to LPCSTR? Also, how can I convert a std::string to LPWSTR?
I am totally confused with these LPCSTR LPSTR LPWSTR and LPCWSTR.
Are LPWSTR and LPCWSTR the same?
Call c_str() to get a const char * (LPCSTR) from a std::string.
It's all in the name:
LPSTR - (long) pointer to string - char *
LPCSTR - (long) pointer to constant string - const char *
LPWSTR - (long) pointer to Unicode (wide) string - wchar_t *
LPCWSTR - (long) pointer to constant Unicode (wide) string - const wchar_t *
LPTSTR - (long) pointer to TCHAR (Unicode if UNICODE is defined, ANSI if not) string - TCHAR *
LPCTSTR - (long) pointer to constant TCHAR string - const TCHAR *
You can ignore the L (long) part of the names -- it's a holdover from 16-bit Windows.
str.c_str() gives you a const char *, which is an LPCSTR (Long Pointer to Constant STRing) -- means that it's a pointer to a 0 terminated string of characters. W means wide string (composed of wchar_t instead of char).
These are Microsoft defined typedefs which correspond to:
LPCSTR: pointer to null terminated const string of char
LPSTR: pointer to null terminated char string of char (often a buffer is passed and used as an 'output' param)
LPCWSTR: pointer to null terminated string of const wchar_t
LPWSTR: pointer to null terminated string of wchar_t (often a buffer is passed and used as an 'output' param)
To "convert" a std::string to a LPCSTR depends on the exact context but usually calling .c_str() is sufficient.
This works.
void TakesString(LPCSTR param);
void f(const std::string& param)
{
TakesString(param.c_str());
}
Note that you shouldn't attempt to do something like this.
LPCSTR GetString()
{
std::string tmp("temporary");
return tmp.c_str();
}
The buffer returned by .c_str() is owned by the std::string instance and will only be valid until the string is next modified or destroyed.
To convert a std::string to a LPWSTR is more complicated. Wanting an LPWSTR implies that you need a modifiable buffer and you also need to be sure that you understand what character encoding the std::string is using. If the std::string contains a string using the system default encoding (assuming windows, here), then you can find the length of the required wide character buffer and perform the transcoding using MultiByteToWideChar (a Win32 API function).
e.g.
void f(const std:string& instr)
{
// Assumes std::string is encoded in the current Windows ANSI codepage
int bufferlen = ::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), NULL, 0);
if (bufferlen == 0)
{
// Something went wrong. Perhaps, check GetLastError() and log.
return;
}
// Allocate new LPWSTR - must deallocate it later
LPWSTR widestr = new WCHAR[bufferlen + 1];
::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), widestr, bufferlen);
// Ensure wide string is null terminated
widestr[bufferlen] = 0;
// Do something with widestr
delete[] widestr;
}
Using LPWSTR you could change contents of string where it points to. Using LPCWSTR you couldn't change contents of string where it points to.
std::string s = SOME_STRING;
// get temporary LPSTR (not really safe)
LPSTR pst = &s[0];
// get temporary LPCSTR (pretty safe)
LPCSTR pcstr = s.c_str();
// convert to std::wstring
std::wstring ws;
ws.assign( s.begin(), s.end() );
// get temporary LPWSTR (not really safe)
LPWSTR pwst = &ws[0];
// get temporary LPCWSTR (pretty safe)
LPCWSTR pcwstr = ws.c_str();
LPWSTR is just a pointer to original string. You shouldn't return it from function using the sample above. To get not temporary LPWSTR you should made a copy of original string on the heap. Check the sample below:
LPWSTR ConvertToLPWSTR( const std::string& s )
{
LPWSTR ws = new wchar_t[s.size()+1]; // +1 for zero at the end
copy( s.begin(), s.end(), ws );
ws[s.size()] = 0; // zero at the end
return ws;
}
void f()
{
std::string s = SOME_STRING;
LPWSTR ws = ConvertToLPWSTR( s );
// some actions
delete[] ws; // caller responsible for deletion
}
The MultiByteToWideChar answer that Charles Bailey gave is the correct one. Because LPCWSTR is just a typedef for const WCHAR*, widestr in the example code there can be used wherever a LPWSTR is expected or where a LPCWSTR is expected.
One minor tweak would be to use std::vector<WCHAR> instead of a manually managed array:
// using vector, buffer is deallocated when function ends
std::vector<WCHAR> widestr(bufferlen + 1);
::MultiByteToWideChar(CP_ACP, 0, instr.c_str(), instr.size(), &widestr[0], bufferlen);
// Ensure wide string is null terminated
widestr[bufferlen] = 0;
// no need to delete; handled by vector
Also, if you need to work with wide strings to start with, you can use std::wstring instead of std::string. If you want to work with the Windows TCHAR type, you can use std::basic_string<TCHAR>. Converting from std::wstring to LPCWSTR or from std::basic_string<TCHAR> to LPCTSTR is just a matter of calling c_str. It's when you're changing between ANSI and UTF-16 characters that MultiByteToWideChar (and its inverse WideCharToMultiByte) comes into the picture.
Converting is simple:
std::string myString;
LPCSTR lpMyString = myString.c_str();
One thing to be careful of here is that c_str does not return a copy of myString, but just a pointer to the character string that std::string wraps. If you want/need a copy you'll need to make one yourself using strcpy.
The conversion is simple:
std::string str;
LPCSTR lpcstr = str.c_str();
The easiest way to convert a std::string to a LPWSTR is in my opinion:
Convert the std::string to a std::vector<wchar_t>
Take the address of the first wchar_t in the vector.
std::vector<wchar_t> has a templated ctor which will take two iterators, such as the std::string.begin() and .end() iterators. This will convert each char to a wchar_t, though. That's only valid if the std::string contains ASCII or Latin-1, due to the way Unicode values resemble Latin-1 values. If it contains CP1252 or characters from any other encoding, it's more complicated. You'll then need to convert the characters.
std::string myString("SomeValue");
LPSTR lpSTR = const_cast<char*>(myString.c_str());
myString is the input string and lpSTR is it's LPSTR equivalent.
Can anyone help in converting string to LPWSTR
string command=obj.getInstallationPath()+"<some string appended>"
Now i wat to pass it as parameter for CreateProcessW(xx,command,x...)
But createProcessW() accepts only LPWSTR so i need to cast string to LPWSTR
Thanks in Advance
If you have an ANSI string, then have you considered calling CreateProcessA instead? If there is a specific reason you need to call CreateProcessW then you will need to convert the string. Try the MultiByteToWideChar function.
Another way:
mbstowcs_s
use with string.c_str(), you can find example here or here
OR
USES_CONVERSION_EX;
std::string text = "text";
LPWSTR lp = A2W_EX(text.c_str(), text.length());
OR
{
std::string str = "String to LPWSTR";
BSTR b = _com_util::ConvertStringToBSTR(str.c_str());
LPWSTR lp = b;
Use lp before SysFreeString...
SysFreeString(b);
}
The easiest way to convert an ansi string to a wide (unicode) string is to use the string conversion macros.
To use these, put USES_CONVERSION at the top of your function, then you can use macros like A2W() to perform the conversion very easily.
eg.
char* sz = "tadaaa";
CreateProcessW(A2W(sz), ...);
The macros allocate space on the stack, perform the conversion and return the converted string.
Also, you might want to consider using TCHAR throughout... If I'm correct, the idea would be something like this:
typedef std::basic_string<TCHAR> tstring
// Make any methods you control return tstring values. Thus, you could write:
tstring command = obj.getInstallationPath();
CreateProcess(x, command.c_str(), ...);
Note that we use CreateProcess instead of CreateProcessW or CreateProcessA. The idea is that if UNICODE is defined, then TCHAR is typedefed to WCHAR and CreateProcess is #defined to be CreateProcessW, which accepts a LPWSTR; but if UNICODE is not defined, then TCHAR becomes char, and CreateProcess becomes CreateProcessA, which accepts a LPSTR. But I might not have the details right here... this stuff seems somewhat needlessly complicated :(.
Here is another option. I've been using this function to do the conversion.
//C++ string to WINDOWS UNICODE string
std::wstring s2ws(const std::string& s)
{
int len;
int slength = (int)s.length() + 1;
len = MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, 0, 0);
wchar_t* buf = new wchar_t[len];
MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, buf, len);
std::wstring r(buf);
delete[] buf;
return r;
}
And wrap your string like this.
s2ws( volume.c_str())
CString is quite handy, while std::string is more compatible with STL container. I am using hash_map. However, hash_map does not support CStrings as keys, so I want to convert the CString into a std::string.
Writing a CString hash function seems to take a lot of time.
CString -----> std::string
How can I do this?
std::string -----> CString:
inline CString toCString(std::string const& str)
{
return CString(str.c_str());
}
Am I right?
EDIT:
Here are more questions:
How can I convert from wstring to CString and vice versa?
// wstring -> CString
std::wstring src;
CString result(src.c_str());
// CString -> wstring
CString src;
std::wstring des(src.GetString());
Is there any problem with this?
Additionally, how can I convert from std::wstring to std::string and vice versa?
According to CodeGuru:
CString to std::string:
CString cs("Hello");
std::string s((LPCTSTR)cs);
BUT: std::string cannot always construct from a LPCTSTR. i.e. the code will fail for UNICODE builds.
As std::string can construct only from LPSTR / LPCSTR, a programmer who uses VC++ 7.x or better can utilize conversion classes such as CT2CA as an intermediary.
CString cs ("Hello");
// Convert a TCHAR string to a LPCSTR
CT2CA pszConvertedAnsiString (cs);
// construct a std::string using the LPCSTR input
std::string strStd (pszConvertedAnsiString);
std::string to CString: (From Visual Studio's CString FAQs...)
std::string s("Hello");
CString cs(s.c_str());
CStringT can construct from both character or wide-character strings. i.e. It can convert from char* (i.e. LPSTR) or from wchar_t* (LPWSTR).
In other words, char-specialization (of CStringT) i.e. CStringA, wchar_t-specilization CStringW, and TCHAR-specialization CString can be constructed from either char or wide-character, null terminated (null-termination is very important here) string sources.
Althoug IInspectable amends the "null-termination" part in the comments:
NUL-termination is not required.
CStringT has conversion constructors that take an explicit length argument. This also means that you can construct CStringT objects from std::string objects with embedded NUL characters.
Solve that by using std::basic_string<TCHAR> instead of std::string and it should work fine regardless of your character setting.
If you want something more C++-like, this is what I use. Although it depends on Boost, that's just for exceptions. You can easily remove those leaving it to depend only on the STL and the WideCharToMultiByte() Win32 API call.
#include <string>
#include <vector>
#include <cassert>
#include <exception>
#include <boost/system/system_error.hpp>
#include <boost/integer_traits.hpp>
/**
* Convert a Windows wide string to a UTF-8 (multi-byte) string.
*/
std::string WideStringToUtf8String(const std::wstring& wide)
{
if (wide.size() > boost::integer_traits<int>::const_max)
throw std::length_error(
"Wide string cannot be more than INT_MAX characters long.");
if (wide.size() == 0)
return "";
// Calculate necessary buffer size
int len = ::WideCharToMultiByte(
CP_UTF8, 0, wide.c_str(), static_cast<int>(wide.size()),
NULL, 0, NULL, NULL);
// Perform actual conversion
if (len > 0)
{
std::vector<char> buffer(len);
len = ::WideCharToMultiByte(
CP_UTF8, 0, wide.c_str(), static_cast<int>(wide.size()),
&buffer[0], static_cast<int>(buffer.size()), NULL, NULL);
if (len > 0)
{
assert(len == static_cast<int>(buffer.size()));
return std::string(&buffer[0], buffer.size());
}
}
throw boost::system::system_error(
::GetLastError(), boost::system::system_category);
}
It is more efficient to convert CString to std::string using the conversion where the length is specified.
CString someStr("Hello how are you");
std::string std(someStr, someStr.GetLength());
In a tight loop, this makes a significant performance improvement.
Is there any problem?
There are several issues:
CString is a template specialization of CStringT. Depending on the BaseType describing the character type, there are two concrete specializations: CStringA (using char) and CStringW (using wchar_t).
While wchar_t on Windows is ubiquitously used to store UTF-16 encoded code units, using char is ambiguous. The latter commonly stores ANSI encoded characters, but can also store ASCII, UTF-8, or even binary data.
We don't know the character encoding (or even character type) of CString (which is controlled through the _UNICODE preprocessor symbol), making the question ambiguous. We also don't know the desired character encoding of std::string.
Converting between Unicode and ANSI is inherently lossy: ANSI encoding can only represent a subset of the Unicode character set.
To address these issues, I'm going to assume that wchar_t will store UTF-16 encoded code units, and char will hold UTF-8 octet sequences. That's the only reasonable choice you can make to ensure that source and destination strings retain the same information, without limiting the solution to a subset of the source or destination domains.
The following implementations convert between CStringA/CStringW and std::wstring/std::string mapping from UTF-8 to UTF-16 and vice versa:
#include <string>
#include <atlconv.h>
std::string to_utf8(CStringW const& src_utf16)
{
return { CW2A(src_utf16.GetString(), CP_UTF8).m_psz };
}
std::wstring to_utf16(CStringA const& src_utf8)
{
return { CA2W(src_utf8.GetString(), CP_UTF8).m_psz };
}
The remaining two functions construct C++ string objects from MFC strings, leaving the encoding unchanged. Note that while the previous functions cannot cope with embedded NUL characters, these functions are immune to that.
#include <string>
#include <atlconv.h>
std::string to_std_string(CStringA const& src)
{
return { src.GetString(), src.GetString() + src.GetLength() };
}
std::wstring to_std_wstring(CStringW const& src)
{
return { src.GetString(), src.GetString() + src.GetLength() };
}
(Since VS2012 ...and at least until VS2017 v15.8.1)
Since it's a MFC project & CString is a MFC class, MS provides a Technical Note TN059: Using MFC MBCS/Unicode Conversion Macros and Generic Conversion Macros:
A2CW (LPCSTR) -> (LPCWSTR)
A2W (LPCSTR) -> (LPWSTR)
W2CA (LPCWSTR) -> (LPCSTR)
W2A (LPCWSTR) -> (LPSTR)
Use:
void Example() // ** UNICODE case **
{
USES_CONVERSION; // (1)
// CString to std::string / std::wstring
CString strMfc{ "Test" }; // strMfc = L"Test"
std::string strStd = W2A(strMfc); // ** Conversion Macro: strStd = "Test" **
std::wstring wstrStd = strMfc.GetString(); // wsrStd = L"Test"
// std::string to CString / std::wstring
strStd = "Test 2";
strMfc = strStd.c_str(); // strMfc = L"Test 2"
wstrStd = A2W(strStd.c_str()); // ** Conversion Macro: wstrStd = L"Test 2" **
// std::wstring to CString / std::string
wstrStd = L"Test 3";
strMfc = wstrStd.c_str(); // strMfc = L"Test 3"
strStd = W2A(wstrStd.c_str()); // ** Conversion Macro: strStd = "Test 3" **
}
--
Footnotes:
(1) In order to for the conversion-macros to have space to store the temporary length, it is necessary to declare a local variable called _convert that does this in each function that uses the conversion macros. This is done by invoking the USES_CONVERSION macro. In VS2017 MFC code (atlconv.h) it looks like this:
#ifndef _DEBUG
#define USES_CONVERSION int _convert; (_convert); UINT _acp = ATL::_AtlGetConversionACP() /*CP_THREAD_ACP*/; (_acp); LPCWSTR _lpw; (_lpw); LPCSTR _lpa; (_lpa)
#else
#define USES_CONVERSION int _convert = 0; (_convert); UINT _acp = ATL::_AtlGetConversionACP() /*CP_THREAD_ACP*/; (_acp); LPCWSTR _lpw = NULL; (_lpw); LPCSTR _lpa = NULL; (_lpa)
#endif
This works fine:
//Convert CString to std::string
inline std::string to_string(const CString& cst)
{
return CT2A(cst.GetString());
}
to convert CString to std::string. You can use this format.
std::string sText(CW2A(CSText.GetString(), CP_UTF8 ));
CString has method, GetString(), that returns an LPCWSTR type if you are using Unicode, or LPCSTR otherwise.
In the Unicode case, you must pass it through a wstring:
CString cs("Hello");
wstring ws = wstring(cs.GetString());
string s = string(ws.begin(), ws.end());
Else you can simply convert the string directly:
CString cs("Hello");
string s = string(cs.GetString());
From this post (Thank you Mark Ransom )
Convert CString to string (VC6)
I have tested this and it works fine.
std::string Utils::CString2String(const CString& cString)
{
std::string strStd;
for (int i = 0; i < cString.GetLength(); ++i)
{
if (cString[i] <= 0x7f)
strStd.append(1, static_cast<char>(cString[i]));
else
strStd.append(1, '?');
}
return strStd;
}
This is a follow up to Sal's answer, where he/she provided the solution:
CString someStr("Hello how are you");
std::string std(somStr, someStr.GetLength());
This is useful also when converting a non-typical C-String to a std::string
A use case for me was having a pre-allocated char array (like C-String), but it's not NUL terminated. (i.e. SHA digest).
The above syntax allows me to specify the length of the SHA digest of the char array so that std::string doesn't have to look for the terminating NUL char, which may or may not be there.
Such as:
unsigned char hashResult[SHA_DIGEST_LENGTH];
auto value = std::string(reinterpret_cast<char*>hashResult, SHA_DIGEST_LENGTH);
You can use CT2CA
CString datasetPath;
CT2CA st(datasetPath);
string dataset(st);
All other answers didn't quite address what I was looking for which was to convert CString on the fly as opposed to store the result in a variable.
The solution is similar to above but we need one more step to instantiate a nameless object. I am illustrating with an example. Here is my function which needs std::string but I have CString.
void CStringsPlayDlg::writeLog(const std::string &text)
{
std::string filename = "c:\\test\\test.txt";
std::ofstream log_file(filename.c_str(), std::ios_base::out | std::ios_base::app);
log_file << text << std::endl;
}
How to call it when you have a CString?
std::string firstName = "First";
CString lastName = _T("Last");
writeLog( firstName + ", " + std::string( CT2A( lastName ) ) );
Note that the last line is not a direct typecast but we are creating a nameless std::string object and supply the CString via its constructor.
One interesting approach is to cast CString to CStringA inside a string constructor. Unlike std::string s((LPCTSTR)cs); this will work even if _UNICODE is defined. However, if that is the case, this will perform conversion from Unicode to ANSI, so it is unsafe for higher Unicode values beyond the ASCII character set. Such conversion is subject to the _CSTRING_DISABLE_NARROW_WIDE_CONVERSION preprocessor definition. https://msdn.microsoft.com/en-us/library/5bzxfsea.aspx
CString s1("SomeString");
string s2((CStringA)s1);
If you're looking to convert easily between other strings types, perhaps the _bstr_t class would be more appropriate? It supports converstion between char, wchar_t and BSTR.
Works for me:
std::wstring CStringToWString(const CString& s)
{
std::string s2;
s2 = std::string((LPCTSTR)s);
return std::wstring(s2.begin(),s2.end());
}
CString WStringToCString(std::wstring s)
{
std::string s2;
s2 = std::string(s.begin(),s.end());
return s2.c_str();
}