C++ using std::string, std::wstring as a buffer - c++

Using WinAPI you can often encounter some methods getting LPWSTR or LPSTR as a parameter. Sometimes this pointer should be a pointer to buffer in fact, for example:
int GetWindowTextW(HWND hWnd, LPWSTR lpString, int nMaxCount);
Is it a good idea to use std::wstring for such buffers, in particular case I strongly need to produce std::wstring as result and cannot replace it with vector<wchar_t> for example?
std::wstring myWrapper(HWND hWnd){
auto desiredBufferSize = GetWindowTextLengthW(hWnd);
std::wstring resultWstr;
resultWstr.resize(desiredBufferSize);
auto ret = GetWindowText(hWnd,
const_cast<wchar_t*>(resultWstr.data()), // const_cast
resultWstr.size());
// handle return code code
return resultWstr;
}
Both data() and c_str() string methods return const pointer, so we must use const_cast to remove constness, which sometimes is a bad sign. Is it a good idea in such case? Can I do better?

Use String as C-String
Auto type conversion from const char* to std::string, but not other way around.
The character ‘\0’ is not special for std::string.
&s[0] for write access
Make sure the string size (not just capacity) is big enough for C style writing.
s.c_str() for read only access
Is valid only until the next call of a non-constant method.
Code sample:
const int MAX_BUFFER_SIZE = 30;         // Including NULL terminator.         
string s(MAX_BUFFER_SIZE, '\0');      // Allocate enough space, NULL terminated
strcpy(&s[0], "This is source string.");    // Write, C++11 only (VS2010 OK)
printf("C str: '%s'\n", s.c_str());     // Read only: Use const whenever possible.

It's tempting to go for nice standard wstring. However it's never good to cast away const...
Here a temporary string wrapper that automatically creates a buffer, passes its pointer to the winapi function, and copies the content of the buffer to your string and disapears cleanly:
auto ret = GetWindowText(hWnd,
tmpstr (resultWstr, desiredBufferSize),
resultWstr.size());
This solution works with any windows API function that writes to a character pointer before it returns (i.e. no assync).
How does it work ?
It's based on C++ standard §12.2 point 3 : "Temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created. (...) The value computations and side effects of destroying a temporary object are associated only with the full-expression, not with any specific subexpression.".
Here it's implementation:
typedef std::basic_string<TCHAR> tstring; // based on microsoft's TCHAR
class tmpstr {
private:
tstring &t; // for later cpy of the result
TCHAR *buff; // temp buffer
public:
tmpstr(tstring& v, int ml) : t(v) { // ctor
buff = new TCHAR[ml]{}; // you could also initialize it if needed
std::cout << "tmp created\n"; // just for tracing, for proof of concept
}
tmpstr(tmpstr&c) = delete; // No copy allowed
tmpstr& operator= (tmpstr&c) = delete; // No assignment allowed
~tmpstr() {
t = tstring(buff); // copy to string passed by ref at construction
delete buff; // clean everyhing
std::cout<< "tmp destroyed"; // just for proof of concept. remove this line
}
operator LPTSTR () {return buff; } // auto conversion to serve as windows function parameter without having to care
};
As you can see, the first line uses a typedef, in order to be compatible with several windows compilation options (e.g. Unicode or not). But of course, you could just replace tstring and TCHAR with wstring and wchar_t if you prefer.
The only drawback is that you have to repeat the buffer size as parameter tmpstr constructor and as parameter of the windows function. But this is why you're writing a wrepper for the function, isn't it ?

For a string buffer why not to use just char array? :)
DWORD username_len = UNLEN + 1;
vector<TCHAR> username(username_len);
GetUserName(&username[0], &username_len);
the accepted solution is nice example of overthinking.

Related

Why should I use GetBuffer member of CString instead of SetAt?

I'm currently studying MFC library and I wonder why should I use GetBuffer member which returns pointer to CString object buffer over other member functions which allow to read and change characters in that object?
For example why should I do (code changes first character of CString object):
CString aString(_T("String")); //new CString object
LPTSTR p = aString.GetBuffer(); //create new pointer to aString buffer
_tcsncpy(p, LPCTSTR(_T("a")), 1); //set first character to 'a'
aString.ReleaseBuffer(); //free allocated memory
Instead of:
CString aStr(_T("String")); //new CString object
aStr.SetAt(0, _T('a')); //set character at 0 position to 'a'
I suppose there is a more appropriate application to use GetBuffer() member, but I can't figure out what it can be... This function requires ReleaseBuffer() to free memory, and I may cause memory leaks when ReleaseBuffer() is not called. Is there any advantage of using it?
Don't use GetBuffer unless you have no alternative. Precisely because of (1) the reason you already know, that it must be followed with ReleaseBuffer which you may forget to do, leading to a resource leak. And (2) you might inadvertently make changes to the underlying data rendering it inconsistent in some way. More often than not the functions GetString, SetString, GetAt and SetAt will do what you need and have no disadvantages. Prefer them.
In above example it is preferable to use the SetAt method.
In some cases you need GetBuffer to directly access the buffer, mainly when used with WinAPI functions. For example, to use ::GetWindowText with WinAPI code you need to allocate a buffer as follows:
int len = ::GetWindowTextLength(m_hWnd) + 1;
char *buf = new char[len];
::GetWindowText(m_hWnd, buf, len);
...
delete[] buf;
The same thing can be done in MFC with CWnd::GetWindowText(CString&). But MFC has to use the same basic WinAPI functions, through GetBuffer. MFC's implementation of CWnd::GetWindowText is roughly as follows:
void CWnd::GetWindowText(CString &str)
{
int nLen = ::GetWindowTextLength(m_hWnd);
::GetWindowText(m_hWnd, str.GetBufferSetLength(nLen), nLen+1);
str.ReleaseBuffer();
}

Loss of data while building a std::string from const char * or LPCSTR

I have a Function which returns a LPSTR/const char * and I need to convert it to a std::string. This is how I am doing it.
std::string szStr(foo(1));
It works just fine in all the cases just when foo returns a 32 characters long string it fails. With this approach I get "". So I thought it had to do something with the length. So I changed it a bit.
std::string szStr(foo(1) , 32);
This gives me "0"
Then I tried another tedious method
const char * cstr_a = foo(1);
const char * cstr_b = foo(2);
size_t ln_a = strlen(cstr_a);
size_t ln_b = strlen(cstr_b);
std::string szStr_a( cstr_a , ln_a );
std::string szStr_b( cstr_b , ln_b );
But strangely enough in this method both the pointers are getting the same value, viz foo(1) should return abc and foo(2) should return xyz. But here cstr_a is first getting abc but the moment cstr_b gets xyz, the value of both cstr_a and cstr_b becomes xyz. I am dazed and confused with this.
And yes, I cannot use std::wstring.
What is foo?
foo is basically reading a value from the registry and returning it as a LPSTR. Now one the value in the registry which I need to read is a MD5 hashed string (32 charecters) That's where it fails.
The Actual Foo function:
LPCSTR CRegistryOperation::GetRegValue(HKEY hHeadKey, LPCSTR szPath, LPCSTR szValue)
{
HKEY hKey;
CHAR szBuff[255] = ("");
DWORD dwBufSize = 255;
::RegOpenKeyEx(hHeadKey, (LPCSTR)szPath, 0, KEY_READ, &hKey);
::RegQueryValueEx(hKey, (LPCSTR)szValue, NULL, 0, (LPBYTE)szBuff, &dwBufSize);
::RegCloseKey(hKey);
LPCSTR cstr(szBuff);
return cstr;
}
The Original cast code:
StrResultMap RegValues;
std::string lid(CRegistryOperation::GetRegValue(HKEY_CURRENT_USER, REG_KEY_HKCU_PATH, "LicenseID"));
std::string mid(CRegistryOperation::GetRegValue(HKEY_CURRENT_USER, REG_KEY_HKCU_PATH, "MachineID"), 32);
std::string vtill(CRegistryOperation::GetRegValue(HKEY_CURRENT_USER, REG_KEY_HKCU_PATH, "ValidTill"));
std::string adate(CRegistryOperation::GetRegValue(HKEY_CURRENT_USER, REG_KEY_HKCU_PATH, "ActivateDT"));
std::string lupdate(CRegistryOperation::GetRegValue(HKEY_CURRENT_USER, REG_KEY_HKCU_PATH, "LastUpdate"));
RegValues["license_id"] = lid;
RegValues["machine_id"] = mid;
RegValues["valid_till"] = vtill;
RegValues["activation_date"] = adate;
RegValues["last_updated"] = lupdate;
Kindly help me get over it.
Thanks.
As a complement to Nordic Mainframe's anwser, there are 3 common ways to return a buffer from a C or C++ function :
use a static buffer - simple and nice until you have re-entrancy problems (multiple threads or recursivity)
pass the buffer as an input parameter, and simply return the number of characters written to it - ok if the size of buffer is really a constant
malloc the buffer in the function (it is in the heap and not in the stack) and document in flashing red that it must be freed by caller
But as you tagged your question as C++, you could create the std::string in the function and return it. C++ functions are allowed to return std::string because the different operators (copy constructor, affectation, ...) take care automatically of the allocation problem.
You can avoid returning a pointer to a buffer which has gone out of scope by returning a std::string directly.
std::string CRegistryOperation::GetRegValue(HKEY hHeadKey, LPCSTR szPath, LPCSTR szValue)
{
HKEY hKey = 0;
CHAR szBuff[255] = { 0 };
DWORD dwBufSize = sizeof(szBuf);
if (::RegOpenKeyEx(hHeadKey, (LPCSTR)szPath, 0, KEY_READ, &hKey) == ERROR_SUCCESS) {
::RegQueryValueEx(hKey, (LPCSTR)szValue, NULL, 0, (LPBYTE)szBuff, &dwBufSize);
::RegCloseKey(hKey);
}
return std::string(szBuf);
}
The GetRegValue function returns a pointer to a buffer in GetRegValue's stack frame. This does not work: after GetRegValue terminates the pointer cstr points to undefined values somewhere in the Stack. Try to make szBuff static and see if that helps.
LPCSTR CRegistryOperation::GetRegValue(HKEY hHeadKey, LPCSTR szPath, LPCSTR szValue)
{
HKEY hKey;
//Here:
static CHAR szBuff[255] = ("");
szBuff[0]=0;
DWORD dwBufSize = 255;
::RegOpenKeyEx(hHeadKey, (LPCSTR)szPath, 0, KEY_READ, &hKey);
::RegQueryValueEx(hKey, (LPCSTR)szValue, NULL, 0, (LPBYTE)szBuff, &dwBufSize);
::RegCloseKey(hKey);
LPCSTR cstr(szBuff);
return cstr;
}
UPDATE: I did not mandate to return std::string, or pass a buffer in, because that would change the interface. Returning a pointer to a static buffer is a common idiom and mostly unproblematic if the lifetime of the returned pointer value is limited to a few scopes (Like for building a std::string from the buffer value).
Multithreading isn't really an issue aynmore, because almost every compiler now has some form of thread local storage support right in the language. __declspec(thread) static CHAR szBuff[255] = (""); for example, should work for Microsoft compilers, __thread for gcc. C++11 even has a new storage class specifier for this (thread_local). You shouldn't call GetRegValue from a signal handler though, but that's OK - you can't do too much there anyway (for example allocate memory from the heap!).
UPDATE: Commenters argue, that I should not suggest this, because the pointer to the static buffer will point to invalid data when GetRegValue is called again. While this is obviously true, I think it is wrong to make an argument from that. Why? Look at these examples:
A pointer returned from strdup() is valid until free()
A pointer to something created with new is valid until deleted.
A const char * returned from string::c_str() is valid as long as the string is not modified.
A std::vector iterator is invalid, if an element from the std::vector is erased.
A std::list iterator is still valid, if an element from the std::vector is erased, unless it points to the erased element.
A pointer returned from GetRegValue is valid until GetRegValue is called again.
a std::ifstream is valid when,..you know, good(), fail() and so on.
There is no point in saying, "look, the thing gets invalid when you are not careful enought", because programming is not about being careless. We are handling objects, which have conditions under which they are valid or not and if an object has well defined conditions under which it is valid or not, then we can write programs with well defined behaviour. Returning a pointer to a static buffer (that is thread-local) has a well defined meaning and a developer can use this to write a well defined program. Unless said developer is negligent or too lazy to read the documentation of the routine of course.

C string to wide C string assignment

I'm a little confused about C strings and wide C strings. For the sake of this question, assume that I using Microsoft Visual Studio 2010 Professional. Please let me know if any of my information is incorrect.
I have a struct with a const wchar_t* member which is used to store a name.
struct A
{
const wchar_t* name;
};
When I assign object 'a' a name as so:
int main()
{
A a;
const wchar_t* w_name = L"Tom";
a.name = w_name;
return 0;
}
That is just copying the memory address that w_name points to into a.name. Now w_name and a.name are both wide character pointers which point to the same address in memory.
If I am correct, then I am wondering what to do about a situation like this. I am reading in a C string from an XML attribute using tinyxml2.
tinyxml2::XMLElement* pElement;
// ...
const char* name = pElement->Attribute("name");
After I have my C string, I am converting it to a wide character string as follows:
size_t newsize = strlen(name) + 1;
wchar_t * wcName = new wchar_t[newsize];
size_t convertedChars = 0;
mbstowcs_s(&convertedChars, wcName, newsize, name, _TRUNCATE);
a.name = wcName;
delete[] wcName;
If I am correct so far, then the line:
a.name = wcName;
is just copying the memory address of the first character of array wcName into a.name. However, I am deleting wcName directly after assigning this pointer which would make it point to garbage.
How can I convert my C string into a wide character C string and then assign it to a.name?
The easiest approach is probably to task you name variable with the management of the memory. This, in turn, is easily done by declaring it as
std::wstring name;
These guys don't have a concept of independent content and object mutation, i.e., you can't really make the individual characters const and making the entire object const would prevent it from being assigned to.
You can do this while using a std::wstring without relying on the additional temporary conversion buffer allocation and destruction. Not tremendously important unless you're overtly concerned about heap fragmentation or on a limited system (aka Windows Phone). It just takes a little setup on the front side. Let the standard library manage the memory for you (with a little nudge).
class A
{
...
std::wstring a;
};
// Convert the string (I'm assuming it is UTF8) to wide char
int wlen = MultiByteToWideChar(CP_UTF8, 0, name, -1, NULL, NULL);
if (wlen > 0)
{
// reserve space. std::wstring gives us the terminator slot
// for free, so don't include that. MB2WC above returns the
// length *including* the terminator.
a.resize(wlen-1);
MultiByteToWideChar(CP_UTF8, 0, name, -1, &a[0], wlen);
}
else
{ // no conversion available/possible.
a.clear();
}
On a complete side-note, you can build TinyXML to use the standard library and std::string rather than char *, which doesn't really help you much here, but may save you a ton of future strlen() calls later on.
As you correctly mentioned a.name is just a pointer which doesn't suppose any allocated string storage. You must manage it manually using new or static/scoped array.
To get rid of these boring things just use one of available string classes: CStringW from ATL (easy to use but MS-specific) or std::wstring from STL (C++ standard, but not so easy to convert from char*):
#include <atlstr.h>
// Conversion ANSI -> Wide is automatic
const CStringW name(pElement->Attribute("name"));
Unfortunately, std::wstring usage with char* is not so easy.
See conversion functon here: How to convert std::string to LPCWSTR in C++ (Unicode)

Conversion of ATL CString to character array

I want to convert a CString into a char[]. Some body tell me how to do this?
My code is like this :
CString strCamIP1 = _T("");
char g_acCameraip[16][17];
strCamIP1 = theApp.GetProfileString(strSection, _T("IP1"), NULL);
g_acCameraip[0] = strCamIP1;
This seems to be along the right lines; http://msdn.microsoft.com/en-us/library/awkwbzyc.aspx
CString aCString = "A string";
char myString[256];
strcpy(myString, (LPCTSTR)aString);
which in your case would be along the lines of
strcpy(g_acCameraip[0], (LPCTSTR)strCamIP1);
From MSDN site:
// Convert to a char* string from CStringA string
// and display the result.
CStringA origa("Hello, World!");
const size_t newsizea = (origa.GetLength() + 1);
char *nstringa = new char[newsizea];
strcpy_s(nstringa, newsizea, origa);
cout << nstringa << " (char *)" << endl;
CString is based on TCHAR so if don't compile with _UNICODE it's CStringA or if you do compile with _UNICODE then it is CStringW.
In case of CStringW conversion looks little bit different (example also from MSDN):
// Convert to a char* string from a wide character
// CStringW string. To be safe, we allocate two bytes for each
// character in the original string, including the terminating
// null.
const size_t newsizew = (origw.GetLength() + 1)*2;
char *nstringw = new char[newsizew];
size_t convertedCharsw = 0;
wcstombs_s(&convertedCharsw, nstringw, newsizew, origw, _TRUNCATE );
cout << nstringw << " (char *)" << endl;
You could use wcstombs_s:
// Convert CString to Char By Quintin Immelman.
//
CString DummyString;
// Size Can be anything, just adjust the 100 to suit.
const size_t StringSize = 100;
// The number of characters in the string can be
// less than String Size. Null terminating character added at end.
size_t CharactersConverted = 0;
char DummyToChar[StringSize];
wcstombs_s(&CharactersConverted, DummyToChar,
DummyString.GetLength()+1, DummyString,
_TRUNCATE);
//Always Enter the length as 1 greater else
//the last character is Truncated
If you are using ATL you could use one of the conversion macros. CString stores data as tchar, so you would use CT2A() (C in macro name stands for const):
CString from("text");
char* pStr = CT2A((LPCTSTR)from);
Those macros are smart, if tchar represents ascii (no _UNICODE defined), they just pass the pointer over and do nothing.
More info below, under ATL String-Conversion Classes section:
http://www.369o.com/data/books/atl/index.html?page=0321159624%2Fch05.html
CStringA/W is cheaply and implicitly convertible to const char/wchar_t *. Whenever you need C-style string, just pass CString object itself (or the result of .GetString() which is the same). The pointer will stay valid as long as string object is alive and unmodified.
strcpy(g_acCameraip[0], strCamIP1);
// OR
strcpy(g_acCameraip[0], strCamIP1.GetString());
If you need writable (non-const) buffer, use .GetBuffer() with optional maximum length argument.
If you have CStringW but you need const char* and vice versa, you can use a temporary CStringA object:
strcpy(g_acCameraip[0], CStringA(strCamIP1).GetString());
But a much better way would be to have array of CStrings. You can use them whereever you need null-terminated string, but they will also manage string's memory for you.
std::vector<CString> g_acCameraip(16);
g_acCameraip[0] = theApp.GetProfileString(strSection, _T("IP1"), NULL);
Use memcpy .
char c [25];
Cstring cstr = "123";
memcpy(c,cstr,cstr.GetLength());
Do you really have to copy the CString objects into fixed char arrays?
enum { COUNT=16 };
CString Cameraip[COUNT];
Cameraip[0] = theApp.GetProfileString(strSection, _T("IP1"), NULL);
// add more entries...
...and then - later - when accessing the entries, for example like this
for (int i=0; i<COUNT; ++i) {
someOp(Cameraip[i]); // the someOp function takes const CString&
}
...you may convert them, if needed.
fopen is the function which needs char* param. so if you have CString as available string, you can just use bellow code.
be happy :)
Here, cFDlg.GetPathName().GetString(); basically returns CString in my code.
char*pp = (char*)cFDlg.GetPathName().GetString();
FILE *fp = ::fopen(pp,"w");
CString str;
//Do something
char* pGTA = (LPTSTR)(LPCTSTR)str;//Now the cast
Just (LPTSTR)(LPCTSTR). Hope this is what you need :)
char strPass[256];
strcpy_s( strPass, CStringA(strCommand).GetString() );
It's simple
ATL CStrings allow very simple usage without having to do a lot of conversions between types. You can most easily do:
CString cs = "Test";
const char* str = static_cast<LPCTSTR>(cs);
or in UNICODE environment:
CString cs = "Test";
const wchar_t* str = static_cast<LPCTSTR>(cs);
How it works
The static_cast (or alternatively C-Style cast) will trigger the CString::operator LPCTSTR, so you don't do any pointer reinterpretation yourself but rely on ATL code!
The documentation of this cast operator says:
This useful casting operator provides an efficient method to access the null-terminated C string contained in a CString object. No characters are copied; only a pointer is returned. Be careful with this operator. If you change a CString object after you have obtained the character pointer, you may cause a reallocation of memory that invalidates the pointer.
Modifiable Pointers
As mentioned in the above statement, the returned pointer by the cast operator is not meant to be modified. However, if you still need to use a modifiable pointer for some outdated C libraries, you can use a const_cast (if you are sure that function wont modify the pointer):
void Func(char* str) // or wchar_t* in Unicode environment
{
// your code here
}
// In your calling code:
CString cs = "Test";
Func(const_cast<LPTSTR>(static_cast<LPCTSTR>(test))); // Call your function with a modifiable pointer
If you wish to modify the pointer, you wont get around doing some kind of memory copying to modifiable memory, as mentioned by other answers.
There is a hardcoded method..
CString a = L"This is CString!";
char *dest = (char *)malloc(a.GetLength() + 1);
// +1 because of NULL char
dest[a.GetLength()] = 0; // setting null char
char *q = (char *)a.m_pszData;
//Here we cannot access the private member..
//The address of "m_pszData" private member is stored in first DWORD of &a...
//Therefore..
int address = *((int *)&a);
char *q = (char *)address;
// Now we can access the private data!, This is the real magic of C
// Size of CString's characters is 16bit...
// in cstring '1' will be stored as 0x31 0x00 (Hex)
// Here we just want even indexed chars..
for(int i = 0;i<(a.GetLength()*2);i += 2)
dest[i/2] = *(q+i);
// Now we can use it..
printf("%s", dest);

How can CString be passed to format string %s?

class MyString
{
public:
MyString(const std::wstring& s2)
{
s = s2;
}
operator LPCWSTR() const
{
return s.c_str();
}
private:
std::wstring s;
};
int _tmain(int argc, _TCHAR* argv[])
{
MyString s = L"MyString";
CStringW cstring = L"CString";
wprintf(L"%s\n", (LPCWSTR)cstring); // Okay. Becase it has an operator LPCWSTR()
wprintf(L"%s\n", cstring); // Okay, fine. But how?
wprintf(L"%s\n", (LPCWSTR)s); // Okay. fine.
wprintf(L"%s\n", s); // Doesn't work. Why? It prints gabage string like "?."
return 0;
}
How can CString be passed to format string %s?
By the way, MSDN says(it's weird)
To use a CString object in a variable argument function
Explicitly cast the CString to an LPCTSTR string, as shown here:
CString kindOfFruit = "bananas";
int howmany = 25;
printf( "You have %d %s\n", howmany, (LPCTSTR)kindOfFruit );
CString is specifically designed such that it only contains a pointer that points to the string data in a buffer class. When passed by value to printf it will be treated as a pointer when seeing the "%s" in the format string.
It originally just happened to work with printf by chance, but this has later been kept as part of the class interface.
This post is based on MS documentation long since retired, so I cannot link to their promise that they will continue to make this work.
However, before adding more downvotes please also read this blog post from someone sharing my old knowledge:
Big Brother helps you
wprintf(L"%s\n", (LPCWSTR)cstring); // Okay. It's been cast to a const wchar_t*.
wprintf(L"%s\n", cstring); // UNDEFINED BEHAVIOUR
wprintf(L"%s\n", (LPCWSTR)s); // Okay, it's a const wchar_t*.
wprintf(L"%s\n", s); // UNDEFINED BEHAVIOUR
The only thing you can pass to this function for %s is a const wchar_t*. Anything else is undefined behaviour. Passing the CString just happens to work.
There's a reason that iostream was developed in C++, and it's because these variable-argument functions are horrifically unsafe, and shoud never be used. Oh, and CString is pretty much a sin too for plenty of reasons, stick to std::wstring and cout/wcout wherever you can.
CString has a pointer as the first member:
class CStringA
{
char* m_pString;
};
Though it is not char* (even for ANSI CString), it is more or less the same thing. When you pass CString object to any of printf-family of functions (including your custom implementation, if any), you are passing CString object (which is on stack). The %s parsing causes it it read as if it was a pointer - which is a valid pointer in this case (the data at very first byte is m_pString).
Generally speaking it's undefined behavior. According to this article Visual C++ just invokes the conversion from CString to a POD type to cover you - that is permissible implementation of undefined behavior.