I'm a little confused about C strings and wide C strings. For the sake of this question, assume that I using Microsoft Visual Studio 2010 Professional. Please let me know if any of my information is incorrect.
I have a struct with a const wchar_t* member which is used to store a name.
struct A
{
const wchar_t* name;
};
When I assign object 'a' a name as so:
int main()
{
A a;
const wchar_t* w_name = L"Tom";
a.name = w_name;
return 0;
}
That is just copying the memory address that w_name points to into a.name. Now w_name and a.name are both wide character pointers which point to the same address in memory.
If I am correct, then I am wondering what to do about a situation like this. I am reading in a C string from an XML attribute using tinyxml2.
tinyxml2::XMLElement* pElement;
// ...
const char* name = pElement->Attribute("name");
After I have my C string, I am converting it to a wide character string as follows:
size_t newsize = strlen(name) + 1;
wchar_t * wcName = new wchar_t[newsize];
size_t convertedChars = 0;
mbstowcs_s(&convertedChars, wcName, newsize, name, _TRUNCATE);
a.name = wcName;
delete[] wcName;
If I am correct so far, then the line:
a.name = wcName;
is just copying the memory address of the first character of array wcName into a.name. However, I am deleting wcName directly after assigning this pointer which would make it point to garbage.
How can I convert my C string into a wide character C string and then assign it to a.name?
The easiest approach is probably to task you name variable with the management of the memory. This, in turn, is easily done by declaring it as
std::wstring name;
These guys don't have a concept of independent content and object mutation, i.e., you can't really make the individual characters const and making the entire object const would prevent it from being assigned to.
You can do this while using a std::wstring without relying on the additional temporary conversion buffer allocation and destruction. Not tremendously important unless you're overtly concerned about heap fragmentation or on a limited system (aka Windows Phone). It just takes a little setup on the front side. Let the standard library manage the memory for you (with a little nudge).
class A
{
...
std::wstring a;
};
// Convert the string (I'm assuming it is UTF8) to wide char
int wlen = MultiByteToWideChar(CP_UTF8, 0, name, -1, NULL, NULL);
if (wlen > 0)
{
// reserve space. std::wstring gives us the terminator slot
// for free, so don't include that. MB2WC above returns the
// length *including* the terminator.
a.resize(wlen-1);
MultiByteToWideChar(CP_UTF8, 0, name, -1, &a[0], wlen);
}
else
{ // no conversion available/possible.
a.clear();
}
On a complete side-note, you can build TinyXML to use the standard library and std::string rather than char *, which doesn't really help you much here, but may save you a ton of future strlen() calls later on.
As you correctly mentioned a.name is just a pointer which doesn't suppose any allocated string storage. You must manage it manually using new or static/scoped array.
To get rid of these boring things just use one of available string classes: CStringW from ATL (easy to use but MS-specific) or std::wstring from STL (C++ standard, but not so easy to convert from char*):
#include <atlstr.h>
// Conversion ANSI -> Wide is automatic
const CStringW name(pElement->Attribute("name"));
Unfortunately, std::wstring usage with char* is not so easy.
See conversion functon here: How to convert std::string to LPCWSTR in C++ (Unicode)
Related
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
I've created a class to test some functionality I need to use. Essentially the class will take a deep copy of the passed in string and make it available via a getter. Am using Visual Studio 2012. Unicode is enabled in the project settings.
The problem is that the memcpy operation is yielding a truncated string. Output is like so;
THISISATEST: InstanceDataConstructor: Testing testing 123
Testing te_READY
where the first line is the check of the passed in TCHAR* string & the second line is the output from populating the allocated memory with the memcpy operation. Output expected is; "Testing testing 123".
Can anyone explain what is wrong here?
N.B. Got the #ifndef UNICODE typedefs from here: how-to-convert-tchar-array-to-stdstring
#ifndef INSTANCE_DATA_H//if not defined already
#define INSTANCE_DATA_H//then define it
#include <string>
//TCHAR is just a typedef, that depending on your compilation configuration, either defaults to char or wchar.
//Standard Template Library supports both ASCII (with std::string) and wide character sets (with std::wstring).
//All you need to do is to typedef String as either std::string or std::wstring depending on your compilation configuration.
//To maintain flexibility you can use the following code:
#ifndef UNICODE
typedef std::string String;
#else
typedef std::wstring String;
#endif
//Now you may use String in your code and let the compiler handle the nasty parts. String will now have constructors that lets you convert TCHAR to std::string or std::wstring.
class InstanceData
{
public:
InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor
{
//Check to passed in string
String outMsg(L"THISISATEST: InstanceDataConstructor: ");//L for wide character string literal
outMsg += strMessage;//concatenate message
const wchar_t* finalMsg = outMsg.c_str();//prepare for outputting
OutputDebugStringW(finalMsg);//print the message
//Prepare TCHAR dynamic array. Deep copy.
charArrayPtr = new TCHAR[strMessage.size() +1];
charArrayPtr[strMessage.size()] = 0;//null terminate
std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());//copy characters from array pointed to by the passed in TCHAR*.
OutputDebugStringW(charArrayPtr);//print the copied message to check
}
~InstanceData()//destructor
{
delete[] charArrayPtr;
}
//Getter
TCHAR* getMessage() const
{
return charArrayPtr;
}
private:
TCHAR* charArrayPtr;
String strMessage;//is used to conveniently ascertain the length of the passed in underlying TCHAR array.
};
#endif//header guard
A solution without all of the dynamically allocated memory.
#include <tchar.h>
#include <vector>
//...
class InstanceData
{
public:
InstanceData(TCHAR* strIn) : strMessage(strIn),
{
charArrayPtr.insert(charArrayPtr.begin(), strMessage.begin(), strMessage.end())
charArrayPtr.push_back(0);
}
TCHAR* getMessage()
{ return &charArrayPtr[0]; }
private:
String strMessage;
std::vector<TCHAR> charArrayPtr;
};
This does what your class does, but the major difference being that it does not do any hand-rolled dynamic allocation code. The class is also safely copyable, unlike the code with the dynamic allocation (lacked a user-defined copy constructor and assignment operator).
The std::vector class has superseded having to do new[]/delete[] in almost all circumstances. The reason being that vector stores its data in contiguous memory, no different than calling new[].
Please pay attention to the following lines in your code:
// Prepare TCHAR dynamic array. Deep copy.
charArrayPtr = new TCHAR[strMessage.size() + 1];
charArrayPtr[strMessage.size()] = 0; // null terminate
// Copy characters from array pointed to by the passed in TCHAR*.
std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());
The third argument to pass to memcpy() is the count of bytes to copy.
If the string is a simple ASCII string stored in std::string, then the count of bytes is the same of the count of ASCII characters.
But, if the string is a wchar_t Unicode UTF-16 string, then each wchar_t occupies 2 bytes in Visual C++ (with GCC things are different, but this is a Windows Win32/C++ code compiled with VC++, so let's just focus on VC++).
So, you have to properly scale the size count for memcpy(), considering the proper size of a wchar_t, e.g.:
memcpy(charArrayPtr, strMessage.data(), strMessage.size() * sizeof(TCHAR));
So, if you compile in Unicode (UTF-16) mode, then TCHAR is expanded to wchar_t, and sizeof(wchar_t) is 2, so the content of your original string should be properly deep-copied.
As an alternative, for Unicode UTF-16 strings in VC++ you may use also wmemcpy(), which considers wchar_t as its "unit of copy". So, in this case, you don't have to scale the size factor by sizeof(wchar_t).
As a side note, in your constructor you have:
InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor
Since strIn is an input string parameter, consider passing it by const pointer, i.e.:
InstanceData(const TCHAR* strIn)
I want to convert a CString into a char[]. Some body tell me how to do this?
My code is like this :
CString strCamIP1 = _T("");
char g_acCameraip[16][17];
strCamIP1 = theApp.GetProfileString(strSection, _T("IP1"), NULL);
g_acCameraip[0] = strCamIP1;
This seems to be along the right lines; http://msdn.microsoft.com/en-us/library/awkwbzyc.aspx
CString aCString = "A string";
char myString[256];
strcpy(myString, (LPCTSTR)aString);
which in your case would be along the lines of
strcpy(g_acCameraip[0], (LPCTSTR)strCamIP1);
From MSDN site:
// Convert to a char* string from CStringA string
// and display the result.
CStringA origa("Hello, World!");
const size_t newsizea = (origa.GetLength() + 1);
char *nstringa = new char[newsizea];
strcpy_s(nstringa, newsizea, origa);
cout << nstringa << " (char *)" << endl;
CString is based on TCHAR so if don't compile with _UNICODE it's CStringA or if you do compile with _UNICODE then it is CStringW.
In case of CStringW conversion looks little bit different (example also from MSDN):
// Convert to a char* string from a wide character
// CStringW string. To be safe, we allocate two bytes for each
// character in the original string, including the terminating
// null.
const size_t newsizew = (origw.GetLength() + 1)*2;
char *nstringw = new char[newsizew];
size_t convertedCharsw = 0;
wcstombs_s(&convertedCharsw, nstringw, newsizew, origw, _TRUNCATE );
cout << nstringw << " (char *)" << endl;
You could use wcstombs_s:
// Convert CString to Char By Quintin Immelman.
//
CString DummyString;
// Size Can be anything, just adjust the 100 to suit.
const size_t StringSize = 100;
// The number of characters in the string can be
// less than String Size. Null terminating character added at end.
size_t CharactersConverted = 0;
char DummyToChar[StringSize];
wcstombs_s(&CharactersConverted, DummyToChar,
DummyString.GetLength()+1, DummyString,
_TRUNCATE);
//Always Enter the length as 1 greater else
//the last character is Truncated
If you are using ATL you could use one of the conversion macros. CString stores data as tchar, so you would use CT2A() (C in macro name stands for const):
CString from("text");
char* pStr = CT2A((LPCTSTR)from);
Those macros are smart, if tchar represents ascii (no _UNICODE defined), they just pass the pointer over and do nothing.
More info below, under ATL String-Conversion Classes section:
http://www.369o.com/data/books/atl/index.html?page=0321159624%2Fch05.html
CStringA/W is cheaply and implicitly convertible to const char/wchar_t *. Whenever you need C-style string, just pass CString object itself (or the result of .GetString() which is the same). The pointer will stay valid as long as string object is alive and unmodified.
strcpy(g_acCameraip[0], strCamIP1);
// OR
strcpy(g_acCameraip[0], strCamIP1.GetString());
If you need writable (non-const) buffer, use .GetBuffer() with optional maximum length argument.
If you have CStringW but you need const char* and vice versa, you can use a temporary CStringA object:
strcpy(g_acCameraip[0], CStringA(strCamIP1).GetString());
But a much better way would be to have array of CStrings. You can use them whereever you need null-terminated string, but they will also manage string's memory for you.
std::vector<CString> g_acCameraip(16);
g_acCameraip[0] = theApp.GetProfileString(strSection, _T("IP1"), NULL);
Use memcpy .
char c [25];
Cstring cstr = "123";
memcpy(c,cstr,cstr.GetLength());
Do you really have to copy the CString objects into fixed char arrays?
enum { COUNT=16 };
CString Cameraip[COUNT];
Cameraip[0] = theApp.GetProfileString(strSection, _T("IP1"), NULL);
// add more entries...
...and then - later - when accessing the entries, for example like this
for (int i=0; i<COUNT; ++i) {
someOp(Cameraip[i]); // the someOp function takes const CString&
}
...you may convert them, if needed.
fopen is the function which needs char* param. so if you have CString as available string, you can just use bellow code.
be happy :)
Here, cFDlg.GetPathName().GetString(); basically returns CString in my code.
char*pp = (char*)cFDlg.GetPathName().GetString();
FILE *fp = ::fopen(pp,"w");
CString str;
//Do something
char* pGTA = (LPTSTR)(LPCTSTR)str;//Now the cast
Just (LPTSTR)(LPCTSTR). Hope this is what you need :)
char strPass[256];
strcpy_s( strPass, CStringA(strCommand).GetString() );
It's simple
ATL CStrings allow very simple usage without having to do a lot of conversions between types. You can most easily do:
CString cs = "Test";
const char* str = static_cast<LPCTSTR>(cs);
or in UNICODE environment:
CString cs = "Test";
const wchar_t* str = static_cast<LPCTSTR>(cs);
How it works
The static_cast (or alternatively C-Style cast) will trigger the CString::operator LPCTSTR, so you don't do any pointer reinterpretation yourself but rely on ATL code!
The documentation of this cast operator says:
This useful casting operator provides an efficient method to access the null-terminated C string contained in a CString object. No characters are copied; only a pointer is returned. Be careful with this operator. If you change a CString object after you have obtained the character pointer, you may cause a reallocation of memory that invalidates the pointer.
Modifiable Pointers
As mentioned in the above statement, the returned pointer by the cast operator is not meant to be modified. However, if you still need to use a modifiable pointer for some outdated C libraries, you can use a const_cast (if you are sure that function wont modify the pointer):
void Func(char* str) // or wchar_t* in Unicode environment
{
// your code here
}
// In your calling code:
CString cs = "Test";
Func(const_cast<LPTSTR>(static_cast<LPCTSTR>(test))); // Call your function with a modifiable pointer
If you wish to modify the pointer, you wont get around doing some kind of memory copying to modifiable memory, as mentioned by other answers.
There is a hardcoded method..
CString a = L"This is CString!";
char *dest = (char *)malloc(a.GetLength() + 1);
// +1 because of NULL char
dest[a.GetLength()] = 0; // setting null char
char *q = (char *)a.m_pszData;
//Here we cannot access the private member..
//The address of "m_pszData" private member is stored in first DWORD of &a...
//Therefore..
int address = *((int *)&a);
char *q = (char *)address;
// Now we can access the private data!, This is the real magic of C
// Size of CString's characters is 16bit...
// in cstring '1' will be stored as 0x31 0x00 (Hex)
// Here we just want even indexed chars..
for(int i = 0;i<(a.GetLength()*2);i += 2)
dest[i/2] = *(q+i);
// Now we can use it..
printf("%s", dest);
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
In the CString header file (be it Microsoft's or Open Foundation Classes - http://www.koders.com/cpp/fid035C2F57DD64DBF54840B7C00EA7105DFDAA0EBD.aspx#L77 ), there is the following code snippet
struct CStringData
{
long nRefs;
int nDataLength;
int nAllocLength;
TCHAR* data() { return (TCHAR*)(&this[1]); };
...
};
What does the (TCHAR*)(&this[1]) indicate?
The CStringData struct is used in the CString class (http :// www.koders.com/cpp/fid100CC41B9D5E1056ED98FA36228968320362C4C1.aspx).
Any help is appreciated.
CString has lots of internal tricks which make it look like a normal string when passed e.g. to printf functions, despite actually being a class - without having to cast it to LPCTSTR in the argument list, e.g., in the case of varargs (...) in e.g. a printf. Thus trying to understand a single individual trick or function in the CString implementation is bad news. (The data function is an internal function which gets the 'real' buffer associated with the string.)
There's a book, MFC Internals that goes into it, and IIRC the Blaszczak book might touch it.
EDIT: As for what the expression actually translates to in terms of raw C++:-
TCHAR* data() { return (TCHAR*)(&this[1]); };
this says "pretend you're actually the first entry in an array of items allocated together. Now, the second item isnt actually a CString, it's a normal NUL terminated buffer of either Unicode or normal characters - i.e., an LPTSTR".
Another way of expressing the same thing is:
TCHAR* data() { return (TCHAR*)(this + 1); };
When you add 1 to a pointer to T, you actually add 1* sizeof T in terms of a raw memory address. So if one has a CString located at 0x00000010 with sizeof(CString) = 4, data will return a pointer to a NUL terminated array of chars buffer starting at 0x00000014
But just understanding this one thing out of context isnt necessarily a good idea.
Why do you need to know?
It returns the memory area that is immediately after the CStringData structure as an array of TCHAR characters.
You can understand why they are doing this if you look at the CString.cpp file:
static const struct {
CStringData data;
TCHAR ch;
} str_empty = {{-1, 0, 0}, 0};
CStringData* pData = (CStringData*)mem_alloc(sizeof(CStringData) + size*sizeof(TCHAR));
They do this trick, so that CString looks like a normal data buffer, and when you ask for the getdata it skips the CStringData structure and points directly to the real data buffer like char*