I'm currently studying MFC library and I wonder why should I use GetBuffer member which returns pointer to CString object buffer over other member functions which allow to read and change characters in that object?
For example why should I do (code changes first character of CString object):
CString aString(_T("String")); //new CString object
LPTSTR p = aString.GetBuffer(); //create new pointer to aString buffer
_tcsncpy(p, LPCTSTR(_T("a")), 1); //set first character to 'a'
aString.ReleaseBuffer(); //free allocated memory
Instead of:
CString aStr(_T("String")); //new CString object
aStr.SetAt(0, _T('a')); //set character at 0 position to 'a'
I suppose there is a more appropriate application to use GetBuffer() member, but I can't figure out what it can be... This function requires ReleaseBuffer() to free memory, and I may cause memory leaks when ReleaseBuffer() is not called. Is there any advantage of using it?
Don't use GetBuffer unless you have no alternative. Precisely because of (1) the reason you already know, that it must be followed with ReleaseBuffer which you may forget to do, leading to a resource leak. And (2) you might inadvertently make changes to the underlying data rendering it inconsistent in some way. More often than not the functions GetString, SetString, GetAt and SetAt will do what you need and have no disadvantages. Prefer them.
In above example it is preferable to use the SetAt method.
In some cases you need GetBuffer to directly access the buffer, mainly when used with WinAPI functions. For example, to use ::GetWindowText with WinAPI code you need to allocate a buffer as follows:
int len = ::GetWindowTextLength(m_hWnd) + 1;
char *buf = new char[len];
::GetWindowText(m_hWnd, buf, len);
...
delete[] buf;
The same thing can be done in MFC with CWnd::GetWindowText(CString&). But MFC has to use the same basic WinAPI functions, through GetBuffer. MFC's implementation of CWnd::GetWindowText is roughly as follows:
void CWnd::GetWindowText(CString &str)
{
int nLen = ::GetWindowTextLength(m_hWnd);
::GetWindowText(m_hWnd, str.GetBufferSetLength(nLen), nLen+1);
str.ReleaseBuffer();
}
Related
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
I found a nice example how to play with folder selecting dialog: http://bobmoore.mvps.org/Win32/w32tip70.htm - and all this is working except of this example using CString which I can't have on MinGW because it doesn't have stdafx.h. So I must use either string or char*.
But here the problem is that this example uses CString methods: GetBuffer and ReleaseBuffer which I don't have in string object. Is there any other method of passing folder name to folder selection window ?
When dealing with the Windows API and buffers, you can use std::vector<BYTE> for bytes and std::vector<TCHAR> for strings. (TCHAR is defined as wchar_t if UNICODE is defined and char otherwise. This way the code works for both UNICODE and ANSI). When instantiating the vector, give it a size to allocate memory:
// can hold MAX_PATH TCHARs, including terminating '\0'
std::vector<TCHAR> buffer(MAX_PATH);
Now you can treat is almost exactly like a buffer of TCHARs allocated with new or created on the stack.
BROWSEINFO bi = {0};
bi.pszDisplayName = &buffer[0];
However, buffer.size() will always return the full vector length. If you need to know the length of the string stored within the vector, or want to use string related methods,
you can copy it to a std::string:
if( LPITEMIDLIST pidl = SHBrowseForFolder(&bi) ) {
// this way it works for both UNICODE and ANSI:
std::basic_string<TCHAR> folderName(&buffer[0]);
if( SHGetPathFromIDList(pidl,&buffer[0]) ) {
MessageBox(0, &buffer[0], folderName.c_str(), MB_OK);
}
// TODO: free pidl with IMalloc* obtained through SHGetMalloc()
}
Since std::string is just another contiguous container, you could (ab)use that instead of the vector. However, size() will return the number of elements stored in the string, even if they are \0. You would have to resize() the string to the first occurrence of \0 (that is what CString::ReleaseBuffer() does) which is done automatically when you assign the buffer to the string in the above example. Because a string is not meant to be used as a buffer (even if it is technically possible) i strongly recommend using the vector approach.
With std::string you have a read-only access to the underlying representation by using c_str(), but nothing else.
In your case, I think the only option is to use some old-fashioned memory management, and then copy the result in a std::string.
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
I am trying to learn a little c++ and I have a silly question. Consider this code:
TCHAR tempPath[255];
GetTempPath(255, tempPath);
Why does windows need the size of the var tempPath? I see that the GetTempPath is declared something like:
GetTempPath(dword size, buf LPTSTR);
How can windows change the buf value without the & operator? Should not the function be like that?
GetTempPath(buf &LPTSTR);
Can somebody provide a simple GetTempPath implementation sample so I can see how size is used?
EDIT:
Thanks for all your answers, they are all correct and I gave you all +1. But what I meant by "Can somebody provide a simple GetTempPath implementation) is that i have tried to code a function similar to the one windows uses, as follow:
void MyGetTempPath(int size, char* buf)
{
buf = "C:\\test\\";
}
int main(int argc, char *argv[])
{
char* tempPath = new TCHAR[255];
GetTempPathA(255, tempPath);
MessageBoxA(0, tempPath, "test", MB_OK);
return EXIT_SUCCESS;
}
But it does not work. MessageBox displays a "##$' string. How should MyGetTempPath be coded to work properly?
Windows needs the size as a safety precaution. It could crash the application if it copies characters past the end of the buffer. When you supply the length, it can prevent that.
Array variables work like pointers. They point to the data in the array. So there is no need for the & operator.
Not sure what kind of example you are looking for. Like I said, it just needs to verify it doesn't write more characters than there's room for.
An array cannot be passed into functions by-value. Instead, it's converted to a pointer to the first element, and that's passed to the function. Having a (non-const) pointer to data allows modification:
void foo(int* i)
{
if (i) (don't dereference null)
*i = 5; // dereference pointer, modify int
}
Likewise, the function now has a pointer to a TCHAR it can write to. It takes the size, then, so it knows exactly how many TCHAR's exist after that initial one. Otherwise it wouldn't know how large the array is.
GetTempPath() outputs into your "tempPath" character array. If you don't tell it how much space there is allocated in the array (255), it has no way of knowing whether or not it will have enough room to write the path string into tempPath.
Character arrays in C/C++ are pretty much just pointers to locations in memory. They don't contain other information about themselves, like instances of C++ or Java classes might. The meat and potatoes of the Windows API was designed before C++ really had much inertia, I think, so you'll often have to use older C style techniques and built-in data types to work with it.
Following wrapper can be tried, if you want to avoid the size:
template<typename CHAR_TYPE, unsigned int SIZE>
void MyGetTempPath (CHAR_TYPE (&array)[SIZE]) // 'return' value can be your choice
{
GetTempPath(SIZE, array);
}
Now you can use like below:
TCHAR tempPath[255];
MyGetTempPath(tempPath); // No need to pass size, it will count automatically
In your other question, why we do NOT use following:
GetTempPath(buf &LPTSTR);
is because, & is used when you want to pass a data type by reference (not address). I am not aware what buf is typecasted to but it should be some pointer type.
Can somebody provide a simple
GetTempPath implementation sample so I
can see how size is used?
First way (based on MAX_PATH constant):
TCHAR szPath[MAX_PATH];
GetTempPath(MAX_PATH, szPath);
Second way (based on GetTempPath description):
DWORD size;
LPTSTR lpszPath;
size = GetTempPath(0, NULL);
lpszPath = new TCHAR[size];
GetTempPath(size, lpszPath);
/* some code here */
delete[] lpszPath;
How can windows change the buf value without the & operator?
& operator is not needed because array name is the pointer to first array element (or to all array). Try next code to demonstrate this:
TCHAR sz[1];
if ((void*)sz == (void*)&sz) _tprintf(TEXT("sz equals to &sz \n"));
if ((void*)sz == (void*)&(sz[0])) _tprintf(TEXT("sz equals to &(sz[0]) \n"));
As requested, a very simple implementation.
bool MyGetTempPath(size_t size, char* buf)
{
const char* path = "C:\\test\\";
size_t len = strlen(path);
if(buf == NULL)
return false;
if(size < len + 1)
return false;
strncpy(buf, path, size);
return true;
}
An example call to the new function:
char buffer[256];
bool success = MyGetTempPath(256, buffer);
from http://msdn.microsoft.com/en-us/library/aa364992(v=vs.85).aspx
DWORD WINAPI GetTempPath(
__in DWORD nBufferLength,
__out LPTSTR lpBuffer
);
so GetTempPath is defined something like
GetTempPath(DWORD nBufferLength, LPTSTR& lpBuffer);
What mean, that compiler passes the value lpBuffer by referenece.
In the CString header file (be it Microsoft's or Open Foundation Classes - http://www.koders.com/cpp/fid035C2F57DD64DBF54840B7C00EA7105DFDAA0EBD.aspx#L77 ), there is the following code snippet
struct CStringData
{
long nRefs;
int nDataLength;
int nAllocLength;
TCHAR* data() { return (TCHAR*)(&this[1]); };
...
};
What does the (TCHAR*)(&this[1]) indicate?
The CStringData struct is used in the CString class (http :// www.koders.com/cpp/fid100CC41B9D5E1056ED98FA36228968320362C4C1.aspx).
Any help is appreciated.
CString has lots of internal tricks which make it look like a normal string when passed e.g. to printf functions, despite actually being a class - without having to cast it to LPCTSTR in the argument list, e.g., in the case of varargs (...) in e.g. a printf. Thus trying to understand a single individual trick or function in the CString implementation is bad news. (The data function is an internal function which gets the 'real' buffer associated with the string.)
There's a book, MFC Internals that goes into it, and IIRC the Blaszczak book might touch it.
EDIT: As for what the expression actually translates to in terms of raw C++:-
TCHAR* data() { return (TCHAR*)(&this[1]); };
this says "pretend you're actually the first entry in an array of items allocated together. Now, the second item isnt actually a CString, it's a normal NUL terminated buffer of either Unicode or normal characters - i.e., an LPTSTR".
Another way of expressing the same thing is:
TCHAR* data() { return (TCHAR*)(this + 1); };
When you add 1 to a pointer to T, you actually add 1* sizeof T in terms of a raw memory address. So if one has a CString located at 0x00000010 with sizeof(CString) = 4, data will return a pointer to a NUL terminated array of chars buffer starting at 0x00000014
But just understanding this one thing out of context isnt necessarily a good idea.
Why do you need to know?
It returns the memory area that is immediately after the CStringData structure as an array of TCHAR characters.
You can understand why they are doing this if you look at the CString.cpp file:
static const struct {
CStringData data;
TCHAR ch;
} str_empty = {{-1, 0, 0}, 0};
CStringData* pData = (CStringData*)mem_alloc(sizeof(CStringData) + size*sizeof(TCHAR));
They do this trick, so that CString looks like a normal data buffer, and when you ask for the getdata it skips the CStringData structure and points directly to the real data buffer like char*