Only first character is assigned converting LPCTSTR to char* - c++

I'm completely new to C++. In my program there's a function which has to take a LPCTSTR as a parameter. I want to convert it into a char*. What I tried is as follows,
char* GetChar(LPCTSTR var){
char* id = (char*)var;
.....
}
But while debugging I noticed that only first letter of var is assigned to id.
What have I done wrong?
(I tried various answers in StackOverflow about converting LPCTSTR to char* before coming to this solution. None of them worked for me.)
UPDATE
What i want is to get full string pointed by var to be treated as char*

It is much more useful to just pick a character set (wchar_t, or char), and just stick to it, in your application, since trying to use TCHAR, when trying to support both, may cause you some headaches. To be fair, today, you can just, safely, use wchar_t (or WCHAR, since from the current types you are using, I suspect that you are using Windows headers).
The problem that you have, is because casting a pointer does not have any impact on its contents. And, since, typically wchar_t is 2 bytes in size, while char is 1 byte in size, storing the value, that fits inside a char, in wchar_t, leaves 2nd byte of wchar_t set to \0. And when you try to print null(\0)-terminated string of wchar_ts as a string of chars, the printing function reaches the \0 character after reading the first symbol, and assumes it is the end of the string. \0 character in wchar_t is 2 bytes long.
For example, the string
LPCWSTR test = L"Hi!";
is stored in memory as:
48 00 69 00 21 00 00 00
If you want to convert between the wchar_t version of the string to char version, or vice-versa, there exist some functions, that can do the conversion, and since I noticed that you probably are using Windows headers (from LPCTSTR define), those functions are WideCharToMultiByte/ MultiByteToWideChar.
You may now start to think: I am not using wchar_t! I am using TCHAR!
Typically TCHAR is defined in the following way:
#ifdef UNICODE
typedef WCHAR TCHAR;
#else
typedef char TCHAR;
#endif
So you could do similar handling in your conversion code:
template<int N>
bool GetChar(LPCTSTR var, char (&out)[N])){
#ifdef UNICODE
return WideCharToMultiByte (CP_ACP, 0, var, -1, out, N, NULL, NULL) != 0;
#else
return strcpy_s (out, var) == 0;
#endif
}
Note, the return value of GetChar function is true if the function Succeeds; false - otherwise.

You code has told the compiler to convert var (which is a pointer) into a pointer to a character and then assign that converted value to id. The only thing it converts is the pointer value. It doesn't make any changes to the thing var points to, copy it, or convert it. So you haven't done anything to the string var points to.
It's not clear what you're trying to do. But your code doesn't really do anything but convert a pointer value without changing or affecting the thing pointed to in any way.
When you convert a LPCTSTR (a long pointer to a const tchar string) to a char*, you get a char* that points to a CTSTR (a const tchar string). What use is that? What sense does that make?

Most probaby LPCTSTR is const wchar_t*, so if you cast it to char* (which is Undefined Behaviour - as var could point to literal), the LSB byte (wchar_t under Visual Studio is 16bits) of *var is zero so it is treated as '\0' - which indicates end of string. So in the end you get only one char.
To convert LPCTSTR to char* you can use wsctombs for example, see here: Convert const wchar_t* to const char*.

Here's an easy solution I found based on other answers given here.
char* GetChar(LPCTSTR var){
char id[30];
int i = 0;
while (var[i] != '\0')
{
id[i] = (char)var[i];
i++;
}
id[i] = '\0';
UPDATE
As mentioned in comments this is not a good way to solve this problem. But if someone has the same problem and cannot understand any other solution, this will help a bit.
Therefore I won't remove this answer.

Related

Conversion (const char*) var goes wrong

I need to convert from CString to double in Embedded Visual C++, which supports only old style C++. I am using the following code
CString str = "4.5";
double var = atof( (const char*) (LPCTSTR) str )
and resutlt is var=4.0, so I am loosing decimal digits.
I have made another test
LPCTSTR str = "4.5";
const char* var = (const char*) str
and result again var=4.0
Can anyone help me to get a correct result?
The issue here is, that you are lying to the compiler, and the compiler trusts you. Using Embedded Visual C++ I'm going to assume, that you are targeting Windows CE. Windows CE exposes a Unicode API surface only, so your project is very likely set to use Unicode (UTF-16 LE encoding).
In that case, CString expands to CStringW, which stores code units as wchar_t. When doing (const char*) (LPCTSTR) str you are then casting from a wchar_t const* to a char const*. Given the input, the first byte has the value 52 (the ASCII encoding for the character 4). The second byte has the value 0. That is interpreted as the terminator of the C-style string. In other words, you are passing the string "4" to your call to atof. Naturally, you'll get the value 4.0 as the result.
To fix the code, use something like the following:
CStringW str = L"4.5";
double var = _wtof( str.GetString() );
_wtof is a Microsoft-specific extension to its CRT.
Note two things in particular:
The code uses a CString variant with explicit character encoding (CStringW). Always be explicit about your string types. This helps read your code and catch bugs before they happen (although all those C-style casts in the original code defeats that entirely).
The code calls the CString::GetString member to retrieve a pointer to the immutable buffer. This, too, makes the code easier to read, by not using what looks to be a C-style cast (but is an operator instead).
Also consider defining the _CSTRING_DISABLE_NARROW_WIDE_CONVERSION macro to prevent inadvertent character set conversions from happening (e.g. CString str = "4.5";). This, too, helps you catch bugs early (unless you defeat that with C-style casts as well).
CString is not const char* To convert a TCHAR CString to ASCII, use the CT2A macro - this will also allow you to convert the string to UTF8 (or any other Windows code page):
// Convert using the local code page
CString str(_T("Hello, world!"));
CT2A ascii(str);
TRACE(_T("ASCII: %S\n"), ascii.m_psz);
// Convert to UTF8
CString str(_T("Some Unicode goodness"));
CT2A ascii(str, CP_UTF8);
TRACE(_T("UTF8: %S\n"), ascii.m_psz);
Found a solution using scanf
CString str="4.5"
double var=0.0;
_stscanf( str, _T("%lf"), &var );
This gives a correct result var=4.5
Thanks everyone for comments and help.

Am I converting properly from "const char *" to "TCHAR*"?

I'm trying to pass a process name as a TCHAR to the following void:
void GetBaseAddressByName(DWORD pID, TCHAR *pN)
By doing it like this:
GetBaseAddressByName(aProcs[i], (TCHAR*)"Process.exe");
So my question is: is what I am doing correct? Because I have tried both TEXT("Process.exe") and _T("Process.exe") with my project's Character Set both on Multi-Bite and Unicode and it just tells me that
argument of type "const char*" is incompatible with parameter of type "TCHAR*"
The short answer is no. TCHAR maps to either char or wchar_t depending on your project's Unicode/Multi-byte setting. So, in general, a cast like that is either unnecessary or incorrect. The correct way, as you said, is to use either the TEXT or _T macro. The reason you're getting an error is that you're trying to pass a const character string to a function that expects a mutable character string. The safeset way to get around the error is to copy your constant string into a local TCHAR buffer and then pass that to GetBaseAddressByName.
It is better to have a TCHAR array first, then copy into it.
#include "atlstr.h"
char const * procName = "processName.exe";
TCHAR szName [128];
_tcscpy(szName, A2T(procName));
GetBaseAddressByName(aProcs[i], szName);
As suggested by #Remy Lebeau in the comments, procName can be defined as TCHAR const * procName = TEXT("processName.exe");.
(TCHAR*)"Process.exe" is not a valid type-cast. It will "work" when the project charset is set to ANSI/MBCS, but it will produce garbage if the charset is set to Unicode.
Using TEXT("Process.exe") is the correct way to make a string literal use TCHAR characters.
GetBaseAddressByName(aProcs[i], TEXT("Process.exe"));
However, you need to change your pN parameter to const TCHAR * (or LPCTSTR) instead:
void GetBaseAddressByName(DWORD pID, const TCHAR *pN);
void GetBaseAddressByName(DWORD pID, LPCTSTR pN);
A string literal is const data, and you cannot pass a pointer-to-const-data where a pointer-to-non-const-data is expected (without casting the const away with const_cast). That is why you were still getting errors when trying to use the TEXT()/_T() macros.
You need a L, like L"Process.exe". Unicode strings are specified with L"".
That said, there is no reason to use TCHAR. Use unicode all the time, if doing Windows work only.

Check length char[] before converting to wstring()

I have a api function. I takes a pointer to array char. The calling function is out of my control. Array is dynamic but still need some checking
extern "C" int __stdcall calcW2(LPWSTR foo)
If somebody make a call with
char foo[5000];
LPSTR lpfoo2 = foo;
calcW2(lpfoo2 );
I understand that i need to make some checks. I can test for nulltpr. But if I want to len checking. That the char array has some validity. How is that best done? In the safest way for a string to 0 to 2500 chars. Do need check for something more?
if(foo != nullptr)
{
//Size checking
//size_t newsize = strlen(SerialNumber) + 1 not good?
std::wstring test(foo);
}
You missed one important point. The function signature says LPWSTR not LPSTR. This means that the function expects (or should expect) to receive wchar_t[] not char[]. See https://msdn.microsoft.com/en-us/library/cc230355.aspx.
I mean:
extern "C" int __stdcall calcW2(LPWSTR foo) <--- LP-W-STR
char foo[5000];
LPSTR lpfoo2 = foo; <--- LP-STR
calcW2(lpfoo2 ); <--- LP-STR passed into LP-W-STR ??
that should not compile. Argument types are wrong.
If you change the array to wchar_t[] and it starts to fail to compile, then most probably you have some _UNICODE #defines set wrong. In WINAPI and similar, many functions have dual definitions. When "UNICODE" flag is set, they take LPWSTR, but when the flag is cleared, the headers switch them to taking LPSTR. So if you see that it should be LPWSTR and you want it to be LPWSTR and it insists on being LPSTR, then you either messed up the function names, or UNICODE flag (or the header you have is simply incorrect).
char and wchar_t are different. Simplifying, char is "singlebyte" and wchar_t is "twobyte". Both use '\0' as the end-of-string marker, but in wchar_t that's actually '\0\0' since it's two bytes per character. Also, in wchar_t[] plain ASCII data isn't like a|b|c|d|e|f, it's 0|a|0|b|0|c|0|d|0|e|0|f since it's two bytes per character. That's why the strlen cannot work on 16bit encoded data properly - it picks the first \0 from the first character as end-of-string. Having a wchar_t data forcibly packed into char[] is plainly wrong or at least highly misleading and error-prone.
That's why you should use wstrlen instead, which so happens to take wchar_t* instead of char*.
This is a overall 'rule'. For any function working on char (strlen, strcat, strcmp, ..) you should be able to find relevant w* function (wstrlen, wstrcat, wstrcmp, ..). There may be some underscores in the names sometimes. Search the docs. Don't mix up char types. That't now just byte-array. There is some semantics out there for them, and usually if some types are named differently, there's a reason for that.

Difference between char* and wchar_t*

I am new to MFC. I am trying to do simple mfc application and I'm getting confuse in some places. For example, SetWindowText have two api, SetWindowTextA, SetWindowTextW one api takes char * and another one accepts wchar_t *.
What is the use of char * and wchar_t *?
char is used for so called ANSI family of functions (typically function name ends with A), or more commonly known as using ASCII character set.
wchar_t is used for new so called Unicode (or Wide) family of functions (typically function name ends with W), which use UTF-16 character set. It is very similar to UCS-2, but not quite it. If character requires more than 2 bytes, it will be converted into 2 composite codepoints, and this can be very confusing.
If you want to convert one to another, it is not really simple task. You will need to use something like MultiByteToWideChar, which requires knowing and providing code page for input ANSI string.
On Windows, APIs that take char * use the current code page whereas wchar_t * APIs use UTF-16. As a result, you should always use wchar_t on Windows. A recommended way to do this is to:
// Be sure to define this BEFORE including <windows.h>
#define UNICODE 1
#include <windows.h>
When UNICODE is defined, APIs like SetWindowText will be aliased to SetWindowTextW and can therefore be used safely. Without UNICODE, SetWindowText will be aliased to SetWindowTextA and therefore cannot be used without first converting to the current code page.
However, there's no good reason to use wchar_t when you are not calling Windows APIs, since its portable functionality is not useful, and its useful functionality is not portable (wchar_t is UTF-16 only on Windows, on most other platforms it is UTF-32, what a total mess.)
SetWindowTextA takes char*, which is a pointer to ANSI strings.
SetWindowTextW takes wchar_t*, which is a pointer to "wide" strings (Unicode).
SetWindowText has been defined (#define) to either of these in header Windows.h based on the type of application you are building. If you are building a UNICODE build then your code will automatically use SetWindowTextW.
SetWindowTextA is there primarily to support legacy code, which needs to be built as SBCS (Single byte character set).
char* : It means that this is a pointer to data of type char.
Example
// Regular char
char aChar = 'a';
// Pointer to char
char* aPointer = new char;
*aPointer = 'a';
// Pointer to an array of 10 chars
char* anArray = new char[ 10 ];
*anArray = 'a';
anArray[ 1 ] = 'b';
// Also a pointer to an array of 10
char[] anArray = new char[ 10 ];
*anArray = 'a';
anArray[ 1 ] = 'b';
wchar_t* : wchar_t is defined such that any locale's char encoding can be converted to a wchar_t representation where every wchar_t represents exactly one codepoint.

how to convert or cast CString to LPWSTR?

I tried to use this code:
USES_CONVERSION;
LPWSTR temp = A2W(selectedFileName);
but when I check the temp variable, just get the first character
thanks in advance
If I recall correctly, CString is typedef'd to either CStringA or CStringW, depending on whether you're building Unicode or not.
LPWSTR is a "Long Pointer to a Wide STRing" -- aka: wchar_t*
If you want to pass a CString to a function that takes LPWSTR, you can do:
some_function(LPWSTR str);
// if building in unicode:
some_function(selectedFileName);
// if building in ansi:
some_function(CA2W(selectedFileName));
// The better way, especially if you're building in both string types:
some_function(CT2W(selectedFileName));
HOWEVER LPWSTR is non-const access to a string. Are you using a function that tries to modify the string? If so, you want to use an actual buffer, not a CString.
Also, when you "check" temp -- what do you mean? did you try cout << temp? Because that won't work (it will display just the first character):
char uses one byte per character. wchar_t uses two bytes per character. For plain english, when you convert it to wide strings, it uses the same bytes as the original string, but each character gets padded with a zero. Since the NULL terminator is also a zero, if you use a poor debugger or cout (which is uses ANSI text), you will only see the first character.
If you want to print a wide string to standard out, use wcout.
In short: You cannot. If you need a non-const pointer to the underlying character buffer of a CString object you need to call GetBuffer.
If you need a const pointer you can simply use static_cast<LPCWSTR>(selectedFilename).
I know this is a decently old question, but I had this same question and none of the previous answers worked for me.
This, however, did work for my unicode build:
LPWSTR temp = (LPWSTR)(LPCWSTR)selectedFileName;
LPWSTR is a "Long Pointer to a Wide String". It is like wchar*.
CString strTmp = "temp";
wchar* szTmp;
szTmp = new WCHAR[wcslen(strTmp) + 1];
wcscpy_s(szTmp, wcslen(strTmp) + 1, strTmp);