I am trying to convert a program for multibyte character to Unicode.
I have gone through the program and preceded the string literals with L so they look like L"string".
This has worked but I am now left with a C style string that won't conform. I have tried the L and putting it in TEXT() but the L gets added to the variable name -- not the string -- if I use TEXT().
I have tried making it a TCHAR but then it complains that it cannot convert a TCHAR to a char *.
What options am I left with?
I know C and C++ are different. It is an old in-house C library that has been used in C++ projects for several years now.
The std::mbstowcs function is what you are looking for:
char text[] = "something";
wchar_t wtext[20];
mbstowcs(wtext, text, strlen(text)+1);//Plus null
LPWSTR ptr = wtext;
for strings,
string text = "something";
wchar_t wtext[20];
mbstowcs(wtext, text.c_str(), text.length());//includes null
LPWSTR ptr = wtext;
--> ED: The "L" prefix only works on string literals, not variables. <--
The clean way to use mbstowcs is to call it twice to find the length of the result:
const char * cs = <your input char*>
size_t wn = mbsrtowcs(NULL, &cs, 0, NULL);
// error if wn == size_t(-1)
wchar_t * buf = new wchar_t[wn + 1](); // value-initialize to 0 (see below)
wn = mbsrtowcs(buf, &cs, wn + 1, NULL);
// error if wn == size_t(-1)
assert(cs == NULL); // successful conversion
// result now in buf, return e.g. as std::wstring
delete[] buf;
Don't forget to call setlocale(LC_CTYPE, ""); at the beginning of your program!
The advantage over the Windows MultiByteToWideChar is that this is entirely standard C, although on Windows you might prefer the Windows API function anyway.
I usually wrap this method, along with the opposite one, in two conversion functions string->wstring and wstring->string. If you also add trivial overloads string->string and wstring->wstring, you can easily write code that compiles with the Winapi TCHAR typedef in any setting.
[Edit:] I added zero-initialization to buf, in case you plan to use the C array directly. I would usually return the result as std::wstring(buf, wn), though, but do beware if you plan on using C-style null-terminated arrays.[/]
In a multithreaded environment you should pass a thread-local conversion state to the function as its final (currently invisible) parameter.
Here is a small rant of mine on this topic.
I'm using the following in VC++ and it works like a charm for me.
CA2CT(charText)
This version, using the Windows API function MultiByteToWideChar(), handles the memory allocation for arbitrarily long input strings.
int lenA = lstrlenA(input);
int lenW = ::MultiByteToWideChar(CP_ACP, 0, input, lenA, NULL, 0);
if (lenW>0)
{
output = new wchar_t[lenW];
::MultiByteToWideChar(CP_ACP, 0, input, lenA, output, lenW);
}
You may use CString, CStringA, CStringW to do automatic conversions and convert between these types. Further, you may also use CStrBuf, CStrBufA, CStrBufW to get RAII pattern modifiable strings
Related
I do have this function defined in windows playsoundapi.h.
PlaySound(L"D:\\resources\\English\\A.wav", NULL, SND_LOOP);
I want to concatenate a variable to replace "A.wav" in c++.
The variable is of type char*
Can anyone suggest a solution to this please? Much appreciated.
In C++17 or above use std::filesystem::path which is more handy for such scenario:
using std::filesystem::path;
path file = ...; // L"A.wav" // here can be wide characters things and regular character things - proper conversion is done implicitly
path base{L"D:\\resources\\English"};
PlaySound((base / file).c_str(), NULL, SND_LOOP);
Note that std::filesystem::path::c_str() returns const wchar_t* on Windows and const char * on other platforms.
Return value
The native string representation of the pathname, using native syntax, native character type, and native character encoding. This string is suitable for use with OS APIs.
Simple enough
std::wstring var = ...;
PlaySound((L"D:\\resources\\English\\" + var).c_str(), NULL, SND_LOOP);
But if your variable is something other than a std::wstring, then that is a different question. Please add more details if that is the case.
EDIT
It seems the variable is type char*. One possible solution is to make a std::wstring variable from the char* variable
char* var = ...;
std::wstring tmp(var, var + strlen(var));
PlaySound((L"D:\\resources\\English\\" + tmp).c_str(), NULL, SND_LOOP);
This does assume that there are no encoding issues in copying from char to wchar_t but again that's a detail not provided in the question.
Also you should consider why the variable is char* in the first place. You are working with an API that requires wide characters, so why not use wide characters in your code?
Assign your char* string to a std::string, which you can then concatenate with your base path, and then use the std::string::c_str() method to get a const char* pointer that you can pass to PlaySound(), eg:
std::string fileName = "A.wav";
PlaySoundA(("D:\\resources\\English\\" + fileName).c_str(), NULL, SND_LOOP);
I want to use the Win API CreateProcess for which accepts 2nd parameter as "LPTSTR".
But I've the path to my exe in a char array. My VS2013 project (static library) is Unicode encoding type.
Code snippert below.
IN this line
"appPath = (LPTSTR)TestEXEPath;"
of the below code snippet where the type cast is happening, I see that the characters in "appPath" gets converted to some junk characters whereas the RHS of this expression "TestEXEPath" does have valid characters.
However tehre is no compilation error here. It is at run time this characters get corrupted in "appPath".
I know this typecast is creating this problem. But how do I solve this, how do I type cast this char array to LPTSTR typr which is needed by "CreateProcess() API.
Or is there any better way of doing this so as to avoid the char array itself.
LPTSTR appPath;
char cwd[_MAX_PATH];
getcwd(cwd, _MAX_PATH);
char TestEXEPath[_MAX_PATH];
strcpy(TestEXEPath, cwd);
strcat(TestEXEPath, "\\pwrtest.exe /sleep /c:1");
appPath = (LPTSTR)TestEXEPath; // The characters in this gets converted to some junk characters.
.......................
......................
CreateProcess(NULL, appPath, NULL, NULL, FALSE, 0, NULL, workingDir, &sI, &pI))
You are compiling for Unicode, so LPTSTR expands to wchar_t*. But you have ANSI data, char*. In that case it is simplest to call CreateProcessA and pass the ANSI data.
BOOL retval = CreateProcessA(..., TestExePath, ...));
If you want to avoid using ANSI functions then you can stick to wchar_t arrays.
whar_t exepath[MAX_PATH + 100]; // enough room for cwd and the rest of command line
GetCurrentDirectory(MAX_PATH, exepath);
wcscat(exepath, L"\\pwrtest.exe /sleep /c:1");
BOOL retval = CreateProcess(..., exepath, ...);
Note that I switched from getcwd to GetCurrentDirectory in order to get a wide char version of the working directory.
Note also that your code should check for errors. I neglected to do that here due to laze. But in your real code, you should not be as lazy as I have been.
The fact that you had to cast should have set off warning signals for you. Well, judging from the question, it probably did. When you write:
appPath = (LPTSTR)TestEXEPath;
That simply tells the compiler to treat TestEXEPath as LPTSTR whether or not it really is. And the fact that the program won't compile without the cast tells you that TestEXEPath is not LPTSTR. The cast does not change that reality, it merely shuts the compiler up. Always a bad move.
With unicode, LPTSTR points to an array of wchar_t and not an array of char.
Here some additional explantions.
Try with:
TCHAR TestExePath[_MAX_PATH];
And use wcscat() and wcscpy() and the other wide c-string handling functions in <cwchar>.
Also take a look at the very convenient ATL conversion classes here: http://msdn.microsoft.com/en-us/library/87zae4a3.aspx
LPTSTR str = CA2T(TestEXEPath);
Or even easier just
CreateProcess(NULL, CA2T(TestEXEPath), NULL, NULL, FALSE, 0, NULL, workingDir, &sI, &pI))
No destruction is needed.
How would I take...
string modelPath = "blah/blah.obj"
and concatenate it with...
L" not found."
While passing it in as LPCWSTR. I tried to do
(LPCWSTR)(modelPath + " was not found.").c_str()
However that did not work. Here is a larger example of what it looks like now.
if(!fin)
{
MessageBox(0, L"Models/WheelFinal.txt not found.", 0, 0); //
return;
}
LPCWSTR is a L ong P ointer to a C onstant W ide STR ing. Wide strings, at least in Win32, are 16 bits, whereas (const) char strings (i.e. (C)STR or their pointer-counterparts LP(C)STR) are 8 bits.
Think of them on Win32 as typedef const char* LPCSTR and typedef const wchar_t* LPCWSTR.
std::string is an 8-bit string (using the underlying type char by default) whereas std::wstring is a wider character string (i.e. 16-bits on win32, using wchar_t by default).
If you can, use std::wstring to concatenate a L"string" as a drop-in replacement.
A note on MessageBox()
Windows has a funny habit of defining macros for API calls that switch out underlying calls given the program's multibyte configuration. For almost every API call that uses strings, there is a FunctionA and FunctionW call that takes an LPCSTR or LPWCSTR respectively.
MessageBox is one of them. In Visual Studio, you can go into project settings and change your Multi-Byte (wide/narrow) setting or you can simply call MessageBoxA/W directly in order to pass in different encodings of strings.
For example:
LPWCSTR wideString = L"Hello, ";
MessageBoxW(NULL, (std::wstring(wideString) + L"world!").c_str(), L"Hello!", MB_OK);
LPCSTR narrowString = "Hello, ";
MessageBoxA(NULL, (std::string(narrowString) + "world!").c_str(), "Hello!", MB_OK);
If you can change modelPath to std::wstring, it becomes easy:
MessageBox(nullptr, (modelPath + L" not found.").c_str(), nullptr, 0);
I changed your 0 pointer values into nullptr as well.
Since std::string represents a narrow string, std::wstring represents a wide string, and the two are wildly different, casting from one representation to the other does not work, while starting with the appropriate one does. On the other hand, one can properly convert between representations using the new <codecvt> header in C++11.
I'm a little confused about C strings and wide C strings. For the sake of this question, assume that I using Microsoft Visual Studio 2010 Professional. Please let me know if any of my information is incorrect.
I have a struct with a const wchar_t* member which is used to store a name.
struct A
{
const wchar_t* name;
};
When I assign object 'a' a name as so:
int main()
{
A a;
const wchar_t* w_name = L"Tom";
a.name = w_name;
return 0;
}
That is just copying the memory address that w_name points to into a.name. Now w_name and a.name are both wide character pointers which point to the same address in memory.
If I am correct, then I am wondering what to do about a situation like this. I am reading in a C string from an XML attribute using tinyxml2.
tinyxml2::XMLElement* pElement;
// ...
const char* name = pElement->Attribute("name");
After I have my C string, I am converting it to a wide character string as follows:
size_t newsize = strlen(name) + 1;
wchar_t * wcName = new wchar_t[newsize];
size_t convertedChars = 0;
mbstowcs_s(&convertedChars, wcName, newsize, name, _TRUNCATE);
a.name = wcName;
delete[] wcName;
If I am correct so far, then the line:
a.name = wcName;
is just copying the memory address of the first character of array wcName into a.name. However, I am deleting wcName directly after assigning this pointer which would make it point to garbage.
How can I convert my C string into a wide character C string and then assign it to a.name?
The easiest approach is probably to task you name variable with the management of the memory. This, in turn, is easily done by declaring it as
std::wstring name;
These guys don't have a concept of independent content and object mutation, i.e., you can't really make the individual characters const and making the entire object const would prevent it from being assigned to.
You can do this while using a std::wstring without relying on the additional temporary conversion buffer allocation and destruction. Not tremendously important unless you're overtly concerned about heap fragmentation or on a limited system (aka Windows Phone). It just takes a little setup on the front side. Let the standard library manage the memory for you (with a little nudge).
class A
{
...
std::wstring a;
};
// Convert the string (I'm assuming it is UTF8) to wide char
int wlen = MultiByteToWideChar(CP_UTF8, 0, name, -1, NULL, NULL);
if (wlen > 0)
{
// reserve space. std::wstring gives us the terminator slot
// for free, so don't include that. MB2WC above returns the
// length *including* the terminator.
a.resize(wlen-1);
MultiByteToWideChar(CP_UTF8, 0, name, -1, &a[0], wlen);
}
else
{ // no conversion available/possible.
a.clear();
}
On a complete side-note, you can build TinyXML to use the standard library and std::string rather than char *, which doesn't really help you much here, but may save you a ton of future strlen() calls later on.
As you correctly mentioned a.name is just a pointer which doesn't suppose any allocated string storage. You must manage it manually using new or static/scoped array.
To get rid of these boring things just use one of available string classes: CStringW from ATL (easy to use but MS-specific) or std::wstring from STL (C++ standard, but not so easy to convert from char*):
#include <atlstr.h>
// Conversion ANSI -> Wide is automatic
const CStringW name(pElement->Attribute("name"));
Unfortunately, std::wstring usage with char* is not so easy.
See conversion functon here: How to convert std::string to LPCWSTR in C++ (Unicode)
I am using Visual Studio c++ and want to convert the Cstring to Byte. I have written this code but it gave me error in the second line that "data" is undefined.
CString data = _T( "OK");
LPBYTE pByte = new BYTE[data.GetLength() + 1];
memcpy(pByte, (VOID*)LPCTSTR(data), data.GetLength());
Further more I need to convert LPBYTE to const char for strcmp function. I have written the code but I can't find the issue with it.
const LPBYTE lpBuffer;
LPBYTE lpData = lpBuffer;
CString rcvValue(LPCSTR(lpBuffer));
const CHAR* cstr = (LPCSTR)rcvValue;
if (strcmp (cstr,("ABC")) == 0)
{
////
}
The CString type is a template specialization of CStringT, depending on the character set it uses (CStringA for ANSI, CStringW for Unicode). While you ensure to use a matching encoding when constructing from a string literal by using the _T macro, you fail to account for the different size requirements when copying the controlled sequence to the buffer.
The following code fixes the first part:
CString data = _T("OK");
size_t size_in_bytes = (data.GetLength() + 1) * sizeof(data::XCHAR);
std::vector<BYTE> buffer(size_in_bytes);
unsigned char const* first = static_cast<unsigned char*>(data.GetString());
unsigned char const* last = first + size_in_bytes;
std::copy(first, last, buffer.begin());
The second question is really asking to solve a solved problem. The CStringT type already provides a CStringT::Compare member, that can be used:
const LPBYTE lpBuffer;
CString rcvValue(static_cast<char const*>(lpBuffer));
if (rcvValue.Compare(_T("ABC")) == 0)
{
////
}
General advice: Always prefer using the concrete CStringT specialization matching your character encoding, i.e. CStringA or CStringW. The code will be much easier to read and reason about, and when you run into problems you need help with, you can post a question at Stack Overflow, without having to explain, what compiler settings you are using.
Make sure you include atlstr.h to provide the definition of CString, as below:
#include "stdafx.h"
#include <Windows.h>
#include <atlstr.h>
int _tmain(int argc, _TCHAR* argv[])
{
CString data = _T( "OK");
LPBYTE pByte = new BYTE[data.GetLength() + 1];
memcpy(pByte, (VOID*)LPCTSTR(data), data.GetLength());
return 0;
}
I'm fairly certain Jay is correct for your first question. You need to include the right header.
For your second question, why would you expect that code to work? Let's walk through what the code you've written actually does.
Create a char pointer (char *) without initializing it. This leaves lpData/lpBuffer pointing to a random location in memory.
Create a CString and initialize it with this random pointer.
Extract the buffer from the CString and compare it to a string literal.
Keeping in mind that the CString contains random garbage, what exactly do you expect this code to do? (Other than crash horribly? =) )
I also want to point out that you need to be more consistent in your approach to strings. Do you plan to support both char and wchar_t based strings as your use of TCHAR in the first sections suggests? Do you want to work with C-Style strings or do you want to use objects like CString? If you want to work with CString's, just use the Compare function that CString provides. Don't bother with strcmp.
Probably you didn't include the cruicial header
#include <afx.h>
int main()
{
CString data = _T( "OK");
LPBYTE pByte = new BYTE[data.GetLength() + 1];
memcpy(pByte, (VOID*)LPCTSTR(data), data.GetLength());
return 0;
}
This code works fine.
You should rather use
CString ss = "123ABC";
BYTE* bp = (BYTE*)ss.GetBuffer(ss.GetLength());
BYTE expected[16] ;
CopyMemory(expected,bp,sizeof(expected));
Just using '=' won't work.