Conversion from TCHAR to std::string, other way than using WideCharToMultiByte? - c++

I have to convert a TCHAR variable (which is a path retrieved with OpenBrowseDir) into a std::string.
I'm currently having this code which works. I'm using WideCharToMultiByte.
TCHAR path[MAX_PATH];
OpenBrowseDir(path);
/*conversion from TCHAR to std::string*/
char ch[MAX_PATH];
char DefChar = ' ';
WideCharToMultiByte(CP_ACP, 0, path, -1, ch, MAX_PATH, &DefChar, NULL);
string spath(ch);
On my project TCHAR is wchar_t.
In OpenBrowseDir I'm opening a browse dialog so the user can select a directory. I'm using a variable BROWSEINFO bi, and bi.pszDisplayName is of type TCHAR.
Are there other better solutions?

Related

How to show Cyrillic text in MFC Multi-Byte application?

I am new with C++ and MFC. The main problem is that I have an MFC project that needs to be translated into Russian. I see that the best option is to change the project to Unicode but I cannot, because it is a huge project and when I change I receive more than 4000 errors. Later we will pass all the code to Unicode, but for now I just need to show Cyrillic on the buttons and CListBox.
Well, the main thing is: How to print Cyrillic with Multibyte?
Thanks guys!
PD: Sorry, I am gonna be more explicit with what I tried:
Use russian locales:
setlocale(LC_ALL, "russian_russia.1251");
setlocale(LC_CTYPE, "rus");
But didn't work. Shows question marks.
Also I tried to convert with function WideCharToMultiByte. But shows characters that seems to be encoded wrong.
std::string utf8_encode(const std::wstring &wstr)
{
if (wstr.empty()) return std::string();
int size_needed = WideCharToMultiByte(CP_UTF8, 0, &wstr[0], (int)wstr.size(), NULL, 0, NULL, NULL);
std::string strTo(size_needed, 0);
WideCharToMultiByte(CP_UTF8, 0, &wstr[0], (int)wstr.size(), &strTo[0], size_needed, NULL, NULL);
return strTo;
}
wchar_t* wch = L"Привет";
std::string ch = utf8_encode(wch);
m_wndOutputBuild.AddString(ch.c_str()); //OUTPUT Привет
PD2: Now I call like this
setlocale(LC_ALL, "russian_russia.1251");
std::wstring wch = L"Привет";
std::string ch = encode_1251(wch);
m_wndOutputBuild.AddString(ch.c_str()); //OUTPUT Ïðèâåò
and Function:
std::string encode_1251(const std::wstring &wstr)
{
if (wstr.empty()) return std::string();
int size_needed = WideCharToMultiByte(1251, 0, &wstr[0], (int)wstr.size(), NULL, 0, NULL, NULL);
std::string strTo(size_needed, 0);
WideCharToMultiByte(1251, 0, &wstr[0], (int)wstr.size(), &strTo[0], size_needed, NULL, NULL);
return strTo;
}
I found that Windows-1251 puts CP like that on WideCharToMultiByte here.
In your utf8_encode function, when converting your Unicode UTF-16 string to a std::string, you passed CP_UTF8 to WideCharToMultiByte. Then you take the returned UTF-8 std::string, and pass it via .c_str() to the CListBox::AddString method.
However, if your application is in MBCS Cyrillic, you should convert from UTF-16 to your Cyrillic code page, instead of UTF-8, and pass the strings encoded in your Cyrillic code page to your MFC class methods, like CListBox::AddString.
In other words, you may want to substitute your utf8_encode function with a cyrillic_encode function, that takes UTF-16 text as input, and converts it to your Cyrillic code page:
// Convert from Unicode UTF-16 to Cyrillic code page
std::string cyrillic_encode(const std::wstring &utf16)
And then pass the returned string to the MFC class methods of interest, e.g.:
// From Unicode UTF-16 to Cyrillic code page
std::string cyrillic_text = cyrillic_encode(wch);
// Show Cyrillic-encoded "MBCS" text
m_wndOutputBuild.AddString(cyrillic_text.c_str());
Moreover, as correctly pointed out by #IInspectable in the comments, consider adding proper error checking code in your conversion functions. In fact, in general, there can be UTF-16 text that cannot be properly encoded in Cyrillic, as the latter is a proper subset of the former.

Issues Converting wstring to TCHAR [duplicate]

This question already has answers here:
How to convert std::wstring to a TCHAR*?
(6 answers)
Closed 10 years ago.
I'm fairly new to programming, and I'm trying to write a program where a user inputs a date, then that date is added to the file directory name, then that file directory is searched.
Here is what I'm working with below. I have a number of functions to do this.. I've searched online and tried doing the conversion a few different ways and I'm just not understanding it.... so I left off with (what I know is incorrected) a static_cast.
Maybe I'm just not doing the conversion right... basically this will throw it back to a function that uses the WINAPI handler. Whether I can get that to work is a completely different story... Thanks in advance for any help!
wstring fDate;
wstring fileDin;
const TCHAR* s = _T (fileDin);
std::wstring(fDate);
std::wstring(fileDin) =L"Z:\\software\\A\\AC\\" + fDate;
wcout<< fileDin;
cout <<endl;
//wstring fileDin(&arc[1]);
fileDin = static_cast<TCHAR>(arc[1]);
dir(2, arc);
TCHAR can be either wchar_t (when you use Unicode) or char (when you use Multi-byte).
On the other hand std::wstring always contains characters of type wchar_t, so it's better if you use wchar_t* directly instead of TCHAR* (if possible).
Then wchar_t* to std::wstring conversion can be done by using constructor of std::wstring:
wchar_t* wcstr = L"my string";
std::wstring wstr(wcstr);
and std::wstring to wchar_t* by simple calling c_str() method:
wchar_t* wcstr = wstr.c_str();
Then sometimes you might need to convert between "wide" strings (std::wstrings holding wchar_t characaters) and multi-byte strings (std::strings holding chars). I usually use following helpers:
// multi byte to wide char:
std::wstring s2ws(const std::string& str)
{
int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
std::wstring wstrTo(size_needed, 0);
MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
return wstrTo;
}
// wide char to multi byte:
std::string ws2s(const std::wstring& wstr)
{
int size_needed = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), int(wstr.length() + 1), 0, 0, 0, 0);
std::string strTo(size_needed, 0);
WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), int(wstr.length() + 1), &strTo[0], size_needed, 0, 0);
return strTo;
}

C++: convert LPTSTR to char array [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Convert lptstr to char*
I need to convert an LPTSTR p to CHAR ch[]. I am new to C++.
#include "stdafx.h"
#define _WIN32_IE 0x500
#include <shlobj.h>
#include <atlstr.h>
#include <iostream>
#include <Strsafe.h>
using namespace std;
int main(){
int a;
string res;
CString path;
char ch[MAX_PATH];
LPTSTR p = path.GetBuffer(MAX_PATH);
HRESULT hr = SHGetFolderPath(NULL,CSIDL_APPDATA, NULL, SHGFP_TYPE_CURRENT, p);
/* some operation with P and CH */
if(SUCCEEDED(hr))
{ /* succeeded */
cout << ch;
} /* succeeded */
else
{ /* failed */
cout << "error";
} /* failed */
cin >> a;
return 0;
}
Thanks in advance.
LPTSTR is a (non-const) TCHAR string. Depends if it is Unicode or not it appears. LPTSTR is char* if not Unicode, or w_char* if so.
If you are using non-Unicode strings LPTSTR is just a char*, otherwise do:
size_t size = wcstombs(NULL, p, 0);
char* CharStr = new char[size + 1];
wcstombs( CharStr, p, size + 1 );
Also, this link can help:
Convert lptstr to char*
LPTSTR means TCHAR* (expanding those Win32 acronyms typedefs can make it easier to understand them). TCHAR expands to char in ANSI/MBCS builds, and to wchar_t in Unicode builds (which should be the default in these days for better internationalization support).
This table summarizes the TCHAR expansions in ANSI/MBCS and Unicode builds:
| ANSI/MBCS | Unicode
--------+----------------+-----------------
TCHAR | char | wchar_t
LPTSTR | char* | wchar_t*
LPCTSTR | const char* | const wchar_t*
So, in ANSI/MBCS builds, LPTSTR expands to char*; in Unicode builds it expands to wchar_t*.
char ch[MAX_PATH] is an array of char's in both ANSI and Unicode builds.
If you want to convert from a TCHAR string (LPTSTR) to an ANSI/MBCS string (char-based), you can use ATL string conversion helpers, e.g.:
LPTSTR psz; // TCHAR* pointing to something valid
CT2A ch(psz); // convert from TCHAR string to char string
(Note also that in your original code you should call CString::ReleaseBuffer() which is the symmetric of CString::GetBuffer().)
Sample code follows:
// Include ATL headers to use string conversion helpers
#include <atlbase.h>
#include <atlconv.h>
...
LPTSTR psz = path.GetBuffer(MAX_PATH);
HRESULT hr = SHGetFolderPath(NULL,CSIDL_APPDATA, NULL, SHGFP_TYPE_CURRENT, psz);
path.ReleaseBuffer();
if (FAILED(hr))
{
// handle error
...
}
// Convert from TCHAR string (CString path) to char string.
CT2A ch(path);
// Use ch...
cout << static_cast<const char*>(ch) << endl;
Note also that the conversion from Unicode to ANSI can be lossy.
First, you defined char* ch[MAX_PATH] instead of char ch[MAX_PATH].
Regarding your question, LPTSTR (Long Pointer to TCHAR String) is equivalent to LPWSTR (which is w_char*) if it's unicode, or just LPSTR (char*) if it is not. You can use this link for reference about conversion in each case.
EDIT: To cut to the chase, here's some code:
if (sizeof(TCHAR) == sizeof(char)) // String is non-unicode
strcpy(ch, (char*)(p));
else // String is unicode
wcstombs(ch, p, MAX_PATH);
EDIT 2: In windows I would recommend using TCHAR instead of char. It will save you some headache.
EDIT 3: As a side note, if you want to prevent Visual Studio from flooding you with warnings about unsafe functions, you can add something like the following to the very beginning of your code:
#ifdef _MSC_VER
#define _CRT_SECURE_NO_WARNINGS
#endif

call avio_open function with non-english filename is invalid

i have been writing unicode based program with libav and i wanna make some file through libav with filename "中.mp4".
this filename is not english, and when i call, function return positive integer(not fail).
but there is "ѱ۰.mp4" instead of "中.mp4". (invalid file name.)
what's the matter?
char * szFilenameA = 0;
#ifdef _UNICODE
CSHArray<char> aFilenameBuffer;
aFilenameBuffer.Alloc(lstrlen(szFileName) * 2);
ZeroMemory(aFilenameBuffer, aFilenameBuffer.GetSize());
WideCharToMultiByte(CP_ACP, 0, szFileName, lstrlen(szFileName), aFilenameBuffer, aFilenameBuffer.GetSize(), NULL, NULL);
szFilenameA = aFilenameBuffer;
#else
szFilenameA = (TCHAR *)szFileName;
#endif
ZeroMemory(m_pOutputFormatCtx->filename,1024);
_snprintf(m_pOutputFormatCtx->filename, strlen(szFilenameA), "%s", szFilenameA);
avio_open(&m_pOutputFormatCtx->pb, szFilenameA, AVIO_FLAG_WRITE)
finally!
it's because of charset.
convert ansi filename to UTF8 and then it works fine.
int ANSIToUTF8(char *pszCode, char *UTF8code)
{
WCHAR Unicode[100]={0,};
char utf8[100]={0,};
// read char Lenth
int nUnicodeSize = MultiByteToWideChar(CP_ACP, 0, pszCode, strlen(pszCode), Unicode, sizeof(Unicode));
// read UTF-8 Lenth
int nUTF8codeSize = WideCharToMultiByte(CP_UTF8, 0, Unicode, nUnicodeSize, UTF8code, sizeof(Unicode), NULL, NULL);
// convert to UTF-8
MultiByteToWideChar(CP_UTF8, 0, utf8, nUTF8codeSize, Unicode, sizeof(Unicode));
return nUTF8codeSize;
}

How can I copy a CHAR Variable to WCHAR Variable in C++

I want to convert a CHAR file to UNICODE file.
I read a file character by character in CHAR file type and then save this character in a CHAR Variable and then I want to copy this CHAR Variable to a WCHAR Variable and then I Write the the WCHAR Variable in to a UNICODE file.
here is the code :
#include<Windows.h>
#include<tchar.h>
int _tmain(int argc, LPCTSTR argv[])
{
HANDLE hInfile, hOutfile;
CHAR f1;
WCHAR f2;
DWORD Rd, Wrt;
INT i;
CreateFile(argv[1], GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, NULL,NULL);
CreateFile(argv[2], GENERIC_WRITE, FILE_SHARE_READ, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL,NULL);
while ((ReadFile(hInfile, &f1, sizeof(CHAR), &Rd, NULL) && Rd>0))
{
**_tccpy(f2, f1);**
WriteFile(hOutfile, &f2, Rd, &Wrt, NULL);
}
CloseHandle(hInfile);
CloseHandle(hOutfile);
}
in bold code is the problem, how can I copy CHAR Variable to a WCHAR Variable.
the _tccpy function and strcpy function cant do this, because the prototype of both of them is char or wachar.
Microsoft Specific
Use wmain instead of main if you want to write portable code that adheres to the Unicode programming model.
wmain( int argc, wchar_t *argv[ ], wchar_t *envp[ ] )
wmain at ms http://msdn.microsoft.com/en-us/library/bky3b5dh(VS.80).aspx
I have always found these string basics and conversions very useful when dealing with Unicode in C++.