I haven't used wide chars before. Here's the code from someone else:
char moduleFileName[512];
int size = ::GetModuleFileName(NULL,moduleFileName,sizeof(moduleFileName));
char c_drive[256];
char c_dir[256];
_splitpath_s(moduleFileName,c_drive,sizeof(c_drive),c_dir,sizeof(c_dir),NULL,0,NULL,0);
root = c_drive;
root.append(c_dir);
wchar_t moduleFileNameW[512];
int sizeW = ::GetModuleFileNameW(NULL,moduleFileNameW,sizeof(moduleFileNameW));
wchar_t w_drive[256];
wchar_t w_dir[256];
_wsplitpath_s(moduleFileNameW,w_drive,sizeof(w_drive),w_dir,sizeof(w_dir),NULL,0,NULL,0);
wroot = w_drive;
wroot.append(w_dir);
SEVEN_ZIP_EXE = wroot;
SEVEN_ZIP_EXE += L"\\7z.exe";
The point is to set a variable to where the 7z.exe file is. When I debug it on my Windows 7 Prof. 64 bit system I end up what look to me like invalid chars for wroot, e.g.
쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌쳌
Blockquote
Change your sizeof's to _countof's and that will fix your strange data problem.
You don't need both sets of functions. Use whichever set is appropriate for the application.
If you need SEVEN_ZIP_EXE to be a wstring, then you can eliminate the char stuff.
The problem is sizeof, and if you really want to use this sizeof(w_dir)/sizeof(wchar_t)
I gather what you are trying to do is get the directory containing your executable.
Rather than using the clumsy splitpath, I suggest the following:-
TCHAR szBuff[FILENAME_MAX];
TCHAR *ShortName;
GetFullPathName(moduleFileName, FILENAME_MAX, szBuff, &ShortName);
*(ShortName-1) = '\0'; // remove exe from path
szBuff will contain the directory, and ShortName points to the name.
The above code uses TCHAR and #define UNICODE, but you could change the functions names to the wchar names.
Related
I am trying to convert a std::string to a TCHAR* for use in CreateFile(). The code i have compiles, and works, but Visual Studio 2013 comes up with a compiler warning:
warning C4996: 'std::_Copy_impl': Function call with parameters that may be unsafe - this call relies on the caller to check that the passed values are correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See documentation on how to use Visual C++ 'Checked Iterators'
I understand why i get the warning, as in my code i use std::copy, but I don't want to define D_SCL_SECURE_NO_WARNINGS if at all possible, as they have a point: std::copy is unsafe/unsecure. As a result, I'd like to find a way that doesn't throw this warning.
The code that produces the warning:
std::string filename = fileList->getFullPath(index);
TCHAR *t_filename = new TCHAR[filename.size() + 1];
t_filename[filename.size()] = 0;
std::copy(filename.begin(), filename.end(), t_filename);
audioreader.setFile(t_filename);
audioreader.setfile() calls CreateFile() internally, which is why i need to convert the string.
fileList and audioreader are instances of classes i wrote myself, but I'd rather not change the core implementation of either if at all possible, as it would mean I'd need to change a lot of implementation in other areas of my program, where this conversion only happens in that piece of code. The method I used to convert there was found in a solution i found at http://www.cplusplus.com/forum/general/12245/#msg58523
I've seen something similar in another question (Converting string to tchar in VC++) but i can't quite fathom how to adapt the answer to work with mine as the size of the string isn't constant. All other ways I've seen involve a straight (TCHAR *) cast (or something equally unsafe), which as far as i know about the way TCHAR and other windows string types are defined, is relatively risky as TCHAR could be single byte or multibyte characters depending on UNICODE definition.
Does anyone know a safe, reliable way to convert a std::string to a TCHAR* for use in functions like CreateFile()?
EDIT to address questions in the comments and answers:
Regarding UNICODE being defined or not: The project in VS2013 is a win32 console application, with #undef UNICODE at the top of the .cpp file containing main() - what is the difference between UNICODE and _UNICODE? as i assume the underscore in what Amadeus was asking is significant.
Not directly related to the question but may add perspective: This program is not going to be used outside the UK, so ANSI vs UNICODE does not matter for this. This is part of a personal project to create an audio server and client. As a result you may see some bits referencing network communication. The aim of this program is to get me using Xaudio and winsock. The conversion issue purely deals with the loading of the file on the server-side so it can open it and start reading chunks to transmit. I'm testing with .wav files found in c:/windows/media
Filename encoding: I read the filenames in at runtime by using FindFirstFileA() and FindNextFileA(). The names are retrieved by looking at cFilename in a WIN32_FIND_DATAA structure. They are stored in a vector<string> (wrapped in a unique_ptr if that matters) but that could be changed. I assume this is what Dan Korn means.
More info about the my classes and functions:
The following are spread between AudioReader.h, Audioreader.cpp, FileList.h, FileList.cpp and ClientSession.h. The fragment above is in ClientSession.cpp. Note that in most of my files i declare using namespace std;
shared_ptr<FileList> fileList; //ClientSession.h
AudioReader audioreader; //ClientSession.h
string _storedpath; //FileList.h
unique_ptr<vector<string>> _filenames; //FileList.h
//FileList.cpp
string FileList::getFullPath(int i)
{
string ret = "";
unique_lock<mutex> listLock(listmtx);
if (static_cast<size_t>(i) < _count)
{
ret = _storedpath + _filenames->at(i);
}
else
{
//rather than go out of bounds, return the last element, as returning an error over the network is difficult at present
ret = _storedpath + _filenames->at(_count - 1);
}
return ret;
}
unique_ptr<AudioReader_Impl> audioReaderImpl; //AudioReader.h
//AudioReader.cpp
HRESULT AudioReader::setFile(TCHAR * fileName)
{
return audioReaderImpl->setFile(fileName);
}
HANDLE AudioReader_Impl::fileHandle; //AudioReader.cpp
//AudioReader.cpp
HRESULT AudioReader_Impl::setFile(TCHAR * fileName)
{
fileHandle = CreateFile(fileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
if (fileHandle == INVALID_HANDLE_VALUE)
{
return HRESULT_FROM_WIN32(GetLastError());
}
if (SetFilePointer(fileHandle, 0, NULL, FILE_BEGIN) == INVALID_SET_FILE_POINTER)
{
return HRESULT_FROM_WIN32(GetLastError());
}
return S_OK;
}
If you do not need to support the string containing UTF-8 (or another multi-byte encoding) then simply use the ANSI version of Windows API:
handle = CreateFileA( filename.c_str(), .......)
You might need to rejig your code for this as you have the CreateFile buried in a function that expects TCHAR. That's not advised these days; it's a pain to splatter T versions of everything all over your code and it has flow-on effects (such as std::tstring that someone suggested - ugh!)
There hasn't been any need to support dual compilation from the same source code since about 1998. Windows API has to support both versions for backward compatibility but your own code does not have to.
If you do want to support the string containing UTF-8 (and this is a better idea than using UTF-16 everywhere) then you will need to convert it to a UTF-16 string in order to call the Windows API.
The usual way to do this is via the Windows API function MultiByteToWideChar which is a bit awkward to use correctly, but you could wrap it up in a function:
std::wstring make_wstring( std::string const &s );
that invokes MultiByteToWideChar to return a UTF-16 string that you can then pass to WinAPI functions by using its .c_str() function.
See this codereview thread for a possible implementation of such a function (although note discussion in the answers)
The root of your problem is that you are mixing TCHARs and non-TCHARs. Even if you get it to work on your machine, unless you do it precisely right, it will fail when non-ASCII characters are used in the path.
If you can use std::tstring instead of regular string, then you won't have to worry about format conversions or codepage versus Unicode issues.
If not, you can use conversion functions like MultiByteToWideChar but make sure you understand the encoding used in the source string or it will just make things worse.
Try this instead:
std::string filename = fileList->getFullPath(index);
#ifndef UNICODE
audioreader.setFile(filename.c_str());
#else
std::wstring w_filename;
int len = MultiByteToWideChar(CP_ACP, 0, filename.c_str(), filename.length(), NULL, 0);
if (len > 0)
{
w_filename.resize(len);
MultiByteToWideChar(CP_ACP, 0, filename.c_str(), filename.length(), &w_filename[0], len);
}
audioreader.setFile(w_filename.c_str());
#endif
Alternatively:
std::string filename = fileList->getFullPath(index);
#ifndef UNICODE
audioreader.setFile(filename.c_str());
#else
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> conv;
std::wstring w_filename = conv.from_bytes(filename);
audioreader.setFile(w_filename.c_str());
#endif
Short version I am using unicode. I am attempting to use a std::string, to a function that requires a const WCHAR string; DrawString(const WCHAR, ...
I compile with GCC. Everything is unicode, I have specified.
I have been trying to convert a string, into a wchar_t*. The purpose is so that I can output using a GDI+ function, its parameters require it so.
Here is how I have outputted string literals, no problems, debugs fine, works fine.
http://msdn.microsoft.com/en-us/library/ms535991%28v=vs.85%29.aspx for reference why:
// works fine
wchar_t* wcBuff;
wcBuff = (wchar_t*)L"Some text here.\0";
AddString(wcBuff, wcslen(wcBuff), &gFontFamilyInfo, FontStyleBold, 20, ptOrg_Controls, &strFormat_Info);
Now this is what I have been trying, all day, and a side note: my conversion function works fine, it is not an issue, nor creating one.
// problems
string s = "Level " + convert::intToString(6) + "\0";
// try 1 - Segfault
wchar_t* wcBuff = new wchar_t[s.length() + 1];
copy(s.begin(), s.end(), wcBuff);
// random tries, compiles, but access violations (my conversion function here has worked other places, do not know for sure here.
wchar_t* wcBuff;
wstring wstr = convert::stringToWideChar(s);
wstring strvalue = convert::stringToWideChar(s);
wcBuff = (wchar_t*)strvalue.c_str();
wcBuff = (wchar_t*)wstr.c_str();
wstring foo;
foo.assign(s.begin(), s.end());
wcBuff = (wchar_t*)foo.c_str();
Everything compiles, but then presents problems. Some runtime errors as soon as it reaches that point. Others access violations and segfaults. Some compiles and debugs no problem, but the strings output constantly changes with random characters.
Any ideas?
(this is not really an answer, but it's too big for a comment)
Try 1: you didn't null-terminate the string
Try 2: can't comment without seeing the conversion function. Remove the casts.
Try 3: Remove the casts, should be OK.
In all cases use wchar_t const *wcBuff. If "Try 3" fails then it means you have a bug somewhere else in your code, that is showing up here. Try to produce a MCVE. You should be able to get it down to about 10-20 lines.
Even if you manage to write the correct code for what you're intending, this is a fairly naive conversion as it doesn't handle characters outside the 0-127 range properly. You need to think about whether that is what you want, or whether you want to do a UTF-8 conversion, etc.
In Windows you can use MultiByteToWideChar.
#include <string>
int main() {
// Can use convenient wstring
std::wstring wstr = L"My wide string";
// When you need a whar_t* just do this
const wchar_t* s = wstr.c_str();
// unicode form of strcpy
wchar_t buf[100] = {0};
wcscpy (buf,s);
// And if you want to convert from string to wstring
std::string thin = "I only take up one byte per character!";
std::wstring wide(thin.begin(), thin.end());
return 0;
}
I first get my data into a wstring. Like this:
(Converting from string):
std::string sString = "This is my string text";
std::wstring str1(sString.begin(), sString.end());
(Converting from int):
wstring str1 = std::to_wstring(BirthDate);
Then, I use it in GDI+ Command like this:
graphics.DrawString(str1.c_str(), -1,
&font, PointF(10, 5), &st);
First thing first. GDI+ is a C++ library. It uses Microsoft C++ ABI. Microsoft C++ ABI is wildly incompatible with gcc so you might just forget about using it. You can try to use WinAPI or any other library that uses C calling conventions.
Now for the wstring question. wchar_t is 32 bits wide in gcc, but Windows APIs require it to be 16 bits wide. You cannot use any native Windows call that requires wchar_t.
You can use -fshort-wchar command line option in gcc, that would make wchar_t 16 bits wide and you will regain compatibility with Windows APIs, but lose compatibility with libc, so no library functions that act on wchar_t for you. std::wstring will probably work as it's header-only, but wprintf or wscpy or all other compiled stuff won't.
None of this is detected at compile time, as the only things gcc sees are header files. It cannot tell whether corresponding libraries are compiled with 16-bit wchar_t or 32-bit wchar_t.
You can use uint16_t when you need to pass an array of wchar_t to a Windows function. If you can use C++11, it has char16_t that you can use too. Here's an example that should work with Basic Multilingual Plane characters:
std::wstring myLittleNiceWstring;
...
std::vector<uint16_t> myUglyCompatibilityString;
std::copy(myLittleNiceWstring.begin(),
myLittleNiceWstring.end(),
std::back_inserter(myUglyCompatibilityString));
myUglyCompatibilityString.push_back(0);
UglyWindowsAPI(static_cast<WCHAR*>(myUglyCompatibilityString.data());
If you have non-BMP characters, you need to convert UTF32 to UTF16 rather than just copy characters with std::copy. You can use libiconv for that or write a conversion routine yourself (it's rather simple) or just boorow some code from the internet.
It is my opinion that Windows-centric development with GCC is rather difficult because of this and other issues. You can use gcc as long as you stick to POSIX-ish APIs.
I hope my question is not to localized... But I think some other will stumble about this or a similar problem.
I want to create a vector which contains all available ports at my system (this works in a console app). After that I want to copy the vectorelements in a wxString-Array to get them in a wxComboBox.
So, in my particular case I get two errors:
the variablename of the vector is not known in wxWidgets
by copying, the wxString will cast my string into a wchar_t (I know, wchar_t and wxString are similar...)
I will add some of my Code, so you will have a better sight about the problem:
first problem
std::vector<std::string> v_ports;
v_ports.push_back = "Com1";
v_ports.push_back = "Com4";
--> error: 'v_ports' does not name a type
(hint: that is a example, in the hole programm I will use a function to get the strings)
second problem
wxString sect_opt[v_ports.size()];
for(int i = 0; i < v_ports.size(); i++)
sect_opt[i] = _T(v_ports[i]);
--> error: 'Lv_ports' was not declared in this scope
I'm using:
IDE: CodeLite 5.1; wxW 2.9.4; #Win8.1
First problem
Instead of using:
v_ports.push_back = "Com1";
v_ports.push_back = "Com4";
you should use:
v_ports.push_back("Com1");
v_ports.push_back("Com4");
because std::vector<T>::push_back is a function.
Second problem
The _T macro is supposed to be used on literals:
Use the _T macro to conditionally code literal strings to be portable to Unicode.
It cannot be used in expressions like _T(v_ports[i]).
To convert a string to unicode please see:
Converting unicode strings and vice-versa (Philipp's answer)
How well is Unicode supported in C++11?
I am currently working on an application in C++, that ties into Lua, that ties into Flash (in that order). My goal at the moment is getting wchar_ts from C++ into Flash, via Lua. I would love any insights as to how I can accomplish this!
If any other information is required, please ask and I'll do my best to provide it
What I have tried
It's my understanding that Lua is not a fan of Unicode, but it should still be able to receive the string of bytes from my C++ application. I imagine there must be a way to then pass those bytes over to Flash to then render out my intended Unicode. So what I've done so far:
C++:
//an example wchar_t*
const wchar_t *text = L"Test!";
//this function pushes a char* to my Lua code
lua.PushString((char*)text); //directly casting text to a char*... D:
Lua:
theString = FunctionThatGetsWCharFromCpp();
flash.ShowString(theString);
Flash:
function ShowString(theString:String)
{
myTextField.text = theString;
}
Now the outcome here is that myTextField only shows "T". This made sense to me. The cast from wchar_t to char would end up padding out the chars with some zeros, especially since "T" doesn't really utilize both bytes of a wchar_t. A quick look at the documentation yields:
lua_pushstring
The string cannot contain embedded zeros; it is assumed to end at the first zero.
So I ran a little test:
C++:
//prefixing with a Japanese character
//which will use both bytes of the wchar_t
const wchar_t *text = L"たTest!";
The Flash textbox now reads: "_0T", 3 characters. Makes total sense, the 2 bytes of the Japanese character + T, then termination.
I understand what is going on, but I am still completely unsure of how to tackle this problem. And I'm really unsure of what to search for. Is there a specific Lua function I can use to pass a wad of bytes over to Lua from C++ (I've read somewhere that lua_pushlstring is often used for this, but that also terminates at first zero)? Is there a Flash datatype that will accept these bytes, then I'll need to do some sort of conversion to get them into a readable, multibyte string? or is this just really not possible?
Note:
I'm not too familiar with Unicode and code pages and whatnot, so I'm not too sure if there'll also be a step where I'll need to specify the correct encoding in Flash so that I can get the correct output - but I'm happy to cross that bridge when I get there, but if anyone has any insight here too, that would be great!
I don't know if this will work, but I'd recommend trying to use UTF-8. A string encoded in UTF-8 doesn't have any embedded zeros in it, so Lua should be able handle it, and Flash ought to also be able to handle it, depending on how exactly the languages interface.
Here's one way to convert a wide-character string to UTF-8 using setlocale(3) wcstombs(3):
// Error checking omitted for expository purposes
// Call this once at program startup. If you'd rather not change the locale,
// you can instead write your own conversion routine (but beware of UTF-16
// surrogate pairs if you do)
setlocale(LC_ALL, "en_US.UTF-8");
// Do this for each string you want to convert
const wchar_t *wideString = L"たTest!";
size_t len = wcslen(wideString);
size_t maxUtf8len = 4 * len + 1; // Each wchar_t encodes to a max of 4 bytes
char *utf8String = new char[maxUtf8len];
wcstombs(utf8String, wideString, maxUtf8len);
...
// Do stuff with utf8string
...
delete [] utf8String;
If you're on Windows, you can instead use the WideCharToMultiByte function with the CP_UTF8 code page to do the conversion, since I don't believe that the Visual Studio C runtime supports UTF-8 locales:
// Error checking omitted for expository purposes
const wchar_t *wideString = L"たTest!";
size_t len = wcslen(wideString);
size_t maxUtf8len = 4 * len + 1; // Each wchar_t encodes to a max of 4 bytes
char *utf8String = new char[maxUtf8len];
WideCharToMultiByte(CP_UTF8, 0, wideString, len + 1, utf8String, maxUtf8len, NULL, NULL);
...
// Do stuff with utf8string
...
delete [] utf8String;
So the GetWindowText is declared on MSDN as follows:
int GetWindowText(
HWND hWnd,
LPTSTR lpString,
int nMaxCount
);
However for the code to work we have to declare the second parameter as
TCHAR[255] WTitle;
and then call the function GetWindowText(hWnd,Wtitle,255);
The LPTSTR is a pointer to an array of tchar, so declaring LPTSTR is similar to declaring TCHAR[]? It doesn't work this way though.
When using TCHAR[] the program returns valid GetWindowText result (it is an integer equal to the number of symbols in the title). The question is : how can I get the exact title out of TCHAR[] ? Code like
TCHAR[255] WTitle;
cout<< WTitle;
or
cout<< *Wtitle;
returns numbers. How can I compare this with a given string?
TCHAR[4] Test= __T("TEST")
if (WTitle == Test) do smth
doesn't work also.
Wow, let's see where to start from.
First off, the declaration of WTitle needs to look like this:
TCHAR WTitle[255];
Next, if cout is not working write, it's because you are in Unicode mode so you need to do this:
wcout << WTitle;
Or to fit better with the whole tchar framework, you can add this (actually, I'm surprised that this is not already part of tchar.h):
#ifdef _UNICODE
#define tcout wcout
#else
#define tcout cout
#endif
and then use:
tcout << WTitle;
OK, a few definitions first.
The 'T' types are definitions that will evaluate to either CHAR (single byte) or WCHAR (double-byte), depending upon whether you've got the _UNICODE symbol defined in your build settings. The intent is to let you target both ANSI and UNICODE with a single set of source code.
The definitions:
TCHAR title[100];
TCHAR * pszTitle;
...are not equivalent. The first defines a buffer of 100 TCHARs. The second defines a pointer to one or more TCHARs, but doesn't point it at a buffer. Further,
sizeof(title) == 100 (or 200, if _UNICODE symbol is defined)
sizeof(pszTitle) == 4 (size of a pointer in Win32)
If you have a function like this:
void foo(LPCTSTR str);
...you can pass either of the above two variables in:
foo(title); // passes in the address of title[0]
foo(pszTitle); // passes in a copy of the pointer value
OK, so the reason you're getting numbers is probably because you do have UNICODE defined (so characters are wide), and you're using cout, which is specific to single-byte characters. Use wcout instead:
wcout << title;
Finally, these won't work:
TCHAR[4] Test == __T("TEST") ("==" is equality comparison, not assignment)
if (WTitle == Test) do smth (you're comparing pointers, use wcscmp or similar)
Short answer: Unless you're coding for Win98, use wchar_t instead of TCHAR and wcout instead of cout
Long version:
The TCHAR type exists to allow for code to be compiled in multiple string modes. For example supporting ASCII and Unicode. The TCHAR type will conditionally compile to the appropriate character type based no the setting.
All new Win systems are Unicode based. When ASCII strings are passed to OS functions, they are converted to unicode and the call the real function. So it's best to just use Unicode throughout your application.
Use _tcscmp or a variant (which takes in the number of characters to compare). http://msdn.microsoft.com/en-us/library/e0z9k731.aspx
Like:
if (_tcscmp(WTitle, Test) == 0) {
// They are equal! Do something.
}
In C, wchar_t is a typedef for some integer type (usually short int). In C++, it's required to be a separate type of its own -- but Microsoft's compilers default to using a typedef for it anyway. To make it a separate type of its own, you need to use the /Zc:wchar_t compiler switch. Offhand, I don't know if that will entirely fix the problem though -- I'm not sure if the library has real overloads for wchar_t as a native type to print those out as characters instead of short ints.
Generally speaking, however, I'd advise against messing with Microsoft's "T" variants anyway -- getting them right is a pain, and they were intended primarily to provide compatibility with 16-bit Windows anyway. Given that it's now been about 10 years since the last release in that line, it's probably safe to ignore it in new code unless you're really sure at least a few of your customers really use it.