How to pass std::string to CreateDirectory with Unicode set

How to pass std::string to CreateDirectory with Unicode set - c++

I am having difficulty with the CreateDirectory function. In the following code I am getting a
"cannot convert argument 1 from 'const char *' to 'LPCWSTR'" compile error for the CreateDirectory call.
// make path to folder in program data
char szPath[MAX_PATH];
if ( ! SUCCEEDED( SHGetFolderPathA( NULL, CSIDL_COMMON_APPDATA, NULL, 0, szPath ) ) )
{
std::cout << "ERROR: Could not open server log - no common data folder " << std::endl;
exit(1);
}
std::string fname = szPath;
fname +="/Point";
CreateDirectory( fname.c_str(), NULL);
I am using Visual Studio 2015 and have "Character Set = Use Unicode Character Set".
In fileapi.h the following is defined:
#ifdef UNICODE
#define CreateDirectory CreateDirectoryW
#else
#define CreateDirectory CreateDirectoryA
#endif // !UNICODE
So I think the CreateDirectoryW function is being used
What do I need to do to get this to compile properly?

You need to use std::wstring instead of std::string in order to use wide character strings.
int main()
{
// make path to folder in program data
wchar_t szPath[MAX_PATH];
if (!SUCCEEDED(SHGetFolderPathW(NULL, CSIDL_COMMON_APPDATA, NULL, 0, szPath)))
{
std::cout << "ERROR: Could not open server log - no common data folder " << std::endl;
exit(1);
}
std::wstring fname = szPath;
fname += L"/Point";
CreateDirectory(fname.c_str(), NULL);
}
If you do not want to use wide character strings you need to explicitly call the narrow character versions of the windows API functions such as CreateDirectoryA instead of CreateDirectory.

Use CreateDirectoryA.
That said you'd be better off changing to Unicode in your application, wide text.
The original code has some problems:
// make path to folder in program data
↑ This comment is misleading: the code is about finding the path, not creating it.
char szPath[MAX_PATH];
↑ This buffer is unnecessary, instead, for this code, you should just declare the later variable std::string fname here, with specified buffer size.
if ( ! SUCCEEDED( SHGetFolderPathA( NULL, CSIDL_COMMON_APPDATA, NULL, 0, szPath ) ) )
↑ !SUCCEEEDED is a misleading rewrite of idiomatic FAILED. And SHGetFolderPath is deprecated. Instead you should be using SHGetKnownFolderPath.
{
std::cout << "ERROR: Could not open server log - no common data folder " << std::endl;
exit(1);
}
↑ The console output makes this failure handling of little value in a GUI program. Anyway, instead of cout you should be using cerr or clog (they both map to the standard error stream by default). In the exit call you should either be using a standard value such as EXIT_FAILURE, or supply the HRESULT that you got (this is Windows convention, in particular for crashing programs), or for some function, the value that you get from GetLastError. Anyway exit is far too drastic. You should be either throwing an exception or returning an optional.
std::string fname = szPath;
fname +="/Point";
↑ Forward slashes are generally supported but still the Windows convention is backslash as item separator.
CreateDirectory( fname.c_str(), NULL);
↑ The only problem that it doesn't compile with UNICODE defined before including windows.h. Use CreateDirectoryA. Or better, switch to Unicode.

Related

ShellExecute ends up in error C2065 don't know how to fix

I have a timer, after which a local html file should be executed, but I hit some kind of error:
int delay = 120;
delay *= CLOCKS_PER_SEC;
clock_t now = clock();
while (clock() - now < delay);
string strWebPage = "file:///D:/project/site/scam.html";
strWebPage = "file:///" + strWebPage;
ShellExecute(NULL, NULL, NULL, strWebPage, NULL, SW_SHOWNORMAL);
return 0;
E0413 no suitable conversion function from "std::string" to "LPCWSTR" exists
I'm new to C++, so it might be an obvious solution.
Could anyone point me to how I can fix it?

You have two problems.
But first, you should always take the time to read the documentation. For Win32 functions, you can get to a known function by typing something like “msdn ShellExecute” into your favorite search engine and clicking the “Lucky” button.
Problem One
ShellExecute() is a C function. It does not take std::string as argument. It needs a pointer to characters. Hence:
std::string filename = "birds.html";
INT_PTR ok = ShellExecute(
NULL, // no window
NULL, // use default operation
filename.c_str(), // file to open
NULL, // no args to executable files
NULL, // no start directory
SW_SHOWNORMAL );
if (ok <= 32)
fooey();
Notice that we pass a const char * to the function as the file to <default verb>.
Problem Two
From your image it would appear that you have your application declared as a Unicode application. In other words, somewhere there is a #define UNICODE.
This makes ShellExecute() expect a WIDE character string (const wchar_t *)as argument, not a narrow string (const char *).
You can still use a narrow string by simply specifying that you want the narrow version:
INT_PTR ok = ShellExecuteA(
...
I recommend you look at how you set up your project to figure out how you got things to think you were using wide strings instead of narrow strings.

C++ Windows can't add program to startup

I'm using this function to add my program to startup. But it doesn't work and I don't know why weird ascii characters and words are showing up in startup applications. What am I doing wrong?
Instead this is being added to starup. U㫅萹㐀蠀渐晁Ɉ U㫆萺㝈耀 U㫆萺㝈耀 and C. Which has no file location and also no details.
HKEY NewVal;
char loggeduser[UNLEN + 1];
std::ostringstream fileinhome;
GetUserNameA(loggeduser, &len);
fileinhome << "C:\\Users\\" << loggeduser << "\\AppData\\Roaming\\snetwork\\snetwork.exe";
std::string fp = fileinhome.str();
const char* file = fp.c_str();
if (RegOpenKey(HKEY_CURRENT_USER, _T("Software\\Microsoft\\Windows\\CurrentVersion\\Run"), &NewVal) != ERROR_SUCCESS)
{
return;
}
if (RegSetValueEx(NewVal, _T("CLI-Social-Network"), 0, REG_SZ, (LPBYTE)file, sizeof(file)) != ERROR_SUCCESS)
{
return;
}
else {
// std::cout << "Program added to Startup.\n";
// Do nothing, Program was added to Startup
}
RegCloseKey(NewVal);

A possibility: You have UNICODE and/or _UNICODE defined, so RegSetValueEx is actually RegSetValueExW. Therefore, this function passes Unicode data into the buffer file. But file is an ASCII buffer, so the otherwise-valid Unicode data is incorrectly parsed as ASCII, leading to the strange output.
To fix, use std::wstring and W functions explicitly.
Unicode considerations aside, you can't use a const char * as a buffer for receiving data. You must allocate sufficient memory for the buffer first.

Renaming a file with an en dash in the name in C++

In the project I'm working on, I work with files and I check if they exists before proceeding. Renaming or even working with files featuring that 'en dash' in the file path seems impossible.
std::string _old = "D:\\Folder\\This – by ABC.txt";
std::rename(_old.c_str(), "New.txt");
here the _old variable is interpreted as D:\Folder\This û by ABC.txt
I tried
setlocale(LC_ALL, "");
//and
setlocale(LC_ALL, "C");
//or
setlocale(LC_ALL, "en_US.UTF-8");
but none of them worked.. What should be done?

It depends on the operation system. In Linux file names are simple byte arrays: forget about encoding and just rename the file.
But seems you are using Windows and file name is actually a null-terminated string containing 16-bit characters. In this case the best way is to use wstring instead of messing with encodings.
Don't try to write platform-independent code to solve platform-specific problems. Windows uses Unicode for file names so you have to write platform-specific code instead of using standard function rename.
Just write L"D:\\Folder\\This \u2013 by ABC.txt" and call _wrename.

The Windows ANSI Western encoding has the Unicode n-dash, U+2013, “–”, as code point 150 (decimal). When you output that to a console with active code page 437, the original IBM PC character set, or compatible, then it's interpreted as an “û”. So you have the right codepage 1252 character in your string literal, either because
you're using Visual C++, which defaults to the Windows ANSI codepage for encoding narrow string literals, or
you're using an old version of g++ that doesn't do the standard-mandated conversions and checking but just passes narrow character bytes directly through its machinery, and your source code is encoded as Windows ANSI Western (or compatible), or
something I didn't think of.
For either of the first two possibilities
the rename call will work.
I tested that it does indeed work with Visual C++. I do not have an old version of g++ around, but I tested that it works with version 5.1. That is, I tested that the file is really renamed to New.txt.
// Source encoding: UTF-8
// Execution character set: Windows ANSI Western a.k.a. codepage 1252.
#include <stdio.h> // rename
#include <stdlib.h> // EXIT_SUCCESS, EXIT_FAILURE
#include <string> // std::string
using namespace std;
auto main()
-> int
{
string const a = ".\\This – by ABC.txt"; // Literal encoded as CP 1252.
return rename( a.c_str(), "New.txt" ) == 0? EXIT_SUCCESS : EXIT_FAILURE;
}
Example:
[C:\my\forums\so\265]
> dir /b *.txt
File Not Found
[C:\my\forums\so\265]
> g++ r.cpp -fexec-charset=cp1252
[C:\my\forums\so\265]
> type nul >"This – by ABC.txt"
[C:\my\forums\so\265]
> run a
Exit code 0
[C:\my\forums\so\265]
> dir /b *.txt
New.txt
[C:\my\forums\so\265]
> _
… where run is just a batch file that reports the exit code.
If your Windows ANSI codepage is not codepage 1252, then you need to use your particular Windows ANSI codepage.
You can check the Windows ANSI codepage via the GetACP API function, or e.g. via this command:
[C:\my\forums\so\265]
> wmic os get codeset /value | find "="
CodeSet=1252
[C:\my\forums\so\265]
> _
The code will work if that codepage supports the n-dash character.
This model of coding is based on having one version of the executable for each relevant main locale (including character encoding).
An alternative is to do everything in Unicode. This can be done portably via Boost file system, which will be adopted into the standard library in C++17. Or you can use the Windows API, or de facto standard extensions to the standard library in Windows, i.e. _rename.
Example of using the experimental file system module with Visual C++ 2015:
// Source encoding: UTF-8
// Execution character set: irrelevant (everything's done in Unicode).
#include <stdlib.h> // EXIT_SUCCESS, EXIT_FAILURE
#include <filesystem> // In C++17 and later, or Visual C++ 2015 and later.
using namespace std::tr2::sys;
auto main()
-> int
{
path const old_path = L".\\This – by ABC.txt"; // Literal encoded as wide string.
path const new_path = L"New.txt";
try
{
rename( old_path, new_path );
return EXIT_SUCCESS;
}
catch( ... )
{}
return EXIT_FAILURE;
}
To do this properly for portable code you can use Boost, or you can create a wrapper header that uses whatever implementation is available.

It really platform dependant, Unicode is headache. Depends on which compiler you use. For older ones from MS (VS2010 or older), you would need use API described in MSDN. This test example creates file with name you have problem with, then renames it
// #define _UNICODE // might be defined in project
#include <string>
#include <tchar.h>
#include <windows.h>
using namespace std;
// Convert a wide Unicode string to an UTF8 string
std::string utf8_encode(const std::wstring &wstr)
{
if( wstr.empty() ) return std::string();
int size_needed = WideCharToMultiByte(CP_UTF8, 0, &wstr[0], (int)wstr.size(), NULL, 0, NULL, NULL);
std::string strTo( size_needed, 0 );
WideCharToMultiByte (CP_UTF8, 0, &wstr[0], (int)wstr.size(), &strTo[0], size_needed, NULL, NULL);
return strTo;
}
// Convert an UTF8 string to a wide Unicode String
std::wstring utf8_decode(const std::string &str)
{
if( str.empty() ) return std::wstring();
int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
std::wstring wstrTo( size_needed, 0 );
MultiByteToWideChar (CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
return wstrTo;
}
int _tmain(int argc, _TCHAR* argv[] ) {
std::string pFileName = "C:\\This \xe2\x80\x93 by ABC.txt";
std::wstring pwsFileName = utf8_decode(pFileName);
// can use CreateFile id instead
HANDLE hf = CreateFileW( pwsFileName.c_str() ,
GENERIC_READ | GENERIC_WRITE,
0,
0,
CREATE_NEW,
FILE_ATTRIBUTE_NORMAL,
0);
CloseHandle(hf);
MoveFileW(utf8_decode("C:\\This \xe2\x80\x93 by ABC.txt").c_str(), utf8_decode("C:\\This \xe2\x80\x93 by ABC 2.txt").c_str());
}
There is still problem with those helpers so that you can have a null terminated string.
std::string utf8_encode(const std::wstring &wstr)
{
std::string strTo;
char *szTo = new char[wstr.length() + 1];
szTo[wstr.size()] = '\0';
WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, szTo, (int)wstr.length(), NULL, NULL);
strTo = szTo;
delete[] szTo;
return strTo;
}
// Convert an UTF8 string to a wide Unicode String
std::wstring utf8_decode(const std::string &str)
{
std::wstring wstrTo;
wchar_t *wszTo = new wchar_t[str.length() + 1];
wszTo[str.size()] = L'\0';
MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, wszTo, (int)str.length());
wstrTo = wszTo;
delete[] wszTo;
return wstrTo;
}
a problem with size of character for conversion.. call to WideCharToMultiByte with 0 as the size of target buffer allows to get size of character required for conversion. It will then return the number of bytes needed for the target buffer size. All this juggling with code explains why the frameworks like Qt got so convoluted code to support Unicode-based file system. Actually, best cost-effective way to get rid of all possible bugs for you is to use such framework.
for VS2015
std::string _old = u8"D:\\Folder\\This \xe2\x80\x93 by ABC.txt"s;
according to their docs. I can't check that one.
for mingw.
std::string _old = u8"D:\\Folder\\This \xe2\x80\x93 by ABC.txt";
std::cout << _old.data();
output contains proper file name... but for file API, you still need do proper conversion

Is Win32 error code 122 somewhat benign in this case?

The following line results in GetLastError() return error code 122 (=ERROR_INSUFFICIENT_BUFFER)
CString str = CString("'") + _T("%s") + CString("'");
But this happens only under VS2005 and doesn't happen in VS2015. Still I see no memory corruption or anything in VS2005 and the str variable does contain the correct value. Is that still an issue to be concerned about give the error code?
The reason this seems to happen is because of concatenation of wide character and simple character strings and the fix is to simply sorround both remaining strings with _T("") so the code line would look like:
CString str = CString(_T("'")) + _T("%s") + CString(_T("'"));
But what does the error code 122 really means in original line when only one string was Unicode? What wrong has really happened or is it more like a warning in this case?

GetLastError() is only meaningful after some system call has returned an error. Since your code does not have any system call, GetLastError() can return anything.
Maybe the value you see is the last error from the last system call that failed. Or maybe it is some error that happened from inside the CString class, but it is handled there.
TL;DR; No error here.

In VS 2015 you can reproduce the error with CString("a") (if Unicode is set) or just CStringW("a")
#include <iostream>
#include <atlstr.h>
int main()
{
CStringW("a");
DWORD err = GetLastError();
std::cout << err << "\n"; //<= error 122, ERROR_INSUFFICIENT_BUFFER
return 0;
}
This happens because CString uses WinAPI MultiByteToWideChar to convert ANSI "a" to Unicode L"a". Debugging through the source code in "atlmfc\include\cstringt.h", we see that at some point it calls the following function:
static int __cdecl GetBaseTypeLength(_In_z_ LPCSTR pszSrc) throw()
{
// Returns required buffer size in wchar_ts
return ::MultiByteToWideChar( _AtlGetConversionACP(), 0, pszSrc, -1, NULL, 0 )-1;
}
For some reason there is a -1 at the end. I don't know why that's there, it might be necessary for other CString functions but in this case it ends up causing ERROR_INSUFFICIENT_BUFFER error in the next call to MultiByteToWideChar. The conversion can be roughly simplified to following:
int main()
{
int nDestLength = MultiByteToWideChar(CP_ACP, 0, "a", -1, NULL, 0) - 1;
wchar_t *pszDest = new wchar_t[32];
//ERROR_INSUFFICIENT_BUFFER occurs here because nDestLength is short by 1:
MultiByteToWideChar(CP_ACP, 0, "a", -1, pszDest, nDestLength);
DWORD err = GetLastError();
std::cout << err << "\n";
return 0;
}
nDestLength is too small because it doesn't account for null terminator. CString sorts this out later but the error remains. That's a good reason not to pay attention to GetLastError unless the function fails.
As you noted, this error can be avoided by using the _T macro, because CString would no longer need MultiByteToWideChar. Or better yet, use the L prefix, or CString::Format
CString str = CString(L"'") + L"%s" + CString(L"'");

Write into a file names of folders containing in C:\Program Files

I have this task:
1. In current directory create file subMape.dat
2. Write into it all names of folders, that stored in C:\Program Files folder
3. Display on the screen data, that was written in subMape.dat
#include <iostream>
#include <windows.h>
using namespace std;
int main() {
WIN32_FIND_DATA findFileData;
DWORD bytesWritten = 0;
HANDLE f;
HANDLE c = CreateFileW(L"subMape.txt", GENERIC_READ | GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
//TCHAR lpBuffer[32];
DWORD nNumberOfBytesToRead = 32;
//DWORD lpNumberOfBytesRead;
DWORD lengthSum = 0;
if (c) {
cout << "CreateFile() succeeded!\n";
if(f = FindFirstFile(L"C:\\Program Files\\*", &findFileData)){
if(f != INVALID_HANDLE_VALUE) {
while (FindNextFile(f, &findFileData)){
lengthSum += bytesWritten;
WriteFile(c, findFileData.cFileName, (DWORD)wcslen(findFileData.cFileName), &bytesWritten, NULL);
}
}
FindClose(f);
}
else {
cout << "FindFirstFile() failed :(\n";
}
}
else {
cout << "CreateFile() failed :(\n";
}
cout << lengthSum << endl;
//SetFilePointer(c, lengthSum, NULL, FILE_BEGIN);
//ReadFile(c, lpBuffer, lengthSum, &lpNumberOfBytesRead, NULL);
//wprintf(lpBuffer);
CloseHandle(c);
return 0;
}
I'm using UNICODE, when it writes findFileData.cFileName - it writes string, where characters splitted with spaces. For example: folder name "New Folder" (strlen = 10) will be written into the file as "N e w T o" (strlen = 10). What do?

Your text file viewer or editor just isn't smart enough to figure out that you've written a utf-16 encoded text file. Most text editors need help, write the BOM to the file:
cout << "CreateFile() succeeded!\n";
wchar_t bom = L'\xfeff';
WriteFile(c, &bom, sizeof(bom), &bytesWritten, NULL);

You need to use something like WideCharToMultiByte() to convert the UNICODE string to ANSI (or UTF8).

The reason you see "space" is that the program you are using to list the file treats it as one byte per character. When using Unicode in windows you will get two, and the second byte is a '\0'.
You need to choose how you want to encode the data in the file.
The easiest you can do is to use UTF-16LE, since this is the native encoding on Windows. Then you only need to prepend a byte order marker to the beginning of the file. This encoding has an advantage over UTF-8 since it is easy to destinguish from extended ASCII encodings due to the observed zero-bytes. Its drawback is that you need the BOM and it occupies more disk space uncompressed.
UTF-8 has the advantage of being more compact. It is also fully compatible with pure ASCII and favoured by the programming community.
If you have do not need to use extended ASCII in any context, you should encode your data in UTF-8. If you do, use UTF-16LE.
Those who argue that a text that passes an UTF-8 validation is encoded in UTF-8 is right if the whole text is available, but wrong if it is not:
Consider an alphabetical list of swedish names. If I only check the first part of the list and it is Latin-1 (ISO/IEC 8859-1), it will also pass the UTF-8 test.
Then in the end comes "Örjansson" which breaks down into mojibake In fact 'Ö' will be an invalid UTF-8 bit sequence. On the other hand, since all letters used actually fits in one byte when using UTF-16LE, I can be fully confident that it is not UTF-8, and not Latin-1 either.

You should know that in windows the "native" uncidode format is UTF-16 which is used by the W-style functions ( CreateFileW ). With that in mind writing the file should give you a valid UTF-16 text, but editor may not recognize that, to make sure your program works use a text editor where you can specify the encoding by hand ( you know what it needs to be ) in case it doesn't recognize it, for this Notepad++ is a good choice.
As others already mentioned, writing the BOM is very helpful for text editors and ensures that your file will be read correctly.
You can use WideCharToMultiByte to convert the UTF-16 into UTF-8 for even more compatibility.
And why did you use CreateFileW directly and not the FindFirstFileW do you have UNICODE defined in your project? If you do the compiler would resolve CreateFile into CreateFileW for you.
Also, here
WriteFile(c, findFileData.cFileName, (DWORD)wcslen(findFileData.cFileName), &bytesWritten, NULL);
wcslen gives the number of characters which is not the same as the data size for a non ANSI text, it should be something like
wcslen(findFileData.cFileName)*sizeof(wchar_t)

When dealing with UTF-16 files, it is important to write a byte-order mark and to write the data with lengths in bytes not characters. wcslen returns the string length in characters but a character is two bytes when using wide strings. Here's a fixed version. It explicitly calls the wide version of the Win32 APIs so will work whether UNICODE/_UNICODE are defined or not.
#include <iostream>
#include <windows.h>
using namespace std;
int main()
{
WIN32_FIND_DATAW findFileData; // Use the wide version explicitly
DWORD bytesWritten = 0;
HANDLE f;
HANDLE c = CreateFileW(L"subMape.txt", GENERIC_READ | GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
DWORD lengthSum = 0;
if(c != INVALID_HANDLE_VALUE) {
cout << "CreateFile() succeeded!\n";
// Write A byte-order mark...make sure length is bytes not characters.
WriteFile(c, L"\uFEFF", sizeof(wchar_t), &bytesWritten, NULL);
lengthSum += bytesWritten;
f = FindFirstFileW(L"C:\\Program Files\\*", &findFileData);
if(f != INVALID_HANDLE_VALUE) {
while(FindNextFileW(f, &findFileData)) {
// Write filename...length in bytes
WriteFile(c, findFileData.cFileName, (DWORD)wcslen(findFileData.cFileName) * sizeof(wchar_t), &bytesWritten, NULL);
// Add the length *after* writing...
lengthSum += bytesWritten;
// Add a carriage return/line feed to make Notepad happy.
WriteFile(c, L"\r\n", sizeof(wchar_t) * 2, &bytesWritten, NULL);
lengthSum += bytesWritten;
}
FindClose(f); // This should be inside findFirstFile succeeded block.
}
else {
cout << "FindFirstFile() failed :(\n";
}
// these should be inside CreateFile succeeded block.
CloseHandle(c);
cout << lengthSum << endl;
}
else {
cout << "CreateFile() failed :(\n";
}
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to pass std::string to CreateDirectory with Unicode set - c++

Related

ShellExecute ends up in error C2065 don't know how to fix

C++ Windows can't add program to startup

Renaming a file with an en dash in the name in C++

Is Win32 error code 122 somewhat benign in this case?

Write into a file names of folders containing in C:\Program Files

Categories

Resources