wstring::find() doesn't work with non-latin symbols?

wstring::find() doesn't work with non-latin symbols? - c++

I have an wide-character string (std::wstring) in my code, and I need to search wide character in it.
I use find() function for it:
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
L'ф' is a Cyrillic letter.
But find() in same call always returns npos. In a case with Latin letters find() works correctly.
It is a problem of this function?
Or I incorrectly do something?
UPD
I use MinGW and save source in UTF-8.
I also set locale with setlocale(LC_ALL, "");.
Code same wcout << L'ф'; works coorectly.
But same
wchar_t w;
wcin >> w;
wcout << w;
works incorrectly.
It is strange. Earlier I had no problems with the encoding, using setlocale ().

The encoding of your source file and the execution environment's encoding may be wildly different. C++ makes no guarantees about any of this. You can check this by outputting the hexadecimal value of your string literal:
std::wcout << std::hex << L"ф";
Before C++11, you could use non-ASCII characters in source code by using their hex values:
"\x05" "five"
C++11 adds the ability to specify their Unicode value, which in your case would be
L"\u03A6"
If you're going full C++11 (and your environment ensures these are encoded in UTF-*), you can use any of char, char16_t, or char32_t, and do:
const char* phi_utf8 = "\u03A6";
const char16_t* phi_utf16 = u"\u03A6";
const char32_t* phi_utf16 = U"\u03A6";

You must set the encoding of the console.
This works:
#include <iostream>
#include <string>
#include <io.h>
#include <fcntl.h>
#include <stdio.h>
using namespace std;
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
_setmode(_fileno(stdin), _O_U16TEXT);
wstring str;
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
system("pause");
return 0;
}

std::wstring::find() works fine. But you have to read the input string correctly.
The following code runs fine on Windows console (the input Unicode string is read using ReadConsoleW() Win32 API):
#include <exception>
#include <iostream>
#include <sstream>
#include <stdexcept>
#include <string>
#include <windows.h>
using namespace std;
class Win32Error : public runtime_error
{
public:
Win32Error(const char* message, DWORD error)
: runtime_error(message)
, m_error(error)
{}
DWORD Error() const
{
return m_error;
}
private:
DWORD m_error;
};
void ThrowLastWin32(const char* message)
{
const DWORD error = GetLastError();
throw Win32Error(message, error);
}
void Test()
{
const HANDLE hStdIn = GetStdHandle(STD_INPUT_HANDLE);
if (hStdIn == INVALID_HANDLE_VALUE)
ThrowLastWin32("GetStdHandle failed.");
static const int kBufferLen = 200;
wchar_t buffer[kBufferLen];
DWORD numRead = 0;
if (! ReadConsoleW(hStdIn, buffer, kBufferLen, &numRead, nullptr))
ThrowLastWin32("ReadConsoleW failed.");
const wstring str(buffer, numRead - 2);
static const wchar_t kEf = 0x0444;
wcout << ((str.find(kEf) != wstring::npos) ? L"EXIST" : L"NONE");
}
int main()
{
static const int kExitOk = 0;
static const int kExitError = 1;
try
{
Test();
return kExitOk;
}
catch(const Win32Error& e)
{
cerr << "\n*** ERROR: " << e.what() << '\n';
cerr << " (GetLastError returned " << e.Error() << ")\n";
return kExitError;
}
catch(const exception& e)
{
cerr << "\n*** ERROR: " << e.what() << '\n';
return kExitError;
}
}
Output:
C:\TEMP>test.exe
abc
NONE
C:\TEMP>test.exe
abcфabc
EXIST

That's probably an encoding issue. wcin works with an encoding different from your compiler's/source code's. Try entering the ф in the console/wcin -- it will work. Try printing the ф via wcout -- it will show a different character or no character at all.
There is no platform independent way to circumvent this, but if you are on windows, you can manually change the console encoding, either with the chchp commandline command or programmatically with SetConsoleCP() (input) and SetConsoleOutputCP() (output).
You could also change your source file's/compiler's encoding. How this is done depends on your editor/compiler. If you are using MSVC, this answer might help you: https://stackoverflow.com/a/1660901/2128694

Related

Why does std::iswalpha return false for some French characters in C++?

I am using std::iswalpha to check if a non-ASCII character (French character) is alphabetic. However, I have found that it returns false for é character. I have set my locale to fr_FR.UTF-8 in my code. Can you help me understand why this is happening and how I can correctly determine whether a French character is alphabetic using C++?
#include <iostream>
#include <cwctype>
int main() {
std::setlocale(LC_ALL, "fr_FR.UTF-8");
wchar_t ch = L'é';
bool is_alpha = std::iswalpha(ch);
std::cout << is_alpha << std::endl; // print 0
return 0;
}

It's because installing the locale fails, but you don't check that.
Running this in Compiler Explorer prints locale: No such file or directory while running it on my local computer where I have the fr_FR.UTF-8 locale installed succeeds and prints 1:
#include <cstdio>
#include <cwctype>
#include <iostream>
int main() {
if (std::setlocale(LC_ALL, "fr_FR.UTF-8") == nullptr) {
std::perror("locale");
} else {
wint_t ch = L'é';
bool is_alpha = std::iswalpha(ch);
std::cout << is_alpha << std::endl; // print 1
}
}

Create string from UTF-8 byte array?

Consider the emoji 😙. It's U+1F619 (decimal 128537). I believe it's UTF-8 byte array is 240, 159, 152, 151.
Given the UTF-8 byte array, how can I display it? Do I create a std::string from the byte array? Are there 3rd party libraries which help?
Given a different emoji, how can I get its UTF-8 byte array?
Target platform: Windows. Compiler: Visual C++ 2019. Just pasting 😙 into the Windows CMD prompt does not work. I tried chcp 65001 and Lucida as the font, but no luck.
I can do this on macOS or Linux if necessary, but I prefer Windows.
To clarify ... given a list of 400 bytes, how can I display the corresponding code points assuming UTF-8?

C++ has a simple solution to that.
#include <iostream>
#include <string>
int main(void) {
std::string s = u8"😙"; /* use std::u8string in c++20*/
std::cout << s << std::endl;
return 0;
}
This will allow you to store and print any UTF-8 string.
Note that Windows command prompt is weird with this kind of stuff. It's better you use an alternative such as MSYS2.

Here is sample code for experimenting with unicode, to convert unicode character/string and print it in console, it works just fine for a lot of unicode characters assuming you set correct locale, console code page, and perform adequate string conversion (if needed ex. char32_t, char16_t and char8_t need conversion).
except for the character you want to display its not that easy, running a test takes huge amount of time, this can be improved my modifying code bellow or by knowing details needed such as code page (likely not supported by windows), so feel free to experiment as long as it doesn't become boring ;)
Hint, it would be the best to add code to write to file, let it run and check after some hour the results in file. For this to work you'll need to put BOM mark into file, but not before file is opened as UTF encoded, you do this by wofstream::imbue() to specific locale, and for BOM it depends on endianess, it's UTF-X LE encoding scheme on Windows, where X is either 8, 16, or 32, write to file must be done with wcout wchar_t to be sucessful.
See code commenets for more info, and try to comment out/uncomment parts of code to see different and quicker results.
BTW. the point in this code is to try out all possible locales/code pages supported by sytem, until you see your smiley in the console or ulitmately fail
#include <climits>
#include <locale>
#include <iostream>
#include <sstream>
#include <Windows.h>
#include <string_view>
#include <cassert>
#include <cwchar>
#include <limits>
#include <vector>
#include <string>
#pragma warning (push, 4)
#if !defined UNICODE && !defined _UNICODE
#error "Compile as unicode"
#endif
#define LINE __LINE__
// NOTE: change desired default code page here (unused)
#define CODE_PAGE CP_UTF8
// Error handling helper method
void StringCastError()
{
std::wstring error = L"Unknown error";
switch (GetLastError())
{
case ERROR_INSUFFICIENT_BUFFER:
error = L"A supplied buffer size was not large enough, or it was incorrectly set to NULL";
break;
case ERROR_INVALID_FLAGS:
error = L"The values supplied for flags were not valid";
break;
case ERROR_INVALID_PARAMETER:
error = L"Any of the parameter values was invalid.";
break;
case ERROR_NO_UNICODE_TRANSLATION:
error = L"Invalid Unicode was found in a string.";
break;
default:
break;
};
std::wcerr << error << std::endl;
}
// Convert multybyte to wide string
static std::wstring StringCast(const std::string& param, int code_page)
{
if (param.empty())
{
std::wcerr << L"ERROR: param string is empty" << std::endl;
return std::wstring();
}
DWORD flags = MB_ERR_INVALID_CHARS;
//flags |= MB_USEGLYPHCHARS;
//flags |= MB_PRECOMPOSED;
switch (code_page)
{
case 50220:
case 50221:
case 50222:
case 50225:
case 50227:
case 50229:
case 65000:
case 42:
flags = 0;
break;
case 54936:
case CP_UTF8:
flags = MB_ERR_INVALID_CHARS; // or 0
break;
default:
if ((code_page >= 57002) && (code_page <= 57011))
flags = 0;
break;
}
const int source_char_size = static_cast<int>(param.size());
int chars = MultiByteToWideChar(code_page, flags, param.c_str(), source_char_size, nullptr, 0);
if (chars == 0)
{
StringCastError();
return std::wstring();
}
std::wstring return_string(static_cast<const unsigned int>(chars), 0);
chars = MultiByteToWideChar(code_page, flags, param.c_str(), source_char_size, &return_string[0], chars);
if (chars == 0)
{
StringCastError();
return std::wstring();
}
return return_string;
}
// Convert wide to multybyte string
std::string StringCast(const std::wstring& param, int code_page)
{
if (param.empty())
{
std::wcerr << L"ERROR: param string is empty" << std::endl;
return std::string();
}
DWORD flags = WC_ERR_INVALID_CHARS;
//flags |= WC_COMPOSITECHECK;
flags |= WC_NO_BEST_FIT_CHARS;
switch (code_page)
{
case 50220:
case 50221:
case 50222:
case 50225:
case 50227:
case 50229:
case 65000:
case 42:
flags = 0;
break;
case 54936:
case CP_UTF8:
flags = WC_ERR_INVALID_CHARS; // or 0
break;
default:
if ((code_page >= 57002) && (code_page <= 57011))
flags = 0;
break;
}
const int source_wchar_size = static_cast<int>(param.size());
int chars = WideCharToMultiByte(code_page, flags, param.c_str(), source_wchar_size, nullptr, 0, nullptr, nullptr);
if (chars == 0)
{
StringCastError();
return std::string();
}
std::string return_string(static_cast<const unsigned int>(chars), 0);
chars = WideCharToMultiByte(code_page, flags, param.c_str(), source_wchar_size, &return_string[0], chars, nullptr, nullptr);
if (chars == 0)
{
StringCastError();
return std::string();
}
return return_string;
}
// Console code page helper to adjust console
bool SetConsole(UINT code_page)
{
if (IsValidCodePage(code_page) == 0)
{
std::wcerr << L"Code page is not valid: " << LINE << std::endl;
}
else if (SetConsoleCP(code_page) == 0)
{
std::wcerr << L"Failed to set console input code page line: " << LINE << std::endl;
}
else if (SetConsoleOutputCP(code_page) == 0)
{
std::wcerr << L"Failed to set console output code page: " << LINE << std::endl;
}
else
{
return true;
}
return false;
}
std::vector<std::string> locales;
// System locale enumerator to get all locales installed on system
BOOL LocaleEnumprocex(LPWSTR locale_name, [[maybe_unused]] DWORD locale_info, LPARAM code_page)
{
locales.push_back(StringCast(locale_name, static_cast<int>(code_page)));
return TRUE; // continue drilling
}
// System code page enumerator to try out every possible supported/installed code page on system
BOOL CALLBACK EnumCodePagesProc(LPTSTR page_str)
{
wchar_t* end;
UINT code_page = std::wcstol(page_str, &end, 10);
char char_buff[MB_LEN_MAX]{};
char32_t target_char = U'😙';
std::mbstate_t state{};
std::stringstream string_buff{};
std::wstring wstr = L"";
// convert UTF-32 to multibyte
std::size_t ret = std::c32rtomb(char_buff, target_char, &state);
if (ret == -1)
{
std::wcout << L"Conversion from char32_t failed: " << LINE << std::endl;
return FALSE;
}
else
{
string_buff << std::string_view{ char_buff, ret };
string_buff << '\0';
if (string_buff.fail())
{
string_buff.clear();
std::wcout << L"string_buff failed or bad line: " << LINE << std::endl;
return FALSE;
}
// NOTE: CP_UTF8 gives good results, ex. CP_SYMBOL or code_page variable does not
// To make stuff work, provide good code page
wstr = StringCast(string_buff.str(), CP_UTF8 /* code_page */ /* CP_SYMBOL */);
}
// Try out every possible locale, this will take insane amount of time!
// make sure to comment this range for out if you know the locale.
for (auto loc : locales)
{
// locale used (comment out for testing)
std::locale::global(std::locale(loc));
if (SetConsole(code_page))
{
// HACK: put breakpoint here, and you'll see the string
// is correctly encoded inside wstr (ex. mouse over wstr)
// However it's not printed because console code page is likely wrong.
assert(std::wcout.good() && string_buff.good());
std::wcout << wstr << std::endl;
// NOTE: commented out to avoid spamming the console, basically
// hard to find correct code page if not impossible for CMD
if (std::wcout.bad())
{
std::wcout.clear();
//std::wcout << L"std::wcout Read/write error on i/o operation line: " << LINE << std::endl;
}
else if (std::wcout.fail())
{
std::wcout.clear();
//std::wcout << L"std::wcout Logical error on i/o operation line: " << LINE << std::endl;
}
}
}
return TRUE; // continue drilling
}
int main()
{
// NOTE: can be also LOCALE_ALL, anything else than CP_UTF8 doesn't make sense here
EnumSystemLocalesEx(LocaleEnumprocex, LOCALE_WINDOWS, static_cast<LPARAM>(CP_UTF8), 0);
// NOTE: can also be CP_INSTALLED
EnumSystemCodePagesW(EnumCodePagesProc, CP_SUPPORTED);
// NOTE: following is just a test code to demonstrate these algorithms indeed work,
// comment out 2 function above to test!
std::mbstate_t state{};
std::stringstream string_buff{};
char char_buff[MB_LEN_MAX]{};
// Test case for working char:
std::locale::global(std::locale("ru_RU.utf8"));
string_buff.clear();
string_buff.str(std::string());
// Russian (KOI8-R); Cyrillic (KOI8-R)
if (SetConsole(20866))
{
char32_t char32_str[] = U"Познер обнародовал";
for (char32_t c32 : char32_str)
{
std::size_t ret2 = std::c32rtomb(char_buff, c32, &state);
if (ret2 == -1)
{
std::wcout << L"Conversion from char32_t failed line: " << LINE << std::endl;
}
else
{
string_buff << std::string_view{ char_buff, ret2 };
}
}
string_buff << '\0';
if (string_buff.fail())
{
string_buff.clear();
std::wcout << L"string_buff failed or bad line: " << LINE << std::endl;
}
std::wstring wstr = StringCast(string_buff.str(), CP_UTF8);
std::wcout << wstr << std::endl;
if (std::wcout.fail())
{
std::wcout.clear();
std::wcout << L"std::wcout failed or bad line: " << LINE << std::endl;
}
}
}
#pragma warning (pop)

How to output unicode box drawing in C++?

Sorry for what may sound simple, but I am trying to draw just a simple box in Visual Studio 2017 using the unicode characters from https://en.wikipedia.org/wiki/Box-drawing_character using the code below
#include <iostream>
using namespace std;
int main()
{
cout << "┏━━━━━━━━━━━━━━━━━┓" << endl;
cout << "┃" << endl;
and so on...
However, whenever I run it all of the above code simply outputs as a ? wherever there should be a line.
So is it possible to output code like this directly to the console or for each character do I have to write the numeric values for each character?

Windows console supports UTF-16LE UNICODE.
You can use some box-driving library like PDCurses for example.
Otherwise you can use the following approach
#include <windows.h>
#include <cwchar>
class output_swap {
output_swap(const output_swap&) = delete;
output_swap operator=(output_swap&) = delete;
public:
output_swap( ) noexcept:
prevCP_( ::GetConsoleCP() )
{
::SetConsoleCP( CP_WINUNICODE );
::SetConsoleOutputCP( CP_WINUNICODE );
}
~output_swap() noexcept {
::SetConsoleCP( prevCP_ );
::SetConsoleOutputCP( prevCP_ );
}
private:
::DWORD prevCP_;
};
void draw_text(const wchar_t* text)
{
static ::HANDLE _out = ::GetStdHandle(STD_OUTPUT_HANDLE);
::DWORD written;
::WriteConsoleW( _out, text, std::wcslen(text), &written, nullptr );
}
int main(int argc, const char** argv) {
output_swap swap;
draw_text(L"┏━━━━━━━━━━━━━━━━━┓\n");
draw_text(L"┃ OK ┃\n");
draw_text(L"┗━━━━━━━━━━━━━━━━━┛\n");
return 0;
}
Also check you console font, in the console settings. You are probably need a raster font, but this is also working for Consolas for example.
If you need console io streams, which can work with unicode as well as box driwing you can use my library

Windows console apps can output wide strings (L"...") directly to the terminal if the mode is set correctly. Note the use of wcout as well. Save the following source in UTF-8 encoding:
#include <iostream>
#include <io.h>
#include <fcntl.h>
using namespace std;
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
wcout << L"┏━━━━━━━━━━━━━━━━━┓" << endl;
wcout << L"┃" << endl;
}
Compile with "cl /EHsc /utf-8 test.cpp". Output is:
┏━━━━━━━━━━━━━━━━━┓
┃

How to handle multiple locales for ifstream, cout, etc, in c++

I am trying to read and process multiple files that are in different encoding. I am supposed to only use STL for this.
Suppose that we have iso-8859-15 and UTF-8 files.
In this SO answer it states:
In a nutshell the more interesting part for you:
std::stream (stringstream, fstream, cin, cout) has an inner
locale-object, which matches the value of the global C++ locale at
the moment of the creation of the stream object. As std::in is
created long before your code in main is called, it has most
probably the classical C locale, no matter what you do afterwards.
You can make sure, that a std::stream object has the desirable
locale by invoking
std::stream::imbue(std::locale(your_favorite_locale)).
The problem is that from the two types, only the files that match the locale that was created first are processed correctly. For example If locale_DE_ISO885915 precedes locale_DE_UTF8 then files that are in UTF-8 are not appended correctly in string s and when I cout them out i only see a couple of lines from the file.
void processFiles() {
//setup locales for file decoding
std::locale locale_DE_ISO885915("de_DE.iso885915#euro");
std::locale locale_DE_UTF8("de_DE.UTF-8");
//std::locale::global(locale_DE_ISO885915);
//std::cout.imbue(std::locale());
const std::ctype<wchar_t>& facet_DE_ISO885915 = std::use_facet<std::ctype<wchar_t>>(locale_DE_ISO885915);
//std::locale::global(locale_DE_UTF8);
//std::cout.imbue(std::locale());
const std::ctype<wchar_t>& facet_DE_UTF8 = std::use_facet<std::ctype<wchar_t>>(locale_DE_UTF8);
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::string currFile, fileStr;
std::wifstream inFile;
std::wstring s;
for (std::vector<std::string>::const_iterator fci = files.begin(); fci != files.end(); ++fci) {
currFile = *fci;
//check file and set locale
if (currFile.find("-8.txt") != std::string::npos) {
std::locale::global(locale_DE_ISO885915);
std::cout.imbue(locale_DE_ISO885915);
}
else {
std::locale::global(locale_DE_UTF8);
std::cout.imbue(locale_DE_UTF8);
}
inFile.open(path + currFile, std::ios_base::binary);
if (!inFile) {
//TODO specific file report
std::cerr << "Failed to open file " << *fci << std::endl;
exit(1);
}
s.clear();
//read file content
std::wstring line;
while( (inFile.good()) && std::getline(inFile, line) ) {
s.append(line + L"\n");
}
inFile.close();
//remove punctuation, numbers, tolower...
for (unsigned int i = 0; i < s.length(); ++i) {
if (ispunct(s[i]) || isdigit(s[i]))
s[i] = L' ';
}
if (currFile.find("-8.txt") != std::string::npos) {
facet_DE_ISO885915.tolower(&s[0], &s[0] + s.size());
}
else {
facet_DE_UTF8.tolower(&s[0], &s[0] + s.size());
}
fileStr = converter.to_bytes(s);
std::cout << fileStr << std::endl;
std::cout << currFile << std::endl;
std::cout << fileStr.size() << std::endl;
std::cout << std::setlocale(LC_ALL, NULL) << std::endl;
std::cout << "========================================================================================" << std::endl;
// Process...
}
return;
}
As you can see in the code, I have tried with global and locale local variables but to no avail.
In addition, in How can I use std::imbue to set the locale for std::wcout? SO answer it states:
So it really looks like there was an underlying C library mechanizme
that should be first enabled with setlocale to allow imbue conversion
to work correctly.
Is this "obscure" mechanism the problem here?
Is it possible to alternate between the two locales while processing the files? What should I imbue (cout, ifstream, getline ?) and how?
Any suggestions?
PS: Why is everything related with locale so chaotic? :|

This works for me as expected on my Linux machine, but not on my Windows machine under Cygwin (the set of available locales is apparently the same on both machines, but std::locale::locale just fails with every imaginable locale string).
#include <iostream>
#include <fstream>
#include <locale>
#include <string>
void printFile(const char* name, const char* loc)
{
try {
std::wifstream inFile;
inFile.imbue(std::locale(loc));
inFile.open(name);
std::wstring line;
while (getline(inFile, line))
std::wcout << line << '\n';
} catch (std::exception& e) {
std::cerr << e.what() << std::endl;
}
}
int main()
{
std::locale::global(std::locale("en_US.utf8"));
printFile ("gtext-u8.txt", "de_DE.utf8"); // utf-8 text: grüßen
printFile ("gtext-legacy.txt", "de_DE#euro"); // iso8859-15 text: grüßen
}
Output:
grüßen
grüßen

How to pass variable to windows.h function text()?

First off, I googled the hell out of this then searched the forums. In my ignorance of how the TEXT() function operates I cannot find an efficient way to search for an answer.
I writing a piece of code to search for a file that relies on inputting the directory you want to search. However, when I pass anything but a literal value to the function, like displayContent(_TEXT("c:\")), the software does not execute properly. It does not search for anything. Inserting breakpoints doesn't tell much, as the software closes anyways.
I would like to pass a variable to the the displayContent function by placing TEXT(variable) inside its argument like displayContent(_TEXT(*ptrDir)) but that is not compiling. Furthermore, when I simply place ptrDir inside of the argument of displayContent the software compiles but does not execute properly, as it asks for the directory to search but does not actually search it.
What's happening here? There has to be a way to pass a variable to displayContent that includes a string that's recieved from the user.
#include "stdafx.h"
#include <iostream>
#include <windows.h>
#include <tchar.h>
#include "Strsafe.h"
#include <string>
using namespace std;
typedef wchar_t WCHAR;
#define CONST const
typedef CONST WCHAR* LPCWSTR;
int displayContent(LPCWSTR lpszPath, int level = 0) {
wcout << lpszPath << endl;
getchar();
getchar();
WIN32_FIND_DATA ptrFileData;
HANDLE hFile = NULL;
BOOL bGetNext = TRUE;
wchar_t lpszNewPath[MAX_PATH];
if (lstrlen(lpszPath) > MAX_PATH)
return -1;
StringCchCopy(lpszNewPath, MAX_PATH, lpszPath);
StringCchCat(lpszNewPath, MAX_PATH, _TEXT("*.*"));
hFile = FindFirstFile(lpszNewPath, &ptrFileData);
while (bGetNext)
{
for (int i = 0; i < level; i++)
wcout << "-";
if (ptrFileData.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY
&& lstrlen(ptrFileData.cFileName) > 2)
{
wchar_t lpszFirstTimePath[MAX_PATH];
StringCchCopy(lpszFirstTimePath, MAX_PATH, lpszPath);
StringCchCat(lpszFirstTimePath, MAX_PATH, ptrFileData.cFileName);
StringCchCat(lpszFirstTimePath, MAX_PATH, _TEXT("\\"));
wcout << ">" << ptrFileData.cFileName << endl;
displayContent(lpszFirstTimePath, level + 2);
}
else
{
wcout << ">" << ptrFileData.cFileName << endl;
}
bGetNext = FindNextFile(hFile, &ptrFileData);
}
FindClose(hFile);
return 0;
}
int main(int argc, char* argv[])
{
WCHAR directory;
LPCWSTR ptrDir;
ptrDir = &directory;
cout << "Enter directory you wish to search: " << endl;
//cin >> directory;
directory = 'c:\\' ;
ptrDir = &directory;
displayContent(_TEXT(*ptrDir));
getchar();
getchar();
return 0;
}

The _TEXT (and equivalently, _T) macro is strictly for literals (string literals or character literals). It expands to L for a Unicode build, and to nothing for a narrow-character build. So, for a string like (say) "hello", you'll get L"hello" for a Unicode build and "hello" for a narrow-character build. This gives you a wide literal in a Unicode build and a narrow literal otherwise.
If you have a string in a variable, you can convert between wide and narrow characters with the MultiByteToWideChar and WideCharToMultibyte functions.
In this case, doing a conversion on the contents of a variable isn't really needed though. After eliminating some unnecessary complexity, and using a few standard library types where they make sense, I end up with code something like this:
#include <iostream>
#include <tchar.h>
#include <string>
#define UNICODE
#include <windows.h>
int displayContent(std::wstring const &path, int level = 0) {
WIN32_FIND_DATA FileData;
if (path.length() > MAX_PATH)
return -1;
std::wstring new_path = path + L"\\*.*";
HANDLE hFile = FindFirstFile(new_path.c_str(), &FileData);
do {
if ((FileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) && (FileData.cFileName[0] == L'.'))
continue;
std::wcout << std::wstring(level, L'-') << L">" << FileData.cFileName << L"\n";
if (FileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
displayContent(path + L"\\" + FileData.cFileName, level + 2);
} while (FindNextFile(hFile, &FileData));
FindClose(hFile);
return 0;
}
int main(int argc, char* argv[]) {
wchar_t current_dir[MAX_PATH];
GetCurrentDirectory(sizeof(current_dir), current_dir);
displayContent(current_dir);
return 0;
}
[Note: I've also changed it to start from the current directory instead of always starting at the root of the C drive, but if you want to change it back, that's pretty trivial--in fact, it simplifies the code a bit more).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

wstring::find() doesn't work with non-latin symbols? - c++

Related

Why does std::iswalpha return false for some French characters in C++?

Create string from UTF-8 byte array?

How to output unicode box drawing in C++?

How to handle multiple locales for ifstream, cout, etc, in c++

How to pass variable to windows.h function text()?

Categories

Resources