Does anyone here have an idea how to work with japanese character in visual c++?
I'm trying to display a Japanese name in console with visual c++.
#include "stdafx.h"
#include <string>
#include <iostream>
using namespace std;
int main()
{
cout << "北島 美奈" << endl;
return 0;
}
Output in the console:
?? ??
Press any key to continue ...
Hope someone can help. Thank you.
I've tested with my own code both UTF-8 and EUC-KR(korean) on a console window using a cmd.exe.
This is my source code.
#include <string>
#include <iostream>
#include <windows.h>
int main()
{
int codepage = CP_ACP; //CP_ACP, CP_OEMCP
int conv_codepage = CP_UTF8; //CP_UTF8
char str[256];
char str1[256];
wchar_t tstr[256], tstr2[256];
memset(str, 0x00, 256);
memset(str1, 0x00, 256);
memset(tstr, 0x00, 256);
memset(tstr2, 0x00, 256);
memcpy(str, " 北島 美奈", sizeof(str));
int nLen = MultiByteToWideChar(codepage, 0, str, -1, 0, 0);
MultiByteToWideChar(codepage, 0, str, -1, tstr, nLen);
int len = WideCharToMultiByte( conv_codepage, 0, tstr, -1, NULL, 0, 0, 0 );
WideCharToMultiByte(conv_codepage, 0, tstr, -1, str1, len ,0 ,0);
cout << "2... " << str1 << endl;
return 0;
}
case 1 UTF-8: the result on a console
The output is reasonable because the str1 variable is an utf-8 string.
I've got a correct utf-8 on a utf-8 console window.
case 2 EUC-KR: the result on a console
I think this case is also acceptable utf-8 string with a utf-8 string.
Then changing the code as follows
cout << "2... " << str << endl;
to
cout << "2... " << str1 << endl;
case 1 UTF-8: the result on a console
I think this is okey to me for an unicode string on a utf-8 console.
case 1 EUC-KR: the result on a console
It is still correct unicode string in a euc-kr codepage.
Related
I want to rename some of the files ,
The names of some of the files are Russian, Chinese, and German
The program can only modify files whose name is English.
What is the problem ? please guide me
std::wstring ToUtf16(std::string str)
{
std::wstring ret;
int len = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), NULL, 0);
if (len > 0)
{
ret.resize(len);
MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), &ret[0], len);
}
return ret;
}
int main()
{
const std::filesystem::directory_options options = (
std::filesystem::directory_options::follow_directory_symlink |
std::filesystem::directory_options::skip_permission_denied
);
try
{
for (const auto& dirEntry :
std::filesystem::recursive_directory_iterator("C:\\folder",
std::filesystem::directory_options(options)))
{
filesystem::path myfile(dirEntry.path().u8string());
string uft8path1 = dirEntry.path().u8string();
string uft8path3 = myfile.parent_path().u8string() + "/" + myfile.filename().u8string();
_wrename(
ToUtf16(uft8path1).c_str()
,
ToUtf16(uft8path3).c_str()
);
std::cout << dirEntry.path().u8string() << std::endl;
}
}
catch (std::filesystem::filesystem_error & fse)
{
std::cout << fse.what() << std::endl;
}
system("pause");
}
filesystem::path myfile(dirEntry.path().u8string());
Windows supports UTF16 and ANSI, there is no UTF8 support for APIs (not standard anyway). When you supply UTF8 string, it thinks there is ANSI input. Use wstring() to indicate UTF16:
filesystem::path myfile(dirEntry.path().wstring());
or just put:
filesystem::path myfile(dirEntry);
Likewise, use wstring() for other objects.
wstring path1 = dirEntry.path();
wstring path3 = myfile.parent_path().wstring() + L"/" + myfile.filename().wstring();
_wrename(path1.c_str(), path3.c_str());
Renaming the files will work fine when you have UTF16 input. But there is another problem with console's limited Unicode support. You can't print some Asian characters with font changes. Use the debugger or MessageBoxW to view Asian characters.
Use _setmode and wcout to print UTF16.
Also note, std::filesystem supports / operator for adding path. Example:
#include <io.h> //for _setmode
#include <fcntl.h>
...
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
const std::filesystem::directory_options options = (
std::filesystem::directory_options::follow_directory_symlink |
std::filesystem::directory_options::skip_permission_denied
);
try
{
for(const auto& dirEntry :
std::filesystem::recursive_directory_iterator(L"C:\\folder",
std::filesystem::directory_options(options)))
{
filesystem::path myfile(dirEntry);
auto path1 = dirEntry;
auto path3 = myfile.parent_path() / myfile;
std::wcout << path1 << ", " << path3 << endl;
//filesystem::rename(path1, path3);
}
}
...
}
i am writing a output file using ofstream in c++, file path is c:\my_folder\フォルダ\text_file.txt
but it shows Path not found,
here is my code (i tried this in Visual Studio 2017 community edition)
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ofstream outfile;
outfile.open(c:\\my_folder\\フォルダ\\text_file.txt,ios::out);
if(!outfile)
{
cout<<"Path not found\n";
}
outfile << "Hello world\n";
outfile.close();
return 0;
}
and
i tried like this also refer -Displaying japanese characters in visual c++
#include <string>
#include <iostream>
#include <windows.h>
int main()
{
int codepage = CP_ACP; //CP_ACP, CP_OEMCP
int conv_codepage = CP_UTF8; //CP_UTF8
char str[256];
char str1[256];
wchar_t tstr[256], tstr2[256];
memset(str, 0x00, 256);
memset(str1, 0x00, 256);
memset(tstr, 0x00, 256);
memset(tstr2, 0x00, 256);
memcpy(str, "c:\\my_folder\\フォルダ\\text_file.txt", sizeof(str));
int nLen = MultiByteToWideChar(codepage, 0, str, -1, 0, 0);
MultiByteToWideChar(codepage, 0, str, -1, tstr, nLen);
int len = WideCharToMultiByte( conv_codepage, 0, tstr, -1, NULL, 0, 0, 0 );
WideCharToMultiByte(conv_codepage, 0, tstr, -1, str1, len ,0 ,0);
cout << "2... " << str1 << endl;
ofstream outfile;
outfile.open(str1,ios::out);
if(!outfile)
{
cout<<"Path not found\n";
}
outfile << "Hello world\n";
outfile.close();
return 0;
}
i tried using wchar_t also but none of them worked,
std::wcout.imbue(std::locale("ja_jp.utf-8"));
wchar_t c_name = L"c:\\my_folder\\フォルダ\\text_file.txt";
wofstream outfile;
outfile.open(c_name,ios::out );
if(!outfile)
{
cout<<"Path not found\n";
}
outfile << "Hello world\n";
outfile.close();
please help me.
In C++, I am trying to display the codepoint of a wchar_t retrieved from std::wcin in MessageBoxW().
My source file is encoded in UTF-8.
If I declare my wchar_t in the source of my program, and give it an initial value, I get the display of the Unicode character and its codepoint in MessageBoxW().
However, if I retrieve the wchar_t from std::wcin, the Unicode character entered is not interpreted correctly.
Can you tell me what my error is?
I compile my code with MinGW GCC version 6.3 32-bit.
Do I need to use a particular C++ option, or C++ version?
Here is the code that works:
#include <Windows.h>
#include <stdio.h>
int main()
{
wchar_t c = L'−';
wchar_t *c1 = &c;
wchar_t buff[1024];
swprintf(buff, L"The code point of %c is %d.", c1, c);
MessageBoxW(NULL, buff, L"", MB_OK);
}
Here is the code that interprets an erroneous character, but it does not give any error when compiling:
#include <Windows.h>
#include <iostream>
#include <stdio.h>
int main()
{
wchar_t c;
std::wcout << "Enter a wchar";
std::wcin >> c;
wchar_t *c1 = &c;
wchar_t buff[1024];
swprintf(buff, L"The code point of %c is %d.", c1, c);
MessageBoxW(NULL, buff, L"", MB_OK);
}
Finally, I got it!
Many thanks to #RemyLebeau and #n.m for their help.
I only needed the last part of the code given by #RemyLebeau.
Here is the code that works here very well with any typed character.
P.S. It's missing the verification of the size of the string entered by the user, he must enter only one character.
Any idea of correction or improvement would be much appreciated.
#include <stdio.h>
#include <Windows.h>
#include <iostream>
int main()
{
std::wcout << "Enter a wchar";
std::wstring s;
wchar_t buffer[4] = {};
DWORD numRead = 0;
if (ReadConsoleW(GetStdHandle(STD_INPUT_HANDLE), buffer, 4, &numRead, NULL))
{
s.append(buffer, numRead);
}
wchar_t buff[1024];
const wchar_t* c = s.c_str();
swprintf(buff, L"The codepoint of %s is %u.", c, *c);
MessageBoxW(NULL, buff, L"", MB_OK);
}
Best regards.
I made this dll file that tries to check if a file exists. But even if I manually create the file, my dll still can't find it.
My dll retrieves the process id of the running program and looks for a file that is named after the pid.
Can anyone please tell me what I'm missing :(
Code:
#include <Windows.h>
#include <winbase.h>
#include <stdio.h>
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;
int clientpid = GetCurrentProcessId();
ifstream clientfile;
string clientpids, clientfilepath;
VOID LoadDLL() {
AllocConsole();
freopen("CONOUT$", "w", stdout);
std::cout << "Debug Start" << std::endl;
std::ostringstream ostr;
ostr << clientpid;
clientpids = ostr.str();
ostr.str("");
TCHAR tempcvar[MAX_PATH];
GetSystemDirectory(tempcvar, MAX_PATH);
ostr << tempcvar << "\\" << clientpids << ".nfo" << std::endl;
clientfilepath = ostr.str();
//clientfile.c_str()
ostr.str("");
std::cout << "Start search for: " << clientfilepath << std::endl;
FOREVER {
clientfile.open(clientfilepath,ios::in);
if(clientfile.good()) {
std::cout << "Exists!" << std::endl;
}
Sleep(10);
};
}
Supposing you are working with UNICODE
I think the problems goes in the following line:
ostr << tempcvar << "\\" << clientpids << ".nfo" << std::endl; The tempcvar is a tchar, and maybe you are working with unicode, so it means tempcvar is a widechar.
The result that you get inserting tempcvar in ostr is not what you are expecting (You are mixing multibyte with widechar too). A solution to this problem is converting tempcvar into a multi byte string (const char* or char*...)
Look at this example based on your code (Look at the convertion between tchar to multibyte char)
VOID LoadDLL() {
AllocConsole();
freopen("CONOUT$", "w", stdout);
std::cout << "Debug Start" << std::endl;
std::ostringstream ostr;
ostr << clientpid;
clientpids = ostr.str();
ostr.str("");
TCHAR tempcvar[MAX_PATH];
GetSystemDirectory(tempcvar, MAX_PATH);
// Convertion between tchar in unicode (wide char) and multibyte
wchar_t * tempcvar_widechar = (wchar_t*)tempcvar;
char* to_convert;
int bytes_to_store = WideCharToMultiByte(CP_ACP,
0,
tempcvar_widechar,
-1,NULL,0,NULL,NULL);
to_convert = new char[bytes_to_store];
WideCharToMultiByte(CP_ACP,
0,
tempcvar_widechar,
-1,to_convert,bytes_to_store,NULL,NULL);
// Using char* to_convert that is the tempcvar converted to multibyte
ostr << to_convert << "\\" << clientpids << ".nfo" << std::endl;
clientfilepath = ostr.str();
//clientfile.c_str()
ostr.str("");
std::cout << "Start search for: " << clientfilepath << std::endl;
FOREVER {
clientfile.open(clientfilepath,ios::in);
if(clientfile.good()) {
std::cout << "Exists!" << std::endl;
}
Sleep(10);
};
}
You can search more about the wide string to multibyte string convertion if this example does not works to you.
Check if you are working with Unicode, if you are, maybe this is your problem.
If you are not working with unicode, the problem in your code can be opening the file.
Hope it helps!
I have an wide-character string (std::wstring) in my code, and I need to search wide character in it.
I use find() function for it:
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
L'ф' is a Cyrillic letter.
But find() in same call always returns npos. In a case with Latin letters find() works correctly.
It is a problem of this function?
Or I incorrectly do something?
UPD
I use MinGW and save source in UTF-8.
I also set locale with setlocale(LC_ALL, "");.
Code same wcout << L'ф'; works coorectly.
But same
wchar_t w;
wcin >> w;
wcout << w;
works incorrectly.
It is strange. Earlier I had no problems with the encoding, using setlocale ().
The encoding of your source file and the execution environment's encoding may be wildly different. C++ makes no guarantees about any of this. You can check this by outputting the hexadecimal value of your string literal:
std::wcout << std::hex << L"ф";
Before C++11, you could use non-ASCII characters in source code by using their hex values:
"\x05" "five"
C++11 adds the ability to specify their Unicode value, which in your case would be
L"\u03A6"
If you're going full C++11 (and your environment ensures these are encoded in UTF-*), you can use any of char, char16_t, or char32_t, and do:
const char* phi_utf8 = "\u03A6";
const char16_t* phi_utf16 = u"\u03A6";
const char32_t* phi_utf16 = U"\u03A6";
You must set the encoding of the console.
This works:
#include <iostream>
#include <string>
#include <io.h>
#include <fcntl.h>
#include <stdio.h>
using namespace std;
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
_setmode(_fileno(stdin), _O_U16TEXT);
wstring str;
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
system("pause");
return 0;
}
std::wstring::find() works fine. But you have to read the input string correctly.
The following code runs fine on Windows console (the input Unicode string is read using ReadConsoleW() Win32 API):
#include <exception>
#include <iostream>
#include <sstream>
#include <stdexcept>
#include <string>
#include <windows.h>
using namespace std;
class Win32Error : public runtime_error
{
public:
Win32Error(const char* message, DWORD error)
: runtime_error(message)
, m_error(error)
{}
DWORD Error() const
{
return m_error;
}
private:
DWORD m_error;
};
void ThrowLastWin32(const char* message)
{
const DWORD error = GetLastError();
throw Win32Error(message, error);
}
void Test()
{
const HANDLE hStdIn = GetStdHandle(STD_INPUT_HANDLE);
if (hStdIn == INVALID_HANDLE_VALUE)
ThrowLastWin32("GetStdHandle failed.");
static const int kBufferLen = 200;
wchar_t buffer[kBufferLen];
DWORD numRead = 0;
if (! ReadConsoleW(hStdIn, buffer, kBufferLen, &numRead, nullptr))
ThrowLastWin32("ReadConsoleW failed.");
const wstring str(buffer, numRead - 2);
static const wchar_t kEf = 0x0444;
wcout << ((str.find(kEf) != wstring::npos) ? L"EXIST" : L"NONE");
}
int main()
{
static const int kExitOk = 0;
static const int kExitError = 1;
try
{
Test();
return kExitOk;
}
catch(const Win32Error& e)
{
cerr << "\n*** ERROR: " << e.what() << '\n';
cerr << " (GetLastError returned " << e.Error() << ")\n";
return kExitError;
}
catch(const exception& e)
{
cerr << "\n*** ERROR: " << e.what() << '\n';
return kExitError;
}
}
Output:
C:\TEMP>test.exe
abc
NONE
C:\TEMP>test.exe
abcфabc
EXIST
That's probably an encoding issue. wcin works with an encoding different from your compiler's/source code's. Try entering the ф in the console/wcin -- it will work. Try printing the ф via wcout -- it will show a different character or no character at all.
There is no platform independent way to circumvent this, but if you are on windows, you can manually change the console encoding, either with the chchp commandline command or programmatically with SetConsoleCP() (input) and SetConsoleOutputCP() (output).
You could also change your source file's/compiler's encoding. How this is done depends on your editor/compiler. If you are using MSVC, this answer might help you: https://stackoverflow.com/a/1660901/2128694