SEEMS LIKE Latin1 ISO-8859-1 can't even save special characters so format of the database must be Latin7 ISO-8859-7. Could not really find easy function to do this, do I really have to write one myself?
UPDATE, UPDATE --- I made small progress as described in this question article - Special characters in Visual Studio 2019 C++ project AND executing CMD commands with them
BUT THE PROBLEM SEEMS TO APPEAR ON DEFAULT PROJECT SETTINGS without any mysql library's or anything, IN ALL CORRECT CODED FILES. (UTF8) EVEN WHEN COMPILE FLAGS ARE ADDED, EVEN WHEN "FIX FILE ENCODING" IS INSTALLED.
#include <iostream>
int main() {
string output = "āāāčččēēēē";
cout << output << endl;
}
Intro rant* - This is 3rd post about MySql Connector, because I just could not find basic information about MySql Connector in Google at all (MySQL and MariaDB library's in C++ using cmake, mingw), first there was no explanation that GCC will not be able to compile it for Windows systems, then I had no luck finding how to use datetime and int objects in the output from database, until I posted issue here (How to return time, date data fields in c++ mysql oracle vs17?).
My issue now is that strings returned from database have special characters - āàčīēļš etc.
Column:test2col
Collation:Latin7_general_ci
So here is the code that might work, but does not due to table or something wrong, any expertise might help:
#include <iostream>
#include <string>
#include <string_view>
std::string_view itou[256] {
{"\x00",1} , {"\x01",1} , {"\x02",1} , {"\x03",1} ,
{"\x04",1} , {"\x05",1} , {"\x06",1} , {"\x07",1} ,
{"\x08",1} , {"\x09",1} , {"\x0a",1} , {"\x0b",1} ,
{"\x0c",1} , {"\x0d",1} , {"\x0e",1} , {"\x0f",1} ,
{"\x10",1} , {"\x11",1} , {"\x12",1} , {"\x13",1} ,
{"\x14",1} , {"\x15",1} , {"\x16",1} , {"\x17",1} ,
{"\x18",1} , {"\x19",1} , {"\x1a",1} , {"\x1b",1} ,
{"\x1c",1} , {"\x1d",1} , {"\x1e",1} , {"\x1f",1} ,
{"\x20",1} , {"\x21",1} , {"\x22",1} , {"\x23",1} ,
{"\x24",1} , {"\x25",1} , {"\x26",1} , {"\x27",1} ,
{"\x28",1} , {"\x29",1} , {"\x2a",1} , {"\x2b",1} ,
{"\x2c",1} , {"\x2d",1} , {"\x2e",1} , {"\x2f",1} ,
{"\x30",1} , {"\x31",1} , {"\x32",1} , {"\x33",1} ,
{"\x34",1} , {"\x35",1} , {"\x36",1} , {"\x37",1} ,
{"\x38",1} , {"\x39",1} , {"\x3a",1} , {"\x3b",1} ,
{"\x3c",1} , {"\x3d",1} , {"\x3e",1} , {"\x3f",1} ,
{"\x40",1} , {"\x41",1} , {"\x42",1} , {"\x43",1} ,
{"\x44",1} , {"\x45",1} , {"\x46",1} , {"\x47",1} ,
{"\x48",1} , {"\x49",1} , {"\x4a",1} , {"\x4b",1} ,
{"\x4c",1} , {"\x4d",1} , {"\x4e",1} , {"\x4f",1} ,
{"\x50",1} , {"\x51",1} , {"\x52",1} , {"\x53",1} ,
{"\x54",1} , {"\x55",1} , {"\x56",1} , {"\x57",1} ,
{"\x58",1} , {"\x59",1} , {"\x5a",1} , {"\x5b",1} ,
{"\x5c",1} , {"\x5d",1} , {"\x5e",1} , {"\x5f",1} ,
{"\x60",1} , {"\x61",1} , {"\x62",1} , {"\x63",1} ,
{"\x64",1} , {"\x65",1} , {"\x66",1} , {"\x67",1} ,
{"\x68",1} , {"\x69",1} , {"\x6a",1} , {"\x6b",1} ,
{"\x6c",1} , {"\x6d",1} , {"\x6e",1} , {"\x6f",1} ,
{"\x70",1} , {"\x71",1} , {"\x72",1} , {"\x73",1} ,
{"\x74",1} , {"\x75",1} , {"\x76",1} , {"\x77",1} ,
{"\x78",1} , {"\x79",1} , {"\x7a",1} , {"\x7b",1} ,
{"\x7c",1} , {"\x7d",1} , {"\x7e",1} , {"\x7f",1} ,
{"\xc2""\x80",2} , {"\xc2""\x81",2} , {"\xc2""\x82",2} , {"\xc2""\x83",2} ,
{"\xc2""\x84",2} , {"\xc2""\x85",2} , {"\xc2""\x86",2} , {"\xc2""\x87",2} ,
{"\xc2""\x88",2} , {"\xc2""\x89",2} , {"\xc2""\x8a",2} , {"\xc2""\x8b",2} ,
{"\xc2""\x8c",2} , {"\xc2""\x8d",2} , {"\xc2""\x8e",2} , {"\xc2""\x8f",2} ,
{"\xc2""\x90",2} , {"\xc2""\x91",2} , {"\xc2""\x92",2} , {"\xc2""\x93",2} ,
{"\xc2""\x94",2} , {"\xc2""\x95",2} , {"\xc2""\x96",2} , {"\xc2""\x97",2} ,
{"\xc2""\x98",2} , {"\xc2""\x99",2} , {"\xc2""\x9a",2} , {"\xc2""\x9b",2} ,
{"\xc2""\x9c",2} , {"\xc2""\x9d",2} , {"\xc2""\x9e",2} , {"\xc2""\x9f",2} ,
{"\xc2""\xa0",2} , {"\xe2""\x80""\x98",3}, {"\xe2""\x80""\x99",3}, {"\xc2""\xa3",2} ,
{"\xe2""\x82""\xac",3}, {"\xe2""\x82""\xaf",3}, {"\xc2""\xa6",2} , {"\xc2""\xa7",2} ,
{"\xc2""\xa8",2} , {"\xc2""\xa9",2} , {"\xcd""\xba",2} , {"\xc2""\xab",2} ,
{"\xc2""\xac",2} , {"\xc2""\xad",2} , {"\x3f",1} , {"\xe2""\x80""\x95",3},
{"\xc2""\xb0",2} , {"\xc2""\xb1",2} , {"\xc2""\xb2",2} , {"\xc2""\xb3",2} ,
{"\xce""\x84",2} , {"\xce""\x85",2} , {"\xce""\x86",2} , {"\xc2""\xb7",2} ,
{"\xce""\x88",2} , {"\xce""\x89",2} , {"\xce""\x8a",2} , {"\xc2""\xbb",2} ,
{"\xce""\x8c",2} , {"\xc2""\xbd",2} , {"\xce""\x8e",2} , {"\xce""\x8f",2} ,
{"\xce""\x90",2} , {"\xce""\x91",2} , {"\xce""\x92",2} , {"\xce""\x93",2} ,
{"\xce""\x94",2} , {"\xce""\x95",2} , {"\xce""\x96",2} , {"\xce""\x97",2} ,
{"\xce""\x98",2} , {"\xce""\x99",2} , {"\xce""\x9a",2} , {"\xce""\x9b",2} ,
{"\xce""\x9c",2} , {"\xce""\x9d",2} , {"\xce""\x9e",2} , {"\xce""\x9f",2} ,
{"\xce""\xa0",2} , {"\xce""\xa1",2} , {"\x3f",1} , {"\xce""\xa3",2} ,
{"\xce""\xa4",2} , {"\xce""\xa5",2} , {"\xce""\xa6",2} , {"\xce""\xa7",2} ,
{"\xce""\xa8",2} , {"\xce""\xa9",2} , {"\xce""\xaa",2} , {"\xce""\xab",2} ,
{"\xce""\xac",2} , {"\xce""\xad",2} , {"\xce""\xae",2} , {"\xce""\xaf",2} ,
{"\xce""\xb0",2} , {"\xce""\xb1",2} , {"\xce""\xb2",2} , {"\xce""\xb3",2} ,
{"\xce""\xb4",2} , {"\xce""\xb5",2} , {"\xce""\xb6",2} , {"\xce""\xb7",2} ,
{"\xce""\xb8",2} , {"\xce""\xb9",2} , {"\xce""\xba",2} , {"\xce""\xbb",2} ,
{"\xce""\xbc",2} , {"\xce""\xbd",2} , {"\xce""\xbe",2} , {"\xce""\xbf",2} ,
{"\xcf""\x80",2} , {"\xcf""\x81",2} , {"\xcf""\x82",2} , {"\xcf""\x83",2} ,
{"\xcf""\x84",2} , {"\xcf""\x85",2} , {"\xcf""\x86",2} , {"\xcf""\x87",2} ,
{"\xcf""\x88",2} , {"\xcf""\x89",2} , {"\xcf""\x8a",2} , {"\xcf""\x8b",2} ,
{"\xcf""\x8c",2} , {"\xcf""\x8d",2} , {"\xcf""\x8e",2} , {"\x3f",1}
};
int main() {
std::string input{"āāāčččēēēē"};
std::string output;
for (auto c : input) {
output.append(itou[static_cast<uint8_t>(c)]);
}
std::cout << output << std::endl;
}
string FirstName = res->getString("test2col");
Documentation for MySQL Connector: https://dev.mysql.com/doc/dev/connector-cpp/8.0/
Seem to not tell much about this, so thanks for any help!
So here is code example that turns into another error based on solutions in commentaries -
157
#include <iostream>
#include <cppconn/driver.h>
#include <cppconn/exception.h>
#include <cppconn/resultset.h>
#include <cppconn/statement.h>
#include <cppconn/prepared_statement.h>
#include <string>
#include <fstream>
#include <sstream>
#include <stdexcept>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <cstring>
#include <filesystem>
#include <codecvt>
#include <cstdint>
#include <locale>
Severity Code Description Project File Line Suppression State Error C4996 'std::wstring_convert<std::codecvt_utf8<wchar_t,1114111,(std::codecvt_mode)0>,wchar_t,std::allocator<wchar_t>,std::allocator<char>>::to_bytes': warning STL4017: std::wbuffer_convert, std::wstring_convert, and the <codecvt> header (containing std::codecvt_mode, std::codecvt_utf8, std::codecvt_utf16, and std::codecvt_utf8_utf16) are deprecated in C++17. (The std::codecvt class template is NOT deprecated.) The C++ Standard doesn't provide equivalent non-deprecated functionality; consider using MultiByteToWideChar() and WideCharToMultiByte() from <Windows.h> instead. You can define _SILENCE_CXX17_CODECVT_HEADER_DEPRECATION_WARNING or _SILENCE_ALL_CXX17_DEPRECATION_WARNINGS to acknowledge that you have received this warning.
```cpp
try
{
std::unique_ptr<sql::Connection> connection{ nullptr };
try {
sql::Driver* driver = ::get_driver_instance();
//sql::Connection* con;
//sql::Statement *stmt;
//sql::ResultSet* res;
//sql::Statement* pstmt;
sql::ConnectOptionsMap connection_options{};
connection_options["hostName"] = "tcp://127.0.0.1:3306"; // Replace with your log-in
connection_options["userName"] = "root"; // ...
connection_options["password"] = "parole123!"; // ...
connection_options["schema"] = "test"; // ...
connection_options["characterSetResults"] = "latin7_general_ci";
connection_options["OPT_CHARSET_NAME"] = "latin7_general_ci";
connection_options["OPT_SET_CHARSET_NAME"] = "latin7_general_ci";
connection.reset(driver->connect(connection_options));
driver = get_driver_instance();
/* Create a connection */
//con = driver->connect("tcp://127.0.0.1:3306", "root", "parole123!");
//con->setClientOption("characterSetResults", "UTF8");
/* Connect to the MySQL test database */
//con->setSchema("test");
//pstmt = con->createStatement();
std::string const some_query = "SELECT * FROM test2";
std::unique_ptr<sql::Statement> statement{ connection->createStatement() };
//res = pstmt->executeQuery("SELECT * FROM test2");
std::unique_ptr<sql::ResultSet> res{ statement->executeQuery(some_query) };
//pstmt->setInt(1, 1);
//pstmt->setString(1, str2);
//res = pstmt->executeQuery();
/* Fetch in reverse = descending order! */
///cikls kur izmantos mysql datu masvu
//res->afterLast();
while (res->next()) {
std::string const FILE_NAME = res->getString("test2col");
string locations2 = ("C:\\Users\\Janis\\Desktop\\TEST2\\");
string txtt = (".txt");
string copy2 = ("copy /-y ");
string space = " ";
string PACIENTI2 = "C:\\PACIENTI\\";
string element = copy2 + locations2 + FILE_NAME + txtt;
//string StartTime = res->getString("StartTime");
//string VisitID = res->getString("VisitID");
//string LastModified = res->getString("LastModified");
//string Id = res->getString("Id");
//string PatientId = res->getString("PatientId");
for (auto& p2 : fs::directory_iterator("C:\\Users\\Janis\\Desktop\\TEST2\\")) {
if (FILE_NAME != p2.path().string()) {
string cmd = element + space + PACIENTI2 + FILE_NAME + txtt;
FILE* pipe = _popen(cmd.c_str(), "r");
cout << cmd << endl;
/*if (pipe == NULL)
{
return;
}
char buffer[128];
std::string result = "";
while (!feof(pipe))
{
if (fgets(buffer, 128, pipe) != NULL)
{
result += buffer;
}
}*/
//std::cout << "Results: " << std::endl << result << std::endl ;
//_pclose(pipe);
}
}
}
//delete res;
//delete pstmt;
//delete con;
}
catch (sql::SQLException& ex) {
std::cerr << "Error occured when connecting to SQL data base: " << ex.what() << "(" << ex.getErrorCode() << ").";
}
}
catch (sql::SQLException& e)
{
///nav implementēts vairāk info
//cout << "# ERR: SQLException in " << __FILE__;
//cout << "(" << __FUNCTION__ << ") on line " << __LINE__ << endl;
/* what() (derived from std::runtime_error) fetches error message */
//cout << "# ERR: " << e.what();
//cout << " (MySQL error code: " << e.getErrorCode();
cout << "# ERR: SQLException in " << endl;
}
copy /-y C:\Users\username\Desktop\TEST2\J─ünis.txt C:\PACIENTI\J─ünis.txt
copy /-y C:\Users\username\Desktop\TEST2\Ann─ü.txt C:\PACIENTI\Ann─ü.txt
instead it should be
copy /-y C:\Users\Janis\Desktop\TEST2\Jānis.txt C:\PACIENTI\Jānis.txt
copy /-y C:\Users\Janis\Desktop\TEST2\Annā.txt C:\PACIENTI\Annā.txt
I think the problem in your case is not related to std::wstring: the 8-bit std::string should be sufficient for UTF-8 (creating a simple std::string with the special characters "āàčīēļš" just works fine), while depending on the operating system std::wstring is 2 Byte (Windows) or 4 Byte (Linux) (more information here and here). After all if you have a look at the getString function you will see that it takes and returns an sql::SQLString. The sql::SQLString class is just a simple wrapper for an std::string.
I think you have to specify utf-8 as default character set for MySql: For this you will have to specify the following connection options when connecting to the data base:
std::unique_ptr<sql::Connection> connection {nullptr};
try {
sql::Driver* driver = ::get_driver_instance();
sql::ConnectOptionsMap connection_options {};
connection_options["hostName"] = url; // Replace with your log-in
connection_options["userName"] = username; // ...
connection_options["password"] = password; // ...
connection_options["schema"] = schema; // ...
connection_options["characterSetResults"] = "utf8";
connection_options["OPT_CHARSET_NAME"] = "utf8";
connection_options["OPT_SET_CHARSET_NAME"] = "utf8";
connection.reset(driver->connect(connection_options));
} catch (sql::SQLException& ex) {
std::cerr << "Error occured when connecting to SQL data base: " << ex.what() << "(" << ex.getErrorCode() << ").";
}
Then you should be able to continue to query your data base as follows
std::string const some_query = "SELECT * FROM some_table_name;";
std::unique_ptr<sql::Statement> statement {connection->createStatement()};
std::unique_ptr<sql::ResultSet> result {statement->executeQuery(some_query)};
while (result->next()) {
std::string const some_field = result->getString("some_field_name");
// Process: e.g. display with std::cout << some_field << std::endl;
}
The problem that now emerges when you want to create file names with it or output it to console is Windows itself (I had tested the code before with Linux only and therefore did not run into this issue before!): By default it uses ANSI and not UTF-8. Even if you output something like āàčīēļš it will not output it correctly no matter if you are using a std::cout or std::wcout in combination with std::wstring. Instead it will output ─ü├á─ì─½─ô─╝┼í.
If you extract the bytes
void dump_bytes(std::string const& str) {
std::cout << std::hex << std::uppercase << std::setfill('0');
for (unsigned char c : str) {
std::cout << std::setw(2) << static_cast<int>(c) << ' ';
}
std::cout << std::dec << std::endl;
return;
}
it will output C4 81 C3 A0 C4 8D C4 AB C4 93 C4 BC C5 A1 which plugging it back into a byte-to-utf8-converter such as this one will in fact give you āàčīēļš. So the string was read correctly but Windows is just not displaying it correctly. The following in combination with the last section (specifying utf-8 as default character set in MySql) should fix all your issues:
A call to SetConsoleOutputCP(CP_UTF8); from windows.h at the start of the program will fix the console output:
#include <cstdlib>
#include <iostream>
#include <string>
#include <windows.h>
int main() {
// Forces console output to UTF8
SetConsoleOutputCP(CP_UTF8);
std::string const name = u8"āàčīēļš";
std::cout << name << std::endl; // Actually outputs āàčīēļš
return EXIT_SUCCESS;
}
Similarly you will have to adapt your routine that creates the files as by default it won't be UTF8 as well (The content of the files will not be an issue but the filename itself will be!). Use std::ofstream from fstream in combination with std::filesystem::u8path from the C++17 library filesystem to resolve this:
#include <cstdlib>
#include <filesystem>
#include <fstream>
#include <string>
int main() {
std::string const name = u8"āàčīēļš";
std::ofstream f(std::filesystem::u8path(name + ".txt")); // Creates a file āàčīēļš.txt
f << name << std::endl; // Writes āàčīēļš to it
return EXIT_SUCCESS;
}
For that you need to convert your string into std::wstring.
#include <codecvt>
string FirstName = res->getString("FirstName");
std::wstring firstNameWstring = std::wstring_convert<std::codecvt_utf8<wchar_t>>().from_bytes(FirstName);
C4 81 C3 A0 C4 8D C4 AB C4 93 C4 BC C5 A1, when interpreted as various encodings:
utf8: āàčīēļš
latin1: Äà Äīēļš
latin7: ÄĆÄ Ä Ä«ÄļŔ
euckr 훮횪훾카휆캬큄
Presumably, you wanted utf8 (or utf8mb4).
The connection parameters specify the encoding of the client. If that hex string is coming from the client, then specify utf8 (or utf8mb4).
The encoding in the database table can be the same or something different. That is specified in the schema; for example:
CREATE TABLE ...
stuff VARCHAR(99) CHARACTER SET utf8
...
When INSERTing and SELECTing, MySQL will convert (if necessary) between the client's encoding and the column's encoding.
Discovery:
To see the client settings:
SHOW VARIABLES LIKE 'char%';
:--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin7 | <--
| character_set_connection | latin7 | <--
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | latin7 | <--
| character_set_server | utf8mb4 |
| character_set_system | utf8mb3 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
To see the column's encoding: SHOW CREATE TABLE tablename;.
To see the actual bytes in a column:
SELECT col, HEX(col) ...
Virtually every other technique is error-prone.
Note that the accented characters you mentioned all have UTF-8 hex like
C3 xx
C4 xx
C5 xx
Terminology:
"Character set" is then encoding. Examples: latin1, latin7, utf8, utf8mb4.
"Collation" refers to how to sort and compare values. Example: latin7_general_ci ("ci" is short for "case insensitive and accent insensitive")
(Your connection parameters are confusing the character set and collation.)
"utf8" or "utf8mb3" is MySQL's 3-byte encoding. Going away, but still quite valid for all European languages, plus much of the rest of the world.
"utf8mb4" is MySQL's 4-byte encoding. Equivalent to UTF-8.
"UTF-8" is the rest of the world's name for it.
Common mistakes:
Failure to declare the client's encoding.
Attempts to convert the encoding in the client.
Problem is that different parts of your code are using different encoding of text data.
Since MySql uses utf-8 you can simply change your program to use UTF-8 everywhere.
This can be achieved by build flags:
cl /source-charset:utf-8 /execution-charset:utf-8 /EHsc YourSources.cpp
/source-charset:utf-8 - says that your source file is using utf-8 encoding - since your source can use different encoding ensure you are using correct parameter, but is highly recommended that source code is encoded in standard which is universal (so UTF-8), so developers from different countries can work on code without problems.
/execution-charset:utf-8 - says that string literals stored in exactable should be encoded as utf-8.
Now problem will be only a console (cmd). By default Windows console uses encoding specific to language setting on your system (inheritance of compatibility with old DOS applications). As a result when you forced your executable to use utf-8 console by default will print those incorrectly.
Changing code page of console so it would use utf-8 will fix the issue.
Here is a test program I've wrote to demonstrate how to handle encoding in C++:
#include <iostream>
#include <locale>
#include <exception>
#include <string>
void setupLocale(int argc, const char *argv[])
{
std::locale def{""};
std::locale::global(argc > 1 ? std::locale{argv[1]} : def);
auto streamLocale = argc > 2 ? std::locale{argv[2]} : def;
std::cout.imbue(streamLocale);
std::cin.imbue(streamLocale);
}
void printSeparator()
{
std::cout << "---------\n";
}
void printTestStuff()
{
std::cout << "Wester Europe: āāāčččēēēēßÞÖöñÅÃ\n";
std::cout << "Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă\n";
std::cout << "China: 字集碼是把字符集中的字符编码为指定集合中某一对象\n";
std::cout << "Korean: 줄여서 인코딩은 사용자가 입력한\n";
}
int main(int argc, const char *argv[]) {
try{
setupLocale(argc, argv);
printSeparator();
printTestStuff();
printSeparator();
}
catch(const std::exception& e)
{
std::cerr << e.what() << '\n';
}
}
When you copy paste that program remember to encode source in UTF-8 (since I used wide ranges of characters for testing most of other encodings will just fail - will not be displayed correctly when file is opened).
Now this is what I see on my terminal (copy paste):
C:\Users\User\Downloads>cl /source-charset:utf-8 /execution-charset:utf-8 /EHsc encodings.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29336 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
encodings.cpp
Microsoft (R) Incremental Linker Version 14.28.29336.0
Copyright (C) Microsoft Corporation. All rights reserved.
/out:encodings.exe
encodings.obj
C:\Users\User\Downloads>chcp
Active code page: 437
C:\Users\User\Downloads>encodings.exe
---------
Wester Europe: Ä?Ä?Ä?Ä?Ä?Ä?Ä"Ä"Ä"Ä"AYAzA-AA±A.Aƒ
Central Europe: Ä.Ä,A"A3Å?Å,Ä~ÄTżÄ╪źŰűA?A½Ä,ă
China: å--é>+碼æ~_æSSå--ç¬▌é>+ä,-çs,å--ç¬▌ç¼-ç ?ä,ºæO╪årsé>+å?^ä,-æY?ä,?å_1象
Korean: ì,ì-¬ì,o ì?,ì½"ë"cì?? ì,¬ìscìz?ê°? ìz.ë ¥ío
---------
C:\Users\User\Downloads>encodings.exe .65001
---------
Wester Europe: aaaccceeeeß_ÖöñÅA
Central Europe: aAOóLlEezczUuYyAa
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>encodings.exe .65001 .437
---------
Wester Europe: aaaccceeeeß_ÖöñÅA
Central Europe: aAOóLlEezczUuYyAa
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>encodings.exe .65001 .1250
---------
Wester Europe: aaaccceeeeß_ÖöñÅA
Central Europe: aAOóLlEezczUuYyAa
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>chcp 1250
Active code page: 1250
C:\Users\User\Downloads>encodings.exe .65001 .1250
---------
Wester Europe: aaačččeeeeß?ÖönAA
Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>chcp 65001
Active code page: 65001
C:\Users\User\Downloads>encodings.exe
---------
Wester Europe: ÄÄÄÄÄÄēēēēßÞÖöñÅÃ
Central Europe: ąĄÓóÅłĘężćźŰűÃýĂă
China: å—集碼是把å—符集ä¸çš„å—符编ç 为指定集åˆä¸æŸä¸€å¯¹è±¡
Korean: 줄여서 ì¸ì½”ë”©ì€ ì‚¬ìš©ìžê°€ ìž…ë ¥í•œ
---------
C:\Users\User\Downloads>encodings.exe .65001
---------
Wester Europe: āāāčččēēēēßÞÖöñÅÃ
Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă
China: 字集碼是把字符集中的字符编码为指定集合中某一对象
Korean: 줄여서 인코딩은 사용자가 입력한
---------
C:\Users\User\Downloads>encodings.exe .65001 .65001
---------
Wester Europe: āāāčččēēēēßÞÖöñÅÃ
Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă
China: 字集碼是把字符集中的字符编码为指定集合中某一对象
Korean: 줄여서 인코딩은 사용자가 입력한
---------
C:\Users\User\Downloads>
As you can see if code page and encodings are setup properly everything just works (without using Windows API).
On my machine cmd.exe is unable to display Asian characters properly, but this is just font issue, when I copy paste cmd.exe content to other program everything is displayed correctly (in case correct setup of encodings in my program).
Note also if conversion is not possible question marks are displayed ar fallback to some other character is performed (for example Å has been converted to A when Windows-1250 encoding and code page was used).
Play a bit whit this program I'm pretty sure this should be ennugh to give you a full picture.
Most important:
std::locale::global defines what kind of encoding main program uses. It means that streams assume that std::string values are encoded this way
std::iostream::imbue allows to define encoding of the stream, sow what will be written to a file.
this both settings are defining both sides of conversion! You do not have to do conversion manually!
Visual studio let you change the file encoding.
So if you are typing the characters on a variable you should have that encoding.
Set the option in Visual Studio or programmatically
Open the project Property Pages dialog box. ...
Select the Configuration Properties > C/C++ > Command Line property page.
In Additional Options, add the /utf-8 option to specify your preferred encoding.
Choose OK to save your changes.
link
https://learn.microsoft.com/en-us/cpp/build/reference/utf-8-set-source-and-executable-character-sets-to-utf-8?view=msvc-160
Now for database make sure your database has right character set.
https://www.a2hosting.com/kb/developer-corner/mysql/convert-mysql-database-utf-8#:~:text=mysql-,To%20change%20the%20character%20set%20encoding%20to%20UTF%2D8%20for,q%20at%20the%20mysql%3E%20prompt.
Once you have right encoding for DB and Visual studio you can save edit and read those characters from c ++ code.