CMD cannot read input characters correctly - c++

I tried to use system function to run a cmd command, but I can't get the command output, Because My Windows is Italian When I type / in my system() function to call cmd its getting actually - and not getting / I tried this in my cmd and is receiving this - instead / ,I tried to use chcp 437 for English cmd but , it didn't work
example :
system("net user xxx xxxx /add");
the command is getting :
net user xxx xxxx -add
I just do not want to do this to work in Italian Windows and work for other languages actually, how to solve this problem?

You should never use system(). You are programming in C++. There is no need to use system() since you have access to everything in the, well, system :D. system() was written in C after all.
And there is the security risk: someone could replace system() or the command you are trying to run using system() in your machine and make not-nice-things in your system.
you can change the code page in your code before calling system() using
SetConsoleOutputCP(); that lives in windows.h
1252 is the Latin codepage and should do ok in Italian. Also 65001 is the utf-8 codepage and should also work well
to run your program on the "new" Windows Terminal is also an option since it is Unicode
pass a string to system() and not a literal. this way you can be sure it has what you want, before the call.
it is a good practice to save the codepage in use before change and restore it on exit
A C++ Example
This program
takes an array of commands
const char* command[] =
{
"DIR .\\*.* /O:D",
"NET USER /Add /?"
};
and runs on the console. The commands uses slashes and backslashes and outputs text so you can test a bit more. And you can just edit the array and add new commands to test
You can try alternative codepages. Here I used 65001, the one for Unicode
int originalOCP = GetConsoleOutputCP();
std::cout << "Original CodePage: " << originalOCP << "\n";
SetConsoleOutputCP(65001);
std::cout << "CodePage now is " << GetConsoleOutputCP() << "\n";
The command is written on the console before being passed to system()
std::cout <<
"\n\n\t==> command " <<
i << " is '" <<
command[i] << "'\n\n";
system(command[i]);
The output in Portuguese Windows
Original CodePage: 850
CodePage now is 65001
==> command 0 is 'DIR .\*.* /O:D'
O volume na unidade C não tem nome.
O Número de Série do Volume é 7E52-1BF2
Pasta de C:\Users\toninho\source\repos\ConsoleApplication8\ConsoleApplication8
29/10/2020 10:21 168 ConsoleApplication8.vcxproj.user
29/10/2020 10:38 974 ConsoleApplication8.vcxproj.filters
29/10/2020 10:38 7.199 ConsoleApplication8.vcxproj
29/10/2020 10:59 676 a.cpp
29/10/2020 10:59 <DIR> ..
29/10/2020 10:59 <DIR> .
29/10/2020 10:59 <DIR> Debug
4 arquivo(s) 9.017 bytes
3 pasta(s) 128.838.795.264 bytes disponíveis
==> command 1 is 'NET USER /Add /?'
A sintaxe deste comando é:
NET USER
[nome de usuário [senha | *] [opções]] [/DOMAIN]
nome de usuário {senha | *} /ADD [opções] [/DOMAIN]
nome de usuário [/DELETE] [/DOMAIN]
nome de usuário [/TIMES:{horários | ALL}]
nome de usuário [/ACTIVE: {YES | NO}]
CodePage now is 850
The code
#include <iostream>
#include <windows.h>
int main(int argc, char** argv)
{
const char* command[] =
{
"DIR .\\*.* /O:D",
"NET USER /Add /?"
};
int originalOCP = GetConsoleOutputCP();
std::cout << "Original CodePage: " << originalOCP << "\n";
SetConsoleOutputCP(65001);
std::cout << "CodePage now is " << GetConsoleOutputCP() << "\n";
for (int i = 0; i < sizeof(command) / sizeof(char*); i += 1)
{
std::cout <<
"\n\n\t==> command " <<
i << " is '" <<
command[i] << "'\n\n";
system(command[i]);
};
SetConsoleOutputCP(originalOCP);
std::cout << "CodePage now is " << GetConsoleOutputCP() << "\n";
return 0;
}

Related

How to use UTF8 characters in DEFAULT c++ project OR when using mysql connector for c++ in visual studio 2019 (Latin7_general_ci to UTF-8)?

SEEMS LIKE Latin1 ISO-8859-1 can't even save special characters so format of the database must be Latin7 ISO-8859-7. Could not really find easy function to do this, do I really have to write one myself?
UPDATE, UPDATE --- I made small progress as described in this question article - Special characters in Visual Studio 2019 C++ project AND executing CMD commands with them
BUT THE PROBLEM SEEMS TO APPEAR ON DEFAULT PROJECT SETTINGS without any mysql library's or anything, IN ALL CORRECT CODED FILES. (UTF8) EVEN WHEN COMPILE FLAGS ARE ADDED, EVEN WHEN "FIX FILE ENCODING" IS INSTALLED.
#include <iostream>
int main() {
string output = "āāāčččēēēē";
cout << output << endl;
}
Intro rant* - This is 3rd post about MySql Connector, because I just could not find basic information about MySql Connector in Google at all (MySQL and MariaDB library's in C++ using cmake, mingw), first there was no explanation that GCC will not be able to compile it for Windows systems, then I had no luck finding how to use datetime and int objects in the output from database, until I posted issue here (How to return time, date data fields in c++ mysql oracle vs17?).
My issue now is that strings returned from database have special characters - āàčīēļš etc.
Column:test2col
Collation:Latin7_general_ci
So here is the code that might work, but does not due to table or something wrong, any expertise might help:
#include <iostream>
#include <string>
#include <string_view>
std::string_view itou[256] {
{"\x00",1} , {"\x01",1} , {"\x02",1} , {"\x03",1} ,
{"\x04",1} , {"\x05",1} , {"\x06",1} , {"\x07",1} ,
{"\x08",1} , {"\x09",1} , {"\x0a",1} , {"\x0b",1} ,
{"\x0c",1} , {"\x0d",1} , {"\x0e",1} , {"\x0f",1} ,
{"\x10",1} , {"\x11",1} , {"\x12",1} , {"\x13",1} ,
{"\x14",1} , {"\x15",1} , {"\x16",1} , {"\x17",1} ,
{"\x18",1} , {"\x19",1} , {"\x1a",1} , {"\x1b",1} ,
{"\x1c",1} , {"\x1d",1} , {"\x1e",1} , {"\x1f",1} ,
{"\x20",1} , {"\x21",1} , {"\x22",1} , {"\x23",1} ,
{"\x24",1} , {"\x25",1} , {"\x26",1} , {"\x27",1} ,
{"\x28",1} , {"\x29",1} , {"\x2a",1} , {"\x2b",1} ,
{"\x2c",1} , {"\x2d",1} , {"\x2e",1} , {"\x2f",1} ,
{"\x30",1} , {"\x31",1} , {"\x32",1} , {"\x33",1} ,
{"\x34",1} , {"\x35",1} , {"\x36",1} , {"\x37",1} ,
{"\x38",1} , {"\x39",1} , {"\x3a",1} , {"\x3b",1} ,
{"\x3c",1} , {"\x3d",1} , {"\x3e",1} , {"\x3f",1} ,
{"\x40",1} , {"\x41",1} , {"\x42",1} , {"\x43",1} ,
{"\x44",1} , {"\x45",1} , {"\x46",1} , {"\x47",1} ,
{"\x48",1} , {"\x49",1} , {"\x4a",1} , {"\x4b",1} ,
{"\x4c",1} , {"\x4d",1} , {"\x4e",1} , {"\x4f",1} ,
{"\x50",1} , {"\x51",1} , {"\x52",1} , {"\x53",1} ,
{"\x54",1} , {"\x55",1} , {"\x56",1} , {"\x57",1} ,
{"\x58",1} , {"\x59",1} , {"\x5a",1} , {"\x5b",1} ,
{"\x5c",1} , {"\x5d",1} , {"\x5e",1} , {"\x5f",1} ,
{"\x60",1} , {"\x61",1} , {"\x62",1} , {"\x63",1} ,
{"\x64",1} , {"\x65",1} , {"\x66",1} , {"\x67",1} ,
{"\x68",1} , {"\x69",1} , {"\x6a",1} , {"\x6b",1} ,
{"\x6c",1} , {"\x6d",1} , {"\x6e",1} , {"\x6f",1} ,
{"\x70",1} , {"\x71",1} , {"\x72",1} , {"\x73",1} ,
{"\x74",1} , {"\x75",1} , {"\x76",1} , {"\x77",1} ,
{"\x78",1} , {"\x79",1} , {"\x7a",1} , {"\x7b",1} ,
{"\x7c",1} , {"\x7d",1} , {"\x7e",1} , {"\x7f",1} ,
{"\xc2""\x80",2} , {"\xc2""\x81",2} , {"\xc2""\x82",2} , {"\xc2""\x83",2} ,
{"\xc2""\x84",2} , {"\xc2""\x85",2} , {"\xc2""\x86",2} , {"\xc2""\x87",2} ,
{"\xc2""\x88",2} , {"\xc2""\x89",2} , {"\xc2""\x8a",2} , {"\xc2""\x8b",2} ,
{"\xc2""\x8c",2} , {"\xc2""\x8d",2} , {"\xc2""\x8e",2} , {"\xc2""\x8f",2} ,
{"\xc2""\x90",2} , {"\xc2""\x91",2} , {"\xc2""\x92",2} , {"\xc2""\x93",2} ,
{"\xc2""\x94",2} , {"\xc2""\x95",2} , {"\xc2""\x96",2} , {"\xc2""\x97",2} ,
{"\xc2""\x98",2} , {"\xc2""\x99",2} , {"\xc2""\x9a",2} , {"\xc2""\x9b",2} ,
{"\xc2""\x9c",2} , {"\xc2""\x9d",2} , {"\xc2""\x9e",2} , {"\xc2""\x9f",2} ,
{"\xc2""\xa0",2} , {"\xe2""\x80""\x98",3}, {"\xe2""\x80""\x99",3}, {"\xc2""\xa3",2} ,
{"\xe2""\x82""\xac",3}, {"\xe2""\x82""\xaf",3}, {"\xc2""\xa6",2} , {"\xc2""\xa7",2} ,
{"\xc2""\xa8",2} , {"\xc2""\xa9",2} , {"\xcd""\xba",2} , {"\xc2""\xab",2} ,
{"\xc2""\xac",2} , {"\xc2""\xad",2} , {"\x3f",1} , {"\xe2""\x80""\x95",3},
{"\xc2""\xb0",2} , {"\xc2""\xb1",2} , {"\xc2""\xb2",2} , {"\xc2""\xb3",2} ,
{"\xce""\x84",2} , {"\xce""\x85",2} , {"\xce""\x86",2} , {"\xc2""\xb7",2} ,
{"\xce""\x88",2} , {"\xce""\x89",2} , {"\xce""\x8a",2} , {"\xc2""\xbb",2} ,
{"\xce""\x8c",2} , {"\xc2""\xbd",2} , {"\xce""\x8e",2} , {"\xce""\x8f",2} ,
{"\xce""\x90",2} , {"\xce""\x91",2} , {"\xce""\x92",2} , {"\xce""\x93",2} ,
{"\xce""\x94",2} , {"\xce""\x95",2} , {"\xce""\x96",2} , {"\xce""\x97",2} ,
{"\xce""\x98",2} , {"\xce""\x99",2} , {"\xce""\x9a",2} , {"\xce""\x9b",2} ,
{"\xce""\x9c",2} , {"\xce""\x9d",2} , {"\xce""\x9e",2} , {"\xce""\x9f",2} ,
{"\xce""\xa0",2} , {"\xce""\xa1",2} , {"\x3f",1} , {"\xce""\xa3",2} ,
{"\xce""\xa4",2} , {"\xce""\xa5",2} , {"\xce""\xa6",2} , {"\xce""\xa7",2} ,
{"\xce""\xa8",2} , {"\xce""\xa9",2} , {"\xce""\xaa",2} , {"\xce""\xab",2} ,
{"\xce""\xac",2} , {"\xce""\xad",2} , {"\xce""\xae",2} , {"\xce""\xaf",2} ,
{"\xce""\xb0",2} , {"\xce""\xb1",2} , {"\xce""\xb2",2} , {"\xce""\xb3",2} ,
{"\xce""\xb4",2} , {"\xce""\xb5",2} , {"\xce""\xb6",2} , {"\xce""\xb7",2} ,
{"\xce""\xb8",2} , {"\xce""\xb9",2} , {"\xce""\xba",2} , {"\xce""\xbb",2} ,
{"\xce""\xbc",2} , {"\xce""\xbd",2} , {"\xce""\xbe",2} , {"\xce""\xbf",2} ,
{"\xcf""\x80",2} , {"\xcf""\x81",2} , {"\xcf""\x82",2} , {"\xcf""\x83",2} ,
{"\xcf""\x84",2} , {"\xcf""\x85",2} , {"\xcf""\x86",2} , {"\xcf""\x87",2} ,
{"\xcf""\x88",2} , {"\xcf""\x89",2} , {"\xcf""\x8a",2} , {"\xcf""\x8b",2} ,
{"\xcf""\x8c",2} , {"\xcf""\x8d",2} , {"\xcf""\x8e",2} , {"\x3f",1}
};
int main() {
std::string input{"āāāčččēēēē"};
std::string output;
for (auto c : input) {
output.append(itou[static_cast<uint8_t>(c)]);
}
std::cout << output << std::endl;
}
string FirstName = res->getString("test2col");
Documentation for MySQL Connector: https://dev.mysql.com/doc/dev/connector-cpp/8.0/
Seem to not tell much about this, so thanks for any help!
So here is code example that turns into another error based on solutions in commentaries -
157
#include <iostream>
#include <cppconn/driver.h>
#include <cppconn/exception.h>
#include <cppconn/resultset.h>
#include <cppconn/statement.h>
#include <cppconn/prepared_statement.h>
#include <string>
#include <fstream>
#include <sstream>
#include <stdexcept>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <cstring>
#include <filesystem>
#include <codecvt>
#include <cstdint>
#include <locale>
Severity Code Description Project File Line Suppression State Error C4996 'std::wstring_convert<std::codecvt_utf8<wchar_t,1114111,(std::codecvt_mode)0>,wchar_t,std::allocator<wchar_t>,std::allocator<char>>::to_bytes': warning STL4017: std::wbuffer_convert, std::wstring_convert, and the <codecvt> header (containing std::codecvt_mode, std::codecvt_utf8, std::codecvt_utf16, and std::codecvt_utf8_utf16) are deprecated in C++17. (The std::codecvt class template is NOT deprecated.) The C++ Standard doesn't provide equivalent non-deprecated functionality; consider using MultiByteToWideChar() and WideCharToMultiByte() from <Windows.h> instead. You can define _SILENCE_CXX17_CODECVT_HEADER_DEPRECATION_WARNING or _SILENCE_ALL_CXX17_DEPRECATION_WARNINGS to acknowledge that you have received this warning.
```cpp
try
{
std::unique_ptr<sql::Connection> connection{ nullptr };
try {
sql::Driver* driver = ::get_driver_instance();
//sql::Connection* con;
//sql::Statement *stmt;
//sql::ResultSet* res;
//sql::Statement* pstmt;
sql::ConnectOptionsMap connection_options{};
connection_options["hostName"] = "tcp://127.0.0.1:3306"; // Replace with your log-in
connection_options["userName"] = "root"; // ...
connection_options["password"] = "parole123!"; // ...
connection_options["schema"] = "test"; // ...
connection_options["characterSetResults"] = "latin7_general_ci";
connection_options["OPT_CHARSET_NAME"] = "latin7_general_ci";
connection_options["OPT_SET_CHARSET_NAME"] = "latin7_general_ci";
connection.reset(driver->connect(connection_options));
driver = get_driver_instance();
/* Create a connection */
//con = driver->connect("tcp://127.0.0.1:3306", "root", "parole123!");
//con->setClientOption("characterSetResults", "UTF8");
/* Connect to the MySQL test database */
//con->setSchema("test");
//pstmt = con->createStatement();
std::string const some_query = "SELECT * FROM test2";
std::unique_ptr<sql::Statement> statement{ connection->createStatement() };
//res = pstmt->executeQuery("SELECT * FROM test2");
std::unique_ptr<sql::ResultSet> res{ statement->executeQuery(some_query) };
//pstmt->setInt(1, 1);
//pstmt->setString(1, str2);
//res = pstmt->executeQuery();
/* Fetch in reverse = descending order! */
///cikls kur izmantos mysql datu masvu
//res->afterLast();
while (res->next()) {
std::string const FILE_NAME = res->getString("test2col");
string locations2 = ("C:\\Users\\Janis\\Desktop\\TEST2\\");
string txtt = (".txt");
string copy2 = ("copy /-y ");
string space = " ";
string PACIENTI2 = "C:\\PACIENTI\\";
string element = copy2 + locations2 + FILE_NAME + txtt;
//string StartTime = res->getString("StartTime");
//string VisitID = res->getString("VisitID");
//string LastModified = res->getString("LastModified");
//string Id = res->getString("Id");
//string PatientId = res->getString("PatientId");
for (auto& p2 : fs::directory_iterator("C:\\Users\\Janis\\Desktop\\TEST2\\")) {
if (FILE_NAME != p2.path().string()) {
string cmd = element + space + PACIENTI2 + FILE_NAME + txtt;
FILE* pipe = _popen(cmd.c_str(), "r");
cout << cmd << endl;
/*if (pipe == NULL)
{
return;
}
char buffer[128];
std::string result = "";
while (!feof(pipe))
{
if (fgets(buffer, 128, pipe) != NULL)
{
result += buffer;
}
}*/
//std::cout << "Results: " << std::endl << result << std::endl ;
//_pclose(pipe);
}
}
}
//delete res;
//delete pstmt;
//delete con;
}
catch (sql::SQLException& ex) {
std::cerr << "Error occured when connecting to SQL data base: " << ex.what() << "(" << ex.getErrorCode() << ").";
}
}
catch (sql::SQLException& e)
{
///nav implementēts vairāk info
//cout << "# ERR: SQLException in " << __FILE__;
//cout << "(" << __FUNCTION__ << ") on line " << __LINE__ << endl;
/* what() (derived from std::runtime_error) fetches error message */
//cout << "# ERR: " << e.what();
//cout << " (MySQL error code: " << e.getErrorCode();
cout << "# ERR: SQLException in " << endl;
}
copy /-y C:\Users\username\Desktop\TEST2\J─ünis.txt C:\PACIENTI\J─ünis.txt
copy /-y C:\Users\username\Desktop\TEST2\Ann─ü.txt C:\PACIENTI\Ann─ü.txt
instead it should be
copy /-y C:\Users\Janis\Desktop\TEST2\Jānis.txt C:\PACIENTI\Jānis.txt
copy /-y C:\Users\Janis\Desktop\TEST2\Annā.txt C:\PACIENTI\Annā.txt
I think the problem in your case is not related to std::wstring: the 8-bit std::string should be sufficient for UTF-8 (creating a simple std::string with the special characters "āàčīēļš" just works fine), while depending on the operating system std::wstring is 2 Byte (Windows) or 4 Byte (Linux) (more information here and here). After all if you have a look at the getString function you will see that it takes and returns an sql::SQLString. The sql::SQLString class is just a simple wrapper for an std::string.
I think you have to specify utf-8 as default character set for MySql: For this you will have to specify the following connection options when connecting to the data base:
std::unique_ptr<sql::Connection> connection {nullptr};
try {
sql::Driver* driver = ::get_driver_instance();
sql::ConnectOptionsMap connection_options {};
connection_options["hostName"] = url; // Replace with your log-in
connection_options["userName"] = username; // ...
connection_options["password"] = password; // ...
connection_options["schema"] = schema; // ...
connection_options["characterSetResults"] = "utf8";
connection_options["OPT_CHARSET_NAME"] = "utf8";
connection_options["OPT_SET_CHARSET_NAME"] = "utf8";
connection.reset(driver->connect(connection_options));
} catch (sql::SQLException& ex) {
std::cerr << "Error occured when connecting to SQL data base: " << ex.what() << "(" << ex.getErrorCode() << ").";
}
Then you should be able to continue to query your data base as follows
std::string const some_query = "SELECT * FROM some_table_name;";
std::unique_ptr<sql::Statement> statement {connection->createStatement()};
std::unique_ptr<sql::ResultSet> result {statement->executeQuery(some_query)};
while (result->next()) {
std::string const some_field = result->getString("some_field_name");
// Process: e.g. display with std::cout << some_field << std::endl;
}
The problem that now emerges when you want to create file names with it or output it to console is Windows itself (I had tested the code before with Linux only and therefore did not run into this issue before!): By default it uses ANSI and not UTF-8. Even if you output something like āàčīēļš it will not output it correctly no matter if you are using a std::cout or std::wcout in combination with std::wstring. Instead it will output ─ü├á─ì─½─ô─╝┼í.
If you extract the bytes
void dump_bytes(std::string const& str) {
std::cout << std::hex << std::uppercase << std::setfill('0');
for (unsigned char c : str) {
std::cout << std::setw(2) << static_cast<int>(c) << ' ';
}
std::cout << std::dec << std::endl;
return;
}
it will output C4 81 C3 A0 C4 8D C4 AB C4 93 C4 BC C5 A1 which plugging it back into a byte-to-utf8-converter such as this one will in fact give you āàčīēļš. So the string was read correctly but Windows is just not displaying it correctly. The following in combination with the last section (specifying utf-8 as default character set in MySql) should fix all your issues:
A call to SetConsoleOutputCP(CP_UTF8); from windows.h at the start of the program will fix the console output:
#include <cstdlib>
#include <iostream>
#include <string>
#include <windows.h>
int main() {
// Forces console output to UTF8
SetConsoleOutputCP(CP_UTF8);
std::string const name = u8"āàčīēļš";
std::cout << name << std::endl; // Actually outputs āàčīēļš
return EXIT_SUCCESS;
}
Similarly you will have to adapt your routine that creates the files as by default it won't be UTF8 as well (The content of the files will not be an issue but the filename itself will be!). Use std::ofstream from fstream in combination with std::filesystem::u8path from the C++17 library filesystem to resolve this:
#include <cstdlib>
#include <filesystem>
#include <fstream>
#include <string>
int main() {
std::string const name = u8"āàčīēļš";
std::ofstream f(std::filesystem::u8path(name + ".txt")); // Creates a file āàčīēļš.txt
f << name << std::endl; // Writes āàčīēļš to it
return EXIT_SUCCESS;
}
For that you need to convert your string into std::wstring.
#include <codecvt>
string FirstName = res->getString("FirstName");
std::wstring firstNameWstring = std::wstring_convert<std::codecvt_utf8<wchar_t>>().from_bytes(FirstName);
C4 81 C3 A0 C4 8D C4 AB C4 93 C4 BC C5 A1, when interpreted as various encodings:
utf8: āàčīēļš
latin1: Äà Äīēļš
latin7: ÄĆÄ Ä Ä«ÄļŔ
euckr 훮횪훾카휆캬큄
Presumably, you wanted utf8 (or utf8mb4).
The connection parameters specify the encoding of the client. If that hex string is coming from the client, then specify utf8 (or utf8mb4).
The encoding in the database table can be the same or something different. That is specified in the schema; for example:
CREATE TABLE ...
stuff VARCHAR(99) CHARACTER SET utf8
...
When INSERTing and SELECTing, MySQL will convert (if necessary) between the client's encoding and the column's encoding.
Discovery:
To see the client settings:
SHOW VARIABLES LIKE 'char%';
:--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin7 | <--
| character_set_connection | latin7 | <--
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | latin7 | <--
| character_set_server | utf8mb4 |
| character_set_system | utf8mb3 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
To see the column's encoding: SHOW CREATE TABLE tablename;.
To see the actual bytes in a column:
SELECT col, HEX(col) ...
Virtually every other technique is error-prone.
Note that the accented characters you mentioned all have UTF-8 hex like
C3 xx
C4 xx
C5 xx
Terminology:
"Character set" is then encoding. Examples: latin1, latin7, utf8, utf8mb4.
"Collation" refers to how to sort and compare values. Example: latin7_general_ci ("ci" is short for "case insensitive and accent insensitive")
(Your connection parameters are confusing the character set and collation.)
"utf8" or "utf8mb3" is MySQL's 3-byte encoding. Going away, but still quite valid for all European languages, plus much of the rest of the world.
"utf8mb4" is MySQL's 4-byte encoding. Equivalent to UTF-8.
"UTF-8" is the rest of the world's name for it.
Common mistakes:
Failure to declare the client's encoding.
Attempts to convert the encoding in the client.
Problem is that different parts of your code are using different encoding of text data.
Since MySql uses utf-8 you can simply change your program to use UTF-8 everywhere.
This can be achieved by build flags:
cl /source-charset:utf-8 /execution-charset:utf-8 /EHsc YourSources.cpp
/source-charset:utf-8 - says that your source file is using utf-8 encoding - since your source can use different encoding ensure you are using correct parameter, but is highly recommended that source code is encoded in standard which is universal (so UTF-8), so developers from different countries can work on code without problems.
/execution-charset:utf-8 - says that string literals stored in exactable should be encoded as utf-8.
Now problem will be only a console (cmd). By default Windows console uses encoding specific to language setting on your system (inheritance of compatibility with old DOS applications). As a result when you forced your executable to use utf-8 console by default will print those incorrectly.
Changing code page of console so it would use utf-8 will fix the issue.
Here is a test program I've wrote to demonstrate how to handle encoding in C++:
#include <iostream>
#include <locale>
#include <exception>
#include <string>
void setupLocale(int argc, const char *argv[])
{
std::locale def{""};
std::locale::global(argc > 1 ? std::locale{argv[1]} : def);
auto streamLocale = argc > 2 ? std::locale{argv[2]} : def;
std::cout.imbue(streamLocale);
std::cin.imbue(streamLocale);
}
void printSeparator()
{
std::cout << "---------\n";
}
void printTestStuff()
{
std::cout << "Wester Europe: āāāčččēēēēßÞÖöñÅÃ\n";
std::cout << "Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă\n";
std::cout << "China: 字集碼是把字符集中的字符编码为指定集合中某一对象\n";
std::cout << "Korean: 줄여서 인코딩은 사용자가 입력한\n";
}
int main(int argc, const char *argv[]) {
try{
setupLocale(argc, argv);
printSeparator();
printTestStuff();
printSeparator();
}
catch(const std::exception& e)
{
std::cerr << e.what() << '\n';
}
}
When you copy paste that program remember to encode source in UTF-8 (since I used wide ranges of characters for testing most of other encodings will just fail - will not be displayed correctly when file is opened).
Now this is what I see on my terminal (copy paste):
C:\Users\User\Downloads>cl /source-charset:utf-8 /execution-charset:utf-8 /EHsc encodings.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29336 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
encodings.cpp
Microsoft (R) Incremental Linker Version 14.28.29336.0
Copyright (C) Microsoft Corporation. All rights reserved.
/out:encodings.exe
encodings.obj
C:\Users\User\Downloads>chcp
Active code page: 437
C:\Users\User\Downloads>encodings.exe
---------
Wester Europe: Ä?Ä?Ä?Ä?Ä?Ä?Ä"Ä"Ä"Ä"AYAzA-AA±A.Aƒ
Central Europe: Ä.Ä,A"A3Å?Å,Ä~ÄTżÄ╪źŰűA?A½Ä,ă
China: å--é>+碼æ~_æSSå--ç¬▌é>+ä,-çs,å--ç¬▌ç¼-ç ?ä,ºæO╪årsé>+å?^ä,-æY?ä,?å_1象
Korean: ì,ì-¬ì,o ì?,ì½"ë"cì?? ì,¬ìscìz?ê°? ìz.ë ¥ío
---------
C:\Users\User\Downloads>encodings.exe .65001
---------
Wester Europe: aaaccceeeeß_ÖöñÅA
Central Europe: aAOóLlEezczUuYyAa
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>encodings.exe .65001 .437
---------
Wester Europe: aaaccceeeeß_ÖöñÅA
Central Europe: aAOóLlEezczUuYyAa
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>encodings.exe .65001 .1250
---------
Wester Europe: aaaccceeeeß_ÖöñÅA
Central Europe: aAOóLlEezczUuYyAa
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>chcp 1250
Active code page: 1250
C:\Users\User\Downloads>encodings.exe .65001 .1250
---------
Wester Europe: aaačččeeeeß?ÖönAA
Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă
China: ????????????????????????
Korean: ??? ???? ???? ???
---------
C:\Users\User\Downloads>chcp 65001
Active code page: 65001
C:\Users\User\Downloads>encodings.exe
---------
Wester Europe: ÄÄÄÄÄÄēēēēßÞÖöñÅÃ
Central Europe: ąĄÓóÅłĘężćźŰűÃýĂă
China: 字集碼是把字符集中的字符编ç ä¸ºæŒ‡å®šé›†åˆä¸­æŸä¸€å¯¹è±¡
Korean: 줄여서 ì¸ì½”ë”©ì€ ì‚¬ìš©ìžê°€ 입력한
---------
C:\Users\User\Downloads>encodings.exe .65001
---------
Wester Europe: āāāčččēēēēßÞÖöñÅÃ
Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă
China: 字集碼是把字符集中的字符编码为指定集合中某一对象
Korean: 줄여서 인코딩은 사용자가 입력한
---------
C:\Users\User\Downloads>encodings.exe .65001 .65001
---------
Wester Europe: āāāčččēēēēßÞÖöñÅÃ
Central Europe: ąĄÓóŁłĘężćźŰűÝýĂă
China: 字集碼是把字符集中的字符编码为指定集合中某一对象
Korean: 줄여서 인코딩은 사용자가 입력한
---------
C:\Users\User\Downloads>
As you can see if code page and encodings are setup properly everything just works (without using Windows API).
On my machine cmd.exe is unable to display Asian characters properly, but this is just font issue, when I copy paste cmd.exe content to other program everything is displayed correctly (in case correct setup of encodings in my program).
Note also if conversion is not possible question marks are displayed ar fallback to some other character is performed (for example Å has been converted to A when Windows-1250 encoding and code page was used).
Play a bit whit this program I'm pretty sure this should be ennugh to give you a full picture.
Most important:
std::locale::global defines what kind of encoding main program uses. It means that streams assume that std::string values are encoded this way
std::iostream::imbue allows to define encoding of the stream, sow what will be written to a file.
this both settings are defining both sides of conversion! You do not have to do conversion manually!
Visual studio let you change the file encoding.
So if you are typing the characters on a variable you should have that encoding.
Set the option in Visual Studio or programmatically
Open the project Property Pages dialog box. ...
Select the Configuration Properties > C/C++ > Command Line property page.
In Additional Options, add the /utf-8 option to specify your preferred encoding.
Choose OK to save your changes.
link
https://learn.microsoft.com/en-us/cpp/build/reference/utf-8-set-source-and-executable-character-sets-to-utf-8?view=msvc-160
Now for database make sure your database has right character set.
https://www.a2hosting.com/kb/developer-corner/mysql/convert-mysql-database-utf-8#:~:text=mysql-,To%20change%20the%20character%20set%20encoding%20to%20UTF%2D8%20for,q%20at%20the%20mysql%3E%20prompt.
Once you have right encoding for DB and Visual studio you can save edit and read those characters from c ++ code.

How can I hand a wchar_t* over to espeak without getting German characters spoken incorrectly?

I'm using espeak-ng to turn German-language traffic messages into speech. See this example text:
B6 Weserstraße B71 Seeborg vorübergehende Begrenzung der Breite.
B213 Wildeshauser Landstraße Delmenhorst-Deichhorst wegen Baustelle gesperrt.
The espeak method call looks like this:
unsigned int spoken_message_uuid = 0;
espeak_ERROR Speak (wstring text)
{
espeak_ERROR error = EE_OK;
unsigned int *uuid = &spoken_message_uuid;
const wchar_t *input = text.c_str ();
wcout << L"Speaking text:" << endl << input << endl;
error = espeak_Synth (input, text.length (), 0, POS_CHARACTER, 0, espeakCHARS_WCHAR | espeakENDPAUSE | espeakSSML, uuid, NULL);
return error;
}
My issue is now the following: All the German special characters (ä, ö, ü, ß) are not being spoken correctly! Instead, something like A Tilde ein Viertel appears in the spoken text, as if UTF-8 text had been treated erroneously as ASCII.
Here are the respective versions of espeak-ng and g++:
pi#autoradio:/import/valen/autoradio $ espeak-ng --version
eSpeak NG text-to-speech: 1.50 Data at: /usr/lib/arm-linux-gnueabihf/espeak-ng-data
pi#autoradio:/import/valen/autoradio $ g++ --version
g++ (Raspbian 6.5.0-1+rpi1+b1) 6.5.0 20181026
pi#autoradio:/import/valen/autoradio $ apt-cache policy espeak-ng
espeak-ng:
Installiert: 1.50+dfsg-7~bpo10+1
Installationskandidat: 1.50+dfsg-7~bpo10+1
Versionstabelle:
*** 1.50+dfsg-7~bpo10+1 100
100 http://deb.debian.org/debian buster-backports/main armhf Packages
100 /var/lib/dpkg/status
1.49.2+dfsg-8 500
500 http://raspbian.raspberrypi.org/raspbian buster/main armhf Packages
espeak has been installed from Debian's buster-backports repo to replace version 1.49, which didn't work either. The voice I'm using is mb-de5.
OK, this is not exactly a solution, yet a mere workaround, but at least it works: I hand over a string instead of a wstring. The original string turned out to be UTF-8-encoded, so that all the special characters fit into a string resp. char* variable. Here is the adapted code:
unsigned int spoken_message_uuid = 0;
espeak_ERROR Speak (string text)
{
espeak_ERROR error = EE_OK;
unsigned int *uuid = &spoken_message_uuid;
const char *input = text.c_str ();
cout << "Speaking text:" << endl << input << endl;
error = espeak_Synth (input, text.length (), 0, POS_CHARACTER, 0, espeakCHARS_UTF8 | espeakENDPAUSE | espeakSSML, uuid, NULL);
return error;
}

Creating tar.gz-archive from c++ program does not work

I use the following code snippet for creating a tar.gz-archive in an extensive measurement software. After collecting some data in several files I want to archive and compress them for later use.
Everything works fine when I start the program from the shell, all the data is collected and archived correctly.
However the program should start automatically after system start of an embedded Linux system. When it's started via a script in /etc/init.d, no data files are archived/compressed, even though I get the return value 0. Furthermore, the tar.gz-file is created, but it's empty.
Everything else is working fine.
Can anyone please explain, what I have to do in this special case of an automatic start?
int returnValue = -1;
std::string jobString = RESULT_PATH;
jobString += "/";
jobString += lastJobString;
std::string jobFiles = lastJobString + "*.*";
std::string cmd = "tar cvf - ";
cmd += jobFiles;
cmd += " | gzip > ";
cmd += jobString;
cmd += ".tar.gz";
std::cout << "archiving and compressing " << jobFiles << ": " << cmd << std::flush << std::endl;
returnValue = system(cmd.c_str());
std::cout << "archiving and compressing finished. Code: " << returnValue << std::flush << std::endl;
I know that there are several librariers, like libarchive, libtar, etc. which to use is not as lazy as firing a system command, but I would like to know why this does not work for my case.
Furthermore, the version of tar in my busy box does not support option z.
I finally found a solution for my problem and maybe for all the cases when a system command is called by a daemon:
The trick is to create a new shell by the command sh and change the current directory before the tar-function is called:
std::string cmd = "sh -c \" cd ";
cmd += SOURCE_DIR;
cmd+= " && tar cvf - ";
cmd += jobFiles;
cmd += " | gzip > ";
cmd += jobString;
cmd += ".tar.gz";
returnValue = system(cmd.c_str());
Maybe this will help other users heading to the same problem.

C++ Encoding (special character Ø,é...)

I work with
C++/Windows/minGw
I get from file .xml a string with special character
The rise witting on the file xml is "Quimby_éé_ØØ R90 GP_NomPoints.txt"
The result is different with strangs characters
My file.xml sounds ok :
<?xml version="1.0" encoding="UTF-8"?>
Test :
When I get from file .txt a string with special character ,it dosesn't work
When I write the string to .txt file it works fine.
Then There might be some problem with the ide console.
My code:
void parser_fichier_xml(string fich,string &ActPoints,string &NomPoints)
{
//string ActPoints;
//string NomPoints;
TiXmlDocument doc(fich.c_str());
if(doc.LoadFile(TIXML_ENCODING_UTF8))
{
TiXmlHandle hdl(&doc);
TiXmlElement *elem = hdl.FirstChildElement("GeometryData").FirstChildElement("Element").Element(); //Création de elem (arbre DOM constituant noeud --enfant)
if(!elem)
{
cout<<"le noeud à atteindre n'existe pas"<<endl;
//return 1;
} //boucle pour vérifier que l'élément ait bien un enfant
/* ********* Recuperer chemin nompoint actpoint dans balise XML *********** */
ActPoints = elem->Attribute("ActPoints");
NomPoints = elem->Attribute("NomPoints");
/* *****test dans fichier de sorti ***** */
string const nomFichier("Z:/Production/Methodes/InfoTec/Developpement/Zeiss_PCM/toCALYPSO/test.txt");
ofstream fichier(nomFichier.c_str());
if(fichier)
{
fichier << NomPoints<< endl;
fichier.close();
}
else
cerr << "Impossible d'ouvrir le fichier test.txt !" << endl;
/* ************************************************** */
debug_string("Chemin ActPoints: ",ActPoints,"Chemin NomPoints: ",NomPoints); //affiche dans console
}
else
{
cerr << "Erreur d'ouverture du fichier .XML" << endl;
}
}
As answers, I doesn't like a function that replace special character but something that changes all
If someone may help me
Thks a lot
The windows console barely supports Unicode, definitely not UTF-8, and MinGW's std::cout doesn't help you. Sorry.
Thks for answers,
For comments.i will minimise my vode the next time.
I know what UFT-8 is ,
ASCII isn't enough (without Ø ,é ...)
UNICODE is theoric
For pratical , i've 2 choices ISO 8859 or UTF-8 and
UTF-8 is more complete (japanese ,etc..)
Test :
I tried this special chars " à " and the result is " á "
The display(cout or printf) is not important because i get the value(the result) like path.
i'm going to try with ISO 8859 but even with a file text that contains "éé_ØØ " , the console doesn't write same "éé_ØØ "
Sorry i can't send the picture of the result (this website ask 10 reputations)

What do /proc/fd file descriptors show?

Learning about the /proc/ directory today, in particular I'm interested in the security implications of having all the information about a process semi-publicly available, so I wrote a simple program that does some simple whatnot that allows me to explore some properties of the /proc/ directory:
#include <iostream>
#include <unistd.h>
#include <fcntl.h>
using namespace std;
extern char** environ;
void is_linux() {
#ifdef __linux
cout << "this is running on linux" << endl;
#endif
}
int main(int argc, char* argv[]) {
is_linux();
cout << "hello world" << endl;
int fd = open("afile.txt", O_RDONLY | O_CREAT, 0600);
cout << "afile.txt open on: " << fd << endl;
cout << "current pid: " << getpid() << endl;;
cout << "launch arguments: " << endl;
for (int index = 0; index != argc; ++index) {
cout << argv[index] << endl;
}
cout << "program environment: " << endl;
for (char** entry = environ; *entry; ++entry) {
cout << *entry << endl;
}
pause();
}
Interestingly though (to me anyway), when I check the file-descriptors folder (/pid/<PID#>/fd), I see this:
root#excalibur-VirtualBox:/proc/1546/fd# ls -l
total 0
lrwx------ 1 root root 64 Nov 7 09:12 0 -> /dev/null
lrwx------ 1 root root 64 Nov 7 09:12 1 -> /dev/null
lrwx------ 1 root root 64 Nov 7 09:12 2 -> /dev/null
lrwx------ 1 root root 64 Nov 7 09:12 3 -> socket:[11050]
why do the file descriptors point to /dev/null? Is that to prevent user's from being able to inject content into a file without actually being the process itself, or am I off base on that? And even more curious, why does the file descriptor to an open file point to a socket? That seems really odd. If anyone can shed some light on this for me, I would really appreciate it. Thanks!
You are definitely looking at the wrong /proc directory (for other PID or on another computer). The contents of /proc/<pid>/fd for your program should look like here:
lrwx------ 1 user group 64 Nov 7 22:15 0 -> /dev/pts/4
lrwx------ 1 user group 64 Nov 7 22:15 1 -> /dev/pts/4
lrwx------ 1 user group 64 Nov 7 22:15 2 -> /dev/pts/4
lr-x------ 1 user group 64 Nov 7 22:15 3 -> /tmp/afile.txt
Here we can see that file descriptors 0, 1, and 2 are shown as symbolic links to the pseudo terminal in which the program is running. It could be /dev/null if you started your program with input, output, and error redirection. The file descriptor #3 points to the file afile.txt which is currently opened.