This is my sample code:
#pragma execution_character_set("utf-8")
#include <boost/locale.hpp>
#include <boost/algorithm/string/case_conv.hpp>
#include <iostream>
int main()
{
std::locale loc = boost::locale::generator().generate("");
std::locale::global(loc);
#ifdef MSVC
std::cout << boost::locale::conv::from_utf("grüßen vs ", "ISO8859-15");
std::cout << boost::locale::conv::from_utf(boost::locale::to_upper("grüßen"), "ISO8859-15") << std::endl;
std::cout << boost::locale::conv::from_utf(boost::locale::fold_case("grüßen"), "ISO8859-15") << std::endl;
std::cout << boost::locale::conv::from_utf(boost::locale::normalize("grüßen", boost::locale::norm_nfd), "ISO8859-15") << std::endl;
#else
std::cout << "grüßen vs ";
std::cout << boost::locale::to_upper("grüßen") << std::endl;
std::cout << boost::locale::fold_case("grüßen") << std::endl;
std::cout << boost::locale::normalize("grüßen", boost::locale::norm_nfd) << std::endl;
#endif
return 0;
}
Output on Windows 7 is:
grüßen vs GRÜßEN
grüßen
grußen
Output on Linux (openSuSE 12.3) is:
grüßen vs GRÜSSEN
grüssen
grüßen
On Linux the german letter 'ß' is converted to 'SS' as predicted, while this character remains unchanged on Windows.
Question: why is this so? How can I correct the conversion?
Some notes: Windows console codepage is set to 1252. In both cases locales are set to de_DE. I tried to replace the default locale setting in the listing above by "de_DE.UTF-8" - without any effect.
On Windows this code is compiled with Visual Studio 2013, on Linux with GCC 4.7, c++11 enabled.
Any suggestions are appreciated - thanks in advance for your support!
Windows doesn't do this conversion because "it would be too confusing" for developers if the string length changed all of a sudden. And boost presumably just delegates all the Unicode conversions to the underlying Windows APIs
Source
I guess the robust way to handle it would be to use a third-party Unicode library such as ICU.
Related
I using wcslen to determine the length of null-terminated wide string (wchar_t*), but I have some problems with this function in MSVC compiler
Code example:
#include <iostream>
#include <cstring>
#include <cwchar>
int main()
{
auto sc = "The good and bad";
auto wsc = L"Уставший лесник";
auto ws = std::wstring(wsc);
std::cout << "sc len:" << std::strlen(sc) << std::endl;
std::cout << "wsc len:" << std::wcslen(wsc) << std::endl;
std::cout << "ws len:" << ws.length() << std::endl;
}
MSVC (amd64 16.8.2 x64) output:
sc len:16
wsc len:29
ws len:29
Clang (10.0.0 (GNU CLI) for MSVC 16.8.30717.126) output:
sc len:16
wsc len:15
ws len:15
Is it a problem of MSVC compiler, some undefined behaivor or nuances of MSVC implementation?
You need to save your file as either UTF-16 or UTF-8 with BOM. MSVC doesn't seem to be able to handle a UTF-8 file without a BOM (which is understandable as the character encoding of such a file is a matter of interpretation).
Some editors (I am using Notepad2) call this 'UTF-8 with signature'.
The program get killed as soon as I inoke GetFamilyCount() so the "Exit" is not printed.
Invoking GetLastStatus() prints value 18 which is equals to GdiplusNotInitialized
I'm not sure why it's not initialized. You can clearly see that I initialized it here,
Gdiplus::PrivateFontCollection privateFontCollection;
I'm using MinGW to build the program.
MinGW-w64 9.0.0
GCC Version 11.1
OS is Windows 10
MinGW taken from http://winlibs.com/
#include <iostream>
#include <Windows.h>
#include <Gdiplus.h>
int main() {
std::cout << "Start" << "\n";
Gdiplus::PrivateFontCollection privateFontCollection;
std::cout << privateFontCollection.GetLastStatus() << "\n";
std::cout << privateFontCollection.GetFamilyCount() << "\n";
std::cout << "Exit" << "\n";
}
I have a simple problem in a Windows console program. I wrote the following single-file (UTF-8 encoded) program in C++:
// utf8 source file encoding
#include <iostream>
#include <iomanip>
int main() {
// compile and run on windows 10 and change code page to utf
// execute from cmd.exe
// WHY first letter 'П' - first time not printed? But after space printed?
std::system("chcp 65001>nul");
const char* utf8 = u8"Привет Мир"; // this will skip first letter with MinGW compiler
const char* utf8s = u8" Привет Мир"; // this will print nice!!! Just add space
std::cout << "cout\n";
std::cout << utf8 << std::endl;
std::cout << utf8s << std::endl;
std::cout << utf8 << std::endl;
std::cout << std::flush;
std::wcout << L"wcout\n";
std::wcout << L"Привет Мир" << std::endl;
std::wcout << L" Привет Мир" << std::endl;
std::wcout << L"Привет Мир" << std::endl;
std::wcout << std::flush;
std::printf("printf\n");
std::printf("%s\n", utf8);
std::printf("%s\n", utf8s);
std::printf("wprintf\n");
std::wprintf(L"Привет Мир\n");
std::wprintf(L" Привет Мир\n");
std::wprintf(L"Привет Мир\n");
std::fflush(stdout);
std::system("pause");
return 0;
}
Output in the Windows console when run from CMD:
Output in the terminal when run from bash:
The program's output is correct when run by git-bash.exe in Windows 10, but not when run by CMD in the console with Lucida Console font. It's not printing the first letter of Russian "Hello World" ("Привет Мир"), even with chcp 65001. I tried to compile this code with MinGW 7.1 and with MSVC 2017, but nothing changed. I know that rustc.exe (Rust Lang compiler) produces binary files that work in both the console and git-bash.exe with UTF-8 text printing. How can the same be done in a C++ program in Windows? Thanks in advance.
I have inherited a piece of C++ code which has many #ifdef branches to adjust the behaviour depending on the platform (#ifdef __WIN32, #ifdef __APPLE__, etc.). The code is unreadable in its current form because these preprocessor directives are nested, occur in the middle of functions and even in the middle of multi-line statements.
I'm looking for a way of somehow specifying some preprocessor tags and getting out a copy of the code as if the code had been pre-processed with those flags. I'd like the #include directives to be left untouched, though.
Example:
#include <iostream>
#ifdef __APPLE__
std::cout << "This is Apple!" << std::endl;
#elif __WIN32
std::cout << "This is Windows" << std::endl;
#endif
would turn into:
#include <iostream>
std::cout << "This is Apple!" << std::endl;
after being processed by: tool_i_want example.cpp __APPLE__.
I've hacked a quick script that does something similar, but I'd like to know of better tested and more thorough tools. I am running a Linux distribution.
I have decided against just running the C-preprocessor because if I'm not mistaken it will expand the header files, which would make everything more unreadable.
Use unifdef. It is designed for that purpose.
Complementing Basile Starynkevitch's answer, I want to mention coan. The major advantage is that, when used with -m it does not require the user to unset all symbols they want undefined.
This code:
#include <iostream>
#ifdef __ANDROID__
std::cout << "In Android" << std::endl;
#endif
#ifndef __WIN32
std::cout << "Not a Windows platform" << std::endl;
#endif
#ifdef __APPLE__
std::cout << "In an Apple platform" << std::endl;
#elif __linux__
std::cout << "In a Linux platform" << std::endl;
#endif
would result in this code if simply run as: unifdef -D__APPLE__ example.cpp:
#include <iostream>
#ifdef __ANDROID__
std::cout << "In Android" << std::endl;
#endif
#ifndef __WIN32
std::cout << "Not a Windows platform" << std::endl;
#endif
std::cout << "In an Apple platform" << std::endl;
Using unifdef one would need to use
unifdef -D__APPLE__ -U__ANDROID__ -U__WIN32 -U__linux__ example.cpp:
#include <iostream>
std::cout << "Not a Windows platform" << std::endl;
std::cout << "In an Apple platform" << std::endl;
This can get exhausting quickly when dealing with code considering several different platforms. With coan it's a matter of:
coan source -D__APPLE__ -m example.cpp.
I have written a dll that uses an abstract interface to allow access to the c++ class inside. When I load the library dynamically at run-time using the LoadLibrary() function using a simple console created in eclipse using g++ and call a function from within the dll I get the correct values returned. However when writing the same console program using qt-creator qt5 with the g++ compiler I get completely different results that are not correct.
All of this was written under Windows 7 64-bit but using the x86 side of it for the programs.
The code that calls the dll from eclipse looks as follows:
HMODULE hMod = ::LoadLibrary("libTestDll.dll");
if (hMod) {
cout << "Library Loaded" << endl;
CreateITestType create = (CreateITestType) GetProcAddress(hMod,
"GetNewITest");
ITest* test = create();
std::cout << test->Sub(20, 5) << std::endl;
std::cout << test->Sub(20, 3) << std::endl;
std::cout << test->Add(20, 5) << std::endl;
std::cout << test->Add(13, 4) << std::endl;
DeleteITestType del = (DeleteITestType) GetProcAddress(hMod,
"DeleteITest");
del(test);
test = NULL;
FreeLibrary(hMod);
}
This returns:
Library Loaded
15
17
25
17
The code that calls the dll from qt looks as follows:
HMODULE hMod = LoadLibrary(TEXT("libTestDll.dll"));
if(hMod)
{
CreateITestType create = (CreateITestType)GetProcAddress(hMod, "GetNewITest");
DeleteITestType destroy = (DeleteITestType)GetProcAddress(hMod, "DeleteITest");
ITest* test = create();
std::cout << test->Sub(20, 5) << std::endl;
std::cout << test->Sub(20, 3) << std::endl;
std::cout << test->Add(20, 5) << std::endl;
std::cout << test->Add(13, 4) << std::endl;
destroy(test);
test = NULL;
FreeLibrary(hMod);
}
And This returns:
1
-17
25
24
Both programs have the imports:
#include <windows.h>
#include <iostream>
#include "TestDll.h"
And finally the functions are implemented as follows:
int Test::Add(int a, int b)
{
return (a+b);
}
int Test::Sub(int a, int b)
{
return (a-b);
}
My question is where is the difference coming from seeing as the two programs are identical in both code and compiler, and how can this be fixed?
Did you also rebuild the DLL with qt-creator qt5 with the g++ compiler? If not, then what you've discovered is that if you don't use the exact same compiler, compiler options and settings, defines, and pretty much every other aspect of the build system, C++ interfaces are not typically ABI compatible.