Is this simple C++ program using <locale> correct? - c++

This code seemed to work ok in (ubuntu trusty) versions of gcc and clang, and in Win 7 on a VM via mingw... Recently I upgraded to Wily and builds made with clang crash consistently here.
#include <iostream>
#include <locale>
#include <string>
int main() {
std::cout << "The locale is '" << std::locale("").name() << "'" << std::endl;
}
Sometimes its a gibberish string followed by Aborted: Core dumped and sometimes its invalid free.
$ ./a.out
The locale is 'en_US.UTF-8QX�у�X�у����0�����P�����\�(��\�(��\�(��h��t�������������y���������ț�ԛ�������en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_UP����`�������������������������p�����������#��������������`�������������p��������������������#��#��#��`��������p������������0��P��p���qp��!en_US.UTF-8QЈ[�����\�(��\�(��\�(�����������#�� �����P�����0�����P�����\�(��\�(��\�(��Ȣ�Ԣ����������������(��4��#��L��en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8!�v[��������������#�� �����P�����0�����P�����\�(��\�(���(��h��t��������������������Ȥ�Ԥ�������en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8!��[�� ����[�������7����7��.,!!x�[��!��[��!�[��#�����������#�� �����P�����0�����P�����\�(��\�(��\�(��(��4��#��L��X��d��p��|������������n_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8ѻAborted (core dumped)
$ ./a.out
The locale is 'en_US.UTF-8QX\%�QX\%�Q�G�0H��H�PI��I�\:|�Q\D|�Q\>|�QhK�tK��K��K��K��K��Q�K��K��K��K��K��K�en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8ѻ
*** Error in `./a.out': free(): invalid pointer: 0x0000000000b04a98 ***
Aborted (core dumped)
(Both program outputs above were abbreviated greatly or they would not fit in this question.)
I also got an invalid free on Coliru with it as well.
But this is very similar to example code on cppreference:
#include <iostream>
#include <locale>
#include <string>
int main()
{
std::wcout << "User-preferred locale setting is " << std::locale("").name().c_str() << '\n';
// on startup, the global locale is the "C" locale
std::wcout << 1000.01 << '\n';
// replace the C++ global locale as well as the C locale with the user-preferred locale
std::locale::global(std::locale(""));
// use the new global locale for future wide character output
std::wcout.imbue(std::locale());
// output the same number again
std::wcout << 1000.01 << '\n';
}
Actually that code crashes Coliru also... :facepalm:
More crashes of similar code from Coliru.
Is this a bug in the c++ library used by clang, or is this code defective?
Note also: These crashes seem to be restricted to the C++ api, if you use <clocale> instead things seem to work okay, so it may just be some trivial problem in the C++ bindings over this?
Variations using setlocale: 1 2 3

Looks like this is caused by libstdc++'s ABI change in its basic_string, which was needed for C++11 conformance. To manage this transition, GCC added the abi_tag attribute, which changes the mangled name of functions so that functions for the new and old ABI can be distinguished, even if the change wouldn't otherwise affect the mangled name (e.g. the return type of a function).
This code
#include <locale>
#include <string>
int main() {
std::locale().name();
}
on GCC emits a call to _ZNKSt6locale4nameB5cxx11Ev, which demangles to std::locale::name[abi:cxx11]() const, and returns a SSO string with the new ABI.
Clang, on other other hand, doesn't support the abi_tag attribute, and emits a call to _ZNKSt6locale4nameEv, which demangles to simply std::locale::name() const - which is the version returning a COW string (the old ABI).
The net result is that the program ends up trying to use a COW string as an SSO string when compiled with Clang. Havoc ensues.
The obvious workaround is to force the old ABI via -D_GLIBCXX_USE_CXX11_ABI=0.

I think the "" parameter might be corrupting something. I don't think it's a legal argument?
To verify it's nothing else, try running this:
#include <iostream>
#include <locale>
int main() {
std::locale("").name();
}

It compiles and runs just fine with GCC:
g++ -Wall -pedantic locale.cpp
<= No errorrs, no warnings
./a.out
The locale is 'en_US.UTF-8'
<= Expected output
ADDENDUM:
Exactly the same with MSVS 2013 - no errors or warnings compiling; no errors running:
locale.cpp =>
#include <iostream>
#include <locale>
#include <string>
int main() {
std::cout << "The locale is '" << std::locale("").name() << "'" << std::endl;
}
Output =>
locale
The locale is 'English_United States.1252'

Related

Right aligning money_put results

I'm trying to get the C++ library to generate properly formatted USD output ($ sign, commas for every 1000s place etc).
I'm close, but I cannot get the right alignment to work:
#include <iostream>
#include <iomanip>
#include <locale>
using namespace std;
int main() {
double fiftyMil = 50000000.0; // 50 million bucks
locale myloc;
const money_put<char>& mpUS = use_facet<money_put<char> >(myloc);
cout.imbue(myloc);
cout << showbase << fixed;
cout << "A";
cout.width(30);
cout.setf(std::ios::right);
mpUS.put(cout, false, cout, ' ', fiftyMil * 100); // convert to cents
cout << "B" << endl;
return 0;
}
I'm getting:
A$50,000,000.00 B
I want to get:
A $50,000,000.00B
Any ideas why this isn't working?
I'm using the latest Solaris compiler (12.4)
Update:
It seems like the issue is with the C++ libraries included with the Solaris compiler. This is the workaround I used:
#include <iostream>
#include <iomanip>
#include <locale>
#include <sstream>
using namespace std;
string getFormattedCcy(double amt) {
ostringstream os;
static locale myloc;
static const money_put<char>& mpUS = use_facet<money_put<char> >(myloc);
os.imbue(myloc);
os << showbase << fixed;
mpUS.put(os, false, os, ' ', amt * 100);
return os.str();
}
int main() {
double fiftyMil = 50000000.0; // 50 million bucks
cout << "A";
cout.setf(std::ios::right);
cout.width(30);
cout << getFormattedCcy(fiftyMil);
cout << "B" << endl;
return 0;
}
You have a couple of problems--one with your code, another that looks like its in your implementation.
The problem in your code is pretty trivial. Since you're using a default-constructed locale, it should be using the "C" locale, which shouldn't write out the $ or thousands separators.
That part is easy to fix. Change: locale myloc; to: locale myloc(""); to get a localized locale (so to speak).
I doubt that'll fix the justification problem you're seeing though. That looks to me like it's a problem with the standard library you're using. When I run your code (with the correction above) I get what I'd expect:
A $50,000,000.00B
That's with Visual C++ though (and despite a compiler that conforms fairly poorly, its standard library is about as good as they come).
Also note that right justification is the default, so the line:
cout.setf(std::ios::right);
...should have no effect (but I suspect you knew that, and added it in the hope of getting it to work when it didn't otherwise).
As far as how to get things to work with the Sun Oracle compiler, the most obvious suggestion would probably be to switch standard libraries to one that works better. That leads to another question: whether to try to get a standard library to work with the compiler you're using, or switch to a different compiler such as CLang or gcc. From what I understand, 12.4 was a pretty serious improvement in terms of C++ conformance, but I don't think either the compiler or (apparently) the standard library is really competitive with gcc or Clang yet. OTOH, you may not have a choice, in which case essentially your only route is to build a different standard library with your existing compiler, and hope for the best. If you can't even do that...you could try setting the locale correctly, and just writing the number with std::cout << fiftyMil;, and hope it at least gives you commas as it should, then add the currency sign separately.
As an aside, if you do get an updated (C++11 or newer) library, you can use put_money to simplify the code quite a bit:
#include <iostream>
#include <iomanip>
#include <locale>
using namespace std;
int main() {
double fiftyMil = 50000000.0; // 50 million bucks
std::locale myloc("");
cout.imbue(myloc);
cout << "A" << showbase << setw(30) << put_money(fiftyMil * 100) << "B";
}

Getline Function Messing with Code

Picking up C++ and having a go at it on OS X 10.9 using XCode 5.0.2 and using Alex Allain as reference.
The following code compiles just fine and outputs correctly
#include <iostream>
#include <string>
using namespace std;
int main()
{
std::string user_first_name = "test";
std::cout << user_first_name << "\n";
return 0;
}
When I add a getline function, code appears to compile but with no output.
#include <iostream>
#include <string>
using namespace std;
int main()
{
std::string user_first_name = "test";
std::getline( std::cin, user_first_name, '\n' );
std::cout << user_first_name << "\n";
return 0;
}
In fact debug navigator has memory filling up with bars (although actual memory use is fixed at 276 KB). Why am I getting stumped on such a simple thing/concept?
I did a bit of digging around and its quite likely this is related to a text encoding issue. I'm using defaults which is Unicode (UTF-8). Encoding is not something I'm familiar with, never something I had to deal with when learning on Windows. How do I get past this?
I can't comment regarding the use of XCode or OS X, but it was my understanding that std::cin always gives you a narrow (single-byte) character stream. In Windows (at least with Visual Studio), I think it works whether you compile for UTF8 (single-byte for all ASCII characters) or UTF16 (2-bytes for all ASCII characters). The runtime library presumably does the conversion for you as necessary.
I'm not sure what "filling up with bars" means, but maybe it's just that you're looking at uninitialized memory. If you think that it is an encoding issue, perhaps try using wstring/wcin instead of string/cin and see if that helps.

wcout and wcerr: How to get it to work with unicode?

#include <iostream>
#include <clocale>
#include <string>
int main() {
std::setlocale(LC_ALL, "en_US.utf8");
std::wstring str(L"Τὴ γλῶσσα μοῦ ἔδωσαν");
std::wcout << str << std::endl;
std::wcerr << str << std::endl;
}
This produces no output on the terminal.
How can I get it to produce UTF-8 output? I figure that ought to be something supported by C++.
I am aware of the utfcpp library, which I am using, but the question is specifically whether there is a stdlib way to print out UTF8.
I got it by following this.
std::setlocale(LC_ALL, ""); is the magic incantation.
This worked on my trivial test program. However my full program still completely fails at printing out wstrings.
That is fine (well, no, it is still somewhat alarming), I will avoid relying on stdlib, and do it myself with the help of utfcpp library, which is excellent. This way I can use cout and cerr for everything and all will be well.

Using regex_search from the C++ regex library

The regex_search function isn't quite behaving as expected.
#include <iostream>
#include <regex>
#include <string>
using namespace std;
int main()
{
string str = "Hello world";
const regex rx("Hello");
cout << regex_search(str.begin(), str.end(), rx) << endl;
return 0;
}
The output is
0
What's going on?
As pointed out in comments to the question, older implementations of the C++ standard libraries did not yet support all features in C++11. Of course, libc++ being an exception because it was originally built specifically for C++11.
According to this bug report support for <regex> in libstdc++ was only implemented for version 4.9 of GCC. You can check the current status on the libstdc++ status page.
One can confirm, that your example works with GCC 4.9 while still failing with GCC 4.8.

Stringstream not working with doubles when _GLIBCXX_DEBUG enabled

I'm using _GLIBCXX_DEBUG mode to help find errors in my code but I'm having a problem which I think is an error in the library, but hopefully someone can tell me I'm just doing something wrong. Here is a short example which repro's the problem:
#define _GLIBCXX_DEBUG
#include <iostream>
#include <sstream>
int main (int argc, const char * argv[]) {
std::ostringstream ostr;
ostr << 1.2;
std::cout << "Result: " << ostr.str() << std::endl;
return 0;
}
If I comment out the #define then the output is (as expected):
Result: 1.2
With the _GLIBCXX_DEBUG define in place however the output is simply:
Result:
I've tracked this down to the _M_num_put field of the stream being left as NULL, which causes an exception to be thrown (and caught) in the stream and results in no output for the number. _M_num_put is supposed to be a std::num_put from the locale (I don't claim to understand how that's supposed to work, it's just what I've learned in my searching so far).
I'm running this on a Mac with XCode and have tried it with both "LLVM GCC 4.2" and "Apple LLVM Compiler 3.0" as the compiler with the same results.
I'd appreciate any help in solving this. I want to continue to run with _GLIBCXX_DEBUG mode on my code but this is interfering with that.
Someone else has seen this over at cplusplus.com
and here at stackoverflow, too.
Consensus is that it is a known bug in gcc 4.2 for Mac OS, and since that compiler is no longer being updated, it is unlikely to ever be fixed.
Seems to me that you can either (1) use LLVM, or (2) build your own GCC and use it.