Why does the following code generate std::bad_cast exception?
#include <iostream>
#include <regex>
#include <string>
int main()
{
std::basic_string<char32_t> reg = U"^\\w";
try
{
std::basic_regex<char32_t> tagRegex(reg);
}
catch(std::exception &e)
{
std::cout << e.what() << std::endl;
}
return 0;
}
This sample on Ideone for convenience: https://ideone.com/Saea88
Using char or wchar instead of char32_t runs without throwing though (proof: https://ideone.com/OBlXed).
You can find here: http://en.cppreference.com/w/cpp/regex/regex_traits:
To use std::basic_regex with other character types (for example, char32_t), a user-provided trait class must be used.
so you would have to implement std::regex_traits<char32_t>
and to see why there is no definition for it see here: Why is there no definition for std::regex_traits<char32_t> (and thus no std::basic_regex<char32_t>) provided?
On GCC or Clang, the code compiles fine even with custom regex traits, but fails at runtime with std::bad_cast. If you've got yourself here, the issue comes from std::use_facet<std::ctype<char32_t>> throwing the error, because the current locale doesn't support it. You have to specialize std::ctype<char32_t> and set the global locale via std::locale::global to a new locale constructed using the old one and the specialized facet.
Related
This code snippet (https://gcc.godbolt.org/z/hKDMxm):
#include <iostream>
#include <sstream>
using namespace std;
int main() {
auto s = (ostringstream{} << "string").str();
cout << s;
return 0;
}
compiles and runs as expected with msvc, but fails to compile with clang 9.0.0 and gcc 9.2 giving this error message:no member named 'str' in 'std::basic_ostream<char>'. Looking at https://en.cppreference.com/w/cpp/io/basic_ostringstream/str there is clearly str() member of ostringstream. Why clang and gcc are failing to compile this code?
there is clearly str() member of ostringstream
Yes, but according to cppreference this overload of << should return a reference to basic_ostream<...> rather than ostringstream.
libstdc++ (GCC's standard library) does exactly this, while libc++ (Clang's standard library) and MSVC's standard library behave incorrectly here, technically.
However, it seems there is an open defect report suggesting that the overload of << that works with rvalue streams should return the exact stream type that was passed to it. If it gets accepted, your code will be valid.
operator<< is member of std::ostream, and returns std::ostream& as described here
MSVC obviously has own overload this operator for std::ostringstream, what is not in standard
I ran into the same problem, and solved it with a wrapper class like this:
#include <sstream>
struct StreamHelper
{
std::ostringstream stream;
template< typename T >
StreamHelper& operator<<( const T& value )
{
stream << value; return *this;
}
std::string str() const
{
return stream.str();
}
operator std::string() const
{
return stream.str();
}
};
Then you can do your one-liner:
auto s = (StreamHelper() << "string1," << "string2").str();
If the compiler can figure out that the target type is a string, you don't need the .str() at the end and can use the implicit cast operator.
std::string s = (StreamHelper() << "string1," << "string2");
This code seemed to work ok in (ubuntu trusty) versions of gcc and clang, and in Win 7 on a VM via mingw... Recently I upgraded to Wily and builds made with clang crash consistently here.
#include <iostream>
#include <locale>
#include <string>
int main() {
std::cout << "The locale is '" << std::locale("").name() << "'" << std::endl;
}
Sometimes its a gibberish string followed by Aborted: Core dumped and sometimes its invalid free.
$ ./a.out
The locale is 'en_US.UTF-8QX�у�X�у����0�����P�����\�(��\�(��\�(��h��t�������������y���������ț�ԛ�������en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_UP����`�������������������������p�����������#��������������`�������������p��������������������#��#��#��`��������p������������0��P��p���qp��!en_US.UTF-8QЈ[�����\�(��\�(��\�(�����������#�� �����P�����0�����P�����\�(��\�(��\�(��Ȣ�Ԣ����������������(��4��#��L��en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8!�v[��������������#�� �����P�����0�����P�����\�(��\�(���(��h��t��������������������Ȥ�Ԥ�������en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8!��[�� ����[�������7����7��.,!!x�[��!��[��!�[��#�����������#�� �����P�����0�����P�����\�(��\�(��\�(��(��4��#��L��X��d��p��|������������n_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8ѻAborted (core dumped)
$ ./a.out
The locale is 'en_US.UTF-8QX\%�QX\%�Q�G�0H��H�PI��I�\:|�Q\D|�Q\>|�QhK�tK��K��K��K��K��Q�K��K��K��K��K��K�en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8en_US.UTF-8ѻ
*** Error in `./a.out': free(): invalid pointer: 0x0000000000b04a98 ***
Aborted (core dumped)
(Both program outputs above were abbreviated greatly or they would not fit in this question.)
I also got an invalid free on Coliru with it as well.
But this is very similar to example code on cppreference:
#include <iostream>
#include <locale>
#include <string>
int main()
{
std::wcout << "User-preferred locale setting is " << std::locale("").name().c_str() << '\n';
// on startup, the global locale is the "C" locale
std::wcout << 1000.01 << '\n';
// replace the C++ global locale as well as the C locale with the user-preferred locale
std::locale::global(std::locale(""));
// use the new global locale for future wide character output
std::wcout.imbue(std::locale());
// output the same number again
std::wcout << 1000.01 << '\n';
}
Actually that code crashes Coliru also... :facepalm:
More crashes of similar code from Coliru.
Is this a bug in the c++ library used by clang, or is this code defective?
Note also: These crashes seem to be restricted to the C++ api, if you use <clocale> instead things seem to work okay, so it may just be some trivial problem in the C++ bindings over this?
Variations using setlocale: 1 2 3
Looks like this is caused by libstdc++'s ABI change in its basic_string, which was needed for C++11 conformance. To manage this transition, GCC added the abi_tag attribute, which changes the mangled name of functions so that functions for the new and old ABI can be distinguished, even if the change wouldn't otherwise affect the mangled name (e.g. the return type of a function).
This code
#include <locale>
#include <string>
int main() {
std::locale().name();
}
on GCC emits a call to _ZNKSt6locale4nameB5cxx11Ev, which demangles to std::locale::name[abi:cxx11]() const, and returns a SSO string with the new ABI.
Clang, on other other hand, doesn't support the abi_tag attribute, and emits a call to _ZNKSt6locale4nameEv, which demangles to simply std::locale::name() const - which is the version returning a COW string (the old ABI).
The net result is that the program ends up trying to use a COW string as an SSO string when compiled with Clang. Havoc ensues.
The obvious workaround is to force the old ABI via -D_GLIBCXX_USE_CXX11_ABI=0.
I think the "" parameter might be corrupting something. I don't think it's a legal argument?
To verify it's nothing else, try running this:
#include <iostream>
#include <locale>
int main() {
std::locale("").name();
}
It compiles and runs just fine with GCC:
g++ -Wall -pedantic locale.cpp
<= No errorrs, no warnings
./a.out
The locale is 'en_US.UTF-8'
<= Expected output
ADDENDUM:
Exactly the same with MSVS 2013 - no errors or warnings compiling; no errors running:
locale.cpp =>
#include <iostream>
#include <locale>
#include <string>
int main() {
std::cout << "The locale is '" << std::locale("").name() << "'" << std::endl;
}
Output =>
locale
The locale is 'English_United States.1252'
I see two ways of accessing a boost::optional variable:
The dereference operator on the variable
The variable itself
If I have this code snippet:
#include <boost/optional.hpp>
#include <iostream>
int main()
{
boost::optional<int> oi;
std::cout << oi << "\n";
}
(where oi is uninitialized) and compile it using "g++-4.9 /tmp/optional.cc" followed by ./a.out, I get 0,
but with this:
#include <boost/optional.hpp>
#include <iostream>
int main()
{
boost::optional<int> oi;
std::cout << *oi << "\n";
}
I get:
a.out: /usr/include/boost/optional/optional.hpp:631: boost::optional<T>::reference_type boost::optional<T>::get() [with T = int; boost::optional<T>::reference_type = int&]: Assertion `this->is_initialized()' failed.
Aborted (core dumped)
which is the expected behavior.
You must have been using an older version of Boost. Your first case triggered a conversion to bool; since the optional does not contain a value, the result of the conversion is false, which is printed as 0.
Newer versions (1.56-1.57) added an operator<< function template declaration to <boost/optional.hpp>
template<class CharType, class CharTrait, class T>
std::basic_ostream<CharType, CharTrait>&
operator<<(std::basic_ostream<CharType, CharTrait>& out, optional<T> const& v);
to catch this kind of mistakes and cause a linker error instead.
Note that including <boost/optional/optional_io.hpp> allows you to actually use the stream operators with optional, in which case optionals that do not contain a value are printed as --.
boost::optional<T> ostream helpers are actually available since boost 1.34. See http://www.boost.org/doc/libs/1_34_0/boost/optional/optional_io.hpp
Note that one needs to EXPLICITLY include <boost/optional/optional_io.hpp> to enable these helpers. It is NOT included by <boost/optional.hpp>.
I am using Visual Studio 2013 for development, which uses v12 of Microsoft's c++ compiler tools.
The following code executes fine, printing "foo" to the console:
#include <regex>
#include <iostream>
#include <string>
std::string get() {
return std::string("foo bar");
}
int main() {
std::smatch matches;
std::string s = get();
std::regex_match(s, matches, std::regex("^(foo).*"));
std::cout << matches[1] << std::endl;
}
// Works as expected.
The same code, with the string "s" substituted for the "get()" function, throws a "string iterators incompatible" error at runtime:
#include <regex>
#include <iostream>
#include <string>
std::string get() {
return std::string("foo bar");
}
int main() {
std::smatch matches;
std::regex_match(get(), matches, std::regex("^(foo).*"));
std::cout << matches[1] << std::endl;
}
// Debug Assertion Failed!
// Expression: string iterators incompatible
This makes no sense to me. Can anyone explain why this happens?
The reason is that get() returns a temporary string, so the match results contains iterators into an object that no longer exists, and trying to use them is undefined behaviour. The debugging assertions in the Visual Studio C++ library notice this problem and abort your program.
Originally C++11 did allow what you're trying to do, but because it is so dangerous it was prevented by adding a deleted overload of std::regex_match which gets used when trying to get match results from a temporary string, see LWG DR 2329. That means your program should not compile in C++14 (or in compilers that implement the DR in C++11 mode too). GCC does not yet implement the change yet, I'll fix that.
Is there a standard way to check whether construction of a new std::codecvt_byname succeeded?
I was experimenting with the following program:
// cl /nologo /Fetest_codecvt_byname.exe /EHsc test_codecvt_byname.cpp && test_codecvt_byname
// g++ -o test_codecvt_byname test_codecvt_byname.cpp && test_codecvt_byname
#include <cstdlib>
#include <iostream>
#include <locale>
#include <new>
#include <stdexcept>
int main()
{
try {
new std::codecvt_byname<wchar_t, char, mbstate_t>(".nonsense");
} catch (const std::exception& ex) {
std::cerr << "Error: " << ex.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
libstdc++ on Windows apparently throws a std::runtime_error object if the named locale is unsupported. Microsoft Visual C++'s STL implementation, however, does not throw an exception.
Not knowing which C++ compiler will compile the code, how do I check whether construction of the new std::codecvt_byname succeeded? Alternatively, is there a way to check whether construction will be successful assuming no out-of-memory scenario?
Section [22.3.1.1.2], Class locale::facet, of the C++11 FDIS states:
For some standard facets a standard "..._byname" class, derived from it, implements the virtual function semantics equivalent to that facet of the locale constructed by locale(const char*) with the same name.
The Standard unfortunately does not require an exception to be thrown by the std::codecvt_byname constructor if the named locale is invalid, as does the explicit std::locale constructor locale(const char*). However, a work-around is to attempt to construct the locale and use_facet the codecvt facet instead of attempting to use std::codecvt_byname.