inconsistent behavior of boost spirit grammar [duplicate] - c++

This question already has an answer here:
undefined behaviour somewhere in boost::spirit::qi::phrase_parse
(1 answer)
Closed 7 years ago.
I have a little grammar that I want to use for a work project. A minimum executable example is:
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-local-typedefs"
#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
#pragma GCC diagnostic ignored "-Wunused-variable"
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/qi_grammar.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
#pragma GCC diagnostic pop // pops
#include <iostream>
int main()
{
typedef unsigned long long ull;
std::string curline = "1;2;;3,4;5";
std::cout << "parsing: " << curline << "\n";
namespace qi = boost::spirit::qi;
auto ids = -qi::ulong_long % ','; // '-' allows for empty vecs.
auto match_type_res = ids % ';' ;
std::vector<std::vector<ull> > r;
qi::parse(curline.begin(), curline.end(), match_type_res, r);
std::cout << "got: ";
for (auto v: r){
for (auto i : v)
std::cout << i << ",";
std::cout << ";";
}
std::cout <<"\n";
}
On my personal machine this produces the correct output:
parsing: 1;2;;3,4;5
got: 1,;2,;;3,4,;5,;
But at work it produces:
parsing: 1;2;;3,4;5
got: 1,;2,;;3,
In other words, it fails to parse the vector of long integers as soon as there's more than one element in it.
Now, I have identified that the work system is using boost 1.56, while my private computer is using 1.57. Is this the cause?
Knowning we have some real spirit experts here on stack overflow, I was hoping someone might know where this issue is coming from, or can at least narrow down the number of things I need to check.
If the boost version is the problem, I can probably convince the company to upgrade, but a workaround would be welcome in any case.

You're invoking Undefined Behaviour in your code.
Specifically where you use auto to store a parser expression. The Expression Template contains references to temporaries that become dangling at the end of the containing full-expression¹.
UB implies that anything can happen. Both compilers are right! And the best part is, you will probably see varying behaviour depending on the compiler flags used.
Fix it either by using:
qi::copy (or boost::proto::deep_copy before v.1.55 IIRC)
use BOOST_SPIRIT_AUTO instead of BOOST_AUTO (mostly helpful iff you also support C++03)
use qi::rule<> and qi::grammar<> (the non-terminals) to type-erase and the expressions. This has performance impact too, but also gives more features like
recursive rules
locals and inherited attributes
declared skippers (handy because rules can be implicitly lexeme[] (see here)
better code organization.
Note also that Spirit X3 promises to drop there restrictions on the use with auto. It's basically a whole lot more lightweight due to the use of c++14 features. Keep in mind that it's not stable yet.
Showing that GCC with -O2 shows undefined results; Live On Coliru
The fixed version:
Live On Coliru
//#pragma GCC diagnostic push
//#pragma GCC diagnostic ignored "-Wunused-local-typedefs"
//#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
//#pragma GCC diagnostic ignored "-Wunused-variable"
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/qi.hpp>
//#pragma GCC diagnostic pop // pops
#include <iostream>
int main() {
typedef unsigned long long ull;
std::string const curline = "1;2;;3,4;5";
std::cout << "parsing: '" << curline << "'\n";
namespace qi = boost::spirit::qi;
#if 0 // THIS IS UNDEFINED BEHAVIOUR:
auto ids = -qi::ulong_long % ','; // '-' allows for empty vecs.
auto grammar = ids % ';';
#else // THIS IS CORRECT:
auto ids = qi::copy(-qi::ulong_long % ','); // '-' allows for empty vecs.
auto grammar = qi::copy(ids % ';');
#endif
std::vector<std::vector<ull> > r;
qi::parse(curline.begin(), curline.end(), grammar, r);
std::cout << "got: ";
for (auto v: r){
for (auto i : v)
std::cout << i << ",";
std::cout << ";";
}
std::cout <<"\n";
}
Printing (also with GCC -O2!):
parsing: '1;2;;3,4;5'
got: 1,;2,;;3,4,;5,;
¹ (that's basically "at the next semicolon" here; but in standardese)

Related

std::ranges::remove still not suported / broken on clang trunk when using libstdc++?

Works fine on gcc trunk, but not on clang trunk, both with libstd++.
Or am I missing something exceedingly obvious?
Godbolt
#include <algorithm>
#include <iostream>
#include <ostream>
#include <vector>
std::ostream& operator<<(std::ostream& os, const std::vector<int>& v) {
for (auto&& e: v) os << e << " ";
return os;
}
int main() {
auto ints = std::vector<int>{1,2,3,4,5};
std::cout << ints << "\n";
auto [first, last] = std::ranges::remove(ints, 3);
ints.erase(first, last);
std::cout << ints << "\n";
}
gcc is clean. clang gives a WALL OF ERRORS, complaining about missing "__begin".
UPDATE: If I use -stdlib=libc++ then clang says "never heard of it", so I guess they are just not there yet.
new Godbolt
This seems to be a Clang bug, affecting ranges when using libstdc++, see this issue with the underlying cause which is still open and other issues linked to it as duplicates with examples how it affects ranges with libstdc++. There seems to have been some work on it about two weeks ago.
In libc++ std::ranges::remove does not seem to be implemented yet as you noticed and as stated on its status page for ranges implementation.

Is Undefined Behavior To Using Invalid Iterator?

Consider this code:
#include <iostream>
#include <string>
#include <map>
int main()
{
std::map<std::string, std::string> map = {
{ "ghasem", "another" }
};
std::cout << map.find("another")->second << std::endl;
std::cout << map.size() << std::endl;
}
It will be compiled and run successfully(the process return value is 0), but we couldn't see the output of map.size(). Neither -fsanitize=address nor -fsanitize=undfined reports any problem. I compiled with GCC-11.2.1 and Clang-13.0.0, both are the same. And running the code step by step using GDB-11.1-5 will not help and all the steps will be run successfully.
But if I reorder the last two lines:
#include <iostream>
#include <string>
#include <map>
int main()
{
std::map<std::string, std::string> map = {
{ "ghasem", "another" }
};
std::cout << map.size() << std::endl;
std::cout << map.find("another")->second << std::endl;
}
I will get a Segmentation Fault and now ASAN could report the error.
And my question here is: Is the code cause some sort of Undefined Behavior? How could I detect those errors?
Environment:
OS: Fedora 35
Compiler(s):
GCC 11.2.1
Clang 13.0.0
Additional Compiler Flags:
Debugger: GDB 11.1-5
There is no key comparing equal to "another" in the map. Therefore map.find("another") will return the .end() iterator of the map. Dereferencing this iterator in ->second is then undefined behavior since end iterators may not be dereferenced.
Your code should check that the iterator returned from find is not the end iterator, i.e. that an element has been found.
In terms of debugging, if you are using libstdc++ as standard library implementation (which is the case with GCC and potentially with Clang), you can use -D_GLIBCXX_DEBUGon the compiler invocation to enable debug assertions in the standard library implementation which will detect this issue.

reinterpret_cast<volatile uint8_t*>(37)' is not a constant expression

gcc fails to compile the code below, while clang compiles ok. I have no control on the macro PORTB, as it is in a 3rd party library (avr).
Is it a gcc bug? How can I work around it in gcc? As a workaround is somehow possible to create a pre-processor macro which extracts the numerical value from PORTB?
Note this question is similar, but not identical to my previous question.
It is also different from this question, where the developer has the flexibility to change the rhs of the assignment, thus avoiding the reinterpret_cast.
#include <iostream>
#include <cstdint>
#define PORTB (*(volatile uint8_t *)((0x05) + 0x20))
struct PortB {
static const uintptr_t port = reinterpret_cast<uintptr_t>(&PORTB);
};
int main() {
std::cout << PortB::port << "\n";
return 0;
}
It seems reinterpret_cast is not allowed during compilation. Thus the newer version of the compiler simply is more conforming to the language. reinterpret_cast will not be allowed where a constexpr is required.
But maybe these workaround may help (compiles with g++ 9.2):
#include <iostream>
#include <cstdint>
#define PORTB (*(volatile uint8_t *)((0x05) + 0x20))
struct PortB {
static uintptr_t port;
};
uintptr_t PortB::port = reinterpret_cast<uintptr_t>(&PORTB);
const uintptr_t freePort = reinterpret_cast<uintptr_t>(&PORTB);
#define macroPort reinterpret_cast<uintptr_t>(&PORTB)
int main() {
std::cout << PortB::port << "\n";
std::cout << freePort << "\n";
std::cout << macroPort << "\n";
return 0;
}

Avoiding compiler issues with abs()

When using the double variant of the std::abs() function without the std with g++ 4.6.1, no warning or error is given.
#include <algorithm>
#include <cmath>
double foobar(double a)
{
return abs(a);
}
This version of g++ seems to be pulling in the double variant of abs() into the global namespace through one of the includes from algorithm. This looks like it is now allowed by the standard (see this question), but not required.
If I compile the above code using a compiler that does not pull the double variant of abs() into the global namespace (such as g++ 4.2), then the following error is reported:
warning: passing 'double' for argument 1 to 'int abs(int)'
How can I force g++ 4.6.1, and other compilers that pull functions into the global namespace, to give a warning so that I can prevent errors when used with other compilers?
The function you are using is actually the integer version of abs, and GCC does an implicit conversion to integer.
This can be verified by a simple test program:
#include <iostream>
#include <cmath>
int main()
{
double a = -5.4321;
    double b = std::abs(a);
double c = abs(a);
std::cout << "a = " << a << ", b = " << b << ", c = " << c << '\n';
}
Output is:
a = -5.4321, b = 5.4321, c = 5
To get a warning about this, use the -Wconversion flag to g++. Actually, the GCC documentation for that option explicitly mentions calling abs when the argument is a double. All warning options can be found here.
Be warned, you don't need to explicitly #include <cmath>, <iostream> does the damage as well (and maybe some other headers). Also, note that -Wall doesn't give you any warnings about it.
#include <iostream>
int main() {
std::cout << abs(.5) << std::endl;
std::cout << typeid(decltype(abs)).name() << std::endl;
}
Gives output
0
FiiE
On
gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04)

C++11 / g++ : std:: qualifier required in lambda, although "using namespace std" is given

I was trying to discover some of the goodies of the new C++11 standard (using g++ 4.6.2). Playing around with lambdas in a an "all_of" algorithm function, I encountered a strange problem with the std:: qualifier.
I am "using" the std namespace as shown at the beginning of the code snippet. This makes the declaration of the pair variable in the for loop well-defined.
However, I tried the same in the lambda argument used in the "all_of" algorithm. I came across several hard-to-understand error messages, before I realized that a full std:: qualified std::pair would work there, but only pair not.
Am I missing an important point? The declaration of the lambda happens in this file, so the namespace should still be active here, right? Or does the required std:: qualifier depend on some STL code in a different file? Or is it likely to be a bug in g++?
Best regards,
Peter
PS: the code compiles without warnings as pasted here, but removing the std:: in the all_of lambda, I get an error message.
#include <iostream>
#include <memory>
#include <map>
#include <string>
#include <algorithm>
#include <utility>
using namespace std;
void duckburg() {
const int threshold = 100;
map <string, int> money;
money["donald"] = 200;
money["daisy"] = 400;
money["scrooge"] = 2000000;
// obviously, an "auto" type would work here nicely,
// but this way my problem is illustrated more clearly:
for (const pair <string, int> &pair : money) {
cout << pair.first << "\t" << pair.second << endl;
}
if (all_of(money.begin(), money.end(),
[&](std::pair<string, int> p) {
return bool(p.second > threshold);
}))
{
cout << "yes, everyone is rich!";
} else {
cout << "no, some are poor!";
};
}
Edit: Just noticed I received a downvote for this old question. No problem with that, but please elaborate on the reasons. It will help me improve future questions, and in the end the entire community will profit. Thanks!
Rename the variable pair in your for loop.
It's scope should only extend to the end of the for loop and therefore not interfere with your
lambda, but g++ has some code for ancient for-scoping rules where that was not the case, so it can emit better error messages for ancient C++ code.
It looks as if there is a bug in that compatibility code.