How to use u8_to_u32_iterator in Boost Spirit X3?

How to use u8_to_u32_iterator in Boost Spirit X3? - c++

I am using Boost Spirit X3 to create a programming language, but when I try to support Unicode, I get an error!
Here is an example of a simplified version of that program.
#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
struct sample : x3::symbols<unsigned> {
sample()
{
add("48", 10);
}
};
int main()
{
const std::string s("🌸");
boost::u8_to_u32_iterator<std::string::const_iterator> first{cbegin(s)},
last{cend(s)};
x3::parse(first, last, sample{});
}
Live on wandbox
What should I do?

As you noticed, internally char_encoding::unicode employs char32_t.
So, first changing the symbols accordingly:
template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;
struct sample : symbols<unsigned> {
sample() { add(U"48", 10); }
};
Now the code fails calling into case_compare:
/home/sehe/custom/boost_1_78_0/boost/spirit/home/x3/string/detail/tst.hpp|74 col 33| error: no match for call to ‘(boost::spirit::x3::case_compare<boost::spirit::char_encoding::unicode>) (reference, char32_t&)’
As you can see it expects a char32_t reference, but u8_to_u32_iterator returns unsigned ints (std::uint32_t).
Just for comparison / sanity check: https://godbolt.org/z/1zozxq96W
Luckily you can instruct the u8_to_u32_iterator to use another co-domain type:
Live On Compiler Explorer
#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;
struct sample : symbols<unsigned> {
sample() { add(U"48", 10)(U"🌸", 11); }
};
int main() {
auto test = [](auto const& s) {
boost::u8_to_u32_iterator<decltype(cbegin(s)), char32_t> first{
cbegin(s)},
last{cend(s)};
unsigned parsed_value;
if (x3::parse(first, last, sample{}, parsed_value)) {
std::cout << s << " -> " << parsed_value << "\n";
} else {
std::cout << s << " FAIL\n";
}
};
for (std::string s : {"🌸", "48", "🤷"})
test(s);
}
Prints
🌸 -> 11
48 -> 10
🤷 FAIL

Related

Vector of string shared memory map

How to append string to a vector contained inside map? Structure is map(float,vector(string)) where the map is in shared memory.my question is if key==desired key then append string to the vector of strings?

Do you mean something like this:
#include <map>
#include <vector>
#include <string>
#include <iostream>
int main()
{
std::map<float, std::vector<std::string>> m;
m[.5f].emplace_back("First");
m[.5f].emplace_back("Second");
m[.0f].emplace_back("Hello");
m[.0f].emplace_back("World");
for(const auto& [key, value] : m)
{
std::cout << "Key: " << key << '\n';
for(const auto& str : value)
std::cout << '\t' << str << '\n';
}
std::cout.flush();
return 0;
}

Doing this in shared memory is pretty hard, actually.
If you get all the allocators right, and add the locking, you'd usually get very clunky code that is hard to read due to all the allocator passing around.
You can, however, employ Boost's scoped allocator adaptor which will do a lot (lot) of magic that makes life better.
I think the following code sample just about nails the sweet spot.
Warning: This builds on years of experience trying to beat this into submission. If you fall just outside of the boundary of "magic" (mostly the in-place construction support due to uses_allocator<> and scoped_allocator_adaptor) you will find it breaks up and you'll be writing a lot of manual constructor/conversion calls to make it work.
Live On Coliru
#define DEMO
#include <iostream>
#include <iomanip>
#include <mutex>
#include <boost/interprocess/containers/map.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/managed_mapped_file.hpp> // For Coliru (doesn't support shared memory)
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/container/scoped_allocator.hpp>
namespace bip = boost::interprocess;
namespace bc = boost::container;
namespace Shared {
using Segment = bip::managed_mapped_file; // Coliru doesn't support bip::managed_shared_memory
template <typename T> using Alloc = bc::scoped_allocator_adaptor<bip::allocator<T, Segment::segment_manager> >;
template <typename V>
using Vector = bip::vector<V, Alloc<V> >;
template <typename K, typename V, typename Cmp = std::less<K> >
using Map = bip::map<K, V, Cmp, Alloc<std::pair<K const, V> > >;
using String = bip::basic_string<char, std::char_traits<char>, Alloc<char> >;
using Mutex = bip::interprocess_mutex;
}
namespace Lib {
using namespace Shared;
struct Data {
using Map = Shared::Map<float, Shared::Vector<Shared::String> >;
mutable Mutex _mx;
Map _map;
template <typename Alloc> Data(Alloc alloc = {}) : _map(alloc) {}
bool append(float f, std::string s) {
std::lock_guard<Mutex> lk(_mx); // lock
auto it = _map.find(f);
bool const exists = it != _map.end();
#ifndef DEMO
if (exists) {
it->second.emplace_back(s);
}
#else
// you didn't specify this, but lets insert new keys here, if
// only for the demo
_map[f].emplace_back(s);
#endif
return exists;
}
size_t size() const {
std::lock_guard<Mutex> lk(_mx); // lock
return _map.size();
}
friend std::ostream& operator<<(std::ostream& os, Data const& data) {
std::lock_guard<Mutex> lk(data._mx); // lock
for (auto& [f,v] : data._map) {
os << f << " ->";
for (auto& ss : v) {
os << " " << std::quoted(std::string(ss));
}
os << "\n";
}
return os;
}
};
}
struct Program {
Shared::Segment msm { bip::open_or_create, "data.bin", 10*1024 };
Lib::Data& _data = *msm.find_or_construct<Lib::Data>("data")(msm.get_segment_manager());
void report() const {
std::cout << "Map contains " << _data.size() << " entries\n" << _data;
}
};
struct Client : Program {
void run(float f) {
_data.append(f, "one");
_data.append(f, "two");
}
};
int main() {
{
Program server;
server.report();
Client().run(.5f);
Client().run(.6f);
}
// report again
Program().report();
}
First run would print:
Map contains 0 entries
Map contains 2 entries
0.5 -> "one" "two"
0.6 -> "one" "two"
A second run:
Map contains 2 entries
0.5 -> "one" "two"
0.6 -> "one" "two"
Map contains 2 entries
0.5 -> "one" "two" "one" "two"
0.6 -> "one" "two" "one" "two"

Segmentation fault for nested boost::variant

The following program has been reduced from the original. I get a segmentation fault when it runs. If I remove line 24 with ArithmeticUnaryExpression then the program no longer crashes. How do I get rid of the segmentation fault?
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <boost/spirit/include/qi_expect.hpp>
#include <boost/spirit/home/x3/directive/expect.hpp>
#include <iostream>
#include <string>
namespace wctl_parser {
namespace x3 = boost::spirit::x3;
namespace ascii = x3::ascii;
namespace qi = boost::spirit::qi;
using x3::ulong_;
using x3::lexeme;
//--- Ast structures
struct ArithmeticUnaryExpression;
using AtomicProp = std::string;
using ArithmeticExpression = x3::variant<
x3::forward_ast<ArithmeticUnaryExpression>,
unsigned long
>;
struct ArithmeticUnaryExpression {
std::string op;
ArithmeticExpression operand;
};
using Expression = x3::variant<
ArithmeticExpression
>;
template <typename T> auto rule = [](const char* name = typeid(T).name()) {
struct _{};
return x3::rule<_, T> {name};
};
template <typename T> auto as = [](auto p) { return rule<T>() = p; };
//--- Rules
x3::rule<struct aTrivRule, ArithmeticExpression> aTriv("aTriv");
x3::rule<struct exprRule, Expression> expr("expression");
auto const aTriv_def = rule<ArithmeticExpression>("aTriv")
= ulong_
// | '(' > expr > ')'
;
auto const primitive = rule<Expression>("primitive")
= aTriv
;
auto const expr_def
= primitive
;
BOOST_SPIRIT_DEFINE(aTriv)
BOOST_SPIRIT_DEFINE(expr)
auto const entry = x3::skip(ascii::space) [expr];
} //End namespace
int main() {
std::string str("prop");
namespace x3 = boost::spirit::x3;
wctl_parser::Expression root;
auto iter = str.begin();
auto end = str.end();
bool r = false;
r = parse(iter, end, wctl_parser::entry, root);
if (r) {
std::cout << "Parses OK:" << std::endl << str << std::endl;
if (iter != end) std::cout << "Partial match" << std::endl;
std::cout << std::endl << "----------------------------\n";
}
else {
std::cout << "!! Parsing failed:" << std::endl << str << std::endl << std::endl << "----------------------------\n";
}
return 0;
}

Your variant
using ArithmeticExpression = x3::variant<
x3::forward_ast<ArithmeticUnaryExpression>,
unsigned long
>;
will default-construct to the first element type. The first element type contains ArithmeticExpression which is also default constructed. Can you see the problem already?
Just make sure the default constructed state doesn't lead to infinite recursion:
using ArithmeticExpression = x3::variant<
unsigned long,
x3::forward_ast<ArithmeticUnaryExpression>
>;

Parsing list of variants with boost spirit X3

I try to parse a simple list of float or int into a vector of variant. I'm using boost 1.64 on Windows (mingw 64bit).
Here is a minimal example:
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <vector>
namespace x3 = boost::spirit::x3;
struct var : x3::variant<int, float> {
using base_type::base_type;
using base_type::operator=;
};
struct block {
bool dummy; // needed to make newer boost version compile
std::vector<var> vars;
};
BOOST_FUSION_ADAPT_STRUCT(block,
(bool, dummy),
(std::vector<var>, vars)
);
x3::rule<class var, var> const r_var = "var";
x3::rule<class block, block> const r_block = "block";
auto const r_var_def = x3::float_ | x3::int_;
auto const r_block_def = x3::attr(true) >> *x3::lit(";") >> *(r_var >> -x3::lit(","));
BOOST_SPIRIT_DEFINE(r_var, r_block);
bool parse(std::string const &txt, block &ast)
{
using boost::spirit::x3::phrase_parse;
using boost::spirit::x3::space;
auto iter = txt.begin();
auto end = txt.end();
const bool parsed = phrase_parse(iter, end, r_block, space, ast);
return parsed && iter == end;
}
int main() {
std::vector<std::string> list = {
"1, 3, 5.5",
";1.0, 2.0, 3.0, 4.0"
};
for (const auto&i : list) {
block ast;
if (parse(i, ast)) {
std::cout << "OK: " << i << std::endl;
} else {
std::cout << "FAIL: " << i << std::endl;
}
}
}
GCC 7.1 gives the following error:
..\parser\parser.cpp:41:68: required from here
..\..\win64\include/boost/spirit/home/x3/nonterminal/detai/rule.hpp:313:24: error: use of deleted function 'var::var(const var&)'
value_type made_attr = make_attribute::call(attr);
^~~~~~~~~
Any ideas, why GCC doesn't compile it? It works with Clang though.
Live on Coliru (switch to clang++ to see it work).

It seems that there is a problem using the inherited special members. Two workarounds:
using var = x3::variant<int, float>;
Alternatively:
struct var : x3::variant<int, float> {
var ( ) = default;
var (var const&) = default;
var& operator= (var const&) = default;
using base_type::base_type;
using base_type::operator=;
};

Is it possible to ASSERT_DOES_NOT_COMPILE with GTest?

Assume a template class where we assert at compile time that the integer template argument must be greater zero:
template<int N>
class A
{
public:
A() {
static_assert(N > 0, "N needs to be greater 0.");
}
};
Is it possible to create a googletest unit test that compiles, but reports the error at runtime? For example:
TEST(TestA, ConstructionNotAllowedWithZero)
{
ASSERT_DOES_NOT_COMPILE(
{
A< 0 > a;
}
);
}

There is a way, but sadly it's probably not the way you want.
My first thought was to try to get SFINAE to discount an overload by expanding an invalid lambda in an unevaluated context. Unhappily (in your case), this is specifically disallowed...
#define CODE { \
utter garbage \
}
struct test
{
template<class T>
static std::false_type try_compile(...) { return{}; }
template<class T>
static auto try_compile(int)
-> decltype([]() CODE, void(), std::true_type());
{ return {}; }
};
struct tag {};
using does_compile = decltype(test::try_compile<tag>(0));
output:
./maybe_compile.cpp:88:17: error: lambda expression in an unevaluated operand
-> decltype([]() CODE, void(), std::true_type());
So it was back to the drawing board and a good old system call to call out to the compiler...
#include <iostream>
#include <string>
#include <cstdlib>
#include <fstream>
#include <sstream>
struct temp_file {
temp_file()
: filename(std::tmpnam(nullptr))
{}
~temp_file() {
std::remove(filename.c_str());
}
std::string filename;
};
bool compiles(const std::string code, std::ostream& reasons)
{
using namespace std::string_literals;
temp_file capture_file;
temp_file cpp_file;
std::ofstream of(cpp_file.filename);
std::copy(std::begin(code), std::end(code), std::ostream_iterator<char>(of));
of.flush();
of.close();
const auto cmd_line = "c++ -x c++ -o /dev/null "s + cpp_file.filename + " 2> " + capture_file.filename;
auto val = system(cmd_line.c_str());
std::ifstream ifs(capture_file.filename);
reasons << ifs.rdbuf();
ifs.close();
return val == 0;
}
auto main() -> int
{
std::stringstream reasons1;
const auto code1 =
R"code(
#include <iostream>
int main() {
return 0;
}
)code";
std::cout << "compiles: " << compiles(code1, reasons1) << std::endl;
std::stringstream reasons2;
const auto code2 =
R"code(
#include <iostream>
int main() {
FOO!!!!XC#£$%^&*()VBNMYGHH
return 0;
}
)code";
std::cout << "compiles: " << compiles(code2, reasons2) << std::endl;
std::cout << "\nAnd here's why...\n";
std::cout << reasons2.str() << std::endl;
return 0;
}
which in my case gives the following example output:
compiles: 1
compiles: 0
And here's why...
/var/tmp/tmp.3.2dADZ7:4:9: error: use of undeclared identifier 'FOO'
FOO!!!!XC#£$%^&*()VBNMYGHH
^
/var/tmp/tmp.3.2dADZ7:4:19: error: non-ASCII characters are not allowed outside of literals and identifiers
FOO!!!!XC#£$%^&*()VBNMYGHH
^
2 errors generated.
of course you can add all the necessary macros around the call to compiles() in order to GTESTify it. You will of course have to set command line options on the c-compiler invocation to set the correct paths and defines.

Hash an arbitrary precision value (boost::multiprecision::cpp_int)

I need to get the hash of a value with arbitrary precision (from Boost.Multiprecision); I use the cpp_int backend. I came up with the following code:
boost::multiprecision::cpp_int x0 = 1;
const auto seed = std::hash<std::string>{}(x0.str());
I don't need the code to be as fast as possible, but I find it very clumsy to hash the string representation.
So my question is twofold:
Keeping the arbitrary precision, can I hash the value more efficiently?
Maybe I should not insisting on keeping the arbitrary precision and I should convert to a double which I could hash easily (I would still however make the comparison needed for the hash table using the arbitrary precision value)?

You can (ab)use the serialization support:
Support for serialization comes in two forms:
Classes number, debug_adaptor, logged_adaptor and rational_adaptor have "pass through" serialization support which requires the underlying backend to be serializable.
Backends cpp_int, cpp_bin_float, cpp_dec_float and float128 have full support for Boost.Serialization.
So, let me cobble something together that works with boost and std unordered containers:
template <typename Map>
void test(Map const& map) {
std::cout << "\n" << __PRETTY_FUNCTION__ << "\n";
for(auto& p : map)
std::cout << p.second << "\t" << p.first << "\n";
}
int main() {
using boost::multiprecision::cpp_int;
test(std::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
test(boost::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
}
Let's forward the relevant hash<> implementations to our own hash_impl specialization that uses Multiprecision and Serialization:
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
}
namespace boost {
template <typename backend>
struct hash<multiprecision::number<backend> >
: mp_hashing::hash_impl<multiprecision::number<backend> >
{};
}
Now, of course, this begs the question, how is hash_impl implemented?
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
This looks pretty simple. That's because Boost is awesome, and writing a hash_sink device for use with Boost Iostreams is just the following straightforward exercise:
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
Full Demo:
Live On Coliru
#include <iostream>
#include <iomanip>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/cpp_int/serialize.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/stream_buffer.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/functional/hash.hpp>
namespace mp_hashing {
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
}
#include <unordered_map>
#include <boost/unordered_map.hpp>
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
}
namespace boost {
template <typename backend>
struct hash<multiprecision::number<backend> >
: mp_hashing::hash_impl<multiprecision::number<backend> >
{};
}
template <typename Map>
void test(Map const& map) {
std::cout << "\n" << __PRETTY_FUNCTION__ << "\n";
for(auto& p : map)
std::cout << p.second << "\t" << p.first << "\n";
}
int main() {
using boost::multiprecision::cpp_int;
test(std::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
test(boost::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
}
Prints
void test(const Map&) [with Map = std::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >]
one 2596148429267413814265248164610048
three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776
two 13479973333575319897333507543509815336818572211270286240551805124608
void test(const Map&) [with Map = boost::unordered::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >]
three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776
two 13479973333575319897333507543509815336818572211270286240551805124608
one 2596148429267413814265248164610048
As you can see, the difference in implementation between Boost's and the standard library's unordered_map show up in the different orderings for identical hashes.

Just to say that I've just added native hashing support (for Boost.Hash and std::hash) to git develop. It works for all the number types including those from GMP etc. Unfortunately that code won't be released until Boost-1.62 now.
The answer above that (ab)uses serialization support, is actually extremely cool and really rather clever ;) However, it wouldn't work if you wanted to use a vector-based hasher like CityHash, I added an example of using that by accessing the limbs directly to the docs: https://htmlpreview.github.io/?https://github.com/boostorg/multiprecision/blob/develop/doc/html/boost_multiprecision/tut/hash.html Either direct limb-access or the serialization tip will work with all previous releases of course.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to use u8_to_u32_iterator in Boost Spirit X3? - c++

Related

Vector of string shared memory map

Segmentation fault for nested boost::variant

Parsing list of variants with boost spirit X3

Is it possible to ASSERT_DOES_NOT_COMPILE with GTest?

Hash an arbitrary precision value (boost::multiprecision::cpp_int)

Categories

Resources