Boost property tree to parse custom configuration format - c++

Following this link provided by #sehe in this post Boost_option to parse a configuration file, I need to parse configuration files that may have comments.
https://www.boost.org/doc/libs/1_76_0/doc/html/property_tree/parsers.html#property_tree.parsers.info_parser
But since there are comments (leading #), so in addition to read_info(), should a grammer_spirit be used to take out the comments as well? I am referring to info_grammar_spirit.cpp in the /property_tree/examples folder

You would do good to avoid depending on implementation details, so instead I'd suggest pre-processing your config file just to strip the comments.
A simple replace of "//" with "; " may be enough.
Building on the previous answer:
std::string tmp;
{
std::ifstream ifs(file_name.c_str());
tmp.assign(std::istreambuf_iterator<char>(ifs), {});
} // closes file
boost::algorithm::replace_all(tmp, "//", ";");
std::istringstream preprocessed(tmp);
read_info(preprocessed, pt);
Now if you change the input to include comments:
Resnet50 {
Layer CONV1 {
Type: CONV // this is a comment
Stride { X: 2, Y: 2 } ; this too
Dimensions { K: 64, C: 3, R: 7, S: 7, Y:224, X:224 }
}
// don't forget the CONV2_1_1 layer
Layer CONV2_1_1 {
Type: CONV
Stride { X: 1, Y: 1 }
Dimensions { K: 64, C: 64, R: 1, S: 1, Y: 56, X: 56 }
}
}
It still parses as expected, if we also extend the debug output to verify:
ptree const& resnet50 = pt.get_child("Resnet50");
for (auto& entry : resnet50) {
std::cout << entry.first << " " << entry.second.get_value("") << "\n";
std::cout << " --- Echoing the complete subtree:\n";
write_info(std::cout, entry.second);
}
Prints
Layer CONV1
--- Echoing the complete subtree:
Type: CONV
Stride
{
X: 2,
Y: 2
}
Dimensions
{
K: 64,
C: 3,
R: 7,
S: 7,
Y:224, X:224
}
Layer CONV2_1_1
--- Echoing the complete subtree:
Type: CONV
Stride
{
X: 1,
Y: 1
}
Dimensions
{
K: 64,
C: 64,
R: 1,
S: 1,
Y: 56,
X: 56
}
See it Live On Coliru
Yes, But...?
What if '//' occurs in a string literal? Won't it also get replaced. Yes.
This is not a library-quality solution. You should not expect one, because you didn't have to put in any effort to parse your bespoke configuration file format.
You are the only party who can judge whether the short-comings of this approach are a problem for you.
However, short of just copying and modifying Boost's parser or implementing your own from scratch, there's not a lot one can do.
For The Masochists
If you don't want to reimplement the entire parser, but still want the "smarts" to skip string literals, here's a pre_process function that does all that. This time, it's truly employing Boost Spirit
#include <boost/spirit/home/x3.hpp>
std::string pre_process(std::string const& input) {
std::string result;
using namespace boost::spirit::x3;
auto static string_literal
= raw[ '"' >> *('\\'>> char_ | ~char_('"')) >> '"' ];
auto static comment
= char_(';') >> *~char_("\r\n")
| "//" >> attr(';') >> *~char_("\r\n")
| omit["/*" >> *(char_ - "*/") >> "*/"];
auto static other
= +(~char_(";\"") - "//" - "/*");
auto static content
= *(string_literal | comment | other) >> eoi;
if (!parse(begin(input), end(input), content, result)) {
throw std::invalid_argument("pre_process");
}
return result;
}
As you can see, it recognizes string literals (with escapes), it treats "//" and ';' style linewise comments as equivalent. To "show off" I threw in /block comments/ which cannot be represented in proper INFO syntax, so we just omit[] them.
Now let's test with a funky example (extended from the "Complicated example demonstrating all INFO features" from the documentation):
#include <boost/property_tree/info_parser.hpp>
#include <iostream>
using boost::property_tree::ptree;
int main() {
boost::property_tree::ptree pt;
std::istringstream iss(
pre_process(R"~~( ; A comment
key1 value1 // Another comment
key2 "value with /* no problem */ special // characters in it {};#\n\t\"\0"
{
subkey "value split "\
"over three"\
"lines"
{
a_key_without_value ""
"a key with special characters in it {};#\n\t\"\0" ""
"" value /* Empty key with a value */
"" /*also empty value: */ "" ; Empty key with empty value!
}
})~~"));
read_info(iss, pt);
std::cout << " --- Echoing the parsed tree:\n";
write_info(std::cout, pt);
}
Prints (Live On Coliru)
--- Echoing the parsed tree:
key1 value1
key2 "value with /* no problem */ special // characters in it {};#\n \"\0"
{
subkey "value split over threelines"
{
a_key_without_value ""
"a key with special characters in it {};#\n \"\0" ""
"" value
"" ""
}
}

Related

Boost_option to parse a configuration file

I am trying to parse a neural network configuration file similar to the following lines. Actual file will have many more lines but similar format.
Resnet50 {
Layer CONV1 {
Type: CONV
Stride { X: 2, Y: 2 }
Dimensions { K: 64, C: 3, R: 7, S: 7, Y:224, X:224 }
}
Layer CONV2_1_1 {
Type: CONV
Stride { X: 1, Y: 1 }
Dimensions { K: 64, C: 64, R: 1, S: 1, Y: 56, X: 56 }
}
I use this Boost argument parsing code:
void to_cout(const std::vector<std::string> &v)
{
std::copy(v.begin(), v.end(), std::ostream_iterator<std::string>{std::cout, "\n"});
}
int main(int argc, char* argv[]) {
namespace po = boost::program_options;
po::options_description conf("Config file options");
conf.add_options()("confg_file", po::value<std::string>(&file_name), "HW configuration file");
po::options_description all_options;
all_options.add(conf);
po::variables_map vm;
po::store(po::parse_command_line(argc, argv, all_options), vm);
po::notify(vm);
return 0;
}
Seeming a regular parsing routine. But the configuration file wasn't parsed correctly because there was no output in the to_cout of vm. How does parse_command_line get into the hierarchy of the example configuration file?
That's not what Program Options is about. You can use it to read ini files, but not with the code shown. You are literally invoking parse_command_line (not parse_config_file).
The code you show allows you to parse the name of a config file from the command line. This is also why the value is std::string file_name.
Maybe we're missing (quite a lot of) code, because there's also nothing invoking to_cout in your code, nevermind that it wouldn't work with vm because the argument type doesn't directly match. I know you can loop over matched names in the variable map, and this is likely what you did, but that's all not very relevant.
Even if you did call parse_config_file would not know how to read that file format, as the documented format is an INI-file flavour.
The Good News
The good news is that your config file does have a format that closely resembles INFO files as supported by Boost Property Tree. Which gives me the first opportunity in 10 years¹ to actually suggest using that library: It seems to be more or less precisely what you are after:
Live On Coliru
#include <boost/property_tree/info_parser.hpp>
#include <iostream>
extern std::string config;
int main() {
boost::property_tree::ptree pt;
std::istringstream iss(config);
read_info(iss, pt);
write_info(std::cout, pt);
}
std::string config = R"(
Resnet50 {
Layer CONV1 {
Type: CONV
Stride { X: 2, Y: 2 }
Dimensions { K: 64, C: 3, R: 7, S: 7, Y:224, X:224 }
}
Layer CONV2_1_1 {
Type: CONV
Stride { X: 1, Y: 1 }
Dimensions { K: 64, C: 64, R: 1, S: 1, Y: 56, X: 56 }
}
}
)";
Prints
Resnet50
{
Layer CONV1
{
Type: CONV
Stride
{
X: 2,
Y: 2
}
Dimensions
{
K: 64,
C: 3,
R: 7,
S: 7,
Y:224, X:224
}
}
Layer CONV2_1_1
{
Type: CONV
Stride
{
X: 1,
Y: 1
}
Dimensions
{
K: 64,
C: 64,
R: 1,
S: 1,
Y: 56,
X: 56
}
}
}
Tieing It Together
You may tie it together with a CLI argument for the filename:
Live On Coliru
#include <boost/property_tree/info_parser.hpp>
#include <boost/program_options.hpp>
#include <iostream>
using boost::property_tree::ptree;
int main(int argc, char* argv[]) {
std::string file_name;
{
namespace po = boost::program_options;
po::options_description cliopts("options");
cliopts.add_options() //
("config_file", po::value<std::string>(&file_name),
"HW configuration file");
po::variables_map vm;
po::store(po::parse_command_line(argc, argv, cliopts), vm);
if (!vm.contains("config_file")) {
std::cerr << cliopts << '\n';
return 255;
}
po::notify(vm); // sets file_name
}
boost::property_tree::ptree pt;
{
std::ifstream ifs(file_name);
read_info(ifs, pt);
} // closes file
for (auto const& [key, sub] : pt.get_child("Resnet50")) {
std::cout << key << " " << sub.get_value("") << "\n";
}
}
Then for running ./test.exe --config_file config.cfg it may print e.g.
Layer CONV1
Layer CONV2_1_1
¹ 10 years (and more) of admonishing people not to abuse Property Tree as an XML, INI, or JSON parser, because it is none of these things. It's ... a property tree library.

How to capture the value parsed by a boost::spirit::x3 parser to be used within the body of a semantic action?

I have a parser for string literals, and I'd like to attach a semantic action to the parser that will manipulate the parsed value. It seems that boost::spirit::x3::_val() returns a reference to the parsed value when given the context, but for some reason the parsed string always enters the body of the semantic action as just an empty string, which obviously makes it difficult to read from it. It is the right string though, I've made sure by checking the addresses. Anyone know how I could have a reference to the parsed value within the semantic action attached to the parser? This here is the parser I currently use:
x3::lexeme[quote > *("\\\"" >> x3::attr('\"') | ~x3::char_(quote)) > quote]
And I'd like to add the semantic action to the end of it. Thank you in advance!
EDIT: it seems that whenever I attach any semantic action in general to the parser, the value is nullified. I suppose the question now is how could I access the value before that happens? I just need to be able to manipulate the parsed string before it is given to the AST.
In X3, semantic actions are much simpler. They're unary callables that take just the context.
Then you use free functions to extract information from the context:
x3::_val(ctx) is like qi::_val
x3::_attr(ctx) is like qi::_0 (or qi::_1 for simple parsers)
x3::_pass(ctx) is like qi::_pass
So, to get your semantic action, you could do:
auto qstring
= x3::rule<struct rule_type, std::string> {"qstring"}
= x3::lexeme[quote > *("\\" >> x3::char_(quote) | ~x3::char_(quote)) > quote]
;
Now to make a very odd string rule that reverses the text (after de-escaping) and requires the number of characters to be an odd-number:
auto odd_reverse = [](auto& ctx) {
auto& attr = x3::_attr(ctx);
auto& val = x3::_val(ctx);
x3::traits::move_to(attr, val);
std::reverse(val.begin(), val.end());
x3::_pass(ctx) = val.size() % 2 == 0;
};
auto odd_string
= x3::rule<struct odd_type, std::string> {"odd_string"}
= qstring [ odd_reverse ]
;
DEMO
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>
int main() {
namespace x3 = boost::spirit::x3;
auto constexpr quote = '"';
auto qstring
= x3::rule<struct rule_type, std::string> {"qstring"}
= x3::lexeme[quote > *("\\" >> x3::char_(quote) | ~x3::char_(quote)) > quote]
;
auto odd_reverse = [](auto& ctx) {
auto& attr = x3::_attr(ctx);
auto& val = x3::_val(ctx);
x3::traits::move_to(attr, val);
std::reverse(val.begin(), val.end());
x3::_pass(ctx) = val.size() % 2 == 0;
};
auto odd_string
= x3::rule<struct odd_type, std::string> {"odd_string"}
= qstring [ odd_reverse ]
;
for (std::string const input : {
R"("test \"hello\" world")",
R"("test \"hello\" world!")",
}) {
std::string output;
auto f = begin(input), l = end(input);
if (x3::phrase_parse(f, l, odd_string, x3::blank, output)) {
std::cout << "[" << output << "]\n";
} else {
std::cout << "Failed\n";
}
if (f != l) {
std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << "\n";
}
}
}
Printing
[dlrow "olleh" tset]
Failed
Remaining unparsed: "\"test \\\"hello\\\" world!\""
UPDATE
To the added question:
EDIT: it seems that whenever I attach any semantic action in general
to the parser, the value is nullified. I suppose the question now is
how could I access the value before that happens? I just need to be
able to manipulate the parsed string before it is given to the AST.
Yes, if you attach an action, automatic attribute propagation is inhibited. This is the same in Qi, where you could assign rules with %= instead of = to force automatic attribute propagation.
To get the same effect in X3, use the third template argument to x3::rule: x3::rule<X, T, true> to indicate you want automatic propagation.
Really, try not to fight the system. In practice, the automatic transformation system is way more sophisticated than I am willing to re-discover on my own, so I usually post-process the whole AST or at most apply some minor tweaks in an action. See also Boost Spirit: "Semantic actions are evil"?

Regular expression to validate syntax of fields in any order, with acceptable values

Consider the following situation:
We want to use a regular expression to validate the syntax of a command with X number of fields - one mandatory, two optional. The three fields can be shown in any order, with any number of spaces separating them, and have limited dictionaries of acceptable values
Mandatory Field: "-foo"
Optional Field 1: Can be either of "-handle" "-bar" or "-mustache"
Optional Field 2: Can be either of "-meow" "-mix" or "-want"
Examples of valid inputs:
-foo
-foo -bar
-foo-want
-foo -meow-bar
-foo-mix-mustache
-handle -foo-meow
-mustache-foo
-mustache -mix -foo
-want-foo
-want-meow-foo
-want-foo-meow
Examples of invalid inputs:
woof
-handle-meow
-ha-foondle
meow
-foobar
stackoverflow
- handle -foo -mix
-handle -mix
-foo -handle -bar
-foo -handle -mix -sodium
I guess you can say, there are three capture groups, with the first being mandatory and the last two being optional:
(\-foo){1}
(\-handle|\-bar|\-mustache)?
(\-meow|\-mix|\-want)?
But I'm not sure on how to write it so that these can be in any order, possibly separated by any amount of spaces, and with nothing else.
What I have so far is three forward-looking capture groups: (% signs indicating stuff to be completed)
^(?=.*?(foo))(?=.*?(\-handle|\-bar|\-mustache))(?=.*?(\-meow|\-mix|\-want))%Verify that group 1 is present once, optional groups 2 and 3 zero or one times, in any order, with any spaces%$
Adding a new capture group is simple enough, or expanding the acceptable inputs for an existing group, but I'm definitely stumped on the backreferencing, and not quite sure how on how expanding the checks to accomodate a 4th group would affect the backreferencing.
Or would it make more sense to just use something like boost::split or boost::tokenize on the "-" character, then iterate through them, counting the tokens that fit into group 1, 2, 3, and "none of the above," and verifying the counts?
It seems like it should be a simple extension or application of a boost library.
You mention boost. Have you looked at program_options? http://www.boost.org/doc/libs/1_55_0/doc/html/program_options/tutorial.html
Indeed, a context-free grammar would be fine. Let's parse your command into a structure like:
struct Command {
std::string one, two, three;
};
Now, when we adapt that as a fusion sequence, we can write a Spirit Qi grammar for it and enjoy automagic attribute propagation:
CommandParser() : CommandParser::base_type(start) {
using namespace qi;
command = field(Ref(&f1)) ^ field(Ref(&f2)) ^ field(Ref(&f3));
field = '-' >> raw[lazy(*_r1)];
f1 += "foo";
f2 += "handle", "bar", "mustache";
f3 += "meow", "mix", "want";
start = skip(blank) [ command >> eoi ] >> eps(is_valid(_val));
}
Here, everything is straight-forward: the permutation parser (operator^) allows all three fields in any order.
f1, f2, f3 are the accepted symbols (Options, below) for the respective fields.
The start rule, finally, adds the skipping of blanks, and checks at the end (have we reached eoi? is the mandatory field present?).
Live Demo
Live On Coliru
#include <boost/fusion/adapted/struct.hpp>
struct Command {
std::string one, two, three;
};
BOOST_FUSION_ADAPT_STRUCT(Command, one, two, three)
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
template <typename It>
struct CommandParser : qi::grammar<It, Command()> {
CommandParser() : CommandParser::base_type(start) {
using namespace qi;
command = field(Ref(&f1)) ^ field(Ref(&f2)) ^ field(Ref(&f3));
field = '-' >> raw[lazy(*_r1)];
f1 += "foo";
f2 += "handle", "bar", "mustache";
f3 += "meow", "mix", "want";
start = skip(blank) [ command >> eoi ] >> eps(is_valid(_val));
}
private:
// mandatory field check
struct is_valid_f {
bool operator()(Command const& cmd) const { return cmd.one.size(); }
};
boost::phoenix::function<is_valid_f> is_valid;
// rules and skippers
using Options = qi::symbols<char>;
using Ref = Options const*;
using Skipper = qi::blank_type;
qi::rule<It, Command()> start;
qi::rule<It, Command(), Skipper> command;
qi::rule<It, std::string(Ref)> field;
// option values
Options f1, f2, f3;
};
boost::optional<Command> parse(std::string const& input) {
using It = std::string::const_iterator;
Command cmd;
bool ok = parse(input.begin(), input.end(), CommandParser<It>{}, cmd);
return boost::make_optional(ok, cmd);
}
#include <iomanip>
void run_test(std::string const& input, bool expect_valid) {
auto result = parse(input);
std::cout << (expect_valid == !!result?"PASS":"FAIL") << "\t" << std::quoted(input) << "\n";
if (result) {
using boost::fusion::operator<<;
std::cout << " --> Parsed: " << *result << "\n";
}
}
int main() {
char const* valid[] = {
"-foo",
"-foo -bar",
"-foo-want",
"-foo -meow-bar",
"-foo-mix-mustache",
"-handle -foo-meow",
"-mustache-foo",
"-mustache -mix -foo",
"-want-foo",
"-want-meow-foo",
"-want-foo-meow",
};
char const* invalid[] = {
"woof",
"-handle-meow",
"-ha-foondle",
"meow",
"-foobar",
"stackoverflow",
"- handle -foo -mix",
"-handle -mix",
"-foo -handle -bar",
"-foo -handle -mix -sodium",
};
std::cout << " === Positive test cases:\n";
for (auto test : valid) run_test(test, true);
std::cout << " === Negative test cases:\n";
for (auto test : invalid) run_test(test, false);
}
Prints
=== Positive test cases:
PASS "-foo"
--> Parsed: (foo )
PASS "-foo -bar"
--> Parsed: (foo bar )
PASS "-foo-want"
--> Parsed: (foo want)
PASS "-foo -meow-bar"
--> Parsed: (foo bar meow)
PASS "-foo-mix-mustache"
--> Parsed: (foo mustache mix)
PASS "-handle -foo-meow"
--> Parsed: (foo handle meow)
PASS "-mustache-foo"
--> Parsed: (foo mustache )
PASS "-mustache -mix -foo"
--> Parsed: (foo mustache mix)
PASS "-want-foo"
--> Parsed: (foo want)
FAIL "-want-meow-foo"
FAIL "-want-foo-meow"
=== Negative test cases:
PASS "woof"
PASS "-handle-meow"
PASS "-ha-foondle"
PASS "meow"
PASS "-foobar"
PASS "stackoverflow"
PASS "- handle -foo -mix"
PASS "-handle -mix"
PASS "-foo -handle -bar"
PASS "-foo -handle -mix -sodium"
This is a brute force solution which should work for fairly simple cases.
The idea is to build up a regular expression out of all the permutations of the order in which these capture group can appear.
In the test data there are only 6 permutations. Obviously this method could get unwieldily pretty easily.
// Build all the permutations into a regex.
std::regex const e{[]{
std::string e;
char const* grps[] =
{
"\\s*(-foo)",
"\\s*(-handle|-bar|-mustache)?",
"\\s*(-meow|-mix|-want)?",
};
// initial permutation
std::sort(std::begin(grps), std::end(grps));
auto sep = "";
do
{
e = e + sep + "(?:";
for(auto const* g: grps)
e += g;
e += ")";
sep = "|"; // separate each permutation with |
}
while(std::next_permutation(std::begin(grps), std::end(grps)));
return e;
}(), std::regex_constants::optimize};
// Do some tests
std::vector<std::string> const tests =
{
"-foo",
"-foo -bar",
"-foo-want",
"-foo -meow-bar",
"-foo-mix-mustache",
"-handle -foo-meow",
"-mustache-foo",
"-mustache -mix -foo",
"-want-foo",
"-want-meow-foo",
"-want-foo-meow",
"woof",
"-handle-meow",
"-ha-foondle",
"meow",
"-foobar",
"stackoverflow",
"- handle -foo -mix",
"-handle -mix",
"-foo -handle -bar",
"-foo -handle -mix -sodium",
};
std::smatch m;
for(auto const& test: tests)
{
if(!std::regex_match(test, m, e))
{
std::cout << "Invalid: " << test << '\n';
continue;
}
std::cout << "Valid: " << test << '\n';
}

JSON parsing using nlohmann json

I am trying to parse the JSON structure using nlohmann's json.hpp
. But I am not to create the JSON structure from the string. I have tried all the way, but still it is failing.
My requirement is to:
1) Create the JSON structure from the string.
2) Find the value of "statusCode" from it.
After trying for so long time, I am really getting doubt, does nlohmann's json parser support nested JSON or not.
#include "json.hpp"
using namespace std;
int main(){
// giving error 1
nlohmann::json strjson = nlohmann::json::parse({"statusResp":{"statusCode":"S001","message":"Registration Success","snStatus":"Active","warrantyStart":"00000000","warrantyEnd":"00000000","companyBPID":"0002210887","siteBPID":"0002210888","contractStart":"00000000","contractEnd":"00000000"}});
// Giving error 2:
auto j= "{
"statusResp": {
"statusCode": "S001",
"message": "Registration Success",
"snStatus": "Active",
"warrantyStart": "20170601",
"warrantyEnd": "20270601",
"companyBPID": "0002210887",
"siteBPID": "0002210888",
"contractStart": "00000000",
"contractEnd": "00000000"
}
}"_json;
// I actually want to get the value of "statusCode" code from the JSOn structure. But no idea how to parse the nested value.
return 1;
}
Below are the error for both the initialisations:
//ERROR 1:
test.cpp: In function 'int main()':
test.cpp:17:65: error: expected '}' before ':' token
nlohmann::json strjson = nlohmann::json::parse({"statusResp":{"statusCode":"S001","message":"Registration Success","snStatus":"Active","warrantyStart":"00000000","warrantyEnd":"00000000","companyBPID":"0002210887","siteBPID":"0002210888","contractStart":"00000000","contractEnd":"00000000"}});
// ERROR 2:
hemanty#sLinux:/u/hemanty/workspaces/avac/cb-product/mgmt/framework/src/lib/libcurl_cpp$g++ test.cpp -std=gnu++11
test.cpp: In function 'int main()':
test.cpp:27:17: error: expected '}' before ':' token
"statusResp": {
Since " is the character to begin and end a string literal you can not have a " character inside a string without putting a \ before it.
std::string str = " "statusCode":"5001" "; //This does not work
std::string str = " \"statusCode\":\"5001\" "; //This will work
An easier alternative when you want to make strings with a lot of " in them is to use the R"" string literal. Then you can write it like so.
std::string str = R"("statusCode":"5001")";
If we now transfear this to your json example, the correct way to parse the strings would be one of the following.
auto j3 = json::parse("{ \"happy\": true, \"pi\": 3.141 }");
// and below the equivalent with raw string literal
auto j3 = json::parse(R"({"happy": true, "pi": 3.141 })");
//Here we use the `_json` suffix
auto j2 = "
{
\"happy\": true,
\"pi\": 3.141
}"_json;
// Here we combine the R"" with _json suffix to do the same thing.
auto j2 = R"(
{
"happy": true,
"pi": 3.141
}
)"_json;
Examples taken from the readme
If this is what you need:
std::string ss= R"(
{
"test-data":
[
{
"name": "tom",
"age": 11
},
{
"name": "jane",
"age": 12
}
]
}
)";
json myjson = json::parse(ss);
auto &students = myjson["test-data"];
for(auto &student : students) {
cout << "name=" << student["name"].get<std::string>() << endl;
}
Or:
json myjson = { {"name", "tom"}, {"age", 11} };
cout << "name=" << myjson["name"].get<std::string>() << endl;

It's a good idea to use boost::program_options to parse a text file?

I have to deal with a lot of files with a well defined syntax and semantic, for example:
the first line it's an header with special info
the other lines are containing a key value at the start of the line that are telling you how to parse and deal with the content of that line
if there is a comment it starts with a given token
etc etc ...
now boost::program_options, as far as I can tell, does pretty much the same job, but I only care about importing the content of those text file, without any extra work in between, just parse it and store it in my data structure .
the key step for me is that I would like to be able to do this parsing with:
regular expressions since I need to detect different semantics and I can't really imagine another way to do this
error checking ( corrupted file, unmatched keys even after parsing the entire file, etc etc ... )
so, I can use this library for this job ? There is a more functional approach ?
Okay, a starting point for a Spirit grammar
_Name = "newmtl" >> lexeme [ +graph ];
_Ns = "Ns" >> double_;
_Ka = "Ka" >> double_ >> double_ >> double_;
_Kd = "Kd" >> double_ >> double_ >> double_;
_Ks = "Ks" >> double_ >> double_ >> double_;
_d = "d" >> double_;
_illum %= "illum" >> qi::int_ [ _pass = (_1>=0) && (_1<=10) ];
comment = '#' >> *(char_ - eol);
statement=
comment
| _Ns [ bind(&material::_Ns, _r1) = _1 ]
| _Ka [ bind(&material::_Ka, _r1) = _1 ]
| _Kd [ bind(&material::_Kd, _r1) = _1 ]
| _Ks [ bind(&material::_Ks, _r1) = _1 ]
| _d [ bind(&material::_d, _r1) = _1 ]
| _illum [ bind(&material::_illum, _r1) = _1 ]
;
_material = -comment % eol
>> _Name [ bind(&material::_Name, _val) = _1 ] >> eol
>> -statement(_val) % eol;
start = _material % -eol;
I only implemented the MTL file subset grammar from your sample files.
Note: This is rather a simplistic grammar. But, you know, first things first. In reality I'd probably consider using the keyword list parser from the spirit repository. It has facilities to 'require' certain number of occurrences for the different 'field types'.
Note: Spirit Karma (and some ~50 other lines of code) are only here for demonstrational purposes.
With the following contents of untitled.mtl
# Blender MTL File: 'None'
# Material Count: 2
newmtl None
Ns 0
Ka 0.000000 0.000000 0.000000
Kd 0.8 0.8 0.8
Ks 0.8 0.8 0.8
d 1
illum 2
# Added just for testing:
newmtl Demo
Ns 1
Ks 0.9 0.9 0.9
d 42
illum 7
The output reads
phrase_parse -> true
remaining input: ''
void dump(const T&) [with T = std::vector<blender::mtl::material>]
-----
material {
Ns:0
Ka:{r:0,g:0,b:0}
Kd:{r:0.8,g:0.8,b:0.8}
Ks:{r:0.8,g:0.8,b:0.8}
d:1
illum:2(Highlight on)
}
material {
Ns:1
Ka:(unspecified)
Kd:(unspecified)
Ks:{r:0.9,g:0.9,b:0.9}
d:42
illum:7(Transparency: Refraction on/Reflection: Fresnel on and Ray trace on)
}
-----
Here's the listing
#define BOOST_SPIRIT_USE_PHOENIX_V3
#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp> // for debug output/streaming
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;
namespace wavefront { namespace obj
{
} }
namespace blender { namespace mtl // material?
{
struct Ns { int exponent; }; // specular exponent
struct Reflectivity { double r, g, b; };
using Name = std::string;
using Ka = Reflectivity;
using Kd = Reflectivity;
using Ks = Reflectivity;
using dissolve_factor = double;
enum class illumination_model {
color, // 0 Color on and Ambient off
color_ambient, // 1 Color on and Ambient on
highlight, // 2 Highlight on
reflection_ray, // 3 Reflection on and Ray trace on
glass_ray, // 4 Transparency: Glass on
// Reflection: Ray trace on
fresnel_ray, // 5 Reflection: Fresnel on and Ray trace on
refract_ray, // 6 Transparency: Refraction on
// Reflection: Fresnel off and Ray trace on
refract_ray_fresnel,// 7 Transparency: Refraction on
// Reflection: Fresnel on and Ray trace on
reflection, // 8 Reflection on and Ray trace off
glass, // 9 Transparency: Glass on
// Reflection: Ray trace off
shadow_invis, // 10 Casts shadows onto invisible surfaces
};
struct material
{
Name _Name;
boost::optional<Ns> _Ns;
boost::optional<Reflectivity> _Ka;
boost::optional<Reflectivity> _Kd;
boost::optional<Reflectivity> _Ks;
boost::optional<dissolve_factor> _d;
boost::optional<illumination_model> _illum;
};
using mtl_file = std::vector<material>;
///////////////////////////////////////////////////////////////////////
// Debug output helpers
std::ostream& operator<<(std::ostream& os, blender::mtl::illumination_model o)
{
using blender::mtl::illumination_model;
switch(o)
{
case illumination_model::color: return os << "0(Color on and Ambient off)";
case illumination_model::color_ambient: return os << "1(Color on and Ambient on)";
case illumination_model::highlight: return os << "2(Highlight on)";
case illumination_model::reflection_ray: return os << "3(Reflection on and Ray trace on)";
case illumination_model::glass_ray: return os << "4(Transparency: Glass on/Reflection: Ray trace on)";
case illumination_model::fresnel_ray: return os << "5(Reflection: Fresnel on and Ray trace on)";
case illumination_model::refract_ray: return os << "6(Transparency: Refraction on/Reflection: Fresnel off and Ray trace on)";
case illumination_model::refract_ray_fresnel: return os << "7(Transparency: Refraction on/Reflection: Fresnel on and Ray trace on)";
case illumination_model::reflection: return os << "8(Reflection on and Ray trace off)";
case illumination_model::glass: return os << "9(Transparency: Glass on/Reflection: Ray trace off)";
case illumination_model::shadow_invis: return os << "10(Casts shadows onto invisible surfaces)";
default: return os << "ILLEGAL VALUE";
}
}
std::ostream& operator<<(std::ostream& os, blender::mtl::Reflectivity const& o)
{
return os << "{r:" << o.r << ",g:" << o.g << ",b:" << o.b << "}";
}
std::ostream& operator<<(std::ostream& os, blender::mtl::material const& o)
{
using namespace boost::spirit::karma;
return os << format("material {"
"\n\tNs:" << (auto_ | "(unspecified)")
<< "\n\tKa:" << (stream | "(unspecified)")
<< "\n\tKd:" << (stream | "(unspecified)")
<< "\n\tKs:" << (stream | "(unspecified)")
<< "\n\td:" << (stream | "(unspecified)")
<< "\n\tillum:" << (stream | "(unspecified)")
<< "\n}", o);
}
} }
BOOST_FUSION_ADAPT_STRUCT(blender::mtl::Reflectivity,(double, r)(double, g)(double, b))
BOOST_FUSION_ADAPT_STRUCT(blender::mtl::Ns, (int, exponent))
BOOST_FUSION_ADAPT_STRUCT(blender::mtl::material,
(boost::optional<blender::mtl::Ns>, _Ns)
(boost::optional<blender::mtl::Ka>, _Ka)
(boost::optional<blender::mtl::Kd>, _Kd)
(boost::optional<blender::mtl::Ks>, _Ks)
(boost::optional<blender::mtl::dissolve_factor>, _d)
(boost::optional<blender::mtl::illumination_model>, _illum))
namespace blender { namespace mtl { namespace parsing
{
template <typename It>
struct grammar : qi::grammar<It, qi::blank_type, mtl_file()>
{
template <typename T=qi::unused_type> using rule = qi::rule<It, qi::blank_type, T>;
rule<Name()> _Name;
rule<Ns()> _Ns;
rule<Reflectivity()> _Ka;
rule<Reflectivity()> _Kd;
rule<Reflectivity()> _Ks;
rule<dissolve_factor()> _d;
rule<illumination_model()> _illum;
rule<mtl_file()> start;
rule<material()> _material;
rule<void(material&)> statement;
rule<> comment;
grammar() : grammar::base_type(start)
{
using namespace qi;
using phx::bind;
using blender::mtl::material;
_Name = "newmtl" >> lexeme [ +graph ];
_Ns = "Ns" >> double_;
_Ka = "Ka" >> double_ >> double_ >> double_;
_Kd = "Kd" >> double_ >> double_ >> double_;
_Ks = "Ks" >> double_ >> double_ >> double_;
_d = "d" >> double_;
_illum %= "illum" >> qi::int_ [ _pass = (_1>=0) && (_1<=10) ];
comment = '#' >> *(char_ - eol);
statement=
comment
| _Ns [ bind(&material::_Ns, _r1) = _1 ]
| _Ka [ bind(&material::_Ka, _r1) = _1 ]
| _Kd [ bind(&material::_Kd, _r1) = _1 ]
| _Ks [ bind(&material::_Ks, _r1) = _1 ]
| _d [ bind(&material::_d, _r1) = _1 ]
| _illum [ bind(&material::_illum, _r1) = _1 ]
;
_material = -comment % eol
>> _Name [ bind(&material::_Name, _val) = _1 ] >> eol
>> -statement(_val) % eol;
start = _material % -eol;
BOOST_SPIRIT_DEBUG_NODES(
(start)
(statement)
(_material)
(_Name) (_Ns) (_Ka) (_Kd) (_Ks) (_d) (_illum)
(comment))
}
};
} } }
#include <fstream>
template <typename T>
void dump(T const& data)
{
using namespace boost::spirit::karma;
std::cout << __PRETTY_FUNCTION__
<< "\n-----\n"
<< format(stream % eol, data)
<< "\n-----\n";
}
void testMtl(const char* const fname)
{
std::ifstream mtl(fname, std::ios::binary);
mtl.unsetf(std::ios::skipws);
boost::spirit::istream_iterator f(mtl), l;
using namespace blender::mtl::parsing;
static const grammar<decltype(f)> p;
blender::mtl::mtl_file data;
bool ok = qi::phrase_parse(f, l, p, qi::blank, data);
std::cout << "phrase_parse -> " << std::boolalpha << ok << "\n";
std::cout << "remaining input: '" << std::string(f,l) << "'\n";
dump(data);
}
int main()
{
testMtl("untitled.mtl");
}
Yes, at least if you config file as simple as map of key-value pairs (something like simple .ini).
From documentation:
The program_options library allows program developers to obtain
program options, that is (name, value) pairs from the user, via
conventional methods such as command line and config file.
...
Options can be read from anywhere. Sooner or later the command line
will be not enough for your users, and you'll want config files or
maybe even environment variables. These can be added without
significant effort on your part.
See "multiple sources" sample for details.
But, if you need (or could probably need in the future) a more sophisticated config files (XML, JSON or binary for example), it is worth to use standalone library.
It's most likely possible, but not necessarily convenient. If you want to parse anything you want to use parser - whether you use existing one or write one yourself depends on what you are parsing.
If there is no way to parse your format with any existing tool then just write your own parser. You can use lex/flex/flex++ with yacc/bison/bison++ or boost::spirit.
I think in a long run learning to maintain you own parser will be more useful that forcefully adjusting boost::program_options config, but not as convenient as using some existing parser already matching your needs.