Argument specified multiple times in boost program options [duplicate] - c++

I am wondering whether it is possible to use zero-parameter options multiple times with boost::program_options.
I have something in mind like this:
mytool --load myfile --print_status --do-something 23 --print_status
It is easy to get this working with one "print_status" parameter, but it is not obvious to me how one could use this option two times (in my case, boost throws an exception if a zero-parameter option is specified more than once).
So, the question is:
Is there any (simple) way to achieve this with out-of-the box functionality from program_options?
Right now, it seems this is a drawback of the current program_options implementation.
P.S.:
There have already been similar questions in the past (both over four years old), where no solution was found:
http://lists.boost.org/boost-users/2006/08/21631.php
http://benjaminwolsey.de/de/node/103
This thread contains a solution, but it is not obvious whether it is a working one, and it seems rather complex for such a simple feature:
Specifying levels (e.g. --verbose) using Boost program_options

If you don't need to count the number of times the option has been specified, it's fairly easy (if a little odd); just declare the variable as vector<bool> and set the following parameters:
std::vector<bool> example;
// ...
desc.add_options()
("example,e",
po::value(&example)
->default_value(std::vector<bool>(), "false")
->implicit_value(std::vector<bool>(1), "true")
->zero_tokens()
)
// ...
Specifying a vector suppresses multiple argument checking; default_value says that the vector should by default be empty, implicit_value says to set it to a 1-element vector if -e/--example is specified, and zero_tokens says not to consume any following tokens.
If -e or --example is specified at least once, example.size() will be exactly 1; otherwise it will be 0.
Example.
If you do want to count how many times the option occurs, it's easy enough to write a custom type and validator:
struct counter { int count = 0; };
void validate(boost::any& v, std::vector<std::string> const& xs, counter*, long)
{
if (v.empty()) v = counter{1};
else ++boost::any_cast<counter&>(v).count;
}
Example.
Note that unlike in the linked question this doesn't allow additionally specifying a value (e.g. --verbose 6) - if you want to do something that complex you would need to write a custom value_semantic subclass, as it's not supported by Boost's existing semantics.

Related

Maxima: creating a function that acts on parts of a string

Context: I'm using Maxima on a platform that also uses KaTeX. For various reasons related to content management, this means that we are regularly using Maxima functions to generate the necessary KaTeX commands.
I'm currently trying to develop a group of functions that will facilitate generating different sets of strings corresponding to KaTeX commands for various symbols related to vectors.
Problem
I have written the following function makeKatexVector(x), which takes a string, list or list-of-lists and returns the same type of object, with each string wrapped in \vec{} (i.e. makeKatexVector(string) returns \vec{string} and makeKatexVector(["a","b"]) returns ["\vec{a}", "\vec{b}"] etc).
/* Flexible Make KaTeX Vector Version of List Items */
makeKatexVector(x):= block([ placeHolderList : x ],
if stringp(x) /* Special Handling if x is Just a String */
then placeHolderList : concat("\vec{", x, "}")
else if listp(x[1]) /* check to see if it is a list of lists */
then for j:1 thru length(x)
do placeHolderList[j] : makelist(concat("\vec{", k ,"}"), k, x[j] )
else if listp(x) /* check to see if it is just a list */
then placeHolderList : makelist(concat("\vec{", k, "}"), k, x)
else placeHolderList : "makeKatexVector error: not a list-of-lists, a list or a string",
return(placeHolderList));
Although I have my doubts about the efficiency or elegance of the above code, it seems to return the desired expressions; however, I would like to modify this function so that it can distinguish between single- and multi-character strings.
In particular, I'd like multi-character strings like x_1 to be returned as \vec{x}_1 and not \vec{x_1}.
In fact, I'd simply like to modify the above code so that \vec{} is wrapped around the first character of the string, regardless of how many characters there may be.
My Attempt
I was ready to tackle this with brute force (e.g. transcribing each character of a string into a list and then reassembling); however, the real programmer on the project suggested I look into "Regular Expressions". After exploring that endless rabbit hole, I found the command regex_subst; however, I can't find any Maxima documentation for it, and am struggling to reproduce the examples in the related documentation here.
Once I can work out the appropriate regex to use, I intend to implement this in the above code using an if statement, such as:
if slength(x) >1
then {regex command}
else {regular treatment}
If anyone knows of helpful resources on any of these fronts, I'd greatly appreciate any pointers at all.
Looks like you got the regex approach working, that's great. My advice about handling subscripted expressions in TeX, however, is to avoid working with names which contain underscores in Maxima, and instead work with Maxima expressions with indices, e.g. foo[k] instead of foo_k. While writing foo_k is a minor convenience in Maxima, you'll run into problems pretty quickly, and in order to straighten it out you might end up piling one complication on another.
E.g. Maxima doesn't know there's any relation between foo, foo_1, and foo_k -- those have no more in common than foo, abc, and xyz. What if there are 2 indices? foo_j_k will become something like foo_{j_k} by the preceding approach -- what if you want foo_{j, k} instead? (Incidentally the two are foo[j[k]] and foo[j, k] when represented by subscripts.) Another problematic expression is something like foo_bar_baz. Does that mean foo_bar[baz], foo[bar_baz] or foo_bar_baz?
The code for tex(x_y) yielding x_y in TeX is pretty old, so it's unlikely to go away, but over the years I've come to increasing feel like it should be avoided. However, the last time it came up and I proposed disabling that, there were enough people who supported it that we ended up keeping it.
Something that might be helpful, there is a function texput which allows you to specify how a symbol should appear in TeX output. For example:
(%i1) texput (v, "\\vec{v}");
(%o1) "\vec{v}"
(%i2) tex ([v, v[1], v[k], v[j[k]], v[j, k]]);
$$\left[ \vec{v} , \vec{v}_{1} , \vec{v}_{k} , \vec{v}_{j_{k}} ,
\vec{v}_{j,k} \right] $$
(%o2) false
texput can modify various aspects of TeX output; you can take a look at the documentation (see ? texput).
While I didn't expect that I'd work this out on my own, after several hours, I made some progress, so figured I'd share here, in case anyone else may benefit from the time I put in.
to load the regex in wxMaxima, at least on the MacOS version, simply type load("sregex");. I didn't have this loaded, and was trying to work through our custom platform, which cost me several hours.
take note that many of the arguments in the linked documentation by Dorai Sitaram occur in the reverse, or a different order than they do in their corresponding Maxima versions.
not all the "pregexp" functions exist in Maxima;
In addition to this, escaping special characters varied in important ways between wxMaxima, the inline Maxima compiler (running within Ace editor) and the actual rendered version on our platform; in particular, the inline compiler often returned false for expressions that compiled properly in wxMaxima and on the platform. Because I didn't have sregex loaded on wxMaxima from the beginning, I lost a lot of time to this.
Finally, the regex expression that achieved the desired substitution, in my case, was:
regex_subst("\vec{\\1}", "([[:alpha:]])", "v_1");
which returns vec{v}_1 in wxMaxima (N.B. none of my attempts to get wxMaxima to return \vec{v}_1 were successful; escaping the backslash just does not seem to work; fortunately, the usual escaped version \\vec{\\1} does return the desired form).
I have yet to adjust the code for the rest of the function, but I doubt that will be of use to anyone else, and wanted to be sure to post an update here, before anyone else took time to assist me.
Always interested in better methods / practices or any other pointers / feedback.

Replace all occurrences based on map

This question might have already been answered but I haven't found it yet.
Let's say I have a std::map<string,string> which contains string pairs of <replace_all_this, to_this>.
I checked Boost's format library, which is close but not perfect:
std::map<string,string> m;
m['$search1'] = 'replace1';
m['$search2'] = 'replace2';
format fmter1("let's try to %1% and %2%.");
fmter % 36; fmter % 77;
for(auto r : m) {
fmter % r.second;
}
// would print "let's try to replace1 and replace2
This would work, but I lose control of what.
Actually I'd like to have this as result:
format fmter1("let's try to $search2 and $search1 even if their order is different in the map.");
...
//print: "let's try to replace2 and replace1 even if their order is different in the map".
Please note: map can contain more items, and items can occur multiple times in the formatter.
What is the way to go for this in 2020, I'd like it to be effective and fast, so I'd avoid iterating over the map multiple times.
There may be new libraries but there is no new algorithm to do that faster than what we have so far.
Assuming your format implies a $<name> for your variables, you can search for the first '$', read the <name> search for that in the map, then do the replace. This allows you to either skip the replacement or process it too (i.e. make it recursive where a variable can reference another).
I don't think that doing it the other way around would be any faster: i.e. go through the map and search for the names in the string means you'd be parsing the strings many times and if you have many variables, it will be a huge waste if most are not likely part of your string. Also if you want to prevent some level of recursivity, it's very complicated.
Where you can eventually optimize is in calculating the size of the resulting string and allocate that buffer once instead of using += which is not unlikely going to be slower.
I have such an implementation in my snaplogger. The variable has to be between brackets and it can include multiple parameters to further tweak the data. There is documentation here about what is supported by that function and as written, you can easily extend the class to add more features. It's probably not a one to one match to what you're looking for, but it shows you that there is not 20 ways of implementing that function.

Is there an alternative to the pipe syntax for ranges-v3?

The pipe | syntax in ranges-v3 is great but it required knowing up front all of the view's I'd like to append... Is there an alternative syntax that lets me optionally connect views depending on some condition?
Rangesv3 uses the type system to store information about what the operations are. This makes things very efficient at runtime, as the compiler knows what happens to the data as it passes from one step to another.
To do what you want, you need to erase the type information and forget it.
To this end, they have various any_views. An "any_input_view<int>" can store a terminal of a pipe that will output ints.
If you then have a transformation double_values that, well, doubles values, you can do:
any_input_view<int> double_the_view( any_input_view<int> in ) {
return std::move(in) | double_values;
}
note, however, that each such stage has a performance hit compared to the non-type erased version.

C/C++ switch case with string [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
C/C++: switch for non-integers
Hi,
I need to use a string in switch case. My solution so far was to calculate the hash of the string with my hash function. Problem is I have to manually pre-calculate all my hash values for strings. Is there a better approach?
h=_myhash (mystring);
switch (h)
{
case 66452:
.......
case 1342537:
........
}
Just use a if() { } else if () { } chain. Using a hash value is going to be a maintenance nightmare. switch is intended to be a low-level statement which would not be appropriate for string comparisons.
You could map the strings to function pointer using a standard collection; executing the function when a match is found.
EDIT: Using the example in the article I gave the link to in my comment, you can declare a function pointer type:
typedef void (*funcPointer)(int);
and create multiple functions to match the signature:
void String1Action(int arg);
void String2Action(int arg);
The map would be std::string to funcPointer:
std::map<std::string, funcPointer> stringFunctionMap;
Then add the strings and function pointers:
stringFunctionMap.add("string1", &String1Action);
I've not tested any of the code I have just posted, it's off the top of my head :)
Typically, you would use a hash table and function object, both available in Boost, TR1 and C++0x.
void func1() {
}
std::unordered_map<std::string, std::function<void()>> hash_map;
hash_map["Value1"] = &func1;
// .... etc
hash_map[mystring]();
This is a little more overhead at runtime but a bajillion times more maintainable. Hash tables offer O(1) insertion, lookup, and etc, which makes them the same complexity as the assembly-style jump-table.
The best way is to use source generation, so that you could use
if (hash(str) == HASH("some string") ..
in your main source, and an pre-build step would convert the HASH(const char*) expression to an integer value.
You could create a hashtable. The keys can be the string and the value can be and integer. Setup your integers for the values as constants and then you can check for them with the switch.
Ruslik's suggestion to use source generation seems like a good thing to me. However, I wouldn't go with the concept of "main" and "generated" source files. I'd rather have one file with code almost identical to yours:
h=_myhash (mystring);
switch (h)
{
case 66452: // = hash("Vasia")
.......
case 1342537: // = hash("Petya")
........
}
The next thing I'd do, I'd write a simple script. Perl is good for such kind of things, but nothing stops you even from writing a simple program in C/C++ if you don't want to use any other languages. This script, or program, would take the source file, read it line-by-line, find all those case NUMBERS: // = hash("SOMESTRING") lines (use regular expressions here), replace NUMBERS with the actual hash value and write the modified source into a temporary file. Finally, it would back up the source file and replace it with the temporary file. If you don't want your source file to have a new time stamp each time, the program could check if something was actually changed and if not, skip the file replacement.
The last thing to do is to integrate this script into the build system used, so you won't accidentally forget to launch it before building the project.
You could use the string to index into a hash table of function pointers.
Edit: glib has a hash table implementation that supports strings as keys and arbitrary pointers as values: http://library.gnome.org/devel/glib/stable/glib-Hash-Tables.html
You can use enumeration and a map, so your string will become the key and enum value is value for that key.
If you are after performance and don't want to go through all the if clauses each time if there are many or the need to hash the values, you could send some extra information to the function with the help of enum or just add an enum type to your structure.
There is no good solution to your problem, so here is an okey solution ;-)
It keeps your efficiency when assertions are disabled and when assertions are enabled it will raise an assertion error when the hash value is wrong.
I suspect that the D programming language could compute the hash value during compile time, thus removing the need to explicitly write down the hash value.
template <std::size_t h>
struct prehash
{
const your_string_type str;
static const std::size_t hash_value = h;
pre_hash(const your_string_type& s) : str(s)
{
assert(_myhash(s) == hash_value);
}
};
/* ... */
std::size_t h = _myhash(mystring);
static prehash<66452> first_label = "label1";
switch (h) {
case first_label.hash_value:
// ...
;
}
By the way, consider removing the initial underscore from the declaration of _ myhash() (sorry but stackoverflow forces me to insert a space between _ and myhash). A C++ implementation is free to implement macros with names starting with underscore and an uppercase letter (Item 36 of "Exceptional C++ Style" by Herb Sutter), so if you get into the habit of giving things names that start underscore, then a beautiful day could come when you give a symbol a name that starts with underscore and an uppercase letter, where the implementation has defined a macro with the same name.

Named parameter string formatting in C++

I'm wondering if there is a library like Boost Format, but which supports named parameters rather than positional ones. This is a common idiom in e.g. Python, where you have a context to format strings with that may or may not use all available arguments, e.g.
mouse_state = {}
mouse_state['button'] = 0
mouse_state['x'] = 50
mouse_state['y'] = 30
#...
"You clicked %(button)s at %(x)d,%(y)d." % mouse_state
"Targeting %(x)d, %(y)d." % mouse_state
Are there any libraries that offer the functionality of those last two lines? I would expect it to offer a API something like:
PrintFMap(string format, map<string, string> args);
In Googling I have found many libraries offering variations of positional parameters, but none that support named ones. Ideally the library has few dependencies so I can drop it easily into my code. C++ won't be quite as idiomatic for collecting named arguments, but probably someone out there has thought more about it than me.
Performance is important, in particular I'd like to keep memory allocations down (always tricky in C++), since this may be run on devices without virtual memory. But having even a slow one to start from will probably be faster than writing it from scratch myself.
The fmt library supports named arguments:
print("You clicked {button} at {x},{y}.",
arg("button", "b1"), arg("x", 50), arg("y", 30));
And as a syntactic sugar you can even (ab)use user-defined literals to pass arguments:
print("You clicked {button} at {x},{y}.",
"button"_a="b1", "x"_a=50, "y"_a=30);
For brevity the namespace fmt is omitted in the above examples.
Disclaimer: I'm the author of this library.
I've always been critic with C++ I/O (especially formatting) because in my opinion is a step backward in respect to C. Formats needs to be dynamic, and makes perfect sense for example to load them from an external resource as a file or a parameter.
I've never tried before however to actually implement an alternative and your question made me making an attempt investing some weekend hours on this idea.
Sure the problem was more complex than I thought (for example just the integer formatting routine is 200+ lines), but I think that this approach (dynamic format strings) is more usable.
You can download my experiment from this link (it's just a .h file) and a test program from this link (test is probably not the correct term, I used it just to see if I was able to compile).
The following is an example
#include "format.h"
#include <iostream>
using format::FormatString;
using format::FormatDict;
int main()
{
std::cout << FormatString("The answer is %{x}") % FormatDict()("x", 42);
return 0;
}
It is different from boost.format approach because uses named parameters and because
the format string and format dictionary are meant to be built separately (and for
example passed around). Also I think that formatting options should be part of the
string (like printf) and not in the code.
FormatDict uses a trick for keeping the syntax reasonable:
FormatDict fd;
fd("x", 12)
("y", 3.141592654)
("z", "A string");
FormatString is instead just parsed from a const std::string& (I decided to preparse format strings but a slower but probably acceptable approach would be just passing the string and reparsing it each time).
The formatting can be extended for user defined types by specializing a conversion function template; for example
struct P2d
{
int x, y;
P2d(int x, int y)
: x(x), y(y)
{
}
};
namespace format {
template<>
std::string toString<P2d>(const P2d& p, const std::string& parms)
{
return FormatString("P2d(%{x}; %{y})") % FormatDict()
("x", p.x)
("y", p.y);
}
}
after that a P2d instance can be simply placed in a formatting dictionary.
Also it's possible to pass parameters to a formatting function by placing them between % and {.
For now I only implemented an integer formatting specialization that supports
Fixed size with left/right/center alignment
Custom filling char
Generic base (2-36), lower or uppercase
Digit separator (with both custom char and count)
Overflow char
Sign display
I've also added some shortcuts for common cases, for example
"%08x{hexdata}"
is an hex number with 8 digits padded with '0's.
"%026/2,8:{bindata}"
is a 24-bit binary number (as required by "/2") with digit separator ":" every 8 bits (as required by ",8:").
Note that the code is just an idea, and for example for now I just prevented copies when probably it's reasonable to allow storing both format strings and dictionaries (for dictionaries it's however important to give the ability to avoid copying an object just because it needs to be added to a FormatDict, and while IMO this is possible it's also something that raises non-trivial problems about lifetimes).
UPDATE
I've made a few changes to the initial approach:
Format strings can now be copied
Formatting for custom types is done using template classes instead of functions (this allows partial specialization)
I've added a formatter for sequences (two iterators). Syntax is still crude.
I've created a github project for it, with boost licensing.
The answer appears to be, no, there is not a C++ library that does this, and C++ programmers apparently do not even see the need for one, based on the comments I have received. I will have to write my own yet again.
Well I'll add my own answer as well, not that I know (or have coded) such a library, but to answer to the "keep the memory allocation down" bit.
As always I can envision some kind of speed / memory trade-off.
On the one hand, you can parse "Just In Time":
class Formater:
def __init__(self, format): self._string = format
def compute(self):
for k,v in context:
while self.__contains(k):
left, variable, right = self.__extract(k)
self._string = left + self.__replace(variable, v) + right
This way you don't keep a "parsed" structure at hand, and hopefully most of the time you'll just insert the new data in place (unlike Python, C++ strings are not immutable).
However it's far from being efficient...
On the other hand, you can build a fully constructed tree representing the parsed format. You will have several classes like: Constant, String, Integer, Real, etc... and probably some subclasses / decorators as well for the formatting itself.
I think however than the most efficient approach would be to have some kind of a mix of the two.
explode the format string into a list of Constant, Variable
index the variables in another structure (a hash table with open-addressing would do nicely, or something akin to Loki::AssocVector).
There you are: you're done with only 2 dynamically allocated arrays (basically). If you want to allow a same key to be repeated multiple times, simply use a std::vector<size_t> as a value of the index: good implementations should not allocate any memory dynamically for small sized vectors (VC++ 2010 doesn't for less than 16 bytes worth of data).
When evaluating the context itself, look up the instances. You then parse the formatter "just in time", check it agaisnt the current type of the value with which to replace it, and process the format.
Pros and cons:
- Just In Time: you scan the string again and again
- One Parse: requires a lot of dedicated classes, possibly many allocations, but the format is validated on input. Like Boost it may be reused.
- Mix: more efficient, especially if you don't replace some values (allow some kind of "null" value), but delaying the parsing of the format delays the reporting of errors.
Personally I would go for the One Parse scheme, trying to keep the allocations down using boost::variant and the Strategy Pattern as much I could.
Given that Python it's self is written in C and that formatting is such a commonly used feature, you might be able (ignoring copy write issues) to rip the relevant code from the python interpreter and port it to use STL maps rather than Pythons native dicts.
I've writen a library for this puporse, check it out on GitHub.
Contributions are wellcome.