Mutable lists in Constrained Language Mode - list

Mutable .Net list, as e.g. List Class, are often preferred above the native immutable PowerShell Array #() along with:
PowerShell scripting performance considerations (by avoiding the increase assignment operator (+=) to create a collection)
sophisticated (recursive) functions where you want to reference (rather than copy) a list in another scope as in Powershell filling array with function calling itself to loop through
Unfortunately, these types are not available in Constrained Language mode
$ExecutionContext.SessionState.LanguageMode = 'ConstrainedLanguage'
$List = [Collections.Generic.List[object]]::new()
InvalidOperation: Cannot create type. Only core types are supported in this language mode.
Is there a way to work around this?

The constrained language mode might be quite a burden if want
write sophisticated (recursive) PowerShell script.
Since Constrained Language is so limited, you will find that many of the approved scripts that you use for advanced systems management no longer work. The solution to this is simple: add these scripts (or more effectively: your code signing authority that signed them) to your Device Guard policy. This will allow your approved scripts to run in Full Language mode. See: PowerShell Constrained Language Mode
If you are administrator you might consider to (temporary) disable the constrained language mode completely, see: how to change PowerShell mode to full language mode from constrained mode?
Anyways, as a workaround, you might consider using the native (mutable) PowerShell HashTable collection (or an [ordered] type) instead:
# $ExecutionContext.SessionState.LanguageMode = "ConstrainedLanguage"
$List = #{}
function AddItem {
$List.Add($List.Count, (New-Guid)) # or just: $List[$List.Count] = New-Guid
}
AddItem
AddItem
Although this creates a key-value pair for every entry (where the key is redundant), you might simply obtain just the values with the .Values property:
$List.Values
Guid
----
b22f9cdd-9dba-4868-978e-ccdee3723685
2ccd98a0-a729-4b07-9bd9-8f1306be28d3
Ordered
Hashtables are unordered by nature which means the first added item doesn't have to be the first item when you list the values ($List.Value). To overcome that, you might sort the list at the moment it is required:
$List.Keys |Sort-Object |ForEach-Object { $List[$_] }
Or as the Sort-Object cmdlet is quiet expensive and the keys are predefined:
0..$($List.Count) |ForEach-Object { $List[$_] }
Or using a for loop:
for ($i = 0; $i -lt $List.Count; $i++) { $List[$i] }
Or you might consider to use a [Ordered] collection type to keep items (values) in order from the start. For this, be aware that:
apparently the constrained language mode does support the Add() method of the ordered collection type either:
Cannot invoke method. Method invocation is supported only on core types in this language mode.
any integer key refers to the actual index in the collection rather than an associated key (to resolve this, you might cast the index to a string)
In other words the following modifications are required to use an ordered dictionary instead of a hashtable:
$List = [ordered]#{}
function AddItem {
$List[$List.Count.ToString()] = New-Guid
}
Notes
There is an outstanding issue request #5643 PowerShell should support creating an List similar to how it supports arrays which likely implies the suggest syntactic sugar to be compatible with the Constrained Language Mode

Related

Is there an alternative to the pipe syntax for ranges-v3?

The pipe | syntax in ranges-v3 is great but it required knowing up front all of the view's I'd like to append... Is there an alternative syntax that lets me optionally connect views depending on some condition?
Rangesv3 uses the type system to store information about what the operations are. This makes things very efficient at runtime, as the compiler knows what happens to the data as it passes from one step to another.
To do what you want, you need to erase the type information and forget it.
To this end, they have various any_views. An "any_input_view<int>" can store a terminal of a pipe that will output ints.
If you then have a transformation double_values that, well, doubles values, you can do:
any_input_view<int> double_the_view( any_input_view<int> in ) {
return std::move(in) | double_values;
}
note, however, that each such stage has a performance hit compared to the non-type erased version.

Dynamically switch parser while parsing

I'm parsing spice netlists, for which I already have a parser. Since I actually use spectre (cadence, integrated electronics), I want to support both simulator languages (they differ, unfortunately). I could use a switch (e.g. commandline) and use the correct parser from start. However, spectre allows simulator lang=spectre statements, which I would also want to support (and vice versa, of course). How can this be done with boost::spirit?
My grammar looks roughly like this:
line = component_parser |
command_parser |
comment_parser |
subcircuit_parser |
subcircuit_instance_parser;
main = -line % qi::eol >> qi::eoi;
This toplevel structure is fine for both languages, so i need to change the subparsers. A first idea for me would be to have the toplevel parser hold instances (or objects) to the respective parser and to switch on finding the simulator lang statement (with a semantic action). Is this a good approach? If not, how else would one do this?
You can use qi::lazy (https://www.boost.org/doc/libs/1_68_0/libs/spirit/doc/html/spirit/qi/reference/auxiliary/lazy.html).
There's an idiomatic pattern related to that, known as The Nabialek Trick.
I have several answers up on this site that show these various techniques.
https://stackoverflow.com/search?q=user%3A85371+qi%3A%3Alazy

Handling case-insensitivity without taking locale into account

I am investigating the handling of case-insensitivity in my application. So far I realized that there are two different cases:
data is visualized to the user
data is internally handled
For case 1 you should use the user's locale, always. This means that e.g. when sorting items in a list and you want this to happen case-insensitively, then you should use locale-aware case-insensitive string compare functions.
For case 2 it seems logical that you don't want to use the user's locale, since this can have undesirable effects if you have users using a different locale, but still using the same data set (e.g. if you are managing library software, you could use the book's name as key for your book instance in the database, and want to handle this case-insensitive (this is a simplification, I know)).
When using STL containers (like std::map) I noticed that it's much more efficient to put the key in uppercase and then perform the lookup on the uppercase'd search-value. This is more efficient than performing a case-insensitive compare while looping over the map. For std::unordered_map it's probably required to do such a trick.
However, I realized that this may have strange effects as well, and I am wondering how Windows (also using case-insensitive file names) handles these situations.
E.g. The German character ß (ringel-S) is written as SS when put in uppercase. This seems to imply that ß.txt and SS.txt should denote the same file, and so ßs.txt and sß.txt and sss.txt and SSS.txt should also denote the same file. But in my experiments this doesn't seem to be the case in Windows.
So my questions:
Which C++ and/or Windows functions should be used to perform locale-independent case-insensitive string compares?
Which C++ and/or Windows functions should be used to make a string case-less (e.g. put it in uppercase) so compares are then more efficient when performing a lookup in an std::map (or even making hashing possible when using an std::unordered_map)?
Any other experience (or links to documents) regarding case-insensitive string handling for internal (i.e. non-visualization-related) data?

Argument specified multiple times in boost program options [duplicate]

I am wondering whether it is possible to use zero-parameter options multiple times with boost::program_options.
I have something in mind like this:
mytool --load myfile --print_status --do-something 23 --print_status
It is easy to get this working with one "print_status" parameter, but it is not obvious to me how one could use this option two times (in my case, boost throws an exception if a zero-parameter option is specified more than once).
So, the question is:
Is there any (simple) way to achieve this with out-of-the box functionality from program_options?
Right now, it seems this is a drawback of the current program_options implementation.
P.S.:
There have already been similar questions in the past (both over four years old), where no solution was found:
http://lists.boost.org/boost-users/2006/08/21631.php
http://benjaminwolsey.de/de/node/103
This thread contains a solution, but it is not obvious whether it is a working one, and it seems rather complex for such a simple feature:
Specifying levels (e.g. --verbose) using Boost program_options
If you don't need to count the number of times the option has been specified, it's fairly easy (if a little odd); just declare the variable as vector<bool> and set the following parameters:
std::vector<bool> example;
// ...
desc.add_options()
("example,e",
po::value(&example)
->default_value(std::vector<bool>(), "false")
->implicit_value(std::vector<bool>(1), "true")
->zero_tokens()
)
// ...
Specifying a vector suppresses multiple argument checking; default_value says that the vector should by default be empty, implicit_value says to set it to a 1-element vector if -e/--example is specified, and zero_tokens says not to consume any following tokens.
If -e or --example is specified at least once, example.size() will be exactly 1; otherwise it will be 0.
Example.
If you do want to count how many times the option occurs, it's easy enough to write a custom type and validator:
struct counter { int count = 0; };
void validate(boost::any& v, std::vector<std::string> const& xs, counter*, long)
{
if (v.empty()) v = counter{1};
else ++boost::any_cast<counter&>(v).count;
}
Example.
Note that unlike in the linked question this doesn't allow additionally specifying a value (e.g. --verbose 6) - if you want to do something that complex you would need to write a custom value_semantic subclass, as it's not supported by Boost's existing semantics.

Named parameter string formatting in C++

I'm wondering if there is a library like Boost Format, but which supports named parameters rather than positional ones. This is a common idiom in e.g. Python, where you have a context to format strings with that may or may not use all available arguments, e.g.
mouse_state = {}
mouse_state['button'] = 0
mouse_state['x'] = 50
mouse_state['y'] = 30
#...
"You clicked %(button)s at %(x)d,%(y)d." % mouse_state
"Targeting %(x)d, %(y)d." % mouse_state
Are there any libraries that offer the functionality of those last two lines? I would expect it to offer a API something like:
PrintFMap(string format, map<string, string> args);
In Googling I have found many libraries offering variations of positional parameters, but none that support named ones. Ideally the library has few dependencies so I can drop it easily into my code. C++ won't be quite as idiomatic for collecting named arguments, but probably someone out there has thought more about it than me.
Performance is important, in particular I'd like to keep memory allocations down (always tricky in C++), since this may be run on devices without virtual memory. But having even a slow one to start from will probably be faster than writing it from scratch myself.
The fmt library supports named arguments:
print("You clicked {button} at {x},{y}.",
arg("button", "b1"), arg("x", 50), arg("y", 30));
And as a syntactic sugar you can even (ab)use user-defined literals to pass arguments:
print("You clicked {button} at {x},{y}.",
"button"_a="b1", "x"_a=50, "y"_a=30);
For brevity the namespace fmt is omitted in the above examples.
Disclaimer: I'm the author of this library.
I've always been critic with C++ I/O (especially formatting) because in my opinion is a step backward in respect to C. Formats needs to be dynamic, and makes perfect sense for example to load them from an external resource as a file or a parameter.
I've never tried before however to actually implement an alternative and your question made me making an attempt investing some weekend hours on this idea.
Sure the problem was more complex than I thought (for example just the integer formatting routine is 200+ lines), but I think that this approach (dynamic format strings) is more usable.
You can download my experiment from this link (it's just a .h file) and a test program from this link (test is probably not the correct term, I used it just to see if I was able to compile).
The following is an example
#include "format.h"
#include <iostream>
using format::FormatString;
using format::FormatDict;
int main()
{
std::cout << FormatString("The answer is %{x}") % FormatDict()("x", 42);
return 0;
}
It is different from boost.format approach because uses named parameters and because
the format string and format dictionary are meant to be built separately (and for
example passed around). Also I think that formatting options should be part of the
string (like printf) and not in the code.
FormatDict uses a trick for keeping the syntax reasonable:
FormatDict fd;
fd("x", 12)
("y", 3.141592654)
("z", "A string");
FormatString is instead just parsed from a const std::string& (I decided to preparse format strings but a slower but probably acceptable approach would be just passing the string and reparsing it each time).
The formatting can be extended for user defined types by specializing a conversion function template; for example
struct P2d
{
int x, y;
P2d(int x, int y)
: x(x), y(y)
{
}
};
namespace format {
template<>
std::string toString<P2d>(const P2d& p, const std::string& parms)
{
return FormatString("P2d(%{x}; %{y})") % FormatDict()
("x", p.x)
("y", p.y);
}
}
after that a P2d instance can be simply placed in a formatting dictionary.
Also it's possible to pass parameters to a formatting function by placing them between % and {.
For now I only implemented an integer formatting specialization that supports
Fixed size with left/right/center alignment
Custom filling char
Generic base (2-36), lower or uppercase
Digit separator (with both custom char and count)
Overflow char
Sign display
I've also added some shortcuts for common cases, for example
"%08x{hexdata}"
is an hex number with 8 digits padded with '0's.
"%026/2,8:{bindata}"
is a 24-bit binary number (as required by "/2") with digit separator ":" every 8 bits (as required by ",8:").
Note that the code is just an idea, and for example for now I just prevented copies when probably it's reasonable to allow storing both format strings and dictionaries (for dictionaries it's however important to give the ability to avoid copying an object just because it needs to be added to a FormatDict, and while IMO this is possible it's also something that raises non-trivial problems about lifetimes).
UPDATE
I've made a few changes to the initial approach:
Format strings can now be copied
Formatting for custom types is done using template classes instead of functions (this allows partial specialization)
I've added a formatter for sequences (two iterators). Syntax is still crude.
I've created a github project for it, with boost licensing.
The answer appears to be, no, there is not a C++ library that does this, and C++ programmers apparently do not even see the need for one, based on the comments I have received. I will have to write my own yet again.
Well I'll add my own answer as well, not that I know (or have coded) such a library, but to answer to the "keep the memory allocation down" bit.
As always I can envision some kind of speed / memory trade-off.
On the one hand, you can parse "Just In Time":
class Formater:
def __init__(self, format): self._string = format
def compute(self):
for k,v in context:
while self.__contains(k):
left, variable, right = self.__extract(k)
self._string = left + self.__replace(variable, v) + right
This way you don't keep a "parsed" structure at hand, and hopefully most of the time you'll just insert the new data in place (unlike Python, C++ strings are not immutable).
However it's far from being efficient...
On the other hand, you can build a fully constructed tree representing the parsed format. You will have several classes like: Constant, String, Integer, Real, etc... and probably some subclasses / decorators as well for the formatting itself.
I think however than the most efficient approach would be to have some kind of a mix of the two.
explode the format string into a list of Constant, Variable
index the variables in another structure (a hash table with open-addressing would do nicely, or something akin to Loki::AssocVector).
There you are: you're done with only 2 dynamically allocated arrays (basically). If you want to allow a same key to be repeated multiple times, simply use a std::vector<size_t> as a value of the index: good implementations should not allocate any memory dynamically for small sized vectors (VC++ 2010 doesn't for less than 16 bytes worth of data).
When evaluating the context itself, look up the instances. You then parse the formatter "just in time", check it agaisnt the current type of the value with which to replace it, and process the format.
Pros and cons:
- Just In Time: you scan the string again and again
- One Parse: requires a lot of dedicated classes, possibly many allocations, but the format is validated on input. Like Boost it may be reused.
- Mix: more efficient, especially if you don't replace some values (allow some kind of "null" value), but delaying the parsing of the format delays the reporting of errors.
Personally I would go for the One Parse scheme, trying to keep the allocations down using boost::variant and the Strategy Pattern as much I could.
Given that Python it's self is written in C and that formatting is such a commonly used feature, you might be able (ignoring copy write issues) to rip the relevant code from the python interpreter and port it to use STL maps rather than Pythons native dicts.
I've writen a library for this puporse, check it out on GitHub.
Contributions are wellcome.