Validate function parameters based on given constraints - c++

Before saying anything, please let me make it clear that this question is NOT language-specific but part of my work on an interpreter.
Let's say we have an enum of Value types. So a value can be:
SV, // stringValue
IV, // integerValue
AV, // arrayValue
etc, etc
then let's say we have a function F which takes one of the following combinations of arguments:
[
[SV],
[SV,IV],
[AV]
]
Now, the function is called, we calculate the values passed, and get their types. Let's say we get [XV,YV].
The question is:
What is the most efficient way to check if the passed values are allowed?
(The original interpreter is written in Nim, so one could say we could lookup the value array in the array of accepted value arrays like: accepted.contains(passed) - but this is not efficient)
P.S. ^ That's how I'm currently doing it, although I've explored the option of using bitmasks too. However, I cannot seem how it would help, since order plays an important part too.

Related

How to construct an XlaOp?

There are a number of functions for creating XlaOps from native C++ values. I'm trying to figure out how to use each to construct a graph. I've gone through xla_builder.h and picked out some candidates, omitting overloads and convenience wrappers. The two most likely candidates seem to be
// Enqueues a "retrieve parameter value" instruction for a parameter that was
// passed to the computation.
XlaOp Parameter(XlaBuilder* builder, int64 parameter_number, const Shape& shape,
const string& name);
// Enqueues a constant with the value of the given literal onto the
// computation.
XlaOp ConstantLiteral(XlaBuilder* builder, const LiteralSlice& literal);
Am I right in thinking Parameter is for "symbols", while ConstantLiteral is for constant values? For example, in f(x) = x + 1, we'd encode 1 as a ConstantLiteral, and then for x we could either
write f(x) as a C++ function, and at application site use another ConstantLiteral for our value of x, or
encode x using Parameter and build an XlaComputation from the corresponding XlaBuilder. That said, I'm not clear on how to actually call the XlaComputation with a Literal, other than with LocalClient which doesn't work with to multiple XlaComputations afaict.
What's the difference between these two approaches? Is one better than the other? I notice the former doesn't appear possible for higher-order functions: those which accept XlaComputations.
Next there's
Infeed, which I'd guess is a streaming version of Parameter.
Recv which looks like a way to pass data between computations, but doesn't actually create a completely new XlaOp itself.
ReplicaId, Iota, and XlaOp CreateToken(XlaBuilder* builder); appear largely irrelevant for this discussion.
Have I got this right? Are there any other important functions I've missed?

Calculate member offset of unknown type

I want to get the offset of a struct's member. I know this has been asked multiple times and the answer is always the mighty offsetof. Well, my case is a little different: I need the offset of an unknown type. That is for example:
void fill_struct(void* unknown)
{
...
}
The only thing I will know from unknown is the order in which types are set. i.e.
int
int
float
...
string
And the main problem here is align/padding, since I don't know a way to calculate it nor if there is a way at all.
This kind of question is often replied with: why would you want to do that?
For those people: I'm implementing a JSON parser in C++, and faced a problem (representing multiple type arrays), and my solution is to map the array's values into a custom struct.
I accept feedback regarding to that solution but I'm mainly interested in my question being answered

SML Basis Library: what's the rationale for `ArraySlice.copyVec`?

Prior understanding
According to The ArraySlice structure as reported by standardml.org (I don't have an SML 97 manual to check, only an SML 90 PDF manual), ArraySlice.copyVec gets an Array.array parameter as the destination, and not an ArraySlice.slice as one (or at least me) would intuitively expects. Of course, one can use ArraySlice.base to get an array and an index for respectively the dst and di parameters to copyVec. Surprisingly, copyVec from ArraySlice, does not even have a single parameter of type ArraySlice.slice. Fortunately, its src parameter is of type VectorSlice.slice, as intuitively expected.
The question
What's the rationale for ArraySlice.copyVec? Why doesn't it get an ArraySlice.slice as dst?
Presumably, because that would require the additional constraint that both slices have the same length (or at least dst is larger than src). The current API gets by without such an extra side condition. Also, it would probably be somewhat more cumbersome to use.

Return Set for a Command Parser

I need to write a parser to parse commands. 5 such commands are:
"a=10"
"b=foo"
"c=10,10"
"clear d"
"c push_back 2"
In the case of the first example, set is the command, a is the object and 10 is the value.
What do you think the parser should return for each line above?
Here is my idea:
"a=10" -> SET (COMMAND_ENUM), INT (VALUE_TYPE), "a", ("10")
"b=foo" -> SET (COMMAND_ENUM), STRING (VALUE_TYPE), "b", ("foo")
Is this a good approach? What is the standard approach for this problem? Should I dispatch instead?
I have a function which checks the type associated with an object. For example, a above is of type INT and must be assigned an INT value, otherwise the parser should return or throw an error of some sort. I also have a convert function for converting values from strings to the desired type. These throw if the conversion is not possible. If the parser tries to convert the values from strings to the required type, then it is probably a good idea to return them via a boost::variant.
You need to come up with at least a semi-formal grammar for the command language you want to recognize, since you've left a whole lot of things really vaguely specified (e.g. in b=foo you want b to be a variable name but foo to be a string literal. How do you distinguish them?. Does a sequence of characters represent an identifier if it's on the right side of an assignment, but a literal if it's on the left side? Or does a single character represent an identifier, but multiple characters represent a literal?) In c=10,10 does 10,10 represent a list or a vector? Writing a grammar will at least force you to think about such things, and it will also serve at least as a guide to how to write your parser (at most it will be something that can be automatically translated into your parser).
You're on the right track by thinking of how statements should be represented as Abstract Syntax Trees (ASTs), but you need to take a step backwards and look at what you want in terms of concrete syntax.

Passing lists from Mathematica to c++ (Mathlink)

I simply want to pass a list of integers to a function written in C++. I've set up the template (.tm) file and all, and I can successfully call a test function whith scalar arguments. Calling the function with the list argument behaves as though the function was not defined at all. I suspect that the argument types don't match.
In the documentation for templates (http://reference.wolfram.com/mathematica/ref/file/file.tm.html) the datatype for lists is something like "Int32List". When I use that, my C++ function must contain an extra long parameter for the list length. The only example code which uses a list is "sumalist.tm". This example uses IntegerList (a type which doesn't appear in the doku).
When I use Int32List, the mprep result requires a function with an extra integer argument (not long as written in the doku). When I use the undocumented IntegerList type, the extra argument is of type long.
During my experiments with scalar types, I had a similar problem - a c++ function was called properly when using "Integer" in the tm-file, and not recognized with "Integer32".
The "sumalist.tm" example also uses a strange Pattern (list:{___Integer}) about which I didn't find any documentation. I'd also like to understand what the Evaluate line means (I suspect that it's used make the function callable without the curly braces around the list).
So who know which datatypes are really appropriate to call a c++ function with a list - maybe also with reals... ?
The mapping of MathLink data types (e.g., Integer32, Integer32List, ...) to C/C++ types is described on the MathLink template file documentation page.
The page no longer documents the old interface types Integer, Real, IntegerList and RealList. These should no longer be used, because the mapping of these types depends on C types whose bit length is platform and compiler dependent (e.g., long). Use the corresponding new type with explicit bit length instead (i.e., Integer32 or Integer64 instead of Integer). The old interface types are still documented in the somewhat dated MathLink reference guide.
The following talk slides contain a simple MathLink example that shows how to implement a MathLink function that adds a scalar value to a vector of reals. This may serve as a starting point.
I don't know much about MathLink, but I can explain the pattern, list:{___Integer}.
The colon is just the general form for a named pattern, that is symbol:pattern just says that the object referred to by symbol has to match pattern. Indeed, pattern like a_Integer or b__List are really just short forms for a:_Integer and b:__List.
So what we are left with interpreting is {___Integer}. This is a pattern matching a list of arbitrary many (including zero) integers. It works as follows:
{Pattern} is the Pattern for a list whose contents matches Pattern
___Integer is the Pattern for a sequence of zero or more Integers.