When are InfoSphere Streams consistent regions merged into one? - ibm-infosphere

I have this test question, which I am not able to figure out:
--
Which of the following is NOT TRUE regarding consistent regions? (pick one)
a) The #consistent annotation is applied to a primitive operator
b) The annotated primitive operator is the start operator of the consistent region
c) The consistent region is defined by the reachability graph of the annotated operator
d) Intersecting reachability graphs of different annotated primitive operators are treated as independent consistent regions
--
Problem is - all of these seem to be true to me, and if I had to pick one which I am least sure of, that would be the D option, only because of the word intersecting.
So I am reading the official documentation of InfoSphere Streams, and it says the following:
When the reachability graphs of different annotated primitive operators share a common operator, they form a single consistent region.
My questions:
1) If I understand the word intersecting correctly, it means exactly that. Or could it mean that the two regions "cross paths" instead of sharing a common operator? In which case option D would be the right answer...
2) If not, then perhaps I am being fooled by the question and the real answer is, e.g., b, as in if the annotated operator is in a consistent region already, then it's not a start operator? So option b is not necessarily always true?
3) Is there a mistake in the official documentation, where it says:
The following figure shows an example with the #consistent annotation placed on two primitive operators (op1 and op7). The reachability graph of each operator does not form a single region because they do not share a common operator. As a result, two independent consistent regions are formed, which is shown by different patterns on operators in each region.
What about op4?

The correct answer to the question would be d. Two intersecting consistent regions merge into a single region, assuming the start operators meet all the criteria.
The diagram in the official documentation appears to be wrong.

Related

Generating branch instructions from an AST for a conditional expression

I am trying to write a compiler for a domain-specific language, targeting a stack-machine based VM that is NOT a JVM.
I have already generated a parser for my language, and can readily produce an AST which I can easily walk. I also have had no problem converting many of the statements of my language into the appropriate instructions for this VM, but am facing an obstacle when it comes to the matter of handling the generation of appropriate branching instructions when complex conditionals are encountered, especially when they are combined with (possibly nested) 'and'-like or 'or' like operations which should use short-circuiting branching as applicable.
I am not asking anyone to write this for me. I know that I have not begun to describe my problem in sufficient detail for that. What I am asking for is pointers to useful material that can get me past this hurdle I am facing. As I said, I am already past the point of converting about 90% of the statements in my language into applicable instructions, but it is the handling of conditionals and generating the appropriate flow control instructions that has got me stumped. Much of the info that I have been able to find so far on generating code from an AST only seems to deal with the generation of code corresponding to simple imperative-like statements, but the handing of conditionals and flow control appears to be much more scarce.
Other than the short-circuiting/lazy-evaluation mechanism for 'and' and 'or' like constructs that I have described, I am not concerned with handling any other optimizations.
Every conditional control flow can be modelled as a flow chart (or flow graph) in which the two branches of the conditional have different targets. Given that boolean operators short-circuit, they are control flow elements rather than simple expressions, and they need to be modelled as such.
One way to think about this is to rephrase boolean operators as instances of the ternary conditional operator. So, for example, A and B becomes A ? B : false and A or B becomes A ? true : B [Note 1]. Note that every control flow diagram has precisely two output points.
To combine boolean expressions, just substitute into the diagram. For example, here's A AND (B OR C)
You implement NOT by simply swapping the meaning of the two out-flows.
If the eventual use of the boolean expression is some kind of conditional, such as an if statement or a conditional loop, you can use the control flow as is. If the boolean expression is to be saved into a variable or otherwise used as a value, you need to fill in the two outflows with code to create the relevant constant, usually a true or false boolean constant, or (in C-like languages) a 1 or 0.
Notes:
Another way to write this equivalence is A and B ⇒ A ? B : A; A or B ⇒ A ? A : B, but that is less useful for a control flow view, and also clouds the fact that the intent is to only evaluate each expression once. This form (modified to reuse the initial computation of A) is commonly used in languages with multiple "falsey" values (like Python).

What is the name of the data structure for and-or-lists (or and-or-trees) and where can I read about it?

I recently needed to make a data structure which was a nested list of and/or questions. Since most every interesting thing has been discovered by someone else previously, I’m looking for the name of this data structure. it looks something like this.
‘((a b c) (b d e) (c (a b) (f a)))
The interpretation is I want to find abc or bde or caf or caa or cbf or cba and the list encapsulates that. At the top level each item is or’ed together and sub-lists of the top level are and’ed together and sub-lists of sub-lists are or’ed again sub-lists of those are and’ed and sub-lists of those or’ed ad infinitum. Note that in my example, all the lists are the same length, in my real application the lists vary in length.
The code to walk such a “tree” is relatively simple, but I’m assuming that there is a name for that type of tree and there is stuff I can read about it.
These lists are equivalent to fixed length regular expressions (which I've seen referred to as "network expressions", but I am particularly interested in this data structure and representation thereof.
In general (in the very high level of abstraction) it is:
Context free grammar -Wiki
If you allow it to be infinitely nested, then it is not a regular expression because of presence of parentheses (left and right should match).
If you consider, that expressions inside parentheses are ordered. I mean that a and b and c is equivalent to (a and b) and c. You get then Binary expression tree -Wiki
But for your particular case, it is probably: Disjunctive normal form -Wiki
I am not sure, but my intuition says that it is regular expression again because you have only 2 levels of nesting (1st - for 'or-ed' and 2nd - for 'and-ed' parts)
The trees are also a subset of DAWGS - directed acyclic word graphs and one could construct them the same way.
In my case, I have a very small set that I have built by hand and I don't worry about getting the minimal set, but instead just want something that I can easily write down but deals with the types of simple variations I see. Basically, I have different ways of finding where I keep my .el files based upon the different directory structures of various OSes I use. (E.g. when I was working at Google, the /usr/local/emacs/site-lisp directory was actually more like /usr/local/Google/emacs/site-lisp.)
I don't need a full regex, but there are about a dozen variations, some having quite long lists of nested sub-directories (c:\users\cfclark\appData\roaming\emacs.emacs.d or some other awful thing) that I wanted to write down (and then have emacs make an automated search to find the one that is appropriate to this particular installation). And every time I go to a new job, I can simply add to the list a description of where they are in that setup.
Anyway, as that code has evolved, I found that I had I was doing (nested or's and and's and realized that the structure generalized to the alternating or/and/or/and/... case). So, my assumption is that someone must have discovered this before. I had hints of it myself several years ago, but didn't set down to implement it. The Disjunctive Normal Form link mpasko256 gave is also particularly relevant. I don't normalize to that level, I still keep nested and's and or's rather than flattening to 2, but I do have a distinct structure, or's at the top, then and's, then or's....

What are some alternative forms of if > then relationships?

Traditional if > then relationship in pseudo code:
if (x>y) {
then print "x is greater than y."
}
There are also relational databases.
Or just visual if>then tables. A visual table representation.
There are also tree or hierarchical structure if>then programming aids.
I'm looking for any and all alternatives and flavors of if>then constructs, but preferably practical ones. Since most humans are better at using and remembering visual constructs (tables vs raw code) than symbolic constructs, I'm looking for the most intuitive way to theoretically construct an if>then rule engine, graphically.
Note: I'm not trying to implement this, I'm just trying to get an idea of what could theoretically be done.
I hope I've interpreted the question correctly.
Everything eventually boils down to comparisons, its just a matter of breaking up these comparisons in manageable chunks for humans. There are many techniques to reduce if-thens, or at least transform them into something easier to understand.
One example would be polymorphism. This frees the programmer from one instance of if/then (basically a switch statement). Another example is maps. The implementation of a map uses if/thens, but one might pre-populate the map with all the data and use one logical piece of code instead of using if/then to differentiate. This moves to a data-driven approach. Another example is SQL; it is just a language, a higher level construct, that enables us to express conditions and constraints differently. How you choose to express these conditions is dependent on the problem domain. Some problems work well with traditional procedural programming, some with logic programming, declarative programming etc. If there are many levels of nested if-thens, a state machine approach might work well. Aspect-oriented programming tries to solve the problem of duplicated code in modules that doesn't belong specifically to any one module; a concern that "cross-cuts".
I would do some reading on Programming Paradigms. Do lots of research and if you run into a recurring problem, see if another approach allows you to reduce the amount of if-thens. Most times someone else has run into the same problem and come up with a solution.
Your question is a bit broad and we could ramble from logical gates to mathematical functions. I'm going to focus on this particular bit:
"I'm looking for the most intuitive way to theoretically construct an if>then rule engine, graphically".
First, two caveats:
The best representation depends on the number of possible rules. What works for 3-4 rules probably won't work for 30-40.
I'm going to pretend that else conditions don't exist.
If "X then Y" boils down to: one condition and one instruction whose execution depends on the condition. Let's pretend X -> Y means that "If X is true then Y is executed". Let's create two sets: one is C that contains all the possible conditions. The other one is I which contains all the possible instructions.
With this is mind, X ∈ C and Y ∈ I. In your specific case, can Y ∈ C (can Y be a condition)? If so, you have nested ifs.
Nested ifs can be represented as chains of conditions joined by and operators:
if (x > 3) {
if (y > 5) {
# do something
}
}
Can be written as:
if (x > 3 and y > 5) {
# do something
}
If you're only thinking about code then the latter can become problematic when you have many nested conditions, but when you go graphical, nesting (probably using tree-like structures) can look cluttered while chaining usually looks like a sequence of instructions (which I think is better).
If you don't consider nesting (chaining) in your rules, then connecting elements (boxes, circles, etc) from X -> Y is trivial way to work. The representation of this depends on how graphical you want to get (see the links below for some examples).
If you're considering nesting then three random ideas come to my mind:
Venn Diagrams: Visually attractive, useless for more than 3-4 conditions. They have a good fit with database representations. See: http://share.mheroin.com/image/3i3l1y0S2F39
Flowcharts: Highly functional and easy to read, not too cumbersome to create. Can get out of hand with 10+ elements. See: http://share.mheroin.com/image/2g071j3U1u29
Tables: As you mentioned, tables are a decent way to represent conditionals as long as you can restrain the set of applicable rules. This is an example taken from iTunes: http://share.mheroin.com/image/390y2G18123q. The "Match [all/any] of the following rules" works as a replacement for if/else.

How does boost::uBLAS handle nested products of matrices?

I read an article about the optimisation of nested product of matrices, using dynamic programming, and I wanted to see how it is implemented in boost::uBLAS.
I'm not sure I understood the documentation (they talk about it at the very end of the page), but it seems they handle this problem. When you write R = prod(A, prod(B,C));, the library computes A x (B x C) or (A x B) x C depending on the sizes of A, B and C.
How can it be achieved ? How can the library "move the brackets" ? When writing such a code line, I thought the arguments of prod would be evaluated, and then the function would be run.
The FAQ mentions the notion of expression templates. Is it linked ?
Yes, expression templates (which you may find under the name "expression trees") are involved.
Basically, prod doesn't return the result, but a wrapper object holding the operation (matrix multiply) and pointers to the two inputs. If prod is called with such a wrapper as input, it can apply reordering optimizations.
However, from my reading of that page, no such optimization is applied (it says it honors bracketing structure).

Equivalence of boolean expressions

I have a problem that consist in comparing boolean expressions ( OR is +, AND is * ). To be more precise here is an example:
I have the following expression: "A+B+C" and I want to compare it with "B+A+C". Comparing it like string is not a solution - it will tell me that the expressions don't match which is of course false. Any ideas on how to compare those expressions?
Any ideas about how can I tackle this problem? I accept any kind of suggestions but (as a note) the final code in my application will be written in C++ (C accepted of course).
An normal expression could contain also parenthesis:
(A * B * C) + D or A+B*(C+D)+X*Y
Thanks in advance,
Iulian
I think the competing approach to exhaustive (and possibly exhausting) creation of truth tables would be to reduce all your expressions to a canonical form and compare those. For example, rewrite everything into conjunctive normal form with some rule about the ordering of symbols (eg alphabetical order within terms) and terms (eg alphabetical by first symbol in term). This of course, requires that symbol A in one expression is the same as symbol A in another.
How easy it is to write (or grab from the net) a C or C++ function for rewriting your expressions into CNF I don't know. However, there's been a lot of AI work done in C and C++ so you'll probably find something when you Google.
I'm also a little unsure about the comparative computational complexity of this approach and the truth-table approach. I strongly suspect that it's the same.
Whether you use truth tables or a canonical representation you can of course keep down the work to be done by splitting your input forms into groups based on the number of different symbols that they contain.
EDIT: On reading the other answers, in particular the suggestion to generate all truth tables and compare them, I think that #Iulian has severely underestimated the number of possible truth tables.
Suppose that we settle on RPN to write the expressions, this will avoid having to deal with brackets, and that there are 10 symbols, which means 9 (binary) operators. There will be 10! different orderings of the symbols, and 2^9 different orderings of the operators. There will therefore be 10! x 2^9 == 1,857,945,600 rows in the truth table for this expression. This does include some duplicates, any expression containing only 'and' and 'or' for instance will be the same regardless of the order of symbols. But I'm not sure I can figure this any further ...
Or am I making a big mistake ?
You can calculate the truth table for each expression over all possible inputs then compare the truth tables.