Algorithm for Deductions

Algorithm for Deductions - c++

Here is the problem.
If we have two statements
p=>q and q=>r, it also implies that p=>r.
Given a set of statements I need to find whether a given statement is true or false or cannot be concluded from the given statements.
Example:
Given statements p=>q, p=>r, q=>s
if the input is p=>s I should get the output true
if the input is p=>t I should get the output Cannot be concluded
if the input is p=> ~p I should get the output false
Here my question is what is the best data structure to implement this and what is the algorithm to use.
Thanks.

So, I'm still not _entirely clear on what you're trying to do. At the risk of being down-voted, I'm going to kick this out there and see what people think.
I might start by building a graph. Each entity (p, q, etc.) has its own node. "Implies" mean you draw a line between two nodes. Any input, then, is just a matter of seeing if you can find a way to traverse the graph--so in your example, a => b, b => c, the graph has three nodes, a connected to b, b connected to c. The fact that a path exists between a and c means that a implies c.
I haven't vetted this idea any further, but it seems like an interesting prospect. In particular because graph theory is cool, and lots of people are interested in it (i.e., Facebook execs). AND there are good modules in Python for analyzing graphs. (I assume the same is also true for C++. And you can always spec it out by hand using Gephi: https://gephi.org/)

A lot of people have studied this problem many years. What you need is a SAT Solver. Lookup Chaff or zChaff or any other commonly used SAT Solver. You want to take your clauses like (p->q && q->r) -> (p -> r) and negate them and determine if that is satisfiable. If the negation is not satisfiable, then you have a theorem, something that is always true. If the original clauses are satisfiable and the negation of the clauses are satisfiable, the you should return "cannot be concluded". And if the original clauses are not satisfiable then you have something that is false.
This is actually a well studied problem. There are good algorithms, but there is a hard limit on how many propositional variables you can handle. SAT is at the heart of NP hard problems. A class of problems for which efficient algorithms are not known.

I think that given the simplicity of your problem, you could get away with using a simple map. The main advantage over a vector being in the much faster look-up.
// For "p": { name: "p", positive: "true" }
// For "~q": { name: "q", positive: "false" }
struct Predicate {
std::string _name;
bool _positive;
};
using PredicateSetType = std::unordered_set<Predicate>;
using PredicateMapType = std::unordered_map<Predicate, PredicateSetType>;
You use the map in the following manner: when given p => q, you insert { "q", true } in the set of predicates associated to { "p", true }.
Note that this actually encodes a directed graph, so the typical methods of exploring a graph apply when it comes to proving a statement.

Related

Maxima: creating a function that acts on parts of a string

Context: I'm using Maxima on a platform that also uses KaTeX. For various reasons related to content management, this means that we are regularly using Maxima functions to generate the necessary KaTeX commands.
I'm currently trying to develop a group of functions that will facilitate generating different sets of strings corresponding to KaTeX commands for various symbols related to vectors.
Problem
I have written the following function makeKatexVector(x), which takes a string, list or list-of-lists and returns the same type of object, with each string wrapped in \vec{} (i.e. makeKatexVector(string) returns \vec{string} and makeKatexVector(["a","b"]) returns ["\vec{a}", "\vec{b}"] etc).
/* Flexible Make KaTeX Vector Version of List Items */
makeKatexVector(x):= block([ placeHolderList : x ],
if stringp(x) /* Special Handling if x is Just a String */
then placeHolderList : concat("\vec{", x, "}")
else if listp(x[1]) /* check to see if it is a list of lists */
then for j:1 thru length(x)
do placeHolderList[j] : makelist(concat("\vec{", k ,"}"), k, x[j] )
else if listp(x) /* check to see if it is just a list */
then placeHolderList : makelist(concat("\vec{", k, "}"), k, x)
else placeHolderList : "makeKatexVector error: not a list-of-lists, a list or a string",
return(placeHolderList));
Although I have my doubts about the efficiency or elegance of the above code, it seems to return the desired expressions; however, I would like to modify this function so that it can distinguish between single- and multi-character strings.
In particular, I'd like multi-character strings like x_1 to be returned as \vec{x}_1 and not \vec{x_1}.
In fact, I'd simply like to modify the above code so that \vec{} is wrapped around the first character of the string, regardless of how many characters there may be.
My Attempt
I was ready to tackle this with brute force (e.g. transcribing each character of a string into a list and then reassembling); however, the real programmer on the project suggested I look into "Regular Expressions". After exploring that endless rabbit hole, I found the command regex_subst; however, I can't find any Maxima documentation for it, and am struggling to reproduce the examples in the related documentation here.
Once I can work out the appropriate regex to use, I intend to implement this in the above code using an if statement, such as:
if slength(x) >1
then {regex command}
else {regular treatment}
If anyone knows of helpful resources on any of these fronts, I'd greatly appreciate any pointers at all.

Looks like you got the regex approach working, that's great. My advice about handling subscripted expressions in TeX, however, is to avoid working with names which contain underscores in Maxima, and instead work with Maxima expressions with indices, e.g. foo[k] instead of foo_k. While writing foo_k is a minor convenience in Maxima, you'll run into problems pretty quickly, and in order to straighten it out you might end up piling one complication on another.
E.g. Maxima doesn't know there's any relation between foo, foo_1, and foo_k -- those have no more in common than foo, abc, and xyz. What if there are 2 indices? foo_j_k will become something like foo_{j_k} by the preceding approach -- what if you want foo_{j, k} instead? (Incidentally the two are foo[j[k]] and foo[j, k] when represented by subscripts.) Another problematic expression is something like foo_bar_baz. Does that mean foo_bar[baz], foo[bar_baz] or foo_bar_baz?
The code for tex(x_y) yielding x_y in TeX is pretty old, so it's unlikely to go away, but over the years I've come to increasing feel like it should be avoided. However, the last time it came up and I proposed disabling that, there were enough people who supported it that we ended up keeping it.
Something that might be helpful, there is a function texput which allows you to specify how a symbol should appear in TeX output. For example:
(%i1) texput (v, "\\vec{v}");
(%o1) "\vec{v}"
(%i2) tex ([v, v[1], v[k], v[j[k]], v[j, k]]);
$$\left[ \vec{v} , \vec{v}_{1} , \vec{v}_{k} , \vec{v}_{j_{k}} ,
\vec{v}_{j,k} \right] $$
(%o2) false
texput can modify various aspects of TeX output; you can take a look at the documentation (see ? texput).

While I didn't expect that I'd work this out on my own, after several hours, I made some progress, so figured I'd share here, in case anyone else may benefit from the time I put in.
to load the regex in wxMaxima, at least on the MacOS version, simply type load("sregex");. I didn't have this loaded, and was trying to work through our custom platform, which cost me several hours.
take note that many of the arguments in the linked documentation by Dorai Sitaram occur in the reverse, or a different order than they do in their corresponding Maxima versions.
not all the "pregexp" functions exist in Maxima;
In addition to this, escaping special characters varied in important ways between wxMaxima, the inline Maxima compiler (running within Ace editor) and the actual rendered version on our platform; in particular, the inline compiler often returned false for expressions that compiled properly in wxMaxima and on the platform. Because I didn't have sregex loaded on wxMaxima from the beginning, I lost a lot of time to this.
Finally, the regex expression that achieved the desired substitution, in my case, was:
regex_subst("\vec{\\1}", "([[:alpha:]])", "v_1");
which returns vec{v}_1 in wxMaxima (N.B. none of my attempts to get wxMaxima to return \vec{v}_1 were successful; escaping the backslash just does not seem to work; fortunately, the usual escaped version \\vec{\\1} does return the desired form).
I have yet to adjust the code for the rest of the function, but I doubt that will be of use to anyone else, and wanted to be sure to post an update here, before anyone else took time to assist me.
Always interested in better methods / practices or any other pointers / feedback.

What is the name of the data structure for and-or-lists (or and-or-trees) and where can I read about it?

I recently needed to make a data structure which was a nested list of and/or questions. Since most every interesting thing has been discovered by someone else previously, I’m looking for the name of this data structure. it looks something like this.
‘((a b c) (b d e) (c (a b) (f a)))
The interpretation is I want to find abc or bde or caf or caa or cbf or cba and the list encapsulates that. At the top level each item is or’ed together and sub-lists of the top level are and’ed together and sub-lists of sub-lists are or’ed again sub-lists of those are and’ed and sub-lists of those or’ed ad infinitum. Note that in my example, all the lists are the same length, in my real application the lists vary in length.
The code to walk such a “tree” is relatively simple, but I’m assuming that there is a name for that type of tree and there is stuff I can read about it.
These lists are equivalent to fixed length regular expressions (which I've seen referred to as "network expressions", but I am particularly interested in this data structure and representation thereof.

In general (in the very high level of abstraction) it is:
Context free grammar -Wiki
If you allow it to be infinitely nested, then it is not a regular expression because of presence of parentheses (left and right should match).
If you consider, that expressions inside parentheses are ordered. I mean that a and b and c is equivalent to (a and b) and c. You get then Binary expression tree -Wiki
But for your particular case, it is probably: Disjunctive normal form -Wiki
I am not sure, but my intuition says that it is regular expression again because you have only 2 levels of nesting (1st - for 'or-ed' and 2nd - for 'and-ed' parts)

The trees are also a subset of DAWGS - directed acyclic word graphs and one could construct them the same way.
In my case, I have a very small set that I have built by hand and I don't worry about getting the minimal set, but instead just want something that I can easily write down but deals with the types of simple variations I see. Basically, I have different ways of finding where I keep my .el files based upon the different directory structures of various OSes I use. (E.g. when I was working at Google, the /usr/local/emacs/site-lisp directory was actually more like /usr/local/Google/emacs/site-lisp.)
I don't need a full regex, but there are about a dozen variations, some having quite long lists of nested sub-directories (c:\users\cfclark\appData\roaming\emacs.emacs.d or some other awful thing) that I wanted to write down (and then have emacs make an automated search to find the one that is appropriate to this particular installation). And every time I go to a new job, I can simply add to the list a description of where they are in that setup.
Anyway, as that code has evolved, I found that I had I was doing (nested or's and and's and realized that the structure generalized to the alternating or/and/or/and/... case). So, my assumption is that someone must have discovered this before. I had hints of it myself several years ago, but didn't set down to implement it. The Disjunctive Normal Form link mpasko256 gave is also particularly relevant. I don't normalize to that level, I still keep nested and's and or's rather than flattening to 2, but I do have a distinct structure, or's at the top, then and's, then or's....

How to detect list changes without comparing the complete list

I have a function which will fail if there has being any change on the term/list it is using since the generation of this term/list. I would like to avoid to check that each parameter still the same. So I had thought about each time I generate the term/list to perform a CRC or something similar. Before making use of it I would generate again the CRC so I can be 99,9999% sure the term/list still the same.
Going to a specfic answer, I am programming in Erlang, I am thinking on using a function of the following type:
-spec(list_crc32(List :: [term()]) -> CRC32 :: integer()).
I use term, because it is a list of terms, (erlang has already a default fast CRC libraries but for binary values). I have consider to use "erlang:crc32(term_to_binary(Term))", but not sure if there could be a better approach.
What do you think?
Regards, Borja.

Without more context it is a little bit difficult to understand why you would have this problem, particularly since Erlang terms are immutable -- once assigned no other operation can change the value of a variable, not even in the same function.
So if your question is "How do I quickly assert that true = A == A?" then consider this code:
A = generate_list()
% other things in this function happen
A = A.
The above snippet will always assert that A is still A, because it is not possible to change A like you might do in, say, Python.
If your question is "How do I assert that the value of a new list generated exactly the same value as a different known list?" then using either matching or an actual assertion is the fastest way:
start() ->
A = generate_list(),
assert_loop(A).
assert_loop(A) ->
ok = do_stuff(),
A = generate_list(),
assert_loop(A).
The assert_loop/1 function above is forcing an assertion that the output of generate_list/0 is still exactly A. There is no telling what other things in the system might be happening which may have affected the result of that function, but the line A = generate_list() will crash if the list returned is not exactly the same value as A.
In fact, there is no way to change the A in this example, no matter how many times we execute assert_loop/1 above.
Now consider a different style:
compare_loop(A) ->
ok = do_stuff(),
case A =:= generate_list() of
true -> compare_loop(A);
false -> terminate_gracefully()
end.
Here we have given ourselves the option to do something other than crash, but the effect is ultimately the same, as the =:= is not merely a test of equality, it is a match test meaning that the two do not evaluate to the same values, but that they actually match.
Consider:
1> 1 == 1.0.
true
2> 1 =:= 1.0.
false
The fastest way to compare two terms will depend partly on the sizes of the lists involved but especially on whether or not you expect the assertion to pass or fail more often.
If the check is expected to fail more often then the fastest check is to use an assertion with =, an equivalence test with == or a match test with =:= instead of using erlang:phash2/1. Why? Because these tests can return false as soon as a non-matching element is encountered -- and if this non-match occurs near the beginning of the list then a full traverse of both lists is avoided entirely.
If the check is expected to pass more often then something like erlang:phash2/1 will be faster, but only if the lists are long, because only one list will be fully traversed each iteration (the hash of the original list is already stored). It is possible, though, on a short list that a simple comparison will still be faster than computing a hash, storing it, computing another hash, and then comparing the hashes (obviously). So, as always, benchmark.
A phash2 version could look like:
start() ->
A = generate_list(),
Hash = erlang:phash2(A),
assert_loop(Hash).
assert_loop(Hash) ->
ok = do_stuff(),
Hash = erlang:phash2(generate_list()),
loop(Hash).
Again, this is an assertive loop that will crash instead of exit cleanly, so it would need to be adapted to your needs.
The basic mystery still remains, though: in a language with immutable variables why is it that you don't know whether something will have changed? This is almost certainly a symptom of an underlying architectural problem elsewhere in the program -- either that or simply a misunderstanding of immutability in Erlang.

Proper flow control in Prolog without using the non-declarative if-then-else syntax

I would like to check for an arbitrary fact and do something if it is in the knowledge base and something else if it not, but without the ( I -> T ; E)syntax.
I have some facts in my knowledge base:
unexplored(1,1).
unexplored(2,1).
safe(1,1).
given an incomplete rule
foo:- safe(A,B),
% do something if unexplored(A,B) is in the knowledge base
% do something else if unexplored(A,B) is not in the knowledge base
What is the correct way to handle this, without doing it like this?
foo:-
safe(A,B),
( unexplored(A,B) -> something ; something_else ).

Not an answer but too long for a comment.
"Flow control" is by definition not declarative. Changing the predicate database (the defined rules and facts) at run time is also not declarative: it introduces state to your program.
You should really consider very carefully if your "data" belongs to the database, or if you can keep it a data structure. But your question doesn't provide enough detail to be able to suggest anything.
You can however see this example of finding paths through a maze. In this solution, the database contains information about the problem that does not change. The search itself uses the simplest data structure, a list. The "flow control" if you want to call it this is implicit: it is just a side effect of Prolog looking for a proof. More importantly, you can argue about the program and what it does without taking into consideration the exact control flow (but you do take into consideration Prolog's resolution strategy).

The fundamental problem with this requirement is that it is non-monotonic:
Things that hold without this fact may suddenly fail to hold after adding such a fact.
This inherently runs counter to the important and desirable declarative property of monotonicity.
Declaratively, from adding facts, we expect to obtain at most an increase, never a decrease of the things that hold.
For this reason, your requirement is inherently linked to non-monotonic constructs like if-then-else, !/0 and setof/3.
A declarative way to reason about this is to entirely avoid checking properties of the knowledge base. Instead, focus on a clear description of the things that hold, using Prolog clauses to encode the knowledge.
In your case, it looks like you need to reason about states of some search problem. A declarative way to solve such tasks is to represent the state as a Prolog term, and write pure monotonic rules involving the state.
For example, let us say that a state S0 is related to state S if we explore a certain position Pos that was previously not explored:
state0_state(S0, S) :-
select(Pos-unexplored, S0, S1),
S = [Pos-explored|S1].
or shorter:
state0_state(S0, [Pos-explored|S1) :-
select(Pos-unexplored, S0, S1).
I leave figuring out the state representation I am using here as an easy exercise. Notice the convenient naming convention of using S0, S1, ..., S to chain the different states.
This way, you encode explicit relations about Prolog terms that represent the state. Pure, monotonic, and works in all directions.

Using Brent algorithm to find the root of a function f with an initial guess, but without intervals [a,b] s.t. f(a)f(b)<0

I would like to know how to use Brent algorithm if no opposite signs can be provided.
For example, in the C++ library of Brent algorithm, the root-finding procedure that implements Brent’s method has to be used, following the header file, in the form of
double zero ( double a, double b, double t, func_base& f );
where a, b satisfies the condition of opposite signs: f(a).f(b) < 0
In my problem setting, I need to find the root(s) of a black-box function f. An initial guess is provided but no endpoints a,b, such that f(a) f(b)<0 are provided It seems that in Matlab there is a function fmin that only needs an initial guess. I would like to know how to do this using C++, in particular, using the implementation of Brent such as the one linked above?
Thanks for your ideas.

Without doing exhaustive search (and in the case of real valued function, you cannot, since the value of x is uncountable), there is no way to really guarantee finding the root if such exist.
One heuristic approach to address the problem is using gradient descent, in order to minimze (/maximize) the value of the function, until you find a local minimum (/maximum) or until you find a root.
The problem with this approach is you can get stuck in a local minimum (/maximum) before finding the root, and "think" there is no root, even if one does exist.

Under the assumptions that
f is a black-box, i.e. it can be evaluated but no information on its shape is known whatsoever.
You have to use a method that requires a priori knowledge of an interval [a,b] which brackets a root of f (assuming f is continuous).
I think your only option is to make a preliminary search for two valid points a and b.
This can be done in a number of ways. The most simple-minded could be to run a linear search (with some prescribed step) starting from your initial guess, which can be repeated with a finer step if it turns out unsuccessful. If f is not too "weird" a simple method should do.
Clearly, some basic clue on the properties of f is always necessary, for example that it actually has a root and that it is continuos, differentiable, etc.. All root finding methods (gradient descent, Newton-Raphson, bisection, etc.) assume some basic properties of the function.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js