I'm trying to figure out how to approach the app_ne problem in SF. My thinking is to induct over the first regular expression, as it will allow us to satisfy the first disjunct, whereas all the other regular expression forms will allow one to prove the existential right disjunct.
(i) Is this a correct approach to the problem?
(ii) If so, how does one deal with the empty set case? This got me right away.
(iii) Is there any way admit a single part of a proof and then come back to it later (since this easy case is throwing me off and I would like to work through some of the other cases..)
Lemma app_ne : forall (a : ascii) s re0 re1,
a :: s =~ (App re0 re1) <->
([ ] =~ re0 /\ a :: s =~ re1) \/
exists s0 s1, s = s0 ++ s1 /\ a :: s0 =~ re0 /\ s1 =~ re1.
Proof.
intros.
split.
- intros. induction re0.
* right. inversion H.
(* + apply re_not_empty_correct. *)
(* + apply MEmpty. *)
Abort.
My thinking is to induct over the first regular expression, as it will allow us to satisfy the first disjunct,
I don't understand that reasoning (maybe it's not what you actually meant). If you could just prove the first disjunct, then there would be no point in having a disjunction in the first place.
whereas all the other regular expression forms will allow one to prove the existential right disjunct.
"other" than what?
(iii) Is there any way admit a single part of a proof and then come back to it later (since this easy case is throwing me off and I would like to work through some of the other cases..)
There is the admit. tactic to skip the current goal and the Admitted. command to skip the whole proof.
Hint for this problem: what does the first assumption mean: a :: s =~ (App re0 re1) (i.e., look at what the definition of =~ says about App)?
What is the best intuition for why the first definition would be refused, while the second one would be accepted ?
let rec a = b (* This kind of expression is not allowed as right-hand side of `let rec' *)
and b x = a x
let rec a x = b x (* oki doki *)
and b x = a x
Is it linked to the 2 reduction approaches : one rule for every function substitution (and a Rec delimiter) VS one rule per function definition (and lambda lifting) ?
Verifying that recursive definitions are valid is a very hard thing to do.
Basically, you want to avoid patterns in this kind of form:
let rec x = x
In the case where every left-hand side of the definitions are function declarations, you know it is going to be fine. At worst, you are creating an infinite loop, but at least you are creating a value. But the x = x case does not produce anything and has no semantic altogether.
Now, in your specific case, you are indeed creating functions (that loop indefinitely), but checking that you are is actually harder. To avoid writing a code that would attempt exhaustive checking, the OCaml developers decided to go with a much easier algorithm.
You can have an outlook of the rules here. Here is an excerpt (emphasis mine):
It will be accepted if each one of expr1 … exprn is statically constructive with respect to name1 … namen, is not immediately linked to any of name1 … namen, and is not an array constructor whose arguments have abstract type.
As you can see, direct recursive variable binding is not permitted.
This is not a final rule though, as there are improvements to that part of the compiler pending release. I haven't tested if your example passes with it, but some day your code might be accepted.
I'm writing a prolog program to check if a variable is an integer.
The way I'm "returning" the result is strange, but I don't think it's important for answering my question.
The Tests
I've written passing unit tests for this behaviour; here they are...
foo_test.pl
:- begin_tests('foo').
:- consult('foo').
test('that_1_is_recognised_as_int') :-
count_ints(1, 1).
test('that_atom_is_not_recognised_as_int') :-
count_ints(arbitrary, 0).
:- end_tests('foo').
:- run_tests.
The Code
And here's the code that passes those tests...
foo.pl
count_ints(X, Answer) :-
integer(X),
Answer is 1.
count_ints(X, Answer) :-
\+ integer(X),
Answer is 0.
The Output
The tests are passing, which is good, but I'm receiving a warning when I run them. Here is the output when running the tests...
?- ['foo_test'].
% foo compiled into plunit_foo 0.00 sec, 3 clauses
% PL-Unit: foo
Warning: /home/brandon/projects/sillybin/prolog/foo_test.pl:11:
/home/brandon/projects/sillybin/prolog/foo_test.pl:4:
PL-Unit: Test that_1_is_recognised_as_int: Test succeeded with choicepoint
. done
% All 2 tests passed
% foo_test compiled 0.03 sec, 1,848 clauses
true.
I'm using SWI-Prolog (Multi-threaded, 64 bits, Version 6.6.6)
I have tried combining the two count_ints predicates into one, using ;, but it still produces the same warning.
I'm on Debian 8 (I doubt it makes a difference).
The Question(s)
What does this warning mean? And...
How do I prevent it?
First, let us forget the whole testing framework and simply consider the query on the toplevel:
?- count_ints(1, 1).
true ;
false.
This interaction tells you that after the first solution, a choice point is left. This means that alternatives are left to be tried, and they are tried on backtracking. In this case, there are no further solutions, but the system was not able to tell this before actually trying them.
Using all/1 option for test cases
There are several ways to fix the warning. A straight-forward one is to state the test case like this:
test('that_1_is_recognised_as_int', all(Count = [1])) :-
count_ints(1, Count).
This implicitly collects all solutions, and then makes a statement about all of them at once.
Using if-then-else
A somewhat more intelligent solution is to make count_ints/2 itself deterministic!
One way to do this is using if-then-else, like this:
count_ints(X, Answer) :-
( integer(X) -> Answer = 1
; Answer = 0
).
We now have:
?- count_ints(1, 1).
true.
i.e., the query now succeeds deterministically.
Pure solution: Clean data structures
However, the most elegant solution is to use a clean representation, so that you and the Prolog engine can distinguish all cases by pattern matching.
For example, we could represent integers as i(N), and everything else as other(T).
In this case, I am using the wrappers i/1 and other/1 to distinguish the cases.
Now we have:
count_ints(i(_), 1).
count_ints(other(_), 0).
And the test cases could look like:
test('that_1_is_recognised_as_int') :-
count_ints(i(1), 1).
test('that_atom_is_not_recognised_as_int') :-
count_ints(other(arbitrary), 0).
This also runs without warnings, and has the significant advantage that the code can actually be used for generating answers:
?- count_ints(Term, Count).
Term = i(_1900),
Count = 1 ;
Term = other(_1900),
Count = 0.
In comparison, we have with the other versions:
?- count_ints(Term, Count).
Count = 0.
Which, unfortunately, can at best be considered covering only 50% of the possible cases...
Tighter constraints
As Boris correctly points out in the comments, we can make the code even stricter by constraining the argument of i/1 terms to integers. For example, we can write:
count_ints(i(I), 1) :- I in inf..sup.
count_ints(other(_), 0).
Now, the argument must be an integer, which becomes clear by queries like:
?- count_ints(X, 1).
X = i(_1820),
_1820 in inf..sup.
?- count_ints(i(any), 1).
ERROR: Type error: `integer' expected, found `any' (an atom)
Note that the example Boris mentioned fails also without such stricter constraints:
?- count_ints(X, 1), X = anything.
false.
Still, it is often useful to add further constraints on arguments, and if you need to reason over integers, CLP(FD) constraints are often a good and general solution to explicitly state type constraints that are otherwise only implicit in your program.
Note that integer/1 did not get the memo:
?- X in inf..sup, integer(X).
false.
This shows that, although X is without a shadow of a doubt constrained to integers in this example, integer(X) still does not succeed. Thus, you cannot use predicates like integer/1 etc. as a reliable detector of types. It is much better to rely on pattern matching and using constraints to increase the generality of your program.
First things first: the documentation of the SWI-Prolog Prolog Unit Tests package is quite good. The different modes are explained in Section 2.2. Writing the test body. The relevant sentence in 2.2.1 is:
Deterministic predicates are predicates that must succeed exactly once and, for well behaved predicates, leave no choicepoints. [emphasis mine]
What is a choice point?
In procedural programming, when you call a function, it can return a value, or a set of values; it can modify state (local or global); whatever it does, it will do it exactly once.
In Prolog, when you evaluate a predicate, a proof tree is searched for solutions. It is possible that there is more than one solution! Say you use between/3 like this:
For x = 1, is x in [0, 1, 2]?
?- between(0, 2, 1).
true.
But you can also ask:
Enumerate all x such that x is in [0, 1, 2].
?- between(0, 2, X).
X = 0 ;
X = 1 ;
X = 2.
After you get the first solution, X = 0, Prolog stops and waits; this means:
The query between(0, 2, X) has at least one solution, X = 0. It might have further solutions; press ; and Prolog will search the proof tree for the next solution.
The choice point is the mark that Prolog puts in the search tree after finding a solution. It will resume the search for the next solution from that mark.
The warning "Test succeeded with choicepoint" means:
The solution Prolog found was the solution the test expected; however, there it leaves behind a choice point, so it is not "well-behaved".
Are choice points a problem?
Choice points you didn't put there on purpose could be a problem. Without going into detail, they can prevent certain optimizations and create inefficiencies. That's kind of OK, but sometimes only the first solution is the solution you (the programmer) intended, and a next solution can be misleading or wrong. Or, famously, after giving you one useful answer, Prolog can go into an infinite loop.
Again, this is fine if you know it: you just never ask for more than one solution when you evaluate this predicate. You can wrap it in once/1, like this:
?- once( between(0, 2, X) ).
or
?- once( count_ints(X, Answer) ).
If someone else uses your code though all bets are off. Succeeding with a choice point can mean anything from "there are other useful solutions" to "no more solutions, this will now fail" to "other solutions, but not the kind you wanted" to "going into an infinite loop now!"
Getting rid of choice points
To the particular example: You have a built-in, integer/1, which will succeed or fail without leaving choice points. So, these two clauses from your original definition of count_ints/2 are mutually exclusive for any value of X:
count_ints(X, Answer) :-
integer(X), ...
count_ints(X, Answer) :-
\+ integer(X), ...
However, Prolog doesn't know that. It only looks at the clause heads and those two are identical:
count_ints(X, Answer) :- ...
count_ints(X, Answer) :- ...
The two heads are identical, Prolog doesn't look any further that the clause head to decide whether the other clause is worth trying, so it tries the second clause even if the first argument is indeed an integer (this is the "choice point" in the warning you get), and invariably fails.
Since you know that the two clauses are mutually exclusive, it is safe to tell Prolog to forget about the other clause. You can use once/1, as show above. You can also cut the remainder of the proof tree when the first argument is indeed an integer:
count_ints(X, 1) :- integer(X), !.
count_ints(_, 0).
The exactly same operational semantics, but maybe easier for the Prolog compiler to optimize:
count_ints(X, Answer) :-
( integer(X)
-> Answer = 1
; Answer = 0
).
... as in the answer by mat. As for using pattern matching, it's all good, but if the X comes from somewhere else, and not from the code you have written yourself, you will still have to make this check at some point. You end up with something like:
variable_tagged(X, T) :-
( integer(X) -> T = i(X)
; float(X) -> T = f(X)
; atom(X) -> T = a(X)
; var(X) -> T = v(X)
% and so on
; T = other(X)
).
At that point you can write your count_ints/2 as suggested by mat, and Prolog will know by looking at the clause heads that your two clauses are mutually exclusive.
I once asked a question that boils down to the same Prolog behaviour and how to deal with it. The answer by mat recommends the same approach. The comment by mat to my comment below the answer is just as important as the answer itself (if you are writing real programs at least).
Whenever I consider learning a new language -- haskell in this case -- I try to hack together a primitive grep clone to see how good the language implementation and/or its libraries are at text processing, because that's a major use case for me.
Inspired by code on the haskell wiki, I came up with the following naive attempt:
{-# LANGUAGE FlexibleContexts, ExistentialQuantification #-}
import Text.Regex.PCRE
import System.Environment
io :: ([String] -> [String]) -> IO ()
io f = interact (unlines . f . lines)
regexBool :: forall r l .
(RegexMaker Regex CompOption ExecOption r,
RegexLike Regex l) =>
r -> l -> Bool
regexBool r l = l =~ r :: Bool
grep :: forall r l .
(RegexMaker Regex CompOption ExecOption r, RegexLike Regex l) =>
r -> [l] -> [l]
grep r = filter (regexBool r)
main :: IO ()
main = do
argv <- getArgs
io $ grep $ argv !! 0
This appears to be doing what I want it to, but unfortunately, it's really slow -- about 10 times slower than a python script doing the same thing. I assume it's not the regex library that's at fault here, because it's calling into PCRE which should be plenty fast (switching to Text.Regex.Posix slows things down quite a bit further). So it must be the String implementation, which is instructive from a theoretical point of view but inefficient according to what I've read.
Is there an alternative to Strings in haskell that's both efficient and convenient (i.e. there's little or no friction when switching to using that instead of Strings) and that fully and correctly handles UTF-8-encoded Unicode, as well as other encodings without too much hassle if possible? Something that everybody uses when doing text processing in haskell but that I just don't know about because I'm a complete beginner?
It's possible that the slow speed is caused by using the standard library's list type. I've often run into performance problems with it in the past.
It would be a good idea to profile your executable, to see where it spends its time: Tools for analyzing performance of a Haskell program. Profiling Haskell programs is really easy (compile with a switch and execute your program with an added argument, and the report is written to a text file in the current working directory).
As a side note, I use exactly the same approach as you when learning a new language: create something that works. My experience doing this with Haskell is that I can easily gain an order of magnitude or two in performance by profiling and making relatively simple changes (usually a couple of lines).
I'm very new to Erlang. I tried to find out if a list index is out of bounds (before trying it) so i wanted to do an if clause with something like
if lists:flatlength(A) < DestinationIndex ....
I discovered that those function results cannot be used in if guards so i used case instead. This results in a nested case statement
case Destination < 1 of
true -> {ok,NumberOfJumps+1};
false ->
case lists:flatlength(A) < Destination of
true ->
doSomething;
false ->
case lists:member(Destination,VisitedIndices) of
true -> doSomething;
false ->
doSomethingElse
end
end
end.
I found this bad in terms of readability and code style. Is this how you do things like that in erlang or is there a more elegant way to do this?
Thanks in advance
Before you take the following as some magical gospel, please note that the way this function is entered is almost certainly unidiomatic. You should seek to limit cases way before you get to this point -- the need for nested cases is itself usually a code smell. Sometimes it is genuinely unavoidable, but I strongly suspect that some aspects of this can be simplified way earlier in the code (especially with some more thought given to the data structures that are being passed around, and what they mean).
Without seeing where the variable A is coming from, I'm making one up as a parameter here. Also, without seeing how this function is entered I'm making up a function head, because without the rest of the function to go by its pretty hard to say anything for sure.
With all that said, let's refactor this a bit:
First up, we want to get rid of the one thing we know can go into a guard, and that is your first case that checks whether Destination < 1. Instead of using a case, let's consider that we really want to call two different clauses of a common function:
foo(Destination, NumberOfJumps, _, _) when Destination < 1 ->
{ok, NumerOfJumps + 1};
foo(Destination, _, VisitedIndices, A) ->
case lists:flatlength(A) < Destination of
true -> doSomething;
false ->
case lists:member(Destination,VisitedIndices) of
true -> doSomething;
false -> doSomethingElse
end
end.
Not too weird. But those nested cases that remain... something is annoying about them. This is where I suspect something can be done elsewhere to alleviate the choice of paths being taken here much earlier in the code. But let's pretend that you have no control over that stuff. In this situation assignment of booleans and an if can be a readability enhancer:
foo(Destination, NumberOfJumps, _, _) when Destination < 1 ->
{ok, NumberOfJumps + 1};
foo(Destination, _, VisitedIndices, A) ->
ALength = lists:flatlength(A) < Destination,
AMember = lists:member(Destionation, VisitedIncides),
NextOp =
if
ALength -> fun doSomething/0;
AMember -> fun doSomething/0;
not AMember -> fun doSomethingElse/0
end,
NextOp().
Here I have just cut to the chase and made sure we only execute each potentially expensive operation once by assigning the result to a variable -- but this makes me very uncomfortable because I shouldn't be in this situation to begin with.
In any case, something like this should test the same as the previous code, and in the interim may be more readable. But you should be looking for other places to simplify. In particular, this VisitedIndices business feels fishy (why don't we already know if Destination is a member?), the variable A needing to be flattened after we've arrived in this function is odd (why is it not already flattened? why is there so much of it?), and NumberOfJumps feels something like an accumulator, but its presence is mysterious.
What makes me feel weird about these variables, you might ask? The only one that is consistently used is Destination -- the others are only used either in one clause of foo/4 or the other, but not both. That makes me think this should be different paths of execution somewhere further up the chain of execution, instead of all winding up down here in a super-decision-o-matic type function.
EDIT
With a fuller description of the problem in hand (reference the discussion in comments below), consider how this works out:
-module(jump_calc).
-export([start/1]).
start(A) ->
Value = jump_calc(A, length(A), 1, 0, []),
io:format("Jumps: ~p~n", [Value]).
jump_calc(_, Length, Index, Count, _) when Index < 1; Index > Length ->
Count;
jump_calc(Path, Length, Index, Count, Visited) ->
NewIndex = Index + lists:nth(Index, Path),
NewVisited = [Index | Visited],
NewCount = Count + 1,
case lists:member(NewIndex, NewVisited) of
true -> NewCount;
false -> jump_calc(Path, Length, NewIndex, NewCount, NewVisited)
end.
Always try to front-load as much processing as possible instead of performing the same calculation over and over. Consider how readily we can barrier each iteration behind guards, and how much conditional stuff we don't even have to write because of this. Function matching is a powerful tool -- once you get the hang of it you will really start to enjoy Erlang.