finding sublist in lists with map/select in mathematica - list

I have, in Wolfram Mathematica 8.0, a nested list like
nList = {{a,b},{f,g},{n,o}}
and a normal list like
lList = {a,b,c,k,m,n,o,z}
and i want to check if all the sublists in nList are in lList (in the example a,b and n,o are there but not f,g)
I've done it using For[,,,] and using index... can someone enlighten me in using functions like Map/Thread/Select to do it in one pass.
Edit: If nList contains a,b, lList must contain a,b and not a,c,b or b,a or b,c,a

Assuming that you don't care about element ordering, here is one way:
In[20]:= Complement[Flatten[nList],lList] ==={}
Out[20]= False
EDIT
If the order matters, then here is one way:
In[29]:= And##(MatchQ[lList,{___,PatternSequence[##],___}]&###nList)
Out[29]= False
For large number of sub-lists, this may be faster:
In[34]:=
Union[ReplaceList[lList,
{___,x:Alternatives##(PatternSequence###nList),___}:>{x}]]===Union[nList]
Out[34]= False
This works as follows: ReplaceList is a very nice but often ignored command which returns a list of all possible expressions which could be obtained with the pattern-matcher trying to apply the rules in all possible ways to an expression. This is in contrast with the way the pattern-matcher usually works, where it stops upon the first successful match. The PatternSequence is a relatively new addition to the Mathematica pattern language, which allows us to give an identity to a given sequence of expression, treating it as a pattern. This allowed us to construct the alternative pattern, so the resulting pattern is saying: the sequence of any sublist in any place in the main list is collected and put back to list braces, forming back the sublist. We get as many newly formed sublists as there are sequences of the original sublists in the larger list. If all sublists are present, then Union on the resulting list should be the same as Union of the original sublist list.
Here are the benchmarks (I took a list of integers, and overlapping sublists generated by Partition):
In[39]:= tst = Range[1000];
In[41]:= sub = Partition[tst, 2, 1];
In[43]:=
And ## (MatchQ[tst, {___, PatternSequence[##], ___}] & ### sub) // Timing
Out[43]= {3.094, True}
In[45]:=
Union[ReplaceList[tst, {___,x : Alternatives ## (PatternSequence ### sub), ___}
:> {x}]] === Union[sub] // Timing
Out[45]= {0.11, True}
Conceptually, the reason why the second method is faster is that it does its work in the single run through the list (performed internally by ReplaceList), while the first solution explicitly iterates through the big list for each sub-list.
EDIT 2 - Performance
If performance is really an issue, then the following code is yet much faster:
And ## (With[{part = Partition[lList, Length[#[[1]]], 1]},
Complement[#, part] === {}] & /#SplitBy[SortBy[nList, Length], Length])
For example, on our benchmarks:
In[54]:= And##(With[{part = Partition[tst,Length[#[[1]]],1]},
Complement[#,part]==={}]&/#SplitBy[SortBy[sub,Length],Length])//Timing
Out[54]= {0.,True}
EDIT 3
Per suggestion of #Mr.Wizard, the following performance improvement can be made:
Scan[
If[With[{part = Partition[lList, Length[#[[1]]], 1]},
Complement[#, part] =!= {}], Return[False]] &,
SplitBy[SortBy[nList, Length], Length]
] === Null
Here, the as soon as we get a negative answer from sub-lists of a given length, sublists of other lengths will not be checked, since we already know that the answer is negative (False). If Scan completes without Return, it will return Null, which will mean that lList contains all of the sublists in nList.

You could use pattern matching to do this job:
In[69]:= nList = {{a, b}, {f, g}, {n, o}};
lList = {a, b, c, k, m, n, o, z};
The ### is an alias for Apply at level {1}. The level 1 of nList contains your pairs, and applying replaces the head List in them with the function to the right of ###.
In[71]:= MatchQ[lList, {___, ##, ___}] & ### nList
Out[71]= {True, False, True}

Related

How to count the number of consecutive occurrences in a list of any element type in OCaml?

In OCaml, suppose I have a string list as follows:
let ls : string list = ["A"; "A"; "B"; "B"; "A"; "Y"; "Y"; "Y"] ;;
I'm having trouble writing a function that calculates how many times an element occurs consecutively and also pairs up that element with its frequency. For instance, given the above list as input, the function should return [("A", 2); ("B", 2), ("A", 1), ("Y", 3)].
I've tried looking for some hints elsewhere but almost all other similar operations are done using int lists, where it is easy to simply add numbers up. But here, we cannot add strings.
My intuition was to use something like fold_left in some similar fashion as below:
let lis : int list = [1;2;3;4;5]
let count (lis : int list) = List.fold_left (fun x y -> x + y) (0) (lis)
which is essentially summing all the elements cumulatively from left to right. But, in my case, I don't want to cumulatively sum all the elements, I just need to count how many times an element occurs consecutively. Some advice would be appreciated!
This is obviously a homework assignment, so I will just give a couple of hints.
When you get your code working, it won't be adding strings (or any other type) together. It will be adding ints together. So you might want to look back at those examples on the net again :-)
You can definitely use fold_left to get an answer. First, note that the resultl is a list of pairs. The first element of each pair can be any type, depending on the type of the original list. The second element in each pair is an int. So you have a basic type that you're working with: ('a * int) list.
Imagine that you have a way to "increment" such a list:
let increment (list: ('a * int) list) value =
(* This is one way to think of your problem *)
This function looks for the pair in the list whose first element is equal to value. If it finds it, it returns a new list where the associated int is one larger than before. If it doesn't find it, it returns a new list with an extra element (value, 1).
This is the basic operation you want to fold over your list, rather than the + operation of your example code.

Using a single clause compute whether the sum of any three members of a list is equal to given value

We are not supposed to use any of the functions other than the ones listed below:
A single clause must be defined (no more).
+
,
;
.
!
:-
is
Lists
Head and tail syntax for list types
Variables
For example sumlists([1,2,3,5,7],11) then the program execution should print TRUE. Because 1+3+7 (any three)=11 (given N value).
Ideally, we either get an element or don't, as we go along the input list; and we stop either on having reached the needed sum, or having surpassed it, or when the list has been exhausted.
But we can only have one clause one predicate here, and only use certain primitives, so instead we sneakily use + both symbolically, to gather the information for summation, and as an arithmetic operation itself:
sumlists(L, N) :-
N = X+A+B+C, X is A+B+C, !
; L = [H|T], sumlists(T, N+H)
; L = [H|T], sumlists(T, N).

Understanding Prolog's empty lists

I am reading Bratko's Prolog: Programming for Artificial Intelligence. The easiest way for me to understand lists is visualising them as binary trees, which goes well. However, I am confused about the empty list []. It seems to me that it has two meanings.
When part of a list or enumeration, it is seen as an actual (empty) list element (because somewhere in the tree it is part of some Head), e.g. [a, []]
When it is the only item inside a Tail, it isn’t an element it literally is nothing, e.g. [a|[]]
My issue is that I do not see the logic behind 2. Why is it required for lists to have this possible ‘nothingness’ as a final tail? Simply because the trees have to be binary? Or is there another reason? (In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?) Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?
In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?
Those are two different things. Lists in Prolog are (degenerate) binary trees, but also very much like a singly linked list in a language that has pointers, say C.
In C, you would have a struct with two members: the value, and a pointer to the next list element. Importantly, when the pointer to next points to a sentinel, this is the end of the list.
In Prolog, you have a functor with arity 2: ./2 that holds the value in the first argument, and the rest of the list in the second:
.(a, Rest)
The sentinel for a list in Prolog is the special []. This is not a list, it is the empty list! Traditionally, it is an atom, or a functor with arity 0, if you wish.
In your question:
[a, []] is actually .(a, .([], []))
[a|[]] is actually .(a, [])
which is why:
?- length([a,[]], N).
N = 2.
This is now a list with two elements, the first element is a, the second element is the empty list [].
?- [a|[]] = [a].
true.
This is a list with a single element, a. The [] at the tail just closes the list.
Question: what kind of list is .([], [])?
Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?
Yes, you can leave a free variable there; then, you have a "hole" at the end of the list that you can fill later. Like this:
?- A = [a, a|Tail], % partial list with two 'a's and the Tail
B = [b,b], % proper list
Tail = B. % the tail of A is now B
A = [a, a, b, b], % we appended A and B without traversing A
Tail = B, B = [b, b].
You can also make circular lists, for example, a list with infinitely many x in it would be:
?- Xs = [x|Xs].
Xs = [x|Xs].
Is this useful? I don't know for sure. You could for example get a list that repeats a, b, c with a length of 7 like this:
?- ABCs = [a,b,c|ABCs], % a list that repeats "a, b, c" forever
length(L, 7), % a proper list of length 7
append(L, _, ABCs). % L is the first 7 elements of ABCs
ABCs = [a, b, c|ABCs],
L = [a, b, c, a, b, c, a].
In R at least many functions "recycle" shorter vectors, so this might be a valid use case.
See this answer for a discussion on difference lists, which is what A and Rest from the last example are usually called.
See this answer for implementation of a queue using difference lists.
Your confusion comes from the fact that lists are printed (and read) according to a special human-friendly format. Thus:
[a, b, c, d]
... is syntactic sugar for .(a, .(b, .(c, .(d, [])))).
The . predicate represents two values: the item stored in a list and a sublist. When [] is present in the data argument, it is printed as data.
In other words, this:
[[], []]
... is syntactic sugar for .([], .([], [])).
The last [] is not printed because in that context it does not need to. It is only used to mark the end of current list. Other [] are lists stored in the main list.
I understand that but I don't quite get why there is such a need for that final empty list.
The final empty list is a convention. It could be written empty or nil (like Lisp), but in Prolog this is denoted by the [] atom.
Note that in prolog, you can leave the sublist part uninstantiated, like in:
[a | T]
which is the same as:
.(a, T)
Those are known as difference lists.
Your understanding of 1. and 2. is correct -- where by "nothing" you mean, element-wise. Yes, an empty list has nothing (i.e. no elements) inside it.
The logic behind having a special sentinel value SENTINEL = [] to mark the end of a cons-cells chain, as in [1,2,3] = [1,2|[3]] = [1,2,3|SENTINEL] = .(1,.(2,.(3,SENTINEL))), as opposed to some ad-hoc encoding, like .(1,.(2,3)) = [1,2|3], is types consistency. We want the first field of a cons cell (or, in Prolog, the first argument of a . functored term) to always be treated as "a list's element", and the second -- as "a list". That's why [] in [1, []] counts as a list's element (as it appears as a 1st argument of a .-functored compound term), while the [] in [1 | []] does not (as it appears as a 2nd argument of such term).
Yes, the trees have to be binary -- i.e. the functor . as used to encode lists is binary -- and so what should we put there in the final node's tail field, that would signal to us that it is in fact the final node of the chain? It must be something, consistent and easily testable. And it must also represent the empty list, []. So it's only logical to use the representation of an empty list to represent the empty tail of a list.
And yes, having a non-[] final "tail" is perfectly valid, like in [1,2|3], which is a perfectly valid Prolog term -- it just isn't a representation of a list {1 2 3}, as understood by the rest of Prolog's built-ins.

Erlang for each list of lists

I want to make a new list that only contains the elements of the "list of lists" which have a length of 1.
The code that i provide gives a exception error: no function clause matching.
lists:foreach(fun(X) if length(X) =:= 1 -> [X] end, ListOfLists).
I am new to erlang, and I am having trouble finding an alternative way for writing this piece of code.
Can someone give me some advice on how to do so?
You can match in a list comprehension to get this quite naturally:
[L || L = [_] <- ListOfLists]
For example:
1> LoL = [[a], [b,c], d, [e], [f,g]].
[[a],[b,c],d,[e],[f,g]]
2> [L || L = [_] <- LoL].
[[a],[e]]
If you want the elements themselves (as in result [a, e] instead of [[a], [e]]) you can match on the element within the shape:
3> [L || [L] <- LoL].
[a,e]
Depending on the size of the lists contained within LoL, matching will be significantly faster than calling length/1 on every member. Calling length/1 and then testing the result requires traversing the entire list, returning a value, and then testing it. This is arbitrarily more overhead than checking if the second element of the list is a termination (in other words, if the "shape" of the data matches).
Regarding your attempt above...
As a newcomer to Erlang it might be helpful to become familiar with the basic functional list operations. They pop up over and over in functional (and logic) programming, and generally have the same names. "maps", "folds", "filters", "cons", "car" ("head" or "hd" or [X|_]), "cdr" ("tail" or "tl" or [_|X]), and so on.
Your original attempt:
lists:foreach(fun(X) if length(X) =:= 1 -> [X] end, ListOfLists).
This can't work because foreach/2 only returns ok, never any value. It is used only when you want to iterate over a list to get side-effects, not because you want to get a return value. For example, if I have a chat system the chat rooms have a list of current members, and broadcasting a message is really sending each chat message to each member in the list, I might do:
-spec broadcast(list(), unicode:chardata()) -> ok.
broadcast(Users, Message) ->
Forward = fun(User) -> send(User, Message) end,
lists:foreach(Forward, Users).
I don't care about the return value, really, and we aren't changing anything in the list Users or the Message. (Note that here we are using the anonymous function to capture the relevant state that it requires -- essentially currying out the Message value so we can present a function of arity 1 to the list operation foreach/2. This is where lambdas become most useful in Erlang vs named functions.)
When you want to take a list as an input and return a single, aggregate value (use some operation to roll all the values in the list into one) you can use a fold (you almost always want to use foldl/3, specifically):
4> lists:foldl(fun(X, A) when length(X) =:= 1 -> [X|A]; (_, A) -> A end, [], LoL).
[[e],[a]]
Broken down that reads as:
Single =
fun
(X, A) when length(X) =:= 1 -> [X|A];
(_, A) -> [X|A]
end,
ListOfSingles = lists:foldl(Single, [], LoL).
This is an anonymous function that has two clauses.
Written another way with a case we could do:
Single =
fun(X, A) ->
case length(X) of
1 -> [X|A];
_ -> A
end
end,
This is a matter of preference, as is the choice to inline that as an anonymous function within the call to foldl/3.
What you are really trying to do, though, is filter the list, and there is a universal list function called just that. You supply a testing function that returns a boolean -- if the test is true then the element will turn up in the output, otherwise it will not:
5> lists:filter(fun([X]) -> true; (_) -> false end, LoL).
[[a],[e]]
Breaking the lambda out as before:
6> Single =
6> fun([X]) -> true;
6> (_) -> false
6> end.
#Fun<erl_eval.6.54118792>
7> lists:filter(Single, LoL).
[[a],[e]]
Here we matched on the shape of the element in the anonymous function head. This filter is almost exactly equivalent to the list comprehension above (the only difference, really, is in the underlying implementation of list comprehensions -- semantically they are identical).

Prolog: square numbers in a list

Ho do I square numbers in a list in prolog?
The list can contain numbers, atoms and lists.
for example: [a,b,2,3,4,[3],[c,d,9]] and the answer should be [a,b,4,9,16,[3],[c,d,9]].
As we see in the answer it should be a shallow squaring of the values in the list.
2->4
3->9
4->16
What I have tried so far,
square([],X).
square([A|B],X):-number(A), A is A*A, square(B,X).
X will contain squared values. Base case is when empty list is received. I check if head (A) is a number then I go ahead square the number and change A to A * A. Then go ahead and call the square function for remaining part B.
Please suggest where I am doing wrong.
EDIT: Correct answer as follows. By aBathologist. Please read his comment for detailed explanation.
squared_members([], []).
squared_members([L|Ls], [SqrdL|SqrdLs]) :-
number(L),
SqrdL is L * L,
squared_members(Ls, SqrdLs).
squared_members([L|Ls], [L|SqrdLs]) :-
\+number(L),
squared_members(Ls, SqrdLs).
And
squared_members([], []).
squared_members([L|Ls], [M|Ms]) :-
( number(L)
-> M is L * L, squared_members(Ls, Ms)
; M = L, squared_members(Ls, Ms)
).
We're defining a predicate which describes the relationship between one list, A, and another list, B: B should have all the same elements as A, except that any number in A should be squared in B.
Where you've gone wrong:
Your ground condition, square([],X), says that when A is empty, then B is anything (so, for instance, even something like square([], 15) is true). But this doesn't capture the meaning we're after, since the second argument should be a list with the same number of members as the first. That is, when the first list is empty then the second list should be empty.
The same problem occurs with your recursive rule, since, at each iteration, an undetermined variable is passed along, and there is never anything said about the relationship between the first list and the second.
This rule will only succeed if the first element of alist is a number. In the case where the first element is, e.g., a (like in your example), number(a) will be false. Since there are no additional rules for the predicate, it will simply be false unless every member of the first list is a number.
Variables in Prolog must always have the same, consistent value throughout the context in which they appear. They function like variables in arithmetic formula. The formula a + b - b = a is true for any values of a and b but *only if a and b are each assigned one, consistent value throughout the equation. The same is true in Prolog statements of the form <variable> is <expression>. What you've written says a = a * a which cannot be the case.
*What you're definition says is, roughly, this: The list B is a squared version of the list A if A is an empty list and B is anything OR if the first element of A is a number, and that number is equal to itself squared, and B is a squared version of the rest of A.
Here's one possible solution:
squared_members([], []).
squared_members([L|Ls], [SqrdL|SqrdLs]) :-
number(L),
SqrdL is L * L,
squared_members(Ls, SqrdLs).
squared_members([L|Ls], [L|SqrdLs]) :-
\+number(L),
squared_members(Ls, SqrdLs).
Notice that this definition is able to establish a meaningful relationship between the two lists by having them either share variables, or contain elements related by a chain of relations between variables (i.e., SqrdL is related to L by virtue of being L * L). This definition has one more clause then yours, which enables it to take account of the members of a list which are not numbers: those are added to the second list unaltered.
An alternative definition, using If-Then-Else notation for cleaner expression, would be the following:
squared_members([], []).
squared_members([L|Ls], [M|Ms]) :-
( number(L)
-> M is L * L, squared_members(Ls, Ms)
; M = L, squared_members(Ls, Ms)
).