Finding similar sets with clojure's core.logic / minikanren - clojure

this is my first question on Stack Overflow.
I’m new to logic programming and are trying to evaluate if it can be used to solve some matching problems I’m working on.
Problem:
Lets say we have a set A that looks like this.
A = {1, 2, 3, 4}
And then we have some other sets that looks like this.
B = {1, 2}
C = {3, 5, “banana"}
D = {2, 3, 4}
The problem I’m trying to solve is,
"Find me the set that share the most members with set A, compared to the other sets we know about."
The answer in this case should be set D, because it shares three members with set A. Compared to the other sets that only share two and one member with A.
Question 1:
Can Logic Programming solve this types of problems?
Question 2:
If it can, how would you do it in for example Clojure’s core.logic?

Qeshi
The following shows getting the best fit result using clojure.set:
(ns
sample.sandbox
(:require [clojure.set :as set])
)
(def A #{ 1, 2, 3, 4})
(def B #{1, 2})
(def C #{3, 5, "banana"})
(def D #{2, 3, 4})
(defn best-fit-set
[control & sets]
(apply max-key count (map #(set/intersection control %) sets )))
(best-fit-set A B C D) => #{4 3 2}

Related

Merging 3D list in 2D list: Python

What would be a good pythonic way to merge my 3D list into a 2D one.
a= [[[1,2],[3,4]],[[2,3],[21,18]]]
I want an output of:
a= [[1,2,3,4],[2,3,21,18]]
I tried with
new =list(itertools.chain.from_iterable(a))
This does not give the desired result. It gives
a= [[1,2],[3,4],[2,3],[21,18]]
Your (Sb92) approach is almost correct, though instead of performing from_iterable on the outer list, it needs to be applied to the inner lists.
The following would work:
[list(itertools.chain.from_iterable(b)) for b in a]
from itertools import chain
a= [[[1,2],[3,4]],[[2,3],[21,18]]]
[list(chain(*i)) for i in a]
[[1, 2, 3, 4], [2, 3, 21, 18]]
give this a try:
a= [[[1,2],[3,4]],[[2,3],[21,18]]]
print [list(set(sum(x, []))) for x in a]

How do I get the longest continuous sequence in a matrix which fulfills a criterion?

I have a matrix in which every vector consists of hashmaps. Here's a toy example:
[
[{:label x, ...}, {:label y, ...}, ...]
[{:label y, ...}, {:label z, ...}, ...]
[{:label p, ...}, {:label x, ...}, ...]
...
[{:label x, ...}, {:label x, ...}, ...]
]
Because only the label is relevant to my problem, I have removed the other things.
Now, what I want to do is for each row, calculate the longest sequence of continuous labels. That is, if the labels of a row are A B B B A A C A, then the longest sequence is B B B. What I then want to return is a tuple of 1) which row k has the longest such sequence (any of the longest is fine in case of a tie), and also 2) what the index i of the first item in the sequence is, as well as 3) what the index j of the last item in the sequence is.
So, for this simplified matrix, that would be k = 1, i = 2, j = 5.
[
[A B B A A C]
[C B A A A A]
[B A C A B A]
]
I'm new to functional programming and I really like it so far, but I can't quite figure out how to do this without resorting to e.g. the foreach loops of my native php. I'm not looking for somebody to do everything for me, but a hint in the right direction would be very much appreciated. Thank you.
(def m "ABBAAAC")
(->> m (map-indexed vector)
(partition-by #(-> % second identity))
(sort-by count >)
(first))
Gives:
([3 \A] [4 \A] [5 \A])

How to change a value of sub maps of a map?

I am new to Clojure and functional programming and now I am stuck with a problem. I get such a data structure:
{
:service1 \a
:service2 \b
:service3 \c
:default \d
:alert-a {
:duration "00:00-23:59"
:if-alert true
:continuous-times 2
:time-interval [2 6 9 15 30 60]
:times -1
}
:alert-b {
:duration "09:00-23:00"
:if-alert true
:continuous-times 2
:time-interval [2 6 9 15 30 60]
:times -1
}
:alert-c {
:duration "00:00-23:59"
:if-alert true
:continuous-times 5
:time-interval [5]
:times 1
}
:alert-d {
:duration "00:00-23:59"
:if-alert true
:continuous-times 5
:time-interval [5 15 30 60]
:times -1
}
}
This is something read from a config file. I want to change all the :duration value to a DateTime object using clj-time. So I can get something like:
{
:service1 \a
:service2 \b
:service3 \c
:default \d
:alert-a {
:duration DateTime Object
:if-alert true
:continuous-times 2
:time-interval [2 6 9 15 30 60]
:times -1
}
:alert-b {
:duration DateTime Object
:if-alert true
:continuous-times 2
:time-interval [2 6 9 15 30 60]
:times -1
}
:alert-c {
:duration DateTime Object
:if-alert true
:continuous-times 5
:time-interval [5]
:times 1
}
:alert-d {
:duration DateTime Object
:if-alert true
:continuous-times 5
:time-interval [5 15 30 60]
:times -1
}
}
But the data structure is immutable. This is an easy problem in other languages but now I don't know how to do it after a whole afternoon.
So can anyone give me some suggestions? Am I using a bad data structure? Or this problem can be somehow solved in a functional way.
Although you are working with immutable datastructures, you can easily and efficiently return new datastructures that are based on the originals.
In this case, the simplest (if repetitive) solution would be:
(-> m
(update-in [:alert-a :duration] parse-duration)
(update-in [:alert-b :duration] parse-duration)
(update-in [:alert-c :duration] parse-duration)
(update-in [:alert-d :duration] parse-duration))
The important thing to realize here is that update-in does not mutate the datastructure it's working on. Instead it returns a new datastructure with the modifications applied.
The threading macro -> allows the new datastructure to be threaded through the update-in operations, so that the final returned value is the original datastructure with all of the updates applied.
The parse-duration function would probably look a bit like this:
(defn parse-duration
"Convert duration in HH:MM-HH:MM format"
[s]
(let [[t1 t2] (clojure.string/split s #"-"))
(Period. (clj-time.coerce/to-date-time t1)
(clj-time.coerce/to-date-time t2)))
In functional programming you don't modify collection, but instead create new collection with needed values substituted by new ones. Fortunately, Clojure comes with a bunch of useful functions for this. For your case update-in should work well. It takes a collection (e.g. map), sequence of nested keys and a function to apply to the most nested value defined by key sequence. For example:
> (def m {:a 1 :b 2 :c {:c1 1 :c2 2}})
#'sandbox5448/m
> m
{:a 1, :c {:c1 1, :c2 2}, :b 2}
> (update-in m [:c :c1] str)
{:a 1, :c {:c1 "1", :c2 2}, :b 2}
Note how value 1 from key sequence [:c :c1] was converted to "1".
So, converting :duration field of :alert-a to DateTime is as easy as writing:
> (update-in your-map [:alert-a :duration] string-to-date)
where string-to-date is you converter function.

pair lists to create tuples in order

I'd like to combine two lists. If I have the following two lists: {a,b,c,d} and {1,2,3,4} what do I need to do so that I get {{a,1}, {b,2}, {c,3}, {d,4}}?
Here is one way:
Transpose[{{a, b, c, d}, {1, 2, 3, 4}}]
An esoteric method is Flatten, which (from the Help Section on Flatten) also allows Transpose of a 'ragged' array.
Flatten[ {{a, b, c, d}, {1, 2, 3, 4, 5}}, {{2}, {1}}]
Out[6]= {{a, 1}, {b, 2}, {c, 3}, {d, 4}, {5}}
One possible solution is
MapThread[List,{{a,b,c,d},{1,2,3,4}}]
If you have lists with the columns of a matrix:
l = Table[Subscript[g, Sequence[j, i]], {i, 5}, {j, 5}]
Transpose will give you the rows:
Transpose#l // MatrixForm
listA={a,b,c,d};
listB=[1,2,3,4};
table=Transpose#{# & ### listA, # & ### listB}
This is a great question. I had become stuck thinking there was a default way to do this with Table, but not so. The answers below are fairly intuitive, and can be easily generalized to other similar situations.
l1 = {a,b,c,d};
l2 = {1,2,3,4};
pairs = Table[{l1[[i]], l2[[i]]}, {i, 1, Length[l1]}]
MapThread does this sort of thing also. This is less elegant than Howard's MapThread solution, but also more readable in some sense. Look at MapThread docs. The function is defined inline (pure function):
pairs = MapThread[{#1, #2} &, {l1, l2}]
In case a, b, c, d themselves are also list, use the following:
MapThread[Flatten[{#1[[All]],#2}]&,{l1,l2}]//TableForm

Show duplicates in Mathematica

In Mathematica I have a list:
x = {1,2,3,3,4,5,5,6}
How will I make a list with the duplicates? Like:
{3,5}
I have been looking at Lists as Sets, if there is something like Except[] for lists, so I could do:
unique = Union[x]
duplicates = MyExcept[x,unique]
(Of course, if the x would have more than two duplicates - say, {1,2,2,2,3,4,4}, there the output would be {2,2,4}, but additional Union[] would solve this.)
But there wasn't anything like that (if I did understand all the functions there well).
So, how to do that?
Lots of ways to do list extraction like this; here's the first thing that came to my mind:
Part[Select[Tally#x, Part[#, 2] > 1 &], All, 1]
Or, more readably in pieces:
Tally#x
Select[%, Part[#, 2] > 1 &]
Part[%, All, 1]
which gives, respectively,
{{1, 1}, {2, 1}, {3, 2}, {4, 1}, {5, 2}, {6, 1}}
{{3, 2}, {5, 2}}
{3, 5}
Perhaps you can think of a more efficient (in time or code space) way :)
By the way, if the list is unsorted then you need run Sort on it first before this will work.
Here's a way to do it in a single pass through the list:
collectDups[l_] := Block[{i}, i[n_]:= (i[n] = n; Unevaluated#Sequence[]); i /# l]
For example:
collectDups[{1, 1, 6, 1, 3, 4, 4, 5, 4, 4, 2, 2}] --> {1, 1, 4, 4, 4, 2}
If you want the list of unique duplicates -- {1, 4, 2} -- then wrap the above in DeleteDuplicates, which is another single pass through the list (Union is less efficient as it also sorts the result).
collectDups[l_] :=
DeleteDuplicates#Block[{i}, i[n_]:= (i[n] = n; Unevaluated#Sequence[]); i /# l]
Will Robertson's solution is probably better just because it's more straightforward, but I think if you wanted to eek out more speed, this should win. But if you cared about that, you wouldn't be programming in Mathematica! :)
Here are several faster variations of the Tally method.
f4 uses "tricks" given by Carl Woll and Oliver Ruebenkoenig on MathGroup.
f2 = Tally## /. {{_, 1} :> Sequence[], {a_, _} :> a} &;
f3 = Pick[#, Unitize[#2 - 1], 1] & ## Transpose#Tally## &;
f4 = # ~Extract~ SparseArray[Unitize[#2 - 1]]["NonzeroPositions"] & ## Transpose#Tally## &;
Speed comparison (f1 included for reference)
a = RandomInteger[100000, 25000];
f1 = Part[Select[Tally##, Part[#, 2] > 1 &], All, 1] &;
First#Timing#Do[##a, {50}] & /# {f1, f2, f3, f4, Tally}
SameQ ## (##a &) /# {f1, f2, f3, f4}
Out[]= {3.188, 1.296, 0.719, 0.375, 0.36}
Out[]= True
It is amazing to me that f4 has almost no overhead relative to a pure Tally!
Using a solution like dreeves, but only returning a single instance of each duplicated element, is a bit on the tricky side. One way of doing it is as follows:
collectDups1[l_] :=
Module[{i, j},
i[n_] := (i[n] := j[n]; Unevaluated#Sequence[]);
j[n_] := (j[n] = Unevaluated#Sequence[]; n);
i /# l];
This doesn't precisely match the output produced by Will Robertson's (IMO superior) solution, because elements will appear in the returned list in the order that it can be determined that they're duplicates. I'm not sure if it really can be done in a single pass, all the ways I can think of involve, in effect, at least two passes, although one might only be over the duplicated elements.
Here is a version of Robertson's answer that uses 100% "postfix notation" for function calls.
identifyDuplicates[list_List, test_:SameQ] :=
list //
Tally[#, test] & //
Select[#, #[[2]] > 1 &] & //
Map[#[[1]] &, #] &
Mathematica's // is similar to the dot for method calls in other languages. For instance, if this were written in C# / LINQ style, it would resemble
list.Tally(test).Where(x => x[2] > 1).Select(x => x[1])
Note that C#'s Where is like MMA's Select, and C#'s Select is like MMA's Map.
EDIT: added optional test function argument, defaulting to SameQ.
EDIT: here is a version that addresses my comment below & reports all the equivalents in a group given a projector function that produces a value such that elements of the list are considered equivalent if the value is equal. This essentially finds equivalence classes longer than a given size:
reportDuplicateClusters[list_List, projector_: (# &),
minimumClusterSize_: 2] :=
GatherBy[list, projector] //
Select[#, Length## >= minimumClusterSize &] &
Here is a sample that checks pairs of integers on their first elements, considering two pairs equivalent if their first elements are equal
reportDuplicateClusters[RandomInteger[10, {10, 2}], #[[1]] &]
This thread seems old, but I've had to solve this myself.
This is kind of crude, but does this do it?
Union[Select[Table[If[tt[[n]] == tt[[n + 1]], tt[[n]], ""], {n, Length[tt] - 1}], IntegerQ]]
Given a list A,
get the non-duplicate values in B
B = DeleteDuplicates[A]
get the duplicate values in C
C = Complement[A,B]
get the non-duplicate values from the duplicate list in D
D = DeleteDuplicates[C]
So for your example:
A = 1, 2, 2, 2, 3, 4, 4
B = 1, 2, 3, 4
C = 2, 2, 4
D = 2, 4
so your answer would be DeleteDuplicates[Complement[x,DeleteDuplicates[x]]] where x is your list. I don't know mathematica, so the syntax may or may not be perfect here. Just going by the docs on the page you linked to.
Another short possibility is
Last /# Select[Gather[x], Length[#] > 1 &]