Can you recognize an infinite list in a Haskell program? [duplicate] - list

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to tell if a list is infinite?
In Haskell, you can define an infinite list, for example [1..]. Is there a built-in function in Haskell to recognize whether a list has finite length? I don't imagine it is possible to write a user-supplied function to do this, but the internal representation of lists by Haskell may be able to support it. If not in standard Haskell, is there an extension providing such a feature?

No, this is not possible. It would be impossible to write such a function, because you can have lists whose finiteness might be unknown: consider a recursive loop generating a list of all the twin primes it can find. Or, to follow up on what Daniel Pratt mentioned in the comments, you could have a list of all the steps a universal Turing machine takes during its execution, ending the list when the machine halts. Then, you could simply check whether such a list is infinite, and solve the Halting problem!
The only question an implementation could answer is whether a list is cyclic: if one of its tail pointers points back to a previous cell of the list. However, this is implementation-specific (Haskell doesn't specify anything about how implementations must represent values), impure (different ways of writing the same list would give different answers), and even dependent on things like whether the list you pass in to such a function has been evaluated yet. Even then, it still wouldn't be able to distinguish finite lists from infinite lists in the general case!
(I mention this because, in many languages (such as members of the Lisp family), cyclic lists are the only kind of infinite lists; there's no way to express something like "a list of all integers". So, in those languages, you can check whether a list is finite or not.)

There isn't any way to test for finiteness of lists other than iterating over the list to search for the final [] in any implementation I'm aware of. And in general, it is impossible to tell whether a list is finite or infinite without actually going to look for the end (which of course means that every time you get an answer, that says finite).

You could write a wrapper type around list which keeps track of infiniteness, and limit yourself to "decidable" operations only (somehow similar to NonEmpty, which avoids empty lists):
import Control.Applicative
data List a = List (Maybe Int) [a]
infiniteList (List Nothing _) = true
infiniteList _ = false
emptyList = List (Just 0) []
singletonList x = List (Just 1) [x]
cycleList xs = List Nothing (cycle xs)
numbersFromList n = List Nothing [n..]
appendList (List sx xs) (List sy ys) = List ((+) <$> sx <*> sy) (xs ++ ys)
tailList (List s xs) = List (fmap pred s) (tail xs)
...

As ehird wrote, your only hope is in finding out whether a list is cyclic. A way of doing so is to use an extension to Haskell called "observable sharing". See for instance: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.4053

When talking about "internal representation of lists", from standpoint of Haskell implementation, there are no infinite lists. The "list" you ask about is actually a description of computational process, not a data object. No data object is infinite inside a computer. Such a thing simply does not exist.
As others have told you, internal list data might be cyclical, and implementation usually would be able to detect this, having a concept of pointer equality. But Haskell itself has no such concept.
Here's a Common Lisp function to detect the cyclicity of a list. cdr advances along a list by one notch, and cddr - by two. eq is a pointer equality predicate.
(defun is-cyclical (p)
(labels ((go (p q)
(if (not (null q))
(if (eq p q) t
(go (cdr p) (cddr q))))))
(go p (cdr p))))

Related

How can I make equivalence function in Common Lisp?

I know that it is possible to check if 2 lists have the same sets with using "EQUAL" function in Common Lisp.
(equal '(a b c) '(a b c)) => T
(equal '(a b c) '(b c a)) => T
(equal '(a b c) '(d e f)) => NIL
But you know, it is impossible if the two lists have the same sets if the sets are arranged in different orders.
I guess that it may be possible to make the function which can predicate that tests whether two sets contain the same elements even if they are arranged in different orders with using 'remove' function and recursion. But, I can't concrete my idea to make this function exactly.
How can I realize the idea?
A solution that does exactly this can be found here. The OP of that question has a working solution, and the accepted answer is a better solution.
What I will do is try to explain the logic.
Let's step through this problem first. You're given 2 lists, list1 and list2.
If list1 is null, then return true if list2 is also null. ( You don't need to check the opposite of this, that gets taken care of in step 2 )
Else (list is not null/empty): We have to somehow, as you suggested, remove an element that is in both sets. This can be achieved by letting list3 be a list such that it is list2 with the first item in list1 removed.
i. If nothing is removed, i.e, then list2 and list3 are equal (you can use normal equal here), and so the function returns false, because it finds an element in list1 that is not in list2
ii. If something is removed then call our function again on the (rest of list1) and list3.
If you want to use built-in CL facilities, you can use set-exclusive-or like this:
(defun sets-equivalent (set-a set-b)
(not (set-exclusive-or set-a set-b)))
Per CLHS:
set-exclusive-or returns a list of elements that appear in exactly one of list-1 and list-2.
So if returned list is empty, it means sets are equivalent.

Remove a specific item in a list?

I want to preface this by saying that yes, this is a homework problem I'm working on and I don't want the actual answer, just maybe a nudge in the right direction. Anyhoo, I'm taking a class on programming languages' structures, and one of our projects is to write a variety of small programs in lisp. This one requires the user to input a list and an atom, then remove all instances of the atom from the list. I've scoured the internet and haven't found all that many good lisp resources, so I'm turning to you all.
Anyways, our professor has given us very little by way of stuff to work off of, and by very little I mean practically nothing.
This is what I have so far, and it doesn't work.
(defun removeIt (a lis)
(if (null lis) 0
(if (= a (car lis))
(delete (car lis))
(removeIt (cdr lis)))))
And when I type
(removeIt 'u '(u e u e))
as the input, it gives me an error stating it got 1 argument when it wanted 2. What errors am I making?
First, a few cosmetic changes:
(defun remove-it (it list)
(if (null list) 0
(if (= it (car list))
(delete (car list))
(remove-it (cdr list)))))
Descriptive and natural sounding identifier names are preferred in the CL community. Don't be shy to use names like list – CL has multiple namespaces, so you don't have to worry about clashes too much. Use hyphens instead of camel case or underscores. Also, read a short style guide.
You said you didn't want the answer but helpful tips, so here we go:
Check your base case – your result will be a list, so why do you return a number?
Use the appropriate comparison function – = is for numbers only.
You are building a new result list, so no need to delete anything – just don't add to it what you don't want.
But remember to add what you want – build your result list by consing what you want to keep to the result of applying your function to the rest of the list.
If you don't want to keep an element, just go on applying your function to the rest of the list.
You defined your function to take two arguments, but you're calling it with (cdr list) only. Provide the missing argument.
I've scoured the internet and haven't found all that many good lisp
resources,
Oh, come on.
Anyhow, I recommend Touretzky.
By the way, the function you're trying to implement is built-in, but your professor probably won't accept it as a solution, and doing it yourself is a good exercise. (For extra credit, try solving it for nested lists.)
This is a good case for a recursive function. Suppose there exists already a function called my-remove which takes an atom and a list as arguments and returns the list without the given atom. So (my-remove 'Y '(X Y Z)) => '(X Z)
Now, how would you use this function when instead of the list '(X Y Z) you have another list which is (A X Y Z), i.e. with an element A in front?
You would compare A to your atom and then, depending on whether the element A matches your atom, you would add this element A or not to the result of applying remove to the rest of the list.
With this recursion the function my-remove will be called successively with shorter lists. Now you only have to think about the base case, i.e. what does the function my-remove have to return when the list is empty.
This is an answer for other people looking specifically for elisp. A builtin function exists for this purpose called delq
Example
(setq my-list '(0 40 80 40 90)) ;; test list
(delq 40 my-list) ;; (0 80 90)
If you installed emacs from source you can check out how it is implemented by doing Mx find-function delq

Pack consecutive duplicates of list elements into sublists in Ocaml

I found this problem in the website 99 problems in ocaml. After some thinking I solved it by breaking the problem into a few smaller subproblems. Here is my code:
let rec frequency x l=
match l with
|[]-> 0
|h::t-> if x=[h] then 1+(frequency x t)
else frequency x t
;;
let rec expand x n=
match n with
|0->[]
|1-> x
|_-> (expand x (n-1)) # x
;;
let rec deduct a b=
match b with
|[]-> []
|h::t -> if a=[h] then (deduct a t)
else [h]# (deduct a t)
;;
let rec pack l=
match l with
|[]-> []
|h::t -> [(expand [h] (frequency [h] l))]# (pack (deduct [h] t))
;;
It is rather clear that this implementation is overkill, as I have to count the frequency of every element in the list, expand this and remove the identical elements from the list, then repeat the procedure. The algorithm complexity is about O(N*(N+N+N))=O(N^2) and would not work with large lists, even though it achieved the required purpose. I tried to read the official solution on the website, which says:
# let pack list =
let rec aux current acc = function
| [] -> [] (* Can only be reached if original list is empty *)
| [x] -> (x :: current) :: acc
| a :: (b :: _ as t) ->
if a = b then aux (a :: current) acc t
else aux [] ((a :: current) :: acc) t in
List.rev (aux [] [] list);;
val pack : 'a list -> 'a list list = <fun>
the code should be better as it is more concise and does the same thing. But I am confused with the use of "aux current acc" in the inside. It seems to me that the author has created a new function inside of the "pack" function and after some elaborate procedure was able to get the desired result using List.rev which reverses the list. What I do not understand is:
1) What is the point of using this, which makes the code very hard to read on first sight?
2) What is the benefit of using an accumulator and an auxiliary function inside of another function which takes 3 inputs? Did the author implicitly used tail recursion or something?
3) Is there anyway to modify the program so that it can pack all duplicates like my program?
These are questions mostly of opinion rather than fact.
1) Your code is far harder to understand, in my opinion.
2a) It's very common to use auxiliary functions in OCaml and other functional languages. You should think of it more like nested curly braces in a C-like language rather than as something strange.
2b) Yes, the code is using tail recursion, which yours doesn't. You might try giving your code a list of (say) 200,000 distinct elements. Then try the same with the official solution. You might try determining the longest list of distinct values your code can handle, then try timing the two different implementations for that length.
2c) In order to write a tail-recursive function, it's sometimes necessary to reverse the result at the end. This just adds a linear cost, which is often not enough to notice.
3) I suspect your code doesn't solve the problem as given. If you're only supposed to compress adjacent elements, your code doesn't do this. If you wanted to do what your code does with the official solution you could sort the list beforehand. Or you could use a map or hashtable to keep counts.
Generally speaking, the official solution is far better than yours in many ways. Again, you're asking for an opinion and this is mine.
Update
The official solution uses an auxiliary function named aux that takes three parameters: the currently accumulated sublist (some number of repetitions of the same value), the currently accumulated result (in reverse order), and the remaining input to be processed.
The invariant is that all the values in the first parameter (named current) are the same as the head value of the unprocessed list. Initially this is true because current is empty.
The function looks at the first two elements of the unprocessed list. If they're the same, it adds the first of them to the beginning of current and continues with the tail of the list (all but the first). If they're different, it wants to start accumulating a different value in current. It does this by adding current (with the one extra value added to the front) to the accumulated result, then continuing to process the tail with an empty value for current. Note that both of these maintain the invariant.

foldl vs foldr: which should I prefer?

I remember that when I showed some code that I wrote to my professor he remarked, offhand, that
It rarely matters, but it's worth noting that fold* is a little bit more efficient than fold*' in SML/NJ, so you should prefer it over fold* when possible.
I forget whether fold* was foldr or foldl. I know that this is one of those micro-optimization things that probably doesn't make a big difference in practice, but I'd like to be in the habit of using the more efficient one when I have the choice.
Which is which? My guess is that this is SML/NJ specific and that MLton will be smart enough to optimize both down to the same machine code, but answers for other compilers are good to know.
foldl is tail-recursive, while foldr is not. Although you can do foldr in a tail-recursive way by reversing the list (which is tail recursive), and then doing foldl.
This is only going to matter if you are folding over huge lists.
Prefer the one that converts the given input into the intended output.
If both produce the same output such as with a sum, and if dealing with a list, folding from the left will be more efficient because the fold can begin with head element, while folding from the right will first require walking the list to find the last element before calculating the first intermediate result.
With arrays and similar random access data structures, there's probably not going to be much difference.
A compiler optimization that always chose the better of left and right would require the compiler to determine that left and right were equivalent over all possible inputs. Since foldl and foldr take a functions as arguments, this is a bit of a tall order.
I'm going to keep the accepted answer here, but I had the chance to speak to my professor, and his reply was actually the opposite, because I forgot a part of my question. The code in question was building up a list, and he said:
Prefer foldr over foldl when possible, because it saves you a reverse at the end in cases where you're building up a list by appending elements during the fold.
As in, for a trivial example:
- val ls = [1, 2, 3];
val ls = [1,2,3] : int list
- val acc = (fn (x, xs) => x::xs);
val acc = fn : 'a * 'a list -> 'a list
- foldl acc [] ls;
val it = [3,2,1] : int list
- foldr acc [] ls;
val it = [1,2,3] : int list
The O(n) save of a reverse is probably more important than the other differences between foldl and foldr mentioned in answers to this question.

Scheme and Clojure don't have the atom type predicate - is this by design?

Common LISP and Emacs LISP have the atom type predicate. Scheme and Clojure don't have it. http://hyperpolyglot.wikidot.com/lisp
Is there a design reason for this - or is it just not an essential function to include in the API?
In Clojure, the atom predicate isn't so important because Clojure emphasizes various other types of (immutable) data structures rather than focusing on cons cells / lists.
It could also cause confusion. How would you expect this function to behave when given a hashmap, a set or a vector for example? Or a Java object that represents some complex mutable data structure?
Also the name "atom" is used for something completely different - it's one of Clojure's core concurrency mechanisms to manage shared, synchronous, independent state.
Clojure has the coll? (collection?) function, which is (sort of) the inverse of atom?.
In the book The Little Schemer, atom? is defined as follows:
(define (atom? x)
(and (not (pair? x))
(not (null? x))))
Noting that null is not considered an atom, as other answers have suggested. In the mentioned book atom? is used heavily, in particular when writing procedures that deal with lists of lists.
In the entire IronScheme standard libraries which implement R6RS, I never needed such a function.
In summary:
It is useless
It is easy enough to write if you need it
Which pretty much follows Scheme's minimalistic approach.
In Scheme anything that is not a pair is an atom. As Scheme already defines the predicate pair?, the atom? predicate is not needed, as it is so trivial to define:
(define (atom? s)
(not (pair? s)))
It's a trivial function:
(defun atom (x)
(not (consp x)))
It is used in list processing, when the Lisp dialect uses conses to build lists. There are some 'Lisps' for which this is not the case or not central.
Atom is either a symbol, a character, a number, or null.
(define (atom? a)
(or (symbol? a)
(char? a)
(number? a)
(null? a)))
I think those are all the atoms that exist, if you find more add to the conditional expression. For example, if you think a string is an atom, add (string? a), :-). The absence of a definition for atom, allows you to define it the way you want. After all, Scheme does not know what an atom is.
In Lisp nil is an atom, so I've made null an atom. nil is also a list by simplification nil = (nil . nil), the same way the integral numbers are rational numbers by simplification, 2 = 2/1, 2 is an integral number, 2/1 is a rational number, as both are equals by simplification of the rational one; one says the integral number 2 is also a rational number. But the list predicate is already defined in Scheme, nothing to worry about.
About the question. As long as I am concerned Scheme has predicates only for class types, atom is not a class type, atom is an abstraction that incorporates several class types. Maybe that is the reason. But pair is not a class type either, but it does not incorporate several class types, and yet some may consider pair as a class type.
Atom means that a certain thing is not a compound thing. One reason not to include such a predicate is when the language allows you to define atomic types, so the pletora of atoms can grow wider and wider, and such a predicate would make no sense. I don't know if Scheme allows for this. I can only say that Scheme predicates (the built-in ones) are all specific. You can ask, is this an apple?, is this an orange?; but you cannot ask is this a fruit?. :-). Well, you can, if you do it yourself. Despite what a said, Scheme has a general predicate number?, and number specific predicates, integer?, rational?, real?; notwithstanding, number can be thought of as a class type (the other predicates refer to sub-types of number), whereas atom is not (at least in Scheme).
Note:
class types: types that belong to a certain class of things. Example:
number, integer, real, rational, character, procedure, list, vector, string, etc.