Haskell function sort input list then do stuff with sorted list - list

Short version:
I want to sort a list, and then perform operations on that sorted list that filter/extract data to form a new list, all in one function.
Long version:
I am teaching myself Haskell using these lessons. I'm currently on Homework 2 Exercise 5.
I am required to write a function whatWentWrong that takes an unsorted list of LogMessages and returns a list of Strings. The strings are the String portion of LogMessages that were constructed with Error in which the Error code is > 50. They are supposed to be sorted by the TimeStamp portion of LogMessage.
I have a function written for whatWentWrong that works, but it's really really slow (you'll see why).
whatWentWrong :: [LogMessage] -> [String]
whatWentWrong [] = []
whatWentWrong ys#((LogMessage (Error code) _ msg):xs)
| ys /= inOrder (build ys)
= whatWentWrong (inOrder (build ys))
| code > 50
= [msg] ++ whatWentWrong xs
| otherwise
= whatWentWrong xs
whatWentWrong (_:xs) = [] ++ whatWentWrong xs
The functions inOrder (build x) will return a sorted version of x (where x is a list of LogMessages). Obviously I have to either sort the list before I begin processing it with whatWentWrong, or I have to filter out all non relevant messages (Messages which are not Errors or which don't have Error codes above 50), sort, and then grab the strings from each one.
If I wasn't following this example, I would just define another function or something, or just send whatWentWrong an already sorted list. But I imagine there's some reason to do it this way (which I can't figure out).
Anyways, what I've done, and why the program is so slow is this:
The line ys /= inOrder (build ys) is checking that the LogMessage list is sorted every single time it encounters a LogMessage that matches the Error pattern, even though, after the first time that check fails, the list is sorted for good.
That's the only way I could think to do it. Really, what I want to do it sort it once, but I have no idea how to make the function sort the list using my sorting functions and then not do that step ever again. I'm obviously not thinking about this correctly and any help is appreciated. Thanks.

You really just need a one-line list comprehension:
whatWentWrong xs = [ msg | (LogMessage (Error code) _ msg) <- inOrder (build xs), code > 50]
If you are sorting the list to see if the list is sorted, you may as well just work directly on the sorted list. Once you've done that, the list comprehension will automatically filter out the elements that don't pattern match, and the code > 50 filters the rest.
If you want to fix your current code as an exercise, you just need to define a helper function that assumes its input is sorted.
whatWentWrong :: [LogMessage] -> [String]
whatWentWrong ys = www (inOrder (build ys))
where www [] = []
www ((LogMessage (Error code) _ msg):xs) | code > 50 = msg : www xs
| otherwise = www xs
www (_:xs) = www xs
However, you should recognize that www is the combination of
a map and a filter.
whatWentWrong ys = map f $ filter p (inOrder (build ys))
where p (LogMessage (Error code) _ _) = code > 50
p _ = False
f (LogMessage _ _ msg) = msg
or, in point-free style
whatWentWrong = map f . filter p . inOrder . build
where p (LogMessage (Error code) _ _) = code > 50
p _ = False
f (LogMessage _ _ msg) = msg

Related

F# a function as an argument in a match function

I have made function that takes a list and a list of lists and returns a new list of lists.
let rec calculator list SS =
match (List.item(0) SS) with
|[] -> []
|_ -> match (validate list (List.item(0) SS)) with
|(validate theCode list) -> List.append [(List.item(0) SS)] (calculator list (SS.[1..]))
|_ -> (calculator list (SS.[1..]))
validate is a function that returns two tupled ints. example (1,1)
list is a list of four ints
SS is a list of lists with four ints
theCode is a list of four ints
i get the error "The pattern discriminator 'validate' is not defined."
Perhaps this is a silly question but none the less i don't know the answer to it.
Is it not allowed to use a function as an argument in a match expression. Or is it something entirely different going on here?
to the best of my knowledge the two validate functions will return two tupled ints and therefore should be able to match upon that.
If your question is how to get this to compile then you only need a small change – a function call is not itself a pattern, so you need to bind to a value and use a when guard:
let rec calculator list SS =
match (List.item(0) SS) with
| [] -> []
| _ ->
match (validate list (List.item(0) SS)) with
// vvvvvvvvvv
| x when x = (validate theCode list) ->
List.append [(List.item(0) SS)] (calculator list (SS.[1..]))
| _ -> (calculator list (SS.[1..]))
However, if your question is indeed "what is the preferred method", then while that's too subjective for this site (IMO), I'll submit this as an option that I consider ideally readable for this logic:
let rec calculator list (h::t) =
if List.isEmpty h then h
elif validate list h = validate theCode list then h::(calculator list t)
else calculator list t
(This assumes that SS is an F# list and not a System.Collections.Generic.List.)
This is not actually an answer to the question of how to implement the when guard, since #ildjarn answered that for you.
I think you'd actually be better served by a library function. What you're trying to do appears to be to filter out elements which don't pass validation, but also to stop on the first empty element. If you can guarantee that you definitely want to loop through every element of SS, you could simply do
let calculator list = List.filter (fun s -> validate list s = validate theCode list)
If it's you must stop at the empty element, you could define a function that cuts the list at the first empty element, something like
let upToElement element list =
let rec loop acc = function
| [] -> List.rev acc
| h :: t when h = element -> List.rev acc
| h :: t -> loop (h :: acc) t
loop [] list
then you can do
let calculator list =
upToElement [] >> List.filter (fun s -> validate list s = validate theCode list)

How can I find the index where one list appears as a sublist of another?

I have been working with Haskell for a little over a week now so I am practicing some functions that might be useful for something. I want to compare two lists recursively. When the first list appears in the second list, I simply want to return the index at where the list starts to match. The index would begin at 0. Here is an example of what I want to execute for clarification:
subList [1,2,3] [4,4,1,2,3,5,6]
the result should be 2
I have attempted to code it:
subList :: [a] -> [a] -> a
subList [] = []
subList (x:xs) = x + 1 (subList xs)
subList xs = [ y:zs | (y,ys) <- select xs, zs <- subList ys]
where select [] = []
select (x:xs) = x
I am receiving an "error on input" and I cannot figure out why my syntax is not working. Any suggestions?
Let's first look at the function signature. You want to take in two lists whose contents can be compared for equality and return an index like so
subList :: Eq a => [a] -> [a] -> Int
So now we go through pattern matching on the arguments. First off, when the second list is empty then there is nothing we can do, so we'll return -1 as an error condition
subList _ [] = -1
Then we look at the recursive step
subList as xxs#(x:xs)
| all (uncurry (==)) $ zip as xxs = 0
| otherwise = 1 + subList as xs
You should be familiar with the guard syntax I've used, although you may not be familiar with the # syntax. Essentially it means that xxs is just a sub-in for if we had used (x:xs).
You may not be familiar with all, uncurry, and possibly zip so let me elaborate on those more. zip has the function signature zip :: [a] -> [b] -> [(a,b)], so it takes two lists and pairs up their elements (and if one list is longer than the other, it just chops off the excess). uncurry is weird so lets just look at (uncurry (==)), its signature is (uncurry (==)) :: Eq a => (a, a) -> Bool, it essentially checks if both the first and second element in the pair are equal. Finally, all will walk over the list and see if the first and second of each pair is equal and return true if that is the case.

Ocaml list of ints to list of int lists (Opposite of flattening)

With a list of integers such as:
[1;2;3;4;5;6;7;8;9]
How can I create a list of list of ints from the above, with all new lists the same specified length?
For example, I need to go from:
[1;2;3;4;5;6;7;8;9] to [[1;2;3];[4;5;6];[7;8;9]]
with the number to split being 3?
Thanks for your time.
So what you actually want is a function of type
val split : int list -> int -> int list list
that takes a list of integers and a sub-list-size. How about one that is even more general?
val split : 'a list -> int -> 'a list list
Here comes the implementation:
let split xs size =
let (_, r, rs) =
(* fold over the list, keeping track of how many elements are still
missing in the current list (csize), the current list (ys) and
the result list (zss) *)
List.fold_left (fun (csize, ys, zss) elt ->
(* if target size is 0, add the current list to the target list and
start a new empty current list of target-size size *)
if csize = 0 then (size - 1, [elt], zss # [ys])
(* otherwise decrement the target size and append the current element
elt to the current list ys *)
else (csize - 1, ys # [elt], zss))
(* start the accumulator with target-size=size, an empty current list and
an empty target-list *)
(size, [], []) xs
in
(* add the "left-overs" to the back of the target-list *)
rs # [r]
Please let me know if you get extra points for this! ;)
The code you give is a way to remove a given number of elements from the front of a list. One way to proceed might be to leave this function as it is (maybe clean it up a little) and use an outer function to process the whole list. For this to work easily, your function might also want to return the remainder of the list (so the outer function can easily tell what still needs to be segmented).
It seems, though, that you want to solve the problem with a single function. If so, the main thing I see that's missing is an accumulator for the pieces you've already snipped off. And you also can't quit when you reach your count, you have to remember the piece you just snipped off, and then process the rest of the list the same way.
If I were solving this myself, I'd try to generalize the problem so that the recursive call could help out in all cases. Something that might work is to allow the first piece to be shorter than the rest. That way you can write it as a single function, with no accumulators
(just recursive calls).
I would probably do it this way:
let split lst n =
let rec parti n acc xs =
match xs with
| [] -> (List.rev acc, [])
| _::_ when n = 0 -> (List.rev acc, xs)
| x::xs -> parti (pred n) (x::acc) xs
in let rec concat acc = function
| [] -> List.rev acc
| xs -> let (part, rest) = parti n [] xs in concat (part::acc) rest
in concat [] lst
Note that we are being lenient if n doesn't divide List.length lst evenly.
Example:
split [1;2;3;4;5] 2 gives [[1;2];[3;4];[5]]
Final note: the code is very verbose because the OCaml standard lib is very bare bones :/ With a different lib I'm sure this could be made much more concise.
let rec split n xs =
let rec take k xs ys = match k, xs with
| 0, _ -> List.rev ys :: split n xs
| _, [] -> if ys = [] then [] else [ys]
| _, x::xs' -> take (k - 1) xs' (x::ys)
in take n xs []

Comparing list length with arrows

Inspired by Comparing list length
If I want to find the longest list in a list of lists, the simplest way is probably:
longestList :: [[a]] -> [a]
longestList = maximumBy (comparing length)
A more efficient way would be to precompute the lengths:
longest :: [[a]] -> [a]
longest xss = snd $ maximumBy (comparing fst) [(length xs, xs) | xs <- xss]
Now, I want to take it one step further. It may not be more efficient for normal cases, but can you solve this using arrows? My idea is basically, step through all of the lists simultaneously, and keep stepping until you've overstepped the length of every list except the longest.
longest [[1],[1],[1..2^1000],[1],[1]]
In the forgoing (very contrived) example, you would only have to take two steps through each list in order to determine that the list [1..2^1000] is the longest, without ever needing to determine the entire length of said list. Am I right that this can be done with arrows? If so, then how? If not, then why not, and how could this approach be implemented?
OK, as I was writing the question, it dawned on me a simple way to implement this (without arrows, boo!)
longest [] = error "it's ambiguous"
longest [xs] = xs
longest xss = longest . filter (not . null) . map (drop 1) $ xss
Except this has a problem...it drops the first part of the list and doesn't recover it!
> take 3 $ longest [[1],[1],[1..2^1000],[1]]
[2,3,4]
Needs more bookkeeping :P
longest xs = longest' $ map (\x -> (x,x)) xs
longest' [] = error "it's ambiguous"
longest' [xs] = fst xs
longest' xss = longest . filter (not . null . snd) . map (sndMap (drop 1)) $ xss
sndMap f (x,y) = (x, f y)
Now it works.
> take 3 $ longest [[1],[1],[1..2^1000],[1]]
[1,2,3]
But no arrows. :( If it can be done with arrows, then hopefully this answer can give you someplace to start.
Thinking about this some more, there is a far simpler solution which gives the same performance characteristics. We can just use maximumBy with a lazy length comparison function:
compareLength [] [] = EQ
compareLength _ [] = GT
compareLength [] _ = LT
compareLength (_:xs) (_:ys) = compareLength xs ys
longest = maximumBy compareLength
Here's the most straightforward implementation I could think of. No arrows involved, though.
I keep a list of pairs where the first element is the original list, and the second is the remaining tail. If we only have one list left, we're done. Otherwise we try taking the tail of all the remaining lists, filtering out those who are empty. If some still remain, keep going. Otherwise, they are all the same length and we arbitrarily pick the first one.
longest [] = error "longest: empty list"
longest xss = go [(xs, xs) | xs <- xss]
where go [(xs, _)] = xs
go xss | null xss' = fst . head $ xss
| otherwise = go xss'
where xss' = [(xs, ys) | (xs, (_:ys)) <- xss]

Combine Lists with Same Heads in a 2D List (OCaml)

I'm working with a list of lists in OCaml, and I'm trying to write a function that combines all of the lists that share the same head. This is what I have so far, and I make use of the List.hd built-in function, but not surprisingly, I'm getting the failure "hd" error:
let rec combineSameHead list nlist = match list with
| [] -> []#nlist
| h::t -> if List.hd h = List.hd (List.hd t)
then combineSameHead t nlist#uniq(h#(List.hd t))
else combineSameHead t nlist#h;;
So for example, if I have this list:
[[Sentence; Quiet]; [Sentence; Grunt]; [Sentence; Shout]]
I want to combine it into:
[[Sentence; Quiet; Grunt; Shout]]
The function uniq I wrote just removes all duplicates within a list. Please let me know how I would go about completing this. Thanks in advance!
For one thing, I generally avoid functions like List.hd, as pattern maching is usually clearer and less error-prone. In this case, your if can be replaced with guarded patterns (a when clause after the pattern). I think what is happening to cause your error is that your code fails when t is []; guarded patterns help avoid this by making the cases more explicit. So, you can do (x::xs)::(y::ys)::t when x = y as a clause in your match expression to check that the heads of the first two elements of the list are the same. It's not uncommon in OCaml to have several successive patterns which are identical except for guards.
Further things: you don't need []#nlist - it's the same as just writing nlist.
Also, it looks like your nlist#h and similar expressions are trying to concatenate lists before passing them to the recursive call; in OCaml, however, function application binds more tightly than any operator, so it actually appends the result of the recursive call to h.
I don't, off-hand, have a correct version of the function. But I would start by writing it with guarded patterns, and then see how far that gets you in working it out.
Your intended operation has a simple recursive description: recursively process the tail of your list, then perform an "insert" operation with the head which looks for a list that begins with the same head and, if found, inserts all elements but the head, and otherwise appends it at the end. You can then reverse the result to get your intended list of list.
In OCaml, this algorithm would look like this:
let process list =
let rec insert (head,tail) = function
| [] -> head :: tail
| h :: t ->
match h with
| hh :: tt when hh = head -> (hh :: (tail # t)) :: t
| _ -> h :: insert (head,tail) t
in
let rec aux = function
| [] -> []
| [] :: t -> aux t
| (head :: tail) :: t -> insert (head,tail) (aux t)
in
List.rev (aux list)
Consider using a Map or a hash table to keep track of the heads and the elements found for each head. The nlist auxiliary list isn't very helpful if lists with the same heads aren't adjacent, as in this example:
# combineSameHead [["A"; "a0"; "a1"]; ["B"; "b0"]; ["A"; "a2"]]
- : list (list string) = [["A"; "a0"; "a1"; "a2"]; ["B"; "b0"]]
I probably would have done something along the lines of what antonakos suggested. It would totally avoid the O(n) cost of searching in a list. You may also find that using a StringSet.t StringMap.t be easier on further processing. Of course, readability is paramount, and I still find this hold under that criteria.
module OrderedString =
struct
type t = string
let compare = Pervasives.compare
end
module StringMap = Map.Make (OrderedString)
module StringSet = Set.Make (OrderedString)
let merge_same_heads lsts =
let add_single map = function
| hd::tl when StringMap.mem hd map ->
let set = StringMap.find hd map in
let set = List.fold_right StringSet.add tl set in
StringMap.add hd set map
| hd::tl ->
let set = List.fold_right StringSet.add tl StringSet.empty in
StringMap.add hd set map
| [] ->
map
in
let map = List.fold_left add_single StringMap.empty lsts in
StringMap.fold (fun k v acc-> (k::(StringSet.elements v))::acc) map []
You can do a lot just using the standard library:
(* compares the head of a list to a supplied value. Used to partition a lists of lists *)
let partPred x = function h::_ -> h = x
| _ -> false
let rec combineHeads = function [] -> []
| []::t -> combineHeads t (* skip empty lists *)
| (hh::_ as h)::t -> let r, l = List.partition (partPred hh) t in (* split into lists with the same head as the first, and lists with different heads *)
(List.fold_left (fun x y -> x # (List.tl y)) h r)::(combineHeads l) (* combine all the lists with the same head, then recurse on the remaining lists *)
combineHeads [[1;2;3];[1;4;5;];[2;3;4];[1];[1;5;7];[2;5];[3;4;6]];;
- : int list list = [[1; 2; 3; 4; 5; 5; 7]; [2; 3; 4; 5]; [3; 4; 6]]
This won't be fast (partition, fold_left and concat are all O(n)) however.