Related
I'm working on a personal project and I got stuck at a certain point.
I have a list of lists that contains 2 elements each in char format. I want to sort each list by number, and then if there are 2 equal numbers in different lists, sort them by alphabet.
A list is an instance of an Ord where it first sorts on the first item, in case of a tie on the second item, etc.
We can work with sortOn :: Ord b => (a -> b) -> [a] -> [a] to sort the elements based on the result of a function call on these items.
We thus can sort with:
import Data.List(sortOn)
sortOn (\(x : y : _) -> (read y :: Double, x)) mylist
for the given sample data, this will produce:
Prelude Data.List> sortOn (\(x : y : _) -> (read y :: Double, x)) mylist
[["Elise","0.9"],["Name","1.2"],["Rex","1.2"],["Diana","2.1"],["Mark","2.1"]]
That being said, the above is not very safe, since it is not said that each list has at least two items, nor that the second item can be converted to a Double.
Here is the expected input/output:
repeated "Mississippi" == "ips"
repeated [1,2,3,4,2,5,6,7,1] == [1,2]
repeated " " == " "
And here is my code so far:
repeated :: String -> String
repeated "" = ""
repeated x = group $ sort x
I know that the last part of the code doesn't work. I was thinking to sort the list then group it, then I wanted to make a filter on the list of list which are greater than 1, or something like that.
Your code already does half of the job
> group $ sort "Mississippi"
["M","iiii","pp","ssss"]
You said you want to filter out the non-duplicates. Let's define a predicate which identifies the lists having at least two elements:
atLeastTwo :: [a] -> Bool
atLeastTwo (_:_:_) = True
atLeastTwo _ = False
Using this:
> filter atLeastTwo . group $ sort "Mississippi"
["iiii","pp","ssss"]
Good. Now, we need to take only the first element from such lists. Since the lists are non-empty, we can use head safely:
> map head . filter atLeastTwo . group $ sort "Mississippi"
"ips"
Alternatively, we could replace the filter with filter (\xs -> length xs >= 2) but this would be less efficient.
Yet another option is to use a list comprehension
> [ x | (x:_y:_) <- group $ sort "Mississippi" ]
"ips"
This pattern matches on the lists starting with x and having at least another element _y, combining the filter with taking the head.
Okay, good start. One immediate problem is that the specification requires the function to work on lists of numbers, but you define it for strings. The list must be sorted, so its elements must have the typeclass Ord. Therefore, let’s fix the type signature:
repeated :: Ord a => [a] -> [a]
After calling sort and group, you will have a list of lists, [[a]]. Let’s take your idea of using filter. That works. Your predicate should, as you said, check the length of each list in the list, then compare that length to 1.
Filtering a list of lists gives you a subset, which is another list of lists, of type [[a]]. You need to flatten this list. What you want to do is map each entry in the list of lists to one of its elements. For example, the first. There’s a function in the Prelude to do that.
So, you might fill in the following skeleton:
module Repeated (repeated) where
import Data.List (group, sort)
repeated :: Ord a => [a] -> [a]
repeated = map _
. filter (\x -> _)
. group
. sort
I’ve written this in point-free style with the filtering predicate as a lambda expression, but many other ways to write this are equally good. Find one that you like! (For example, you could also write the filter predicate in point-free style, as a composition of two functions: a comparison on the result of length.)
When you try to compile this, the compiler will tell you that there are two typed holes, the _ entries to the right of the equal signs. It will also tell you the type of the holes. The first hole needs a function that takes a list and gives you back a single element. The second hole needs a Boolean expression using x. Fill these in correctly, and your program will work.
Here's some other approaches, to evaluate #chepner's comment on the solution using group $ sort. (Those solutions look simpler, because some of the complexity is hidden in the library routines.)
While it's true that sorting is O(n lg n), ...
It's not just the sorting but especially the group: that uses span, and both of them build and destroy temporary lists. I.e. they do this:
a linear traversal of an unsorted list will require some other data structure to keep track of all possible duplicates, and lookups in each will add to the space complexity at the very least. While carefully chosen data structures could be used to maintain an overall O(n) running time, the constant would probably make the algorithm slower in practice than the O(n lg n) solution, ...
group/span adds considerably to that complexity, so O(n lg n) is not a correct measure.
while greatly complicating the implementation.
The following all traverse the input list just once. Yes they build auxiliary lists. (Probably a Set would give better performance/quicker lookup.) They maybe look more complex, but to compare apples with apples look also at the code for group/span.
repeated2, repeated3, repeated4 :: Ord a => [a] -> [a]
repeated2/inserter2 builds an auxiliary list of pairs [(a, Bool)], in which the Bool is True if the a appears more than once, False if only once so far.
repeated2 xs = sort $ map fst $ filter snd $ foldr inserter2 [] xs
inserter2 :: Ord a => a -> [(a, Bool)] -> [(a, Bool)]
inserter2 x [] = [(x, False)]
inserter2 x (xb#(x', _): xs)
| x == x' = (x', True): xs
| otherwise = xb: inserter2 x xs
repeated3/inserter3 builds an auxiliary list of pairs [(a, Int)], in which the Int counts how many of the a appear. The aux list is sorted anyway, just for the heck of it.
repeated3 xs = map fst $ filter ((> 1).snd) $ foldr inserter3 [] xs
inserter3 :: Ord a => a -> [(a, Int)] -> [(a, Int)]
inserter3 x [] = [(x, 1)]
inserter3 x xss#(xc#(x', c): xs) = case x `compare` x' of
{ LT -> ((x, 1): xss)
; EQ -> ((x', c+1): xs)
; GT -> (xc: inserter3 x xs)
}
repeated4/go4 builds an output list of elements known to repeat. It maintains an intermediate list of elements met once (so far) as it traverses the input list. If it meets a repeat: it adds that element to the output list; deletes it from the intermediate list; filters that element out of the tail of the input list.
repeated4 xs = sort $ go4 [] [] xs
go4 :: Ord a => [a] -> [a] -> [a] -> [a]
go4 repeats _ [] = repeats
go4 repeats onces (x: xs) = case findUpd x onces of
{ (True, oncesU) -> go4 (x: repeats) oncesU (filter (/= x) xs)
; (False, oncesU) -> go4 repeats oncesU xs
}
findUpd :: Ord a => a -> [a] -> (Bool, [a])
findUpd x [] = (False, [x])
findUpd x (x': os) | x == x' = (True, os) -- i.e. x' removed
| otherwise =
let (b, os') = findUpd x os in (b, x': os')
(That last bit of list-fiddling in findUpd is very similar to span.)
I want to do a Haskell function where the input (a list of Strings) is ordered (always. input is valid only if is ordered) and I want to get the number of occurrences of each different string.
Example:
ContaOcs["a", "a", "b", "c", "c", "c", "d"]
Should return:
[(2,"a"), (1,"b"), (3,"c"), (1,"d")]
Here is What I'm trying to do:
module Main where
contaOcs :: [String] -> [(Int, String)]
contaOcs [] = [_,_]
contaOcs [x] = [1,x]
contaOcs (i, x1:x2:xs)
| x1 == x2 = (i+1,(x2:xs))
| otherwise = (0, (x2:xs))
But this code have some errors and I'm not so sure how I should do to accomplish this
I'm new to functional programing and Haskell. Can anyone help me with some information?
Thanks for any help.
There are some syntactical problems as well as problems with the types. The first line looks like:
contaOcs [] = [_,_]
But an underscore (_) in the result does not makes any sense, you can only construct lists with values in it. When we count the number of occurences of an empty list, the result will be an empty list, so contaOcs [] = [].
As for the second:
contaOcs [x] = [1,x]
Here you aim to return a list with two elements: a 1 and an x (which is a String). In Haskell the elements of a list all have the same type. What you can do is return a list of 2-tuples with the first item an Int, and the second a String, like the signature suggests, but then you need to wrap the values in a 2-tuple, like contaOcs [x] = [(1,x)].
In your last clause, you write:
contaOcs (i, x1:x2:xs) = ...
which does not make much sense: the input type is a list (here of Strings), not a 2-tuple with an Int, and a list of strings.
So the input will look like:
contaOcs (x1:x2:xs) = ...
The output, like (i+1,(x2:xs)) also is not in "harmony" with the proposed output type in the signature, this looks like a 2-tuple with an Int, and a list of Strings, so (Int, [String]), not [(Int, String)].
Based on the above comments, we have derived something like:
contaOcs :: [String] -> [(Int, String)]
contaOcs [] = []
contaOcs [x] = [(1,x)]
contaOcs (x1:x2:xs)
| x1 == x2 = -- ...
| otherwise = -- ...
So now there are two parts to fill in. In case x1 and x2 are not equal, that means that we can first yield a tuple (1, x1) in the list, followed by the result of contaOcs on the rest of the list (x2 included), so:
(1, x1) : contaOcs (x2:xs)
In the latter case, it means that we first make a recursive call to contaOcs with (x2:xs), and then increment the counter of the first item of that list. We are sure such element exists, since we make a recursive call with a list containing at least one element, and by induction, that means the result contains at least one element as well, since the base case contains one element, and the recursive case either prepends elements to the result, or updates these.
So we can use a pattern guard, and maniplate the result, like:
contaOcs :: [String] -> [(Int, String)]
contaOcs [] = []
contaOcs [x] = [(1,x)]
contaOcs (x1:x2:xs)
| x1 == x2, ((yi, yv):ys) <- contaOcs (x2:xs) = (yi+1, yv) : ys
| otherwise = (1, x1) : contaOcs (x2:xs)
We can also use an "as-pattern": we only need a reference to the tail of the list starting with x2, not xs:
contaOcs :: [String] -> [(Int, String)]
contaOcs [] = []
contaOcs [x] = [(1,x)]
contaOcs (x1:xs#(x2:_))
| x1 == x2, ((yi, yv):ys) <- contaOcs xs = (yi+1, yv) : ys
| otherwise = (1, x1) : contaOcs xs
The above is however not very elegantly. It might be better to use an accumulator here, I leave this as an exercise.
Let's look at some of the errors mentioned by ghc. Always pay close attention to when GHC talks about Expected and Actual types, as these messages are always illuminating. Expected indicates what GHC thinks you should write. Actual indicates what you wrote. You either need to change what you wrote (read: change your code), or change what GHC thinks you should write (read: change your type annotations). In this case it's mostly the former.
hw.hs:2:16: error:
• Found hole: _ :: (Int, String)
• In the expression: _
In the expression: [_, _]
In an equation for ‘contaOcs’: contaOcs [] = [_, _]
• Relevant bindings include
contaOcs :: [String] -> [(Int, String)] (bound at hw.hs:2:1)
|
2 | contaOcs [] = [_,_]
| ^
hw.hs:2:18: error:
• Found hole: _ :: (Int, String)
• In the expression: _
In the expression: [_, _]
In an equation for ‘contaOcs’: contaOcs [] = [_, _]
• Relevant bindings include
contaOcs :: [String] -> [(Int, String)] (bound at hw.hs:2:1)
|
2 | contaOcs [] = [_,_]
| ^
The underscore is used as a placeholder (or "hole"), to be filled in later. GHC is telling you that you should figure out something to put in these holes.
hw.hs:3:19: error:
• Couldn't match type ‘[Char]’ with ‘(Int, String)’
Expected type: (Int, String)
Actual type: String
• In the expression: x
In the expression: [1, x]
In an equation for ‘contaOcs’: contaOcs [x] = [1, x]
|
3 | contaOcs [x] = [1,x]
|
You have declared that the return type of the function is [(Int, String)], in other words, a List, where each element of the list is a Tuple of Int and String.
Therefore, each element in the list should be a Tuple. The syntax [1,x] means a list with two elements: 1 and x. GHC has noticed that x, however, is known to be a String, which is not a Tuple. (GHC failed to notice that 1 is not a tuple, for... reasons. Numbers in Haskell are a little weird and GHC is not so helpful with those.)
Perhaps you meant to write (1, x), which is a tuple of 1 (an Int) and x (a String). However, don't forget to also put that tuple into a list somehow, since your return type is a list of tuples.
hw.hs:4:10: error:
• Couldn't match expected type ‘[String]’
with actual type ‘(Integer, [a0])’
• In the pattern: (i, x1 : x2 : xs)
In an equation for ‘contaOcs’:
contaOcs (i, x1 : x2 : xs)
| x1 == x2 = (i + 1, (x2 : xs))
| otherwise = (0, (x2 : xs))
|
4 | contaOcs (i, x1:x2:xs)
| ^^^^^^^^^^^^^
GHC is again reminding you that it expects a list of tuples, but in this case, you gave it just one tuple.
The errors are mostly the same as this.
contaOcs :: [String] -> [(Int, String)]
contaOcs consumes a list of strings: xss, for each unique string: xs in xss, we produce a pair: p, whose first element represents the number of occurrences of xs in xss, and the second element of p is that xs itself.
We know we need to group strings by their uniqueness and count each unique string's total occurrences. You can follow this idea and implement the rest yourself. contaOcs takes a list and produces a new list so list comprehension should give you what you want. You're transforming one list to another, so fmap a function that accumulates should work. You can also just use natural recursion or accumulator. Here is one way to write contaOcs:
contaOcs = (return . liftA2 (,) length head =<<) . group
Write down the signature, purpose statement, some sample data and test cases first, then it's just a matter of finding the solutions that best fit your need.
This is a good example of when a co-recursive function is helpful.
contaOcs :: [String] -> [(Int, String)]
We'll define contaOcs as the outer function that takes the list of strings and returns the tuples. First let's look at the trivial cases:
contaOcs [] = []
contaOcs [x] = [(1,x)]
Pass an empty list, and you should get back an empty list. Pass a single element list, and you should get back a list with one element: (1, x). Now we can guarantee that any other list is 2+ elements long.
contaOcs (x:xs) = go x xs
go? What is go you might ask? Well let's define it in the where clause:
where
go cur xs = let (this, rest) = span (==x) xs
in (succ . length $ this, cur) : contaOcs diff
That's kind of a lot, so let's unpack. go is an idiomatic term for a function helper (this could as easily be named f or frobnicator, it doesn't matter). It takes the character we're counting, which is split separately from the rest of its list, and calls it x. It runs a span (==x) against the rest of the list, which splits it into a tuple (longestPrefixThatMatches, rest). We return the length of that longest prefix (plus one, since we've stripped off the front character) paired with the character itself in a tuple, then cons that with the recursive case -- handing the rest of the list back to the outer function to handle.
What you want can be done by a one-liner
Prelude> import Data.List
Prelude Data.List> ls = ["a", "a", "b", "c", "c", "c", "d"]
Prelude Data.List> [(length x, head x) | x <- group ls]
[(2,"a"),(1,"b"),(3,"c"),(1,"d")]
I mix list comprehension with the group function. Basic concepts you can make yourselves familiar with.
contaOcs :: [String] -> [(Int, String)]
contaOcs xs = foldr foldContaOcs [] xs
where foldContaOcs s [] = (1, s):[]
foldContaOcs s ((n, ch):xs) = if ch == s then (n + 1, s) : xs
else (1, s): (n, ch): xs
The next lines should show how its has to work..
[14,2,344,41,5,666] after [(14,2),(2,1),(344,3),(5,1),(666,3)]
["Zoo","School","Net"] after [("Zoo",3),("School",6),("Net",3)]
Thats my code up to now
zipWithLength :: [a] -> [(a, Int)]
zipWithLength (x:xs) = zipWith (\acc x -> (x, length x):acc) [] xs
I want to figure out what the problem in the second line is.
If you transform the numbers into strings (using show), you can apply length on them:
Prelude> let zipWithLength = map (\x -> (x, length (show x)))
Prelude> zipWithLength [14,2,344,41,5,666]
[(14,2),(2,1),(344,3),(41,2),(5,1),(666,3)]
However, you cannot use the same function on a list of strings:
Prelude> zipWithLength ["Zoo","School","Net"]
[("Zoo",5),("School",8),("Net",5)]
The numbers are not the lengths of the strings, but of their representations:
Prelude> show "Zoo"
"\"Zoo\""
Prelude> length (show "Zoo")
5
As noted in the comments, similar problems may happen with other types of elements:
Prelude> zipWithLength [(1.0,3),(2.5,3)]
[((1.0,3),7),((2.5,3),7)]
Prelude> show (1.0,3)
"(1.0,3)"
Prelude> length (show (1.0,3))
7
If you want to apply a function on every element of a list, that is a map :: (a -> b) -> [a] -> [b]. The map thus takes a function f and a list xs, and generates a list ys, such that the i-th element of ys, is f applied to the i-th element of xs.
So now the only question is what mapping function we want. We want to take an element x, and return a 2-tuple (x, length x), we can express this with a lambda expression:
mapwithlength = map (\x -> (x, length x))
Or we can use ap :: Monad m => m (a -> b) -> m a -> m b for that:
import Control.Monad(ap)
mapwithlength = map (ap (,) length)
A problem is that this does not work for Ints, since these have no length. We can use show here, but there is an extra problem with that: if we perform show on a String, we get a string literal (this means that we get a string that has quotation marks, and where some characters are escaped). Based on the question, we do not want that.
We can define a parameterized function for that like:
mapwithlength f = map (ap (,) (length . f))
We can basically leave it to the user. In case they want to work with integers, they have to call it with:
forintegers = mapwithlength show
and for Strings:
forstrings = mapwithlength id
After installing the number-length package, you can do:
module Test where
import Data.NumberLength
-- use e.g for list of String
withLength :: [[a]] -> [([a], Int)]
withLength = map (\x -> (x, length x))
-- use e.g for list of Int
withLength' :: NumberLength a => [a] -> [(a, Int)]
withLength' = map (\x -> (x, numberLength x))
Examples:
>>> withLength ["Zoo", "bear"]
[("Zoo",3),("bear",4)]
>>> withLength' [14, 344]
[(14,2),(344,3)]
As bli points out, calculating the length of a number using length (show n) does not transfer to calculating the length of a string, since show "foo" becomes "\"foo\"". Since it is not obvious what the length of something is, you could parameterise the zip function with a length function:
zipWithLength :: (a -> Int) -> [a] -> [(a, Int)]
zipWithLength len = map (\x -> (x, len x))
Examples of use:
> zipWithLength (length . show) [7,13,666]
[(7,1),(13,2),(666,3)]
> zipWithLength length ["Zoo", "School", "Bear"]
[("Zoo",3),("School",6),("Bear",4)]
> zipWithLength (length . concat) [[[1,2],[3],[4,5,6,7]], [[],[],[6],[6,6]]]
[([[1,2],[3,4],[5,6,7]],7),([[],[],[6],[6,6]],3)]
Currently working with Haskell on a function that takes a String in parameters and return a list of (Char, Int) The function occur works with multiple type and is used in the function called word.
occur::Eq a=>a->[a]->Int
occur n [] = 0
occur n (x:xs) = if n == x
then 1 + occur n xs
else occur n xs
word::String->[(String,Int)]
word xs = [(x,y) | x<-head xs, y<-(occur x xs)]
Get me this error
ERROR "file.hs":31 - Type error in generator
*** Term : head xs
*** Type : Char
*** Does not match : [a]
What am I doing wrong ? How can I make this code run properly , type-wise ?
The problem is you say that xs has type String, so head xs has type Char, and then you try to iterate over a single Char, which can't be done. The a <- b syntax only works when b is a list. You have the same problem in that y <- occur x xs is trying to iterate over a single Int, not a list of Int. You also had a problem in your type signature, the first type in the tuple should be Char, not String. You can fix it with:
word :: String -> [(Char, Int)]
word xs = [(x, occur x xs) | x <- xs]
Here we loop over the entire string xs, and for each character x in xs we compute occur x xs.
I would actually recommend using a slightly stronger constraint than just Eq. If you generalize word (that I've renamed to occurrences) and constrain it with Ord, you can use group and sort, which allow you to keep from iterating over the list repeatedly for each character and avoid the O(n^2) complexity. You can also simplify the definition pretty significantly:
import Control.Arrow
import Data.List
occurrences :: Ord a => [a] -> [(a, Int)]
occurrences = map (head &&& length) . group . sort
What this does is first sort your list, then group by identical elements. So "Hello, world" turns into
> sort "Hello, world"
" ,Hdellloorw"
> group $ sort "Hello, world"
[" ", ",", "H", "d", "e", "lll", "oo", "r", "w"]
Then we use the arrow operator &&& which takes two functions, applies a single input to both, then return the results as a tuple. So head &&& length is the same as saying
\x -> (head x, length x)
and we map this over our sorted, grouped list:
> map (head &&& length) $ group $ sort "Hello, world"
[(' ',1),(',',1),('H',1),('d',1),('e',1),('l',3),('o',2),('r',1),('w',1)]
This eliminates repeats, you aren't having to scan the list over and over counting the number of elements, and it can be defined in a single line in the pointfree style, which is nice. However, it does not preserve order. If you need to preserve order, I would then use sortBy and the handy function comparing from Data.Ord (but we lose a nice point free form):
import Control.Arrow
import Data.List
import Data.Ord (comparing)
occurrences :: Ord a => [a] -> [(a, Int)]
occurrences = map (head &&& length) . group . sort
occurrences' :: Ord a => [a] -> [(a, Int)]
occurrences' xs = sortBy (comparing ((`elemIndex` xs) . fst)) $ occurrences xs
You can almost read this as plain English. This sorts by comparing the index in xs of the first element of the tuples in occurrences xs. Even though elemIndex returns a value of type Maybe Int, we can still compare those directly (Nothing is "less than" any Just value). It simply looks up the first index of each letter in the original string and sorts by that index. That way
> occurrences' "Hello, world"
returns
[('H',1),('e',1),('l',3),('o',2),(',',1),(' ',1),('w',1),('r',1),('d',1)]
with all the letters in the original order, up to repetition.