Function to find the most frequent element - list

I am trying to code a function that returns the element that appears the most in a list. So far I have the following
task :: Eq a => [a] -> a
task xs = (map ((\l#(x:xs) -> (x,length l)) (occur (sort xs))))
occur is a function that takes a list and returns a list of pairs with the elements of the inputted list along with the amount of times they appear. So for example for a list [1,1,2,3,3] the output would be [(1,2),(2,1),(3,2)].
However, I am getting some errors related to the arguments of map. Can anyone tell me what I'm doing wrong?

A map maps every item to another item, so here \l is a 2-tuple, like (1,2), (2, 1) or (3, 2). It thus does not make much sense to work with length l, since length :: Foldable f => f a -> Int will always return one for a 2-tuple: this is because only the second part of the 2-tuple is used in the foldable. But we do not need length in the first place.
What you need is a function that can retrieve the maximum based on the second item of the 2-tuple. We can make use of the maximumOn :: Ord b => (a -> b) -> [a] -> a from the exta package, or we can implement our own function to calculate the maximum on a list of items.
Such function thus should look like:
maximumSnd :: Ord b => [(a, b)] -> (a, b)
maximumSnd [] = error "Empty list"
maximumSnd (x:xs) = go xs x
where go [] m = m
go (x#(xa, xb):xs) (ya, yb)
| xb > yb = go … … -- (1)
| otherwise = go … … -- (2)
Here (1) should be implemented such that we make a recursive call but work with x as the new maximum we found thus far. (2) should make a recursive call with the same thus far maximum.
Once we have implemented the maxSnd function, we can use this function as a helper function for:
task :: Eq a => [a] -> (a, Int)
task xs = maxSnd (occur xs)
or we can use fst :: (a, b) -> a to retrieve the first item of the 2-tuple:
task :: Eq a => [a] -> a
task xs = (fst . maxSnd) (occur xs)
In case there are two characters with a maximum number of elements, the maximumSnd will return the first one in the list of occurrences.

Related

Function to find number of occurrences in list

So I already have a function that finds the number of occurrences in a list using maps.
occur :: [a] -> Map a a
occur xs = fromListWith (+) [(x, 1) | x <- xs]
For example if a list [1,1,2,3,3] is inputted, the code will output [(1,2),(2,1),(3,2)], and for a list [1,2,1,1] the output would be [(1,3),(2,1)].
I was wondering if there's any way I can change this function to use foldr instead to eliminate the use of maps.
You can make use of foldr where the accumulator is a list of key-value pairs. Each "step" we look if the list already contains a 2-tuple for the given element. If that is the case, we increment the corresponding value. If the item x does not yet exists, we add (x, 1) to that list.
Our function thus will look like:
occur :: Eq => [a] -> [(a, Int)]
occur = foldr incMap []
where incMap thus takes an item x and a list of 2-tuples. We can make use of recursion here to update the "map" with:
incMap :: Eq a => a -> [(a, Int)] -> [(a, Int)]
incMap x = go
where go [] = [(x, 1)]
go (y2#(y, ny): ys)
| x == y = … : ys
| otherwise = y2 : …
where I leave implementing the … parts as an exercise.
This algorithm is not very efficient, since it takes O(n) to increment the map with n the number of 2-tuples in the map. You can also implement incrementing the Map for the given item by using insertWith :: Ord k => (a -> a -> a) -> k -> a -> Map k a -> Map k a, which is more efficient.

Haskell function to keep the repeating elements of a list

Here is the expected input/output:
repeated "Mississippi" == "ips"
repeated [1,2,3,4,2,5,6,7,1] == [1,2]
repeated " " == " "
And here is my code so far:
repeated :: String -> String
repeated "" = ""
repeated x = group $ sort x
I know that the last part of the code doesn't work. I was thinking to sort the list then group it, then I wanted to make a filter on the list of list which are greater than 1, or something like that.
Your code already does half of the job
> group $ sort "Mississippi"
["M","iiii","pp","ssss"]
You said you want to filter out the non-duplicates. Let's define a predicate which identifies the lists having at least two elements:
atLeastTwo :: [a] -> Bool
atLeastTwo (_:_:_) = True
atLeastTwo _ = False
Using this:
> filter atLeastTwo . group $ sort "Mississippi"
["iiii","pp","ssss"]
Good. Now, we need to take only the first element from such lists. Since the lists are non-empty, we can use head safely:
> map head . filter atLeastTwo . group $ sort "Mississippi"
"ips"
Alternatively, we could replace the filter with filter (\xs -> length xs >= 2) but this would be less efficient.
Yet another option is to use a list comprehension
> [ x | (x:_y:_) <- group $ sort "Mississippi" ]
"ips"
This pattern matches on the lists starting with x and having at least another element _y, combining the filter with taking the head.
Okay, good start. One immediate problem is that the specification requires the function to work on lists of numbers, but you define it for strings. The list must be sorted, so its elements must have the typeclass Ord. Therefore, let’s fix the type signature:
repeated :: Ord a => [a] -> [a]
After calling sort and group, you will have a list of lists, [[a]]. Let’s take your idea of using filter. That works. Your predicate should, as you said, check the length of each list in the list, then compare that length to 1.
Filtering a list of lists gives you a subset, which is another list of lists, of type [[a]]. You need to flatten this list. What you want to do is map each entry in the list of lists to one of its elements. For example, the first. There’s a function in the Prelude to do that.
So, you might fill in the following skeleton:
module Repeated (repeated) where
import Data.List (group, sort)
repeated :: Ord a => [a] -> [a]
repeated = map _
. filter (\x -> _)
. group
. sort
I’ve written this in point-free style with the filtering predicate as a lambda expression, but many other ways to write this are equally good. Find one that you like! (For example, you could also write the filter predicate in point-free style, as a composition of two functions: a comparison on the result of length.)
When you try to compile this, the compiler will tell you that there are two typed holes, the _ entries to the right of the equal signs. It will also tell you the type of the holes. The first hole needs a function that takes a list and gives you back a single element. The second hole needs a Boolean expression using x. Fill these in correctly, and your program will work.
Here's some other approaches, to evaluate #chepner's comment on the solution using group $ sort. (Those solutions look simpler, because some of the complexity is hidden in the library routines.)
While it's true that sorting is O(n lg n), ...
It's not just the sorting but especially the group: that uses span, and both of them build and destroy temporary lists. I.e. they do this:
a linear traversal of an unsorted list will require some other data structure to keep track of all possible duplicates, and lookups in each will add to the space complexity at the very least. While carefully chosen data structures could be used to maintain an overall O(n) running time, the constant would probably make the algorithm slower in practice than the O(n lg n) solution, ...
group/span adds considerably to that complexity, so O(n lg n) is not a correct measure.
while greatly complicating the implementation.
The following all traverse the input list just once. Yes they build auxiliary lists. (Probably a Set would give better performance/quicker lookup.) They maybe look more complex, but to compare apples with apples look also at the code for group/span.
repeated2, repeated3, repeated4 :: Ord a => [a] -> [a]
repeated2/inserter2 builds an auxiliary list of pairs [(a, Bool)], in which the Bool is True if the a appears more than once, False if only once so far.
repeated2 xs = sort $ map fst $ filter snd $ foldr inserter2 [] xs
inserter2 :: Ord a => a -> [(a, Bool)] -> [(a, Bool)]
inserter2 x [] = [(x, False)]
inserter2 x (xb#(x', _): xs)
| x == x' = (x', True): xs
| otherwise = xb: inserter2 x xs
repeated3/inserter3 builds an auxiliary list of pairs [(a, Int)], in which the Int counts how many of the a appear. The aux list is sorted anyway, just for the heck of it.
repeated3 xs = map fst $ filter ((> 1).snd) $ foldr inserter3 [] xs
inserter3 :: Ord a => a -> [(a, Int)] -> [(a, Int)]
inserter3 x [] = [(x, 1)]
inserter3 x xss#(xc#(x', c): xs) = case x `compare` x' of
{ LT -> ((x, 1): xss)
; EQ -> ((x', c+1): xs)
; GT -> (xc: inserter3 x xs)
}
repeated4/go4 builds an output list of elements known to repeat. It maintains an intermediate list of elements met once (so far) as it traverses the input list. If it meets a repeat: it adds that element to the output list; deletes it from the intermediate list; filters that element out of the tail of the input list.
repeated4 xs = sort $ go4 [] [] xs
go4 :: Ord a => [a] -> [a] -> [a] -> [a]
go4 repeats _ [] = repeats
go4 repeats onces (x: xs) = case findUpd x onces of
{ (True, oncesU) -> go4 (x: repeats) oncesU (filter (/= x) xs)
; (False, oncesU) -> go4 repeats oncesU xs
}
findUpd :: Ord a => a -> [a] -> (Bool, [a])
findUpd x [] = (False, [x])
findUpd x (x': os) | x == x' = (True, os) -- i.e. x' removed
| otherwise =
let (b, os') = findUpd x os in (b, x': os')
(That last bit of list-fiddling in findUpd is very similar to span.)

How to compare elements in a [[]]?

I am dealing with small program with Haskell. Probably the answer is really simple but I try and get no result.
So one of the part in my program is the list:
first = [(3,3),(4,6),(7,7),(5,43),(9,9),(32,1),(43,43) ..]
and according to that list I want to make new one with element that are equal in the () =:
result = [3,7,9,43, ..]
Even though you appear to have not made the most minimal amount of effort to solve this question by yourself, I will give you the answer because it is so trivial and because Haskell is a great language.
Create a function with this signature:
findIdentical :: [(Int, Int)] -> [Int]
It takes a list of tuples and returns a list of ints.
Implement it like this:
findIdentical [] = []
findIdentical ((a,b) : xs)
| a == b = a : (findIdentical xs)
| otherwise = findIdentical xs
As you can see, findIdentical is a recursive function that compares a tuple for equality between both items, and then adds it to the result list if there is found equality.
You can do this for instance with list comprehension. We iterate over every tuple f,s) in first, so we write (f,s) <- first in the right side of the list comprehension, and need to filter on the fact that f and s are equal, so f == s. In that case we add f (or s) to the result. So:
result = [ f | (f,s) <- first, f == s ]
We can turn this into a function that takes as input a list of 2-tuples [(a,a)], and compares these two elements, and returns a list [a]:
f :: Eq a => [(a,a)] -> [a]
f dat = [f | (f,s) <- dat, f == s ]
An easy way to do this is to use the Prelude's filter function, which has the type definition:
filter :: (a -> Bool) -> [a] -> [a]
All you need to do is supply predicate on how to filter the elements in the list, and the list to filter. You can accomplish this easily below:
filterList :: (Eq a) => [(a, a)] -> [a]
filterList xs = [x | (x, y) <- filter (\(a, b) -> a == b) xs]
Which behaves as expected:
*Main> filterList [(3,3),(4,6),(7,7),(5,43),(9,9),(32,1),(43,43)]
[3,7,9,43]

nub not compiling when checking a list for duplicates

Working on a sudoku inspired assignment and I need to implement a function that checks if a Block Cell has no repeated elements in it (to check if its a valid solution to the puzzle).
okBlock :: Block Cell -> Bool
okBlock b = okList $ filter (/= Nothing) b
where
okList :: [a]-> Bool
okList list
| (length list) == (length (nub list)) = True
| otherwise = False
Block a = [a]
Cell = [Maybe Int]
Haskell complains saying No instance for (Eq a) arising from a use of "==" Possible fix: add (Eq a) to the context of the type signature for okList...
Adding Eq a to the type signature does not help. I have tried the function in the terminal and it works fine for for lists, and for lists of lists (i.e the type I am feeding it in the function).
What am I missing here?
Well you can only filter out duplicates, if there is a way to check whether two values are duplicates. If we look at the type signature for nub, we see:
nub :: Eq a => [a] -> [a]
So that means that in order to filter out duplicates in a list of as, we need a to be an instance of the Eq class. We can thus simply forward the type constraint further in the signatures of the functions:
okBlock :: Block Cell -> Bool
okBlock b = okList $ filter (/= Nothing) b
where
okList :: Eq => [a] -> Bool
okList list
| (length list) == (length (nub list)) = True
| otherwise = False
We do not need to specify that Cell is an instance of Eq because:
Int is an instance of Eq;
if a is an instance of Eq, so is Maybe a, so Maybe Int is an instance of Eq; and
if a is an instance of Eq, so is [a], so [Maybe Int] is an instance of Eq.
That being said we can do some syntactical improvements of the code:
there is no need to work with guards if you simply return the result of the guard True and False, and
you can use an eta reduction and omit the b in okBlock.
you don't need parentheses around function application (unless to feed to result straight to another, non-infix function).
This gives us:
okBlock :: Block Cell -> Bool
okBlock = okList . filter (/= Nothing)
where
okList :: Eq => [a] -> Bool
okList list = length list == length (nub list)
A final note is that usually you do not have to specify a type signature. In that case Haskell will aim to dervice the most generic type signature. So you can write:
okBlock = okList . filter (/= Nothing)
where
okList list = length list == length (nub list)
Now okBlock will have type:
Prelude Data.List> :t okBlock
okBlock :: Eq a => [Maybe a] -> Bool
Three points that are too big to make in a comment.
nub is horribly slow
nub takes O(n^2) time to process a list of length n. Unless you know the list is very short, this is the wrong function to use to remove duplicates from a list. Adding a bit more information about what sort of thing you're working with allows more efficient nubbing. The simplest, and probably most general, approach that isn't absolutely wretched is to use an Ord constraint:
import qualified Data.Set as S
nubOrd :: Ord a => [a] -> [a]
nubOrd = go S.empty where
go _seen [] = []
go seen (a : as)
| a `S.member` seen = go seen as
| otherwise = go (S.insert a seen) as
length is wasteful
Suppose I write
sameLength :: [a] -> [b] -> Bool
sameLength xs ys = length xs == length ys
(which uses the approach you did). Now imagine I calculate
sameLength [1..16] [1..2^100]
How long will that take? Calculating length [1..16] will take nanoseconds. Calculating length [1..2^100] will probably take billions of years using current hardware. Whoops. What's the right way? Pattern match!
sameLength [] [] = True
sameLength (_ : xs) (_ : ys) = sameLength xs ys
sameLength _ _ = False
Nubbing isn't the right solution to this problem
Suppose I ask noDuplicates (1 : [1,2..]). Obviously, there's a duplicate, right at the beginning. But if I use sameLength and nub to check, I will never get an answer. It will keep building the nubbed list and comparing it to the original list until the seen becomes so large it exhausts your computer's memory. How can you fix that? By directly calculating what you need:
noDuplicates = go S.empty where
go _seen [] = True
go seen (x : xs)
| x `S.member` seen = False
| otherwise = go (S.insert x seen) xs
Now the program will conclude that there's a duplicate the moment it sees the second 1.

Haskell function that outputs all combinations within the input list that add to the input number

I want to write a function in haskell that takes a list of integers and an integer value as input and outputs a list of all the lists that contain combinations of elements that add up to the input integer.
For example:
myFunc [3,7,5,9,13,17] 30 = [[13,17],[3,5,9,13]]
Attempt:
myFunc :: [Integer] -> Integer -> [[Integer]]
myFunc list sm = case list of
[] -> []
[x]
| x == sm -> [x]
| otherwise -> []
(x : xs)
| x + myFunc xs == sm -> [x] ++ myFunc[xs]
| otherwise -> myFunc xs
My code produces just one combination and that combination must be consecutive, which is not what I want to achieve
Write a function to create all subsets
f [] = [[]]
f (x:xs) = f xs ++ map (x:) (f xs)
then use the filter
filter ((==30) . sum) $ f [3,7,5,9,13,17]
[[13,17],[3,5,9,13]]
as suggested by #Ingo you can prune the list while it's generated, for example
f :: (Num a, Ord a) => [a] -> [[a]]
f [] = [[]]
f (x:xs) = f xs ++ (filter ((<=30) . sum) $ map (x:) $ f xs)
should work faster than generating all 2^N elements.
You can use subsequences from Data.List to give you every possible combination of values, then filter based on your requirement that they add to 30.
myFunc :: [Integer] -> Integer -> [[Integer]]
myFunc list sm =
filter (\x -> sum x == sm) $ subsequences list
An alternative would be to use a right fold:
fun :: (Foldable t, Num a, Eq a) => t a -> a -> [[a]]
fun = foldr go $ \a -> if a == 0 then [[]] else []
where go x f a = f a ++ ((x:) <$> f (a - x))
then,
\> fun [3,7,5,9,13,17] 30
[[13,17],[3,5,9,13]]
\> fun [3,7,5,9,13,17] 12
[[7,5],[3,9]]
An advantage of this approach is that it does not create any lists unless it adds up to the desired value.
Whereas, an approach based on filtering, will create all the possible sub-sequence lists only to drop most of them during filtering step.
Here is an alternate solution idea: Generate a list of lists that sum up to the target number, i.e.:
[30]
[29,1]
[28,2]
[28,1,1]
...
and only then filter the ones that could be build from your given list.
Pro: could be much faster, especially if your input list is long and your target number comparatively small, such that the list of list of summands is much smaller than the list of subsets of your input list.
Con: does only work when 0 is not in the game.
Finally, you can it do both ways and write a function that decides which algorthm will be faster given some input list and the target number.