How to compute frequency via list comprehension? - list

count :: Eq a => a -> [a] -> Int
count n [] = 0
count n (x:xs) | n == x = 1 + count n xs
| otherwise = count n xs
rmdups :: Eq a => [a] -> [a]
rmdups [ ] = [ ]
rmdups (x:xs) = x : rmdups (filter(/= x) xs)
using the 2 functions, a third needs to be created, called frequency:
it should count how many times each distinct value in a list occurs in that list. for example : frequency "ababc", should return [(3,'a'),(2,'b'),(1,'c')].
the layout for frequency is :
frequency :: Eq a => [a] -> [(Int, a)]
P.s rmdups, removes duplicates from list, so rmdups "aaabc" = abc
and count 2 [1,2,2,2,3] = 3.
so far i have:
frequency :: Eq a => [a] -> [(Int, a)]
frequency [] = []
frequency (x:xs) = (count x:xs, x) : frequency (rmdups xs)
but this is partly there, (wrong). thanks

frequency xs = map (\c -> (count c xs,c)) (rmdups xs)
or, with a list comprehension,
frequency xs = [(count c xs, c) | c <- rmdups xs]
is the shortest way to define it using your count and rmdups. If you need it sorted according to frequency (descending) as in your example,
frequency xs = sortBy (flip $ comparing fst) $ map (\c -> (count c xs,c)) (rmdups xs)
using sortBy from Data.List and comparing from Data.Ord.
If all you have is an Eq constraint, you cannot gain much efficiency, but if you only need it for types in Ord, you can get a much more efficient implementation using e.g. Data.Set or Data.Map.

Here is my own 'lazy' answer, which does not call rmdups:
frequency [] = []
frequency (y:ys) = [(count y (y:ys), y)] ++ frequency (filter (/= y) ys)

import qualified Data.Set as Set
frequency xs = map (\x -> (length $ filter (== x) xs, x)) (Set.toList $ Set.fromList xs)

Related

Standard ML :How to cycle through a list?

I am trying to write a program that cycle through a list n times.
Suppose that L = [a1, a2, ... , an]
What I am trying to achieve is [ai+1, a i+2, ... , an, a1, a2, ... , ai].
I referenced to a previous post about this exact problem. However, I am not sure how to obtain the output or [ai+1, a i+2, ... , an, a1, a2, ... , ai].
For the output: I tried
-cycle([1,2,3,4], 5);
However the error that I am getting is that the operand and operator don't match
This is the code I found from the previous post:
fun cycle n i =
if i = 0 then n
else cycle (tl n) (i-1) # [hd(n)];
A way to do this using if-then-else:
fun cycle xs n =
if n = 0
then []
else xs # cycle xs (n - 1)
You might instead like to use pattern matching:
fun cycle xs 0 = []
| cycle xs n = xs # cycle xs (n - 1)
But the most elegant solution, I think, is using higher-order functions:
fun cycle xs n =
List.concat (List.tabulate (n, fn _ => xs))
A slightly harder task is how to write a cycle for lazy lists that cycles infinitely...
datatype 'a lazylist = Cons of 'a * (unit -> 'a lazylist) | Nil
fun fromList [] = Nil
| fromList (x::xs) = Cons (x, fn () => fromList xs)
fun take 0 _ = []
| take _ Nil = []
| take n (Cons (x, tail)) = x :: take (n - 1) (tail ())
local
fun append' (Nil, ys) = ys
| append' (Cons (x, xtail), ys) =
Cons (x, fn () => append' (xtail (), ys))
in
fun append (xs, Nil) = xs
| append (xs, ys) = append' (xs, ys)
end
fun cycle xs = ...
where take 5 (cycle (fromList [1,2])) = [1,2,1,2,1].

Remove duplicate but keep the order

rmdup :: [Int] -> [Int]
rmdup [] = []
rmdup (x:xs) | x `elem` xs = rmdup xs
| otherwise = x: rmdup xs
The code above removes duplicate from a list of Integer but it removes the first occurrence and keeps the second one. For instance:
rmdup [1,2,3,1,4]
will result:
[2,3,1,4]
How can I change it to keep the order and yield this: [1,2,3,4]? Note, I don't want to use built-in functions.
How about the following? This avoids the crazily inefficient acc ++ [x] and also to reverse the given list twice:
rmdup :: Eq a => [a] => [a]
rmdup xs = rmdup' [] xs
where
rmdup' acc [] = []
rmdup' acc (x:xs)
| x `elem` acc = rmdup' acc xs
| otherwise = x : rmdup' (x:acc) xs
One way to achieve what you want is to pass the input list in the reverse order and once when the computation is finished then reverse the result again. Although, this solution is not efficient.
rmdup :: [Int] -> [Int]
rmdup xs = reverse $ rmdup' (reverse xs)
where
rmdup' [] = []
rmdup' (x:xs) | x `elem` xs = rmdup' xs
| otherwise = x: rmdup' xs
Demo:
ghci> rmdup [1,2,3,1,4]
[1,2,3,4]
You want to ignore those later occurrences of an element if you saw it before, then you need to record what you have seen, looks like foldl or foldl' is what you are looking for.
Here is a possible implementation:
import Data.List (foldl')
rmdup :: (Eq a) => [a] -> [a]
rmdup = foldl' step []
where step acc x
| x `elem` acc = acc
| otherwise = acc++[x]
Since elem is O(n), the solutions based on using it to check each element are O(n^2).
The "standard" efficient solution to the duplicates problem is to sort the list before checking for duplicates. Here, since we need to preserve elements, we have to be a bit more careful.
import Data.List
import Data.Ord
rmdupSorted :: Eq b => [(a,b)] -> [(a,b)]
rmdupSorted (x#(_,xb):xs#((_,yb):_)) | xb == yb = rmdupSorted xs
| otherwise = x : rmdupSorted xs
rmdupSorted xs = xs -- 0 or 1 elements
rmdup :: Ord a => [a] -> [a]
rmdup = map snd . sort . rmdupSorted . sortBy (comparing snd) . zip [0..]
main = print $ rmdup [1,2,3,4,5,4,6,1,7]
Assuming that the sortBy function is a stable sort, the rmdup function will remove all the duplicate occurrences of any element but for the one occurring last. If sortBy is not stable, then rmdup will remove all the occurrences but for an unspecified one (i.e., rmdup [1,2,1] could return [1,2] instead of [2,1].).
Complexity is now O(n log n).
We now need to rewrite the above without library functions, as the OP requested. I will leave this as an exercise to the reader. :-P

Creating tuples variations from a list - Haskell

I am a relative haskell newbie and am trying to create a list of tuples with an equation I named splits that arises from a single list originally, like this:
splits [1..4] --> [ ([1],[2,3,4]), ([1,2],[3,4]), ([1,2,3],[4]) ]
or
splits "xyz" --> [ ("x","yz"), ("xy","z") ]
Creating a list of tuples that take 1, then 2, then 3 elements, etc. I figured out I should probably use the take/drop functions, but this is what I have so far and I'm running into a lot of type declaration errors... Any ideas?
splits :: (Num a) => [a] -> [([a], [a])]
splits [] = error "shortList"
splits [x]
| length [x] <= 1 = error "shortList"
| otherwise = splits' [x] 1
where splits' [x] n = [(take n [x], drop n [x])] + splits' [x] (n+1)
The Haskell-y approach is to use the inits and tails functions from Data.List:
inits [1,2,3,4] = [ [], [1], [1,2], [1,2,3], [1,2,3,4] ]
tails [1,2,3,4] = [ [1,2,3,4], [2,3,4], [3,4], [4], [] ]
We then just zip these two lists together and drop the first pair:
splits xs = tail $ zip (inits xs) (tails xs)
or equivalently, drop the first element of each of the constituent lists first:
= zip (tail (inits xs)) (tail (tails xs))
splits [] = []
splits [_] = []
splits (x:xs) = ([x], xs) : map (\(ys, zs) -> (x:ys, zs)) (splits xs)
You have several mistakes.
You don't need to have Num a class for a.
use [] or [x] as pattern, but not a variable, use xs instead.
Use ++ instead of + for concatenating lists.
In our case use (:) to add list to value instead of ++.
Add stop for recursion, like additional variable maxn to splits'
splits :: [a] -> [([a], [a])]
splits [] = error "shortList"
splits xs
| lxs <= 1 = error "shortList"
| otherwise = splits' xs 1 lxs
where
lxs = length xs
splits' xs n maxn
| n > maxn = []
| otherwise = (take n xs, drop n xs) : splits' xs (n+1) maxn
There is a built in function that kind of does a part of what you want:
splitAt :: Int -> [a] -> ([a], [a])
which does what it looks like it would do:
> splitAt 2 [1..4]
([1,2],[3,4])
Using this function, you can just define splits like this:
splits xs = map (flip splitAt xs) [1..length xs - 1]

How do I take the last n elements of a list

To obtain the last n elements of a list xs, I can use reverse (take n (reverse xs)), but that is not very good code (it keeps the complete list in memory before returning anything, and the result is not shared with the original list).
How do I implement this lastR function in Haskell?
This should have the property of only iterating the length of the list once. N for drop n and n - 1 for zipLeftover.
zipLeftover :: [a] -> [a] -> [a]
zipLeftover [] [] = []
zipLeftover xs [] = xs
zipLeftover [] ys = ys
zipLeftover (x:xs) (y:ys) = zipLeftover xs ys
lastN :: Int -> [a] -> [a]
lastN n xs = zipLeftover (drop n xs) xs
Here is an alternative shorter and perhaps better since as Satvik pointed out it is often better to use recursion operators then explicit recursion.
import Data.Foldable
takeLeftover :: [a] -> t -> [a]
takeLeftover [] _ = []
takeLeftover (x:xss) _ = xss
lastN' :: Int -> [a] -> [a]
lastN' n xs = foldl' takeLeftover xs (drop n xs)
Also note Will Ness's comment below that takeLeftover is just:
takeLeftover == const . drop 1
Which makes things rather tidy:
lastN' :: Int -> [a] -> [a]
lastN' n xs = foldl' (const . drop 1) xs (drop n xs)
-- or
-- lastN' n xs = foldl' (const . drop 1) <*> drop n
From what I can tell you can use something like
lastN :: Int -> [a] -> [a]
lastN n xs = drop (length xs - n) xs
But with any implementation on inbuilt list you can not perform better than O(length of list - n).
It looks like you are trying to use list for something it was not meant to perform efficiently.
Use Data.Sequence or some other implementation of list which allows operations to be performed at the end of the list efficiently.
Edit:
Davorak's implementation looks like to be the most efficient implementation you can get from inbuilt list. But remember there are intricacies other than just the running time of a single function like whether it fuse well with other functions etc.
Daniel's solution uses inbuilt functions and has the same complexity as of Davorak's and I think has better chances to fuse with other functions.
Not sure whether it's terribly fast, but it's easy:
lastR n xs = snd $ dropWhile (not . null . fst) $ zip (tails $ drop n xs) (tails xs)
Be aware that whatever you do, you are going to need to iterate through the entire list. That said, you can do a bit better than reverse (take n (reverse xs)) by computing the length of the list first, and dropping the appropriate number of elements:
lastN :: Int -> [a] -> [a]
lastN n xs = let m = length xs in drop (m-n) xs
Here's a simplification of Davorak's first solution:
-- dropLength bs = drop (length bs)
dropLength :: [b] -> [a] -> [a]
dropLength [] as = as
dropLength _ [] = []
dropLength (_ : bs) (_ : as) = dropLength bs as
lastR :: Int -> [a] -> [a]
lastR n as = dropLength (drop n as) as
When n <= length as, length (drop n as) = length as - n, so dropLength (drop n as) as = drop (length (drop n as)) as = drop (length as - n) as, which is the last n elements of as. When n > length as, dropLength (drop n as) as = dropLength [] as = as, which is the only sensible answer.
If you want to use a fold, you can write
dropLength :: [b] -> [a] -> [a]
dropLength = foldr go id
where
go _b _r [] = []
go _b r (_a : as) = r as
That won't make any difference for lastR, but in other applications it could win you some list fusion.
Simple solution is not that bad. The algorithm is O(n) anyway.
takeLastN n = reverse . take n . reverse
Time comparison:
> length $ lastN 3000000 (replicate 10000000 "H") -- Davorak's solution #1
3000000
(0.88 secs, 560,065,232 bytes)
> length $ lastN' 3000000 (replicate 10000000 "H") -- Davorak's solution #2
3000000
(1.82 secs, 840,065,096 bytes)
> length $ lastN'' 3000000 (replicate 10000000 "H") -- Chris Taylor's solution
3000000
(0.50 secs, 560,067,680 bytes)
> length $ takeLastN 3000000 (replicate 10000000 "H") -- Simple solution
3000000
(0.81 secs, 1,040,064,928 bytes)
As Joachim Breitner pointed out in the question and in the comment there is still memory issue. Being not much slower than others such solution requires almost twice as much memory. You can see this in the benchmarks.
takeLast :: Int -> [a] -> [a]
takeLast n xs
| n < 1 = []
| otherwise = let s = splitAt n xs in bla (fst s) (snd s)
where
bla xs [] = xs
bla (x:xs) (y:ys) = bla (xs ++ [y]) ys

Sum of elements in a list at the same position

How can I do the sum of elements in a list at the same position?
For example:
[[2,3,4],[5,6,7],[8,9,10]]=[15,18,21]
Thanks
Try:
sumIn :: Num a => [[a]] -> [a]
sumIn = foldl (zipWith (+)) (repeat 0)
Note that if the argument is an empty list, the result is an infinite list of zeros. So you may want to treat this case separately, for example
sumIn :: Num a => [[a]] -> [a]
sumIn [] = []
sumIn xs = foldl (zipWith (+)) (repeat 0) xs
Here's an example in GHCi:
λ> let xs = [[2,3,4],[5,6,7],[8,9,10]]
λ> foldr1 (zipWith (+)) xs
[15,18,21]
You could transpose the list, and sum each list in the result:
ghci> import Data.List (transpose)
ghci> map sum $ transpose [[2,3,4],[5,6,7],[8,9,10]]
[15,18,21]
Unlike the other solutions, this works for lists of non-uniform length.