Haskell list length alternative - list

Hi I've got a list on Haskell with close to 10^15 Int's in it and I'm trying print the length of the list.
let list1 = [1..1000000000000000] -- this is just a dummy list I dont
print list1 length -- know the actual number of elements
printing this takes a very long time to do, is there another way to get the number of elements in the list and print that number?

I've occasionally gotten some value out of lists that carry their length. The poor man's version goes like this:
import Data.Monoid
type ListLength a = (Sum Integer, [a])
singletonLL :: a -> ListLength a
singletonLL x = (1, [x])
lengthLL :: ListLength a -> Integer
lengthLL (Sum len, _) = len
The Monoid instance that comes for free gives you empty lists, concatenation, and a fromList-alike. Other standard Prelude functions that operate on lists like map, take, drop aren't too hard to mimic, though you'll need to skip the ones like cycle and repeat that produce infinite lists, and filter and the like are a bit expensive. For your question, you would also want analogs of the Enum methods; e.g. perhaps something like:
enumFromToLL :: Integral a => a -> a -> ListLength a
enumFromToLL lo hi = (fromIntegral hi-fromIntegral lo+1, [lo..hi])
Then, in ghci, your example is instant:
> lengthLL (enumFromToLL 1 1000000000000000)
1000000000000000

Related

Calculating the difference between two strings

I have two strings
a :: [String]
a = ["A1","A2","B3","C3"]
and
b :: [String]
b = ["A1","B2","B3","D5"]
And I want to calculate the difference between two strings based on the first character and second character and combination of two characters.
If the combination of two elements are the same, it would be calculate as 1
The function I declared is
calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys) = (a,b)
where
a = 0 in
???
b = ????
I know that I should have a increment variable to count the correct element, and where I should put it in? For now I totally have no idea about how to do that, can anyone give me some hint??
The desired result would be
(2,["B2","D5"])
How should I do that?
I assume that the lists have the same size.
The differences between the two lists
Let's focus on the main part of the problem:
Prelude> a=["A1","A2","B3","C3"]
Prelude> b=["A1","B2","B3","D5"]
First, notice that the zip method zips two lists. If you use it on a and b, you get:
Prelude> zip a b
[("A1","A1"),("A2","B2"),("B3","B3"),("C3","D5")]
Ok. It's now time to compare the terms one to one. There are many ways to do it.
Filter
Prelude> filter(\(x,y)->x/=y)(zip a b)
[("A2","B2"),("C3","D5")]
The lambda function returns True if the elements of the pair are different (/= operator). Thus, the filter keeps only the pairs that don't match.
It's ok, but you have to do a little more job to keep only the second element of each pair.
Prelude> map(snd)(filter(\(x,y)->x/=y)(zip a b))
["B2","D5"]
map(snd) applies snd, which keeps only the second element of a pair, to every discordant pair.
Fold
A fold is more generic, and may be used to implement a filter. Let's see how:
Prelude> foldl(\l(x,y)->if x==y then l else l++[y])[](zip a b)
["B2","D5"]
The lambda function takes every pair (x,y) and compares the two elements. If they have the same value, the accumulator list remains the identical, but if the values are different, the accumulator list is augmented by the second element.
List comprehension
This is more compact, and should seem obvious to every Python programmer:
Prelude> [y|(x,y)<-zip a b, x/=y] -- in Python [y for (x,y) in zip(a,b) if x!= y]
["B2","D5"]
The number of elements
You want a pair with the number of elements and the elements themselves.
Fold
With a fold, it's easy but cumbersome: you will use a slightly more complicated accumulator, that stores simultaneously the differences (l) and the number of those differences (n).
Prelude> foldl(\(n,l)(x,y)->if x==y then (n,l) else (n+1,l++[y]))(0,[])$zip a b
(2,["B2","D5"])
Lambda
But you can use the fact that your output is redundant: you want a list preceeded by the length of that list. Why not apply a lambda that does the job?
Prelude> (\x->(length x,x))[1,2,3]
(3,[1,2,3])
With a list comprehension, it gives:
Prelude> (\x->(length x,x))[y|(x,y)<-zip a b, x/=y]
(2,["B2","D5"])
Bind operator
Finally, and for the fun, you don't need to build the lambda this way. You could do:
Prelude> ((,)=<<length)[y|(x,y)<-zip a b,x/=y]
(2,["B2","D5"])
What happens here? (,) is a operator that makes a pair from two elements:
Prelude> (,) 1 2
(1,2)
and ((,)=<<length) : 1. takes a list (technically a Foldable) and passes it to the length function; 2. the list and the length are then passed by =<< (the "bind" operator) to the (,) operator, hence the expected result.
Partial conclusion
"There is more than than one way to do it" (but it's not Perl!)
Haskell offers a lot of builtins functions and operators to handle this kind of basic manipulation.
What about doing it recursively? If two elements are the same, the first element of the resulting tuple is incremented; otherwise, the second element of the resulting tuple is appended by the mismatched element:
calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys)
| x == y = increment (calcP xs ys)
| otherwise = append y (calcP xs ys)
where
increment (count, results) = (count + 1, results)
append y (count, results) = (count, y:results)
calcP [] x = (0, x)
calcP x [] = (0, [])
a = ["A1","A2","B3","C3"]
b = ["A1","B2","B3","D5"]
main = print $ calcP a b
The printed result is (2,["B2","D5"])
Note, that
calcP [] x = (0, x)
calcP x [] = (0, [])
are needed to provide exhaustiveness for the pattern matching. In other words, you need to provide the case when one of the passed elements is an empty list. This also provides the following logic:
If the first list is greater than the second one on n elements, these n last elements are ignored.
If the second list is greater than the first one on n elements, these n last elements are appended to the second element of the resulting tuple.
I'd like to propose a very different method than the other folks: namely, compute a "summary statistic" for each pairing of elements between the two lists, and then combine the summaries into your desired result.
First some imports.
import Data.Monoid
import Data.Foldable
For us, the summary statistic is how many matches there are, together with the list of mismatches from the second argument:
type Statistic = (Sum Int, [String])
I've used Sum Int instead of Int to specify how statistics should be combined. (Other options here include Product Int, which would multiply together the values instead of adding them.) We can compute the summary of a single pairing quite simply:
summary :: String -> String -> Statistic
summary a b | a == b = (1, [ ])
| otherwise = (0, [b])
Combining the summaries for all the elements is just a fold:
calcP :: [String] -> [String] -> Statistic
calcP as bs = fold (zipWith summary as bs)
In ghci:
> calcP ["A1", "A2", "B3", "C3"] ["A1", "B2", "B3", "D5"]
(Sum {getSum = 2},["B2","D5"])
This general pattern (of processing elements one at a time into a Monoidal type) is frequently useful, and spotting where it's applicable can greatly simplify your code.

Haskell get a filtered List of integers

Scenario:
If there is an array of integers and I want to get array of integers in return that their total should not exceed 10.
I am a beginner in Haskell and tried below. If any one could correct me, would be greatly appreciated.
numbers :: [Int]
numbers = [1,2,3,4,5,6,7,8,9,10, 11, 12]
getUpTo :: [Int] -> Int -> [Int]
getUpTo (x:xs) max =
if max <= 10
then
max = max + x
getUpTo xs max
else
x
Input
getUpTo numbers 0
Output Expected
[1,2,3,4]
BEWARE: This is not a solution to the knapsack problem :)
A very fast solution I came up with is the following one. Of course solving the full knapsack problem would be harder, but if you only need a quick solution this should work:
import Data.List (sort)
getUpTo :: Int -> [Int] -> [Int]
getUpTo max xs = go (sort xs) 0 []
where
go [] sum acc = acc
go (x:xs) sum acc
| x + sum <= max = go xs (x + sum) (x:acc)
| otherwise = acc
By sorting out the array before everything else, I can take items from the top one after another, until the maximum is exceeded; the list built up to that point is then returned.
edit: as a side note, I swapped the order of the first two arguments because this way should be more useful for partial applications.
For educational purposes (and since I felt like explaining something :-), here's a different version, which uses more standard functions. As written it is slower, because it computes a number of sums, and doesn't keep a running total. On the other hand, I think it expresses quite well how to break the problem down.
getUpTo :: [Int] -> [Int]
getUpTo = last . filter (\xs -> sum xs <= 10) . Data.List.inits
I've written the solution as a 'pipeline' of functions; if you apply getUpTo to a list of numbers, Data.List.inits gets applied to the list first, then filter (\xs -> sum xs <= 10) gets applied to the result, and finally last gets applied to the result of that.
So, let's see what each of those three functions do. First off, Data.List.inits returns the initial segments of a list, in increasing order of length. For example, Data.List.inits [2,3,4,5,6] returns [[],[2],[2,3],[2,3,4],[2,3,4,5],[2,3,4,5,6]]. As you can see, this is a list of lists of integers.
Next up, filter (\xs -> sum xs <= 10) goes through these lists of integer in order, keeping them if their sum is less than 10, and discarding them otherwise. The first argument of filter is a predicate which given a list xs returns True if the sum of xs is less than 10. This may be a bit confusing at first, so an example with a simpler predicate is in order, I think. filter even [1,2,3,4,5,6,7] returns [2,4,6] because that are the even values in the original list. In the earlier example, the lists [], [2], [2,3], and [2,3,4] all have a sum less than 10, but [2,3,4,5] and [2,3,4,5,6] don't, so the result of filter (\xs -> sum xs <= 10) . Data.List.inits applied to [2,3,4,5,6] is [[],[2],[2,3],[2,3,4]], again a list of lists of integers.
The last step is the easiest: we just return the last element of the list of lists of integers. This is in principle unsafe, because what should the last element of an empty list be? In our case, we are good to go, since inits always returns the empty list [] first, which has sum 0, which is less than ten - so there's always at least one element in the list of lists we're taking the last element of. We apply last to a list which contains the initial segments of the original list which sum to less than 10, ordered by length. In other words: we return the longest initial segment which sums to less than 10 - which is what you wanted!
If there are negative numbers in your numbers list, this way of doing things can return something you don't expect: getUpTo [10,4,-5,20] returns [10,4,-5] because that is the longest initial segment of [10,4,-5,20] which sums to under 10; even though [10,4] is above 10. If this is not the behaviour you want, and expect [10], then you must replace filter by takeWhile - that essentially stops the filtering as soon as the first element for which the predicate returns False is encountered. E.g. takeWhile [2,4,1,3,6,8,5,7] evaluates to [2,4]. So in our case, using takeWhile stops the moment the sum goes over 10, not trying longer segments.
By writing getUpTo as a composition of functions, it becomes easy to change parts of your algorithm: if you want the longest initial segment that sums exactly to 10, you can use last . filter (\xs -> sum xs == 10) . Data.List.inits. Or if you want to look at the tail segments instead, use head . filter (\xs -> sum xs <= 10) . Data.List.tails; or to take all the possible sublists into account (i.e. an inefficient knapsack solution!): last . filter (\xs -> sum xs <= 10) . Data.List.sortBy (\xs ys -> length xscomparelength ys) . Control.Monad.filterM (const [False,True]) - but I'm not going to explain that here, I've been rambling long enough!
There is an answer with a fast version; however, I thought it might also be instructive to see the minimal change necessary to your code to make it work the way you expect.
numbers :: [Int]
numbers = [1,2,3,4,5,6,7,8,9,10, 11, 12]
getUpTo :: [Int] -> Int -> [Int]
getUpTo (x:xs) max =
if max < 10 -- (<), not (<=)
then
-- return a list that still contains x;
-- can't reassign to max, but can send a
-- different value on to the next
-- iteration of getUpTo
x : getUpTo xs (max + x)
else
[] -- don't want to return any more values here
I am fairly new to Haskell. I just started with it a few hours ago and as such I see in every question a challenge that helps me get out of the imperative way of thinking and a opportunity to practice my recursion thinking :)
I gave some thought to the question and I came up with this, perhaps, naive solution:
upToBound :: (Integral a) => [a] -> a -> [a]
upToBound (x:xs) bound =
let
summation _ [] = []
summation n (m:ms)
| n + m <= bound = m:summation (n + m) ms
| otherwise = []
in
summation 0 (x:xs)
I know there is already a better answer, I just did it for the fun of it.
I have the impression that I changed the signature of the original invocation, because I thought it was pointless to provide an initial zero to the outer function invocation, since I can only assume it can only be zero at first. As such, in my implementation I hid the seed from the caller and provided, instead, the maximum bound, which is more likely to change.
upToBound [1,2,3,4,5,6,7,8,9,0] 10
Which outputs: [1,2,3,4]

Efficiently find indices of maxima of a list

Edit: I must not have worded it clearly enough, but I'm looking for a function like the one below, but not exactly it.
Given a list, I wanted to be able to find the index of the largest element in the list
(So, list !! (indexOfMaximum list) == maximum list)
I wrote some code that seems pretty efficient, although I feel I'm reinventing the wheel somewhere.
indexOfMaximum :: (Ord n, Num n) => [n] -> Int
indexOfMaximum list =
let indexOfMaximum' :: (Ord n, Num n) => [n] -> Int -> n -> Int -> Int
indexOfMaximum' list' currIndex highestVal highestIndex
| null list' = highestIndex
| (head list') > highestVal =
indexOfMaximum' (tail list') (1 + currIndex) (head list') currIndex
| otherwise =
indexOfMaximum' (tail list') (1 + currIndex) highestVal highestIndex
in indexOfMaximum' list 0 0 0
Now I want to return a list of the indices of the largest n numbers in the list.
My only solution is to store the top n elements in a list and replace (head list') > highestVal with a comparison across the n-largest-so-far list.
It feels like there has to be a more efficient way than to do this, and I also feel I'm making insufficient use of Prelude and Data.List. Any suggestions?
This solution associates each element with its index, sorts the list, so the smallest element is first, reverses it so the largest element is first, takes the first n elements, and then extracts the index.
maxn n xs = map snd . take n . reverse . sort $ zip xs [0..]
The shortest way finds the last index of a maximum element,
maxIndex list = snd . maximum $ zip list [0 .. ]
If you want the first index,
maxIndex list = snd . maximumBy cmp $ zip list [0 .. ]
where
cmp (v,i) (w,j) = case compare v w of
EQ -> compare j i
ne -> ne
The downside is that maximum and maximumBy are too lazy, so these may build large thunks. To avoid that, either use a manual recursion (like you did, but some strictness annotations may be necessary) or use a strict left fold with a strict accumulator type, tuples are not good for that because foldl' only evaluates to weak head normal form, that is to the outermost constructor here, and thus you build thunks in the tuple components.
Well, a simple way would be to use maximum to find the largest element and then use findIndices to find each occurrence of it. Something like:
largestIndices :: Ord a => [a] -> [Int]
largestIndices ls = findIndices (== maximum ls) ls
However, this is not perfect because maximum is a partial function and will barf horribly if given an empty list. You can easily avoid this by adding a [] case:
largestIndices :: Ord a => [a] -> [Int]
largestIndices [] = []
largestIndices ls = findIndices (== maximum ls) ls
The real trick to this answer is how I figured it out. I didn't even know about findIndices before now! However, GHCi has a neat command called :browse.
Prelude> :browse Data.List
This lists every single function exported by Data.List. Using this, I just search first for maximum and then for index to see what the options were. And, right by findIndex, there was findIndecies, which was perfect.
Finally, I would not worry about efficiency unless you actually see that code is running slowly. GHC can--and does--perform some very aggressive optimizations because the language is pure and it can get away with it. So the only time you need to worry about performance is when--after compiling with -O2--you see that it's a problem.
EDIT: If you want to find the n top elements' indices, here's an easy idea: sort the list in descending order, grab the first n unique elements, get their indices with elemIndices and take the first n indices from that. I hope this is relatively clear.
Here's a quick version of my idea:
nLargestInices n ls = take n $ concatMap (`elemIndices` ls) nums
where nums = take n . reverse . nub $ sort ls

Filter list of lists

I'm very new with Haskell, only starting to learn it.
I'm using "Learn You a Haskell for Great Good!" tutorial for start, and saw example of solving "3n+1" problem:
chain :: (Integral a) => a -> [a]
chain 1 = [1]
chain n
| even n = n:chain (n `div` 2)
| odd n = n:chain (n*3 + 1)
numLongChains :: Int
numLongChains = length (filter isLong (map chain [1..100]))
where isLong xs = length xs > 15
so, numLongChains counts all chains that longer 15 steps, for all numbers from 1 to 100.
Now, I wanna my own:
numLongChains' :: [Int]
numLongChains' = filter isLong (map chain [1..100])
where isLong xs = length xs > 15
so now, I wanna not to count these chains, but return filtered list with these chains.
But now I get error when compiling:
Couldn't match expected type `Int' with actual type `[a0]'
Expected type: Int -> Bool
Actual type: [a0] -> Bool
In the first argument of `filter', namely `isLong'
In the expression: filter isLong (map chain [1 .. 100])
What can be the problem?
The type signature of numLongChains is probably not correct. Depending on what you want to do, one of the following is needed:
You simply want to count those chains, your function numLongChains obviously shall return a number, change the first line to length $ filter isLong (map chain [1..100]) and the type to Int
You want to return a list of lengths of the long chains. In this case, the type signature is fine, but you need to return a length. I'd suggest you, the calculate the length before filtering and filter on it. The function's body becomes filter (>15) (map (length . chain) [1..100]).
You want to return all chains that are longer than 15 chars. Just change the signature to [[Int]] (A list of chains (lists) of Ints) and you're fine.
FUZxxl is right. You are going to want to change the type signature of your function to [[Int]]. As you are filtering a list of lists and only selecting the ones that are sufficiently long, you will have returned a lists of lists.
One note about reading Haskell compile-time debugger/errors. This error may seem strange. It says you had [a0] -> Bool but you were expecting Int -> Bool. This is because that the type checker assumes that, from the signature of your numLongChains' function, you are going to need a filter function that checks Ints and returns a list of acceptable ones. The only way to filter over a list and get [Int] back is to have a function that takes Ints and returns Bools (Int -> Bool). Instead, it sees a function that checks length. Length takes a list, so it guesses that you wrote a function that checks lists. ([a0] -> Bool). Sometimes, the checker is not as friendly as you would like it to be but if you look hard enough, you will see that 9 times out of 10, a hard to decipher error is the result of such as assumptions.

Convert list of Integers into one Int (like concat) in haskell

Pretty much what the title says. I have a list of Integers like so: [1,2,3]. I want to change this in to the Integer 123. My first thought was concat but that doesn't work because it's of the wrong type, I've tried various things but usually I just end up returning the same list. Any help greatly appreciated.
Also I have found a way to print the right thing (putStr) except I want the type to be Integer and putStr doesn't do that.
You can use foldl to combine all the elements of a list:
fromDigits = foldl addDigit 0
where addDigit num d = 10*num + d
The addDigit function is called by foldl to add the digits, one after another, starting from the leftmost one.
*Main> fromDigits [1,2,3]
123
Edit:
foldl walks through the list from left to right, adding the elements to accumulate some value.
The second argument of foldl, 0 in this case, is the starting value of the process. In the first step, that starting value is combined with 1, the first element of the list, by calling addDigit 0 1. This results in 10*0+1 = 1. In the next step this 1 is combined with the second element of the list, by addDigit 1 2, giving 10*1+2 = 12. Then this is combined with the third element of the list, by addDigit 12 3, resulting in 10*12+3 = 123.
So pointlessly multiplying by zero is just the first step, in the following steps the multiplication is actually needed to add the new digits "to the end" of the number getting accumulated.
You could concat the string representations of the numbers, and then read them back, like so:
joiner :: [Integer] -> Integer
joiner = read . concatMap show
This worked pretty well for me.
read (concat (map show (x:xs))) :: Int
How function reads:
Step 1 - convert each int in the list to a string
(map show (x:xs))
Step 2 - combine each of those strings together
(concat (step 1))
Step 3 - read the string as the type of int
read (step 2) :: Int
Use read and also intToDigit:
joinInt :: [Int] -> Int
joinInt l = read $ map intToDigit l
Has the advantage (or disadvantage) of puking on multi-digit numbers.
Another idea would be to say: the last digit counts for 1, the next-to last counts for 10, the digit before that counts for 100, etcetera. So to convert a list of digits to a number, you need to reverse it (in order to start at the back), multiply the digits together with the corresponding powers of ten, and add the result together.
To reverse a list, use reverse, to get the powers of ten you can use iterate (*10) 1 (try it in GHCi or Hugs!), to multiply corresponding digits of two lists use zipWith (*) and to add everything together, use sum - it really helps to know a few library functions! Putting the bits together, you get
fromDigits xs = sum (zipWith (*) (reverse xs) (iterate (*10) 1))
Example of evaluation:
fromDigits [1,2,3,4]
==> sum (zipWith (*) (reverse [1,2,3,4]) [1,10,100,1000, ....]
==> sum (zipWith (*) [4,3,2,1] [1,10,100,1000, ....])
==> sum [4 * 1, 3 * 10, 2 * 100, 1 * 1000]
==> 4 + 30 + 200 + 1000
==> 1234
However, this solution is slower than the ones with foldl, due to the call to reverse and since you're building up those powers of ten only to use them directly again. On the plus side, this way of building numbers is closer to the way people usually think (at least I do!), while the foldl-solutions in essence use Horner's rule.
join :: Integral a => [a] -> a
join [x] = x
join (x:xs) = (x * (10 ^ long)) + join(xs)
where long = length(x:xs)
We can define the function called join, that given a list of Integral numbers it can return another Integral number. We are using recursion to separate the head of the given list with the rest of the list and we use pattern matching to define an edge condition so that the recursion can end.
As for how to print the number, instead of
putStr n
just try
putStr (show n)
The reasoning is that putStr can only print strings. So you need to convert the number to a string before passing it in.
You may also want to try the print function from Prelude. This one can print anything that is "showable" (any instance of class Show), not only Strings. But be aware that print n corresponds (roughly) to putStrLn (show n), not putStr (show n).
I'm no expert in Haskell, but this is the easiest way I can think of for a solution to this problem that doesn't involve using any other external functions.
concatDigits :: [Int] -> Int
concatDigits [] = 0
concatDigits xs = concatReversed (reverseDigits xs) 1
reverseDigits :: [Int] -> [Int]
reverseDigits [] = []
reverseDigits (x:xs) = (reverseDigits xs) ++ [x]
concatReversed :: [Int] -> Int -> Int
concatReversed [] d = 0
concatReversed (x:xs) d = (x*d) + concatReversed xs (d*10)
As you can see, I've assumed you're trying to concat a list of digits. If by any chance this is not your case, I'm pretty sure this won't work. :(
In my solution, first of all I've defined a function called reverseDigits, which reverses the original list. For example [1,2,3] to [3,2,1]
After that, I use a concatReversed function which takes a list of digits and number d, which is the result of ten power the first digit on the list position. If the list is empty it returns 0, and if not, it returns the first digit on the list times d, plus the call to concatReversed passing the rest of the list and d times 10.
Hope the code speaks for itself, because I think my poor English explanation wasn't very helpful.
Edit
After a long time, I see my solution is very messy, as it requires reversing the list in order to be able to multiply each digit by 10 power the index of the digit in the list, from right to left. Now knowing tuples, I see that a much better approach is to have a function that receives both the accumulated converted part, and the remainder of the list, so in each invocation in multiplies the accumulated part by 10, and then adds the current digit.
concatDigits :: [Int] -> Int
concatDigits xs = aggregate (xs, 0)
where aggregate :: ([Int], Int) -> Int
aggregate ([], acc) = acc
aggregate (x:xs, acc) = aggregate (xs, (acc * 10 + x))