Erlang lists:index_of/2 function? - list

I'm looking for an Erlang library function that will return the index of a particular element in a list.
So, if
X = [10,30,50,70]
lists:index_of(30, X)
would return 1, etc., just like java.util.List's indexOf() method.
Does such a method exist in the Erlang standard lib? I tried looking in the lists module but no luck. Or should I write it myself?

You'll have to define it yourself, like this:
index_of(Item, List) -> index_of(Item, List, 1).
index_of(_, [], _) -> not_found;
index_of(Item, [Item|_], Index) -> Index;
index_of(Item, [_|Tl], Index) -> index_of(Item, Tl, Index+1).
Note however that accesing the Nth element of a list is O(N), so an algorithm that often accesses a list by index will be less efficient than one that iterates through it sequentially.

As others noted, there are more efficient ways to solve for this. But if you're looking for something quick, this worked for me:
string:str(List, [Element]).

Other solutions (remark that these are base-index=1):
index_of(Value, List) ->
Map = lists:zip(List, lists:seq(1, length(List))),
case lists:keyfind(Value, 1, Map) of
{Value, Index} -> Index;
false -> notfound
end.
index_of(Value, List) ->
Map = lists:zip(List, lists:seq(1, length(List))),
case dict:find(Value, dict:from_list(Map)) of
{ok, Index} -> Index;
error -> notfound
end.
At some point, when the lists you pass to these functions get long enough, the overhead of constructing the additional list or dict becomes too expensive. If you can avoid doing the construction every time you want to search the list by keeping the list in that format outside of these functions, you eliminate most of the overhead.
Using a dictionary will hash the values in the list and help reduce the index lookup time to O(log N), so it's better to use that for large, singly-keyed lists.
In general, it's up to you, the programmer, to organize your data into structures that suit how you're going to use them. My guess is that the absence of a built-in index_of is to encourage such consideration. If you're doing single-key lookups -- that's really what index_of() is -- use a dictionary. If you're doing multi-key lookups, use a list of tuples with lists:keyfind() et al. If your lists are inordinately large, a less simplistic solution is probably best.

This function is very uncommon for Erlang and this is may be reason why it is not in standard library. No one of experienced Erlang programmers need it and is discourage to use algorithms using this function. When someone needs it, can write for own purpose but this very rare occasions are not reason to include it to stdlib. Design your data structures in proper way instead of ask for this function. In most cases need of this function indicates error in design.

I think the writer makes a valid case. Here is my use case from a logging application. The objective is to check the severity of an error against the actions to be performed against various levels of error response.
get_index(A,L) ->
get_index(A,L,1).
get_index(A,[A|_],N) ->
N;
get_index(A,[_|T],N) ->
get_index(A,T,N+1).
get_severity(A) ->
Severity=[debug,info,warn,error],
get_index(A,Severity).

The following function returns a list of indices of a given element in a list. Result can be used to get the index of the first or last occurrence of a duplicate element in a list.
indices_of(Element, L) ->
Indices = lists:zip(lists:seq(1,length(L)), L),
[ I || {I, E} <- Indices, E == Element ].

Related

Why does the shuffle' function require an Int parameter?

In System.Random.Shuffle,
shuffle' :: RandomGen gen => [a] -> Int -> gen -> [a]
The hackage page mentions this Int argument as
..., its length,...
However, it seems that a simple wrapper function like
shuffle'' x = shuffle' x (length x)
should've sufficed.
shuffle operates by building a tree form of its input list, including the tree size. The buildTree function performs this task using Data.Function.fix in a manner I haven't quite wrapped my head around. Somehow (I think due to the recursion of inner, not the fix magic), it produces a balanced tree, which then has logarithmic lookup. Then it consumes this tree, rebuilding it for every extracted item. The advantage of the data structure would be that it only holds remaining items in an immutable form; lazy updates work for it. But the size of the tree is required data during the indexing, so there's no need to pass it separately to generate the indices used to build the permutation. System.Random.Shuffle.shuffle indeed has no random element - it is only a permutation function. shuffle' exists to feed it a random sequence, using its internal helper rseq. So the reason shuffle' takes a length argument appears to be because they didn't want it to touch the list argument at all; it's only passed into shuffle.
The task doesn't seem terribly suitable for singly linked lists in the first place. I'd probably consider using VectorShuffling instead. And I'm baffled as to why rseq isn't among the exported functions, being the one that uses a random number generator to build a permutation... which in turn might have been better handled using Data.Permute. Probably the reasons have to with history, such as Data.Permute being written later and System.Random.Shuffle being based on a paper on immutable random access queues.
Data.Random.Extras seems to have a more straight forward Seq-based shuffle function.
It might be a case when length of the given list is already known, and doesn't need to be calculated again. Thus, it might be considered as an optimisation.
Besides, in general, the resulting list doesn't need to have the same size as the original one. Thus, this argument could be used for setting this length.
This is true for the original idea of Oleg (source - http://okmij.org/ftp/Haskell/perfect-shuffle.txt):
-- examples
t1 = shuffle1 ['a','b','c','d','e'] [0,0,0,0]
-- "abcde"
-- Note, that rseq of all zeros leaves the sequence unperturbed.
t2 = shuffle1 ['a','b','c','d','e'] [4,3,2,1]
-- "edcba"
-- The rseq of (n-i | i<-[1..n-1]) reverses the original sequence of elements
However, it's not the same for the 'random-shuffle' package implementation:
> shuffle [0..10] [0,0,0,0]
[0,1,2,3random-shuffle.hs: [shuffle] called with lists of different lengths
I think it worth to follow-up with the packages maintainers in order to understand the contract of this function.

Is there an exposed function for removing an item from a list?

Is there an exposed function for removing an item from a list?
I do not see any operation for removing an item from a list.
I'm sure I can implement this functionality on my own. However, I kind of expected this operation to be supported in FSharp.Core.
Am I missing something?
If you mean creating a new list with some items removed based on their value, then you could do this:
[1; 2; 3; 1] |> List.filter ((<>) 1)
// Returns [2; 3]
This uses the <> (not equal) operator in prefix mode by wrapping it in parentheses and then currying it by only providing the first argument.
Note that all instances of this value are excluded.
Short answer - the library designers didn't think it warranted inclusion.
Designing a library of any sort, but in particular a core library like collection modules in F#, is always about finding the right balance between complexity and usefulness. You have to carefully consider if your new feature brings enough to the table to offset the cost of having a larger library.
For removing all instances of an item, you can use List.filter with a negated predicate. The designers could have included a List.remove function that does the negation internally. It's not something unthinkable, in fact Lisps tend to have both filter and remove. In Haskell and OCaml, you only have filter though - and F# designers probably followed suit here.
If you want to remove only a single instance of an item, you have to write something yourself. This is a somewhat non-standard use case for a list - lists are "about" accumulating elements in sequence; removing particular elements from the middle of the list (as opposed to removing the head or removing all undesirable elements) is seldom useful. If your focus is on adding or removing elements without a need to preserve order, sets or maps (used as multisets) are a better fit for job.
I'm not sure why you would expect this. For example the same functionality (AFAIK) is not available for Arrays in C#.
However if you want you can use the generic List:
open System.Collections.Generic
let xs = [1..3]
let xs' = List(xs)
xs'.Remove(2)
xs'
//val it : List<int> = seq [1; 3]
The Generic List has .Remove, .RemoveAt and .RemoveAll methods.

Why is List.length linear in complexity?

I understand that lists are implemented as singly linked so they don't really have a constant structure that you can pin a length on, but each node should know how many nodes till the last element right? There isn't a way to add a node to some existing list and for that node not to be able to determine the length of the list it represents in constant time provided that the existing nodes already have that info.
I can understand why that wouldn't work in Haskell, for example, due to lazyness, but as far as I know F# lists aren't lazy. So, is the problem just in the extra memory overhead?
Seems to me like typical memory vs time performance consideration.
If standard f# list had the implementation You suggest, then it would need much more place in memory (consider one million long list of bools). And everyone using such list would have to deal with it. There would be no simple way to opt out of this other than writing completely new implementation of list.
On the other hand, it seems to be fairly simple to create a new type that would store length of succeeding list with each element basing on F# List. You can implement it on Your own if You need it. Those, who don't need it will use standard implementation.
I don't often find myself needing to know the length of the list, it's not like you need it to exit a for loop like you would with arrays in imperative languages.
For those rare cases when you really need to know the length asap, you can go with Carsten König's suggestion from a comment and make your 'a list into a ('a * int) list, where each node keeps the length of the tail as a tuple element.
Then you can have something like this:
let push lst e =
match lst with
| (_, c)::_ -> (e, c + 1) :: lst
| [] -> [e, 0]
and length and pop functions to match.
For all the other cases I'd call it a premature optimization.

Inserting into a list at a specific location using lenses

I'm trying to perform a manipulation of a nested data structure containing lists of elements. After mucking around with various approaches I finally settled on lenses as the best way to go about doing this. They work perfectly for finding and modifying specific elements of the structure, but so far I'm stumped on how to add new elements.
From what I've read, I can't technically use a Traversal as it violates the Traversal laws to insert a new element into a list, and that's assuming I could even figure out how to do that using a Traversal in the first place (I'm still fairly weak with Haskell, and the type signatures for most things in the lens package make my head spin).
Specifically what I'm trying to accomplish is, find some element in the list of elements that matches a specific selector, and then insert a new element either before, or after the matched element (different argument to the function for either before or after the match). Does Control.Lens already have something that can accomplish what I'm trying to do and my understanding of the type signatures is just too weak to see it? Is there a better way to accomplish what I'm trying to do?
It would be fairly trivial if I was just trying to add a new element either to the beginning or the end of a list, but inserting it somewhere specific in the middle is the difficult part. In some of the pre-lens code I wrote I used a fold to accomplish what I wanted, but it was starting to get gnarly on the more deeply nested parts of the structure (E.G. a fold inside of a fold inside of a fold) so I turned to Control.Lens to try to untangle some of that mess.
Using lens pacakge
If we start with knowing the function id can be used like a lens:
import Control.Lens
> [1,2,3,4] ^. id
[1,2,3,4]
Then we can move on to how the list can be modified:
> [1,2,3,4] & id %~ (99:)
[99,1,2,3,4]
The above allows for insertion at the start of the list. To focus on the latter parts of the list we can use _tail from the Control.Lens.Cons module.
> [1,2,3,4] ^. _tail
[2,3,4]
> [1,2,3,4] & _tail %~ (99:)
[1,99,2,3,4]
Now to generalize this for the nth position
> :{
let
_drop 0 = id
_drop n = _tail . _drop (n - 1)
:}
> [1,2,3,4] ^. _drop 1
[2,3,4]
> [1,2,3,4] & _drop 0 %~ (99:)
[99,1,2,3,4]
> [1,2,3,4] & _drop 1 %~ (99:)
[1,99,2,3,4]
One last step to generalize this over all types with a Cons instance we can use cons or <|.
> [1,2,3,4] & _drop 1 %~ (99<|)
[1,99,2,3,4]
> import Data.Text
> :set -XOverloadedStrings
> ("h there"::Text) & _drop 1 %~ ('i'<|)
"hi there"
I think a simple approach would be break down the problem in:
A function that is of [a] -> SomeAddtionalData -> [a], which is basically responsible to transform the list into another list using some specific data. This is where you add/remove elements from the list and get a new list
Use lense to extract the List from some nested data structure, pass that list to above defined function, set the returned list in the nested data structure using lense.
Your last paragraph is the indication about what happens when you try to do too much using a generic abstraction like Lens. These generic abstractions are good for some generic purpose and everything else is specific to your problem and should be designed around plain old functions (at least initially, later on in your project you may find some general pattern across your code base which can be abstracted using type classes etc.).
Some comments on your problem:
Answer the Question:
There may be a way to do what you want to do. The Lens library is amazingly generic. What there is not is a simple or obvious way to make it happen. I think the it will involve the partsOf combinator but I'm not sure.
Comments on Lenses:
The lens library is really cool and can apply to an amazing number of problems. My initial temptation as I am learning the library was to try to fit everything into a Lens access or mutation. What I discovered was that it was better to use the lens library to dig into my complex data structures, but once I had a simple element it was better to use the more traditional functional techniques I already knew rather then stretching the Lens library past it's useful limit.
Advice you didn't ask for:
Inserting an element into the middle of a list is a bad idea. Not that it cannot be done but it can end up being an O(n^2) operation. (See also this StackOverflow answer.)Zip lists or some other functional data structure may be a better idea. As a side benefit, some of these structures could be made instance of the At class allowing for insertion and deletion using the partial lens combinators.

What is the easiest way to add an element to the end of the list?

As:: : 'a -> 'a list -> 'a list is used to add an element to the begin of a list, Could anyone tell me if there is a function to add an element to the end of a list? If not, I guess List.rev (element::(List.rev list)) is the most straightforward way to do it?
Thank you!
The reason there's not a standard function to do this is that appending at the end of a list is an anti-pattern (aka a "snoc list" or a Schlemiel the Painter algorithm). Adding an element at the end of a list requires a full copy of the list. Adding an element at the front of the list requires allocating a single cell—the tail of the new list can just point to the old list.
That said, the most straightforward way to do it is
let append_item lst a = lst # [a]
list#[element] should work. # joins lists.
Given that this operation is linear, you should not use it in the "hot" part of your code, where performance matters. In a cold part, use list # [element] as suggest by Adi. In a hot part, rewrite your algorithm so that you don't need to do that.
The typical way to do it is to accumulate results in the reverse order during processing, and then reverse the whole accumulated list before returning the result. If you have N processing steps (each adding an element to the list), you therefore amortize the linear cost of reverse over N elements, so you keep a linear algorithm instead of a quadratic one.
In some case, another technique that work is to process your elements in reverse order, so that the accumulated results turn out to be in the right order without explicit reversal step.