Iterating through and removing items from a list - python-2.7

b = [1,2,3,4,5,6,7]
for n in b:
if n > 3:
b.remove(n)
I print b, and get the following list:
[1,2,3,5,7]
Why are 5 and 7 still present? I can make a function that pretty much does the same thing and removes all numbers from the list about 3, so why can't I do the same in Terminal?

This is a well-known issue. Iterators are not reliable when you modify the underlying collection while using the iterator.
As for the behavior you experience:
With CPython, the list iterator is represented by an index into the array. If you remove an item from the list at the iterator position or before it while still iterating over it, the iterator jumps forward. The iterator position index is still the same, but all items "under" the iterator have just moved to the left by one position. This makes the iterator skip one element. Hence you only remove every second item.
l = [1, 2, 3, 4]
^it(pos=1)
l.remove(2)
l = [1, 3, 4]
^it(pos=1)
it.next() # automatically at the end of each for loop
l = [1, 3, 4] # we just skipped over an item
^it(pos=2)
Here's a nice little treatise on the topic from #mgiuca.
Interestingly enough, removing items after the iterator position is safe with the current implementation.
In short: don't modify collections while iterating over them. Alternatives for lists: Remove items from a list while iterating in Python

Thats because you are iterating over the same list. Try this:
b = [1,2,3,4,5,6,7]
c = b[:]
for n in c:
if n > 3:
b.remove(n)
If you see the below image, now I creating two different list.

When you remove an element, the array is modified (elements shifted to the left), so the next iteration takes you to the next element bypassing the shifted element. That is to say, the array is modified and the loop advances to the next index. That is why you notice a jump every time an element is removed.

Related

Member of a list, sum previous members list prolog

I want to verify if a member of list, is the sum of the previous numbers.
Example: [0,1,3,4,18,19]. This is TRUE because 0+1+3 = 4
sum_([],0).
sum_([X|XS],R):- suma(XS,R1), R is X + R1.
existsSum(L,[X|C]):-append(A,[X|B],L),
append(A,B,C),
sum_(C,X).
I am stuck here. Any idea? Thanks.
Why append(A,[X|B],L),append(A,B,C),sum_(C,X)? In this way you want the sum of all elements except X to be equal to X.
It is not clear what the arguments of existsSum should be. Supposing existsSum(InputList, SubList, Element):
existsSum(L,A,X) :- append(A,[X|_B],L), sum_(A,X).
With your example produces these results:
?- existsSum([0,1,3,4,18,19], Sublist, Element).
Sublist = [],
Element = 0 ;
Sublist = [0, 1, 3],
Element = 4 ;
false.
Note: also [] and 0 is a solution because of how you defined the sum_ predicate, i.e. the sum of [] is 0.
If you change the sum_ predicate in this way:
sum_([X],X).
sum_([X|XS],R):- sum_(XS,R1),R is X + R1.
it is defined only for non-empty lists, and in this case you get only one result from your example:
?- existsSum([0,1,3,4,18,19], Sublist, Element).
Sublist = [0, 1, 3],
Element = 4 ;
false.
I think your problem is ill-stated (or your example should not start with zero) because I think you basically have two ways you can process the list: either you process the entire list every time (and your example fails because 0+1+3+4+18 != 19) or you stop as soon as your expected value matches the head of the list, in which case [0] is already successful.
In the end, there aren't that many ways to process a list. You have to make a decision when you have an element, and you have to make a decision when you are out of elements. Suppose we want to succeed as soon as we have reached a value that matches the sum-so-far. We can model that fairly simply like this:
exists_sum(List) :- exists_sum(0, List).
exists_sum(RunningTotal, [RunningTotal|_]).
exists_sum(RunningTotal, [H|T]) :-
NewRunningTotal is RunningTotal + H,
exists_sum(NewRunningTotal, T).
Note that with this formulation, [0|_] is already successful. Also note that I have no empty list case: if I make it to the end of a list without having succeeded already, there is no solution there, so there's nothing to say about it.
The other formulation would be to require that the entire list is processed, which would basically be to replace the first exists_sum/2 clause with this:
exists_sum(Total, [Total]).
This will fail to unify exists_sum(4, [4|_]) which is the case you outline in the question where [0,1,3,4...] succeeds.
There may be other formulations that are more complex than these, but I don't see them. I really think there are only a couple ways to go with this that make sense.

Iterate through list from any given starting point and continue from beginning?

Iterating through a list is trivial. In this case, a TCollection property of a component I'm working on. I have no problem with iterating from 0 index to the maximum index - I've done it plenty of times.
However, I'm working on something now which needs to iterate a bit differently. I need to iterate through a list of collection items from any given starting point - and yet complete a full loop of all items. After the last list item, it shall automatically continue iteration from the beginning of the list.
To clarify: traditional iteration works like:
for X := 0 to SomeList.Count-1 do ...
But I may start it at some other point, such as:
for X := StartingPoint to EndingPoint do ...
And it's that "EndingPoint" which I cannot figure out. Iteration only increases. But in my case, I need to reset this current iteration position to the beginning - right in the middle of the iteration. EndingPoint might be less than the StartingPoint, but it still needs to do a complete loop, where once it reaches the end, it picks up from the beginning.
So, in a list of 5 items, rather than just going...
0, 1, 2, 3, 4
I may start at 2, and want to do...
2, 3, 4, 0, 1
How do I accomplish such a loop?
for foo := 0 to Pred(SomeList.Count) do begin
i := (foo + StartingPoint) mod SomeList.Count;
...
end;
Use the index i inside the loop; ignore the foo variable.
From the middle to the end of the range, i will equal foo + StartingPoint. After that, the mod operator will effectively make i "wrap around" to the start again.

Removing multiple elements from stl list while iterating

This is not similar to Can you remove elements from a std::list while iterating through it?. Mine is a different scenario.
Lets say I have a list like this.
1 2 3 1 2 2 1 3
I want to iterate this stl list in such a way that
When I first encounter an element X I do some activity and then I need to remove all the elements X in that list and continue iterating. Whats an efficient way of doing this in c++.
I am worried that when i do a remove or an erase I will be invalidating the iterators. If it was only one element then I could potentially increment the iterator and then erase. But in my scenario I would need to delete/erase all the occurances.
Was thinking something like this
while (!list.empty()) {
int num = list.front();
// Do some activity and if successfull
list.remove(num);
}
Dont know if this is the best.
Save a set of seen numbers and if you encounter a number in the set ignore it. You can do as follows:
list<int> old_list = {1, 2, 3, 1, 2, 2, 1, 3};
list<int> new_list;
set<int> seen_elements;
for(int el : old_list) {
if (seen_elements.find(el) == seen_elements.end()) {
seen_elements.insert(el);
new_list.push_back(el);
}
}
return new_list;
This will process each value only once and the new_list will only contain the first copy of each element in the old_list. This runs in O(n*log(n)) because each iteration performs a set lookup (you can make this O(n) by using a hashset). This is significantly better than the O(n^2) that your approach runs in.

Moving circular array elements algorithm?

I mean in reverse sorry, essentially, find the first true element, then circularly move backwards until you find the last valid element, once the last element that is true is found by circularly reverse traversing the array, circularly go forward and push until a false is found.
I am given an array of pair of bool,int.
The array always has 4 elements. Elements that are true are circularly linked together ex:
TFFT
TTFT
FFTT
FTTT
TFFF
FTTF
These are all valid arrays that I could have.
The number they contain is not important for this (the pair second value).
What I need to do is:
only keep the true ones. But I need them to stay in the correct circular order such that the last valid true element will come first.
So for example:
If my array was:
T 1
F 2
F 3
T 4
The new array needs to be:
T 4
T 1
Another example:
If my array was:
F 1
T 2
T 3
F 4
The new array needs to be:
T 2
T 3
This is just an abstract example of the problem. The actual code is complex and hard to read. But if I know how to do this I'll be alright.
Essentially I need to walk clockwise from the first discontinuous element to the last contiguous element.
Thanks
Edit:
By circularly linked together I mean that if the 4th and first element are true, they are not disconnected meaning they are not discontiguous, 3,4,1 is considered contiguous.
Thus if you had TFTT then I need them to be in the order of 3,4,1.
You can think of your array as containing three segments:
0 or more T elements at the beginning
1 or more F elements in the middle
0 or more T elements at the end
(If your array might not have any F elements at all, then you can handle that as a special case.)
What you want is a new array containing segment 3 followed by segment 1, with segment 2 erased.
Here's an outline of an algorithm to do that:
Find the index of the first F in the array. Call it first_F.
Find the index of the last F in the array. Call it last_F.
Now you know your segments occupy the indices [0, first_F), [first_F, last_F], and [last_F + 1, size_of_array), respectively.
Iterate over the segment [last_F + 1, size_of_array) and add the elements to your result array.
Iterate over the segment [0, first_F) and add those elements to your result array.
Suppose you store your elements like this
l= [(T, 1),
(F, 2),
(F, 3),
(T, 4),]
Then you need to double the list, like this
l= [(T, 1),
(F, 2),
(F, 3),
(T, 4),
(T, 1),
(F, 2),
(F, 3),
(T, 4),]
Now what you need to do essentially is to find the longest sub-list that all have T
A special corner case is that the original list is all T

O(log n) algorithm to find the element having rank i in union of pre-sorted lists

Given two sorted lists, each containing n real numbers, is there a O(log n) time algorithm to compute the element of rank i (where i coresponds to index in increasing order) in the union of the two lists, assuming the elements of the two lists are distinct?
EDIT:
#BEN: This i s what I have been doing , but I am still not getting it.
I have an examples ;
List A : 1, 3, 5, 7
List B : 2, 4, 6, 8
Find rank(i) = 4.
First Step : i/2 = 2;
List A now contains is A: 1, 3
List B now contains is B: 2, 4
compare A[i] to B[i] i.e
A[i] is less;
So the lists now become :
A: 3
B: 2,4
Second Step:
i/2 = 1
List A now contains A:3
List B now contains B:2
NoW I HAVE LOST THE VALUE 4 which is actually the result ...
I know I am missing some thing , but even after close to a day of thinking I cant just figure this one out...
Yes:
You know the element lies within either index [0,i] of the first list or [0,i] of the second list. Take element i/2 from each list and compare. Proceed by bisection.
I'm not including any code because this problem sounds a lot like homework.
EDIT: Bisection is the method behind binary search. It works like this:
Assume i = 10; (zero-based indexing, we're looking for the 11th element overall).
On the first step, you know the answer is either in list1(0...10) or list2(0...10). Take a = list1(5) and b = list2(5).
If a > b, then there are 5 elements in list1 which come before a, and at least 6 elements in list2 which come before a. So a is an upper bound on the result. Likewise there are 5 elements in list2 which come before b and less than 6 elements in list1 which come before b. So b is an lower bound on the result. Now we know that the result is either in list1(0..5) or list2(5..10). If a < b, then the result is either in list1(5..10) or list2(0..5). And if a == b we have our answer (but the problem said the elements were distinct, therefore a != b).
We just repeat this process, cutting the size of the search space in half at each step. Bisection refers to the fact that we choose the middle element (bisector) out of the range we know includes the result.
So the only difference between this and binary search is that in binary search we compare to a value we're looking for, but here we compare to a value from the other list.
NOTE: this is actually O(log i) which is better (at least no worse than) than O(log n). Furthermore, for small i (perhaps i < 100), it would actually be fewer operations to merge the first i elements (linear search instead of bisection) because that is so much simpler. When you add in cache behavior and data locality, the linear search may well be faster for i up to several thousand.
Also, if i > n then rely on the fact that the result has to be toward the end of either list, your initial candidate range in each list is from ((i-n)..n)
Here is how you do it.
Let the first list be ListX and the second list be ListY. We need to find the right combination of ListX[x] and ListY[y] where x + y = i. Since x, y, i are natural numbers we can immediately constrain our problem domain to x*y. And by using the equations max(x) = len(ListX) and max(y) = len(ListY) we now have a subset of x*y elements in the form [x, y] that we need to search.
What you will do is order those elements like so [i - max(y), max(y)], [i - max(y) + 1, max(y) - 1], ... , [max(x), i - max(x)]. You will then bisect this list by choosing the middle [x, y] combination. Since the lists are ordered and distinct you can test ListX[x] < ListY[y]. If true then we bisect the upper half our [x, y] combinations or if false then we bisect the lower half. You will keep bisecting until find the right combination.
There are a lot of details I left, but that is the general gist of it. It is indeed O(log(n))!
Edit: As Ben pointed out this actually O(log(i)). If we let n = len(ListX) + len(ListY) then we know that i <= n.
When merging two lists, you're going to have to touch every element in both lists. If you don't touch every element, some elements will be left behind. Thus your theoretical lower bound is O(n). So you can't do it that way.
You don't have to sort, since you have two lists that are already sorted, and you can maintain that ordering as part of the merge.
edit: oops, I misread the question. I thought given value, you want to find rank, not the other way around. If you want to find rank given value, then this is how to do it in O(log N):
Yes, you can do this in O(log N), if the list allows O(1) random access (i.e. it's an array and not a linked list).
Binary search on L1
Binary search on L2
Sum the indices
You'd have to work out the math, +1, -1, what to do if element isn't found, etc, but that's the idea.