removing duplicates from two 2 dimensional list - list

i searched for a solution to remove deplicates from two 2d list in python i couldn't find so here my question:
i have two lists, for example
[[1,2],[3,5],[4,4],[5,7]]
[[1,3],[4,4],[3,5],[3,5],[5,6]]
Expected result:
[[1,2],[1,3],[5,7],[5,6]]
I want to remove list inside on the lists that match EXACTLY the values of the other list.
my script:
def filter2dim(firstarray, secondarray):
unique = []
for i in range(len(firstarray)):
temp=firstarray[i]
for j in range(len(secondarray)):
if(temp == secondarray[j]):
break
elif(j==(len(secondarray)-1)):
unique.append(temp)
for i in range(len(secondarray)):
temp=secondarray[i]
for j in range(len(firstarray)):
if(temp == firstarray[j]):
break
elif(j==(len(firstarray)-1)):
unique.append(secondarray[i])
return
Please if you fix it and explain what you did it will be greateful.
Thank you, Best Regards

Replace your 2-item lists with tuples and you can use set operations (because tuples are immutable and lists not, and set items must be immutable):
a = {(1,2),(3,5),(4,4),(5,7)}
b = {(1,3),(4,4),(3,5),(3,5),(5,6)}
print(a.symmetric_difference(b)) # {(1, 2), (5, 7), (5, 6), (1, 3)}
Note this also removes duplicates within each list because they are sets, and order is ignored.
If you need to programatically convert your lists into tuples, a list comprehension works just fine:
list_a = [[1,2],[3,5],[4,4],[5,7]]
set_a = {(i, j) for i, j in list_a}
print(set_a) # {(1, 2), (4, 4), (5, 7), (3, 5)}

Your script works fine for me, just add: return unique

Turn the first list into a dict:
a = [[1, 2], [3, 5], [4, 4], [5, 7]]
b = [[1, 3], [4, 4], [3, 5], [3, 5], [5, 6]]
filt = dict(a)
result = [el for el in b if el[0] in filt and el[0] == filt[el[0]]]
Alternatively, turn the first list into a set of tuples, and just check for membership:
filt = set(map(tuple, a))
result = [el for el in b if tuple(el) in filt]
Both of these solutions avoid iterating through the first list more than once, because dict and set lookups are O(1).

Related

How can i sort a list from specifik criteria

I have this list and I want to sort the list. This is just a smaller example of what I want to do, but I get the same error. I dont understand why I can't make this work.
I have tried using google to solve the problem but without luck.
lst = [3, 4, 5, 6]
if lst < 4:
lst.pop()
print(lst)
How can i do this it shows
TypeError:'<' not supported between instances of 'list' and 'in
I think that your goal is to remove all elements in the list that are lesser than 4. You can use this simple list comprehension in order to achieve what you want:
lst = [3, 4, 5, 6]
lst = [elem for elem in lst if elem >= 4]
print(lst)
Output:
[4, 5, 6]

Python list append different lists in the same scope for the same variable

Okay. I write an algorithm for show me all the permutations of a list of integers. But during the algorithm I got a problem to append a permuted list to my result list.
The code is the heap's algorithm. I got my finished permutation when size == 1. So a can append the permutated list V to my final list res. Here's the code:
The function for permutate the list
def permutations(V, size):
global res
if size == 1:
print(V)
res.append(V)
for i in range(0, size):
permutations(V, size-1)
if size % 2 == 1:
V[size-1], V[0] = V[0], V[size-1]
else:
V[i], V[size-1] = V[size-1], V[i]
A = [1,2,3]
res = []
permutation(A, len(A))
print(res)
And this is the output:
[1, 2, 3]
[2, 1, 3]
[3, 1, 2]
[1, 3, 2]
[2, 3, 1]
[3, 2, 1]
res: [[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]
The printed permutations of V are all correct. But the list V append to my global res are not change. They are being append right after the print and the list append is different.
If you change the lines like this:
res.append(V)
|
|
v
D = [V[i] for i in range(len(V))]
res.append(D)
The results is correct on the final. Anyone can explain how can a printed list can be different from a appended list using the same variable.
Replace res.append(V) with res.append(list(V)) simply fixes your issue.
All V you appended to the res are references to the same object. This can be observed by printing the id of each element in the list:
for i in res:
print(id(i))

Math operations between values in list and list of lists (python3)

I'm stuck on what seems to be an easy problem :
I've got 2 lists of lists, let says :
a = [[1], [2]]
b = [[1, 2, 3], [4, 5, 6]]
And I want this result :
result = [[2, 3, 4], [6, 7, 8]]
by adding (or, why not, substracting ) a[0] value to each value of b[0], then a[1] to b[1] etc...
I've tried using zip without result as expected:
result = [x for x in zip(a, b)]
Can someone help me to progress ?
you have a list of lists with 1 element, and you want to apply addition of that element on all elements of the other list. Since expected result is a list of lists, you have to create a double list comprehension, like this:
a = [[1], [2]]
b = [[1, 2, 3], [4, 5, 6]]
result = [[x+v for x in l] for [v],l in zip(a,b)]
print(result)
result:
[[2, 3, 4], [6, 7, 8]]
for [v],l is a neat way of unpacking the element inside the list so it avoids x+v[0] in the loop and it's more performant (and pythonic). Plus: if the list suddenly contains more than 1 element, you'll get an unpack error instead of an unexpected result (by ignoring further elements).
You can do this using numpy, which inherently supports array operations such as this:
>>> import numpy as np
>>> i = np.array([[1], [2]])
>>> j = np.array([[1, 2, 3], [4, 5, 6]])
>>> i+j
array([[2, 3, 4],
[6, 7, 8]])
If your lists are large, this may have a speed advantage over list comprehensions due to the fact that numpy uses fast low-level routines for this sort of stuff.
If not, and you don't already have numpy installed, then the overhead of installing another library is probably not worth it.

python3.2)append two element in a list(lists in a list)

If I have an input like this (1, 2, 3, 4, 5, 6)
The output has to be ... [[1, 2], [3, 4], [5, 6]].
I know how to deal with if it's one element but not two.
x=[]
for number in numbers:
x.append([number])
I'll appreciate your any help!
Something like this would work:
out = []
lst = (1,2,3,4,5,6,7,8,9,10)
for x in range(len(lst)):
if x % 2 == 0:
out.append([lst[x], lst[x+1]])
else:
continue
To use this, just set lst equal to whatever list of numbers you want. The final product is stored in out.
There is a shorter way of doing what you want:
result = []
L = (1,2,3,4,5,6,7,8,9,10)
result = [[L[i], L[i + 1]] for i in range(0, len(L) - 1, 2)]
print(result)
You can use something like this. This solution also works for list of odd length
def func(lst):
res = []
# Go through every 2nd value | 0, 2, 4, ...
for i in range(0, len(lst), 2):
# Append a slice of the list, + 2 to include the next value
res.append(lst[i : i + 2])
return res
# Output
>>> lst = [1, 2, 3, 4, 5, 6]
>>> func(lst)
[[1, 2], [3, 4], [5, 6]]
>>> lst2 = [1, 2, 3, 4, 5, 6, 7]
>>> func(lst2)
[[1, 2], [3, 4], [5, 6], [7]]
List comprehension solution
def func(lst):
return [lst[i:i+2] for i in range(0, len(lst), 2)]
Slicing is better in this case as you don't have to account for IndexError allowing it to work for odd length as well.
If you want you can also add another parameter to let you specify the desired number of inner elements.
def func(lst, size = 2): # default of 2 it none specified
return [lst[i:i+size] for i in range(0, len(lst), size)]
There's a few hurdles in this problem. You want to iterate through the list without going past the end of the list and you need to deal with the case that list has an odd length. Here's one solution that works:
def foo(lst):
result = [[x,y] for [x,y] in zip(lst[0::2], lst[1::2])]
return result
In case this seems convoluted, let's break the code down.
Index slicing:
lst[0::2] iterates through lst by starting at the 0th element and proceeds in increments of 2. Similarly lst[1::2] iterates through starting at the 1st element (colloquially the second element) and continues in increments of 2.
Example:
>>> lst = (1,2,3,4,5,6,7)
>>> print(lst[0::2])
(1,3,5,7)
>>> print(lst[1::2])
(2,4,6)
zip: zip() takes two lists (or any iterable object for that matter) and returns a list containing tuples. Example:
>>> lst1 = (10,20,30, 40)
>>> lst2 = (15,25,35)
>>> prit(zip(lst1, lst2))
[(10,15), (20,25), (30,35)]
Notice that zip(lst1, lst2) has the nice property that if one of it's arguments is longer than the other, zip() stops zipping whenever the shortest iterable is out of items.
List comprehension: python allows iteration quite generally. Consider the statement:
>>> [[x,y] for [x,y] in zip(lst1,lst2)]
The interior bit "for [x,y] in zip(lst1,lst2)" says "iterate through all pairs of values in zip, and give their values to x and y". In the rest of the statement
"[[x,y] for [x,y] ...]", it says "for each set of values x and y takes on, make a list [x,y] to be stored in a larger list". Once this statement executes, you have a list of lists, where the interior lists are all possible pairs for zip(lst1,lst2)
Very Clear solution:
l = (1, 2, 3, 4, 5, 6)
l = iter(l)
w = []
for i in l:
sub = []
sub.append(i)
sub.append(next(l))
w.append(sub)
print w

python 2-D array get the function as np.unique or union1d

as follows I have a 2-D list/array
list1 = [[1,2],[3,4]]
list2 = [[3,4],[5,6]]
how can I use the function as union1d(x,y)to make list1 and list2 as one list
list3 = [[1,2],[3,4],[5,6]]
union1d just does:
unique(np.concatenate((ar1, ar2)))
so if you have a method of finding unique rows, you have the solution.
As described in the suggested link, and elsewhere, you can do this by converting the array to a 1d structured array. Here the simple version is
If arr is:
arr=np.array([[1,2],[3,4],[3,4],[5,6]])
the structured equivalent (a view, same data):
In [4]: arr.view('i,i')
Out[4]:
array([[(1, 2)],
[(3, 4)],
[(3, 4)],
[(5, 6)]],
dtype=[('f0', '<i4'), ('f1', '<i4')])
In [5]: np.unique(arr.view('i,i'))
Out[5]:
array([(1, 2), (3, 4), (5, 6)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
and back to 2d int:
In [7]: np.unique(arr.view('i,i')).view('2int')
Out[7]:
array([[1, 2],
[3, 4],
[5, 6]])
This solution does require a certain familiarity with compound dtypes.
Using return_index saves that return view. We can index arr directly with that index:
In [54]: idx=np.unique(arr.view('i,i'),return_index=True)[1]
In [55]: arr[idx,:]
Out[55]:
array([[1, 2],
[3, 4],
[5, 6]])
For what it's worth, unique does a sort and then uses a mask approach to remove adjacent duplicates.
It's the sort that requires a 1d array, the rest works in 2d
Here arr is already sorted
In [42]: flag=np.concatenate([[True],(arr[1:,:]!=arr[:-1,:]).all(axis=1)])
In [43]: flag
Out[43]: array([ True, True, False, True], dtype=bool)
In [44]: arr[flag,:]
Out[44]:
array([[1, 2],
[3, 4],
[5, 6]])
https://stackoverflow.com/a/16971324/901925 shows this working with lexsort.
================
The mention of np.union1d set me and Divakar to focus on numpy methods. But it starting with lists (of lists), it is likely to be faster to use Python set methods.
For example, using list and set comprehensions:
In [99]: [list(x) for x in {tuple(x) for x in list1+list2}]
Out[99]: [[1, 2], [3, 4], [5, 6]]
You could also take the set for each list, and do a set union.
The tuple conversion is needed because a list isn't hashable.
One approach would be to stack those two input arrays vertically with np.vstack and then finding the unique rows in it. It would be memory intensive as we would discard rows from it thereafter.
Another approach would be to find the rows in the first array that are exclusive to it, i.e. not present in the second array and thus just stacking those exclusive rows alongwith the second array. Of course, this would assume that there are unique rows among each input array.
The crux of such a proposed memory-saving implementation would be to get those exclusive rows from first array. For the same we would convert each row into a linear index equivalent considering each row as an indexing tuple on a n-dimensional grid, with the n being the number of columns in the input arrays. Thus, assuming the input arrays as arr1 and arr2, we would have an implementation like so -
# Get dim of ndim-grid on which linear index equivalents are to be mapped
dims = np.maximum(arr1.max(0),arr2.max(0)) + 1
# Get linear index equivalents for arr1, arr2
idx1 = np.ravel_multi_index(arr1.T,dims)
idx2 = np.ravel_multi_index(arr2.T,dims)
# Finally get the exclusive rows and stack with arr2 for desired o/p
out = np.vstack((arr1[~np.in1d(idx1,idx2)],arr2))
Sample run -
In [93]: arr1
Out[93]:
array([[1, 2],
[3, 4],
[5, 3]])
In [94]: arr2
Out[94]:
array([[3, 4],
[5, 6]])
In [95]: out
Out[95]:
array([[1, 2],
[5, 3],
[3, 4],
[5, 6]])
For more info on setting up those linear index equivalents, please refer to this post.