How to produce permutations with replacement in Python - python-2.7

I am trying to write some code (as part of a larger script) to develop numpy arrays of length n, which I can use to change the sign of an input list of length n, in all possible ways.
I am trying to produce all possible permutations of 1 and -1 of length n.
If I use itertools.permutations, it will not accept a repeat length greater than 2, because of not allowing repetitions. If I use itertools.combinations_with_replacement, then not all of the permutations are produced. I need "permutations_with_replacement".
I tried to use itertools.product, but I cannot get it to work.
Here is my code so far (n is an unknown number, depending on the length of the input list).
import numpy as np
import itertools
ones = [-1, 1]
multiplier = np.array([x for x in itertools.combinations_with_replacement(ones, n)])

Perhaps this is what you want?
>>> import itertools
>>> choices = [-1, 1]
>>> n = 3
>>> l = [choices] * n
>>> l
[[-1, 1], [-1, 1], [-1, 1]]
>>> list(itertools.product(*l))
[(-1, -1, -1), (-1, -1, 1), (-1, 1, -1), (-1, 1, 1), (1, -1, -1), (1, -1, 1), (1, 1, -1), (1, 1, 1)]

Related

Python List - wierd behaviour

I was trying to solve the 3Sum problem in leetcode. but I observed python lists behaving different during the end of loop statement.
def threeSum(nums):
n=len(nums)
sum = {}
result = []
for i in range(n):
for j in range(i+1,n):
if i != j:
key = nums[i]+nums[j]
if key not in sum:
sum[key] = [nums[i],nums[j]]
for i in range(n):
if -nums[i] in sum:
temp = sum[-nums[i]]
temp.append(nums[i])
if(len(temp)<=3):
result.append(temp)
print(result)
print("at the end of loop")
print(result)
return "result printed"
nums = [-1,0,1,2,-1,-4]
print(threeSum(nums))
For the above function I got the output as
[[-1, 2, -1]]
[[-1, 2, -1], [-1, 1, 0]]
[[-1, 2, -1], [-1, 1, 0], [-1, 0, 1]]
[[-1, 2, -1], [-1, 1, 0], [-1, 0, 1], [-1, -1, 2]]
at the end of loop
[[-1, 2, -1, -1], [-1, 1, 0], [-1, 0, 1], [-1, -1, 2]]
result printed
From the output you can see that during the last iteration of the loop the result List variable contains the value [[-1, 2, -1], [-1, 1, 0], [-1, 0, 1], [-1, -1, 2]] but when I print the same result at the end of the loop it is printed as [[-1, 2, -1, -1], [-1, 1, 0], [-1, 0, 1], [-1, -1, 2]] , the first element in List is changed.
How do you explain this? Am I missing something in the understanding of Python Lists?
P.S : Please ignore the solution of 3Sum problem, I already found another way to solve it, my question is regarding the Python List only
In Python, List is reference value. In your code, you refer to sum[1] 2 times. Both of 2 times return to the same List instance. That's why after the 2nd time, that List instance is appended 1 more number
This behavior is caused by two issues:
if(len(temp)<=3): will prevent printing the final result due to the length constraint
python lists are mutable and they can be modified from different places if the same object is referenced
In your case, at fourth iteration result[0] and temp will reference the same object. This is why result gets modified even it was not apparently touched. It was changed due to the change of temp variable. You can check this using additional prints to highlight current iteration, result and object ids.
for i in range(n):
print(i)
print(result)
if -nums[i] in sum:
temp = sum[-nums[i]]
temp.append(nums[i])
if(len(temp)<=3):
result.append(temp)
print(result)
print(id(result[0]))
print(id(temp))
print("at the end of loop")
print(result)
return "result printed"

Accessing specific pairwise distances in a distance matrix (scipy / numpy)

I am using scipy and its cdist function to compute a distance matrix from an array of vectors.
import numpy as np
from scipy.spatial import distance
vectorList = [(0, 10), (4, 8), (9.0, 11.0), (14, 14), (16, 19), (25.5, 17.5), (35, 16)]
#Convert to numpy array
arr = np.array(vectorList)
#Computes distances matrix and set self-comparisons to NaN
d = distance.cdist(arr, arr)
np.fill_diagonal(d, None)
Let's say I want to return all the distances that are below a specific threshold (6 for example)
#Find pairs of vectors whose separation distance is < 6
id1, id2 = np.nonzero(d<6)
#id1 --> array([0, 1, 1, 2, 2, 3, 3, 4])
#id2 --> array([1, 0, 2, 1, 3, 2, 4, 3])
I now have 2 arrays of indices.
Question: how can I return the distances between these pairs of vectors as an array / list ?
4.47213595499958 #d[0][1]
4.47213595499958 #d[1][0]
5.830951894845301 #d[1][2]
5.830951894845301 #d[2][1]
5.830951894845301 #d[2][2]
5.830951894845301 #d[3][2]
5.385164807134504 #d[3][4]
5.385164807134504 #d[4][3]
d[id1][id2] returns a matrix, not a list, and the only way I found so far is to iterate over the distance matrix again which doesn't make sense.
np.array([d[i1][i2] for i1, i2 in zip(id1, id2)])
Use
d[id1, id2]
This is the form that numpy.nonzero example shows (i.e. a[np.nonzero(a > 3)]) which is different from the d[id1][id2] you are using.
See arrays.indexing for more details on numpy indexing.

Strange behavior modifying a list while looping over it (Python)

I understand that modifying a list while iterating over it can spell disaster. I was curious so I tried it anyway. In the first few examples below, things go as expected; but then something unusual happens in the second to last example.
>>> A = [0, 0, 0, 0]
>>> for k in A:
if k == 0:
A.remove(k)
>>> A
[0, 0]
>>> A = [0, 0, 0, 0, 1]
>>> for k in A:
if k == 0:
A.remove(k)
>>> A
[0, 0, 1]
>>> A = [0, 0, 0, 0, 1, 1]
>>> for k in A:
if k == 0:
A.remove(k)
>>> A
[0, 0, 1, 1]
>>> A = [0, 0, 0, 0, 1, 1, 0] # Why does the presence of a fifth zero (the one at the end), cause an earlier zero to be removed?
>>> for k in A:
if k == 0:
A.remove(k)
>>> A
[0, 1, 1, 0]
>>> A = [0, 0, 0, 0, 1, 1, 2]
>>> for k in A:
if k == 0:
A.remove(k)
>>> A
[0, 0, 1, 1, 2]
I am not a Python expert, but just when I imagine how foreach loop is implemented:
len = size(array)
for i in range(0, len):
loop_body(array[i])
Then for the second example:
len = 5, array=[0, 0, 0, 0, 1]
First iteration: i=0, array=[0, 0, 0, 0, 1], zero in array[0] is removed.
Second iteration: i=1, array=[0, 0, 0, 1], zero in array[1] is removed.
Third iteration: i=2, array=[0, 0, 1], array[2] == 1, nothing happened.
And this is your result. The same for the last one.
The remove method removes the first occurrence of x (I knew that, but forgot about it completely!) So when I execute this code:
>>> A = [0, 0, 0, 0, 1, 1, 0]
>>> for k in A:
if k == 0:
A.remove(k)
the presence of the zero at the end causes the zero in front of it (the one that was skipped over while iterating) to be removed. This produces:
>>> A
[0, 1, 1, 0]
I expected:
>>> A
[0, 0, 1, 1, 0]

Transforming a mapping matrix

I have a 2-dimensional array of ones and zeros called M where the g rows represent groups and the a columns represent articles. M maps groups and articles. If a given article "art" belongs to group "gr" then we have M[gr,art]=1; if not we have M[gr,art]=0.
Now, I would like to convert M into a square a x a matrix of ones and zeros (call it N) where if an article "art1" is in the same group as article "art2", we have N(art1,art2)=1 and N(art1,art2)=0 otherwise. N is clearly symmetric with 1's in the diagonal.
How do I construct N based on M?
Many thanks for your suggestions - and sorry if this is trivial (still new to python...)!
So you have a boolean matrix M like the following:
>>> M
array([[1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 0, 0]])
>>> ngroups, narticles = M.shape
and what you want is a matrix of shape (narticles, narticles) that represents co-occurrence. That's simply the square of the matrix:
>>> np.dot(M, M.T)
array([[1, 0, 0, 1],
[0, 2, 0, 0],
[0, 0, 1, 1],
[1, 0, 1, 2]])
... except that you don't want counts, so set entries > 0 to 1.
>>> N = np.dot(M, M.T)
>>> N[N > 0] = 1
>>> N
array([[1, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 1, 1],
[1, 0, 1, 1]])

Creating lists of lists in a pythonic way

I'm using a list of lists to store a matrix in python. I tried to initialise a 2x3 Zero matrix as follows.
mat=[[0]*2]*3
However, when I change the value of one of the items in the matrix, it changes the value of that entry in every row, since the id of each row in mat is the same. For example, after assigning
mat[0][0]=1
mat is [[1, 0], [1, 0], [1, 0]].
I know I can create the Zero matrix using a loop as follows,
mat=[[0]*2]
for i in range(1,3):
mat.append([0]*2)
but can anyone show me a more pythonic way?
Use a list comprehension:
>>> mat = [[0]*2 for x in xrange(3)]
>>> mat[0][0] = 1
>>> mat
[[1, 0], [0, 0], [0, 0]]
Or, as a function:
def matrix(rows, cols):
return [[0]*cols for x in xrange(rows)]
Try this:
>>> cols = 6
>>> rows = 3
>>> a = [[0]*cols for _ in [0]*rows]
>>> a
[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]]
>>> a[0][3] = 2
>>> a
[[0, 0, 0, 2, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]]
This is also discussed in this answer:
>>> lst_2d = [[0] * 3 for i in xrange(3)]
>>> lst_2d
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> lst_2d[0][0] = 5
>>> lst_2d
[[5, 0, 0], [0, 0, 0], [0, 0, 0]]
This one is faster than the accepted answer!
Using xrange(rows) instead of [0]*rows makes no difference.
>>> from itertools import repeat
>>> rows,cols = 3,6
>>> a=[x[:] for x in repeat([0]*cols,rows)]
A variation that doesn't use itertools and runs around the same speed
>>> a=[x[:] for x in [[0]*cols]*rows]
From ipython:
In [1]: from itertools import repeat
In [2]: rows=cols=10
In [3]: timeit a = [[0]*cols for _ in [0]*rows]
10000 loops, best of 3: 17.8 us per loop
In [4]: timeit a=[x[:] for x in repeat([0]*cols,rows)]
100000 loops, best of 3: 12.7 us per loop
In [5]: rows=cols=100
In [6]: timeit a = [[0]*cols for _ in [0]*rows]
1000 loops, best of 3: 368 us per loop
In [7]: timeit a=[x[:] for x in repeat([0]*cols,rows)]
1000 loops, best of 3: 311 us per loop
I use
mat = [[0 for col in range(3)] for row in range(2)]
although depending on what you do with the matrix after you create it, you might take a look at using a NumPy array.
This will work
col = 2
row = 3
[[0] * col for row in xrange(row)]
What about:
m, n = 2, 3
>>> A = [[0]*m for _ in range(n)]
>>> A
[[0, 0], [0, 0], [0, 0]]
>>> A[0][0] = 1
[[1, 0], [0, 0], [0, 0]]
Aka List comprehension; from the docs:
List comprehensions provide a concise way to create lists
without resorting to use of
map(), filter() and/or lambda.
The resulting list definition tends often to be clearer
than lists built using those constructs.
If the sizes involved are really only 2 and 3,
mat = [[0, 0], [0, 0], [0, 0]]
is easily best and hasn't been mentioned yet.
Is there anything itertools can't do? :)
>>> from itertools import repeat,izip
>>> rows=3
>>> cols=6
>>> A=map(list,izip(*[repeat(0,rows*cols)]*cols))
>>> A
[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]]
>>> A[0][3] = 2
>>> A
[[0, 0, 0, 2, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]]