Numpy - split a matrix considering offsets - python-2.7

Given an m x n matrix I want to split it into square a x a (a = 3 or a = 4) matrices of arbitrary offset (minimal offset = 1, max offset = block size), like Mathematica's Partition function does:
For example, given a 4 x 4 matrix A like
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
If I give 3 x 3 blocks and offset = 1, I want to get the 4 matrices:
1 2 3
5 6 7
9 10 11
2 3 4
6 7 8
10 11 12
5 6 7
9 10 11
13 14 15
6 7 8
10 11 12
14 15 16
If matrix A is A = np.arange(1, 37).reshape((6,6)) and I use 3 x 3 blocks with offset = 3, I want as output the blocks:
1 2 3
7 8 9
3 14 15
4 5 6
10 11 12
16 17 18
19 20 21
25 26 27
31 32 33
22 23 24
28 29 30
34 35 36
I'm ok with matrix A being a list of lists and I think that I don't need NumPy's functionality. I was surprised that neither array_split nor numpy.split provide this offset option out of the box, is it more straightforward to code this in pure Python with slicing or should I look into NumPy's strides? I want the code to be highly legible.

As you hint, there is a way of doing this with strides
In [900]: M = np.lib.stride_tricks.as_strided(A, shape=(2,2,3,3), strides=(16,4,16,4))
In [901]: M
Out[901]:
array([[[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]]],
[[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]],
[[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]]]])
In [902]: M.reshape(4,3,3) # to get it in form you list
Out[902]:
array([[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]],
[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]],
[[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]]])
A problem with strides is that it is advanced, and hard to explain to someone without much numpy experience. I figured out the form without much trial and error, but I've been hanging around here too long. :) ).
But this iterative solution is easier to explain:
In [909]: alist=[]
In [910]: for i in range(2):
...: for j in range(2):
...: alist.append(A[np.ix_(range(i,i+3),range(j,j+3))])
...:
In [911]: alist
Out[911]:
[array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]]),
array([[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]]),
array([[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]]),
array([[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]])]
Which can be turned into an array with np.array(alist). There's nothing wrong with using this if it is clearer.
One thing to keep in mind about the as_strided approach is that it is a view, and changes to M may change A, and a change in one place in M may modify several places in M. But that reshaping M may turn it into a copy. So overall it's safer to read values from M, and use them for calculations like sum and mean. In place changes can be unpredictable.
The iterative solution produces copies all around.
The iterative solution with np.ogrid instead of np.ix_ (otherwise the same idea):
np.array([A[np.ogrid[i:i+3, j:j+3]] for i in range(2) for j in range(2)])
both ix_ and ogrid are just easy ways constructing the pair of vectors for indexing a block:
In [970]: np.ogrid[0:3, 0:3]
Out[970]:
[array([[0],
[1],
[2]]), array([[0, 1, 2]])]
The same thing but with slice objects:
np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(2) for j in range(2)])
The list version of this would have similar view behavior as the as_strided solution (the elements of the list are views).
For the 6x6 with non-overlapping blocks, try:
In [1016]: np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(0,6,3) for j i
...: n range(0,6,3)])
Out[1016]:
array([[[ 1, 2, 3],
[ 7, 8, 9],
[13, 14, 15]],
[[ 4, 5, 6],
[10, 11, 12],
[16, 17, 18]],
[[19, 20, 21],
[25, 26, 27],
[31, 32, 33]],
[[22, 23, 24],
[28, 29, 30],
[34, 35, 36]]])
Assuming you want contiguous blocks, the inner slices/ranges don't change, just the stepping for the outer i and j
In [1017]: np.arange(0,6,3)
Out[1017]: array([0, 3])

Related

Sort an integer array by converting an element to its sum of numbers

The question I am given is
We are given an array.
In one operation we can replace any element of the array with any two elements that sum to that element.
For example: array = {4, 11, 7}. In one operation you can replace array[1] with 5 and 6 which sums to 11. So the array becomes array = {4, 5, 6, 7}
Return the minimum number of steps in which the whole array can be sorted in non-decreasing order. Along with array in sorted order.
For example: array = {3,9,3}
I think the answer will be 9 will be converted to 3,3,3
But I cannot think of a general formula of doing it.
My thoughts on the solution are
Suppose we want to convert number 6 and 9
We use if and else
IF
we see that we divide a number by 2 and take ceiling but it is greater than the number on it's right side(last example in the question) then we keep subtracting that number(3) until we get integer 0.
That is 9 = 3(number on right of 9 in array in last example) - 3 - 3
ELSE
simply do ceiling(num / 2) to get first number and then num - ceil(num / 2) to ger second. 7 will be 4 and 3.
Please can someone think of a general formula for doing it?
Edy's way (as I interpret it) in Python:
def solve(xs):
limit = 10**100
out = []
for x in reversed(xs):
parts = (x - 1) // limit + 1
limit, extra = divmod(x, parts)
out += extra * [limit+1] + (parts - extra) * [limit]
print(len(out) - len(xs), out[::-1])
solve([4, 11, 7])
solve([3, 9, 3])
solve([9, 4, 15, 15, 28, 23, 13])
Output showing steps and result array for the three test cases (Try it online!):
1 [4, 5, 6, 7]
2 [3, 3, 3, 3, 3]
8 [3, 3, 3, 4, 5, 5, 5, 7, 8, 9, 9, 10, 11, 12, 13]
An output illustrating the progress:
[4, 11, 7] = (input)
[4, 11, [7]]
[4, [5, 6], [7]]
[[4], [5, 6], [7]]
[3, 9, 3] = (input)
[3, 9, [3]]
[3, [3, 3, 3], [3]]
[[3], [3, 3, 3], [3]]
[9, 4, 15, 15, 28, 23, 13] = (input)
[9, 4, 15, 15, 28, 23, [13]]
[9, 4, 15, 15, 28, [11, 12], [13]]
[9, 4, 15, 15, [9, 9, 10], [11, 12], [13]]
[9, 4, 15, [7, 8], [9, 9, 10], [11, 12], [13]]
[9, 4, [5, 5, 5], [7, 8], [9, 9, 10], [11, 12], [13]]
[9, [4], [5, 5, 5], [7, 8], [9, 9, 10], [11, 12], [13]]
[[3, 3, 3], [4], [5, 5, 5], [7, 8], [9, 9, 10], [11, 12], [13]]
Code for that (Try it online!):
def solve(xs):
print(xs, '= (input)')
limit = 10**100
for i, x in enumerate(reversed(xs)):
parts = (x - 1) // limit + 1
limit, extra = divmod(x, parts)
xs[~i] = (parts - extra) * [limit] + extra * [limit+1]
print(xs)
print()
You would want to scan from the right to the left. For convenient explanation, let's mark the right-most element x_0, and the left-most x_{n-1} (n can increase as you split a number into two).
If x_{i} > x_{i-1}, you would want to divide x_{i} into ((x_{i} - 1) / x_{i-1}) + 1 parts, where / is integer division, as evenly as possible.
So for example:
If x_{i} = 15, x_{i-1] = 5, divide x_{i} into (15-1)/5 + 1 = 3 parts: (5, 5, 5).
If x_{i} = 19, x_{i-1] = 5, divide x_{i} into (19-1)/5 + 1 = 4 parts: (4, 5, 5, 5).
(To divide a number equally into a non-decreasing sequence would require a bit of calculation, which shouldn't be too difficult.)
Once you know the sequence, it would be straightforward to repeatedly split a number into 2 to produce that sequence.

How to remove many elements from the list by checking it's index in Maxima CAS?

I use Maxima CAS to create the list:
a:makelist(i,i,1,20);
result:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
I want to slim the list and leave only every third element. To find it I check index i of the list a :
mod(i,3)>0
to find elements.
My code :
l:length(a);
for i:1 thru l step 1 do if (mod(i,3)>0) then a:delete(a[i],a);
Of course it does not work because length of a is changing.
I can do it using second list:
b:[];
for i:1 thru l step 1 do if (mod(i,3)=0) then b:cons(a[i],b);
Is it the best method ?
There are different ways to solve this, as know already. My advice is to construct a list of the indices you want to keep, and then construct the list of elements from that. E.g.:
(%i1) a:makelist(i,i,1,20);
(%o1) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
(%i2) ii : sublist (a, lambda ([a1], mod(a1, 3) = 0));
(%o2) [3, 6, 9, 12, 15, 18]
(%i3) makelist (a[i], i, ii);
(%o3) [3, 6, 9, 12, 15, 18]
The key part is the last step, makelist(a[i], i, ii), where ii is the list of indices you want to select. ii might be constructed in various ways. Here is a different way to construct the list of indices:
(%i4) ii : makelist (3*i, i, 1, 6);
(%o4) [3, 6, 9, 12, 15, 18]
One simple way (I do not know which one is best or faster) with compact code: makelist(a[3*i],i,1,length(a)/3)
Test example:
l1:makelist(i,i,1,12)$
l2:makelist(i,i,1,14)$
l3:[2,3,5,7,11,13,17,19,23,29]$
for a in [l1,l2,l3] do (
b:makelist(a[3*i],i,1,length(a)/3),
print(a,"=>",b)
)$
Result:
[1,2,3,4,5,6,7,8,9,10,11,12] => [3,6,9,12]
[1,2,3,4,5,6,7,8,9,10,11,12,13,14] => [3,6,9,12]
[2,3,5,7,11,13,17,19,23,29] => [5,13,23]

Writting in sub-ndarray of a ndarray in the most pythonian way. Python 2

I have a ndarray like this one:
number_of_rows = 3
number_of_columns = 3
a = np.arange(number_of_rows*number_of_columns).reshape(number_of_rows,number_of_columns)
a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
But I want something like this:
array([[0, 100, 101],
[3, 102, 103],
[6, 7, 8]])
To do that I want to avoid to do it one by one, I rather prefer to do it in arrays or matrices, because later I want to extend the code.
Nothe I have change a submatrix of the initial matrix (in mathematical terms, in terms of this example ndarray). In the example the columns considered are [1,2] and the rows [0,1].
columns_to_keep = [1,2]
rows_to_keep = [0,1]
My first try was to do:
a[rows_to_keep,:][:,columns_to_keep] = np.asarray([[100,101],[102,103]])
However this doesn't modify the initial a, I am not having any error, so a=
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
So I have implemented a piece of code that goes do the job:
b = [[100, 101],[102, 103]]
for i in range(len(rows_to_keep)):
a[i,columns_to_keep] = b[i]
Al thought the previous lines do the job I am wondering how to do it slicing and in a faster fashion. Also in a way that with:
columns_to_keep = [0,2]
rows_to_keep = [0,2]
the desired output is
array([[100, 1, 101],
[3, 4, 5],
[102, 7, 103]]).
Many thanks!
Indexing with lists like [1,2] is called advanced indexing. By itself it produces a copy, not a view. You have to use one indexing expression, not two to assign or change values. That is a[[1,2],:] is a copy, a[[1,2],:][:,[1,2]] += 100 modifies that copy, not the original a.
In [68]: arr = np.arange(12).reshape(3,4)
Indexing with slices; this is basic indexing:
In [69]: arr[1:,2:]
Out[69]:
array([[ 6, 7],
[10, 11]])
In [70]: arr[1:,2:] += 100
In [71]: arr
Out[71]:
array([[ 0, 1, 2, 3],
[ 4, 5, 106, 107],
[ 8, 9, 110, 111]])
Doing the same indexing with lists requires arrays that 'broadcast' against each other. ix_ is a handy way of generating these:
In [73]: arr[np.ix_([1,2],[2,3])]
Out[73]:
array([[106, 107],
[110, 111]])
In [74]: arr[np.ix_([1,2],[2,3])] -= 100
In [75]: arr
Out[75]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Here's what ix_ produces - a tuple of arrays, one is (2,1) in shape, the other (1,2). Together they index a (2,2) block:
In [76]: np.ix_([1,2],[2,3])
Out[76]:
(array([[1],
[2]]), array([[2, 3]]))
For the continuous rows and columns case, you can use basic slicing like this:
In [634]: a
Out[634]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [635]: b = np.asarray([[100, 101],[102, 103]])
In [636]: a[:rows_to_keep[1]+1, columns_to_keep[0]:] = b
In [637]: a
Out[637]:
array([[ 0, 100, 101],
[ 3, 102, 103],
[ 6, 7, 8]])

Dictionary Keys-Repeat (List<int>) in Python#

This is an assignment; I have worked over it and somewhere get stuck;
This is the input from text file:
min: 1,2,3,5,6
max: 1,2,3,5,6
avg: 1,2,3,5,6
p90: 1,2,3,4,5,6,7,8,9,10
sum: 1,2,3,5,6
min: 1,5,6,14,24
max: 2,3,9
p70: 1,2,3
This is the required output to the text file:
The min of [1, 2, 3, 5, 6] is 1
The max of [1, 2, 3, 5, 6] is 6
The avg of [1, 2, 3, 5, 6] is 3.4
The 90th percentile of [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] is 9
The sum of [1, 2, 3, 5, 6] is 17
The min of [1, 5, 6, 14, 24] is 1
The max of [2, 3, 9] is 9
The 70th percentile of [1, 2, 3] is 2
This is my work-out to the text file:
The min of [1, 5, 6, 14, 24] is 1
The max of [2, 3, 9] is 9
The avg of [1, 2, 3, 5, 6] is 3.4
The p90 of [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] is 9.0
The sum of [1, 2, 3, 5, 6] is 17
The p70 of [1, 2, 3] is 2.1
Logics
I wrote a function to read from a file and insert the keys:values into dictionary;
Below is the dictionary
OrderedDict([('min', [1, 5, 6, 14, 24]), ('max', [2, 3, 9]), ('avg', [1, 2, 3, 5, 6]), ('p90', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]), ('sum', [1, 2, 3, 5, 6]), ('p70', [1, 2, 3])])
From here I compute the required and write the results to the file
My question; how can I make the keys min and max duplicate in the dictionary as you can see the have been overwritten
The problem is, that the keys in a dictionary are unique. That means, a dictionary can only have one entry with the key 'min'. That's why your first entry with the key 'min' gets overwritten by the second.
To solve this I would recommend to change the structure type from Dictionary to something else (like a nested List).
list = []
list.append(['min', [1, 2, 3, 5, 6]])
you will get a list of rows, each containing the function (like 'min') and the number array.
More about Lists

Select the value in the matrix/ array/ list

I was a beginner in python programming. What is the difference:
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
with
a = [0 1 2 3 4 5 6 7 8 9]
I have
a = [0 1 2 3 4 5 6 7 8 9]
I want to form a matrix / array / list with values <= 6, in order to obtain:
a1 = [0 1 2 3 4 5 6]
How do I get the a1?
Sorry if my question has been asked before.
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
is a valid list,
a = [0 1 2 3 4 5 6 7 8 9]
is not a valid list
Assuming you want to turn:
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
into
a = [0, 1, 2, 3, 4, 5, 6]
you could use list comprehension:
a1 = [x for x in a if x <= 6]
or a for loop:
a1 = []
for x in a:
if x <= 6:
a1.append(x)
The list comprehension solution is more pythonic though.