Related
I need to sort a std::vector by index. Let me explain it with an example:
Imagine I have a std::vector of 12 positions (but can be 18 for example) filled with some values (it doesn't have to be sorted):
Vector Index: 0 1 2 3 4 5 6 7 8 9 10 11
Vector Values: 3 0 2 3 2 0 1 2 2 4 5 3
I want to sort it every 3 index. This means: the first 3 [0-2] stay, then I need to have [6-8] and then the others. So it will end up like this (new index 3 has the value of previous idx 6):
Vector Index: 0 1 2 3 4 5 6 7 8 9 10 11
Vector Values: 3 0 2 1 2 2 3 2 0 4 5 3
I'm trying to make it in one line using std::sort + lambda but I can't get it. Also discovered the std::partition() function and tried to use it but the result was really bad hehe
Found also this similar question which orders by odd and even index but can't figure out how to make it in my case or even if it is possible: Sort vector by even and odd index
Thank you so much!
Note 0: No, my vector is not always sorted. It was just an example. I've changed the values
Note 1: I know it sound strange... think it like hte vecotr positions are like: yes yes yes no no no yes yes yes no no no yes yes yes... so the 'yes' positions will go in the same order but before the 'no' positions
Note 2: If there isn't a way with lambda then I thought making it with a loop and auxiliar vars but it's more ugly I think.
Note 3: Another example:
Vector Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Vector Values: 3 0 2 3 2 0 1 2 2 4 5 3 2 3 0 0 2 1
Sorted Values: 3 0 2 1 2 2 2 3 0 3 2 0 4 5 3 0 2 1
The final Vector Values is sorted (in term of old index): 0 1 2 6 7 8 12 13 14 3 4 5 9 10 11 15 16 17
You can imagine those index in 2 colums, so I want first the Left ones and then the Right one:
0 1 2 3 4 5
6 7 8 9 10 11
12 13 14 15 16 17
You don't want std::sort, you want std::rotate.
std::vector<int> v = {20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31};
auto b = std::next(std::begin(v), 3); // skip first three elements
auto const re = std::end(v); // keep track of the actual end
auto e = std::next(b, 6); // the end of our current block
while(e < re) {
auto mid = std::next(b, 3);
std::rotate(b, mid, e);
b = e;
std::advance(e, 6);
}
// print the results
std::copy(std::begin(v), std::end(v), std::ostream_iterator<int>(std::cout, " "));
This code assumes you always do two groups of 3 for each rotation, but you could obviously work with whichever arbitrary ranges you wanted.
The output looks like what you'd want:
20 21 22 26 27 28 23 24 25 29 30 31
Update: #Blastfurnace pointed out that std::swap_ranges would work as well. The rotate call can be replaced with the following line:
std::swap_ranges(b, mid, mid); // passing mid twice on purpose
With the range-v3 library, you can write this quite conveniently, and it's very readable. Assuming your original vector is called input:
namespace rs = ranges;
namespace rv = ranges::views;
// input [3, 0, 2, 3, 2, 0, 1, 2, 2, 4, 5, 3, 2, 3, 0, 0, 2, 1]
auto by_3s = input | rv::chunk(3); // [[3, 0, 2], [3, 2, 0], [1, 2, 2], [4, 5, 3], [2, 3, 0], [0, 2, 1]]
auto result = rv::concat(by_3s | rv::stride(2), // [[3, 0, 2], [1, 2, 2], [2, 3, 0]]
by_3s | rv::drop(1) | rv::stride(2)) // [[3, 2, 0], [4, 5, 3], [0, 2, 1]]
| rv::join
| rs::to<std::vector<int>>; // [3, 0, 2, 1, 2, 2, 2, 3, 0, 3, 2, 0, 4, 5, 3, 0, 2, 1]
Here's a demo.
I want to add values of dataframe of which format is same.
for exmaple
>>> my_dataframe1
class1 score
subject 1 2 3
student
0 1 2 5
1 2 3 9
2 8 7 2
3 3 4 7
4 6 7 7
>>> my_dataframe2
class2 score
subject 1 2 3
student
0 4 2 2
1 4 4 14
2 8 7 7
3 1 2 NaN
4 NaN 2 3
as you can see, the two dataframes have multi-layer columns that the main column is 'class score' and the sub columns is 'subject'.
what i want to do is that get summed dataframe which can be showed like this
score
subject 1 2 3
student
0 5 4 7
1 2 1 5
2 16 14 9
3 4 6 7
4 6 9 10
Actually, i could get this dataframe by
for i in my_dataframe1['class1 score'].index:
my_dataframe1['class1 score'].loc[i,:] = my_dataframe1['class1 score'].loc[i,:].add(my_dataframe2['class2 score'].loc[i,:], fill_value = 0)
but, when dimensions increases, it takes tremendous time to get result dataframe, and i do think it isn't good way to solve problem.
If you add values from the second dataframe, it will ignore the indexing
# you don't need `astype(int)`.
my_dataframe1.add(my_dataframe2.values, fill_value=0).astype(int)
class1 score
subject 1 2 3
student
0 5 4 7
1 6 7 23
2 16 14 9
3 4 6 7
4 6 9 10
Setup
my_dataframe1 = pd.DataFrame([
[1, 2, 5],
[2, 3, 9],
[8, 7, 2],
[3, 4, 7],
[6, 7, 7]
], pd.RangeIndex(5, name='student'), pd.MultiIndex.from_product([['class1 score'], [1, 2, 3]], names=[None, 'subject']))
my_dataframe2 = pd.DataFrame([
[4, 2, 2],
[4, 4, 14],
[8, 7, 7],
[1, 2, np.nan],
[np.nan, 2, 3]
], pd.RangeIndex(5, name='student'), pd.MultiIndex.from_product([['class2 score'], [1, 2, 3]], names=[None, 'subject']))
IIUC:
df_out = df['class1 score'].add(df2['class2 score'],fill_value=0).add_prefix('scores_')
df_out.columns = df_out.columns.str.split('_',expand=True)
df_out
Output:
scores
1 2 3
student
0 5.0 4 7.0
1 6.0 7 23.0
2 16.0 14 9.0
3 4.0 6 7.0
4 6.0 9 10.0
The way I would approach this is keep the data in the same dataframe. You could concatenate the two you have already:
big_df = pd.concat([my_dataframe1, my_dataframe2], axis=1)
Then sum over the larger dataframe, specifying level:
big_df.sum(axis=1, level='subject')
I was a beginner in python programming. What is the difference:
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
with
a = [0 1 2 3 4 5 6 7 8 9]
I have
a = [0 1 2 3 4 5 6 7 8 9]
I want to form a matrix / array / list with values <= 6, in order to obtain:
a1 = [0 1 2 3 4 5 6]
How do I get the a1?
Sorry if my question has been asked before.
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
is a valid list,
a = [0 1 2 3 4 5 6 7 8 9]
is not a valid list
Assuming you want to turn:
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
into
a = [0, 1, 2, 3, 4, 5, 6]
you could use list comprehension:
a1 = [x for x in a if x <= 6]
or a for loop:
a1 = []
for x in a:
if x <= 6:
a1.append(x)
The list comprehension solution is more pythonic though.
Given an m x n matrix I want to split it into square a x a (a = 3 or a = 4) matrices of arbitrary offset (minimal offset = 1, max offset = block size), like Mathematica's Partition function does:
For example, given a 4 x 4 matrix A like
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
If I give 3 x 3 blocks and offset = 1, I want to get the 4 matrices:
1 2 3
5 6 7
9 10 11
2 3 4
6 7 8
10 11 12
5 6 7
9 10 11
13 14 15
6 7 8
10 11 12
14 15 16
If matrix A is A = np.arange(1, 37).reshape((6,6)) and I use 3 x 3 blocks with offset = 3, I want as output the blocks:
1 2 3
7 8 9
3 14 15
4 5 6
10 11 12
16 17 18
19 20 21
25 26 27
31 32 33
22 23 24
28 29 30
34 35 36
I'm ok with matrix A being a list of lists and I think that I don't need NumPy's functionality. I was surprised that neither array_split nor numpy.split provide this offset option out of the box, is it more straightforward to code this in pure Python with slicing or should I look into NumPy's strides? I want the code to be highly legible.
As you hint, there is a way of doing this with strides
In [900]: M = np.lib.stride_tricks.as_strided(A, shape=(2,2,3,3), strides=(16,4,16,4))
In [901]: M
Out[901]:
array([[[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]]],
[[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]],
[[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]]]])
In [902]: M.reshape(4,3,3) # to get it in form you list
Out[902]:
array([[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]],
[[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]],
[[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]]])
A problem with strides is that it is advanced, and hard to explain to someone without much numpy experience. I figured out the form without much trial and error, but I've been hanging around here too long. :) ).
But this iterative solution is easier to explain:
In [909]: alist=[]
In [910]: for i in range(2):
...: for j in range(2):
...: alist.append(A[np.ix_(range(i,i+3),range(j,j+3))])
...:
In [911]: alist
Out[911]:
[array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]]),
array([[ 2, 3, 4],
[ 6, 7, 8],
[10, 11, 12]]),
array([[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]]),
array([[ 6, 7, 8],
[10, 11, 12],
[14, 15, 16]])]
Which can be turned into an array with np.array(alist). There's nothing wrong with using this if it is clearer.
One thing to keep in mind about the as_strided approach is that it is a view, and changes to M may change A, and a change in one place in M may modify several places in M. But that reshaping M may turn it into a copy. So overall it's safer to read values from M, and use them for calculations like sum and mean. In place changes can be unpredictable.
The iterative solution produces copies all around.
The iterative solution with np.ogrid instead of np.ix_ (otherwise the same idea):
np.array([A[np.ogrid[i:i+3, j:j+3]] for i in range(2) for j in range(2)])
both ix_ and ogrid are just easy ways constructing the pair of vectors for indexing a block:
In [970]: np.ogrid[0:3, 0:3]
Out[970]:
[array([[0],
[1],
[2]]), array([[0, 1, 2]])]
The same thing but with slice objects:
np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(2) for j in range(2)])
The list version of this would have similar view behavior as the as_strided solution (the elements of the list are views).
For the 6x6 with non-overlapping blocks, try:
In [1016]: np.array([A[slice(i,i+3), slice(j,j+3)] for i in range(0,6,3) for j i
...: n range(0,6,3)])
Out[1016]:
array([[[ 1, 2, 3],
[ 7, 8, 9],
[13, 14, 15]],
[[ 4, 5, 6],
[10, 11, 12],
[16, 17, 18]],
[[19, 20, 21],
[25, 26, 27],
[31, 32, 33]],
[[22, 23, 24],
[28, 29, 30],
[34, 35, 36]]])
Assuming you want contiguous blocks, the inner slices/ranges don't change, just the stepping for the outer i and j
In [1017]: np.arange(0,6,3)
Out[1017]: array([0, 3])
I want to apply two 'for' loops (slightly different from each other) on list. First 'for' loop will be from the minimum value to the left side and second from the minimum value to the right side. Following is the list:
a = [3,4,6,7,8,4,3,1,6,7,8,9,4]
# to get min index
b = a.index(min(a))
c=a[0:b+1]
d=a[b:len(a)]
for i in reversed(c):
print i
and
for i in d:
print i
So for example, first 'for' loop will run from the index value 8 to 1 and second 'for' loop will run from 8 to 13. I am not sure how to run loops in opposite directions starting from the minimum value. Any suggestions would be helpful.
>>> a = [3,4,6,7,8,4,3,1,6,7,8,9,4]
>>> b = a.index(min(a))
>>> b
7
A loop that runs from the index 7 to 0. (not 8 to 1):
>>> for i in range(b, -1, -1):
... print i, a[i]
...
7 1
6 3
5 4
4 8
3 7
2 6
1 4
0 3
A loop that run from 8 to 12:
>>> for i in range(b+1, len(a)):
... print i, a[i]
...
8 6
9 7
10 8
11 9
12 4
>>> a[b:None:-1]
[1, 3, 4, 8, 7, 6, 4, 3]
>>> a[b+1:]
[6, 7, 8, 9, 4]
UPDATE
Followings are more Pythonic methods of getting the index of the minimum value:
>>> min(xrange(len(a)), key=a.__getitem__)
7
>>> min(enumerate(a), key=lambda L: L[1])[0]
7
>>> import operator
>>> min(enumerate(a), key=operator.itemgetter(1))[0]
7