I have a list of values (product codes, like '1123','4356'...), call it LIST, and I want to select from a matrix M only the correspondent rows. I.e., the first col of the matrix M contains codes, the other cols the data, and I have an additional vector LIST that contains the codes to select.
Ex.
LIST MATRIX I WANT
[123; [000 1 2 3 ; [123 3 5 6 ;
456] 123 3 5 6 ; 456 1 4 6 ]
000 5 6 7 ;
456 1 4 6 ]
Efficient way to do it?
list = [123; 456];
mat = [000 1 2 3; 123 3 5 6; 000 5 6 7; 456 1 4 6];
iwant = [123 3 5 6 ; 456 1 4 6];
[a,b]=ismember(list,mat);
iwant2 = mat(b,:);
iwant==iwant2
Related
Consider a sheet like:
rowNr | Another Col | Filled | Cumul. Size
0 2 -1000 -1000
1 3 1000 0
2 1 -5000 -5000
3 4 5000 0
4 5 -10000 -10000
5 2 -10000 -20000
6 1 -20000 -40000
6 4 40000 0
The 'Cumul. Size'-column displays the cumulative sum of the 'filled' column.
each time Cummulutive Size = 0, I need to calculate the sum of 'Another Column' for all previous rows until 'Cummulutive Size' != 0 again. For rows where 'Cummulutive Size' = 0, display '' (blank)
So something like this:
rowNr | Another Col | Filled | Cumul. Size | calculated
0 2 -1000 -1000
1 3 1000 0 5
2 1 -5000 -5000
3 4 5000 0 5
4 5 -10000 -10000
5 2 -10000 -20000
6 1 -20000 -40000
6 4 40000 0 12
I'm sure I can create something working as long as I can find a function with a signature similar to: findPreviousRowIndex(curRowIndex, whereCondition)
Any pointers much appreciated
EDIT
Link To example Google Sheet
paste in D2 cell and drag down:
=ARRAYFORMULA(IF(LEN(A2), IF(C2=0, SUM(INDIRECT(ADDRESS(IFERROR(MAX(IF(
INDIRECT("C1:C"&ROW()-1)=0, ROW(A:A), ))+1, 2), 1, 4)&":A"&ROW())), ), ))
I have a dataframe with two columns A and B.
A B
1 0
2 0
3 1
4 2
5 0
6 3
What I want to do is to add column A with with column B. But only with the corresponding non zero values of column B. And put the result on column B.
A B
1 0
2 0
3 4
4 6
5 0
6 9
Thank you for your help and sugestion in advance.
use .loc with a boolean mask:
In [49]:
df.loc[df['B'] != 0, 'B'] = df['A'] + df['B']
df
Out[49]:
A B
0 1 0
1 2 0
2 3 4
3 4 6
4 5 0
5 6 9
I have a dataframe with multiindexed columns. I want to select on the first level based on the column name, and then return all columns but the last one, and assign a new value to all these elements.
Here's a sample dataframe:
In [1]: mydf = pd.DataFrame(np.random.random_integers(low=1,high=5,size=(4,9)),
columns = pd.MultiIndex.from_product([['A', 'B', 'C'], ['a', 'b', 'c']]))
Out[1]:
A B C
a b c a b c a b c
0 4 1 2 1 4 2 1 1 3
1 4 4 1 2 3 4 2 2 3
2 2 3 4 1 2 1 3 2 3
3 1 3 4 2 3 4 1 5 1
If want to be able to assign to this elements for example:
In [2]: mydf.loc[:,('A')].iloc[:,:-1]
Out[2]:
A
a b
0 4 1
1 4 4
2 2 3
3 1 3
If I wanted to modify one column only, I know how to select it properly with a tuple so that the assigning works:
In [3]: mydf.loc[:,('A','a')] = 0
In [4]: mydf.loc[:,('A','a')]
Out[4]:
0 0
1 0
2 0
3 0
Name: (A, a), dtype: int32
So that worked well.
Now the following doesn't work...
In [5]: mydf.loc[:,('A')].ix[:,:-1] = 6 - mydf.loc[:,('A')].ix[:,:-1]
In [6]: mydf.loc[:,('A')].iloc[:,:-1] = 6 - mydf.loc[:,('A')].iloc[:,:-1]
Sometimes I will, and sometimes I won't, get the warning that a value is trying to be set on a copy of a slice from a DataFrame. But in both cases it doesn't actually assign.
I've pretty much tried everything I could think, I still can't figure out how to mix both label and integer indexing in order to set the value correctly.
Any idea please?
Versions:
Python 2.7.9
Pandas 0.16.1
This is not directly supported as .loc MUST have labels and NOT positions. In theory .ix could support this with mulit-index slicers, but the usual complicates of figuring out what is 'meant' by the user (e.g. is it a label or a position).
In [63]: df = pd.DataFrame(np.random.random_integers(low=1,high=5,size=(4,9)),
columns = pd.MultiIndex.from_product([['A', 'B', 'C'], ['a', 'b', 'c']]))
In [64]: df
Out[64]:
A B C
a b c a b c a b c
0 4 4 4 4 3 2 5 1 4
1 1 2 1 3 2 1 1 4 5
2 3 2 4 4 2 2 3 1 4
3 5 1 1 3 1 1 5 5 5
so we compute the indexer for the 'A' block; np.r_ turns this slice into an actual indexer; then we select the element (e.g. 0 in this case). This feeds into .iloc.
In [65]: df.iloc[:,np.r_[df.columns.get_loc('A')][0]] = 0
In [66]: df
Out[66]:
A B C
a b c a b c a b c
0 0 4 4 4 3 2 5 1 4
1 0 2 1 3 2 1 1 4 5
2 0 2 4 4 2 2 3 1 4
3 0 1 1 3 1 1 5 5 5
For each row of data in a DataFrame I would like to compute the number of unique values in columns A and B for that particular row and a reference row within the group identified by another column ID. Here is a toy dataset:
d = {'ID' : pd.Series([1,1,1,2,2,2,2,3,3])
,'A' : pd.Series([1,2,3,4,5,6,7,8,9])
,'B' : pd.Series([1,2,3,4,11,12,13,14,15])
,'REFERENCE' : pd.Series([1,0,0,0,0,1,0,1,0])}
data = pd.DataFrame(d)
The data looks like this:
In [3]: data
Out[3]:
A B ID REFERENCE
0 1 1 1 1
1 2 2 1 0
2 3 3 1 0
3 4 4 2 0
4 5 11 2 0
5 6 12 2 1
6 7 13 2 0
7 8 14 3 1
8 9 15 3 0
Now, within each group defined using ID I want to compare each record with the reference record and I want to compute the number of unique A and B values for the combination. For instance, I can compute the value for data record 3 by taking len(set([4,4,6,12])) which gives 3. The result should look like this:
A B ID REFERENCE CARDINALITY
0 1 1 1 1 1
1 2 2 1 0 2
2 3 3 1 0 2
3 4 4 2 0 3
4 5 11 2 0 4
5 6 12 2 1 2
6 7 13 2 0 4
7 8 14 3 1 2
8 9 15 3 0 3
The only way I can think of implementing this is using for loops that loop over each grouped object and then each record within the grouped object and computes it against the reference record. This is non-pythonic and very slow. Can anyone please suggest a vectorized approach to achieve the same?
I would create a new column where I combine a and b into a tuple and then I would group by And then use groups = dict(list(groupby)) and then get the length of each frame using len()
I am trying to map the subdivision of a matrix to an array.
By subdivision of a matrix I mean a box like the 3x3 boxes in a 9x9 sudoku matrix.
To do so I use :
grid[x][y] = box[x/3 + (y/3)*3];
But it does not work, any sugesstion on a solution and an explanation of why it does not work ?
EDIT:
I know how to map a vector to a matrix.
I want to map a vector to a portion of a square matrix like just like in the sudoku game.
EDIT2:
Bassicaly what I want is to be able to map a box number to a tuple ,
for example with 3x3 boxes and a 9x9 matrix
(0,0) => 1
(0,1) => 1
(8,8) => 9
Updated Answer to Edit2:
If you want a mapping like:
1 2 3
4 5 6
7 8 9
then your original code is almost want you want (just add 1):
for (int y = 0; y < 9; ++y)
{
for (int x = 0; x < 9; ++x)
{
int index = x/3 + (y/3) * 3 + 1;
printf("%d ", index);
}
printf("\n");
}
Which outputs:
1 1 1 2 2 2 3 3 3
1 1 1 2 2 2 3 3 3
1 1 1 2 2 2 3 3 3
4 4 4 5 5 5 6 6 6
4 4 4 5 5 5 6 6 6
4 4 4 5 5 5 6 6 6
7 7 7 8 8 8 9 9 9
7 7 7 8 8 8 9 9 9
7 7 7 8 8 8 9 9 9