check each variable of a column pandas - python-2.7

I have a pivot table:
Type Result
l 213
l 435
h 54
l 34
h 54
l 645
h 345
h 34
h 345
l 345
l 666
I want to calculate the result column depending on it's type colume 'l' or 'h':
for h, f(result);
for l, g(result);
Finally, append the calculation results as a new column in this pivot table.
Can anyone help me with this?

Related

Counting sequence by group

Lets say I ahve a dataset as follows:
ID Cat
101 G
101 G
101 F
101 G
102 F
102 F
102 G
102 F
102 F
i want to create a variable for sequence by group variable ID,Cat(notsorted)
a count can be this way
data X1; set have; by ID, cat notsorted;
if first.cat then count=1; else count+1;run;
ID Cat count
101 G 1
101 G 2
101 F 1
101 G 1
102 F 1
102 F 2
102 G 1
102 F 1
102 F 1
however what I am looking for is.
ID Cat Seq
101 G 1
101 G 1
101 F 2
101 G 3
102 F 1
102 F 1
102 G 2
102 F 3
102 F 3
You can just use
seq+first.cat;
So every time you start a new CAT value the SEQ will increment by one.
To reset for each ID add:
if first.id then seq=1;

How to rename a column name to a new value in a dataframe if the column names are dynamic

I have csv file with column names changing based on month and year but has keyword like 'sales' 'product' etc. Is there a way to rename the column to a fixed value using python rename by searching the keyword
Sample column names would be 2019 May sales Tv, 2018 April sales Fridge
eg
nil
df_nw = df.rename(df.filter(like='Sales').columns.values
Current data:
column1 column2 2019AprilSalesTV 2018ActualSalesTV
X BBBB 7766 60
Y CCCC 10 20
Z LLLLL 60 65
K TTTTT 10 67
New Data:
column1 column2 Sales ActualSales
X BBBB 7766 60
Y CCCC 10 20
Z LLLLL 60 65
K TTTTT 10 67
You can do:
> clean_colname = lambda x: re.sub(r'(^\w+(?<!Actual))(Sales)', r'\2',
re.sub(r'^\d+|TV$', r'', x))
> df_nw.rename(clean_colname, axis=1)
column2 Sales ActualSales
column1
X BBBB 7766 60
Y CCCC 10 20
Z LLLLL 60 65
K TTTTT 10 67

TOPN in PowerBI DAX not arranging values in proper order

I have been running into some issues with the TOPN function in DAX in PowerBI.
Below is the original dataset:
regions sales
--------------
a 1191
b 807
c 1774
d 376
e 899
f 1812
g 1648
h 6
i 1006
j 1780
k 243
l 777
m 747
n 61
o 1637
p 170
q 1319
r 1437
s 493
t 1181
u 118
v 1787
w 1396
x 102
y 104
z 656
So now, I want to get the Top 5 sales in descending order.
I used the following code:
Table = TOPN(5, SUMMARIZE(Sheet1, Sheet1[regions], Sheet1[sales]), Sheet1[sales], DESC)
The resulting table is as follows:
regions sales
--------------
g 1648
j 1780
c 1774
v 1787
f 1812
Any idea why this is happening?
According to Microsoft documentation this is working as intended.
https://msdn.microsoft.com/en-us/query-bi/dax/topn-function-dax
Remarks
TOPN does not guarantee any sort order for the results.
What you can do is to create a RANKX to sort by.

Multiply rows in dataframe, then sum them together Python

I have a function to apply to this table
F(x) = 1.5*x1 + 2*x2 - 1.5*x3
Where xi, i = 1,2,3, is the column value.
And I have the following table below.
X1 | X2 | X3
------|------|------
20 |15 |12
30 |17 |24
40 |23 |36
The desired output that I would like is the following below, where I apply the function to each row, taking the value in each column and applying it to the function iteratively then receiving value as a sum and another column appended onto the dataframe.
X1 | X2 | X3 |F(X)
------|------|------|------
20 |15 |12 |42
30 |17 |24 |43
40 |23 |36 |52
Is there a way to do this in Python 2.7?
Something like this ?
df['F(x)']=df.mul([1.5,2,-1.5]).sum(1)
df
Out[1076]:
X1 X2 X3 F(x)
0 20 15 12 42.0
1 30 17 24 43.0
2 40 23 36 52.0
Ok. Found a sample code to solve my problem.
var1 = 1.5
var2 = 2
var3 = -1.5
def calculate_fx(row):
return (var1 * row['X1']) + (var2 * row['X2']) + (var3 * row['X3'])
#function_df is the predefined dataframe
function_df['F(X)'] = function_df.apply(calculate_fx, axis=1)
function_df

In the following SAS statement, what do the parameters "noobs" and "label" stand for?

In the following SAS statement, what do the parameters "noobs" and "label" stand for?
proc print data-sasuser.schedule noobs label;
per SAS 9.2 documentation on PROC PRINT:
"NOOBS - Suppress the column in the output that identifies each observation by number"
"LABEL - Use variables' labels as column headings"
noobs don't show you the column of observations number
(1,2,3,4,5,....)
my first title
results without noobs
Obs name sex group height weight
1 mike m a 21 150
2 henry m b 30 140
3 norian f b 18 130
4 nadine f b 32 135
5 dianne f a 23 135
results with noobs
my first title
name sex group height weight
mike m a 21 150
henry m b 30 140
norian f b 18 130
nadine f b 32 135
dianne f a 23 135