Iteratively divide column by column plus suffix (pandas) - python-2.7

In my for loop I would like to divide a column by another. The second column's name is dependent upon the first. Example data frame columns look like this:
column1, column1_mean, column2, column2_mean
I would like to iteratively divide each column by its corresponding mean column ( column1 / column1_mean; column2/column2_mean ).
Thanks for your help.

For a given list of column names cols = ['column1', 'column2', ... ], you can use .div() to divide a list of columns by another list of columns, and use string formatting to make the second list of columns based on the first:
df[cols].div(df[['{0}_mean'.format(col) for col in cols]])
But a simpler way would be to dispense with making the mean columns altogether:
df[cols].div(df[cols].mean())

Related

DAX syntax to compare a list to another list

I have a string column, which contains a list of string values separated by commas.
I converted the column into a List by using M syntax
Now I would like to write a DAX query that would compare the Names column with a criteria table and return a row if there are intersections. Something like this:
FILTER(Users, INTERSECTSWITH(Users[Names], {{"Joe", "Meg"}}))
which should return rows 1 and 3.
INTERSECTSWITH does not exist in DAX, but is there an equivalent technique to compare one list to another?
Thanks!

Get total count of each distinct value

If I for example have a column of countries that might repeat and the list follows like this: Spain, Spain, Italy, Spain
I want to get the result that I take the number that a country appears in the column and divide it by total number. I have tried:
CountRows = DIVIDE(DISTINCTCOUNT('Report (7)'[Country]); COUNT('Report (7)'[Country]) )
Any suggestions? do I need a new column for that?
The easiest way to achieve this type of calculation is to add one column with the number of occurrence of the selected words divided by the number of row in the table.
You need to use the function Earlier to get the context.
If you have one table named Table1 and your column Country
Something like :
Divide(COUNTROWS(FILTER(table1, Table1[Country] = EARLIER(Table1[Country]))),COUNTROWS(Table1))
Don't forget to put your new column in Percentage type or add some decimal to see the correct data.

Auto-Populating to create 1 list in 1 column based on data from two different columns (Google sheets)

Having some trouble with a very simple problem. I want to create 1 list in a column based on data from two different columns, I only want each item to appear once in the list column.
So to create the list based on data from one column, I used this formula but I don't know how to do it from 2 (and the number of occurrences/count isn't needed).
=ArrayFormula(QUERY(J2:J&{"",""},"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Number of occurrences'",-1))
https://docs.google.com/spreadsheets/d/1loPw3eUALLKx3NzXrxhDszqD2_B7G3cyH1EhnYg4tFg/edit?ts=5cf63f94#gid=0
Maybe something like:
=sort(UNIQUE({A1:A10;B1:B12}))
If your locale uses , as separator, the lists are in A1:A10 and B1:B12 and you want the result sorted.

Power BI remove duplicates based on max value

I have 2 column; ID CODE, value
Remove duplicates function will remove the examples with the higher value and leave the lower one. Is there any way to remove the lower ones? The result I expected was like this.
I've tried Buffer Table function before but it doesn't work. Seems like Buffer Table just works with date-related data (newest-latest).
You could use SUMMARIZE which can be used similar to a SQL query that takes a MIN value for a column, grouped by some other column.
In the example below, MIN([value]) is taken, given a new column name "MinValue", which is grouped by IDCode. This should return the min value for each IDCode.
NewCalculatedTable =
SUMMARIZE(yourTablename, yourTablename[IDCode], "MinValue", MIN(yourTablename[value]) )
Alternatively, if you want the higher values just replace the MIN function with MAX.

Sum values in one column of all rows which match one or more of multiple criteria

I have some table data in which I'd like to sum all the values in a specific column of all rows where column A contains string A and/or column B contains string B. How can I achieve this?
This works for one criterium:
=SUM(FILTER(G:G,REGEXMATCH(F:F,"stringA")))
I tried this, but it didn't work:
=SUM(FILTER(G:G,OR(ISTEXT(REGEXMATCH(F:F,"stringA")),ISTEXT(REGEXMATCH(C:C,"stringB")))))
Please try:
=SUM(FILTER(G:G,REGEXMATCH(F:F,"stringA")+REGEXMATCH(C:C,"stringB")))
+ works for or logic. ISTEXT is not needed because REGEXMATCH gives true or false.
OR does not work because filter is an arrayformula, use + in array formulas.
=SUM(FILTER(G:G,REGEXMATCH(F:F&C:C,"stringA|stringB")))
OR is denoted by |
EDIT Added &C:C to denote different Columns