Data frames pandas python

Data frames pandas python - python-2.7

I have a data frame that looks like this:
id age sallary
1 16 500
2 21 1000
3 25 3000
4 30 6000
5 40 25000
and a list of ids that I would like to ignore [1,3,5]
how can I get a data frame that will contain all the remaining rows: 2,4.
Big thanks for every one.

Call isin and negate the result using ~:
In [42]:
ignore_ids=[1,3,5]
df[~df.id.isin(ignore_ids)]
Out[42]:
id age sallary
1 2 21 1000
3 4 30 6000

Related

Power BI : line grouping

I begin to use Power BI, and I don't know how to group lines.
I have this kind of data :
api user 01/07/21 02/07/21 03/07/21 ...
a 25 null 3 4
b 25 1 null 2
c 25 1 4 5
a 30 4 3 5
b 30 3 2 2
c 30 1 1 3
And I would like to have the sum of the values per user, not by api and user
user 01/07/21 02/07/21 03/07/21 ...
25 2 7 11
30 8 6 10
Do you know how to do it please ?

I created a table with your sample data (make sure your values are treated as numbers):
Then create a Matrix visual, with "user" in Rows and your desired columns in the Values section:

Power BI - get the graph out of the data set

It seems very simple but I can not get the graph to show the data I want.
So, I have got a lot of IDs with the end and start dates (LENGHT) and open items (OPEN). Each day has got availability (AVAIL) and there is nil used (USED) at day 1.
ID LENGTH OPEN USED AVAIL
1A 6 100 0 2400
I need to create the NEW_DAY column with count of the LENGHT. In this case the result would be
ID LENGTH NEW_DAY OPEN USED AVAIL
1A 6 1 100 0 2400
1A 6 2 100 0 2400
1A 6 3 100 0 2400
1A 6 4 100 0 2400
1A 6 5 100 0 2400
1A 6 6 100 0 2400
Note, I have hundreds of IDs so can not hard code it as 1A and needs to be dynamic.

I am not sure, but maybe this might help you.
If you add a blank query and add this expression:
= List.Repeat({1, 2}, 3)
you will get the first argument {1, 2} repeated three times.
When you separate your ID in a new column and pass this column to the code above (the same goes for the second argument) it might work.

replace multiple column values at the same time

I would like to replace multiple column values at the same time in a dataframe. I would like to change 2 to 1, 1 to 2.
data=data.frmae(store=c(122,323,254,435,654,342,234,344)
,cluster=c(2,2,2,1,1,3,3,3))
The problem in my code is after it changes 2 to 1 , it changes these 1's to 2.
Can I do it in dplyr or sth? Thank you
Desired data set below
store cluster
122 1
323 1
254 1
435 2
654 2
342 3
234 3
344 3

Adding column based on ID in another data

data1 is data from 1990 and it looks like
Panelkey Region income
1 9 30
2 1 20
4 2 40
data2 is data from 2000 and it looks like
Panelkey Region income
3 2 40
2 1 30
1 1 20
I want to add a column of where each person lived in 1990.
Panelkey Region income Region1990
3 2 40 .
2 1 30 1
1 1 20 9
How can I do this on Stata?

The following code will deal with panels that live in multiple regions in the same year by choosing the region with larger income. This would make sense if income was proportional to fraction of the year spent in a region. Same income ties will be broken arbitrarily using the highest region's value. Other types of aggregation might make sense (take a look at the -collapse- command).
Note that I tweaked your data by inserting second rows for the last observation in each year:
clear
input Panelkey Region income
1 9 30
2 1 20
4 2 40
4 10 80
end
rename (Region income) =1990
bysort Panelkey (income Region): keep if _n==_N
isid Panelkey
save "data1990.dta", replace
clear
input Panelkey Region income
3 2 40
2 1 30
1 1 20
1 9 20
end
bysort Panelkey (income Region): keep if _n==_N
isid Panelkey
merge 1:1 Panelkey using "data1990.dta", keep(match master) nogen
list, clean noobs

Django query aggregation

Imagine a number guessing game where one person thinks of a number and another person has to guess it. The game is over if the correct number was guessed.
The models might look like this
class SecretNumber(models.Model):
number = models.IntegerField()
class Guess(models.Model)
secretnumber = models.Foreignkey(SecretNumber)
guess = models.IntegerField()
After having played four times, the database might look like this:
id number
==========
1 10
2 54
3 68
4 25
id secretnumber_id guess
=============================
1 1 50
2 1 30
3 1 10
4 2 99
5 2 60
6 2 54
7 3 1
8 3 68
9 4 73
10 4 34
11 4 86
12 4 51
13 4 25
As you can see, the guesser was very lucky: it took him 3, 3, 2 and 4 guesses. But that's just to keep this example short.
Now I need to come up with a query which will allow to display the following data:
Nb. guesses Count
=====================
2 1
3 2
4 1
A manual SQL statement would look something like this:
SELECT inner_count AS 'Nb. guesses', count(inner_count) AS 'Count' FROM (
SELECT secretnumber_id, count(id) AS inner_count FROM guess GROUP BY secretnumber_id
) GROUP BY inner_count
I thought about annotating an annotation, but this seems not to be possible.
Any ideas?

If you're using django (ie models instead of classes), you want to use the QuerySet aggregate functions
e.g.
from django.db.models import Count
guesses = Guess.objects.values('secretnumber').annotate(Count('secretnumber'))
This will give you a queryset with a list of objects, which have a secretnumber and a count value.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Data frames pandas python - python-2.7

I have a data frame that looks like this: id age sallary 1 16 500 2 21 1000 3 25 3000 4 30 6000 5 40 25000 and a list of ids that I would like to ignore [1,3,5] how can I get a data frame that will contain all the remaining rows: 2,4. Big thanks for every one.

Call isin and negate the result using ~: In [42]: ignore_ids=[1,3,5] df[~df.id.isin(ignore_ids)] Out[42]: id age sallary 1 2 21 1000 3 4 30 6000

Related

Power BI : line grouping

Power BI - get the graph out of the data set

replace multiple column values at the same time

Adding column based on ID in another data

Django query aggregation

Categories

Resources