replace multiple column values at the same time - replace

I would like to replace multiple column values at the same time in a dataframe. I would like to change 2 to 1, 1 to 2.
data=data.frmae(store=c(122,323,254,435,654,342,234,344)
,cluster=c(2,2,2,1,1,3,3,3))
The problem in my code is after it changes 2 to 1 , it changes these 1's to 2.
Can I do it in dplyr or sth? Thank you
Desired data set below
store cluster
122 1
323 1
254 1
435 2
654 2
342 3
234 3
344 3

Related

How to sum by group in Power Query Editor?

My table look like this :
Serial WO# Value Indicator
A 333 10 333-1
A 333 4 333-2
B 456 5 456-1
A 334 1 334-1
A 334 5 334-2
I want to create a new column that sums up the Values based on WO#. It should look like this:
Serial WO# Value Indicator SumValue
A 333 10 333-1 14
A 333 4 333-2 14
B 456 5 456-1 5
A 334 1 334-1 6
A 334 5 334-2 6
Eventually I will remove duplicates on the WO# and remove the Value and Indicator Columns from the data. I can't seem to find a function in M that allows for sum by group. Thanks in advance!
If you load the data with Power Query, there is a Group command on the ribbon that will do just that.
Make sure to use the Advanced option and add all columns you want to retain to the grouping section. Screenshot from Excel ....
.... and from Power BI

SAS Proc Print - No Output

I am so frustrated. I can't even get a proc print to work. I've tried so many things. I don't see the table in results viewer. My log says the file has been read and that I should see results. I've tried turning ods off and on and saving to work folder or saving to my own folder. I've tried switching to a list output. Right now, I just want this code to run which I got from: https://support.sas.com/resources/papers/proceedings11/270-2011.pdf .
data energy;
length state $2;
input region division state $ type expenditures ##;
datalines;
1 1 ME 1 708 1 1 ME 2 379 1 1 NH 1 597 1 1 NH 2 301
1 1 VT 1 353 1 1 VT 2 188 1 1 MA 1 3264 1 1 MA 2 2498
1 1 RI 1 531 1 1 RI 2 358 1 1 CT 1 2024 1 1 CT 2 1405
1 2 NY 1 8786 1 2 NY 2 7825 1 2 NJ 1 4115 1 2 NJ 2 3558
1 2 PA 1 6478 1 2 PA 2 3695 4 3 MT 1 322 4 3 MT 2 232
4 3 ID 1 392 4 3 ID 2 298 4 3 WY 1 194 4 3 WY 2 184
4 3 CO 1 1215 4 3 CO 2 1173 4 3 NM 1 545 4 3 NM 2 578
4 3 AZ 1 1694 4 3 AZ 2 1448 4 3 UT 1 621 4 3 UT 2 438
4 3 NV 1 493 4 3 NV 2 378 4 4 WA 1 1680 4 4 WA 2 1122
4 4 OR 1 1014 4 4 OR 2 756 4 4 CA 1 10643 4 4 CA 2 10114
4 4 AK 1 349 4 4 AK 2 329 4 4 HI 1 273 4 4 HI 2 298
;
proc sort data=energy out=energy_report;
by region division type;
run;
proc format;
value regfmt 1='Northeast'
2='South'
3='Midwest'
4='West';
value divfmt 1='New England'
2='Middle Atlantic'
3='Mountain'
4='Pacific';
value usetype 1='Residential Customers'
2='Business Customers';
run;
ods html file='my_report.html';
proc print data=energy_report;
run;
ods html close;
My log shows no errors:
NOTE: Writing HTML Body file: my_report.html
1582 proc print data=energy_report;
1583 run;
NOTE: There were 44 observations read from the data set WORK.ENERGY_REPORT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
When I go into my temporary files, I can open the "energy" and "energy_report" data set and I can view all the data. Why can't I see a print output? I'm not sure what I'm missing. I checked the output window, the results viewer window, and all the generated html files. They're all blank.
Thank you
It depends a lot on your set up, but I would enable HTML & Listing output and then check the output.
ods listing;
ods html;
proc print data=sashelp.class;
run;
If you're using EG the results should be in the process flow. If Studio, in the Results tab, if SAS Base, click on Results and open if necessary.
There is an option called 'Show Results as Generated' and it's possible it's been set to off in your installation for some reason. I often set mine up this way because I often generate a lot of files at once (HTML/XLSX) and don't want them to open up automatically.
Where you print to my_report.html, the file will probably be trying to go to C:\my_report.html - put in a full file path instead, and check that when you're done.
change
ods html file='my_report.html';
proc print data=energy_report;
run;
ods html close;
to
ods html file="&path./my_report4.html";
proc print data=energy_report;
run;
ods html close;
where &path contains the path where the file will be created.
And important : Use " instead of '. Double quote in the place of a quote.

Duplicate each row as many times as is given in a variable

I have a set of individuals with characteristics. Each individual belongs to one or more group. I need to merge individuals to group characteristics, by firstly duplicating each row of individual data set as many times as is given by n_groups.
The data looks like
id age n_groups
1 50 2
2 46 1
3 51 3
4 44 2
I need to have
id age n_groups group_index
1 50 2 1
1 50 2 2
2 46 1 1
3 51 3 1
3 51 3 2
3 51 3 3
4 44 2 1
4 44 2 1
It seems like a very easy task, and I need some variation of expand with variable number of duplicates. Any ideas if there is a simple command for this?
Thanks!
Appears the solution is very standard. The expand command indeed allows for expanding based on variable: expand n_groups solved the question.

Data frames pandas python

I have a data frame that looks like this:
id age sallary
1 16 500
2 21 1000
3 25 3000
4 30 6000
5 40 25000
and a list of ids that I would like to ignore [1,3,5]
how can I get a data frame that will contain all the remaining rows: 2,4.
Big thanks for every one.
Call isin and negate the result using ~:
In [42]:
ignore_ids=[1,3,5]
df[~df.id.isin(ignore_ids)]
Out[42]:
id age sallary
1 2 21 1000
3 4 30 6000

PROC RANK by score: minimum number of a counts of target variable

I have used SAS PROC RANK to rank a population based on score and create groups of equal size. I would like to create groups such that there is a minimum number of target variable (Goods and Bads) in each bin. Is there a way to do that using PROC RANK? I understand that the size of each bin would be different.
For example in the table below, I have created 10 groups based on a certain score. As you can see the Non cures in the lower deciles are sparse. I would like to create groups such there there are at least 10 Non cures in each group.
Cures and Non cures are based on same variable: Cure = 1 and Cure = 0.
Decile cures non cures
0 262 94
1 314 44
2 340 19
3 340 13
4 353 10
5 373 5
6 308 3
7 342 3
8 440 4
9 305 3