Rollup function in SAS - sas

I would like to add summary record after each group of records connected with specific shop. So, I have this:
Shop_id Trans_id Count
1 1 10
1 2 23
1 3 12
2 1 8
2 2 15
And want to have this:
Shop_id Trans_id Count
1 1 10
1 2 23
1 3 12
. . 45
2 1 8
2 2 15
. . 23
I have done this using PROC SQL but I would like to do this using PROC REPORT as I have read that PROC REPORT should handle such cases.

Try this:
data have;
input shop_id Trans_id Count;
cards;
1 1 10
1 2 23
1 3 12
2 1 8
2 2 15
;
proc report data=have out=want(drop=_:);
define shop_id/group;
define trans_id/order;
define count/sum;
break after shop_id/summarize;
compute after shop_id;
if _break_='shop_id' then shop_id='';
endcomp;
run;

Related

Subtract Value at Aggregate by Quarter

Values are for two groups by quarter.
In DAX, need to summarize all the data but also need to remove -3 from each quarter in 2021 for Group 1, without allowing the value to go below 0.
This only impacts:
Group 1 Only
2021 Only
However, I also need to retain the data details without the adjustment. So I can't do this in Power Query. My data detail is actually in months but I'm only listing one date per quarter for brevity.
Data:
Group
Date
Value
1
01/01/2020
10
1
04/01/2020
8
1
07/01/2020
18
1
10/01/2020
2
1
01/01/2021
12
1
04/01/2021
3
1
07/01/2021
7
1
10/01/2021
2
2
01/01/2020
10
2
04/01/2020
8
2
07/01/2020
18
2
10/01/2020
2
2
01/01/2021
12
2
04/01/2021
3
2
07/01/2021
7
2
10/01/2021
2
Result:
Group
Qtr/Year
Value
1
Q1-2020
10
1
Q2-2020
8
1
Q3-2020
18
1
Q4-2020
2
1
2020
38
1
Q1-2021
9
1
Q2-2021
0
1
Q3-2021
4
1
Q4-2021
0
1
2021
13
2
Q1-2020
10
2
Q2-2020
8
2
Q3-2020
18
2
Q4-2020
2
2
2020
2
2
Q1-2021
12
2
Q2-2021
3
2
Q3-2021
7
2
Q4-2021
2
2
2021
24
You issue can be solved by using Matrix Table, and also to add new column to process value before create the table:
First, add a new column using following formula:
Revised value =
var newValue = IF(YEAR(Sheet1[Date])=2021,Sheet1[Value]-3,Sheet1[Value])
return
IF(newValue <0,0,newValue)
Second, create the matrix table for the desired outcome:

SAS Proc Print - No Output

I am so frustrated. I can't even get a proc print to work. I've tried so many things. I don't see the table in results viewer. My log says the file has been read and that I should see results. I've tried turning ods off and on and saving to work folder or saving to my own folder. I've tried switching to a list output. Right now, I just want this code to run which I got from: https://support.sas.com/resources/papers/proceedings11/270-2011.pdf .
data energy;
length state $2;
input region division state $ type expenditures ##;
datalines;
1 1 ME 1 708 1 1 ME 2 379 1 1 NH 1 597 1 1 NH 2 301
1 1 VT 1 353 1 1 VT 2 188 1 1 MA 1 3264 1 1 MA 2 2498
1 1 RI 1 531 1 1 RI 2 358 1 1 CT 1 2024 1 1 CT 2 1405
1 2 NY 1 8786 1 2 NY 2 7825 1 2 NJ 1 4115 1 2 NJ 2 3558
1 2 PA 1 6478 1 2 PA 2 3695 4 3 MT 1 322 4 3 MT 2 232
4 3 ID 1 392 4 3 ID 2 298 4 3 WY 1 194 4 3 WY 2 184
4 3 CO 1 1215 4 3 CO 2 1173 4 3 NM 1 545 4 3 NM 2 578
4 3 AZ 1 1694 4 3 AZ 2 1448 4 3 UT 1 621 4 3 UT 2 438
4 3 NV 1 493 4 3 NV 2 378 4 4 WA 1 1680 4 4 WA 2 1122
4 4 OR 1 1014 4 4 OR 2 756 4 4 CA 1 10643 4 4 CA 2 10114
4 4 AK 1 349 4 4 AK 2 329 4 4 HI 1 273 4 4 HI 2 298
;
proc sort data=energy out=energy_report;
by region division type;
run;
proc format;
value regfmt 1='Northeast'
2='South'
3='Midwest'
4='West';
value divfmt 1='New England'
2='Middle Atlantic'
3='Mountain'
4='Pacific';
value usetype 1='Residential Customers'
2='Business Customers';
run;
ods html file='my_report.html';
proc print data=energy_report;
run;
ods html close;
My log shows no errors:
NOTE: Writing HTML Body file: my_report.html
1582 proc print data=energy_report;
1583 run;
NOTE: There were 44 observations read from the data set WORK.ENERGY_REPORT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
When I go into my temporary files, I can open the "energy" and "energy_report" data set and I can view all the data. Why can't I see a print output? I'm not sure what I'm missing. I checked the output window, the results viewer window, and all the generated html files. They're all blank.
Thank you
It depends a lot on your set up, but I would enable HTML & Listing output and then check the output.
ods listing;
ods html;
proc print data=sashelp.class;
run;
If you're using EG the results should be in the process flow. If Studio, in the Results tab, if SAS Base, click on Results and open if necessary.
There is an option called 'Show Results as Generated' and it's possible it's been set to off in your installation for some reason. I often set mine up this way because I often generate a lot of files at once (HTML/XLSX) and don't want them to open up automatically.
Where you print to my_report.html, the file will probably be trying to go to C:\my_report.html - put in a full file path instead, and check that when you're done.
change
ods html file='my_report.html';
proc print data=energy_report;
run;
ods html close;
to
ods html file="&path./my_report4.html";
proc print data=energy_report;
run;
ods html close;
where &path contains the path where the file will be created.
And important : Use " instead of '. Double quote in the place of a quote.

count the total of unique numbers occur in a range of cells

Hello this is my data sample
coustmer_NO id
1 5
1 13
2 4
2 4
2 4
3 4
3 10
4 8
4 8
using SQL >> I Would like to count for each customer how many different ID They have.
the expected output is:
coustmer_NO total_id
1 2
2 1
3 2
4 1
I guess there is a typo in your data,
The result should be:
coustmer_NO total_id
1 2
2 1
3 2
4 1
You can do the following:
SELECT costumer_NO, count(distinct id) AS total_id FROM <table_name> GROUP BY costumer_NO;
Try this query in MYSQL:
select coustmer_NO, count(distinct id) as 'total_id' from table_name group by coustmer_NO;

SAS - How to keep the earliest date considering a missing

A need to create a new variable to repeat the earliest date for a ID visit and if it missing it should type missing, after a missing it should keep the earliest date since it was missing(like in the example). I've tried the LAG function and it didn't work; I also try the keep function but just repeat the 25NOV2015 for all records. The final result/"what I need" is in the last column.
Thanks
Example
You need to use retain statement. Retain means your value in each observation won't be reinitialized to a missing. So in the next iteration of data step your variable remembers its value.
Sample data
data a;
input date;
format date ddmmyy10.;
datalines;
.
5
6
7
.
1
2
.
9
;
run;
Solution
data b;
set a;
retain new_date;
format new_date ddmmyy10.;
if date = . then
new_date = .;
if new_date = . then
new_date = date;
run;
Since you didn't post any data I will make up some. Also since the fact that your variable is a date doesn't really impact the answer I will just use some integers as they are easier to type.
data have ;
input id value ## ;
cards;
1 . 1 2 1 3 1 . 1 5 1 6 1 . 1 8
2 1 2 2 2 3 2 . 2 5 2 6
;;;;
Basically your algorithm says that you want to store the value when either the current value is missing or stored value is missing. With multiple BY groups you would also want to set it when you start a new group.
data want ;
set have ;
by id ;
retain new_value ;
if first.id or missing(new_value) or missing(value)
then new_value=value;
run;
Results:
new_
Obs id value value
1 1 . .
2 1 2 2
3 1 3 2
4 1 . .
5 1 5 5
6 1 6 5
7 1 . .
8 1 8 8
9 2 1 1
10 2 2 1
11 2 3 1
12 2 . .
13 2 5 5
14 2 6 5

SAS: No valid observations are found Error - Simple Regression

I have big panel time series data set. I wish to do this basic SAS regression code:
proc sort data=dataset;
by time_id;
run;
ods output parameterestimates=pe;
proc reg data=dataset;
by time_id;
model y=x1 x2 x3....x15;
quit;
run;
I get this error when I run the code:
ERROR: No valid observations are found.
NOTE: The above message was for the following BY group:
time_id=1
ERROR: No valid observations are found.
NOTE: The above message was for the following BY group:
time_id=2....
Why? My time_id variable exists... is it because I have too many time_id variables? If I select firm_id it works but I want time_id.
Here's a sample of my data (panel time series):
y x firm_id time_id
3.4 100 1 1
2.3 200 1 2
6.5 653 1 3
3 50 2 1
4.34 23 2 2
4.8 55 2 3
1.311 400 3 1
1.23 200 3 2
5.63 50 3 3
You'll get this error message if all values of a particular x variable are missing for a given time_id. Take a look at the example below where all values of x2 are missing for time_id 1, when you run the code the Results Output window details the problem (number of missing observations the same as the number of observations).
It works for firm_id because you have fewer values than time_id, therefore not all values of a particular x variable are missing for each firm_id.
data have;
input y x1 x2 firm_id time_id;
cards;
3.4 100 . 1 1
2.3 200 200 1 2
6.5 653 653 1 3
3 50 . 2 1
4.34 23 23 2 2
4.8 55 55 2 3
1.311 400 . 3 1
1.23 200 200 3 2
5.63 50 50 3 3
;
run;
proc sort data=have;
by time_id;
run;
ods output parameterestimates=pe;
proc reg data=have;
by time_id;
model y=x1-x2;
quit;
run;