SAS SGPlot scatterplot options--Grouping by attribute - sas

I have two questions concerning SGPlot in SAS. I have a large dataset, and am trying to create a scatterplot which highlights differences between brands of similar prooduct. I am able to get different colors for each brand, but for some reason the symbols will not show up. I had to close ods listing and html because I was getting an error ERROR: Cannot write image to filename.png. Please ensure that proper disk permissions are set. I'm not sure this can be fixed. Is there another way to get the symbols?
Also, I have a 95% prediction ellipse, but am wondering if there is a way to have an ellipse for each brand. Thanks.
data example;
length product brand $20;
input product $ brand $ item price trans;
cards;
sconces AllenRoth 1 1.5 300
sconces AllenRoth 2 2.75 350
sconces AllenRoth 3 1.75 300
sconces AllenRoth 4 0.75 400
sconces GardenTreasures 1 3 200
sconces GardenTreasures 2 3.25 175
sconces GardenTreasures 3 2.75 100
sconces GardenTreasures 4 3.5 100
sconces GardenTreasures 5 4 150
sconces OtherBrand 1 0.5 850
sconces OtherBrand 2 0.45 875
sconces OtherBrand 3 0.75 900
sconces OtherBrand 4 1 650
sconces OtherBrand 5 0.75 700
sconces BrandX 1 1 200
sconces BrandX 2 1.25 500
sconces BrandX 3 1.2 400
sconces BrandX 4 0.95 375
sconces BrandX 5 1 300
sconces BrandX 6 1 200
sconces BrandX 7 1.35 400
sconces BrandX 8 1.5 350
curtains AllenRoth 1 10 200
curtains AllenRoth 2 12 250
curtains AllenRoth 3 11.5 200
curtains AllenRoth 4 10 400
curtains AllenRoth 5 17 500
curtains AllenRoth 6 15 100
curtains AllenRoth 7 29 50
curtains AllenRoth 8 50 12
curtains GardenTreasures 1 80 150
curtains GardenTreasures 2 60 75
curtains GardenTreasures 3 100 50
curtains BrandX 1 9 300
curtains BrandX 2 12 350
curtains BrandX 3 10 275
curtains BrandX 4 7.5 400
curtains BrandX 5 12 200
curtains BrandX 6 8.5 500
;
run;
proc format;
value legfmt
1 = 'legend value 1'
2 = 'legend value 2'
3 = 'legend value 3'
4 = 'legend value 4';
run;
proc sort data= example;
by product brand;
run;
ods listing close;
ods html close;
proc sgplot data= example ;
title1 "plotting trans by price";
footnote1 "final";
by product;
scatter x= trans y= price / datalabel= item group= brand name= "scp";
ellipse x=trans y=price;
xaxis label= "Number of Transactions";
yaxis label= "Average Selling Price";
keylegend "scp" / noborder across= 1 down= 4 location= outside position= topright
title= "Legend";
run;
ods graphics off;
ods listing close;

Likely your gpath is not set properly. This works for me, for example:
ods html path='c:\temp' file='test.html' gpath='c:\temp\' style=htmlblue;
proc sgplot data= example ;
title1 "plotting trans by price";
footnote1 "final";
by product;
scatter x= trans y= price / datalabel= item group= brand name= "scp";
ellipse x=trans y=price;
xaxis label= "Number of Transactions";
yaxis label= "Average Selling Price";
keylegend "scp" / noborder across= 1 down= 4 location= outside position= topright
title= "Legend";
run;
ods html close;
If you don't set gpath to a path you can write to, it may be set to something that you don't have write access to, especially if you have a server installation. PATH and GPATH can be set to the same or different paths.
I don't believe you can have an ellipse for each brand, largely because it would look terrible. Having four prediction ellipses, even with different colors, would be very difficult to visually distinguish. A different chart type may be appropriate if you are trying to show that (perhaps a bar graph for example with brand as the bar type and selling price buckets as a group variable).

Related

PROC TRANSPOSE value column while retaining dates and Hour End

I have data structured like this:
Meter_ID Date HourEnd Value
100 12/01/2007 1 986
100 12/01/2007 2 992
100 12/01/2007 3 1002
200 12/01/2007 1 47
200 12/01/2007 2 45
200 12/01/2007 3 50
300 12/01/2007 1 32
300 12/01/2007 2 37
300 12/01/2007 3 40
And would like to transpose the information so that I end up with this:
Date HourEnd Meter100 Meter200 Meter300
12/01/2007 1 986 47 32
12/01/2007 2 992 45 37
12/01/2007 3 1002 50 40
I have tried numerous PROC TRANSPOSE options and variations and am confusing myself. Any help would be greatly appreciated!
You need to SORT.
data have;
infile cards firstobs=2;
input Meter_ID Date:mmddyy. HourEnd Value;
format date mmddyy10.;
cards;
Meter_ID Date HourEnd Value
100 12/01/2007 1 986
100 12/01/2007 2 992
100 12/01/2007 3 1002
200 12/01/2007 1 47
200 12/01/2007 2 45
200 12/01/2007 3 50
300 12/01/2007 1 32
300 12/01/2007 2 37
300 12/01/2007 3 40
;;;;
run;
proc print;
proc sort data=have;
by date hourend meter_id;
run;
proc print;
run;
proc transpose prefix="Meter"n;
by date hourend;
id meter_id;
var value;
run;
proc print;
run;

Populate df row value based on column header

Appreciate any help. Basically, I have a poor data set and am trying to make it more useful.
Below is a representation
df = pd.DataFrame({'State': ("Texas","California","Florida"),
'Q1 Computer Sales': (100,200,300),
'Q1 Phone Sales': (400,500,600),
'Q1 Backpack Sales': (700,800,900),
'Q2 Computer Sales': (200,200,300),
'Q2 Phone Sales': (500,500,600),
'Q2 Backpack Sales': (800,800,900)})
I would like to have a df that creates separate columns for the Quarters and Sales for the respective state.
I think perhaps regex, str.contains, and loops perhaps?
snapshot below
IIUC, you can use:
df_a = df.set_index('State')
df_a.columns = pd.MultiIndex.from_arrays(zip(*df_a.columns.str.split(' ', n=1)))
df_a.stack(0).reset_index()
Output:
State level_1 Backpack Sales Computer Sales Phone Sales
0 Texas Q1 700 100 400
1 Texas Q2 800 200 500
2 California Q1 800 200 500
3 California Q2 800 200 500
4 Florida Q1 900 300 600
5 Florida Q2 900 300 600
Or we can go further:
df_a = df.set_index('State')
df_a.columns = pd.MultiIndex.from_arrays(zip(*df_a.columns.str.split(' ', n=1)), names=['Quarters','Items'])
df_a = df_a.stack(0).reset_index()
df_a['Quarters'] = df_a['Quarters'].str.extract('(\d+)')
print(df_a)
Output:
Items State Quarters Backpack Sales Computer Sales Phone Sales
0 Texas 1 700 100 400
1 Texas 2 800 200 500
2 California 1 800 200 500
3 California 2 800 200 500
4 Florida 1 900 300 600
5 Florida 2 900 300 600

Django ORM QUERY Adjacent row sum with sqlite

In my database I'm storing data as below:
id amt
-- -------
1 100
2 -50
3 100
4 -100
5 200
I want to get output like below
id amt balance
-- ----- -------
1 100 100
2 -50 50
3 100 150
4 -100 50
5 200 250
How to do with in django orm

SAS Proc Print - No Output

I am so frustrated. I can't even get a proc print to work. I've tried so many things. I don't see the table in results viewer. My log says the file has been read and that I should see results. I've tried turning ods off and on and saving to work folder or saving to my own folder. I've tried switching to a list output. Right now, I just want this code to run which I got from: https://support.sas.com/resources/papers/proceedings11/270-2011.pdf .
data energy;
length state $2;
input region division state $ type expenditures ##;
datalines;
1 1 ME 1 708 1 1 ME 2 379 1 1 NH 1 597 1 1 NH 2 301
1 1 VT 1 353 1 1 VT 2 188 1 1 MA 1 3264 1 1 MA 2 2498
1 1 RI 1 531 1 1 RI 2 358 1 1 CT 1 2024 1 1 CT 2 1405
1 2 NY 1 8786 1 2 NY 2 7825 1 2 NJ 1 4115 1 2 NJ 2 3558
1 2 PA 1 6478 1 2 PA 2 3695 4 3 MT 1 322 4 3 MT 2 232
4 3 ID 1 392 4 3 ID 2 298 4 3 WY 1 194 4 3 WY 2 184
4 3 CO 1 1215 4 3 CO 2 1173 4 3 NM 1 545 4 3 NM 2 578
4 3 AZ 1 1694 4 3 AZ 2 1448 4 3 UT 1 621 4 3 UT 2 438
4 3 NV 1 493 4 3 NV 2 378 4 4 WA 1 1680 4 4 WA 2 1122
4 4 OR 1 1014 4 4 OR 2 756 4 4 CA 1 10643 4 4 CA 2 10114
4 4 AK 1 349 4 4 AK 2 329 4 4 HI 1 273 4 4 HI 2 298
;
proc sort data=energy out=energy_report;
by region division type;
run;
proc format;
value regfmt 1='Northeast'
2='South'
3='Midwest'
4='West';
value divfmt 1='New England'
2='Middle Atlantic'
3='Mountain'
4='Pacific';
value usetype 1='Residential Customers'
2='Business Customers';
run;
ods html file='my_report.html';
proc print data=energy_report;
run;
ods html close;
My log shows no errors:
NOTE: Writing HTML Body file: my_report.html
1582 proc print data=energy_report;
1583 run;
NOTE: There were 44 observations read from the data set WORK.ENERGY_REPORT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
When I go into my temporary files, I can open the "energy" and "energy_report" data set and I can view all the data. Why can't I see a print output? I'm not sure what I'm missing. I checked the output window, the results viewer window, and all the generated html files. They're all blank.
Thank you
It depends a lot on your set up, but I would enable HTML & Listing output and then check the output.
ods listing;
ods html;
proc print data=sashelp.class;
run;
If you're using EG the results should be in the process flow. If Studio, in the Results tab, if SAS Base, click on Results and open if necessary.
There is an option called 'Show Results as Generated' and it's possible it's been set to off in your installation for some reason. I often set mine up this way because I often generate a lot of files at once (HTML/XLSX) and don't want them to open up automatically.
Where you print to my_report.html, the file will probably be trying to go to C:\my_report.html - put in a full file path instead, and check that when you're done.
change
ods html file='my_report.html';
proc print data=energy_report;
run;
ods html close;
to
ods html file="&path./my_report4.html";
proc print data=energy_report;
run;
ods html close;
where &path contains the path where the file will be created.
And important : Use " instead of '. Double quote in the place of a quote.

SAS-Create Dynamic Tables Based On ID With Only Top 3 Showing

Hi I would really like to create dynamic tables based on the following sample data, create 4 new data sets based upon PAYEE_ID: 522,622,743,and 888. I want all all of the fields to be in the new 4 data sets, but only have the top 3 AMT_BILLED in the 4 tables for each type of PAYEE_ID
PAYEE_ID PAYEENAME MSG_CODE MSG_DESCRIPTION AMT_BILLED percentbilled claimscounts PercentLines TotalAmount TotNumofClaims
522 MakeBelieve Center 1 AA text field 1 10000 4% 50 16% 275000 305
522 MakeBelieve Center 1 BB text field 2 20000 7% 40 13% 275000 305
522 MakeBelieve Center 1 6N text field 3 30000 11% 30 10% 275000 305
522 MakeBelieve Center 1 5U text field 4 25000 9% 20 7% 275000 305
522 MakeBelieve Center 1 1F text field 5 90000 33% 100 33% 275000 305
522 MakeBelieve Center 1 2E text field 6 100000 36% 65 21% 275000 305
622 Invisible Center 2 A4 text field 1 600 2% 9 7% 34300 134
622 Invisible Center 2 D2 text field 2 700 2% 31 23% 34300 134
622 Invisible Center 2 D4 text field 3 8000 23% 11 8% 34300 134
622 Invisible Center 2 DS text field 4 10000 29% 62 46% 34300 134
622 Invisible Center 2 F8 text field 5 15000 44% 21 16% 34300 134
743 Pretend Center 1 1K text field 1 440 1% 2 1% 41040 246
743 Pretend Center 1 1N text field 2 3000 7% 7 3% 41040 246
743 Pretend Center 1 1V text field 3 400 1% 4 2% 41040 246
743 Pretend Center 1 2W text field 4 15000 37% 63 26% 41040 246
743 Pretend Center 1 3B text field 5 500 1% 2 1% 41040 246
743 Pretend Center 1 3H text field 6 7700 19% 41 17% 41040 246
743 Pretend Center 1 3Z text field 7 14000 34% 127 52% 41040 246
888 It's A MakeBelieve One B7 text field 1 68000 38% 257 29% 178449 886
888 It's A MakeBelieve One B8 text field 2 5000 3% 47 5% 178449 886
888 It's A MakeBelieve One B9 text field 3 200 0% 138 16% 178449 886
888 It's A MakeBelieve One BB text field 4 1562 1% 18 2% 178449 886
888 It's A MakeBelieve One BO text field 5 39999 22% 3 0% 178449 886
888 It's A MakeBelieve One BZ text field 6 40000 22% 2 0% 178449 886
888 It's A MakeBelieve One C2 text field 7 500 0% 5 1% 178449 886
888 It's A MakeBelieve One C5 text field 8 7865 4% 395 45% 178449 886
888 It's A MakeBelieve One C7 text field 9 8649 5% 14 2% 178449 886
888 It's A MakeBelieve One CR text field 10 5674 3% 1 0% 178449 886
888 It's A MakeBelieve One CX text field 11 1000 1% 6 1% 178449 886
to
I'm new to SAS, and this would really help me out. Thank you so much!
proc sort data=sampleData out=sampleData_s;
by payee_id amt_billed;
run;
You can use descending if by 'top' you mean largest e.g. by payee_id descending amt_billed;
Once the data are sorted you are able to read into a data step and use first and last e.g.
data partial_solution(drop=count);
retain count 0;
set sampleData_s;
by payee_id descending amt_billed;
if first.payee_id then count=0;
count+1;
if count le 3 then output;
run;
To output to different dataset names:
proc sort data=sampleData(keep=payee_id) out=all_payee_ids nodupkey;
by payee_id;
run;
data _null_;
length id_list $10000; * needs to be long enough to contain all ids;
* if you do not state this, sas will default;
* length to first value;
retain id_list;
set all_payee_ids end=eof;
id_list = catx('|', id_list, payee_id);
if eof then call symputx('macroVarIdList', id_list);
run;
You've now got a pipe separated list of all your id's. You can loop through these using them to create names for you datasets. You need to do this as SAS needs to know the names of the datasets you want to output to up front e.g.
data ds1 ds2 ds3 ds4;
set some_guff;
if blah then output ds1;
else if blahblah then output ds2;
else output d3;
output d4;
run;
So with the macro var loop:
%let nrVars=%sysfunc(countw(&macroVarIdList));
data
%do i = 1 %to &nrVars;
dataset_%scan(&macroVarIdList,&i,|)
%end;
;
set partial_solution;
count+1;
%do j = 1 %to &nrVars;
%let thisPayeeId=%scan(&macroVarIdList,&j,|);
if payee_id = "&thisPayeeId" then output dataset_&thisPayeeId.;
%end;
run;