sas gchart hbar bars overlap with each other - sas

OK. Finally, I get the chance to address this problem properly. I came across this problem on SAS EG.
First, I have the following dataset:
data test;
infile datalines;
input var1 var2;
datalines;
0.01 200
0.02 200
0.03 200
0.04 200
0.05 200
0.06 200
0.07 200
0.08 200
0.09 200
0.10 200
0.11 200
0.12 200
0.13 200
0.14 200
0.15 200
11111111111111111111111111 200
;
run;
When I try to plot var1(x-axis) against var2(y-axis) in a gchart hbar, it works fine:
PROC GCHART DATA=test;
HBAR age /
SUMVAR=income missing discrete clipref frame;
run;quit;
The chart is
But when I specify goptions reset=all device=gif; The chart becomes:
Clearly, there is an extreme value and all the other bars overlap with each other. Notice that even that I put discrete option in my hbar statement, when I put goptions in, it seems not working.
Obviously, the purpose here is to just put var1 evenly on x-axis, rather than putting them according to their numeric values. So the first chart is what I want. But I need the goptions in order to output the chart to a gif file.
Is there anyone having the similar experience and what would be the solution? Many thanks.

The easiest solution is to change the type of age from number to character. SAS will not try to space character values relative to their the values as it tries with numeric values.

Related

How to get weighted percentile of each observation in SAS

I have dataset like this:
data providers;
input prv_id mbr_cnt value;
datalines;
1100 25860 3.9025
4700 71855 8.8566
5500 72147 6.9918
6400 25144 4.5200
7000 58114 9.3391
7900 67222 7.5189
8300 54039 8.9301
8800 2204 3.2221
9400 71600 9.9682
10000 68807 7.6581
10200 16322 8.6505
10700 115118 12.4198
11100 148235 18.2053
11700 56441 8.6987
12100 58556 7.6724
12500 81865 10.1048
12900 18106 3.7881
13400 98701 12.9679
13900 10347 3.7001
14400 45516 6.3924
;
run;
I need to calculate percentile of each observation weighted by mbr_cnt. Is there a way to do it in SAS? I tried to use proc rank data=providers groups=100 out=providers_percentile; but that just gives me unweighted percentile.
PROC FREQ has a WEIGHT option and can calculate weighted cumulative percent.
proc freq data=providers;
ods output list=freqout;
weight mbr_cnt;
tables value * prv_id / list missing;
run;
Not sure if this exactly what you need.

Cluster Bar Chart

I have data that look like the following:
data have;
format date date9.;
input date:mmddyy10. Intervention _24hrPtVolumeESI_1_5;
datalines;
9/17/2018 0 204
9/24/2018 0 139
10/17/2018 0 527
10/23/2018 1 430
11/01/2018 1 231
;
run;
I would like to create a bar chart where the x axis contains ranges of median wait time (e.g. 100-125, 126-150 etc.) while displaying those times comparatively based on intervention (0 or 1). Thus, each range would have two bars-one for preintervention (0) and post interventions(1) The Y axis would simply show the counts for how man given median scores fell within the x axis range.
I've tried toying around with a sgplot code but that produces sloppy results.
proc sgplot data=WORK.FelaCombo;
vbar _24hrPtVolumeESI_1_5 / response=_24hrPtVolumeESI_1_5 stat=sum
group=intervention nostatlabel
groupdisplay=cluster;
xaxis display=(nolabel);
yaxis grid;
run;
Try using a histogram instead. vbar is more for discrete categories, whereas histogram will automatically create bins.
proc sgplot data=WORK.have;
histogram _24hrPtVolumeESI_1_5 /
scale=count
binstart=100
binwidth=25
group=intervention
transparency=0.5
showbins
;
xaxis display=(nolabel);
yaxis grid;
run;

SAS: robust regression and output coefficients, t values and adj R squares

I am running robust regression by group in SAS .
My data is like
id stock date stock_liq market_liq
1 VOD 1/5/2016 0.03 0.02
1 VOD 2/5/2016 0.04 0.025
... ... ... ... ...
2 SAB 1/5/2016 0.31 0.02
2 SAB 1/5/2016 0.31 0.02
... ... ... ... ...
Its a panel data and each stock has a unique ID. I want to run robust regression by ID and I want to output the coefficients, t values and adj-R squares.
My code is:
proc robustreg data=have outest= want noprint;
model stock_liq=market_liq ;
by id;
run;
However I don't think the code runs properly. SAS just stops running and the log gives me
"Error: Too many parameters in the model".
Can anyone advise ? Thank you !
The syntax is a bit off. Also the requested outputs can be added:
proc robustreg data=have outest= want noprint;
by id;
model stock_liq=market_liq ;
output out=output_sas
p=stock_liq
r=stock_liqresid ;
run;
See more on the output options from documentation

Proc Transpose 2 columns together in SAS [duplicate]

In SAS, I have a data set similar to the one below.
ID TRACT meanFA sdFA medianFA
1 t01 0.56 0.14 0.56
1 t02 0.53 0.07 0.52
1 t03 0.71 0.08 0.71
2 t01 0.72 0.09 0.72
2 t02 0.83 0.10 0.86
2 t03 0.59 0.10 0.62
I am not sure if transpose is the right concept here... but I would want the data to look like the one below.
ID t01_meanFA t01_sdFA t01_medianFA t02_meanFA t02_sdFA t02_medianFA t03_meanFA t03_sdFA t03_medianFA
1 0.56 0.14 0.56 0.53 0.07 0.52 0.71 0.08 0.71
2 0.72 0.09 0.72 0.83 0.10 0.86 0.59 0.10 0.62
proc transpose data=TRACT out=newTRACT;
var meanFA sdFA medianFA;
by id;
id tract meanFA sdFA medianFA;
run;
I have been playing around with the SAS code above, but with no success. Any ideas or suggestions would be great!
Double transpose is how you get to that. Get it to a dataset that has one row per desired variable per ID, so
ID=1 variable=t01_meanFA value=0.56
ID=1 variable=t01_sdFA value=0.14
...
ID=2 variable=t01_meanFA value=0.72
...
Then transpose using ID=variable and var=value (or whatever you choose to name those columns). You create the intermediate dataset by creating an array of your values (array vars[3] meanFA sdFA medianFA;) and then iterating over that array, setting variable name to catx('_',tract,vname(vars[n])); (vname gets the variable name of the array element).
You need 2 transposes. Transpose, use a data step to update then _NAME_ variable, and then transpose again;
proc transpose data=tract out=tract2;
by id tract;
run;
data tract2;
format _name_ $32.;
set tract2;
_name_ = strip(tract) || "_" || strip(_name_);
run;
proc transpose data=tract2 out=tract3(drop=_name_);
by id;
/*With no ID statement, the _NAME_ variable is used*/
var col1;
run;
Using example data from this duplicate question.
You can also just do this with a data step.
First, put the maximum sequence number into a macro variable.
proc sql;
select
max(sequence_no) into : maxseq
from
have
;
quit;
Create arrays for your new variables, setting the dimensions with the macro variable. Then loop over each visit, putting the events and notes into their respective variables. Output 1 line per visit.
data want(drop=sequence_no--notes);
do until (last.visit_no);
set have;
by id visit_no;
array event_ (&maxseq);
array notes_ (&maxseq) $;
event_(sequence_no)=event_code;
notes_(sequence_no)=notes;
end;
output;
run;

Grouping observation and forming new variable [duplicate]

In SAS, I have a data set similar to the one below.
ID TRACT meanFA sdFA medianFA
1 t01 0.56 0.14 0.56
1 t02 0.53 0.07 0.52
1 t03 0.71 0.08 0.71
2 t01 0.72 0.09 0.72
2 t02 0.83 0.10 0.86
2 t03 0.59 0.10 0.62
I am not sure if transpose is the right concept here... but I would want the data to look like the one below.
ID t01_meanFA t01_sdFA t01_medianFA t02_meanFA t02_sdFA t02_medianFA t03_meanFA t03_sdFA t03_medianFA
1 0.56 0.14 0.56 0.53 0.07 0.52 0.71 0.08 0.71
2 0.72 0.09 0.72 0.83 0.10 0.86 0.59 0.10 0.62
proc transpose data=TRACT out=newTRACT;
var meanFA sdFA medianFA;
by id;
id tract meanFA sdFA medianFA;
run;
I have been playing around with the SAS code above, but with no success. Any ideas or suggestions would be great!
Double transpose is how you get to that. Get it to a dataset that has one row per desired variable per ID, so
ID=1 variable=t01_meanFA value=0.56
ID=1 variable=t01_sdFA value=0.14
...
ID=2 variable=t01_meanFA value=0.72
...
Then transpose using ID=variable and var=value (or whatever you choose to name those columns). You create the intermediate dataset by creating an array of your values (array vars[3] meanFA sdFA medianFA;) and then iterating over that array, setting variable name to catx('_',tract,vname(vars[n])); (vname gets the variable name of the array element).
You need 2 transposes. Transpose, use a data step to update then _NAME_ variable, and then transpose again;
proc transpose data=tract out=tract2;
by id tract;
run;
data tract2;
format _name_ $32.;
set tract2;
_name_ = strip(tract) || "_" || strip(_name_);
run;
proc transpose data=tract2 out=tract3(drop=_name_);
by id;
/*With no ID statement, the _NAME_ variable is used*/
var col1;
run;
Using example data from this duplicate question.
You can also just do this with a data step.
First, put the maximum sequence number into a macro variable.
proc sql;
select
max(sequence_no) into : maxseq
from
have
;
quit;
Create arrays for your new variables, setting the dimensions with the macro variable. Then loop over each visit, putting the events and notes into their respective variables. Output 1 line per visit.
data want(drop=sequence_no--notes);
do until (last.visit_no);
set have;
by id visit_no;
array event_ (&maxseq);
array notes_ (&maxseq) $;
event_(sequence_no)=event_code;
notes_(sequence_no)=notes;
end;
output;
run;