Disclaimer: I am an amateur programmer. I am a student.
So I've been working on a Van Westendorf price model which requires analysis of critical points produced when the kernel density lines overlayed on 4 histograms intersect. My current model has 3 histograms, so 3 lines, and 2 intersections. How can I label these intersections?
You can see the output here.
http://i.imgur.com/6TGWv4l.jpg
Here is the code:
Proc Import Out = work.SASdata datafile =
"C:\Users\jdelizza\Desktop\SimpleSAS.xls"
DBMS = xls;
Sheet = "Asthma";
Getnames = yes;
Label I_Price_A = 'Too Expensive'
I_Price_B = 'Inexpensive'
I_Price_C = 'Slightly Expensive'
F_Price_Low = 'Full Price Inexpensive'
F_Price_Expensive = 'Full price Expensive'
F_Price_Too_Expensive = 'Full Price Too Expensive'
P_Price_Low = 'Personalized Price Inexpensive'
P_Price_Expensive = 'Personalized price Expensive'
P_Price_Too_Expensive= 'Personalized Price Too Expensive';
run;
proc sgplot;
Title "Those with Asthma Indoor Pricing";
histogram I_Price_A / fillattrs=graphdata1 transparency = .5 binstart =1
binwidth=50;
density I_Price_A /type = kernel lineattrs=graphdata1;
histogram I_Price_B / fillattrs=graphdata2 transparency=.5 binstart=1
binwidth=50;
density I_Price_B /type = kernel lineattrs=graphdata2;
histogram I_Price_C / fillattrs=graphdata3 transparency=.5 binstart=1
binwidth=50;
density I_Price_C / type = kernel lineattrs=graphdata3;
keylegend / location=inside position=topright noborder across=2;
yaxis grid;
xaxis display=(nolabel) values=(0 to 1000 by 100);
run;
Related
I am running an ordered probit with four levels (A lot, Somewhat, Little, Not at all) on a female variable and some controls:
* Baseline only
eststo, title ("OProbit1"): /*quietly*/ oprobit retincome_worry i.female $control_socio, vce(robust)
estimate store OProbit1
* Baseline + Health Controls
eststo, title ("OProbit3"): oprobit retincome_worry i.female $control_socio $control_health, vce(robust)
estimate store OProbit3
I am doing this for marginal effects of the female variable:
* TABLE BASELINE
estimate restore OProbit1
margins, dydx(i.female) predict (outcome(1)) atmeans post
outreg using results\Reg_margins\Reg2.tex, noautosumm replace rtitle(A lot) ctitle(Social Controls) title(Worry about Retirement Income)
estimate restore OProbit1
margins, dydx(i.female) predict (outcome(2)) atmeans post
outreg using results\Reg_margins\Reg2.tex, noautosumm append rtitle(Somewhat)
estimate restore OProbit1
margins, dydx(i.female) predict (outcome(3)) atmeans post
outreg using results\Reg_margins\Reg2.tex, noautosumm append rtitle(Little)
estimate restore OProbit1
margins, dydx(i.female) predict (outcome(4)) atmeans post
outreg using results\Reg_margins\Reg2.tex, noautosumm append rtitle(Not at all) tex
* TABLE BASELINE + HEALTH
estimate restore OProbit3
margins, dydx(i.female) predict (outcome(1)) atmeans post
outreg using results\Reg_margins\Reg3.tex, noautosumm replace rtitle(A lot) ctitle(Baseline and Health) title(Worry about Retirement Income)
estimate restore OProbit3
margins, dydx(i.female) predict (outcome(2)) atmeans post
outreg using results\Reg_margins\Reg3.tex, append noautosumm rtitle(Somewhat)
estimate restore OProbit3
margins, dydx(i.female) predict (outcome(3)) atmeans post
outreg using results\Reg_margins\Reg3.tex, append noautosumm rtitle(Little)
estimate restore OProbit3
margins, dydx(i.female) predict (outcome(4)) atmeans post
outreg using results\Reg_margins\Reg3.tex, append noautosumm rtitle(Not at all) tex
I currently have four tables (see examples for two of them), each with a column name which is the controls included in the model and four rows with each level:
How can I have all of this in a single table, keeping the four rows and adding more columns?
You can get the desired output using the community-contributed command esttab.
First, define the program appendmodels (obtained from here):
capt prog drop appendmodels
*! version 1.0.0 14aug2007 Ben Jann
program appendmodels, eclass
// using first equation of model
version 8
syntax namelist
tempname b V tmp
foreach name of local namelist {
qui est restore `name'
mat `tmp' = e(b)
local eq1: coleq `tmp'
gettoken eq1 : eq1
mat `tmp' = `tmp'[1,"`eq1':"]
local cons = colnumb(`tmp',"_cons")
if `cons'<. & `cons'>1 {
mat `tmp' = `tmp'[1,1..`cons'-1]
}
mat `b' = nullmat(`b') , `tmp'
mat `tmp' = e(V)
mat `tmp' = `tmp'["`eq1':","`eq1':"]
if `cons'<. & `cons'>1 {
mat `tmp' = `tmp'[1..`cons'-1,1..`cons'-1]
}
capt confirm matrix `V'
if _rc {
mat `V' = `tmp'
}
else {
mat `V' = ///
( `V' , J(rowsof(`V'),colsof(`tmp'),0) ) \ ///
( J(rowsof(`tmp'),colsof(`V'),0) , `tmp' )
}
}
local names: colfullnames `b'
mat coln `V' = `names'
mat rown `V' = `names'
eret post `b' `V'
eret local cmd "whatever"
end
Next, run the following (here I use Stata's fullauto toy dataset for illustration):
webuse fullauto, clear
estimates clear
forvalues i = 1 / 4 {
oprobit rep77 i.foreign
margins, dydx(foreign) predict (outcome(`i')) atmeans post
estimate store OProbit1`i'
}
appendmodels OProbit11 OProbit12 OProbit13 OProbit14
estimates store result1
forvalues i = 1 / 4 {
oprobit rep77 i.foreign length mpg
margins, dydx(foreign) predict (outcome(`i')) atmeans post
estimate store OProbit2`i'
}
appendmodels OProbit21 OProbit22 OProbit23 OProbit24
estimates store result2
forvalues i = 1 / 4 {
oprobit rep77 i.foreign trunk weight
margins, dydx(foreign) predict (outcome(`i')) atmeans post
estimate store OProbit3`i'
}
appendmodels OProbit31 OProbit32 OProbit23 OProbit34
estimates store result3
forvalues i = 1 / 4 {
oprobit rep77 i.foreign price displ
margins, dydx(foreign) predict (outcome(`i')) atmeans post
estimate store OProbit4`i'
}
appendmodels OProbit41 OProbit42 OProbit43 OProbit44
estimates store result4
Finally, see the results:
esttab result1 result2 result3 result4, keep(1.foreign) varlab(1.foreign " ") ///
labcol2("A lot" "Somewhat" "A little" "Not at all") gaps noobs nomtitles
-------------------------------------------------------------------------------------
(1) (2) (3) (4)
-------------------------------------------------------------------------------------
A lot -0.0572 -0.0677 -0.0728 -0.0690
(-1.83) (-1.67) (-1.81) (-1.67)
Somewhat -0.144** -0.247*** -0.188** -0.175*
(-2.73) (-3.54) (-2.86) (-2.47)
A little -0.124 -0.290** -0.290** -0.163
(-1.86) (-3.07) (-3.07) (-1.74)
Not at all 0.198** 0.351*** 0.252** 0.237*
(2.64) (3.82) (2.95) (2.55)
-------------------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
You can install esttab by typing the following in Stata's command prompt:
ssc install estout
I have some data in the following format:
COMPNAME DATA CAP RETURN
I have found some code that will construct and calculate the value-weighted return based on the data.
This works great and is below:
PROC SUMMARY NWAY DATA = Data1 ; CLASS DATE ;
VAR RETURN / WEIGHT = CAP ;
OUTPUT
OUT = MKTRET
MEAN (RETURN) = MONTHLYRETURN
RUN;
The extension that I would like to make is in my head a little bit complicated.
I want to make the weights based on the market capitalization in June.
So this will be a buy and hold portfolios. The actual data has 100's of companies but to give a representative example for two companies with the sole explanation of how the weights will evolve...
Say for example I have two companies, A and B.
The CAP of A is £100m and B is £100m.
In July of one year, I would invest 50% in A and 50% in B.
The returns in July are 10% and -10%.
Therefore I would invest 55% and 45%.
It will go on like this until next June when I will re-balance again based on the market capitalisation...
10% monthly return is pretty speculative!
When the two companies differ by more than 200 you will need to also sell and buy to equalize the companies.
Presume the rates per month are simulated and stored in a data set. You can generate a simulated ledger as follows
add returns
compare balances
equalize by splitting 200 investment if balances are close enough
equalize by investing all 200 in one and selling and buying
Of course, a portfolio with more than 2 companies becomes a more complicated balancing act to achieve mathematical balance.
data simurate(label="Future expectation is not an indicator of past performance :)");
do month = 1 to 60;
do company = 1 to 2;
return = round (sin(company+month/4) / 12, 0.001); %* random return rate for month;
output;
end;
end;
run;
data want;
if 0 then set simurate;
declare hash lookup (dataset:'simurate');
lookup.defineKey ('company', 'month');
lookup.defineData('return');
lookup.defineDone();
month = 0;
bal1 = 0; bal2 = 0;
output;
do month = 1 to 60;
lookup.find(key:1, key:month); rate1 = return;
ret1 = round(bal1 * rate1, 0.0001);
lookup.find(key:2, key:month); rate2 = return;
ret2 = round(bal1 * rate2, 0.0001);
bal1 + ret1;
bal2 + ret2;
goal = mean(bal1,bal2) + 100;
sel1 = 0; buy1 = 0;
sel2 = 0; buy2 = 0;
if abs(bal1-bal2) <= 200 then do;
* difference between balances after returns is < 200;
* balances can be equalized simple investment split;
inv1 = goal - bal1;
inv2 = goal - bal2;
end;
else if bal1 < bal2 then do;
* sell bal2 as needed to equalize;
inv1 = 200;
inv2 = 0;
buy1 = goal - 200 - bal1;
sel2 = bal2 - goal;
end;
else do;
inv2 = 200;
inv1 = 0;
buy2 = goal - 200 - bal2;
sel1 = bal1 - goal;
end;
bal1 + (buy1 - sel1 + inv1);
bal2 + (buy2 - sel2 + inv2);
output;
end;
stop;
drop company return ;
format bal: 10.4 rate: 5.3;
run;
I'm wondering if there's a way to add an inset graph in another graph in SAS. I see quite a bit of stuff about adding text insets into a figure, but nothing about a graph. I am making scatterplots with, for example, leaf mass on the x-axis, and leaf area on the y. The slope of the line tells me the ratio of these two quantities, which is specific leaf area. This is also important, so I want to be able to show that, as well. But since its not new information (relative to the scatterplots), I don't want to make another figure since it will basically show the same thing, just in different ways. But if I had it as an inset, that would be okay.
I use SAS 9.4 at school and SAS university edition at home. Here's the code I have for the scatterplot:
proc template; define statgraph Graph;
dynamic _MASS_PER_LEAF _SA_PER_LEAF _GROUP _LT_MASS_PER_LEAF _LT_SA_PER_LEAF _GROUP2;
begingraph / designwidth=500 designheight=819 attrpriority=none DataSymbols=(circle x circle x circle x) DataContrastColors=(CX0000FF CX0000FF CXCC0033 CXCC0033 CX639A21 CX639A21)
DataLinePatterns=(1 2 1 2 1 2);
legendItem type=marker name='Beech' / label = 'Beech' markerattrs=(color=CX0000FF symbol=circlefilled) ;
legendItem type=marker name='Red Maple' / label = 'Red Maple' markerattrs=(color=CXCC0033 symbol=circlefilled) ;
legendItem type=marker name='Sugar Maple' / label = 'Sugar Maple' markerattrs=(color=CX639A21 symbol=circlefilled) ;
legendItem type=marker name='Calcium' / label = 'Calcium' markerattrs=(color=black symbol=circle) ;
legendItem type=marker name='Control' / label = 'Control' markerattrs=(color=black symbol=x) ;
layout lattice / rowdatarange=data columndatarange=union rows=2 rowgutter=10 columngutter=10 rowweights=(1.0 1.0);
layout overlay / yaxisopts=( label=('Average surface area per leaf (cm^2)'));
scatterplot x=_MASS_PER_LEAF y=_SA_PER_LEAF / group=_GROUP name='scatter' markerattrs=(size=11 weight=bold );
regressionplot x = mass_per_leaf y = sa_per_leaf / group = group;
entry halign=left 'Green leaves' / valign=top;
endlayout;
layout overlay / yaxisopts=( label=('Averge surface area per leaf (cm^2)'));
scatterplot x=_LT_MASS_PER_LEAF y=_LT_SA_PER_LEAF / group=_GROUP2
name='scatter2' markerattrs=(size=11 weight=bold );
regressionplot x = mass_per_leaf y = sa_per_leaf / group = group;
entry halign=left 'Senesced leaves' / valign=top;
discretelegend 'Beech' 'Red Maple' 'Sugar Maple' 'Calcium' 'Control' /
opaque=false
border=true
halign=right
valign=bottom
displayclipped=true
down=3
order=columnmajor
location=inside;
endlayout;
columnaxes;
columnaxis / label=('Average mass per leaf (g)');
endcolumnaxes;
endlayout;
endgraph;
end;
run;
proc sgrender data=WORK.CALCIUM template=Graph;
dynamic _MASS_PER_LEAF="'MASS_PER_LEAF'n" _SA_PER_LEAF="'SA_PER_LEAF'n" _GROUP="GROUP" _LT_MASS_PER_LEAF="'LT_MASS_PER_LEAF'n" _LT_SA_PER_LEAF="'LT_SA_PER_LEAF'n" _GROUP2="GROUP";
run;
Thank you for any help!
There are four variables in my dataset. Company shows the company's name. Return is the return of Company at day Date. Weight is the weight of this company in the market.
I want to keep all variables in the original file, and create an additional variable which is the market return (exclude Company itself). Market return corresponding for stock 'a' is the sum of all weighted stocks' return at the same Date in the market exclude stock a. For example, if there are 3 stocks in the market a, b and c. Market Return for stock a is Return(b)* [Weight(b)/(weight(b)+weight(C))] + Return(C)* [weight(C)/(weight(b)+weight(C)]. Similarly, Market Return for stock b is Return(a)* [Weight(a)/(weight(a)+weight(C))] + Return(C)* [weight(C)/(weight(a)+weight(C)].
I try to use proc summary but this function cannot exclude stock a when calculate the market return for stock a.
PROC SUMMARY NWAY DATA ;
CLASS Date ;
VAR Return / WEIGHT = weight;
OUTPUT
OUT = output
MEAN (Return) = MarketReturn;
RUN;
Could anyone teach me how to solve this please. I am relatively new to this software, so I dont know if I should use loop or there might be some better alternative.
This can be done with a bit of fancy algebra. It's not something that's built-in, though.
Basically:
Construct a "total" market return
Construct a stock by stock return (so just return of A)
Subtract out the portion that A contributes to total.
Thanks to the simple math that generates these lists, it's quite easy to do this.
Total sum = ((mean of A*Awgt) + (mean of remainder*sum of their weights))/(sum of Awgt + sum of rest wgts)
So, solve that for (mean of rest*mean of rest wgts / sum of rest wgts).
Exclusive sum: ((mean of all * sum of all wgts) - (mean of A * sum of A wgts)) / (sum of all wgts - sum of A wgts)
Something like this.
data returns;
input stock $ return weight;
datalines;
A .50 1
B .75 2
C .33 1
;;;;
run;
proc means data=returns;
class stock;
types () stock; *this is the default;
weight weight;
output out=means_out mean= sumwgt= /autoname;
run;
data returns_excl;
if _n_=1 then set means_out(where=(_type_=0) rename=(return_mean=tot_return return_sumwgt=tot_wgts));
set means_out(where=(_type_=1));
return_excl = (tot_return*tot_wgts-return_mean*return_sumwgt)/(tot_wgts-return_sumwgt);
run;
I want to plot confidence intervals for some estimates after running a regression model.
As I'm working with a very big dataset, I need an efficient solution: in particular, a solution that does not require me to sort or save the dataset. In the following example, I plot estimates for b1 to b6:
reg y b1 b2 b3 b4 b5 b6
foreach i of numlist 1/6 {
local mean `mean' `=_b[b`i']' `i'
local ci `ci' ///
(scatteri ///
`=_b[b`i'] +1.96*_se[b`i']' `i' ///
`=_b[`i'] -1.96 * _se[b`i']' `i' ///
,lpattern(shortdash) lcolor(navy))
}
twoway `ci' (scatteri `mean', mcolor(navy)), legend(off) yline(0)
While scatteri efficiently plots the estimates, I can't get boundaries for the confidence interval similar to rcap.
Is there a better way to do this?
Here's token code for what you seem to want. The example is ridiculous. It's my personal view that refining this would be pointless given the very accomplished previous work behind coefplot. The multiplier of 1.96 only applies in very large samples.
sysuse auto, clear
set scheme s1color
reg mpg weight length displ
gen coeff = .
gen upper = .
gen lower = .
gen which = .
local i = 0
quietly foreach v in weight length displ {
local ++i
replace coeff = _b[`v'] in `i'
replace upper = _b[`v'] + 1.96 * _se[`v'] in `i'
replace lower = _b[`v'] - 1.96 * _se[`v'] in `i'
replace which = `i' in `i'
label def which `i' "`v'", modify
}
label val which which
twoway scatter coeff which, mcolor(navy) xsc(r(0.5, `i'.5)) xla(1/`i', val) ///
|| rcap upper lower which, lcolor(navy) xtitle("") legend(off)