How do I interpret log exponential non-linear regression? - sas

I am using an exponential non-linear regression in order to fit my data.
Here is my model:
O-acetyl content = A*exp(-B*Time)+C
with:
- A: amplitude of decrease (difference between the intercept T0 and the plateau);
- B: speed of decrease (curvature);
- C: low plateau;
- O-acetyl content: expressed in µmol/dose;
- Time: expressed in hours.
I have to pass my dependant variable (O-acetyl content) into ln . I have obtained my coefficients but how do I interpret the results? I am using and the Newthon method of coeff estimation.
I have for instance these results for the parameter estimates:
A = 0.062
B=0.0573
C=0.0309
my SAS script (a part):
proc nlin data=donn method=newton;
parms A=0.8815 B=0.0124 C=4.4067;
if (CCR="before_change") then do;
beta_1 = A1;
beta_2 = B1;
beta_0 = C1;
end;
else do;
beta_1 = A2;
beta_2 = B2;
beta_0 = C2;
end;
model l_AgHA=beta_0 + beta_1*exp(-beta_2*hours); *variable dependante en ln;
output out=nlinexp_full predicted=pred l95m=l95mean u95m=u95mean
l95=l95ind u95=u95ind;
ods output ParameterEstimates=pest_full;
ods output Anova=aovred_full(rename=(ss=ssred ms=msred df=dfred));
run;
Thanks a lot.

Related

0 DF in regression in SAS enterprise guide

I created dummies in SAS (part of the codes below) and run regression (threw away M23). It was working fine. But then I tried to group them by age since we don't have enough members. I ran it the same way and threw away one age group (M20to24 since this group has the highest membership). Now some of my variables have 0 DF. Does anyone know what went wrong?
I got the message - Note: Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means that the estimate is biased. The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.
data Table;
set Table;
M0=(AgeGender = '0M');
M1=(AgeGender = '1M');
M2=(AgeGender = '2M');
M3=(AgeGender = '3M');
M4=(AgeGender = '4M');
M5to9=(AgeGender = ' 5to9M');
M10to14=(AgeGender = '10to14M');
M15to19=(AgeGender = '15to19M');
M20to24=(AgeGender = '20to24M');
M25to29=(AgeGender = '25to29M');
M30to34=(AgeGender = '30to34M');
M35to39=(AgeGender = '35to39M');
M40to44=(AgeGender = '40to44M');
M45to49=(AgeGender = '45to49M');
M50to54=(AgeGender = '50to54M');
M55to59=(AgeGender = '55to59M');
M60to64=(AgeGender = '60to64M');
M65Plus=(AgeGender = '65+M');
F0=(AgeGender = '0F');
F1=(AgeGender = '1F');
F2=(AgeGender = '2F');
F3=(AgeGender = '3F');
F4=(AgeGender = '4F');
F5to9=(AgeGender = ' 5to9F');
F10to14=(AgeGender = '10to14F');
F15to19=(AgeGender = '15to19F');
F20to24=(AgeGender = '20to24F');
F25to29=(AgeGender = '25to29F');
F30to34=(AgeGender = '30to34F');
F35to39=(AgeGender = '35to39F');
F40to44=(AgeGender = '40to44F');
F45to49=(AgeGender = '45to49F');
F50to54=(AgeGender = '50to54F');
F55to59=(AgeGender = '55to59F');
F60to64=(AgeGender = '60to64F');
F65Plus=(AgeGender = '65+F');
Dep = (Relationship = 'Dep');
Mandatory = (Mand_Vo = 'Mandatory');
run;
ods output ParameterEstimates=Parameter_Estimates;
proc reg data= Table;
model logPMPM =
M0
M1
M2
M3
M4
M5to9
M10to14
M15to19
M25to29
M30to34
M35to39
M40to44
M45to49
M50to54
M55to59
M60to64
M65Plus
F0
F1
F2
F3
F4
F5to9
F10to14
F15to19
F20to24
F25to29
F30to34
F35to39
F40to44
F45to49
F50to54
F55to59
F60to64
F65Plus;
weight Membership;
run;
ods output close;
It doesn't look like you have overlaps or identical complimentary data variables but that's by definition. Your data is likely having that occur by chance, which is harder to find. You can likely find this by crossing variables that you suspect may be related or doing a pair wise scatter plot (PROC SGSCATTER) and seeing which two overlap almost identically.
You're correct, you wouldn't get this behaviour with continuous values because they're continuous and less likely to overlap exactly. In general, it's considered best practice to NOT categorize/bin variables when you can keep them continuous. The boundaries are artificial, does a 34 year old really differ from that 36 year old? What if all the people in that age group are 34 compared to the 36 in the 35 to 39 age group? You may not find a difference, but if your distribution was everyone at 39 vs everyone at 31 you may find more of a difference. Keeping the data continuous avoids these manufactured issues.

Calculating value weighted returns in SAS

I have some data in the following format:
COMPNAME DATA CAP RETURN
I have found some code that will construct and calculate the value-weighted return based on the data.
This works great and is below:
PROC SUMMARY NWAY DATA = Data1 ; CLASS DATE ;
VAR RETURN / WEIGHT = CAP ;
OUTPUT
OUT = MKTRET
MEAN (RETURN) = MONTHLYRETURN
RUN;
The extension that I would like to make is in my head a little bit complicated.
I want to make the weights based on the market capitalization in June.
So this will be a buy and hold portfolios. The actual data has 100's of companies but to give a representative example for two companies with the sole explanation of how the weights will evolve...
Say for example I have two companies, A and B.
The CAP of A is £100m and B is £100m.
In July of one year, I would invest 50% in A and 50% in B.
The returns in July are 10% and -10%.
Therefore I would invest 55% and 45%.
It will go on like this until next June when I will re-balance again based on the market capitalisation...
10% monthly return is pretty speculative!
When the two companies differ by more than 200 you will need to also sell and buy to equalize the companies.
Presume the rates per month are simulated and stored in a data set. You can generate a simulated ledger as follows
add returns
compare balances
equalize by splitting 200 investment if balances are close enough
equalize by investing all 200 in one and selling and buying
Of course, a portfolio with more than 2 companies becomes a more complicated balancing act to achieve mathematical balance.
data simurate(label="Future expectation is not an indicator of past performance :)");
do month = 1 to 60;
do company = 1 to 2;
return = round (sin(company+month/4) / 12, 0.001); %* random return rate for month;
output;
end;
end;
run;
data want;
if 0 then set simurate;
declare hash lookup (dataset:'simurate');
lookup.defineKey ('company', 'month');
lookup.defineData('return');
lookup.defineDone();
month = 0;
bal1 = 0; bal2 = 0;
output;
do month = 1 to 60;
lookup.find(key:1, key:month); rate1 = return;
ret1 = round(bal1 * rate1, 0.0001);
lookup.find(key:2, key:month); rate2 = return;
ret2 = round(bal1 * rate2, 0.0001);
bal1 + ret1;
bal2 + ret2;
goal = mean(bal1,bal2) + 100;
sel1 = 0; buy1 = 0;
sel2 = 0; buy2 = 0;
if abs(bal1-bal2) <= 200 then do;
* difference between balances after returns is < 200;
* balances can be equalized simple investment split;
inv1 = goal - bal1;
inv2 = goal - bal2;
end;
else if bal1 < bal2 then do;
* sell bal2 as needed to equalize;
inv1 = 200;
inv2 = 0;
buy1 = goal - 200 - bal1;
sel2 = bal2 - goal;
end;
else do;
inv2 = 200;
inv1 = 0;
buy2 = goal - 200 - bal2;
sel1 = bal1 - goal;
end;
bal1 + (buy1 - sel1 + inv1);
bal2 + (buy2 - sel2 + inv2);
output;
end;
stop;
drop company return ;
format bal: 10.4 rate: 5.3;
run;

how to calculate weighted average but exclude the object itself using SAS

There are four variables in my dataset. Company shows the company's name. Return is the return of Company at day Date. Weight is the weight of this company in the market.
I want to keep all variables in the original file, and create an additional variable which is the market return (exclude Company itself). Market return corresponding for stock 'a' is the sum of all weighted stocks' return at the same Date in the market exclude stock a. For example, if there are 3 stocks in the market a, b and c. Market Return for stock a is Return(b)* [Weight(b)/(weight(b)+weight(C))] + Return(C)* [weight(C)/(weight(b)+weight(C)]. Similarly, Market Return for stock b is Return(a)* [Weight(a)/(weight(a)+weight(C))] + Return(C)* [weight(C)/(weight(a)+weight(C)].
I try to use proc summary but this function cannot exclude stock a when calculate the market return for stock a.
PROC SUMMARY NWAY DATA ;
CLASS Date ;
VAR Return / WEIGHT = weight;
OUTPUT
OUT = output
MEAN (Return) = MarketReturn;
RUN;
Could anyone teach me how to solve this please. I am relatively new to this software, so I dont know if I should use loop or there might be some better alternative.
This can be done with a bit of fancy algebra. It's not something that's built-in, though.
Basically:
Construct a "total" market return
Construct a stock by stock return (so just return of A)
Subtract out the portion that A contributes to total.
Thanks to the simple math that generates these lists, it's quite easy to do this.
Total sum = ((mean of A*Awgt) + (mean of remainder*sum of their weights))/(sum of Awgt + sum of rest wgts)
So, solve that for (mean of rest*mean of rest wgts / sum of rest wgts).
Exclusive sum: ((mean of all * sum of all wgts) - (mean of A * sum of A wgts)) / (sum of all wgts - sum of A wgts)
Something like this.
data returns;
input stock $ return weight;
datalines;
A .50 1
B .75 2
C .33 1
;;;;
run;
proc means data=returns;
class stock;
types () stock; *this is the default;
weight weight;
output out=means_out mean= sumwgt= /autoname;
run;
data returns_excl;
if _n_=1 then set means_out(where=(_type_=0) rename=(return_mean=tot_return return_sumwgt=tot_wgts));
set means_out(where=(_type_=1));
return_excl = (tot_return*tot_wgts-return_mean*return_sumwgt)/(tot_wgts-return_sumwgt);
run;

Plot confidence interval efficiently

I want to plot confidence intervals for some estimates after running a regression model.
As I'm working with a very big dataset, I need an efficient solution: in particular, a solution that does not require me to sort or save the dataset. In the following example, I plot estimates for b1 to b6:
reg y b1 b2 b3 b4 b5 b6
foreach i of numlist 1/6 {
local mean `mean' `=_b[b`i']' `i'
local ci `ci' ///
(scatteri ///
`=_b[b`i'] +1.96*_se[b`i']' `i' ///
`=_b[`i'] -1.96 * _se[b`i']' `i' ///
,lpattern(shortdash) lcolor(navy))
}
twoway `ci' (scatteri `mean', mcolor(navy)), legend(off) yline(0)
While scatteri efficiently plots the estimates, I can't get boundaries for the confidence interval similar to rcap.
Is there a better way to do this?
Here's token code for what you seem to want. The example is ridiculous. It's my personal view that refining this would be pointless given the very accomplished previous work behind coefplot. The multiplier of 1.96 only applies in very large samples.
sysuse auto, clear
set scheme s1color
reg mpg weight length displ
gen coeff = .
gen upper = .
gen lower = .
gen which = .
local i = 0
quietly foreach v in weight length displ {
local ++i
replace coeff = _b[`v'] in `i'
replace upper = _b[`v'] + 1.96 * _se[`v'] in `i'
replace lower = _b[`v'] - 1.96 * _se[`v'] in `i'
replace which = `i' in `i'
label def which `i' "`v'", modify
}
label val which which
twoway scatter coeff which, mcolor(navy) xsc(r(0.5, `i'.5)) xla(1/`i', val) ///
|| rcap upper lower which, lcolor(navy) xtitle("") legend(off)

SAS function for using 'power' / exponential

I may be missing something obvious, but how do you calculate 'powers' in SAS?
Eg X squared, or Y cubed?
what I need is to have variable1 ^ variable2, but cannot find the syntax... (I am using SAS 9.1.3)
got it! there is no function.
you need to do:
variable1 ** variable2;
data t;
num = 5;
pow = 2;
res = num**pow;
run;
proc print data = t;
run;
Use the POWER function and, if necessary, the CONSTANT function.
nbr_squared = power(nbr, 2);
nbr_cubed = power(nbr, 3);
E_to_the_power_2 = power(constant('E'),2);