Calculating value weighted returns in SAS - sas

I have some data in the following format:
COMPNAME DATA CAP RETURN
I have found some code that will construct and calculate the value-weighted return based on the data.
This works great and is below:
PROC SUMMARY NWAY DATA = Data1 ; CLASS DATE ;
VAR RETURN / WEIGHT = CAP ;
OUTPUT
OUT = MKTRET
MEAN (RETURN) = MONTHLYRETURN
RUN;
The extension that I would like to make is in my head a little bit complicated.
I want to make the weights based on the market capitalization in June.
So this will be a buy and hold portfolios. The actual data has 100's of companies but to give a representative example for two companies with the sole explanation of how the weights will evolve...
Say for example I have two companies, A and B.
The CAP of A is £100m and B is £100m.
In July of one year, I would invest 50% in A and 50% in B.
The returns in July are 10% and -10%.
Therefore I would invest 55% and 45%.
It will go on like this until next June when I will re-balance again based on the market capitalisation...

10% monthly return is pretty speculative!
When the two companies differ by more than 200 you will need to also sell and buy to equalize the companies.
Presume the rates per month are simulated and stored in a data set. You can generate a simulated ledger as follows
add returns
compare balances
equalize by splitting 200 investment if balances are close enough
equalize by investing all 200 in one and selling and buying
Of course, a portfolio with more than 2 companies becomes a more complicated balancing act to achieve mathematical balance.
data simurate(label="Future expectation is not an indicator of past performance :)");
do month = 1 to 60;
do company = 1 to 2;
return = round (sin(company+month/4) / 12, 0.001); %* random return rate for month;
output;
end;
end;
run;
data want;
if 0 then set simurate;
declare hash lookup (dataset:'simurate');
lookup.defineKey ('company', 'month');
lookup.defineData('return');
lookup.defineDone();
month = 0;
bal1 = 0; bal2 = 0;
output;
do month = 1 to 60;
lookup.find(key:1, key:month); rate1 = return;
ret1 = round(bal1 * rate1, 0.0001);
lookup.find(key:2, key:month); rate2 = return;
ret2 = round(bal1 * rate2, 0.0001);
bal1 + ret1;
bal2 + ret2;
goal = mean(bal1,bal2) + 100;
sel1 = 0; buy1 = 0;
sel2 = 0; buy2 = 0;
if abs(bal1-bal2) <= 200 then do;
* difference between balances after returns is < 200;
* balances can be equalized simple investment split;
inv1 = goal - bal1;
inv2 = goal - bal2;
end;
else if bal1 < bal2 then do;
* sell bal2 as needed to equalize;
inv1 = 200;
inv2 = 0;
buy1 = goal - 200 - bal1;
sel2 = bal2 - goal;
end;
else do;
inv2 = 200;
inv1 = 0;
buy2 = goal - 200 - bal2;
sel1 = bal1 - goal;
end;
bal1 + (buy1 - sel1 + inv1);
bal2 + (buy2 - sel2 + inv2);
output;
end;
stop;
drop company return ;
format bal: 10.4 rate: 5.3;
run;

Related

How to set proper back test range

I do not know how to code and I am trying to learn Pinescript but it really makes no sense to me so i googled how to set a backtest range and used some code someone else wrote but it doesn't seem to be actually testing the area i would like, it tests the entirety of the chart. I'd like to test from 1/1/2018 to present. I'm trying to do this for multiple strategies so I can better tailor them to the current market. here is wat I have for one of them and if you are willing to help with the others I would very much appreciate it!!! feel free to DM me.
//#version=5
strategy("Bollinger Bands BACKTEST", overlay=true)
source = close
length = input.int(20, minval=1)
mult = input.float(2.0, minval=0.001, maxval=50)
basis = ta.sma(source, length)
dev = mult * ta.stdev(source, length)
upper = basis + dev
lower = basis - dev
buyEntry = ta.crossover(source, lower)
sellEntry = ta.crossunder(source, upper)
if (ta.crossover(source, lower))
strategy.entry("BBandLE", strategy.long, stop=lower, oca_name="BollingerBands", oca_type=strategy.oca.cancel, comment="BBandLE")
else
strategy.cancel(id="BBandLE")
if (ta.crossunder(source, upper))
strategy.entry("BBandSE", strategy.short, stop=upper, oca_name="BollingerBands", oca_type=strategy.oca.cancel, comment="BBandSE")
else
strategy.cancel(id="BBandSE")
//plot(strategy.equity, title="equity", color=color.red, linewidth=2, style=plot.style_areabr)
// === INPUT BACKTEST RANGE ===
fromMonth = input.int(defval = 1, title = "From Month", minval = 1, maxval = 12)
fromDay = input.int(defval = 1, title = "From Day", minval = 1, maxval = 31)
fromYear = input.int(defval = 2018, title = "From Year", minval = 1970)

Adding seconds to time variable

*data final;
set final;
duration = redate-ondate;
dudays = floor(duration/86400);
duhrs = floor((duration-(dudays*86400))/3600);
dumins = floor((duration-(dudays*86400+duhrs*3600))/60);
****************Set up new variable duration**************;
attrib dur length=$11.;
if ae_term ne 'None' and dudays ne . then
dur = left(put(dudays,z2.))||':'||left(put(duhrs,z2.))||':'||left(put(dumins,z2.));
else dur = '';
run;*
I have this code but need to calculate seconds and concatenate to dur as I have an adverse event that is less than a minute so won't display. What's the most efficient way to do this?
You can calculate the remaining seconds and then append to your time string like this:
dusec = duration-(dudays*86400+duhrs*3600+dumins*60);
if ae_term ne 'None' and dudays ne . then
dur = left(put(dudays,z2.))||':'||left(put(duhrs,z2.))||':'||left(put(dumins,z2.)||':'||left(put(dusec,z2.)));
One note - using put(dudays,z2.) assumes your duration is never more than 99 days.
Ok, this should simplify things somewhat:
dudays = FLOOR(duration/86400);
duhrs = FLOOR(MOD(duration,86400)/3600);
dumins = FLOOR(MOD(duration,3600)/60);
dusec = MOD(duration,60);
The difference between two datetime values is a number of seconds (so it is also a datetime value). You can use the DATEPART() and TIMEPART() function to divide into the number of days and seconds since midnight. The TOD11.2 format will display seconds in HH:MM:SS.mm style.
length dur $20;
if n(redate,ondate)=2 and ae_term not in (' ','None') then do;
duration = redate-ondate;
dur = catx(':',datepart(duration),put(timepart(duration),tod11.2));
end;

how to calculate weighted average but exclude the object itself using SAS

There are four variables in my dataset. Company shows the company's name. Return is the return of Company at day Date. Weight is the weight of this company in the market.
I want to keep all variables in the original file, and create an additional variable which is the market return (exclude Company itself). Market return corresponding for stock 'a' is the sum of all weighted stocks' return at the same Date in the market exclude stock a. For example, if there are 3 stocks in the market a, b and c. Market Return for stock a is Return(b)* [Weight(b)/(weight(b)+weight(C))] + Return(C)* [weight(C)/(weight(b)+weight(C)]. Similarly, Market Return for stock b is Return(a)* [Weight(a)/(weight(a)+weight(C))] + Return(C)* [weight(C)/(weight(a)+weight(C)].
I try to use proc summary but this function cannot exclude stock a when calculate the market return for stock a.
PROC SUMMARY NWAY DATA ;
CLASS Date ;
VAR Return / WEIGHT = weight;
OUTPUT
OUT = output
MEAN (Return) = MarketReturn;
RUN;
Could anyone teach me how to solve this please. I am relatively new to this software, so I dont know if I should use loop or there might be some better alternative.
This can be done with a bit of fancy algebra. It's not something that's built-in, though.
Basically:
Construct a "total" market return
Construct a stock by stock return (so just return of A)
Subtract out the portion that A contributes to total.
Thanks to the simple math that generates these lists, it's quite easy to do this.
Total sum = ((mean of A*Awgt) + (mean of remainder*sum of their weights))/(sum of Awgt + sum of rest wgts)
So, solve that for (mean of rest*mean of rest wgts / sum of rest wgts).
Exclusive sum: ((mean of all * sum of all wgts) - (mean of A * sum of A wgts)) / (sum of all wgts - sum of A wgts)
Something like this.
data returns;
input stock $ return weight;
datalines;
A .50 1
B .75 2
C .33 1
;;;;
run;
proc means data=returns;
class stock;
types () stock; *this is the default;
weight weight;
output out=means_out mean= sumwgt= /autoname;
run;
data returns_excl;
if _n_=1 then set means_out(where=(_type_=0) rename=(return_mean=tot_return return_sumwgt=tot_wgts));
set means_out(where=(_type_=1));
return_excl = (tot_return*tot_wgts-return_mean*return_sumwgt)/(tot_wgts-return_sumwgt);
run;

SAS - Labeling Kernel Density Line Intersections in SGplot with multiple histograms

Disclaimer: I am an amateur programmer. I am a student.
So I've been working on a Van Westendorf price model which requires analysis of critical points produced when the kernel density lines overlayed on 4 histograms intersect. My current model has 3 histograms, so 3 lines, and 2 intersections. How can I label these intersections?
You can see the output here.
http://i.imgur.com/6TGWv4l.jpg
Here is the code:
Proc Import Out = work.SASdata datafile =
"C:\Users\jdelizza\Desktop\SimpleSAS.xls"
DBMS = xls;
Sheet = "Asthma";
Getnames = yes;
Label I_Price_A = 'Too Expensive'
I_Price_B = 'Inexpensive'
I_Price_C = 'Slightly Expensive'
F_Price_Low = 'Full Price Inexpensive'
F_Price_Expensive = 'Full price Expensive'
F_Price_Too_Expensive = 'Full Price Too Expensive'
P_Price_Low = 'Personalized Price Inexpensive'
P_Price_Expensive = 'Personalized price Expensive'
P_Price_Too_Expensive= 'Personalized Price Too Expensive';
run;
proc sgplot;
Title "Those with Asthma Indoor Pricing";
histogram I_Price_A / fillattrs=graphdata1 transparency = .5 binstart =1
binwidth=50;
density I_Price_A /type = kernel lineattrs=graphdata1;
histogram I_Price_B / fillattrs=graphdata2 transparency=.5 binstart=1
binwidth=50;
density I_Price_B /type = kernel lineattrs=graphdata2;
histogram I_Price_C / fillattrs=graphdata3 transparency=.5 binstart=1
binwidth=50;
density I_Price_C / type = kernel lineattrs=graphdata3;
keylegend / location=inside position=topright noborder across=2;
yaxis grid;
xaxis display=(nolabel) values=(0 to 1000 by 100);
run;

Pseudocode to Calculate average using MapReduce

Hi I want to write a MapReduce algorithm in pseudo code to solve the following problem:
Given input records in the following format:
address, zip, city, house_value,
please calculate the average house value for each zip code.
I would really appreciate if you could help me with this..
The easiest would be to use Apache Pig, here is an example of finding an average:
inpt = load 'data.txt' as (address:chararray, zip:chararray, city:chararray, house_value:long);
grp = group inpt by zip;
average = foreach grp generate FLATTEN(group) as (zip), AVG(inpt.house_value) as average_price;
dump average;
For Pseudo Map Reduce code you would need one MAPPER, COMBINER and a REDUCER
MAPPER(record):
zip_code_key = record['zip'];
value = {1, record['house_value']};
emit(zip_code_key, value);
COMBINER(zip_code_key, value_list):
record_num = 0;
value_sum = 0;
foreach (value : value_list) {
record_num += value[0];
value_sum += value[1];
}
value_out = {record_num, value_sum};
emit(zip_code_key, value_out);
REDUCER(zip_code_key, value_list):
record_num = 0;
value_sum = 0;
foreach (value : value_list) {
record_num += value[0];
value_sum += value[1];
}
avg = value_sum / record_num;
emit(zip_code_key, avg);