SAS_STATE SPACE MODELS

SAS_STATE SPACE MODELS - sas

I hope all of you are doing well. I have a question regarding the State Space Models. As you probably know, by using these models we can compute the error variances of irregular component and level component by using one observable variable. For example, assume that we have one variable; it is price of the asser. State Space Models help us to find the level and irregular components of this variable. I use the following codes in order to compute that:
proc ucm data = work;
model price (price is my observable variable);
irregular plot = smooth;
level checkbreak plot = smooth;
estimate plot = residual;
forecast plot = forecasts lead = 10 alpha = 0.5;
run;
My problem is, I need to find the error variances of irregular and level components for each group. Above mentioned code help me to find these variances by using data of all groups. For simplicity, I explain it by using a simple data. I have the following datasheet:
group price
A 0.5
A 0.4
A 0.8
A 0.1
B 0.3
B 0.2
B 0.5
I want to get the following datasheet:
group price Error variances of irregular components (irr.c) Error variances of level components
A 0.5 0.1 (assume that er.variance of irr.c for A is 0.1) 0.3 (assume that er.variance of lev.c for A is 0.3)
A 0.4 0.1 (assume that er.variance of irr.c for A is 0.1) 0.3 (assume that er.variance of lev.c for A is 0.3)
A 0.8 0.1 (assume that er.variance of irr.c for A is 0.1) 0.3 (assume that er.variance of lev.c for A is 0.3)
A 0.1 0.1 (assume that er.variance of irr.c for A is 0.1) 0.3 (assume that er.variance of lev.c for A is 0.3)
B 0.3 0.2 (assume that er.variance of irr.c for B is 0.2) 0.1 (assume that er.variance of lev.c for B is 0.1)
B 0.2 0.2 (assume that er.variance of irr.c for B is 0.2) 0.1 (assume that er.variance of lev.c for B is 0.1)
B 0.5 0.2 (assume that er.variance of irr.c for B is 0.2) 0.1 (assume that er.variance of lev.c for B is 0.1)
I hope I can explain my issue. Sorry for any misunderstanding.

data work;
input group $ price;
datalines;
A 0.5
A 0.4
A 0.8
A 0.1
B 0.3
B 0.2
B 0.5
;
run;
proc print data=work;
run;
ods trace on;
ods select ParameterEstimates;
ods output ParameterEstimates=myEstimates;
proc ucm data=work;
model price;
by group;
irregular plot=smooth;
level checkbreak plot=smooth;
estimate plot=residual;
forecast plot=forecasts lead=10 alpha=0.5;
run;
proc print data=myEstimates;
run;
proc transpose data=myEstimates(keep=group component estimate)
out=transposedEstimates;
by group;
id component;
run;
proc print data=transposedEstimates;
run;
proc sql;
create table myResults as
select a.*,
b.irregular as IrregularComponent,
b.level as LevelComponent
from work as a,
transposedEstimates as b
where a.group=b.group;
quit;
proc print data=myResults;
run;

Related

How to blend images in python (without blend method)

this is an assignment on my class, so i need to blend to images together with python using interpolation but i am missing something, perhaps you can help me understand what.
Heres my code so far:
from PIL import Image
import numpy as np
image_one=Image.open('capone.pgm')
image_two=Image.open('escobar.pgm')
out=Image.new(image_one.mode, image_two.size)
(l,h)=image_one.size
for j in range(0, h):
for i in range(0, l):
out.getpixel((i,j)),(image_one.getpixel((i,j)) * (1.0 - 0.3) + image_two.getpixel((i,j)) * 0.3 )
out.save("testaando.jpg","JPEG")
out.show()
0.3 is the alpha i want for the blending
the two original images are sime size and mode

getpixel method of PIL.Image returns value of a pixel, but to modify it you need to use putpixel method. So instead of
out.getpixel((i,j)),(image_one.getpixel((i,j)) * (1.0 - 0.3) + image_two.getpixel((i,j)) * 0.3 )
use
out.putpixel((i,j), (image_one.getpixel((i,j)) * (1.0 - 0.3) + image_two.getpixel((i,j)) * 0.3 ))

This is just a guess as there currently is not much information.
The Line:
out.getpixel((i,j)),(image_one.getpixel((i,j)) * (1.0 - 0.3) + image_two.getpixel((i,j)) * 0.3 )
Should be:
out[i, j] = (image_one.getpixel((i,j)) * (1.0 - 0.3) + Image_two.getpixel((i,j)) * 0.3 )

Python: Float value addition and substraction give wrong value

I just check following thing in python 2.7
print 0.1 + 0.2
output :- 0.3
print 0.1 + 0.2 - 0.3
output :- 5.55111512313e-17
But I expect the 0.0
So, how to achive this thing ?

The problem here is that the float type doesn't have enough precision to display the result you want. If you try to print the partial sum 0.1 + 0.2 you'll see that the float result you get is 0.30000000000000004.
So, 5.55111512313e-17 is the closest approximation possible with float type variables to that result. If you try to cast the result to int, so:
int(0.2 + 0.1 - 0.3)
You'll see 0, and that's the right integer approximation.
You can get 0.0 with floating point variables by using the decimal class.
Try this:
from decimal import Decimal
Decimal("0.2") + Decimal("0.1") - Decimal("0.3")
And you'll see that the result is Decimal("0.0")

Efficiently fitting cubic splines in SAS to specific grid of objects

I have a dataset mydat with the following variables:
MNES IV
0.84 0.40
0.89 0.34
0.91 0.31
0.93 0.29
0.95 0.26
0.98 0.23
0.99 0.22
1.00 0.22
1.02 0.20
1.04 0.18
1.07 0.18
And I need to fit cubic splines to these elements, where MNES is the object (X) and IV is the image (Y).
I have successfully accomplished what I need through PROC IML but I am afraid this is not the most efficient solution.
Specifically, my intended output dataset is:
mnes iv
0.333 0.40
0.332 0.40 <- for mnes out of sample MNES range, copy first IV;
0.336 0.40
... ...
0.834 0.40
0.837 0.40
0.840 0.40
0.842 INTERPOLATION
0.845 INTERPOLATION
0.848 INTERPOLATION
...
1.066 INTERPOLATION
1.069 INTERPOLATION
1.072 INTERPOLATION
1.074 0.18
1.077 0.18 <- for mnes out of sample MNES range, copy last IV;
1.080 0.18
... ...
3.000 0.18
The necessary specifics are the following:
I always have 1001 points for MNES, ranging from 0.(3) to 3 (thus, each step is (3-1/3)/1000).
The interpolation for IV should only be used for the points between the minimum and maximum MNES.
For the points where MNES is greater than the maximum MNES in the sample, IV should be equal to the IV of the maximum MNES and likewise for the minimum MNES (it is always sorted by MNES).
My worry for efficiency is due to the fact that I have to solve this problem roughly 2 million times and right now it (the code below, using PROC IML) takes roughly 5 hours for 100k different input datasets.
My question is: What alternatives do I have if I wish to fit cubic splines given an input data set such as the one above and output it to a specific grid of objects?
And what solution would be the most efficient?
With PROC IML I can do exactly this with the splinev function, but I am concerned that using PROC IML is not the most efficient way;
With PROC EXPAND, given that this is not a time series, it does not seem adequate. Additionally, I do not know how to specify the grid of objects which I need through PROC EXPAND;
With PROC TRANSREG, I do not understand how to input a dataset into the knots and I do not understand whether it will output a dataset with the corresponding interpolation;
With the MSPLINT function, it seems doable but I do not know how to input a data set to its arguments.
I have attached the code I am using below for this purpose and an explanation of what I am doing. Reading what is below is not necessary for answering the question but it could be useful for someone solving this sort of problem with PROC IML or wanting a better understanding of what I am saying.
I am replicating a methodology (Buss and Vilkov (2012)) which, among other things, applies cubic splines to these elements, where MNES is the object (X) and IVis the image (Y).
The following code is heavily based on the Model Free Implied Volatility (MFIV) MATLAB code by Vilkov for Buss and Vilkov (2012), available on his website.
The interpolation is a means to calculate a figure for stock return volatility under the risk-neutral measure, by computing OTM put and call prices. I am using this for the purpose of my master thesis. Additionally, since my version of PROC IML does not have functions for Black-Scholes option pricing, I defined my own.
proc iml;
* Define BlackScholes call and put function;
* Built-in not available in SAS/IML 9.3;
* Reference http://www.lexjansen.com/wuss/1999/WUSS99039.pdf ;
start blackcall(x,t,s,r,v,d);
d1 = (log(s/x) + ((r-d) + 0.5#(v##2)) # t) / (v # sqrt(t));
d2 = d1 - v # sqrt(t);
bcall = s # exp(-d*t) # probnorm(d1) - x # exp(-r*t) # probnorm(d2);
return (bcall);
finish blackcall;
start blackput(x,t,s,r,v,d);
d1 = (log(s/x) + ((r-d) + 0.5#(v##2)) # t) / (v # sqrt(t));
d2 = d1 - v # sqrt(t);
bput = -s # exp(-d*t) # probnorm(-d1) + x # exp(-r*t) # probnorm(-d2);
return (bput);
finish blackput;
store module=(blackcall blackput);
quit;
proc iml;
* Specify necessary input parameters;
currdate = "&currdate"d;
currpermno = &currpermno;
currsecid = &currsecid;
rate = &currrate / 100;
mat = &currdays / 365;
* Use inputed dataset and convert to matrix;
use optday;
read all var{mnes impl_volatility};
mydata = mnes || impl_volatility;
* Load BlackScholes call and Put function;
load module=(blackcall blackput);
* Define parameters;
k = 2;
m = 500;
* Define auxiliary variables according to Buss and Vilkov;
u = (1+k)##(1/m);
a = 2 * (u-1);
* Define moneyness (ki) and implied volatility (vi) grids;
mi = (-m:m);
mi = mi`;
ki = u##mi;
* Preallocation of vi with 2*m+1 ones (1001 in the base case);
vi = J(2*m+1,1,1);
* Define IV below minimum MNESS equal to the IV of the minimum MNESS;
h = loc(ki<=mydata[1,1]);
vi[h,1] = mydata[1,2];
* Define IV above maximum MNESS equal to the IV of the maximum MNESS;
h = loc(ki>=mydata[nrow(mydata),1]);
vi[h,1] = mydata[nrow(mydata),2];
* Define MNES grid where there are IV from data;
* (equal to where ki still has ones resulting from the preallocation);
grid = ki[loc(vi=1),];
* Call splinec to interpolate based on available data and obtain coefficients;
* Use coefficients to create spline on grid and save on smoothFit;
* Save smoothFit in correct vi elements;
call splinec(fitted,coeff,endSlopes,mydata);
smoothFit = splinev(coeff,grid);
vi[loc(vi=1),1] = smoothFit[,2];
* Define elements of mi corresponding to OTM calls (MNES >=1) and OTM puts (MNES <1);
ic = mi[loc(ki>=1)];
ip = mi[loc(ki<1)];
* Calculate call and put prices based on call and put module;
calls = blackcall(ki[loc(ki>=1),1],mat,1,rate,vi[loc(ki>=1),1],0);
puts = blackput(ki[loc(ki<1),1],mat,1,rate,vi[loc(ki<1),1],0);
* Complete volatility calculation based on Buss and Vilkov;
b1 = sum((1-(log(1+k)/m)#ic)#calls/u##ic);
b2 = sum((1-(log(1+k)/m)#ip)#puts/u##ip);
stddev = sqrt(a*(b1+b2)/mat);
* Append to voldata dataset;
edit voldata;
append var{currdate currsecid currpermno mat stddev};
close voldata;
quit;

Ok. I'm going to do this for 2 data sets to help you with the fact you have a bunch. You will have to modify for your inputs, but this should give you better performance.
Create some inputs
Get the first and last values from each input data set.
Create a list of all MNES values.
Merge each input to the MNES list and set the upper and lower values.
Append the Inputs together
Run PROC EXPAND with a BY statement to single pass all the input values and create the splines.
The trick is to "trick" EXPAND into thinking MNES is a Daily timeseries. I do this by making it an integer -- date values are integers behind the scenes in SAS. With no gaps, ETS Procedures will assume a "daily" frequency.
After this is done, run a Data Step to call the Black-Scholes (BLKSHPTPRC, BLKSHCLPRC) functions and complete your analysis.
/*Sample Data*/
data input1;
input MNES IV;
/*Make MNES and integer*/
MNES = MNES * 1000;
datalines;
0.84 0.40
0.89 0.34
0.91 0.31
0.93 0.29
0.95 0.26
0.98 0.23
0.99 0.22
1.00 0.22
1.02 0.20
1.04 0.18
1.07 0.18
;
run;
data input2;
input MNES IV;
MNES = MNES * 1000;
datalines;
0.80 0.40
0.9 0.34
0.91 0.31
0.93 0.29
0.95 0.26
0.98 0.23
1.02 0.19
1.04 0.18
1.07 0.16
;
run;
/*Get the first and last values from the input data*/
data _null_;
set input1 end=last;
if _n_ = 1 then do;
call symput("first1",mnes);
call symput("first1_v",iv);
end;
if last then do;
call symput("last1",mnes);
call symput("last1_v",iv);
end;
run;
data _null_;
set input2 end=last;
if _n_ = 1 then do;
call symput("first2",mnes);
call symput("first2_v",iv);
end;
if last then do;
call symput("last2",mnes);
call symput("last2_v",iv);
end;
run;
/*A list of the MNES values*/
data points;
do mnes=333 to 3000;
output;
end;
run;
/*Join Inputs to the values and set the lower and upper values*/
data input1;
merge points input1;
by mnes;
if mnes < &first1 then
iv = &first1_v;
if mnes > &last1 then
iv = &last1_v;
run;
data input2;
merge points input2;
by mnes;
if mnes < &first2 then
iv = &first2_v;
if mnes > &last2 then
iv = &last2_v;
run;
/*Append the data sets together, keep a value
so you can tell them apart*/
data toSpline;
set input1(in=ds1)
input2(in=ds2);
if ds1 then
Set=1;
else if ds2 then
Set=2;
run;
/*PROC Expand for the spline. The integer values
for MNES makes it think these are "daily" data*/
proc expand data=toSpline out=outSpline method=spline;
by set;
id mnes;
run;

Here is the solution I came up with. Sadly, I cannot yet conclude whether this is more efficient than the PROC IML solution - just for one dataset they both take the pretty much the same running time.
MSPLINT:
real time: 1.42 seconds
cpu time 0.23 seconds
PROC IML:
real time: 1.02 seconds
cpu time: 0.26 seconds
The biggest disadvantage of this solution when compared to the one above by #DomPazz is that I cannot process the data by 'By groups', which would certainly make it a lot faster... I am still thinking whether I can solve this without resorting to a macro loop but I am all out of ideas.
I keep the solution of defining a macro variable with the first and last values, as proposed by #DomPazz, but I then use a datastep, which copies the first and last values or applies the interpolation depending on what value of MNES it is stepping through. It applies the interpolation through the MSPLINT function. Its syntax is as follows:
MSPLINT(X, n, X1 <, X2, ..., Xn>, Y1 <,Y2, ..., Yn> <, D1, Dn>)
Where X is the object at which you wish to evaluate the spline, n is the number of knots supplied to the function (i.e. the number of observations in the input data), X1,...,Xn are the objects in the input data (i.e. MNES) and Y1,...,Yn are the images in the input data (i.e. IV). D1 and Dn (optional) are the derivatives you wish to maintain for interpolation objects X < X1 and X>Xn.
An interesting note is that by specifying D1 and Dn. as 0 you can have the points beyond the grid equal to the last observation inside the interpolated area. However, this forces the spline images to converge to a derivative of zero, potentially generating a non-natural pattern in the data. I opted not to define these as zero and defining the points outside the interpolation area separately.
So, I use PROC SQL to define the lists of elements of MNES and IV in macro variables, divided by commas, so that I can input them in the MSPLINT function. I also define the number of observations through PROC SQL.
MNES, as I commented in the answer above, was not well defined in my explanation. It needs to be defined as the variable u to the power of elements from -500 to 500. This is just a detail but it will allow you to understand where MNES comes from in the example below.
So, here is the solution, including example data.
* Set model inputs;
%let m = 500;
%let k = 2;
%let u = (1+&k) ** (1/&m);
/*Sample Data*/
data input1;
input MNES 13.10 IV 8.6;
cards;
0.8444984010 0.400535
0.8901469633 0.347988
0.9129712444 0.318596
0.9357955255 0.291456
0.9586198066 0.264852
0.9814440877 0.236231
0.9928562283 0.224858
1.0042683688 0.220035
1.0270926499 0.201118
1.0499169310 0.189373
1.0727412121 0.185628
;
run;
data _null_;
set input1 end=last;
if _n_ = 1 then do;
call symput("first1",MNES);
call symput("first1_v",IV);
end;
if last then do;
call symput("last1",MNES);
call symput("last1_v",IV);
end;
run;
proc sql noprint;
select MNES into:mneslist
separated by ','
from input1;
select IV into:IVlist
separated by ','
from input1;
select count(*) into:countlist
from input1;
quit;
data splined;
do grid=-500 to 500;
mnes = (&u) ** grid;
if mnes < &first1 then IV = &first1_v;
if mnes > &last1 then IV = &last1_v;
if mnes >= &first1 and mnes <= &last1 then IV = msplint(mnes, &countlist, &mneslist, &IVlist);
end;
run;

SAS do loop by increment

Why is the following not yielding the desired result?
data data1;
do a = 0.0 to 1.0 by 0.1;
do b = 0.0 to 1.0 by 0.1;
do c = 0.0 to 1.0 by 0.1;
do d = 0.0 to 1.0 by 0.1;
if (a+b+c+d)=1 then output;
end;
end;
end;
end;
format a b c d 4.1;
run;

I'm not really familiar with SAS, but in general when you have numbers like 0.1, it's represented in binary. Since .1 can't be represented exactly in binary then math equations don't always add up exactly. For instance 0.1 times 10 does not equal exactly 1.0 in floating point arithmetic. In general, don't use equality when working with floating point.
See http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Numbers in SAS are stored in binary (as they are in most computing applications), and are often not precisely representing the decimal equivalent. Just like 0.33333333 != (1/3) exactly, neither do many 1/10ths represent their precise decimal value either.
data data1;
do a = 0.0 to 1.0 by 0.1;
do b = 0.0 to 1.0 by 0.1;
do c = 0.0 to 1.0 by 0.1;
do d = 0.0 to 1.0 by 0.1;
if round(a+b+c+d,.1)=1 then output;
end;
end;
end;
end;
format a b c d 4.1;
run;
Rounding fixes the issue.
You can read this SAS technical paper for more information.

Percentage calculation around 0.5 (0.4 = -20% and 0.6 = +20%)

I'm in a strange situation where I have a value of 0.5 and I want to convert the values from 0.5 to 1 to be a percentage and from 0.5 to 0 to be a negative percentage.
As it says in the title 0.4 should be -20%, 0.3 should be -40% and 0.1 should be -80%.
I'm sure this is a simple problem, but my mind is just refusing to figure it out :)
Can anyone help? :)

What we want to do is to scale the range (0; 1) to (-100; 100):
percentage = (value - 0.5) * 200;
The subtraction transforms the value so that it's in the range (-0.5; 0.5), and the multiplication scales it to the range of (-100; 100).

percent = ((value - 0.5) / 0.5) * 100
This will generate from -100 to 100. You want to subtract your zero value (0.5) from the given value, and divide by the range that should give 100% (also 0.5 in your example). Then multiply by 100 to convert to percentage.

Normalize it, and you're done:
// Assuming x is in the range (0,1)
x *= 2.0; // x is in the range (0,2)
x -= 1.0; // (-1,1)
x *= 100; // (-100,100)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

SAS_STATE SPACE MODELS - sas

Related

How to blend images in python (without blend method)

Python: Float value addition and substraction give wrong value

Efficiently fitting cubic splines in SAS to specific grid of objects

SAS do loop by increment

Percentage calculation around 0.5 (0.4 = -20% and 0.6 = +20%)

Categories

Resources