Transpose Multiple Variables to Columns Suffixed with Dates

Transpose Multiple Variables to Columns Suffixed with Dates - sas

I have data that looks like this:
data have;
format date monyy.;
input date:date9. group$ value1 value2;
datalines;
01JAN2020 A 100 10
01FEB2020 A 200 20
01JAN2020 B 300 30
01FEB2020 B 400 40
;
run;
But I need it to look like this, where value1 value2 are suffixed with date. This format is being used to export to Excel.
data want;
input group$ value1_JAN20 value1_FEB20 value2_JAN20 value2_FEB20;
datalines;
A 100 200 10 20
B 300 400 30 40
;
run;
I have tried PROC TRANSPOSE, but the actual values of value1 value2 are stored as date names with the names of value1 value2 stored in the column _NAME_.
Is there a way to do this with PROC TRANSPOSE, or is this only a data step/macro solution?
proc transpose data = have
out = try
delim = '_'n
name = date;
by group;
id date;
var value1 value2;
run;

Transpose twice. First to get a truly vertical structure. Then to generate the structure you want.
data have;
input date :date9. group $ value1 value2;
format date monyy.;
datalines;
01JAN2020 A 100 10
01FEB2020 A 200 20
01JAN2020 B 300 30
01FEB2020 B 400 40
;
proc transpose data=have out=tall;
by group date ;
var value1-value2;
run;
proc transpose data=tall out=want(drop=_name_) delim=_;
by group;
id _name_ date ;
var col1 ;
run;
Results:
value1_ value2_ value1_ value2_
Obs group JAN20 JAN20 FEB20 FEB20
1 A 100 10 200 20
2 B 300 30 400 40

Related

matching two datasets with one month lag

I am trying to match max daily data within a month to a monthly data.
data daily;
input permno $ date ret;
datalines;
1000 19860101 88
1000 19860102 90
1000 19860201 70
1000 19860202 55
1001 19860201 97
1001 19860202 74
1001 19860203 79
1002 19860301 55
1002 19860302 100
1002 19860301 10
;
run;
data monthly;
input permno $ date ret;
datalines;
1000 19860131 1
1000 19860228 2
1000 19860331 5
1001 19860331 3
1002 19860430 4
;
run;
The result I want is the following; (I want to match daily max data to one month lag monthly data. )
1000 19860102 90 1000 19860228 2
1000 19860201 70 1000 19860331 5
1001 19860201 97 1001 19860331 3
1002 19860302 100 1002 19860430 4
Below is what I have tried so far.
I want to have maximum ret value within a month so I have created yrmon to assign same yyyymm data for the same month daily data
data a1; set daily;
yrmon=year(date)*100 + month(date);
run;
In order to choose the maximum value(here, ret) within same yrmon group for the same permno, I used code below
proc means data=a1 noprint;
class permno yrmon ;
var ret;
output out= a2 max=maxret;
run;
However, it only got me permno yrmon ret data, leaving the original date data away.
data a3;
set a2;
new=intnx('month',yrmon,1);
format date new yymmn6.;
run;
But it won't work since yrmon is no longer date format.
Thank you in advance.
Hello
I am trying to match two different sets by permno(same company) but with one month lag (eg. daily9 dataset yrmon=198601 and monthly2 dataset yrmon=198602)
it is pretty difficult to handle for me because if I just add +1 in yrmon, 198612 +1 will not be 198701 and I am confused with handling these issues.
Can anyone help?

1) informat date1/date2 yymmn6. is used to read the date in yyyymm format
2) format date1/date2 yymmn6. is used to view the date in yyyymm format
3) intnx("months",b.date2,-1) is used to join the dates with lag of 1 month
data data1;
input date1 value1;
informat date1 yymmn6.;
format date1 yymmn6.;
cards;
200101 200
200212 300
200211 400
;
run;
data data2;
input date2 value2;
informat date2 yymmn6.;
format date2 yymmn6.;
cards;
200101 3000000
200102 4000000
200301 2000000
200212 2000000
;
run;
proc sql;
create table result as
select a.*,b.date2,b.value2 from
data1 a
left join
data2 b
on a.date1 = intnx("months",b.date2,-1);
quit;
My Output:
date1 |value1 |date2 |value2
200101 |200 |200102 |4000000
200211 |400 |200212 |2000000
200212 |300 |200301 |2000000
Let me know in case of any queries.

Transpose multiple columns to rows in SAS

I am new to SAS and I want to transpose the following table in SAS
From
ID Var1 Var2 Jul-09 Aug-09 Sep-09
1 10 15 200 300
2 5 17 -150 200
to
ID Var1 Var2 Date Transpose
1 10 15 Jul-09 200
1 10 15 Aug-09 300
2 5 17 Aug-09 -150
2 5 17 Sep-09 200
Can anyone help please?

You can use proc transpose to tranform data.
options validvarname=any;
data a;
infile datalines missover;
input ID Var1 Var2 "Jul-09"n "Aug-09"n "Sep-09"n;
datalines;
1 10 15 200 300
2 5 17 -150 200
;
run;
proc transpose data=a out=b(rename=(_NAME_=Date COL1=Transpose));
var "Jul-09"n--"Sep-09"n;
by ID Var1-Var2;
run;

data a;
input ID Var1 Var2 Jul_09 Aug_09;
CARDS;
1 10 15 200 300
2 5 17 -150 200
;
DATA b(drop=i jul_09 aug_09);
array dates_{*} jul_09 aug_09;
set a;
do i=1 to dim(dates_);
this_value=dates_{i};
this_date=input(compress(vname(dates_{i}),'_'),MONYY5.);
output;
end;
format this_date monyy5.;
run;

Getting next observation per group

I am working on a dataset in SAS to get the next observation's score should be the current observation's value for the column Next_Row_score. If there is no next observation then the current observation's value for the column Next_Row_score should be 'null'per group(ID). For better illustration i have provided the sample below dataset :
ID Score
10 1000
10 1500
10 2000
20 3000
20 4000
30 2500
Resultant output should be like -
ID Salary Next_Row_Salary
10 1000 1500
10 1500 2000
10 2000 .
20 3000 4000
20 4000 .
30 2500 2500
Thank you in advance for your help.

data want(drop=_: flag);
merge have have(firstobs=2 rename=(ID=_ID Score=_Score));
if ID=_ID then do;
Next_Row_Salary=_Score;
flag+1;
end;
else if ID^=_ID and flag>=1 then do;
Next_Row_Salary=.;
flag=.;
end;
else Next_Row_Salary=score;
run;

Try this :
data have;
input ID Score;
datalines;
10 1000
10 1500
10 2000
20 3000
20 4000
30 2500
;
run;
proc sql noprint;
select count(*) into :obsHave
from have;
quit;
data want2(rename=(id1=ID Score1=Salary) drop=ID id2 Score);
do i=1 to &obsHave;
set have point=i;
id1=ID;
Score1=Score;
j=i+1;
set have point=j;
id2=ID;
if id1=id2 then do;
Next_Row_Salary = Score;
end;
else Next_Row_Salary=".";
output;
end;
stop;
;
run;

There is a simpler (in my mind, at least) proc sql approach that doesn't involve loops:
data have;
input ID Score;
datalines;
10 1000
10 1500
10 2000
20 3000
20 4000
30 2500
;
run;
/*count each observation's place in its ID group*/
data have2;
set have;
count + 1;
by id;
if first.id then count = 1;
run;
/*if there is only one ID in a group, keep original score, else lag by 1*/
proc sql;
create table want as select distinct
a.id, a.score,
case when max(a.count) = 1 then a.score else b.score end as score2
from have2 as a
left join have2 (where = (count > 1)) as b
on a.id = b.id and a.count = b.count - 1
group by a.id;
quit;

proc tabulate table with where clause

I have a dataset with tree variables, three binary variables.
I wrote a proc tabulate
proc tabulate data=mydata;
class country var1 var2;
table Country, var1 var2;
run;
Var1 Var2
0 1 0 1
USA 40 50 40 50
AUS 50 20 50 20
IRE 60 40 60 40
DUB 70 50 70 50
Here I get the table with the totals of both var 1 var 2 for 0s and 1s.
However I want only the totals of 1s in this cross table. How can I do that.
If I use a where caluse as below, it shows only 1ns both..
proc tabulate data=mydata;
class country var1 var2;
table Country, var1 var2;
where var1=1 and var2=2;
run;
When I use the above it brings out only the 1s present in both at the sametime.
Which is not I am looking for.
So the dataset I want is as below.
Var1 Var2
1 1
USA 50 50
AUS 20 20
IRE 40 40
DUB 50 50
Is there any other way of doing this?

Change and to or.
Truth table for
Var1=1, Var2=1
Include?
Var1 Var2 AND OR
0 0 N N
0 1 N Y
1 0 N Y
1 1 Y Y

Since your variables are coded 0,1 you can ask for the SUM statistic to get the "count" of the number of ones.
proc tabulate data=mydata;
class country;
var var1 var2;
table Country, var1*sum var2*sum;
run;

PROC SUMMARY using multiple fields as a key

Is there a way to replicate the behaviour below using PROC SUMMARY without having to create the dummy variable UniqueKey?
DATA table1;
input record_desc $ record_id $ response;
informat record_desc $4. record_id $1. response best32.;
format record_desc $4. record_id $1. response best32.;
DATALINES;
A001 1 300
A001 1 150
A002 1 200
A003 1 150
A003 1 99
A003 2 250
A003 2 450
A003 2 250
A004 1 900
A005 1 100
;
RUN;
DATA table2;
SET table1;
UniqueKey = record_desc || record_id;
RUN;
PROC SUMMARY data = table2 NWAY MISSING;
class UniqueKey record_desc record_id;
var response;
output out=table3(drop = _FREQ_ _TYPE_) sum=;
RUN;

You can summarise by record_desc and record_id (see class statement below) without creating a concatenation of the two columns. What made you think you couldn't?
PROC SUMMARY data = table1 NWAY MISSING;
class record_desc record_id;
var response;
output out=table4(drop = _FREQ_ _TYPE_) sum=;
RUN;
proc compare
base=table3
compare=table4;
run;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Transpose Multiple Variables to Columns Suffixed with Dates - sas

Related

matching two datasets with one month lag

Transpose multiple columns to rows in SAS

Getting next observation per group

proc tabulate table with where clause

PROC SUMMARY using multiple fields as a key

Categories

Resources