outputting multiple data sets into excel workbook - sas

Another question. I have multiple data sets that generate ouput how can output these into one excel work sheet and apply my own formating. For example I have data set 1, data set 2, data set 3
each data set has two coloumns, for example
Col 1 Col 2
1 2
3 4
5 6
I want each data set to be in one worksheet and seperated by column , so in excel it should look like
Col 1 Col 2 Blank Col Col 1 Col 2 Blank Col
Somone told me I need to look at DDE for this is this true
Regards,

You can definitely do it using DDE. What DDE does it just simulates user's clicks at Excel's menus, buttons, cells etc. Here's an example how you can do that with macro loop for 3 datasets with names have1, have2 and have3. If you need more general solution (unknown number of datasets, with various number of variables, random datasets' names etc), the code should be updated, but its 'DDE-part' will be essentially pretty the same.
One more assumption - your Excel workbook should be open during code execution. Though it can be also automated - Excel can be started and file can be open using DDE itself.
You can find a very nice introduction into DDE here, where all these trick discussed in details.
data have1;
input Col1 Col2;
datalines;
1 2
3 4
5 6
;
run;
data have2;
input Col1 Col2;
datalines;
1 2
3 4
5 6
7 8
;
run;
data have3;
input Col1 Col2;
datalines;
1 2
3 4
7 8
5 6
9 10
;
run;
%macro xlsout;
/*iterating through your datasets*/
%do i=1 %to 3;
/*determine number of records in the current dataset*/
proc sql noprint;
select count(*) into :noobs
from have&i;
quit;
/*assign a range on the workbook spreadsheet matching to data in the current dataset*/
filename range dde "excel|[myworkbook.xls]sas!r1c%eval((&i-1)*3+1):r%left(&noobs)c%eval((&i-1)*3+2)" notab;
/*put data into selected range*/
data _null_;
set have&i;
file range;
put Col1 '09'x Col2;
run;
%end;
%mend xlsout;
%xlsout

You cannot do exactly this with SAS (DDE is probably possible). I would suggest looking at SaviCells Pro.
http://www.sascommunity.org/wiki/SaviCells
http://www.savian.net/utilities.html

You could likely accomplish what you're asking through ODS TAGSETS.EXCELXP or the new ODS EXCEL (9.4 TS1M1). You would need to arrange the datasets ahead of time (ie, merge them together or transpose or whatnot to get one dataset with the right columns), however, or else use PROC REPORT or some other procedure to get them in the right format.

Related

How to create 2x2 table in sas for fisher exact test

I just performed the fisher test in R and in Excel on a 2x2 table with the contents 1 6 and 7 2. I can't manage to do this in sas.
data my_table;
input var1 var2 ##;
datalines;
1 6 7 2
;
proc freq data=my_table;
tables var1*var2 / fisher;
run;
The test somehow ignores that the table consists of the 4 variables but when I print the table it looks fine. In the test the contents of the table are 0, 1, 1, 0. I guess I need to change something when creating the data but what?
You do NOT have two variables that each have two categories.
Read the data in this way instead.
data have ;
do var1=1,2 ; do var2=1,2;
input count ##;
output;
end; end;
datalines;
1 6 7 2
;
Now VAR1 and VAR2 both have two possible values and COUNT has the number of cases for the particular combination. Use the WEIGHT statement to tell PROC FREQ to use COUNT as the number of cases.
proc freq data=have ;
weight count ;
tables var1*var2 / fisher ;
run;

SAS: How to create a table that lists the variable names of another table

I am starting out with SAS and PROC SQL, and I want to do a very simple task as described below but am having trouble.
data test;
input x_ray y_chromosome z_sword;
cards;
1 2 3
4 5 6
4 7 8
7 8 9
7 9 10
7 10 11
;
In the table test we have three variables, x_ray, y_chromosome, and z_sword.
My goal is to make another table, let's say, result, that looks like this.
data result;
input name $;
cards;
x_ray
y_chromosome
z_sword
;
I looked up ways to do this online, but all the methods are very well beyond my understanding and I simply do not believe that such a simple process has to be so complicated.
May I have some help, please?
Use PROC CONTENTS.
proc contents data=test noprint out=result;
run;
If you only want the variable names and not the other information you can use dataset option to limit the variables kept in the RESULT dataset.
proc contents data=test noprint out=result(keep=name) ;
run;
I cannot find any shorter and simpler answer than #Tom, just another two ways for reference.
Use dictionary table:
proc sql;
create table result as select name from sashelp.vcolumn where libname='WORK' and memname='TEST';
quit;
or query them by I/O functions:
data result;
rc = open('work.test');
if rc then do;
do i = 1 to attrn(rc,'nvars');
name = varname(rc,i);
output;
end;
rc = close(rc);
end;
run;

label variables in one data set with values from another?

How do you take the labels contained in the metadata data set and apply them to the variables in the set1 data set?
The desired result is that 'set1' still contains variables a-h and the appropriate variables now have labels. For example 'set1' will continue to have a variable 'a' with no label, however variable 'b' will now have the label 'Label1' etc.
The code I have below works, but it's very inefficient because it runs the macro for each variable. So for every one label it has to read 'set1', apply a label and save 'set1'. When doing this with large 'set1' and 'metadata' data sets, it's quite slow.
/**********************************************************
Reads metadata - in the real case it comes from a large
csv file
***********************************************************/
data metadata;
input var $ labels $;
datalines;
b Label1
d Label2
f Label3
;
run;
/**********************************************************
Reads 'set1' in the real case it comes from many
even larger csv files.
***********************************************************/
data set1;
input a b c d e f g h;
datalines;
1 1 0 5 6 4 0 4
2 3 4 5 3 5 0 1
3 2 1 9 6 5 8 1
;
run;
/**********************************************************
Macro to relabel one by one
***********************************************************/
%Macro relabel(var,label);
DATA set1;
set set1;
label %quote(&var) = %quote(&label);
RUN;
%Mend relabel;
/**********************************************************
Steps through 'metadata' and individually calls the macro
for each obs
***********************************************************/
data _null_;
set metadata;
call execute('%relabel('||var||','||labels||')');
run;
proc print;
run;
/**********************************************************
Shows labels applied correctly.
***********************************************************/
proc contents;
run;
If your metadata is small enough then you can use a macro variable to hold it. There is a 65K limit on the size of a macro variable.
proc sql noprint;
select catx('=',var,quote(trim(labels)))
into :labels separated by ' '
from metadata
;
quit;
proc datasets nolist lib=work ;
modify set1;
label &labels;
run;
quit;

SAS retrieving data from monthly datasets

I have 2 variables and 3 records in a sas data set, and based on the date field in that data set, I need to read different monthly data sets.
For example,
I have
item no. Date
1 30Jun2015
2 31Jul2015
3 31Aug2015
When I read the first record, then based on the date field (30jun2015) here, it should merge another dataset suffixed with 30jun2015 with this current dataset.
How can I achieve that?
So as I'll hazard a guess what you're looking for I've left a bit of a gap where you'll have to specifiy the criteria for your own merge.
1) Read in base data
data MAIN_DATA;
infile cards;
input ITEM_NO DATE:date9.;
format DATE date9.;
cards;
1 30JUN2015
2 31JUL2015
3 31AUG2015
;
run;
2) Store all dates: into macro variables date1 to daten. Assuming ddmmyy6. is a good format for your table names
Data _null_;
Set Main_data;
Call symputx('date'||strip(_n_),put(DATE,ddmmyy6.));
Call symputx('daten', _n_);
Run;
3) Read in the variables and read the associated table - you haven't specified how to do the merge so I'll leave that up to you
%macro readin;
%do i = 1 %to &daten;
data NEW_TABLE_&&date&i..;
set TEST_&&date&i..; /*in this step you can merge on the original table however you intend to*/
run;
%end;
%mend readin;
%readin;

replicating a sql function in sas datastep

Hi another quick question
in proc sql we have on which is used for conditional join is there something similar for sas data step
for example
proc sql;
....
data1 left join data2
on first<value<last
quit;
can we replicate this in sas datastep
like
data work.combined
set data1(in=a) data2(in=b)
if a then output;
run;
You can also can reproduce sql join in one DATA-step using hash objects. It can be really fast but depends on the size of RAM of your machine since this method loads one table into memory. So the more RAM - the larger dataset you can wrap into hash. This method is particularly effective for look-ups in relatively small reference table.
data have1;
input first last;
datalines;
1 3
4 7
6 9
;
run;
data have2;
input value;
datalines;
2
5
6
7
;
run;
data want;
if _N_=1 then do;
if 0 then set have2;
declare hash h(dataset:'have2');
h.defineKey('value');
h.defineData('value');
h.defineDone();
declare hiter hi('h');
end;
set have1;
rc=hi.first();
do while(rc=0);
if first<value<last then output;
rc=hi.next();
end;
drop rc;
run;
The result:
value first last
2 1 3
5 4 7
6 4 7
7 6 9
Yes there is a simple (but subtle) way in just 7 lines of code.
What you intend to achieve is intrinsically a conditional Cartesian join which can be done by a do-looped set statement. The following code use the test dataset from Dmitry and a modified version of the code in the appendix of SUGI Paper 249-30
data data1;
input first last;
datalines;
1 3
4 7
6 9
;
run;
data data2;
input value;
datalines;
2
5
6
7
;
run;
/***** by data step looped SET *****/
DATA CART_data;
SET data1;
DO i=1 TO NN; /*NN can be referenced before set*/
SET data2 point=i nobs=NN; /*point=i - random access*/
if first<value<last then OUTPUT; /*conditional output*/
END;
RUN;
/***** by SQL *****/
proc sql;
create table cart_SQL as
select * from data1
left join data2
on first<value<last;
quit;
One can easily see that the results coincide.
Also note that from SAS 9.2 documentation: "At compilation time, SAS reads the descriptor portion of each data set and assigns the value of the NOBS= variable automatically. Thus, you CAN refer to the NOBS= variable BEFORE the SET statement. The variable is available in the DATA step but is not added to any output data set."
There isn't a direct way to do this with a MERGE. This is one example where the SQL method is clearly superior to any SAS data step methods, as anything you do will take much more code and possibly more time.
However, depending on the data, it's possible a few approaches may make sense. In particular, the format merge.
If data1 is fairly small (even, say, millions of records), you can make a format out of it. Like so:
data fmt_set;
set data1;
format label $8.;
start=first; *set up the names correctly;
end=last;
label='MATCH';
fmtname='DATA1F';
output;
if _n_=1 then do; *put out a hlo='o' line which is for unmatched lines;
start=.; *both unnecessary but nice for clarity;
end=.;
label='NOMATCH';
hlo='o';
output;
end;
run;
proc format cntlin=fmt_set; *import the dataset;
quit;
data want;
set data2;
if put(value,DATA1F.)="MATCH";
run;
This is very fast to run, unless data1 is extremely large (hundreds of millions of rows, on my system) - faster than a data step merge, if you include sort time, since this doesn't require a sort. One major limitation is that this will only give you one row per data2 row; if that is what is desired, then this will work. If you want repeats of data2 then you can't do it this way.
If data1 may have overlapping rows (ie, two rows where start/end overlap each other), you also will need to address this, since start/end aren't allowed to overlap normally. You can set hlo="m" for every row, and "om" for the non-match row, or you can resolve the overlaps.
I'd still do the sql join, however, since it's much shorter to code and much easier to read, unless you have performance issues, or it doesn't work the way you want it to.
Here's another solution, using a temporary array to hold the lookup dataset. Performance is probably similar to Dmitry's hash-based solution, but this should also work for people still using versions of SAS prior to 9.1 (i.e. when hash objects were first introduced).
I've reused Dmitry's sample datasets:
data have1;
input first last;
datalines;
1 3
4 7
6 9
;
run;
data have2;
input value;
datalines;
2
5
6
7
;
run;
/*We need a macro var with the number of obs in the lookup dataset*/
/*This is so we can specify the dimension for the array to hold it*/
data _null_;
if 0 then set have2 nobs = nobs;
call symput('have2_nobs',put(nobs,8.));
stop;
run;
data want_temparray;
array v{&have2_nobs} _temporary_;
do _n_ = 1 to &have2_nobs;
set have2 (rename=(value=value_array));
v{_n_}=value_array;
end;
do _n_ = 1 by 1 until (eof_have1);
set have1 end = eof_have1;
value=.;
do i=1 to &have2_nobs;
if first < v{i} < last then do;
value=v{i};
output;
end;
end;
if missing(value) then output;
end;
drop i value_array;
run;
Output:
value first last
2 1 3
5 4 7
6 4 7
7 6 9
This matches the output from the equivalent SQL:
proc sql;
create table want_sql as
select * from
have1 left join have2
on first<value<last
;
quit;
run;