I am using SAS on OS390.
I have an INFILE1, some treatment, then another INFILE2, other treatment.
I want to use variables from INFILE1 to compare with INFILE2.
examples:
INFILE1.DATE1 > INFLE2.DATE2 THEN OUTPUT;
My issue is that DATE1 is always empty no matter what.
I've tried....
%LET DATETEMP = INFILE1.DATE1
...but DATETEMP is empty as well.
Is there any way in SAS to make a variable carry its value from an INFILE to another...so to speak?
You cannot compare variables across datasets in one datastep. You have to merge the datasets first and then make the comparison.
You can either use a datastep merge or an SQL Join
Merge Example:
DATA want ;
MERGE INFILE1 INFILE2;
BY ID ;
RUN;
SQL Join Example: (you can do either inner or left join based on what you want)
proc sql;
create table work.want as
select t1.date1 , t2.date2, t1.id
from INFILE1 as t1
left join INFILE2 as t2 on t1.id=t2.id;
/*inner join INFILE2 as t2 on t1.id=t2.id;*/
quit;
Related
I have a do loop in which I do calculation on new variable and results are stored as additional column, this column-s (at each iteration) should be attached to the output table defined by macro.
Here on SO something similar has been asked but the answer is not acceptable, the last answer is not compatible with sas command but very close, getting incomplete script with following:
proc sql;
update &outlib..&out.
set var._iqr = b.&var._iqr
from &outlib..&out. as a
left join cal_resul as b
on a.id_client=b.id_client
and a.reference_date=b.reference_date;
quit;
Here is my attempt which works but very slow:
proc sql; create table &outlib..&out. as select * from &inlib..&in.; quit; /* the input is as a basis for output table */
proc sql; alter table &outlib..&out. add &var._iqr numeric; quit; /* create empty column to be filled at each iteration */
proc sql;
update &outlib..&out. as a
set &var._iqr=(select b.&var._iqr from cal_resul as b
where a.id_client=b.id_client
and a.reference_date=b.reference_date
and a.data_source=b.data_source);
quit;
Attempt 2:
This is somewhat faster:
proc sort data=cal_resul; by id_client reference_date data_source; run;
data &outlib..&out.;
update &outlib..&out. cal_resul;
by id_client reference_date data_source;
run;
Simple left join (adding new column into existing table is way faster) but with left join I did not figure out how I can update (always retain the same dataset) the &outlib..&out. at each iteration. Many thanks for any help;
If you want to ADD a variable to a dataset you will have to make a new dataset. (Your ALTER TABLE statement will create a new dataset and copy over all of the observations.)
Looks like your data has three key variables. So use those in merging the new data to the old.
For example to make a new variable in HAVE named EXAMPLE_IQR using the variable EXAMPLE in the dataset NEW you could use code like this. I have used macro variables to show how you might use those macro variables as the parameters to a macro. It sounds like you don't want the process to add new observations to the existing dataset so I have added a check for that using the IN= dataset option.
%let base=work.have;
%let indata=work.new;
%let var=example;
data &base ;
merge &base(in=inbase)
&indata(keep=id_client reference_date data_source &var
rename=(&var=&var._iqr)
)
;
by id_client reference_date data_source;
if inbase;
run;
I have two codes one proc sql and another proc and datastep. Both are interlinked datasets.
Below is the proc sql lines.
create table new as select a.id,a.alid,b.pdate from tb a inner join
tb1 act on a.aid =act.aid left join tb2 as b on (r.alid=a.alid) where
a.did in (15,45); quit;
Below is the proc and datasteps created from above datatset new.
proc sort data = new uodupkey;
by alid;
data new1;
set new;
format ddate date9.
dat1=datepart(today);
datno=input(number,20.);
key=_n_;
rename alid blid;
run;
proc sort data=new1 nodupkey;
by datno dat1;
run;
I need to put everything into single proc sql step.
You mention two data steps but I only see one.
Anyway, your data step and proc sort can indeed be written in one sql query (which you can then insert in your proc sql):
proc sql;
create table new1 as
select id
,alid as blid
,pdate
,datepart(today) as dat1
,input(number,20.) as datno
,monotonic() as key
from new1
group by datno, dat1
having key=min(key)
;
quit;
One remark though. Your data step expects variables called ddate,today and number in your input dataset new. If that dataset is supposed to be the result of your first sql query, then those variables don't exist and their values along with those of dat1 and datno in new1 will always be missing.
Also I assume you misspelled nodupkey on your proc sort.
EDIT: or, to have it all in the same query (if that's what you meant with "the same proc sql"):
proc sql;
create table new1 as
select id
,alid as blid
,pdate
,datepart(today) as dat1
,input(number,20.) as datno
,monotonic() as key
from (
select a.id,a.alid,b.pdate
from tb a
inner join tb1 act
on a.aid =act.aid
left join tb2 as b
on (r.alid=a.alid)
where a.did in (15,45)
)
group by datno, dat1
having key=min(key)
;
quit;
I am trying to read a list of values into a macro, so that the macro variable would contain the table name and create a column that would contain the table name.
My attempt, which is wrong, was trying to use the code below, and erroring out because of the line " '&tbl' as Table_Dt ". The code below is inefficient, so feel free to enhance it. Thanks for your help.
%macro flat(tbl);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
'&tbl' as Table_Dt
FROM &tbl..flat_file;
QUIT;
%mend flat;
%flat(flat0113);
%flat(flat0213);
...
%flat(flat1213);
As you are basically processing a list, this could also be done using call execute. No need to write all the information to macro variables. All tables/libraries are already stored in the sashelp tables and therefore are ready for list processing.
data _null_;
set sashelp.vslib (where=(substr(libname,1,4) = 'FLAT')) end =eof;
if _n_ = 1 then call execute ('proc sql exec feedback stimer noprint outobs=5;');
call execute ('
CREATE TABLE '|| libname ||' AS
SELECT ID,
DOB,
"'||compress(libname)||'" as Table_Dt
FROM '||compress(libname)||'.flat_file
;
');
if eof then call execute ('QUIT;');
run;
Macros in quotation marks will only resolve with double quotes, not single. If you want to do a more efficient way, you can do so with the following modified code. I am assuming that you are reading from libraries named flat0113, flat0213, etc.
Step 1: Get a list of all the libnames with the word "flat" in it
proc sql noprint;
select distinct libname
, count(libname)
into: tbl_list separated by ' '
, total_tbls
from sashelp.vmember
where libname LIKE 'FLAT%'
;
quit;
This will create two macro variables: &tbl_list, and &total_tbls.
&tbl_list holds the values flat0113 flat0213 flat ... flat1213.
&total_tbls holds the total number of values in &tbl_list.
Step 2: Loop through the newly created list
%macro readTables;
%do i = 1 %to &total_tbls;
%let tbl = %scan(tbl_list, &i);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
"&tbl" as Table_Dt
FROM &tbl..flat_file;
quit;
%end;
%mend;
%readTables;
This will read each individual value from &tbl_list one by one until the very end of the list.
I know in teradata or other sql platforms you can find the count distinct of a combination of variables by doing:
select count(distinct x1||x2)
from db.table
And this will give all the unique combinations of x1,x2 pairs.
This syntax, however, does not work in proc sql.
Is there anyway to perform such a count in proc sql?
Thanks.
That syntax works perfectly fine in PROC SQL.
proc sql;
select count(distinct name||sex)
from sashelp.class;
quit;
If the fields are numeric, you must put them to character (using put) or use cat or one of its siblings, which happily take either numeric or character.
proc sql;
select count(distinct cats(age,sex))
from sashelp.class;
quit;
This maybe redundant, but when you mentioned "combination", it instantly triggered 'permutation' in my mind. So here is one solution to differentiate these two:
DATA TEST;
INPUT (X1 X2) (:$8.);
CARDS;
A B
B A
C D
C D
;
PROC SQL;
SELECT COUNT(*) AS TOTAL, COUNT(DISTINCT CATS(X1,X2)) AS PERMUTATION,
COUNT(DISTINCT CATS(IFC(X1<=X2,X1,X2),IFC(X1>X2,X1,X2))) AS COMBINATION
FROM TEST;
QUIT;
I merge two data sets as follows:
data ds3;
merge ds1(in=in1) ds2(in=in2);
by mrgvar;
if in1;
if in2 then flag=1;
run;
If I were to do this with a PROC SQL step instead, how can I set the flag variable as above?
proc sql;
create table ds3 as
select a.*
,b.*
,???
from ds1 as a
left join
ds2 as b
on a.mrgvar=b.mrgvar;
quit;
A common way is to use the table alias with the join variable.
proc sql;
create table ds3 as
select a.*
,b.*
,case when b.mrgvar is null then 0 else 1 end as flag
from ds1 as a
left join
ds2 as b
on a.mrgvar=b.mrgvar;
quit;
Something to that effect - if b.mrgvar is null/missing then it's only coming from table a. (Yes, you can separately reference the two even though they're basically the same and get combined in the result table.)