Merging multiple datasets together - sas

Hello I have a listing I'm struggling with as I don't think the code I am using is doing the job correctly. Here is the spec.
Dataset: Firstly, merge QS with SUPPQS by USUBJID, IDVARVAL= QSSEQ, keep only records where QSCAT=’SOFA’. Then merge with ADSOFA by USUBJID and QSSEQ. Only keep records where MITTFL=’Y’
and here is the code I'm using
proc sql;
create table qs (where=(qscat="SOFA" )) as
select a.*,b.qnam as SOFASCS,qval as avalc_qs from trans.qs as a
left join
trans.suppqs (where=(qnam='SOFASCS')) as b
on a.usubjid = b.usubjid and a.qsseq = input(b.idvarval,best.);
quit;
proc sort data=qs;
by usubjid qsseq;
run;
data adsofa;
set adb.adsofa;
run;
proc sort data=adsofa;
by usubjid qsseq;
run;
data qs01;
merge qs(in=a drop=studyid)
adsofa(in=b where=(mittfl = "Y"));
by usubjid qsseq;
if a or b;
I keep getting rows I don't want. Is there a cleaner way of doing this?.

I tried to convert your logic into a classical SQL.
proc sql;
create table qs as
select a.*
,b.qnam as SOFASCS
,qval as avalc_qs
from trans.qs as a
left join trans.suppqs as b
on a.usubjid = b.usubjid and a.qsseq = input(b.idvarval,best.) and qnam='SOFASCS'
where qscat="SOFA" ;
quit;
proc sql;
create table qs01 as
select qs.*, a.*
from qs
full /* left? */ join adb.adsofa as a
on a.usubjid = qs.usubjid and a.qsseq = qs.qsseq and mittfl = "Y"
;
quit;
I assume that you did not really want to have a full join but a simple left loin in the last one.

Related

How do I merge the results from proc sql count into one table?

I used proc sql to count the number of observations in 4 different tables. Now how do I merge these 4 results so that I get one nice table? Thanks.
SQL DICTIONARY.TABLES might be what you want.
Example:
proc sql;
create table want as
select libname, memname, nobs
from dictionary.tables
where libname = 'SASHELP'
and upcase(memname) in ('CARS', 'CLASS', 'AIR', 'BASEBALL')
;
There are many ways to merge tables in SAS, and you can even count the observations all in a single step. Assuming you have four tables with one row and dataset each, you can do a simple merge step. For example:
data count1;
n_obs1 = 100;
run;
data count2;
n_obs2 = 200;
run;
data want;
merge count1 count2;
run;
You could also do everything in a single SQL step.
proc sql;
create table want as
select nobs_1, nobs_2
from (select count(*) as nobs_1 from sashelp.cars)
, (select count(*) as nobs_2 from sashelp.class)
;
quit;
Proc sql;
Create table counts as
Select count(*) as count, "t1" as source from t1
Union select count(*) as count, "t2" as source from t2
Union select count(*) as count, "t3" as source from t3
Union select count(*) as count, "t4" as source from t4;
Quit;`

Put everything into sas sql

I have two codes one proc sql and another proc and datastep. Both are interlinked datasets.
Below is the proc sql lines.
create table new as select a.id,a.alid,b.pdate from tb a inner join
tb1 act on a.aid =act.aid left join tb2 as b on (r.alid=a.alid) where
a.did in (15,45); quit;
Below is the proc and datasteps created from above datatset new.
proc sort data = new uodupkey;
by alid;
data new1;
set new;
format ddate date9.
dat1=datepart(today);
datno=input(number,20.);
key=_n_;
rename alid blid;
run;
proc sort data=new1 nodupkey;
by datno dat1;
run;
I need to put everything into single proc sql step.
You mention two data steps but I only see one.
Anyway, your data step and proc sort can indeed be written in one sql query (which you can then insert in your proc sql):
proc sql;
create table new1 as
select id
,alid as blid
,pdate
,datepart(today) as dat1
,input(number,20.) as datno
,monotonic() as key
from new1
group by datno, dat1
having key=min(key)
;
quit;
One remark though. Your data step expects variables called ddate,today and number in your input dataset new. If that dataset is supposed to be the result of your first sql query, then those variables don't exist and their values along with those of dat1 and datno in new1 will always be missing.
Also I assume you misspelled nodupkey on your proc sort.
EDIT: or, to have it all in the same query (if that's what you meant with "the same proc sql"):
proc sql;
create table new1 as
select id
,alid as blid
,pdate
,datepart(today) as dat1
,input(number,20.) as datno
,monotonic() as key
from (
select a.id,a.alid,b.pdate
from tb a
inner join tb1 act
on a.aid =act.aid
left join tb2 as b
on (r.alid=a.alid)
where a.did in (15,45)
)
group by datno, dat1
having key=min(key)
;
quit;

SAS insert value with proc sql

So I have a rather interesting problem. I am trying to insert a current date in specific formats and styles, but for some reason it seems to fail. I know its not a formatting issue... But idk how to fix it. a data step solution is welcomed as well... Here's what works.
proc sql;
create table work.test
(test_Id char(50), test_Name char(50), cur_Mo char(1), cur_Qtr char(1), entered_Date char(8));
insert into work.test
values('201703','2017 Mar','0','0','24APR17')
values('201704','2017 Apr','0','0','24APR17')
values('201706','2017 Jun','1','0','23JUN17');
quit;
Here's what doesn't:
proc sql;
insert into work.test
values(catx('',put(year(today()),4.),case when month(today())< 10 then catx('','0',put(month(today()),2.)) else put(month(today()),2.)end) ,catx(' ',[put(year(today()),4.),put(today(),monname3.))],'1','0',put(today(),date7.));
quit;
You can use the %SYSFUNC() macro function to call most other SAS function in macro code. So to generate today's date in DATE7 format you could use:
insert into work.test (date)
values("%sysfunc(date(),date7)")
;
The way I'd probably do it is to use a data step to make a dataset that you would insert, and then insert that dataset.
You can use insert into (...) select (...) from (...) syntax in SAS, and the data step is much more flexible as to allowing you to define columns.
For example:
proc sql;
create table class like sashelp.class;
quit;
proc sql;
insert into class
select * from sashelp.class;
quit;
Or you can specify only certain variables:
proc sql;
insert into class (name, age)
select name, age from sashelp.class;
quit;
data to_insert;
name= 'Wilma';
sex = 'F';
age = 29;
height = 61.2;
weight = 95.3;
run;
proc sql;
insert into class
select * from to_insert;
quit;
Just make sure you either explicitly list the variables to insert/select, or you have the order exactly right (it matches up by position if you use * like I do above).

SAS : How to iterate a dataset elements within the proc sql WHERE statement?

I need to create multiple tables using proc sql
proc sql;
/* first city */
create table London as
select * from connection to myDatabase
(select * from mainTable
where city = 'London');
/* second city */
create table Beijing as
select * from connection to myDatabase
(select * from mainTable
where city = 'Beijing');
/* . . the same thing for other cities */
quit;
The names of those cities are in the sas table myCities
How can I embed the data step into proc sql in order to iterate through all cities ?
proc sql noprint;
select quote(city_varname) into :cities separated by ',' from myCities;
quit;
*This step above creates a list as a macro variable to be used with the in() operator below. EDIT: Per Joe's comment, added quote() function so that each city will go into the macro-var list within quotes, for proper referencing by in() operator below.
create table all_cities as
select * from connection to myDatabase
(select * from mainTable
where city in (&cities));
*this step is just the step you provided in your question, slightly modified to use in() with the macro-variable list defined above.
One relatively simple solution to this is to do this entirely in a data step. Assuming you can connect via libname (which if you can connect via connect to you probably can), let's say the libname is mydb. Using a similar construction to Max Power's for the first portion:
proc sql noprint;
select city_varname
into :citylist separated by ' '
from myCities;
select cats('%when(var=',city_varname,')')
into :whenlist separated by ' '
from myCities;
quit;
%macro when(var=);
when "&var." output &var.;
%mend when;
data &citylist.;
set mydb.mainTable;
select(city);
&whenlist.;
otherwise;
end;
run;
If you're using most of the data in mainTable, this probably wouldn't be much slower than doing it database-side, as you're moving all of the data anyway - and likely it would be faster since you only hit the database once.
Even better would be to pull this to one table (like Max shows), but this is a reasonable method if you do need to create multiple tables.
You need to put your proc sql code into a SAS Macro.
Create a macro-variable for City (in my example I called the macro-variable "City").
Execute the macro from a datastep program. Since the Datastep program processes one for each observation, there is no need to create complex logic to iterate.
data mycities;
infile datalines dsd;
input macrocity $ 32.;
datalines;
London
Beijing
Buenos_Aires
;
run;
%macro createtablecity(city=);
proc sql;
/* all cities */
create table &city. as
select * from connection to myDatabase
(select * from mainTable
where city = "&city.");
quit;
%mend;
data _null_;
set mycities;
city = macrocity;
call execute('%createtablecity('||city||')');
run;
Similar to the other solutions here really, maybe a bit simpler... Pull out a distinct list of cities, place into macros, run SQL query within a do loop.
Proc sql noprint;
Select distinct city, count(city) as c
Into :n1-:n999, :c
From connection to mydb
(Select *
From mainTable)
;
Quit;
%macro createTables;
%do a=1 %to &c;
Proc sql;
Create table &&n&a as
Select *
From connection to myDb
(Select *
From mainTable
Where city="&&n&a")
;
Quit;
%end;
%mend createTables;
%createTables;

Data step merge PROC SQL equivalent flagging which table record was found in

I merge two data sets as follows:
data ds3;
merge ds1(in=in1) ds2(in=in2);
by mrgvar;
if in1;
if in2 then flag=1;
run;
If I were to do this with a PROC SQL step instead, how can I set the flag variable as above?
proc sql;
create table ds3 as
select a.*
,b.*
,???
from ds1 as a
left join
ds2 as b
on a.mrgvar=b.mrgvar;
quit;
A common way is to use the table alias with the join variable.
proc sql;
create table ds3 as
select a.*
,b.*
,case when b.mrgvar is null then 0 else 1 end as flag
from ds1 as a
left join
ds2 as b
on a.mrgvar=b.mrgvar;
quit;
Something to that effect - if b.mrgvar is null/missing then it's only coming from table a. (Yes, you can separately reference the two even though they're basically the same and get combined in the result table.)