I have two codes one proc sql and another proc and datastep. Both are interlinked datasets.
Below is the proc sql lines.
create table new as select a.id,a.alid,b.pdate from tb a inner join
tb1 act on a.aid =act.aid left join tb2 as b on (r.alid=a.alid) where
a.did in (15,45); quit;
Below is the proc and datasteps created from above datatset new.
proc sort data = new uodupkey;
by alid;
data new1;
set new;
format ddate date9.
dat1=datepart(today);
datno=input(number,20.);
key=_n_;
rename alid blid;
run;
proc sort data=new1 nodupkey;
by datno dat1;
run;
I need to put everything into single proc sql step.
You mention two data steps but I only see one.
Anyway, your data step and proc sort can indeed be written in one sql query (which you can then insert in your proc sql):
proc sql;
create table new1 as
select id
,alid as blid
,pdate
,datepart(today) as dat1
,input(number,20.) as datno
,monotonic() as key
from new1
group by datno, dat1
having key=min(key)
;
quit;
One remark though. Your data step expects variables called ddate,today and number in your input dataset new. If that dataset is supposed to be the result of your first sql query, then those variables don't exist and their values along with those of dat1 and datno in new1 will always be missing.
Also I assume you misspelled nodupkey on your proc sort.
EDIT: or, to have it all in the same query (if that's what you meant with "the same proc sql"):
proc sql;
create table new1 as
select id
,alid as blid
,pdate
,datepart(today) as dat1
,input(number,20.) as datno
,monotonic() as key
from (
select a.id,a.alid,b.pdate
from tb a
inner join tb1 act
on a.aid =act.aid
left join tb2 as b
on (r.alid=a.alid)
where a.did in (15,45)
)
group by datno, dat1
having key=min(key)
;
quit;
Related
Below sample data is from oracle database
promo flag
vijay a
vijay b
vijay c
sam b
sam g
sam c
I have one proc sql statement connected to oracle(though i have not mentioned oracle connection below)
proc sql;
create table a as select *from new;
quit;
then two proc sort statement based on above dataset a.
proc sort data = a;
by promo descending flag;
run;
proc sort data =a nodupkey out =new1;
by promo;
run;
Now I want do these two proc sort statements inside proc sql statement itself. Any idea how to do?
proc sql;
create table want as
select distinct promo,flag from new group by promo having flag=max(flag);
quit;
I am a newbie to SAS Base, and I am struggling to create a simple program that extracts data from a table on my database, runs e.g. PROC MEANS, and writes the data back to the table.
I know how to use PROC SQL (read and update tables) and PROC MEANS, but I can't figure out how to combine the steps.
PROC SQL;
SELECT make,model,type,invoice,horsepower
FROM
SASHELP.CARS
;
QUIT;
PROC Means;
RUN;
What I want to accomplish is create an additional column in the dataset with e.g. the mean of the horsepower.. and in the end I want to write that computed column to the table on the database.
Edit
What I was looking for is this:
PROC SQL;
create table want as
select make,model,type,invoice,horsepower
, mean(horsepower) as mean_horsepower
from sashelp.cars
;
QUIT;
PROC MEANS DATA=want;
RUN;
SAS makes this very easy to do with SQL since it will automatically remerge summary statistics back to detailed records.
create table want as
select make,model,type,invoice,horsepower
, mean(horsepower) as mean_horsepower
from sashelp.cars
;
Or using normal SAS code.
proc means data=sashelp.cars nway noprint ;
var horsepower ;
output out=mean_horsepower mean=mean_horsepower ;
run;
data want ;
set sashelp.cars ;
if _n_=1 then set mean_horsepower (keep=mean_horsepower);
run;
I need to create multiple tables using proc sql
proc sql;
/* first city */
create table London as
select * from connection to myDatabase
(select * from mainTable
where city = 'London');
/* second city */
create table Beijing as
select * from connection to myDatabase
(select * from mainTable
where city = 'Beijing');
/* . . the same thing for other cities */
quit;
The names of those cities are in the sas table myCities
How can I embed the data step into proc sql in order to iterate through all cities ?
proc sql noprint;
select quote(city_varname) into :cities separated by ',' from myCities;
quit;
*This step above creates a list as a macro variable to be used with the in() operator below. EDIT: Per Joe's comment, added quote() function so that each city will go into the macro-var list within quotes, for proper referencing by in() operator below.
create table all_cities as
select * from connection to myDatabase
(select * from mainTable
where city in (&cities));
*this step is just the step you provided in your question, slightly modified to use in() with the macro-variable list defined above.
One relatively simple solution to this is to do this entirely in a data step. Assuming you can connect via libname (which if you can connect via connect to you probably can), let's say the libname is mydb. Using a similar construction to Max Power's for the first portion:
proc sql noprint;
select city_varname
into :citylist separated by ' '
from myCities;
select cats('%when(var=',city_varname,')')
into :whenlist separated by ' '
from myCities;
quit;
%macro when(var=);
when "&var." output &var.;
%mend when;
data &citylist.;
set mydb.mainTable;
select(city);
&whenlist.;
otherwise;
end;
run;
If you're using most of the data in mainTable, this probably wouldn't be much slower than doing it database-side, as you're moving all of the data anyway - and likely it would be faster since you only hit the database once.
Even better would be to pull this to one table (like Max shows), but this is a reasonable method if you do need to create multiple tables.
You need to put your proc sql code into a SAS Macro.
Create a macro-variable for City (in my example I called the macro-variable "City").
Execute the macro from a datastep program. Since the Datastep program processes one for each observation, there is no need to create complex logic to iterate.
data mycities;
infile datalines dsd;
input macrocity $ 32.;
datalines;
London
Beijing
Buenos_Aires
;
run;
%macro createtablecity(city=);
proc sql;
/* all cities */
create table &city. as
select * from connection to myDatabase
(select * from mainTable
where city = "&city.");
quit;
%mend;
data _null_;
set mycities;
city = macrocity;
call execute('%createtablecity('||city||')');
run;
Similar to the other solutions here really, maybe a bit simpler... Pull out a distinct list of cities, place into macros, run SQL query within a do loop.
Proc sql noprint;
Select distinct city, count(city) as c
Into :n1-:n999, :c
From connection to mydb
(Select *
From mainTable)
;
Quit;
%macro createTables;
%do a=1 %to &c;
Proc sql;
Create table &&n&a as
Select *
From connection to myDb
(Select *
From mainTable
Where city="&&n&a")
;
Quit;
%end;
%mend createTables;
%createTables;
I merge two data sets as follows:
data ds3;
merge ds1(in=in1) ds2(in=in2);
by mrgvar;
if in1;
if in2 then flag=1;
run;
If I were to do this with a PROC SQL step instead, how can I set the flag variable as above?
proc sql;
create table ds3 as
select a.*
,b.*
,???
from ds1 as a
left join
ds2 as b
on a.mrgvar=b.mrgvar;
quit;
A common way is to use the table alias with the join variable.
proc sql;
create table ds3 as
select a.*
,b.*
,case when b.mrgvar is null then 0 else 1 end as flag
from ds1 as a
left join
ds2 as b
on a.mrgvar=b.mrgvar;
quit;
Something to that effect - if b.mrgvar is null/missing then it's only coming from table a. (Yes, you can separately reference the two even though they're basically the same and get combined in the result table.)
I have a dataset which has multiple obs per person. I want to have each single record showing the sum of a variable per person ID. However I do not want to group the data into single personal IDs. I hope the example below explains my question
I want to create the column in bold. How to do this? In SAS EG (or SAS if necessary)?
ID...Var1...SUM
X.....10.......30
X.....20.......30
Y.....20.......80
Y.....20.......80
Y.....40.......80
Z.....30.......30
You can do this using either proc sql or proc means
more info:proc means
proc sql
proc sql:
proc sql noprint;
create table new_table as
select distinct id, var1, sum(var_to_sum) as summed_var_name
from old_table
group by id
;
quit;
after rereading your question, using proc means you will need to merge var1 back in, better off using proc sql above.
proc means:
proc means data = old_table sum;
by id var1;
var var_to_sum;
output out = new_table sum;
run;