I have three piece of code. How can I combine them into one so that they look elegant? data1: pull data with some condition; data2: data1 left join new data; data3: set to data2 and create a new variable.
proc sql; create table data1 as select
a.ID,
b.decison_CD,
c.type,
from
dataA a,
dataB b,
dataC c,
where a.ID=b.ID
and a.ID=c.ID
and c.type not in ('Unknown')
and b.decison_CD in (āYā,āNā)
; quit;
proc sql;
create table data2 as select
a.*
,b.payId
from data1 a
left join datanew b
on a.ID=b.ID;
quit;
data data3;
set data2;
if payID= . then booked =0;
else if payID=1 then booked=1;
run;
It looks like you can just use left joins and treat datanew as a fourth dataset:
proc sql;
create table data1 as select
a.ID, b.decison_CD, c.type, d.payId,
case when missing(d.payId) then 0 else
case when d.payID = 1 then 1 end end as booked
from dataA as a
left join dataB (where = (decision_CD in('Y','N'))) as b on a.id = b.id
left join dataC (where = (type notin('Unknown'))) as c on a.id = c.id
left join datanew as d on a.id = d.id;
quit;
Related
I have a dataset with two columns of IDs, ID_A and ID_B. Each row contains two ID's that I believe belong to the same person. Because of this, each combination shows up twice. For example:
ID_A ID_B
A B
C D
B A
D C
What I want is to remove the repetition. I.E. If I have the row A, B I don't require the row B, A.
ID_A ID_B
A B
C D
Any idea how to do this in SAS?
How about this...
data have;
input (ID_A ID_B)($);
cards;
A B
C D
B A
D C
;;;;
run;
data haveV / view=haveV;
set have;
call sortc(of id:);
run;
proc sort nodupkey out=want;
by id:;
run;
proc print;
run;
I like #data null answer is perfect and robust. You can also try proc sql as shown below
proc sql;
create table want as
select distinct
case when ID_A le ID_B then ID_A else ID_B end as ID_A,
case when ID_A ge ID_B then ID_A else ID_B end as ID_B
from have;
I have a table1 that contains 4 different kind of ids
Data table1;
Input id1 $ id2 $ id3 $ final_id $;
Datalines;
1 a a1 p
2 b b2 q
- c c2 r
3 d - s
4 - d4 t
A table2 contains any of the ids from id1, id2 or id3 of table1:
Data table1;
Input id $ col1 $ col2 $;
Datalines;
1 gsh ywu
b hsjs kall
c2 jsjs ywe
3 sja weei
d4 ase uwh
I want to left join table1 on table2 such that I get a new column in table2 giving me final_id from table1.
How do i go about this problem?
Please help.
Thank you.
You can do it using SQL:
proc SQL noprint;
create table merged as
select b.final_id, a.*
from table2 as a left join table1 as b
on (a.id eq b.id1 or a.id eq b.id2 or a.id eq b.id3)
;
quit;
In T-SQL I used to be able to do the following:
delete t1
from table1 t1
join table2 t2 on t1.rowid = t2.rowid and t1.value <> t2.value
I'd like to be able to do the same in SAS.
taking the code above and wrapping in proc sql; and quit; throws a syntax error.
Is below my only option ?
proc sql;
delete from table1 t1
where t1.value <> (select t2.value from table2 t2 where t1.rowid = t2.rowid)
and t1.rowid in (select t2.rowid from table t2);
quit;
Thank you.
So you have probably figured out, that delete is not very efficient.
If you have the disk space, I would recommend just creating a new table based on the inner join (the records you want), drop table1, and rename the results table1.
%let n=1000000;
data table1;
do rowid=1 to &n;
value = rowid**2;
output;
end;
run;
data table2;
do rowid=1 to &n;
value = (mod(rowid,2)=1)*rowid**2;
output;
end;
run;
proc sql noprint;
create table table1_new as
select a.*
from table1 as a
inner join
table2 as b
on a.rowid=b.rowid
and
a.value = b.value;
drop table table1;
quit;
proc datasets lib=work nolist;
change table1_new = table1;
run;
quit;
I have two datasets. Both have a common column- ID. I would like to check if ID from df1 lies in df2 and extract all such rows from df1. I'm doing this in SAS.
It is easily done in one sql query.
proc sql;
create table extract_from_df1 as
select
*
from
df1
where
id in (select id from df2)
;
quit;
There are lots of ways to do this. For example:
proc sql;
create table compare as select distinct
a.id as id1, b.id as id2
from table1 as a
left join table2 as b
on a.id = b.id;
quit;
and then keep matches. Or you can try:
proc sql;
delete from table2 where id2 in select distinct id1 from table1;
quit;
data df1;
input id name $;
cards;
1 abc
2 cde
3 fgh
4 ijk
;
run;
data df2;
input id address $;
cards;
1 abc
2 cde
5 ggh
6 ihh
7 jjj
;
run;
data c;
merge df1(in=x) df2(in=y);
if x and y;
keep id name;
run;
proc print data=c;
run;
I have one column of data and the column is named (Daily_Mileage). I have 15 different types of daily mileages and 250 rows. I want a separate count for each of the 15 daily mileages. I am using PROC SQL in SAS and it does not like the Cross join command. I am not really sure what I should do but this is what I started:
PROC SQL;
select A, B
From (select count(Daily_Mileage) as A from Work.full where Daily_Mileage = 'Farm Utility Vehicle (Class 7)') a
cross join (select count(Daily_Mileage) as B from Work.full where Daily_Mileage = 'Farm Truck Light (Class 35)') b);
QUIT;
Use case statements to define your counts as below.
proc sql;
create table submit as
select sum(case when Daily_Mileage = 'Farm Utility Vehicle (Class 7)'
then 1 else 0 end) as A,
sum(case when Daily_Mileage = 'Farm Truck Light (Class 35)'
then 1 else 0 end) as B
from Work.full
;
quit ;
Can't you just use a proc freq?
data example ;
input #1 Daily_Mileages $5. ;
datalines ;
TYPE1
TYPE1
TYPE2
TYPE3
TYPE3
TYPE3
TYPE3
;
run ;
proc freq data = example ;
table Daily_Mileages ;
run ;
/* Create an output dataset */
proc freq data = example ;
table Daily_Mileages /out=f_example ;
run ;
You can first create another column of ones, then SUM that column and GROUP BY Daily_Mileage. Let me know if I'm misunderstanding your questions.
PROC SQL;
CREATE TABLE tab1 AS
SELECT Daily_Mileage, 1 AS Count, SUM(Count) AS Sum
FROM <Whatever table your data is in>
GROUP BY Daily_Mileage;
QUIT;