SAS Creating disjoint data sets using SET - sas

Is there a way in sas to create disjoint data sets using the SET statement?
I have tried:
DATA OnlyFirst OnlySecond InBoth;
SET firstds(IN=A)
seconds(IN=B);
IF A AND NOT B THEN OUTPUT OnlyFirst;
IF B AND NOT A THEN OUTPUT OnlySecond;
IF A AND B THEN OUTPUT InBoth;
Run;
But this does not create disjoint sets.

That's not how the set statement works. You should be able to use a merge if you first make sure firstds and seconds are both sorted by a key variable (or variables) they both share. You'd then need to reference that shared variable in a by statement.
DATA OnlyFirst OnlySecond InBoth;
merge firstds(IN=A)
seconds(IN=B);
by <something shared variable>;
IF A AND NOT B THEN OUTPUT OnlyFirst;
IF B AND NOT A THEN OUTPUT OnlySecond;
IF A AND B THEN OUTPUT InBoth;
Run;

Related

How to use a variable in Data step without including it as a column in SAS

When I use the Data Step, I don't want to include temporary variables as columns. For example, in the following, while I want to include y as a column, I don't want to include a and b as columns. How can I tell SAS to not include a and b as columns?
Data Table1;
Set Table2;
a=scan(column_x,1,'_')
b=scan(column_x,2,'_')
y=cats(a, ':', b)
Run;
Use
the DROP statement
DROP A B;
or (DROP= data set option
DATA Table1(Drop=A B);
or KEEP statement
or (KEEP= data set option
You would explicitly list the desired variables when coding KEEP.

how to create 2 new data sets into a single data step in SAS

I created two data sets by themselves in two separate steps but now I am wondering how I could create the two new data sets in a single step in SAS
data purchase_price_jjohns2;
merge hw06.inventory hw06.purchase;
by Model;
if Quantity NE '';
TotalCost = Quantity*Price;
format TotalCost dollar7.2;
run;
data not_purchase_jjohns2;
merge hw06.inventory hw06.purchase;
by Model;
if Quantity='';
run;
my two steps by themselves now I want to know how to create this in one data step
A single data step can create as many output data sets as you want. It is important to remember all output data sets are created, and the columns of the output are fixed at run-time. You can specify which variables of the pdv should be in each output data set by using the data set option (keep=...) or (drop=...). In a single data step you can not create an output data set whose name is based on some variables value -- you can preprocess the data if you need such a splitting. There are some trickeries that involve dynamic hashes, but that is an advanced topic
The if statements you are currently using are sub-setting ifs. This means output (an implicit output) only occurs when the data step flow reaches the bottom of the step. You will want an explicit OUTPUT statement to ensure the current row goes into the desired output data set.
Thus you could have a data step similar to
data want1 want2(drop=total);
merge one two;
by key;
if missing(quantity) then
OUTPUT want2;
else do;
total = quantity * price;
OUTPUT want1;
end;
run;

What is the difference between concatenate, appending and merge on SAS?

I am trying to run codes on SAS for Concatenate, Appending and Merge and unable to understand the difference between the same. Looking for some one to help me understand the same with examples.
Concatenate and Append are similar, but not used the same way. In SAS, Append is used most commonly to mean concatenation in place. In other words, adding rows to a dataset without reading in the original dataset. This is very efficient, as you skip reading one of the datasets, but it has limitations (largely, you can't interleave or do other data step type things while appending). Append is most often done by PROC APPEND.
Concatenate, on the other hand, while it can mean appending, is usually used when combining the rows from two datasets into a new dataset with all of the rows from each source dataset as separate rows, but not in-place. This would be done with a set statement in a data step, most commonly. This reads both datasets in and writes a new dataset (that could replace one of the original input datasets, or have a new name). Concatenate also is often used to mean combine two string values into a variable; that's actually the most common usage I've heard it in.
Merge is not the same at all, though; it is side-by-side in some fashion, placing the data from one dataset in new variables on the same rows as the data from the other dataset. New rows can be created as part of merge, when one dataset has different key identifier values from the other, but that's usually not the point of the merge (usually!). Merge is done most often in the data step, either with the merge or the update statement.
Concatenate and Merge can also be done in other ways, of course, including SQL.
In a nutshell:
Concatenate: add a dataset on top (or to the bottom) of another one. Look into the SET statement of the DATA Step or the UNION clause of PROC SQL.
Append: Just another word for concatenate. Look into PROC DATASETS / APPEND, but it accomplishes the same task with different means.
Merge: add a dataset to the side (right, generally) of another one. Look into the MERGE statement of the DATA Step and/or the various JOIN's allowed by PROC SQL.
SAS Documentation will show you plenty of examples!
concatenate :it is used to append observations from one data set to another data set,so you specify a list of data set names in the set statement,to concatenate two data set the SAS system must process all observations from both data sets to create a new one.
APPEND:bypasses the observations from original data set and add the new observations directly at the end of the original data set when you have different variables you use the force=option with the append procedure.It function the same way as the append statement in the datasets statement.
and you can only append one data set at a time while in concatenation you can add as many data set in the set statement.
MERGE:you should have a common variable or several variables which taken together uniquely identify each observations,it sequentaly checks observation for each by-value(you have to sort your data sets before you can merge them),then write the combined observation to the new data set

Probt in sas for column of values

Im looking do a probt for a column of values in sas not just one and to give two tailed p values.
I have the following code Id like to amend
data all_ssr;
x=.551447;
df=25;
p=(1-probt(abs(x),df))*2;
put p=;
run;
however I would like x to be a column of values within another file. I have tried work.ttest which is just a file of ttest values.
Many thanks
You need to use a set statement to access data from another SAS dataset.
data all_ssr;
set work.ttest; /*Dataset containing column of values*/
df=25;
p=(1-probt(abs(x),df))*2;
run;
Removing the put statement avoids clogging up the log.

Column copy to another dataset in sas

I want to copy only 2 out of 7 columns in 'B' dataset form 'A' dataset
dataset A has (p,q,r,s,t,u,v)
I want to copy p,q,t in a new dataset B.
This is a more efficient way to do it:
data B;
set A (keep=p q t);
run;
Because the keep option in the set statement indicates that only these columns are read to start with. Using keep outside the set statement will still read in the columns, but drop them after.
We can use 'keep' keyword.
data B;
set A;
keep p q t;
run;
What do you mean copy two colums ? Do dataset B already exists ? If thats the case you need to simply merge the two files and use keep statemaent when reading them. If you need to create new data set its even simpler
data B;
set A;
keep p q t;
run;
Hope it helps. If you need the merge plz post and I will explain furthermore