How do I set Primary Key | SAS Studio - sas

I'm trying to set Primary Key on SAS and I keep getting the error mentioned below.
Any help would be great!
The first snippet is code and the next is the error.
/*Primary Key*/ /*Defines the unique key*/
Proc datasets lib=work;
modify WORK.FinAdvMaster;
ic create primary key(FinAdvID);
PROC PRINT DATA=WORK.FinAdvMaster; RUN;**strong text**
The error I get -
96 /*Primary Key*/ /*Defines the unique key*/
97
98 Proc datasets lib=work;
99 modify WORK.FinAdvMaster;
_________________
22
201
ERROR 22-322: Expecting a name.
ERROR 201-322: The option is not recognized and will be ignored.
100 ic create primary key(FinAdvID);
NOTE: Enter RUN; to continue or QUIT; to end the procedure.

Remove work. from your modify statement. The lib= option specifies the library. It's a quirk of proc datasets.
proc datasets lib=work;
modify FinAdvMaster;
ic create primary key (FinAdvID);
quit;
Note that this key will be destroyed if you recreate the dataset.

You can use SQL to add a column constraint specifying PRIMARY KEY
Example:
proc sql;
create table work.class as select * from sashelp.class;
alter table work.class add constraint pk_name primary key(name);
In your case
alter table FinAdvMaster
add constraint
pk_FinAdvID primary key(FinAdvID)
;
pk_<column-name> is a common convention for naming primary keys.

Related

Outputting specific results from data check using PROC SQL

I have this code to check for correct baseline lab values in SDTM.
/* Checking for no baseline results*/
proc sql;
create table dca as
select a.usubjid, a.lbtestcd,a.lbblfl,a.lbdy,a.lbstresn,b.good_value
from lb as a
left join
(select usubjid,lbtestcd,lbdy as good_value label='Baesline record' from lb
where lbdy < 0 group by usubjid, lbtestcd having lbdy= max(lbdy)) as b
on a.usubjid = b.usubjid and a.lbtestcd = b.lbtestcd;
create table dcb as
select unique 'LB' as domain,
compbl("USUBJID/lbblfl Subset") as key_grp_vr length = 50
, case when lbblfl='Y' and ^(lbdy=good_value) then
'FAIL: At least one Baseline is Not correct'
else 'PASS: All Baselines are correct' end as dc_rslt label="Data Check Results for:" length =75
from dca
;
quit;
proc sort data=dca;
by usubjid lbtestcd lbdy;
run;
proc print data=dcb; run;
DCA creates GOOD_VALUE using LEFT JOIN and subquery with WHERE condition to filter records. HAVING condition to filter by summary function and GROUP BY.DCB has case block with condition to confirm LBBLFL and GOOD_VALUE. This is the output
So since DCB shows both PASS and FAIL this means there at least one baseline flag which is incorrect. While this is useful I was wondering if there was some way to create another table to output the exact records where this is the case. Would anyone have any suggestions on how to do this?
Sort DCB and return values where they're the same but it doesn't seem like you have any unique identifier in that table as well?
proc sort data=dcb;
by domain key_gpr_vr;
run;
data mismatched;
set dcb;
by domain key_grp_vr;
IF not(first.key_grp_vr and last.key_grp_vr);
run;

select only a few columns from a large table in SAS

I have to join 2 tables on a key (say XYZ). I have to update one single column in table A using a coalesce function. Coalesce(a.status_cd, b.status_cd).
TABLE A:
contains some 100 columns. KEY Columns ABC.
TABLE B:
Contains just 2 columns. KEY Column ABC and status_cd
TABLE A, which I use in this left join query is having more than 100 columns. Is there a way to use a.* followed by this coalesce function in my PROC SQL without creating a new column from the PROC SQL; CREATE TABLE AS ... step?
Thanks in advance.
You can take advantage of dataset options to make it so you can use wildcards in the select statement. Note that the order of the columns could change doing this.
proc sql ;
create table want as
select a.*
, coalesce(a.old_status,b.status_cd) as status_cd
from tableA(rename=(status_cd=old_status)) a
left join tableB b
on a.abc = b.abc
;
quit;
I eventually found a fairly simple way of doing this in proc sql after working through several more complex approaches:
proc sql noprint;
update master a
set status_cd= coalesce(status_cd,
(select status_cd
from transaction b
where a.key= b.key))
where exists (select 1
from transaction b
where a.ABC = b.ABC);
quit;
This will update just the one column you're interested in and will only update it for rows with key values that match in the transaction dataset.
Earlier attempts:
The most obvious bit of more general SQL syntax would seem to be the update...set...from...where pattern as used in the top few answers to this question. However, this syntax is not currently supported - the documentation for the SQL update statement only allows for a where clause, not a from clause.
If you are running a pass-through query to another database that does support this syntax, it might still be a viable option.
Alternatively, there is a way to do this within SAS via a data step, provided that the master dataset is indexed on your key variable:
/*Create indexed master dataset with some missing values*/
data master(index = (name));
set sashelp.class;
if _n_ <= 5 then call missing(weight);
run;
/*Create transaction dataset with some missing values*/
data transaction;
set sashelp.class(obs = 10 keep = name weight);
if _n_ > 5 then call missing(weight);
run;
data master;
set transaction;
t_weight = weight;
modify master key = name;
if _IORC_ = 0 then do;
weight = coalesce(weight, t_weight);
replace;
end;
/*Suppress log messages if there are key values in transaction but not master*/
else _ERROR_ = 0;
run;
A standard warning relating to the the modify statement: if this data step is interrupted then the master dataset may be irreparably damaged, so make sure you have a backup first.
In this case I've assumed that the key variable is unique - a slightly more complex data step is needed if it isn't.
Another way to work around the lack of a from clause in the proc sql update statement would be to set up a format merge, e.g.
data v_format_def /view = v_format_def;
set transaction(rename = (name = start weight = label));
retain fmtname 'key' type 'i';
end = start;
run;
proc format cntlin = v_format_def; run;
proc sql noprint;
update master
set weight = coalesce(weight,input(name,key.))
where master.name in (select name from transaction);
run;
In this scenario I've used type = 'i' in the format definition to create a numeric informat, which proc sql uses convert the character variable name to the numeric variable weight. Depending on whether your key and status_cd columns are character or numeric you may need to do this slightly differently.
This approach effectively loads the entire transaction dataset into memory when using the format, which might be a problem if you have a very large transaction dataset. The data step approach should hardly use any memory as it only has to load 1 row at a time.

Proc sql VALIDATE with CREATE TABLE

I am working in SAS EG and I have code like this:
proc sql;
CREATE TABLE new as
SELECT f1,f2
FROM work.orig
WHERE f1<>'x'
;
This works.
However, when I add a VALIDATE option, like below, I get an error:
ERROR 22-322: Syntax error, expecting one of the following: (, SELECT.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
proc sql;
CREATE TABLE new as
VALIDATE
SELECT f1,f2
FROM work.orig
WHERE f1<>'x'
;
How do I use the validate option in proc sql?
My understanding is that you cannot use the VALIDATE with the CREATE statement.
Validate the underlying SELECT
proc sql;
VALIDATE
SELECT f1,f2
FROM work.orig
WHERE f1<>'x'
;
quit;
If that is successful, then your CREATE statement will work.

SAS getting table primary key

I'm completely new in SAS 4GL...
Is it possible to extract from table, which columns are primary key or parts of compound primary key? I need their values to be merged into one column of an output dataset.
The problem is, that as an input I can get different tables, and I don't know theirs definition.
If an index is defined, then you can find out what variable(s) is/are used in that index. See for example:
data blah(index=(name));
set sashelp.class;
run;
proc contents data=blah out=blahconts;
run;
blahconts has columns that indicate that name is in a simple index, and that it has 1 index total.
Also, you can have foreign key contraints, such as the following from this SAS documentation example:
proc sql;
create table work.mystates
(state char(15),
population num,
continent char(15),
/* contraint specifications */
constraint prim_key primary key(state),
constraint population check(population gt 0),
constraint continent check(continent in ('North America', 'Oceania')));
create table work.uspostal
(name char(15),
code char(2) not null, /* constraint specified as */
/* a column attribute */
constraint for_key foreign key(name) /* links NAME to the */
references work.mystates /* primary key in MYSTATES */
on delete restrict /* forbids deletions to STATE */
/* unless there is no */
/* matching NAME value */
on update set null); /* allows updates to STATE, */
/* changes matching NAME */
/* values to missing */
quit;
proc contents data=uspostal out=postalconts;
run;
proc sql;
describe table constraints uspostal;
quit;
That writes the constraint information to the output window. From the output dataset you can see that the variable is in a simple index. You can wrap either of these (the PROC CONTENTS or the DESCRIBE TABLE CONSTRAINTS) in ODS OUTPUT to get the information to a dataset:
ods output IntegrityConstraints=postalICs;
proc contents data=uspostal out=postalconts;
run;
ods output close;
or
ods output IntegrityConstraints=postalICs;
proc sql;
describe table constraints uspostal;
quit;
ods output close;

Lookup values in one table and add to dataset according to IF condition (MERGE/SQL)?

I need to lookup data from one table and add it to a master data table based on an if condition: whether the data is flagged as missing. Say the lookup table contains countries and ports. There are missing port names in the master file that need to be filled. It fills these using the lookup only if flag = 1 (it's missing).
This command doesn't work (won't fill it in & won't keep the obs with Flag =0):
proc sql;
create table data.varswprice1 as
select *
from data.varswprice a left join data.LPortsFill b
on a.LoadCountry = b.LoadCountry and a.LoadArea = b.LoadArea
where LPortMiss = 1;
quit;
Here's an example with a bit of the data...
LOOKUP table (3 vars):
LoadPort LoadCountry LoadArea
ARZEW ALGERIA NAF
MASTER (many vars):
OBS LoadPort LoadCountry LoadArea LPortMiss
1 ALGERIA NAF 1
2 ADELAIDE AUSTRALIA SEOZ 0
So, it needs to fill in the first obs in MASTER with the first obs in LOOKUP (ARZEW) based on the fact that LPortMiss = 1 and LoadCountry and LoadArea are equal. There are many more obs in LOOKUP and MASTER but I hope this illustrates the problem better.
I think this is what you're looking for:
proc sql;
select coalesce(a.loadport,b.loadport), a.loadcountry, a.loadarea
from master a left join lookup b
on a.loadcountry=b.loadcountry and a.loadarea=b.loadarea;
quit;
The coalesce function returns the first non-missing argument, so if loadport is missing from table master then it takes it from table lookup.
By the way, this isn't specific to SAS. For questions like this you could use a SQL label.
You can also use the UPDATE function in proc sql, this saves having to create a new dataset. You would probably want to reset the lportmiss flag as well.
proc sql;
update master as a
set loadport=(select loadport from lookup as b
where a.LoadCountry=b.LoadCountry and a.LoadArea=b.LoadArea)
where lportmiss=1;
quit;