I am using SAS Enterprise guide and want to compare two date variables:
My code looks as follows:
proc sql;
CREATE TABLE observations_last_month AS
SELECT del_flag_1,
gross_exposure_fx,
reporting_date format=date7.,
max(reporting_date) AS max_date format=date7.
FROM &dataIn.
WHERE reporting_date = max_date;
quit;
If I run my code without the WHEREstatement I get the following data:
However, when I run the above code I get the following error messages:
ERROR: Expression using (=) has components that are of different data types.
ERROR: The following tables were not found in the contributing tables: max_date.
What am I doing wrong here? Thanks up front for the help
If you want to subset based on an aggregate function then you need to use HAVING instead of WHERE. If you want to refer to a variable that you have derived in your query then you need to use the CALCULATED keyword (or just re-calculate it).
proc sql;
CREATE TABLE observations_last_month AS
SELECT del_flag_1
, gross_exposure_fx
, reporting_date format=date7.
, max(reporting_date) AS max_date format=date7.
FROM &dataIn.
HAVING reporting_date = CALCULATED max_date
;
quit;
Related
I have two datasets. The first, big_dataset, has around 3000 columns, most of which are never used. The second, column_list, contains a single column called column_name with around 100 values. Each value is the name of a column I want to keep.
I want to filter big_dataset so that only columns in column_list are kept, and the rest are discarded.
If I were using Pandas dataframes in Python, this would be a trivial task:
cols = column_list['column_name'].tolist()
smaller_dataset = big_dataset[cols]
However, I can't figure out the SAS equivalent. Proc Transpose doesn't let me turn the rows into headers. I can't figure out a statement in the data step that would let this work, and as far as I'm aware this isn't something that Proc SQL could handle. I've read through the docs on Proc Datasets and that doesn't seem to have what I need either.
To obtain a list of columns from column_list to use against big_dataset, you can query the column_list table and put the result into a macro variable. This can be achieved with PROC SQL and the SEPARATED BY clause:
proc sql noprint;
select column_name
into :cols separated by ','
from column_list;
create table SMALLER_DATASET AS
select &cols.
from WORK.BIG_DATASET;
quit;
Alternatively you may use SEPARATED BY ' ' and then use the resulting list in a KEEP statement or dataset option:
proc sql noprint;
select column_name
into :cols separated by ' '
from column_list;
quit;
data small_dataset;
set big_dataset (keep=&cols.);
/* or keep=&cols.; */
run;
I have 18 separate datasets that contain similar information: patient ID, number of 30-day equivalents, and total day supply of those 30-day equivalents. I've output these from a dataset that contains those 3 variables plus the medication class (VA_CLASS) and the quarter it was captured in (a total of 6 quarters).
Here's how I've created the 18 separate datasets from the snip of the dataset shown above:
%macro rx(class,num);
proc sql;
create table dm_sum&clas._qtr&num as select PatID,
sum(equiv_30) as equiv_30_&class._&num
from dm_qtrs
where va_class = "HS&class" and dm_qtr = &qtr
group by 1;
quit;
%mend;
%rx(500,1);
%rx(500,2);
%rx(500,3);
%rx(500,4);
%rx(500,5);
%rx(500,6);
%rx(501,1);
and so on...
I then need to merge all 18 datasets back together by PatID and what I'd like to do is iteratively add the next dataset created to the previous, as in, add dataset dm_sum_500_qtr3 to a file that already contains the results of dm_sum_500_qtr1 & dm_sum_500_qtr1.
Thanks for looking, Brian
In the macro append the created data set to it an accumulator data set. Be sure to delete it before starting so there is a fresh accumulation. If the process is run at different times (like weekly or monthly) you may want to incorporate a unique index to prevent repeated appendings. If you are stacking all these sums, the create table should also select va_class and dm_qtr
%macro (class, num, stack=perm.allClassNumSums);
proc sql; create table dm_sum&clas._qtr&num as … ;
proc append force base=perm.allClassNumSums data=dm_sum&clas._qtr#
run;
%mend;
proc sql;
drop table perm.allClassNumSums;
%rx(500,1)
%rx(500,2)
%rx(500,3)
%rx(500,4)
%rx(500,5)
…
A better approach might be a single query with an larger where, and leave the class and qtr as categorical variables. Your current approach is moving data (class and qtr) into metadata (column names). Such a transformation makes additional downstream processing more difficult.
Proc TABULATE or REPORT can be use a CLASS statement to assist the creation of output having category based columns. These procedures might even be able to work directly with the original data set and not require a preparatory SQL query.
proc sql;
create table want as
select
PatID, va_class, dm_qtr,
sum(equiv_30) as equiv_30_sum
from dm_qtrs
where catx(':', va_class, dm_sqt) in
(
'HS500:1'
'HS500:2'
'HS500:3'
…
'HS501:1'
)
group by PatID, va_class, dm_qtr;
quit;
How do i write in sas:
proc sql;
create table THIS as
select *
from MAIN(keep=id col1 -- col34)
where (AT LEAST ONE OF THE COLUMNS contains 1) ;
;
I am having a problem figuring out how to write that last line bc I want to keep all columns so I am not just checking one column i want to check for all of them.
You will have more flexibility if you use a DATA step instead of PROC SQL since you cannot use variable lists in PROC SQL code.
Assuming all of the variables in your list are numeric you could do something like this.
data this;
set main ;
keep id col1 -- col34;
if whichn(1,of col1 -- col34);
run;
Tom is right, the best approach is with a data step. If you are certain you want to do it with SQL though you could do something like this:
proc sql noprint;
create table THIS as
select *
from MAIN(keep=id col1 -- col34)
where sum(col1,col2,col3, ... ,col34)
;
quit;
I have the following query which runs in SAS using proc sql where I have an automated variable which contains the month end date but it results in the following error
ERROR: Prepare error: ICommandPrepare::Prepare failed. : ERROR: Attribute '2017-02-28' not found
Query:
proc sql;
connect to oledb (datasource='10.1.0.105' provider=nzoledb
user=&user_id password=&pwd properties=('initial catalog'=ODS));
create table &user..Pers_test as select * from connection to oledb
(SELECT a.ID from DBO.Table1
where a.SOURCE_SYSTEM_CREATED_DTM <= "&monthend."
Group by a.SWID order by a.SWID
);
%let _sql_xrc=&sqlxrc;
disconnect from oledb;
quit;
However the query runs when the timestamp is hardcoded.
proc sql;
connect to oledb (datasource='10.1.0.105' provider=nzoledb
user=&user_id password=&pwd properties=('initial catalog'=ODS));
create table &user..Pers_test as select * from connection to oledb
(SELECT a.ID from DBO.Table1
where a.SOURCE_SYSTEM_CREATED_DTM <= '2017-02-28 00:00:00'
Group by a.SWID order by a.SWID
);
%let _sql_xrc=&sqlxrc;
disconnect from oledb;
quit;
I have tried casting, substring but it all results in the same error. Any help is appreciated to work around with the automated variable.
The variable was not getting resolved under single quotes and hence double quotes was being used. But being double quotes, the column could not identify with the value and the error got thrown up. So, the variable had to be resolved under single quotes.
The code to resolve the variable under single quote is as follows
cast(%unquote(%str(%')&monthend.%str(%')) as datetime)
I modified Karan Pappala's answer to make it work for me:
%unquote(%str(%')&execution_method.%str(%'))
I am working in SAS EG and I have code like this:
proc sql;
CREATE TABLE new as
SELECT f1,f2
FROM work.orig
WHERE f1<>'x'
;
This works.
However, when I add a VALIDATE option, like below, I get an error:
ERROR 22-322: Syntax error, expecting one of the following: (, SELECT.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
proc sql;
CREATE TABLE new as
VALIDATE
SELECT f1,f2
FROM work.orig
WHERE f1<>'x'
;
How do I use the validate option in proc sql?
My understanding is that you cannot use the VALIDATE with the CREATE statement.
Validate the underlying SELECT
proc sql;
VALIDATE
SELECT f1,f2
FROM work.orig
WHERE f1<>'x'
;
quit;
If that is successful, then your CREATE statement will work.