I created below SAS code to pull the data for particular a date.
%let date =2016-12-31;
proc sql;
connect to teradata as tera ( user=testuser password=testpass );
create table new as select * from connection tera (select acct,org
from dw.act
where date= &date.);
disconnect from tera;
quit;
There are situation where that particular date may be missing in the dataset due to holiday.
I thinking how to query the previous date(non-holiday) if the mention date in the %let statement is holiday
Before running your query you have to do a lookup or data check on the date you are using. You have two options:
Use a Date Dimension table in order identify/lookup holidays.
Count how many records you have for that date, if you get 0 obs for this date, use date+1 in your query.
I recommend using the date dimension table option.
Teradata has Sys_Calendar.Calendar view. You can use that in query, it has all the information regarding weekdays and others.
if you want to SAS way use weekday function and use call symput as shown below. Teradata needs single quote around the date, so it is better to have single quotes around when creating macro variable
data _null_;
/* this is for intial date*/
date_int = input('2016-12-31', yymmdd10.);
/* create a new date variable depending on weekday*/
if weekday(date_int) = 7 then date =date_int-2; /*sunday -2 days to get
friday*/
else if weekday(date_int) = 6 then date =date_int-1;/*saturday -1 day to get
friday*/
else date =date_int;
format date date_int yymmdd10.;
call symputx('date', ''''||put(date,yymmdd10.)||'''');
run;
%put modfied date is &date;
modified date is '2016-12-29'
Now you can use this macro variable in your pass through.
I have metrics sas table like below
work.met_table
Metrics_Id Metrics_desc
1 Count_Column
2 Sum_Column
3 Eliminate_column
I wanna do something like doing while loop in T-sql
select count(*) :cnt_tbl from work.met_table
%let init_cnt = 1
while (&init_cnt = &cnt_tbl)
begin
select Metrics_desc into :met_nm
from work.met_table
where metrics_id = 1
Insert into some_sas_table
Select * from another table where Metrics_desc =&met_nm
/* Here I wanna loop all values in metrics table one by one */
end
%put &init_cnt = &int_cnt+1;
How this can be done in proc sql? Thanks in advance
If you want to dynamically generate code then use the SAS macro language.
But for your example there is no need to dynamically generate code.
proc sql ;
insert into some_sas_table
select *
from another_table
where Metrics_desc in (select Metrics_desc from work.met_table)
;
quit;
You can also do an explicit pass through. Send your native t-sql code to run on the database Server through SAS rather than bringing the data to the SAS application server to query it.
The example below is explained in details here.
PROC SQL;
CONNECT TO ODBC(DATASRC=SQLdb USER=&SYSUSERID) ;
/* Explicit PASSTHRU with SELECT */
SELECT *
FROM CONNECTION TO ODBC (
SELECT b.idnum o.[SSdatecol] AS mydate
FROM dbo.big_SS_table1 b
LEFT JOIN dbo.other_SStable o
ON b.idnum = o.memberid
WHERE o.otherdatecol >= '2014-10-06'
--This is a T-SQL comment that works inside SQL Server
) ;
;
DISCONNECT FROM ODBC ;
QUIT;
I have a question about the following 2 codes in SAS PROC SQL.
Code 1: (Standard Book version)
CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV AS a
WHERE
a.SITEID = '0001'
AND a.CLAIMID IN (SELECT CLAIMID FROM WORK.INPUT)
Code 2: (The much faster way in practice)
CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV AS a
WHERE
a.SITEID = '0001'
AND a.CLAIMID IN ('10001', '10002', '10003', ... '15000')
When I try to do it more elegantly by using subquery in #1, the run time blows up to 50 minutes +. But the same input returns within 3 minutes using Code 2. Why is that? Note, it's just as slow using INNER JOIN too (after reading this). The input is 5000+ CLAIMID, which I manually paste into the IN('...') block everyday.
PS: The CLAIMID are made up, in real life they are random.
The CLAIMID are indexed in DW.CLAIMS. I am using SAS PROC SQL to access an Oracle database. What is going on, and is there a better way? Thanks!
I don't know that I can tell you why SAS is so slow at the first select; something's not optimized in that scenario clearly.
If I had to guess, I'd guess that SAS is deciding in the first case that it can't use pass-through SQL and so it's downloading the whole big table and then running this SAS-side, while in the second case it's passing the query up to the SQL database and only transporting the resulting rows back.
But there are several ways to work around this, anyway. Here's one: use a macro variable to do precisely the pasting you're doing!
proc sql;
select quote(strip(claimid)) into :claimlist separated by ','
from work.input
;
CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV AS a
WHERE
a.SITEID = '0001'
AND a.CLAIMID IN (&claimlist.)
;
quit;
Tada, you don't have to touch this anymore, and it's identical to the copy/paste that you did.
A few extra notes given some comments:
If CLAIMID is ever less than 15, you may have space padding, so I added strip to remove those. It doesn't matter for string comparisons - except insomuch as you might run out of macro language, and I worry that some DBMS may actually care about the padding. You can leave out strip if the 15 is a constant length.
Macro variables run up to 64K in space. If you have 15 character variable plus " " two plus comma one, you have 18 characters; you have room for a bit over 3500 values. That's under 5000, unfortunately.
In this case, you can either split up the field into two macro variables (easy enough hopefully, use obs and firstobs) or you can do some other solution.
Transfer the work.input dataset into the DW libname, then do the join in SQL there.
Put the contents of the claimID into a file instead of into a macro variable, and then %include that file.
Use call execute to execute the whole proc SQL.
Here's one example of CALL EXECUTE.
data _null_;
set work.input end=eof;
if _n_=1 then do;
call execute('CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV AS a
WHERE
a.SITEID = "0001"
AND a.CLAIMID IN ('); *the part of the SQL query before the list of IDs;
end;
call execute(quote(claimID) || ' ');
if EOF then do;
call execute('); QUIT;'); *the part of the SQL query after the list of IDs;
end;
run;
This would be nearly identical to the %INCLUDE solution really, except there you put that stuff to a text file instead of CALL EXECUTEing it, and then you %INCLUDE that text file.
I think you're working both with local data and data on your server. When SAS is working with data from different sources (databases) it brings it all into SAS for processing which can be really, really slow.
Instead, you can make a macro variable and use that within your query. If it's 5000, it should fit into one macro variable, assuming the length is less than 13 chars each. A macro variable size limit is 64K characters, so it depends on the length of the variable. If not you could create a macro instead.
proc sql noprint;
select quote(claimID, "'") into : claim_list separated by ", " from input;
quit;
proc sql;
CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM DW.CLAIMS_BAV AS a
WHERE
a.SITEID = '0001'
AND a.CLAIMID IN (&claim_list.);
quit;
Please be sure to use
option sastrace=',,,ds' sastraceloc=saslog nostsuffix;
to receive information on how your code is translated by SAS/Aceess engine to DB statements.
In order to give SAS a hint to dynamicly build IN (1,2,3, ..) clause from your IN (SELECT .. query
add MULTI_DATASRC_OPT=IN_CLAUSE to your libname DW ... statement and
add dbmaster dataset option to the "master" table
like one of the following queries:
CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV (dbmaster=yes) AS a
WHERE
a.SITEID = '0001'
AND a.CLAIMID IN (SELECT CLAIMID FROM WORK.INPUT)
or
CREATE TABLE WORK.OUTPUT AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV (dbmaster=yes) AS a
inner join WORK.INPUT AS b
on a.CLAIMID = b.CLAIMID
WHERE
a.SITEID = '0001'
Using the In() without sub-querying is definitely faster, but other performance consideration to keep in mind is the network and compute server load/traffic at the time of running; assuming you are running on a client / server configuration.
If you plan to use the SQL select into macro variable solution; keep in mind the count of distinct values and the length of the string you are saving in the macro as there is a size limit.
You can also save the In() values in a table and just do a join.
PROC SQL;
/*CLAIM ID Table*/
CREATE TABLE WORK.OUTPUT1 AS
SELECT
"CLAIM" AS SOURCE,
a.CLAIMID,
a.DXCODE
FROM
DW.CLAIMS_BAV AS a
WHERE
a.SITEID = '0001';
/*ID Lookup Table*/
CREATE TABLE WORK.OUTPUT2 AS
SELECT
DISTINCT b.CLAIMID FROM WORK.INPUT AS b
;
/*Inner Join Table / AKA lookup join*/
CREATE TABLE WORK.Final AS
SELECT
a.SOURCE, a.CLAIMID, a.DXCODE
FROM WORK.OUTPUT1 AS a INNER JOIN WORK.OUTPUT2 AS b
ON a.CLAIMID = b.CLAIMID
;
QUIT;
I'm still new to SAS and DB2. I have a DB2 Table with a column that stores values encoded as timestamps. I'm trying to load data onto this column from a SAS data set in my Work directory. Some of these timestamps, however, correspond to dates before 01-01-1582 and can not be stored as datetime values in SAS. They are instead stored as strings.
This means that if I want to load these values onto the DB2 table I must first convert them to timestamp with the TIMESTAMP() DB2 function, which, as far as I know, requires passthrough SQL with an execute statement (as opposed to the SAS ACCESS libname method). For instance, in order to write a single value I do the following:
PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (insert into xxxx.xxxx (var) values (TIMESTAMP('0001-01-01-00.00.00.000000'))) by db2;
disconnect from db2;
quit;
How can I achieve this for all values in the source data set? A select ... from statement inside the execute command doesn't work because as far as I know I can't reference the SAS Work directory from within the DB2 connection.
Ultimately I could write a macro that executes the PROC SQL block above and call it from within a data step for every observation but I was wondering if there's an easier way to do this. Changing the types of the variables is not an option.
Thanks in advance.
A convoluted way of working around that would be to use call execute:
data _null_;
set sas_table;
call execute("PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (
insert into xxxx.xxxx (var)
values (TIMESTAMP('"||strip(dt_string)||"'))
) by db2;
disconnect from db2;
quit;");
run;
Where sas_table is your SAS dataset containing the datetime values stored as strings and in a variable called dt_string.
What happens here is that, for each observation in a dataset, SAS will execute the argument of the execute call routine, each time with the current value of dt_string.
Another method using macros instead of call execute to do essentially the same thing:
%macro insert_timestamp;
%let refid = %sysfunc(open(sas_table));
%let refrc = %sysfunc(fetch(&refid.));
%do %while(not &refrc.);
%let var = %sysfunc(getvarc(&refid.,%sysfunc(varnum(&refid.,dt_string))));
PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (insert into xxxx.xxxx (var) values (TIMESTAMP(%str(%')&var.%str(%')))) by db2;
disconnect from db2;
quit;
%let refrc = %sysfunc(fetch(&refid.));
%end;
%let refid = %sysfunc(close(&refid.));
%mend;
%insert_timestamp;
EDIT: I guess you could also load the table as-is in DB2 using SAS/ACCESS and then convert the strings to timestamp with sql pass-through. Something like
libname lib db2 database=xxxx schema=xxxx user=xxxx password=xxxx;
data lib.temp;
set sas_table;
run;
PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (create table xxxx.xxxx (var TIMESTAMP)) by db2;
execute (insert into xxxx.xxxx select TIMESTAMP(dt_string) from xxxx.temp) by db2;
execute (drop table xxxx.temp) by db2;
disconnect from db2;
quit;
I have the following query which runs in SAS using proc sql where I have an automated variable which contains the month end date but it results in the following error
ERROR: Prepare error: ICommandPrepare::Prepare failed. : ERROR: Attribute '2017-02-28' not found
Query:
proc sql;
connect to oledb (datasource='10.1.0.105' provider=nzoledb
user=&user_id password=&pwd properties=('initial catalog'=ODS));
create table &user..Pers_test as select * from connection to oledb
(SELECT a.ID from DBO.Table1
where a.SOURCE_SYSTEM_CREATED_DTM <= "&monthend."
Group by a.SWID order by a.SWID
);
%let _sql_xrc=&sqlxrc;
disconnect from oledb;
quit;
However the query runs when the timestamp is hardcoded.
proc sql;
connect to oledb (datasource='10.1.0.105' provider=nzoledb
user=&user_id password=&pwd properties=('initial catalog'=ODS));
create table &user..Pers_test as select * from connection to oledb
(SELECT a.ID from DBO.Table1
where a.SOURCE_SYSTEM_CREATED_DTM <= '2017-02-28 00:00:00'
Group by a.SWID order by a.SWID
);
%let _sql_xrc=&sqlxrc;
disconnect from oledb;
quit;
I have tried casting, substring but it all results in the same error. Any help is appreciated to work around with the automated variable.
The variable was not getting resolved under single quotes and hence double quotes was being used. But being double quotes, the column could not identify with the value and the error got thrown up. So, the variable had to be resolved under single quotes.
The code to resolve the variable under single quote is as follows
cast(%unquote(%str(%')&monthend.%str(%')) as datetime)
I modified Karan Pappala's answer to make it work for me:
%unquote(%str(%')&execution_method.%str(%'))