Inserting into DB2 fom SAS dataset with passthrough SQL - sas

I'm still new to SAS and DB2. I have a DB2 Table with a column that stores values encoded as timestamps. I'm trying to load data onto this column from a SAS data set in my Work directory. Some of these timestamps, however, correspond to dates before 01-01-1582 and can not be stored as datetime values in SAS. They are instead stored as strings.
This means that if I want to load these values onto the DB2 table I must first convert them to timestamp with the TIMESTAMP() DB2 function, which, as far as I know, requires passthrough SQL with an execute statement (as opposed to the SAS ACCESS libname method). For instance, in order to write a single value I do the following:
PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (insert into xxxx.xxxx (var) values (TIMESTAMP('0001-01-01-00.00.00.000000'))) by db2;
disconnect from db2;
quit;
How can I achieve this for all values in the source data set? A select ... from statement inside the execute command doesn't work because as far as I know I can't reference the SAS Work directory from within the DB2 connection.
Ultimately I could write a macro that executes the PROC SQL block above and call it from within a data step for every observation but I was wondering if there's an easier way to do this. Changing the types of the variables is not an option.
Thanks in advance.

A convoluted way of working around that would be to use call execute:
data _null_;
set sas_table;
call execute("PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (
insert into xxxx.xxxx (var)
values (TIMESTAMP('"||strip(dt_string)||"'))
) by db2;
disconnect from db2;
quit;");
run;
Where sas_table is your SAS dataset containing the datetime values stored as strings and in a variable called dt_string.
What happens here is that, for each observation in a dataset, SAS will execute the argument of the execute call routine, each time with the current value of dt_string.
Another method using macros instead of call execute to do essentially the same thing:
%macro insert_timestamp;
%let refid = %sysfunc(open(sas_table));
%let refrc = %sysfunc(fetch(&refid.));
%do %while(not &refrc.);
%let var = %sysfunc(getvarc(&refid.,%sysfunc(varnum(&refid.,dt_string))));
PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (insert into xxxx.xxxx (var) values (TIMESTAMP(%str(%')&var.%str(%')))) by db2;
disconnect from db2;
quit;
%let refrc = %sysfunc(fetch(&refid.));
%end;
%let refid = %sysfunc(close(&refid.));
%mend;
%insert_timestamp;
EDIT: I guess you could also load the table as-is in DB2 using SAS/ACCESS and then convert the strings to timestamp with sql pass-through. Something like
libname lib db2 database=xxxx schema=xxxx user=xxxx password=xxxx;
data lib.temp;
set sas_table;
run;
PROC SQL;
connect to db2 (user = xxxx database = xxxx password = xxxx);
execute (create table xxxx.xxxx (var TIMESTAMP)) by db2;
execute (insert into xxxx.xxxx select TIMESTAMP(dt_string) from xxxx.temp) by db2;
execute (drop table xxxx.temp) by db2;
disconnect from db2;
quit;

Related

SAS Selecting the most recent dataset in a lib automatically

i am working with a library that is updated every month or so, and i need a way to select the most recent dataset each month, i tried two methods that would show me the latest table, one that makes a table ordering from the modified date
proc sql;
create table tables as
select memname, modate
from dictionary.tables
where libname = 'SASHELP'
order by modate desc;
quit;
and one that gives me just the latest modified one
proc sql;
select memname into> latest_dataset
from dictionary.tables
where libname='WORK'
having crdate=max(crate);
%put &=latest_dataset;
and i would like to put these latest datasets in a table, but i don't know how, or if there is another easier way to do this, i am still very much new to SAS programming so i'm lost, any help is appreciated.
Use Proc APPEND to put the latest data set into a table. You are essentially accumulating rows.
Use SQL :INTO to obtain (place into a macro variable) the libname.memname of the data set that should be appended.
Example:
The task of determining the newest data set and appending it to a base table is also in a macro so the the code can be easily rerun in the example.
%macro append_newest;
%local newest_table;
proc sql noprint;
select catx('.', libname, memname) into :newest_table
from dictionary.tables
where libname = 'WORK'
and memtype = 'DATA'
having crdate = max(crdate);
%put NOTE: &=newest_table;
create view newest_view as
select "&newest_table" as row_source length=41, *
from &newest_table
;
proc append base=work.master data=newest_view;
run;
%mend;
* create an empty for accumulating new observations;
data work.master;
length row_source $41;
set one (obs=0);
run;
data work.one;
set sashelp.class;
where name between 'A' and 'E';
%append_newest;
data work.two;
set sashelp.class;
where name between 'Q' and 'ZZ';
%append_newest;
data work.three;
set sashelp.class;
where name between 'E' and 'Q';
%append_newest;
Will produce this master table that accumulates the little pieces that come in day by day.
You would want additional constraints such as a unique key in order to prevent appending the same data more than once.

SAS - Change date format returned from database

I'm pulling data from many Teradata tables that have dates stored in MM/DD/YYYY format (ex: 8/21/2003, 10/7/2013). SAS returns them as DDMMMYYYY, or DATE9 format (ex: 21AUG2003, 07OCT2013). Is there a way to force SAS to return date variables as MM/DD/YYYY, or MMDDYY10 format? I know I can manually specify this for specific columns, but I have a macro set up to execute the same query for 65 different tables:
%macro query(x);
proc sql;
connect using dbase;
create table &x. as select * from connection to dbase
(select *
from table.&x.);
disconnect from dbase;
quit;
%mend(query);
%query(bankaccount);
%query(budgetcat);
%query(timeattendance);
Some of these tables will have date variables and some won't. So I'd like the value to be returned as MMDDYY10 format by default. Thanks for your help!
Per the comments to my question, I was able to figure this out using the FMTINFO function. I pretty much used this same code:
proc contents data=mylib._all_ noprint out=contents;
run;
data _null_;
set contents;
where fmtinfo(format,'cat')='date';
by libname memname ;
if first.libname then call execute(catx(' ','proc datasets nolist lib=',libname,';')) ;
if first.memname then call execute(catx(' ','modify',memname,';format',name)) ;
else call execute(' '||trim(name)) ;
if last.memname then call execute(' DDMMYYS10.; run;') ;
if last.libname then call execute('quit;') ;
run;
Found here:
https://communities.sas.com/t5/SAS-Procedures/Change-DATE-formats-to-DDMMYYS10-for-ALL-unknown-number-date/td-p/366637

Connecting to Oracle from SAS

This how I configured my LibName
LIBNAME OrcaleSAS ORACLE USER=UserName PASSWORD=pwd*** PATH = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host.unix.####.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=prod.tk.com)))'
And to fetch data below is the code I am using
PROC SQL;
connect using OracleSAS AS OracDB;
select * from connection to oracle
(select * from ENTITY_DATES_WK13);
disconnect from OracDB;
quit;
I am getting the error OracleSAS is not a SAS name & Error in the LIBNAME statement , I am fairly new to SAS..!!
connecting to oracle or any dbms can be done libname or by explicit pass through. Libname method is used to access oracle table(or any dbms) in SAS(tables are usually moved to SAS). Explicit method ( using connect statement) where in query is directly sent to Oracle(or dbms mentioned). This methods are not interchangeable for oracle(or anydbms) table and hence you got error.
below is libname method
LIBNAME OrcaleSAS ORACLE USER=UserName PASSWORD=pwd*** PATH =
'(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host.unix.####.com)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=prod.tk.com)))'
libname sastab "/whatever path/";
proc sql;
create table sastab.tablename as
select *
oratable.tablename
quit;
below is explicit pass through method
proc sql;
connect to oracle as myconn (user=smith password=secret
path='myoracleserver');
create table sastab.newtable as
select *
from connection to myconn
(select *
from oracleschematable);
disconnect from myconn;
quit;
The name you use for your library, called the libref, can only by 8 characters long.
OracleSAS is 9 characters.
Use something shorter.
When you set up a libname you don't need the Connect portion of the PROC SQL. You're mixing two methods of connecting to a SQL database into one.
Assuming your libname worked correctly you can query it the same as you would any other SAS table at this point:
PROC SQL;
create table want as
select * from ENTITY_DATES_WK13;
quit;
proc print data=want(obs=5);
run;
The other method is 'pass through' which means the query is passed fully to the server to run on that side. This means the inside query needs to be Oracle compliant and you can't use SAS functions or data.
At last I ended doing something like this..,
%let usr ="pradeep"
%let pwd ="******"
%let pth = "ORACLEPATH"
%put path: &path;
proc sql;
connect to oracle(user = &usr
password =&pwd
path =&pth buffsize=5000);
/* my code */

select only a few columns from a large table in SAS

I have to join 2 tables on a key (say XYZ). I have to update one single column in table A using a coalesce function. Coalesce(a.status_cd, b.status_cd).
TABLE A:
contains some 100 columns. KEY Columns ABC.
TABLE B:
Contains just 2 columns. KEY Column ABC and status_cd
TABLE A, which I use in this left join query is having more than 100 columns. Is there a way to use a.* followed by this coalesce function in my PROC SQL without creating a new column from the PROC SQL; CREATE TABLE AS ... step?
Thanks in advance.
You can take advantage of dataset options to make it so you can use wildcards in the select statement. Note that the order of the columns could change doing this.
proc sql ;
create table want as
select a.*
, coalesce(a.old_status,b.status_cd) as status_cd
from tableA(rename=(status_cd=old_status)) a
left join tableB b
on a.abc = b.abc
;
quit;
I eventually found a fairly simple way of doing this in proc sql after working through several more complex approaches:
proc sql noprint;
update master a
set status_cd= coalesce(status_cd,
(select status_cd
from transaction b
where a.key= b.key))
where exists (select 1
from transaction b
where a.ABC = b.ABC);
quit;
This will update just the one column you're interested in and will only update it for rows with key values that match in the transaction dataset.
Earlier attempts:
The most obvious bit of more general SQL syntax would seem to be the update...set...from...where pattern as used in the top few answers to this question. However, this syntax is not currently supported - the documentation for the SQL update statement only allows for a where clause, not a from clause.
If you are running a pass-through query to another database that does support this syntax, it might still be a viable option.
Alternatively, there is a way to do this within SAS via a data step, provided that the master dataset is indexed on your key variable:
/*Create indexed master dataset with some missing values*/
data master(index = (name));
set sashelp.class;
if _n_ <= 5 then call missing(weight);
run;
/*Create transaction dataset with some missing values*/
data transaction;
set sashelp.class(obs = 10 keep = name weight);
if _n_ > 5 then call missing(weight);
run;
data master;
set transaction;
t_weight = weight;
modify master key = name;
if _IORC_ = 0 then do;
weight = coalesce(weight, t_weight);
replace;
end;
/*Suppress log messages if there are key values in transaction but not master*/
else _ERROR_ = 0;
run;
A standard warning relating to the the modify statement: if this data step is interrupted then the master dataset may be irreparably damaged, so make sure you have a backup first.
In this case I've assumed that the key variable is unique - a slightly more complex data step is needed if it isn't.
Another way to work around the lack of a from clause in the proc sql update statement would be to set up a format merge, e.g.
data v_format_def /view = v_format_def;
set transaction(rename = (name = start weight = label));
retain fmtname 'key' type 'i';
end = start;
run;
proc format cntlin = v_format_def; run;
proc sql noprint;
update master
set weight = coalesce(weight,input(name,key.))
where master.name in (select name from transaction);
run;
In this scenario I've used type = 'i' in the format definition to create a numeric informat, which proc sql uses convert the character variable name to the numeric variable weight. Depending on whether your key and status_cd columns are character or numeric you may need to do this slightly differently.
This approach effectively loads the entire transaction dataset into memory when using the format, which might be a problem if you have a very large transaction dataset. The data step approach should hardly use any memory as it only has to load 1 row at a time.

SAS -> Shell DB2 Passthrough and macro resolving

I'm trying to automate a job that involves a lot of data being passed across the network and between the actual db2 server and our SAS server. What I'd like to do is take a traditional pass through...
proc sql;
connect to db2(...);
create table temp as
select * from connection to db2(
select
date
.......
where
date between &start. and &stop.
); disconnect from db2;
quit;
into something like this:
x "db2 'insert into temp select date ...... where date between &start. and &stop.'";
I'm running into a few issues the first of which is db2 date format of 'ddMONyyyy'd which causes the shell command to terminate early. If I can get around that I think it should work.
I can pass a macro variable through to the AIX (SAS) server without the extra set of ' ' needed to execute the db2 command.
Any thoughts?
You might get around the single-quote around the date issue by setting off the WHERE clause with a parenthesis. I'm not sure that this will work, but it might be worth trying.
As far as X command, try something like the following:
%let start = '01jan2011'd;
%let stop = '31dec2011'd;
%let command_text = db2 %nrbquote(')insert into temp select date ... where (date between &start. and &stop.)%nrbquote(');
%put command_text = &command_text;
x "&command_text";