Is there an equivalent to the SAS format cntlin procedure in Teradata. I have a reference value table (code_value), which is used a lot and rather than doing many outer joins to the reference value table, I'd like to have a lookup function similar to the solution below in SAS. Any help is greatly appreciated.
data CodeValueFormat;
set grp.code_value (keep=code_value_id description);
fmtname = 'fmtCodeValue';
start = code_value_id;
label = description;
run;
proc format cntlin=work.codevalueformat;
run;
proc sql;
select foo_code_id format=fmtCodeValue.
from bar;
quit;
There is no way you can emulate SAS format cntlin procedure in Teradata or any other database other than using lookup tables. One way to avoid doing same joins again and again is to do index join. please look into below link to see whether this is what you want to do. https://info.teradata.com/HTMLPubs/DB_TTU_16_00/index.html#page/Database_Management%2FB035-1094-160K%2Fqiq1472240587768.html%23wwID0EFK1R
One another way is to maintain a denormalized table and do joins with your incremental/daily records in your staging area and then append this records to your final table
Related
I am a newbie to SAS and I am trying to execute below code to obtain all the information for a particular library. However it fails in between due to data in a particular dataset. Is there any way to read dataset names from a different dataset and loop through them creating a different dataset specific to each datasetname from the list?
Proc contents data= testlib. _ALL_ out=x;
Run;
Instead I want something like this
Proc contents data in (work. Tbnames) out = x;
Run;
And read data from below data set.
Data tbnames(keep tablename) ;
Set WORK. tablenames;
Run;
Please help
St
Proc contents data = work.Tbnames out = x;
Run;
Use Proc COPY to copy data sets from one library to another.
libname testlib '<os-path-to-folder>';
proc copy in=testlib out=work memtype=DATA;
run;
Read the data from dictionary.table instead.
This assumes that you have the list of tables in a data set called tableNames and it has a variable called tName, which is the variable name. Note that it is a case sensitive comparison so UPCASE() is used make it all upper case.
proc sql;
create table summary as
select *
from dictionary.table
where memname in (select upcase(tName) from tableNames);
quit;
Or look at PROC DATASETS which operates on a library, not a single data set.
proc datasets lib=myLib;
run;quit;
I am working in SAS Enterprise guide and have a one column SAS table that contains unique identifiers (id_list).
I want to filter another SAS table to contain only observations that can be found in id_list.
My code so far is:
proc sql noprint;
CREATE TABLE test AS
SELECT *
FROM data_sample
WHERE id IN id_list
quit;
This code gives me the following errors:
Error 22-322: Syntax error, expecting on of the following: (, SELECT.
What am I doing wrong?
Thanks up front for the help.
You can't just give it the table name. You need to make a subquery that includes what variable you want it to read from ID_LIST.
CREATE TABLE test AS
SELECT *
FROM data_sample
WHERE id IN (select id from id_list)
;
You could use a join in proc sql but probably simpler to use a merge in a data step with an in= statement.
data want;
merge oneColData(in = A) otherData(in = B);
by id_list;
if A;
run;
You merge the two datasets together, and then using if A you only take the ID's that appear in the single column dataset. For this to work you have to merge on id_list which must be in both datasets, and both datasets must be sorted by id_list.
The problem with using a Data Step instead of a PROC SQL is that for the Data step the Data-set must be sorted on the variable used for the merge. If this is not yet the case, the complete Data-set must be sorted first.
If I have a very large SAS Data-set, which is not sorted on the variable to be merged, I have to sort it first (which can take quite some time). If I use the subquery in PROC SQL, I can read the Data-set selectively, so no sort is needed.
My bet is that PROC SQL is much faster for large Data-sets from which you want only a small subset.
I have the following problem. We have several streams in Enterprise Miner and we would like to be able to tell how long was each run. I have tried to create a macro that would save the start and end time/date but the problem is that global variables defined in a node, are not seen anymore in a subsequent node (so are global only inside a node, but not between nodes). How people usually solve the problem? Any idea or suggestion?
Thanks, Umberto
Just write out timestamps to log (EM should produce a global log in the same fashion that EG and DI do)
Either use:
data _null_;
datetime = datetime();
put datetime= datetime20.;
run;
or macro language:
%put EM node started at at %sysfunc(time(),timeampm.) on %sysfunc(date(),worddate.).;
with a higly customised message you have read the log in SAS looking for those strings using regex.
Solution 2:
Other option is to created a table in a library that is visible from EM and EG for example and have sql inserts at the beginning/end of your process.
proc sql;
create table EM_logger
(jobcode char(100),
timestamp num informat=datetime20. format=datetime20.);
quit;
proc sql;
insert into EM_logger values('Begining Linear Reg',%sysfunc(datetime()));
quit;
data w;
do i=1 to 10000000;
output;
end;
run;
proc sql;
insert into EM_logger values('End Linear Reg',%sysfunc(datetime()));
quit;
Table layout can be as complex as you want and as long as you can access it you can get your statistics.
Hope it helps
I need to perform a procedure on a small set (e.g. 100 rows) of a very big table just to test the syntax and output. I have been running the following code for a while and it's still running. I wonder if it is doing something else. Or what is the right way to do?
Proc sql inobs = 100;
select
Var1,
sum(Var2) as VarSum
from BigTable
Group by
Var1;
Quit;
What you're doing is fine (limiting the maximum number of records taken from any table to 100), but there are a few alternatives. To avoid any execution at all, use the noexec option:
proc sql noexec;
select * from sashelp.class;
quit;
To restrict the obs from a specific dataset, you can use the data set obs option, e.g.
proc sql;
select * from sashelp.class(obs = 5);
quit;
To get a better idea of what SAS is doing behind the scenes in terms of index usage and query planning, use the _method and _tree options (and optionally combine with inobs as above):
proc sql _method _tree inobs = 5;
create table test as select * from sashelp.class
group by sex
having age = max(age);
quit;
These produce quite verbose output which is beyond the scope of this answer to explain fully, but you can easily search for more details if you want.
For further details on debugging SQL in SAS, refer to
http://support.sas.com/documentation/cdl/en/sqlproc/62086/HTML/default/viewer.htm#a001360938.htm
I have a SAS dataset with numeric variables to, from, and weight. Some of the observations have value 0 for weight. I need all the weight values to be positive, so I wish to simply add 1 to all weight values.
How can I do that using Proc SQL?
I have tried the following, but it doesn't work:
proc sql;
update mylib.mydata
set weight=weight+1;
quit;
The error is:
ERROR: A CURRENT-OF-CURSOR operation cannot be initiated because
the column "weight" cannot be used to uniquely identify a row
because of its data type.
Also, mylib refers to a Greenplum appliance. This might be the problem...
If you have the database permissions to update that table, you might want to use the SAS/Access pass-through facility. You will need to know the correct syntax for this to work. Here is a non-working example:
proc sql;
connect to greenplm as dbcon
(server=greenplum04 db=sample port=5432 user=gpusr1 password=gppwd1);
execute (
/* Native code goes here */
update sample.mydata
set weight=weight+1
) by dbcon;
quit;
The connection string would be the same as used on the LIBNAME that defined your "mylib' libref.
However, if you are really trying to create a SAS dataset (not update the real table), you can do that with a simple data step:
data mydata;
set mylib.mydata
weight = weight + 1;
run;
That will create a copy of the table that can be used with other SAS procedures.
Check out this note at prosgress.com. You probably need to add UPDATE_MULT_ROWS=YES to your library definition.