I am trying to write a PROC SQL query in SAS to determine maximum of many columns starting with a particular letter (say RF*). The existing proc means statement which i have goes like this.
proc means data = input_table nway noprint missing;
var age x y z RF: ST: ;
class a b c;
output out = output_table (drop = _type_ _freq_) max=;
run;
Where the columns RF: refers to all columns starting with RF and likewise for ST. I was wondering if there is something similar in PROC SQL, which i can use?
Thanks!
Dynamic SQL is indeed the way to go with this, if you must use SQL. The good news is that you can do it all in one proc sql call using only one macro variable, e.g.:
proc sql noprint;
select catx(' ','max(',name,') as',name) into :MAX_LIST separated by ','
from dictionary.columns
where libname = 'SASHELP'
and memname = 'CLASS'
and type = 'num'
/*eq: is not available in proc sql in my version of SAS, but we can use substr to match partial variable names*/
and upcase(substr(name,1,1)) in ('A','W') /*Match all numeric vars that have names starting with A or W*/
;
create table want as select SEX, &MAX_LIST
from sashelp.class
group by SEX;
quit;
Related
I'm kinda new to SAS.
I have 2 datasets: set1 and set2.
I'd like to get a list of variables that's in set2 but not in set1.
I know I can easily see them by doing proc compare and then listvar,
however, i wish to copy&paste the whole list of different variables instead of copying one by one from the report generated.
i want either a macro variable containing a list of all different variables separated by space, or printing out all variables in plain texts that I can easily copy everything.
proc contents data=set1 out=cols1;
proc contents data=set2 out=cols2;
data common;
merge cols1 (in=a) cols2 (in=b);
by name;
if not a and b;
keep name;
run;
proc sql;
select name into :commoncols separated by ','
from work.common;
quit;
Get the list of variable names and then compare the lists.
Conceptually the simplest way see what is in a dataset is to use proc contents.
proc contents data=set1 noprint out=content1 ; run;
proc contents data=set2 noprint out=content2 ; run;
Now you just need to find the names that are in one and not the other.
An easy way is with PROC SQL set operations.
proc sql ;
create table in1_not_in2 as
select name from content1
where upcase(name) not in (select upcase(name) from content2)
;
create table in2_not_in1 as
select name from content2
where upcase(name) not in (select upcase(name) from content1)
;
quit;
You could also push the lists into macro variables instead of datasets.
proc sql noprint ;
select name from content1
into :in1_not_in2 separated by ' '
where upcase(name) not in (select upcase(name) from content2)
;
select name from content2
into :in2_not_in1 separated by ' '
where upcase(name) not in (select upcase(name) from content1)
;
quit;
Then you could use the macro variables to generate other code.
data both;
set set1(drop=&in1_not_in2) set2(drop=&in2_not_in1) ;
run;
Issue:
I'm looking to reverse the order of all columns in a sas dataset. Should I achieve this by first transposing and then using a loop to reverse the order of the columns? This is my logic...
Step One:
data pre_transpose;
set sashelp.class;
*set &&dataset&i.. ;
_row_ + 1; * Unique identifier ;
length _charvar_ $20; * Create 1 character variable ;
run;
Step One Output:
Step Two: Do I Reverse Columns Here?
proc transpose data = pre_transpose out = middle (where = (lowcase(_name_) ne '_row_'));
by _row_;
var _all_;
quit;
Step Two Output:
EDIT:
I have attempted this:
/* use proc sql to create a macro variable for column names */
proc sql noprint;
select varnum, nliteral(name)
into :varlist, :varlist separated by ' '
from dictionary.columns
where libname = 'WORK' and memname = 'all_character'
order by varnum desc;
quit;
/* Use retain to maintain format */
data reverse_columns;
retain &varlist.;
set all_character;
run;
But I did not achieve the results I was looking for - the column order is not reversed.
You just need to get the list of variable names. One way is to use the metadata. Do if your dataset is member HAVE in libref WORK then you could use this to get the list of variable names into a single macro variable.
proc sql noprint;
select varnum , nliteral(name)
into :varlist, :varlist separated by ' '
from dictionary.columns
where libname='WORK' and memname='HAVE'
order by varnum desc
;
quit;
You could then use the macro variable in a data step like this.
data want ;
retain &varlist ;
set have ;
run;
Note that the value of libname and memname in DICTIONARY.COLUMNS is in uppercase only.
So I have a rather interesting problem. I am trying to insert a current date in specific formats and styles, but for some reason it seems to fail. I know its not a formatting issue... But idk how to fix it. a data step solution is welcomed as well... Here's what works.
proc sql;
create table work.test
(test_Id char(50), test_Name char(50), cur_Mo char(1), cur_Qtr char(1), entered_Date char(8));
insert into work.test
values('201703','2017 Mar','0','0','24APR17')
values('201704','2017 Apr','0','0','24APR17')
values('201706','2017 Jun','1','0','23JUN17');
quit;
Here's what doesn't:
proc sql;
insert into work.test
values(catx('',put(year(today()),4.),case when month(today())< 10 then catx('','0',put(month(today()),2.)) else put(month(today()),2.)end) ,catx(' ',[put(year(today()),4.),put(today(),monname3.))],'1','0',put(today(),date7.));
quit;
You can use the %SYSFUNC() macro function to call most other SAS function in macro code. So to generate today's date in DATE7 format you could use:
insert into work.test (date)
values("%sysfunc(date(),date7)")
;
The way I'd probably do it is to use a data step to make a dataset that you would insert, and then insert that dataset.
You can use insert into (...) select (...) from (...) syntax in SAS, and the data step is much more flexible as to allowing you to define columns.
For example:
proc sql;
create table class like sashelp.class;
quit;
proc sql;
insert into class
select * from sashelp.class;
quit;
Or you can specify only certain variables:
proc sql;
insert into class (name, age)
select name, age from sashelp.class;
quit;
data to_insert;
name= 'Wilma';
sex = 'F';
age = 29;
height = 61.2;
weight = 95.3;
run;
proc sql;
insert into class
select * from to_insert;
quit;
Just make sure you either explicitly list the variables to insert/select, or you have the order exactly right (it matches up by position if you use * like I do above).
I need to run a macro that does a transpose for many variables (and creates a table for each one), orders the columns names, which are numeric, but also adds as a prefix the variable's name (which is a string).
I have a macro in SAS to perform a transpose with different variables as var in the transpose. The code is:
%macro transponer(var);
proc transpose data=labo2.A_svm_200711_200806
out=labo2.D_tr_&var.0;
var &var;
id mes;
by cid;
run;
/*......more code.....*/
select cats(name, '=', &var, name)
into :prefijolista
separated by ' '
from dictionary.columns
where libname='LABO2' and memname= cats('D_TR_',upcase(&var))
and name like '_20%';
quit;
%put &prefijolista;
%mend;
Since mes is numeric I wanted to order the variable, that's why I didn't introduce the "prefix &var" in the proc transpose but instead I did it after the retain (that was useful to order the columns).
The problem starts when I try to introduce the prefix (after the ordering).
Since one of the variables' name is for example "monto", I get the following error (because it is the var variable in the transpose and it's not a column name in the transposed table):
The following columns were not found in the contributing tables:
monto.
My next step would be:
proc datasets library=labo2;
modify D_tr_&var.0;
rename &prefijolista;
quit;
But I cant do it untill I get the previous one done.
So I don't know how to order the columns after the transpose and also add the prefix.
How can I solve this?
Thanks!
You need to rename the columns using something like PROC DATASETS.
proc datasets lib=work nolist;
modify myDataSet;
rename old_col_name = new_col_name;
run;
quit;
A documentation example is available in the Base SAS guide under the doc for PROC DATASETS. It is available online at: http://support.sas.com/documentation/cdl/en/proc/67327/HTML/default/viewer.htm#n0mfav25learpan1lerk79jsp30n.htm
The problem was that &var inside the cats function inside a macro hast to use
" "
Also you could use
sysfunc(cats(D_TR, &a)
So finally the code will remain like:
%let a = %upcase(&var);
%put &a;
%let b=%sysfunc(cats(D_TR_,&a));
%put &b;
proc sql;
select cats(name, '=', "&var" , name)
into :prefijolista
separated by ' '
from dictionary.columns
where libname='LABO2' and memname= "&b"
and name like '_20%';
quit;
%put &prefijolista;
%put "&b";
PROC datasets library=LABO2;
modify &b;
rename &prefijolista;
quit;
%put "ult" &b;
Not very straightforward, but worked. :)
I have a table with postings by category (a number) that I transposed. I got a table with each column name as _number for example _16, _881, _853 etc. (they aren't in order).
I need to do the sum of all of them in a proc sql, but I don't want to create the variable in a data step, and I don't want to write all of the columns names either . I tried this but doesn't work:
proc sql;
select sum(_815-_16) as nnl
from craw.xxxx;
quit;
I tried going to the first number to the last and also from the number corresponding to the first place to the one corresponding to the last place. Gives me a number that it's not correct.
Any ideas?
Thanks!
You can't use variable lists in SQL, so _: and var1-var6 and var1--var8 don't work.
The easiest way to do this is a data step view.
proc sort data=sashelp.class out=class;
by sex;
run;
*Make transposed dataset with similar looking names;
proc transpose data=class out=transposed;
by sex;
id height;
var height;
run;
*Make view;
data transpose_forsql/view=transpose_forsql;
set transposed;
sumvar = sum(of _:); *I confirmed this does not include _N_ for some reason - not sure why!;
run;
proc sql;
select sum(sumvar) from transpose_Forsql;
quit;
I have no documentation to support this but from my experience, I believe SAS will assume that any sum() statement in SQL is the sql-aggregate statement, unless it has reason to believe otherwise.
The only way I can see for SAS to differentiate between the two is by the way arguments are passed into it. In the below example you can see that the internal sum() function has 3 arguments being passed in so SAS will treat this as the SAS sum() function (as the sql-aggregate statement only allows for a single argument). The result of the SAS function is then passed in as the single parameter to the sql-aggregate sum function:
proc sql noprint;
create table test as
select sex,
sum(sum(height,weight,0)) as sum_height_and_weight
from sashelp.class
group by 1
;
quit;
Result:
proc print data=test;
run;
sum_height_
Obs Sex and_weight
1 F 1356.3
2 M 1728.6
Also note a trick I've used in the code by passing in 0 to the SAS function - this is an easy way to add an additional parameter without changing the intended result. Depending on your data, you may want to swap out the 0 for a null value (ie. .).
EDIT: To address the issue of unknown column names, you can create a macro variable that contains the list of column names you want to sum together:
proc sql noprint;
select name into :varlist separated by ','
from sashelp.vcolumn
where libname='SASHELP'
and memname='CLASS'
and upcase(name) like '%T' /* MATCHES HEIGHT AND WEIGHT */
;
quit;
%put &varlist;
Result:
Height,Weight
Note that you would need to change the above wildcard to match your scenario - ie. matching fields that begin with an underscore, instead of fields that end with the letter T. So your final SQL statement will look something like this:
proc sql noprint;
create table test as
select sex,
sum(sum(&varlist,0)) as sum_of_fields_ending_with_t
from sashelp.class
group by 1
;
quit;
This provides an alternate approach to Joe's answer - though I believe using the view as he suggests is a cleaner way to go.