SAS Dates - SEMIYEAR in Proc SQL - sas

The following works fine for me, as I want to select the YYYYQ value for a column to show the year and quarter:
proc sql;
select YYQ(year(datepart(betdate))
, QTR(datepart(betdate))) FORMAT=YYQN. as yearquarter
, QTR(datepart(betdate)) as semiyear
from &dsn;
quit;
How can I calculate the 'SEMIYEAR' instead of QTR? I can find refernces to it in the SAS documentation, but can't seem to get it to work. I want to show YYYYS, as it the year and the year 'half'.
Thanks.

There's not exactly a format or function for that, unfortunately. Do you need the date part of the value to persist, or just the value "20131"? (In your YYQ, for example, the underlying value is an actual date, which corresponds to the first date in the period of 2013Q1, so Jan 1; it's just displayed as 20131).
If you just want to display the value, you can do something pretty simple, like this:
proc sql;
select YYQ(year(datepart(betdate))
, QTR(datepart(betdate))) FORMAT=YYQN. as yearquarter
, floor(QTR(datepart(betdate))/2)+1 as semiyear
from test;
quit;
And append on the year if you want. However that does not maintain the actual first-day-of-the-period value. If you want to do that, you should use INTNX:
proc sql;
select YYQ(year(datepart(betdate))
, QTR(datepart(betdate))) FORMAT=YYQN. as yearquarter
, intnx('SEMIYEAR',datepart(betdate),0,'b') FORMAT=DATE9. as semiyear
from test;
quit;
That doesn't format it neatly, of course, so you would have to write your own format, unless I'm missing one that exists already; that's pretty easy though.
proc format;
value SEMIYEAR
'01JAN2013'd-'30JUN2013'd = '20131'
'01JUL2013'd-'31DEC2013'd = '20132'
;
quit;
Sadly you can't use picture formats to do this as far as I know - the documentation at least doesn't offer an option to display semiannual period. You can either do like I did above and just explicitly specify the time periods in the range, or you can write a function format; see http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#n1eyyzaux0ze5ln1k03gl338avbj.htm for examples of how to do that.
Edit: Here's an example that mostly works.
proc fcmp outlib=work.functions.smd;
function sfmt(date) $;
length snum $5;
snum = put(year(date)*10+floor(QTR(date)/2)+1,5.);
return(snum);
endsub;
run;
options cmplib=(work.functions);
proc format;
value semiyear
other=[sfmt()];
quit;
data test2;
set test;
x=put(datepart(betdate),semiyear.);
put x=;
run;
proc sql;
select YYQ(year(datepart(betdate))
, QTR(datepart(betdate))) FORMAT=YYQN. as yearquarter
, intnx('SEMIYEAR',datepart(betdate),0,'b') FORMAT=SEMIYEAR5. as semiyear
from test;
quit;
However, for some odd reason in my session at least the PROC SQL returns goofy characters instead of 20131. The data step returns the correct answer in the log. Not sure if this is a bug or if i'm doing something very slightly wrong.

Related

SAS: print the name related to the most little value

I'm a beginner in SAS and i have difficulties with this exercice:
I have a very simple table with 2 columns and three lines
I try to find the request that will return me the name of the most little people (so it must return titi)
All what I found is to return the most little size (157) but i don't want this, I want the name related to the most little value!
Could you help me please?
Larapa
A SQL having clause is a good one for this. SAS will automatically summarize the data and merge it back to the original table, giving you one a one-line table with the name of the smallest value of taille.
proc sql noprint;
create table want as
select nom
from have
having taille = min(taille)
;
quit;
Here are some other ways you can do it:
Using PROC MEANS:
proc means data=have noprint;
id nom;
output out=want
min(taille) = min_taille;
run;
Using sort and a data step to keep only the first observation:
proc sort data=have;
by taille;
run;
data want;
set have;
if(_N_ = 1);
run;

Convert SAS DATE and use in a PROC SQL

I'm having problems to DATEs in SAS Enterprise Guide 7.1 M4.
it's very very simple in SQL Server or VBA but in SAS is driving me crazy.
Problem:
For some strange reason I'm unable to make a simple select. I tried many different forms of formating and convertions but any seems to work
My Simple select returns no observations.
Description of T1.DT_DATE in proc contents
Type: Num
Len: 8
Format: DDMMYY10.
Informat: DATETIME20.
%let DATE_EXAMPLE='01JAN2019'd;
data _null_;
call symput ('CONVERTED_DATE',put(&DATE_EXAMPLE, ddmmyy10.));
run;
%put &CONVERTED_DATE;
PROC SQL;
CREATE TABLE TEST_SELECT AS
SELECT *
FROM MY_SAMPLE_DATA as T1
WHERE T1.DT_DATE = &CONVERTED_DATE
;QUIT;
Intially you are setting up the date properly but you are changing it to a different value that is not understood in where clause. See the resolutions of macrovariable for both macrovariables you have created
%put value of my earlier date value is &DATE_EXAMPLE;
value of my earlier date value is '01JAN2019'd
%put value of my current date value is &CONVERTED_DATE;
value of my current date value is 01/01/2019
change your code to use date literal that is '01JAN2019'd then your code will work. 01/01/2019 value will not make sense in where clause.
PROC SQL;
CREATE TABLE TEST_SELECT AS
SELECT *
FROM MY_SAMPLE_DATA as T1
WHERE T1.DT_DATE = &CONVERTED_DATE
;QUIT;

SAS equivalent to R’s is.element()

It’s the first time that I’ve opened sas today and I’m looking at some code a colleague wrote.
So let’s say I have some data (import) where duplicates occur but I want only those which have a unique number named VTNR.
First she looks for unique numbers:
data M.import;
set M.import;
by VTNR;
if first.VTNR=1 then unique=1;
run;
Then she creates a table with the duplicated numbers:
data M.import_dup1;
set M.import;
where unique^=1;
run;
And finally a table with all duplicates.
But here she is really hardcoding the numbers, so for example:
data M.import_dup2;
set M.import;
where VTNR in (130001292951,130100975613,130107546425,130108026864,130131307133,130134696722,130136267001,130137413257,130137839451,130138291041);
run;
I’m sure there must be a better way.
Since I’m only familiar with R I would write something like:
import_dup2 <- subset(import, is.element(import$VTNR, import_dup1$VTNR))
I guess there must be something like the $ also for sas?
To me it looks like the most direct translation of the R code
import_dup2 <- subset(import, is.element(import$VTNR, import_dup1$VTNR))
Would be to use SQL code
proc sql;
create table import_dup2 as
select * from import
where VTNR in (select VTNR from import_dup1)
;
quit;
But if your intent is to find the observations in IMPORT that have more than one observation per VTNR value there is no need to first create some other table.
data import_dup2 ;
set import;
by VTNR ;
if not (first.VTNR and last.VTNR);
run;
I would use the options in PROC SORT.
Make sure to specify an OUT= dataset otherwise you'll overwrite your original data.
/*Generate fake data with dups*/
data class;
set sashelp.class sashelp.class(obs=5);
run;
/*Create unique and dup dataset*/
proc sort data=class nouniquekey uniqueout=uniquerecs out=dups;
by name;
run;
/*Display results - for demo*/
proc print data=uniquerecs;
title 'Unique Records';
run;
proc print data=dups;
title 'Duplicate Records';
run;
Above solution can give you duplicates but not unique values. There are many possible ways to do both in SAS. Very easy to understand would be a SQL solution.
proc sql;
create table no_duplicates as
select *
from import
group by VTNR
having count(*) = 1
;
create table all_duplicates as
select *
from import
group by VTNR
having count(*) > 1
;
quit;
I would use Reeza's or Tom's solution, but for completeness, the solution most similar to R (and your preexisting code) would be three steps. Again, I wouldn't use this here, it's excess work for something you can do more easily, but the concept is helpful in other situations.
First, get the dataset of duplicates - either her method, or proc sort.
proc sort nodupkey data=have out=nodups dupout=dups;
by byvar;
run;
Then pull those into a macro list:
proc sql;
select byvar
into :duplist separated by ','
from dups;
quit;
Then you have them in &duplist. and can use them like so:
data want;
set have;
if not (byvar in &duplist.);
run;
data want;
set import;
where VTNR in import_dup1;
run;

SAS Proc SQL Trim not working?

I have a problem that seems pretty simple (probably is...) but I can't get it to work.
The variable 'name' in the dataset 'list' has a length of 20. I wish to conditionally select values into a macro variable, but often the desired value is less than the assigned length. This leaves trailing blanks at the end, which I cannot have as they disrupt future calls of the macro variable.
I've tried trim, compress, btrim, left(trim, and other solutions but nothing seems to give me what I want (which is 'Joe' with no blanks). This seems like it should be easier than it is..... Help.
data list;
length id 8 name $20;
input id name $;
cards;
1 reallylongname
2 Joe
;
run;
proc sql;
select trim(name) into :nameselected
from list
where id=2;
run;
%put ....&nameselected....;
Actually, there is an option, TRIMMED, to do what you want.
proc sql noprint;
select name into :nameselected TRIMMED
from list
where id=2;
quit;
Also, end PROC SQL with QUIT;, not RUN;.
It works if you specify a separator:
proc sql;
select trim(name) into :nameselected separated by ''
from list
where id=2;
run;

SAS sum variables using name after a proc transpose

I have a table with postings by category (a number) that I transposed. I got a table with each column name as _number for example _16, _881, _853 etc. (they aren't in order).
I need to do the sum of all of them in a proc sql, but I don't want to create the variable in a data step, and I don't want to write all of the columns names either . I tried this but doesn't work:
proc sql;
select sum(_815-_16) as nnl
from craw.xxxx;
quit;
I tried going to the first number to the last and also from the number corresponding to the first place to the one corresponding to the last place. Gives me a number that it's not correct.
Any ideas?
Thanks!
You can't use variable lists in SQL, so _: and var1-var6 and var1--var8 don't work.
The easiest way to do this is a data step view.
proc sort data=sashelp.class out=class;
by sex;
run;
*Make transposed dataset with similar looking names;
proc transpose data=class out=transposed;
by sex;
id height;
var height;
run;
*Make view;
data transpose_forsql/view=transpose_forsql;
set transposed;
sumvar = sum(of _:); *I confirmed this does not include _N_ for some reason - not sure why!;
run;
proc sql;
select sum(sumvar) from transpose_Forsql;
quit;
I have no documentation to support this but from my experience, I believe SAS will assume that any sum() statement in SQL is the sql-aggregate statement, unless it has reason to believe otherwise.
The only way I can see for SAS to differentiate between the two is by the way arguments are passed into it. In the below example you can see that the internal sum() function has 3 arguments being passed in so SAS will treat this as the SAS sum() function (as the sql-aggregate statement only allows for a single argument). The result of the SAS function is then passed in as the single parameter to the sql-aggregate sum function:
proc sql noprint;
create table test as
select sex,
sum(sum(height,weight,0)) as sum_height_and_weight
from sashelp.class
group by 1
;
quit;
Result:
proc print data=test;
run;
sum_height_
Obs Sex and_weight
1 F 1356.3
2 M 1728.6
Also note a trick I've used in the code by passing in 0 to the SAS function - this is an easy way to add an additional parameter without changing the intended result. Depending on your data, you may want to swap out the 0 for a null value (ie. .).
EDIT: To address the issue of unknown column names, you can create a macro variable that contains the list of column names you want to sum together:
proc sql noprint;
select name into :varlist separated by ','
from sashelp.vcolumn
where libname='SASHELP'
and memname='CLASS'
and upcase(name) like '%T' /* MATCHES HEIGHT AND WEIGHT */
;
quit;
%put &varlist;
Result:
Height,Weight
Note that you would need to change the above wildcard to match your scenario - ie. matching fields that begin with an underscore, instead of fields that end with the letter T. So your final SQL statement will look something like this:
proc sql noprint;
create table test as
select sex,
sum(sum(&varlist,0)) as sum_of_fields_ending_with_t
from sashelp.class
group by 1
;
quit;
This provides an alternate approach to Joe's answer - though I believe using the view as he suggests is a cleaner way to go.