Partitions in Proc SQL - sas

Can we use partitions inside Proc Sql, if not could you please help me the equivalent logic to use?
Proc sql;
select user_id,
content_title,
calendar_date,
sum(watch_second)/3600 as hours_watched,
sum(hours_watched) over (partition by user_id, content_title order by calendar_date) AS cumulative_hours,
available_hours,
cumulative_hours / available_hours as pct_completed
from watch_history as a
inner join dim_user s
ON s.USER_ID = a.USER_ID
left join dim_content_meta pb
ON pb.metrics_video_id = a.metrics_video_id
inner join sascidm.coop_top50 v
on pb.hummus_show_id = v.hummus_show_id
where pb.hummus_playback_type in ('VOD')
AND pb.hummus_show_type = 'series'
order by 1,2,3;
quit;
Error:
sum(watch_second)/3600 as hours_watched,
29 sum(hours_watched) over (partition by user_id, content_title order by calendar_date) AS cumulative_hours,
____
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, ',', -, /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.
ERROR 76-322: Syntax error, statement will be ignored.

No, proc sql supports only a relatively limited set of SQL that's close to ANSI SQL from a few decades ago, and does not support partition by as that's much newer.
For the most part, a SAS programmer would instead use SAS procs to compute things like this rather than SQL - PROC TABULATE could probably do this I suspect.

It is simple in a DATA step to generate a cummulative sum. Just use a SUM statement. To reset by groups use a FIRST. flags to detect when a new group is starting.
Proc sql;
create table raw as
select
user_id
,content_title
,calendar_date
,sum(watch_second)/3600 as hours_watched
/*
,sum(hours_watched) over (partition by user_id, content_title order by calendar_date) AS cumulative_hours
*/
,available_hours
/*
,cumulative_hours / available_hours as pct_completed
*/
from watch_history as a
inner join dim_user s
ON s.USER_ID = a.USER_ID
left join dim_content_meta pb
ON pb.metrics_video_id = a.metrics_video_id
inner join sascidm.coop_top50 v
on pb.hummus_show_id = v.hummus_show_id
where pb.hummus_playback_type in ('VOD')
AND pb.hummus_show_type = 'series'
order by 1,2,3
;
quit;
data want;
set raw ;
by user_id content_title calendar_date;
if first.content_title then cumulative_hours=0;
cumulative_hours + hours_watched;
pct_completed = cumulative_hours / available_hours;
run;

Related

Using the result of a query as a parameter for another query

I am trying to use a query result as a parameter for another query.
As below:
PROC SQL;
SELECT mydate INTO : varmydate FROM work.MyTable WHERE codigo = 1234;
QUIT;
PROC SQL;
SELECT * FROM work.MyOtherTable
WHERE codata = &varmydate;
QUIT;
But, unfortunately, this didn't work.
In this example, the varmydate variable will receive a value of type data in the first query.
96 PROC SQL;
97 SELECT codata FORMAT date9., valor FROM work.MyOtherTable WHERE codata = &varmydate;
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
NOTE: Line generated by the macro variable "VARMYDATE".
97 01MAR2020
_______
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, >=, AND, EQ, EQT, GE, GET,
GROUP, GT, GTT, HAVING, LE, LET, LT, LTT, NE, NET, OR, ORDER, ^=, |, ||, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
98 QUIT;
After executing the first query, the command below:
%PUT &varmydate;
will have the following result:
01MAR2020
But the second query returns an error.
If I use the syntax below:
PROC SQL;
SELECT * FROM work.MyOtherTable
WHERE codata = '01MAR2020'd;
QUIT;
then it will work.
You did not indicate what the LOG window messaged regarding didn't work
I presume column mydate in table mytable is type character storing the date9. represention of a date. If it had instead been numeric variable with a date format the macro variable myvardate would have a value such as 21975
Thus, you need to resolve the date9 representation in a SAS date literal context
WHERE codata = "&myvardate"d;
Double quotes are needed because macro resolution will not occur within single quoted literals.

Syntax error using CATX in SAS PROC SQL

I am generating a syntax error in SAS 9.4 when trying to use CATX("|", of a1-a5) in PROC SQL.
Why do the first two outputs work, but the third fails?
data test;
input a1 $ a2 $ a3 $ a4 $ a5 $;
cards;
a b c d e
f g h i j
k l m n o
p q r s t
u v w x y
;
run;
proc sql;
select CATX('|',a1,a2,a3,a4,a5) as catx from test;
quit;
data test2;
set test;
catx=CATX('|',OF a1-a5);
run;
proc print data=test2; run;
proc sql;
select CATX('|',OF a1-a5) as catx from test;
quit;
The first proc sql and the data step produce the expected "a|b|c|d|e", etc. But the third proc sql produces a syntax error pointed at the "a1":
32 proc sql;
33 select CATX('|',OF a1-a5) as catx from test;
--
22
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.
Thanks
You have hit one of those walls in Proc SQL where some base SAS functions are not fully supported. As someone mentioned earlier you're better off creating a macro variable containing the columns you need for your concatenation.
here is a quick example:
proc sql noprint;
select name into :cols separated by ','
from dictionary.columns
where libname = "WORK" and
memname = "TEST";
quit;
%put &cols;
proc sql;
select CATX('|',&cols) as catx from test;
quit;
Obviously your where clause will be more complex as your original data set may contain columns not required in the CATX expression.
This syntax "OF a1-a5" is specified for data-step code only.
"proc sql" sas procedure is a simpliest sql code and it cant't be mixed with data-step code.

Check if a SAS DIS parameter is null

In SAS DIS I have set date parameters on a job. I tried setting the default values using the drop down menu provided, but each time I get the error
Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||, ~=.
I therefore decided to try to check if the parameter is null before proceeding, but none of my various attempts succeeded. Is there a way to do this with user-written code? Something like
if(&date_param = .) then do;
date = today();
else do;
date = &date_param;
end;
I tried this within a macro and it did not work.
Much Gratitude.
Assuming this is similar to a standard SAS macro variable, a couple of things.
First off, a null parameter would be literally blank, not a period (that's for numeric dataset variables). In a data step you could check it like so:
if "&date_param." = " " then do;
Second, depending on context you may need to do this in macro syntax. If you're setting another parameter, you may need to do:
%if &date_param. eq %then %do;
%let date=%sysfunc(today());
%end;
%else %do;
%let date = &date_param.;
%end;
%sysfunc lets you execute a data step function in macro code.

How to count distinct of the concatenation/cross of two variables in SAS Proc Sql?

I know in teradata or other sql platforms you can find the count distinct of a combination of variables by doing:
select count(distinct x1||x2)
from db.table
And this will give all the unique combinations of x1,x2 pairs.
This syntax, however, does not work in proc sql.
Is there anyway to perform such a count in proc sql?
Thanks.
That syntax works perfectly fine in PROC SQL.
proc sql;
select count(distinct name||sex)
from sashelp.class;
quit;
If the fields are numeric, you must put them to character (using put) or use cat or one of its siblings, which happily take either numeric or character.
proc sql;
select count(distinct cats(age,sex))
from sashelp.class;
quit;
This maybe redundant, but when you mentioned "combination", it instantly triggered 'permutation' in my mind. So here is one solution to differentiate these two:
DATA TEST;
INPUT (X1 X2) (:$8.);
CARDS;
A B
B A
C D
C D
;
PROC SQL;
SELECT COUNT(*) AS TOTAL, COUNT(DISTINCT CATS(X1,X2)) AS PERMUTATION,
COUNT(DISTINCT CATS(IFC(X1<=X2,X1,X2),IFC(X1>X2,X1,X2))) AS COMBINATION
FROM TEST;
QUIT;

issue with nested select statements in proc sql

I was given this code to run but I keep getting errors even after I've made sure there are the right number of left/right brackets. This is the original code as my adding of brackets seemed to be in the wrong place.
proc sql;
create table all as
select distinct a.id, a.count, b.date
from (select distinct id, count (*) as count from (select distinct id, p_id, date2 from table1) group by id) a
(select distinct id, min(date2) as date format datetime. from table1) b
where a.id=b.id;
quit;
(select distinct id, min(date2) as date format datetime. from
-------- -
22 22
202 76
3520! table1) group by id) b
ERROR 22-322: Syntax error, expecting one of the following: ), ','.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
ERROR 76-322: Syntax error, statement will be ignored.
Edit: after adding a comma I then get this error:
256 , (select id, min(date2) as date format datetime. from
256! table1) group by id ) b
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, (, *, **, +, ',', -,
'.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, EXCEPT, GE, GET,
GT, GTT, HAVING, IN, INTERSECT, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN,
OR, ORDER, OUTER, UNION, ^, ^=, |, ||, ~, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
257 Where a.id=b.id;
258 quit;
The error is not of brackets, but of comma (,) . You've missed comma sign at the start of 5th line.
, (select distinct id, min(date2) as date format datetime. from table1) b
EDIT: Indented the original code with my comma fix. I don't know why you are getting this new error. I copied your original code w/ comma addition and tested you code with dummy data and it's working fine. I guess some hidden junk character is causing error.
data table1;
input id p_id date2 :yymmdd10.;
datalines;
1 1 2012-01-15
1 1 2012-01-15
2 1 2012-01-15
2 2 2012-01-15
4 1 2012-01-15
;;;;
run;
proc sql;
create table all as
select distinct a.id, a.count, b.date
from (select distinct id, count (*) as count
from (select distinct id, p_id, date2 from table1)
group by id
) a
, (select distinct id, min(date2) as date format datetime.
from table1
) b
where a.id=b.id;
quit;