Truncation when using CASE in SQL statement in SAS (Enterprise Guide) - sas

I am trying to manipulate some text files in SAS Enterprise Guide and load them line by line in a character variable "text" which gets the length 1677 characters.
I can use the Tranwrd() function to create a new variable text21 on this variable and get the desired result as shown below.
But if I try to put some conditions on the execution of exactly the same Tranwrd() to form the variable text2 (as shown below) it goes wrong as the text in the variable is now truncated to around 200 characters, even though the text2 variable has the length 1800 characters:
PROC SQL;
CREATE TABLE WORK.Area_Z_Added AS
SELECT t1.Area,
t1.pedArea,
t1.Text,
/* text21 */
( tranwrd(t1.Text,'zOffset="0"',compress('zOffset="'||put(t2.Z,8.2)||'"'))) LENGTH=1800 AS text21,
/* text2 */
(case when t1.type='Area' then
tranwrd(t1.Text,'zOffset="0"',compress('zOffset="'||put(t2.Z,8.2)||'"'))
else
t1.Text
end) LENGTH=1800 AS text2,
t1.Type,
t1.id,
t1.x,
t1.y,
t2.Z
FROM WORK.VISSIM_IND t1
LEFT JOIN WORK.AREA_Z t2 ON (t1.Type = t2.Type) AND (t1.Area = t2.Area)
ORDER BY t1.id;
QUIT;
Anybody got a clue?

This is a known problem with using character functions inside a CASE statement. See this thread on SAS Communities https://communities.sas.com/t5/SAS-Programming/Truncation-when-using-CASE-in-SQL-statement/m-p/852137#M336855
Just use the already calculated result in the other variable instead by using the CALCULATED keyword.
CREATE TABLE WORK.Area_Z_Added AS
SELECT
t1.Area
,t1.pedArea
,t1.Text
,(tranwrd(t1.Text,'zOffset="0"',cats('zOffset="',put(t2.Z,8.2),'"')))
AS text21 length=1800
,(case when t1.type='Area'
then calculated text21
else t1.Text
end) AS text2 LENGTH=1800
,t1.Type
,t1.id
,t1.x
,t1.y
,t2.Z
FROM WORK.VISSIM_IND t1
LEFT JOIN WORK.AREA_Z t2
ON (t1.Type = t2.Type)
AND (t1.Area = t2.Area)
ORDER BY t1.id
;
If you don't need the extra TEXT21 variable then use the DROP= dataset option to remove it.
CREATE TABLE WORK.Area_Z_Added(drop=text21) AS ....

Related

Delete all observations starting with a list of values from database (SAS)

I am trying to find the optimized way to do this :
I want to delete from a character variable all the observations STARTING with different possible strings such as :
"Subtotal" "Including:"
So if it starts with any of these values (or many others that i didn't write here) then delete them from the dataset.
Best solution would be a macro variable containing all the values but i don't know how to deal with it. (%let list = Subtotal Including: but counts them as variables while they are values)
I did this :
data a ; set b ;
if findw(product,"Subtotal") then delete ;
if findw(product,"Including:") then delete;
...
...
Would appreciate any suggestions !Thanks
First figure out what SAS code you want. Then you can begin to worry about how to use macro logic or macro variables.
Do you just to exclude the strings that start with the values?
data want ;
set have ;
where product not in: ("Subtotal" "Including");
run;
Or do you want to subset based on the first "word" in the string variable?
where scan(product,1) not in ("Subtotal" "Including");
Or perhaps case insensitive?
where lowcase(scan(product,1)) not in ("subtotal" "including");
Now if the list of values is small enough (less than 64K bytes) then you could put the list into a macro variable.
%let list="Subtotal" "Including";
And then later use the macro variable to generate the WHERE statement.
where product not in: (&list);
You could even generate the macro variable from a dataset of prefix values.
proc sql noprint;
select quote(trim(prefix)) into :list separated by ' '
from prefixes
;
quit;

Can you reference an aggregate function on a temporary row in an insert statement within a stored procedure in postgresql?

I am writing a postgres stored procedure that loops through the rows returned from a select statement. For each row it loops through, it inserts values from that select statement into a new table. One of the values I need to insert into the second table is the average of a column. However, when I call the stored procedure, I get an error that the temporary row has no attribute for the actual column that I am averaging. See stored procedure and error below.
Stored Procedure:
create or replace procedure sendToDataset(sentence int)
as $$
declare temprow peoplereviews%rowtype;
BEGIN
FOR temprow IN
select rulereviewid, avg(rulereview)
from peoplereviews
where sentenceid = sentence
group by rulereviewid
loop
insert into TrainingDataSet(sentenceId, sentence, ruleCorrectId, ruleCorrect, dateAdded)
values(sentence, getSentenceFromID(sentence), tempRow.rulereviewid, tempRow.avg(rulereview), current_timestamp);
END LOOP;
END
$$
LANGUAGE plpgsql;
Error:
ERROR: column "rulereview" does not exist
LINE 2: ...omID(sentence), tempRow.rulereviewid, tempRow.avg(rulereview...
^
QUERY: insert into TrainingDataSet(sentenceId, sentence, ruleCorrectId, ruleCorrect, dateAdded)
values(sentence, getSentenceFromID(sentence), tempRow.rulereviewid, tempRow.avg(rulereview), current_timestamp)
CONTEXT: PL/pgSQL function sendtodataset(integer) line 11 at SQL statement
SQL state: 42703
Basically, I am wondering if it's possible to use that aggregate function in the insert statement or not and if not, if there is another way around it.
you don't need to use a slow and inefficient loop for this:
insert into TrainingDataSet(sentenceId, sentence, ruleCorrectId, ruleCorrect, dateAdded)
select getSentenceId(sentence), sentence, rulereviewid, avg(rulereview), current_timestamp
from peoplereviews
where sentenceid = sentence
group by rulereviewid
To answer the original question: you need to provide a proper alias for the aggregate:
FOR temprow IN
select rulereviewid, avg(rulereview) as avg_views
from peoplereviews
where sentenceid = sentence
group by rulereviewid
loop
insert into TrainingDataSet(sentenceId, sentence, ruleCorrectId, ruleCorrect, dateAdded)
values(sentence, getSentenceFromID(sentence), tempRow.rulereviewid,
tempRow.avg_views, current_timestamp);
END LOOP;

Remove single quotes in list of values in macro variable

I have a project with multiple programs. Each program has a proc SQL statement which will use the same list of values for a condition in the WHERE clause; however, the column type of one database table needed is a character type while the column type of the other is numeric.
So I have a list of "Client ID" values I'd like to put into a macro variable as these IDs can change, and I would like to change them once in the variable instead of in multiple programs.
For example, I have this macro variable set up like so and it works in the proc SQL which queries the character column:
%let CLNT_ID_STR = ('179966', '200829', '201104', '211828', '264138');
Proc SQL part:
...IN &CLNT_ID_STR.
I would like to create another macro variable, say CLNT_ID_NUM, which takes the first variable (CLNT_ID_STR) but removes the quotes.
Desired output: (179966, 200829, 201104, 211828, 264138)
Proc SQL part: ...IN &CLNT_ID_NUM.
I've tried using the sysfunc, dequote and translate functions but have not figured it out.
TRANSLATE doesn't seem to want to allow a null string as the replacement.
Below uses TRANSTRN, which has no problem translating single quote into null:
1 %let CLNT_ID_STR = ('179966', '200829', '201104', '211828', '264138');
2 %let want=%sysfunc(transtrn(&clnt_id_str,%str(%'),%str())) ;
3 %put &want ;
(179966, 200829, 201104, 211828, 264138)
It uses the macro quoting function %str() to mask the meaning of a single quote.
Three other ways to remove single quotes are COMPRESS, TRANSLATE and PRXCHANGE
%let CLNT_ID_STR = ('179966', '200829', '201104', '211828', '264138');
%let id_list_1 = %sysfunc(compress (&CLNT_ID_STR, %str(%')));
%let id_list_2 = %sysfunc(translate(&CLNT_ID_STR, %str( ), %str(%')));
%let id_list_3 = %sysfunc(prxchange(%str(s/%'//), -1, &CLNT_ID_STR));
%put &=id_list_1;
%put &=id_list_2;
%put &=id_list_3;
----- LOG -----
ID_LIST_1=(179966, 200829, 201104, 211828, 264138)
ID_LIST_2=( 179966 , 200829 , 201104 , 211828 , 264138 )
ID_LIST_3=(179966, 200829, 201104, 211828, 264138)
It really doesn't matter that TRANSLATE replaces the ' with a single blank () because the context for interpretation is numeric.

SAS format char

First i have created this table
data rmlib.tableXML;
input XMLCol1 $ 1-10 XMLCol2 $ 11-20 XMLCol3 $ 21-30 XMLCol4 $ 31-40 XMLCol5 $ 41-50 XMLCol6 $ 51-60;
datalines;
| AAAAA A||AABAAAAA|| BAAAAA|| AAAAAA||AAAAAAA ||AAAA |
;
run;
I want to clean, concatenate and export. I have written the following code
data rmlib.tableXML_LARGO;
file CleanXML lrecl=90000;
set rmlib.tableXML;
array XMLCol{6} ;
array bits{6};
array sqlvars{6};
do i = 1 to 6;
*bits{i}=%largo(XMLCol{i})-2;
%let bits =input(%largo(XMLCol{i})-2,comma16.5);
sqlvars{i} = substr(XMLCol{i},2,&bits.);
put sqlvars{i} &char10.. #;
end;
run;
the macro largo count how many characters i have
%macro largo(num);
length(put(&num.,32500.))
%mend;
What i need is instead of have char10, i would like that this number(10) would be the length, of each string, so to have something like
put sqlvars{i} &char&bits.. #;
I don't know if it possible but i can't do it.
I would like to see something like
AAAAA AAABAAAAA BAAAAA AAAAAAAAAAAAA AAAA
It is important to me to keep the spaces(this is only an example of an extract of a xml extract). In addition I will change (for example) "B" for "XPM", so the size will change after cleaning the text, that it what i need to be flexible in the char
Thank you for your time
Julen
I'm still not quite sure what you want to achieve, but if you want to combine the text from multiple varriables into one variable, then you could do something along the lines:
proc sql;
select name into :names separated by '||'
from dictionary.columns
where 1=1
and upcase(libname)='YOURLIBNAME'
and upcase(memname)='YOURTABLENAME';
quit;
data work.testing;
length resultvar $ 32000;
set YOURLIBNAME.YOURTABLENAME;
resultvar = &names;
resultvar2 = compress(resultvar,'|');
run;
Wasn't able to test this, but this should work if you replace YOURLIBNAME and YOURTABLENAME with your respective tables. I'm not 100% sure if the compress will preserve the spaces in the text.. But I think it should.
The format $VARYING. <length-variable> is a good candidate for solving this output problem.
On the presumption of having a number of variables whose values are vertical-bar bounded and wanting to output to a file the concatenation of the values without the bounding bars.
data have;
file "c:\temp\want.txt" lrecl=9000;
length xmlcol1-xmlcol6 $100;
array values xmlcol1-xmlcol6 ;
xmlcol1 = '| A |';
xmlcol2 = '|A BB|';
xmlcol3 = '|A BB|';
xmlcol4 = '|A BBXC|';
xmlcol5 = '|DD |';
xmlcol6 = '| ZZZ |';
do index = 1 to dim(values);
value = substr(values[index], 2); * ignore presumed opening vertical bar;
value_length = length(value)-1; * length with still presumed closing vertical bar excluded;
put value $varying. value_length #; * send to file the value excluding the presumed closing vertical bar;
end;
run;
You have some coding errors in that is making it difficult to understand what you want to do.
Your %largo() macro doesn't make any sense. There is no format 32500.. The only reason it would run in your code is because you are trying to apply the format to a character variable instead of a number. So SAS will automatically convert to use the $32500. instead.
The %LET statement that you have hidden in the middle of your data step will execute BEFORE the data step runs. So it would be less confusing to move it before the data step.
So replacing the call to %largo() your macro variable BITS will contain this text.
%let bits =input(length(put(XMLCol{i},32500.))-2,comma16.5);
Which you then use inside a line of code. So that line will end up being this SAS code.
sqlvars{i} = substr(XMLCol{i},2,input(length(put(XMLCol{i},$32500.))-2,comma16.5));
Which seems to me to be a really roundabout way to do this:
sqlvars{i} = substr(XMLCol{i},2,length(XMLCol{i})-2);
Since SAS stores character variables as fixed length, it will pad the value stored. So what you need to do is to remember the length so that you can use it later when you write out the value. So perhaps you should just create another array of numeric variables where you can store the lengths.
sqllen{i} = length(XMLCol{i})-2;
sqlvars{i} = substr(XMLCol{i},2,sqllen{i});

SAS &SYSERRORTEXT variable removing quote to use in SQL

I'm trying to use SAS system variables to track batch program execution (SYSERRORTEXT, SYSERR, SYSCC, ...)
I want to insert those in a dataset, using PROC SQL, like this :
Table def :
PROC SQL noprint;
CREATE TABLE WORK.ERROR_&todaydt. (
TRT_NM VARCHAR(50)
,DEB_EXE INTEGER FORMAT=DATETIME.
,FIN_EXE INTEGER FORMAT=DATETIME.
,COD_ERR VARCHAR(10)
,LIB_ERR VARCHAR(255)
,MSG_ERR VARCHAR(1000)
)
;
RUN;
QUIT;
Begin execution :
PROC SQL noprint;
INSERT INTO WORK.ERROR_&todaydt. VALUES("&PGM_NAME", %sysfunc(datetime), ., '', '', '';
RUN;
QUIT;
End execution :
PROC SQL noprint;
UPDATE WORK.ERROR_&todaydt.
SET
FIN_EXE = %sysfunc(datetime())
,COD_ERR = "&syserr"
,LIB_ERR = ''
,MSG_ERR = "&syserrortext"
WHERE TRT_NM = "&PGM_NAME"
;
RUN;
QUIT;
The problem occurs with system variable &syserrortext. which may contains special char, espcially single quote ('), like this
Code example for problem :
DATA NULL;
set doesnotexist;
RUN;
and so, &syserrortext give us : ERROR: Le fichier WORK.DOESNOTEXIST.DATA n'existe pas.
my update command is failing with this test, so how i can remove special chars from my variable &syserrortext ?
One approach, which avoids the need to remove special characters, is to simply use symget(), eg as follows:
,MSG_ERR = symget('syserrortext')
Make sure to quote the value. First use macro quoting in case the value contains characters that might cause trouble. Then add quotes so that the value becomes a string literal. Use the quote() function to add the quotes in case the value contains quote characters already. Use the optional second parameter so that it uses single quotes in case the value contains & or % characters.
,MSG_ERR = %sysfunc(quote(%superq(syserrortext),%str(%')))