SAS Pulling string variables from a csv file - sas

SOLVED (per Neil Neyman's comment):
&var1 is not the same as var1.
DATA local.trow;
INFILE csvfile FIRSTOBS=&i OBS=&i;
INPUT var1 $ var2 $ var3 $ var4 $;
call symput('var1',var1); *Added line;
call symput('var2',var2); *Added line;
call symput('var3',var3); *Added line;
call symput('var4',var4); *Added line;
RUN;
Adding the lines marked with "*Added line;" solved the issue.
QUESTION
Disclaimer: I am very new to SAS and have been struggling with issues in this code for a while.
In a loop, I am trying to import string variables from a CSV file, one of which I then pass to a remote server (var1), but I'm running into an issue. If I include %let var1 = 'XXE'; at the top of the code and exclude the portion where I'm pulling the variables from my csv file, remote execution works fine and I get the output I would expect.
However, if I run the code as is, it appears to not treat the string variables as expected. For instance, the PROC PRINT statement produces the expected output (i.e. it shows the 4 variables), but the title does not show up properly--it appears that var1 is skipped altogether, while i(with a value of 1) and m (with a value of 2007) are displayed. The title shows up as "Title - 1 2007". The log displays the following error near the title line:
WARNING: Apparent symbolic reference VAR1 not resolved.
The remote submit does not work either, but instead produces the following error while highlighting &VAR1:
ERROR: Syntax error while parsing WHERE clause.
ERROR 22-322: Syntax error, expecting one of the following: a quoted string,
a numeric constant, a datetime constant, a missing value.
I'm really confused by this error because the PROC PRINT statement is able to print the variables (which do in fact visually appear to be strings). Is a "quoted string" a different type of variable?
If I explicitly declare var1 at the top of the code or manually enter 'XXE' into the WHERE clause, the remote query executes.
Could it be that am I handling the text file incorrectly? It looks like this:
XXE XXA XXB XXC
XXM XXN XXI XXP
...
My code:
LIBNAME local 'C:\...\Pulled Data\New\';
FILENAME csvfile 'C:\...\Pulled Data\New\indexes.txt';
%macro getthedata(nrows,ystart,yend); *nrows is the number of rows in the text file;
%GLOBAL var1 var2 var3 var4;
%do i=1 %to &nrows;
%do m=&ystart %to &yend;
DATA local.trow;
INFILE csvfile FIRSTOBS=&i OBS=&i;
INPUT var1 $ var2 $ var3 $ var4 $;
RUN;
PROC PRINT DATA = local.trow;
TITLE "Title - &i. &var1. &m";
var var1 var2 var3 var4;
RUN;
proc export data=local.trow
outfile="C:\...\Pulled Data\New\Indices_&i._&m..csv"
dbms=csv replace;
run;
signon username=_prompt_;
%syslput VAR1 = &var1;
rsubmit;
libname abc'server/sasdata';
data all2009;
set abc.file_2007:;
by index date time;
where index in (&VAR1) and time between '8:30:00't and '12:00:00't;
run;
endrsubmit;
%end;
%end;
%mend getthedata;
Options MPRINT;
%getthedata(1,2007,2007)

Short Answer:
&var1 is not the same as var1. Add the call symput() lines described below to assign the datastep values to the macro variable values.
DATA local.trow;
INFILE csvfile FIRSTOBS=&i OBS=&i;
INPUT var1 $ var2 $ var3 $ var4 $;
call symput('var1',var1);
call symput('var2',var2);
call symput('var3',var4);
call symput('var4',var4);
RUN;
Other Notes
Seems a strange way to go about this, but you said you are new to SAS so maybe I could give you some pointers?
Create the entire dataset at once outside the macro
data local.trows;
length var1 var2 var3 var4 $3; *assuming vars really are only 3 chars;
infile csvfile; *this is not really a csv file, it looks space-delimited.;
*confusing to name it as such;
input var1 var2 var3 var4;
run;
I'm not getting why there's a separate output csv file for each row? Is that really what you need?
Once you have your dataset your macro can do something like:
%macro getthedata(mdataset)
data _null_;
set &mdataset; #add mdataset as a macro parameter;
/* automatically assigning nrows based on dataset; */
if last then call symput('nrows',_n_);
run;
%do i=1 to &nrows;
data _null_;
set &mdataset;
if &i=_n_ then do;
call symput('var1',var1);
call symput('var2',var2);
/*
etc... Doesn't seem like these really should be
globals since they change every iteration, and
don't seem needed outside of the macro?
*/
run;
/** now you have your vars set for the current iteration
and proceed with your connect code **/
It seems you are just overwriting this dataset with every iteration. Is that what you want to do? Or is there some other code/macro variables you left out for this question?
libname abc'server/sasdata';
data all2009;
set abc.file_2007:;
/*seems to be a random colon here ^ by the way*/

Related

How to use filters when importing on sas

I have a very large data table on "dsv" format and i'm trying to import it on sas. However i don't have enough space to import the full table and then filter it (i've done this for smaller tables).
Is there any way to filter the data while importing it because at the end i will only use a part of that table ? If i want for example to import only rows that have the value 103 for Var2
PS: i'm using "proc import" not "data - infile..." because i don't know the exact number of columns
Var1
Var2
Var3
A10
103
Test
A02
102
Hiis
...
...
....
Thank you
You can add dataset options to the dataset listed in the OUT= option of PROC IMPORT.
Example:
filename dsv temp;
data _null_;
input (var1-var3) (:$20.);
file dsv dsd dlm='|';
put var1-var3;
cards;
Var1 Var2 Var3
A10 103 Test
A02 102 Hiis
;
proc import file=dsv dbms=csv out=want(where=(var2=102)) replace ;
delimter='|';
run;
The result is a dataset with just one observation.
NOTE: The data set WORK.WANT has 1 observations and 3 variables.
If you don't know the name of the second variable you could always just read the header row first and put the name into a macro variable.
data _null_;
infile dsv dsd dlm='|' truncover obs=1;
input (2*name) (:$32.);
call symputx('var2',nliteral(name));
run;
proc import file=dsv dbms=csv out=want(where=(&var2=102)) replace ;
delimter='|';
run;
You can add a where dataset option to the out= statement. For example:
proc import
file = 'myfile.txt'
out = want(where=(var2=103))
...;
run;

Mixed Delimiters in Proc Export

Is there a method to make the first delimiter in an observation different to the rest? In Microsoft SQL Server Integration Services (SSIS), there is an option to set the delimiter per column. I wonder if there is a similar way to achieve this in SAS with an amendment to the below code, whereby the first delimiter would be tab instead and the rest pipe:
proc export
dbms=csv
data=mydata.dataset1
outfile="E:\OutPutFile_%sysfunc(putn("&sysdate9"d,yymmdd10.)).txt"
replace
label;
delimiter='|';
run;
For example
From:
var1|var2|var3|var4
to
var1 var2|var3|var4
...Where the large space between var1 and var2 is a tab.
Many thanks in advance.
Sounds like you just want to make a new variable that has the first two variables combined and then write that out using tab delimiter.
data fix ;
length new1 $50 ;
set have ;
new1=catx('09'x,var1,var2);
drop var1 var2 ;
run;
proc export data=fix ... delimiter='|' ...
Note that you can reference a variable in the DLM= option on the FILE statement in a data step.
data _null_;
dlm='09'x ;
file 'outfile.txt' dsd dlm=dlm ;
set have ;
put var1 # ;
dlm='|' ;
put var2-var4 ;
run;
Or you could use the catx() trick in a data _null step. You also might want to use vvalue() function to insure formats are applied.
data _null_;
length newvar $200;
file 'outfile.txt' dsd dlm='|' ;
set have ;
newvar = catx('09'x,vvalue(var1),vvalue(var2));
put newvar var3-var4 ;
run;
Updated Fixed order of delimiters to match question.
Final code based on the marked answer by Tom:
data _null_;
dlm='09'x ;
file "E:\outputfile_%sysfunc(putn("&sysdate9"d,yymmdd10.)).txt" dsd dlm=dlm ;
set work.have;
put
var1 # ;
dlm='|';
put var2 var3 var4;
run;

SAS - If-then macros in DDE

I'm outputting three datasets to Excel via DDE (set1, set2, set3). The datasets have the same variables, except that set3 has two additional variables. I've wrapped the DDE section in a macro that I call for each dataset and use "put" to write out the variables I want. I'm trying to figure out how to add the two additional variables from set3 if the macro is being called on set3. Here is my code so far:
filename out dde
'excel|sheet1!r2c2:r1000c5';
%macro write(set);
data _null_;
set &set.;
file out dlm='09'x;
put
var1
var2
var3
%if &set. = set3 %then var4 var5;
%else ;
run;
%mend write;
%write(set1);
%write(set2);
%write(set3);
The code works fine if I remove the macro %if-%then statement. Any ideas how to go about this? Thanks!
There isn't an ending semi-colon for the PUT statement, just for the %if and %else statements.
I find that it helps make the code clearer if I indent the macro code independently from the SAS code. Also when a SAS statement takes more than one line to make sure the put the terminal semi-colon on a separate line.
You can even add in some redundant macro %do; and %end; to help make it clearer which statements are macro statements and which are SAS statements. Or in this case parts of a SAS statement.
%macro write(set);
data _null_;
set &set.;
file out dlm='09'x;
put var1 var2 var3
%if &set. = set3 %then %do;
var4 var5
%end;
;
run;
%mend write;

SAS Export data to create standard and comma-delimited raw data files

i m new to sas and studying different ways to do subject line task.
Here is two ways i knew at the moment
Method1: file statement in data step
*DATA _NULL_ / FILE / PUT ;
data _null_;
set engappeal;
file 'C:\Users\1502911\Desktop\exportdata.txt' dlm=',';
put id $ name $ semester scoreEng;
run;
Method2: Proc Export
proc export
data = engappeal
outfile = 'C:\Users\1502911\Desktop\exportdata2.txt'
dbms = dlm;
delimiter = ',';
run;
Question:
1, Is there any alternative way to export raw data files
2, Is it possible to export the header also using the data step method 1
You can also make use of ODS
ods listing file="C:\Users\1502911\Desktop\exportdata3.txt";
proc print data=engappeal noobs;
run;
ods listing close;
You need to use the DSD option on the FILE statement to make sure that delimiters are properly quoted and missing values are not represented by spaces. Make sure you set your record length long enough, including delimiters and inserted quotes. Don't worry about setting it too long as the lines are variable length.
You can use CALL VNEXT to find and output the names. The LINK statement is so the loop is later in the data step to prevent __NAME__ from being included in the (_ALL_) variable list.
data _null_;
set sashelp.class ;
file 'class.csv' dsd dlm=',' lrecl=1000000 ;
if _n_ eq 1 then link names;
put (_all_) (:);
return;
names:
length __name__ $32;
do while(1);
call vnext(__name__);
if upcase(__name__) eq '__NAME__' then leave;
put __name__ #;
end;
put;
return;
run;

SAS replace variable with the variable's value before sending to remote server

I'm new to SAS and I'm trying to retrieve data from a remote server.
Below is the code that I am trying to execute on the remote server:
rsubmit;
libname abc'server/sasdata'; *This is where the datasets are located;
data all2009;
set abc.file_2007:;
by index date time;
where index in (var1) and time between '8:30:00't and '12:00:00't;
run;
endrsubmit;
Currently, I am trying to pass along the variable var1 which contains the string index that the "query" needs. var1 is evaluated on my local machine and contains the value 'abc'.
But, since var1 is evaluated locally, trying to refer to the variable on the remote server produces an error since var1 does not exist there.
If I run the code as follows (with the explicit value of 'abc') it runs fine:
...
set abc.file_2007:;
by index date time;
where index in ('abc') and time between '8:30:00't and '12:00:00't;
...
I need to run this "query" dynamically. Is there a way to force var1 to be replaced by its actual value (ie. 'abc') before trying to execute the code enclosed in rsubmit and endrsubmit?
UPDATED:
Entire Code (I left out the remote server specific ones, but I'm able to connect no problem):
LIBNAME local 'C:\...\Pulled Data\New\';
FILENAME csvfile 'C:\...\Pulled Data\New\indexes.txt';
%macro getthedata(nrows,ystart,yend); *nrows is the number of rows in the text file;
%GLOBAL var1 var2 var3 var4;
%do i=1 %to &nrows;
%do m=&ystart %to &yend;
DATA local.trow;
INFILE csvfile FIRSTOBS=&i OBS=&i;
INPUT var1 $ var2 $ var3 $ var4 $;
RUN;
PROC PRINT DATA = local.trow;
TITLE "Title - &i. &var1. &m";
var var1 var2 var3 var4;
RUN;
proc export data=local.trow
outfile="C:\...\Pulled Data\New\Indices_&i._&m..csv"
dbms=csv replace;
run;
signon username=_prompt_;
%syslput VAR1 = &var1;
rsubmit;
libname abc'server/sasdata';
data all2009;
set abc.file_2007:;
by index date time;
where index in (&VAR1) and time between '8:30:00't and '12:00:00't;
run;
endrsubmit;
%end;
%end;
%mend getthedata;
Options MPRINT;
%getthedata(1,2007,2007)
With this code, the PROC PRINT statement correctly prints a table showing the var variables and their values. The TITLE statement properly evaluates the i and m variables, but leaves the var1 off altogether.
The proc export statement creates the correct CSV file containing the expected values for each var variable.
As suggested, I tried to declare the var variables as GLOBAL, but that didn't seem to have any effect. The code still appears unable to properly pass the var1 variable to the remote server.
Again, the code works perfectly well if I replace &VAR1 with the actual string value of the variable.
The error I get is:
ERROR: Syntax error while parsing WHERE clause.
ERROR 22-322: Syntax error, expecting one of the following: a quoted string,
a numeric constant, a datetime constant, a missing value.
Could it be that the WHERE clause in this case cannot accept variables?
If you put the VAR1 value in a macro variable, on/in your local machine/session you can pass that macro to the remote server using the %SYSLPUT command.
you place it between the SIGNON and RSUBMIT commands.
Signon;
%SYSLPUT VAR1=&var1;
RSubmit;
Then &VAR1 is available in your remote session. Depending on what you are passing in &VAR1, you may need some macro quoting. Post the value if you have problem with this method.