SAS DDE not formatting output correctly in Excel - sas

I'm just looking to export a SAS dataset into a pre-made Excel template.
First 6 variables of my dataset (which is a .wpd file) look like:
StartDate EndDate product_code Description Leaflet Media
04-Jul-13 07-Jul-13 256554 BUTCHER BEEF 1PK (1 KGM) 54x10 3
I currently have:
options noxwait noxsync;
x '"c:\Template.xls"'; /* <--excel template to use*/
filename template dde 'excel|Leaflets!r6c1:r183c67'; /*put data in rows 3 to 183 in leaflets sheet*/
data LEAF.results; set LEAF.results;
file template ;
put StartDate EndDate product_code Description Leaflet Media
/*and the remaining 61 variables*/
run;
The DDE procedure works and opens the excel sheet, but the data is not formatted correctly in excel and looks like this:
StartDate EndDate Product code Description Leaflet Media
04 July 2013 07 July 2013 256554 BUTCHER BEEF 1PK
As you can see it seems to have treated spaces as delimiters but I'm not sure of the syntax to change this
- might also be worth noting that I have 67 variables in my actual dataset so didn't want to have to informat and format them all individually.
Also, is there a way to output this dataset into my excel template and then save the template as a different filename elsewhere on my c drive?
Thanks!

After trying every DDE option under the sun I finally stumbled across LRECL.
So,
options noxwait noxsync;
x '"c:\Template.xls"'; /* <--excel template to use*/
filename template dde 'excel|Leaflets!r6c1:r183c67' notab **LRECL=3000**; /*put data in rows 6 to 183 in leaflets sheet*/
data LEAF.results; set LEAF.results;
file template ;
put StartDate EndDate product_code Description Leaflet Media
/*and the remaining 61 variables*/
run;
I'm guessing the default length of characters allowed in each cell was too short, so increasing the length allowed means each cell doesn't get split into multiple cells?
source:
http://support.sas.com/resources/papers/proceedings11/003-2011.pdf

Try changing file template ; to use a delimiter of tab, ie, file template dlm='09'x;;
Also, in the filename, add 'notab':
filename template dde 'excel|Leaflets!r6c1:r183c67' notab;

Related

Changing the first row name conditionally on character interval in SAS

Consider the following data:
data GDP;
input Year $ Agriculture Industry;
datalines;
2016 195 1634
2017 220 1986
;
When exporting as a .dat file:
proc export
data = GDP
outfile = '....\GDP.dat'
dbms = TAB
replace;
run;
Then I get the following file:
However, I want the following file:
Where:
Mydata is a text I manually add.
The number after for instance Year (that is Year: 1-4) is the character intervals where the values are within. For instance, the values in the Year column is from characther 1 to 4. The values in the agriculture column goes from 9 to 11, and so on.
So SAS should count the interval for the values and add it to the first row name. How to do it in SAS?
You can fudge this with labels to your variables and then add the LABEL option to PROC EXPORT.
data GDP;
input Year $ Agriculture Industry;
label Year = "Mydata, Year:1-4" Agriculture = "Agriculture:9-11";
datalines;
2016 195 1634
2017 220 1986;
run;
proc export
data = GDP
outfile = '....\GDP.dat'
dbms = TAB
LABEL
replace;
run;
FYI - it looks like you're trying to create a fixed width file and put the specifications in the header. I'd advise against this and either put the specifications in a separate file or to include it at the top of the file instead.
Putting it in the header makes it harder for any other system to process correctly.
If you really need this for some reason, you may also want to consider using a data step to create your export instead of using PROC EXPORT.
AFAIK there is no easy way to define the specifications automatically though you could push the PROC CONTENTS output to a separate data set.

How to select a single value from a table, to use for comparison(greater than/less than)?

I am handing over some code to a colleague, which is to be run daily to generate reports.
Once every month a new cycle starts, and we have to update the code for cycle_start_date
data mtd_table;
set ytd_table;
where entry_date> '10Mar2021'd; /*different every month*/
run;
Since he'll be running them from now on, along with other reports from other teams, I don't want to bother him every month to tweak the code. So I devised this:
i run(once a month)
data shared1.cycle_start_date;
cycle_start_date='10Mar2021'd;
run;
he runs(everyday)
data mtd_table;
set ytd_table;
where entry_date>/*(select cycle_start_date from shared1.cycle_start_date)*/;
run;
I'm not sure how to correctly implement this (select cycle_start_date from shared1.cycle_start_date) part, since it is from proc sql. Would appreciate help.
When you store program parameters in a data set (called control data) one use case is having later code extract the values into macro variables, at which point other code can resolve the macro variable for replacement at (automatic) step compile and run time. Two ways to extract values into macro variables are:
Proc SQL, SELECT ... INTO :<macro-variable>, and
DATA _NULL_, CALL SYMPUT(<macro-variable>, <data step expression>);
Don't forget, macro resolution replaces the macro variable as source code text. Dates in macro variables can be either the SAS data value (the text representation of a SAS date integer) or part of a date literal (the text <dd-mon-yyyy>) that would be resolved as source date literal "&<macro-variable>"D when to be utilized as a date value. The date literal part is used when you want to show the date value as human readable in when output; for example: TITLE "cycle start: &cycle_start_date";
Control data (you)
Rebuild or edit values in data set (name it parameters to be more useful)
data shared1.parameters;
cycle_start_date = '10Mar2021'd; * stored as a SAS date value (integer);
run;
Note: Some control data layouts use a name/value organization and has one row per parameter.
Other
Extract date value as SAS date value text, and as date literal text portion and use.
proc sql noprint;
select
cycle_start_date
, cycle_start_date format=date11.
into
:cycle_start_date_value trimmed
, :cycle_start_date_literal trimmed
from
shared1.parameters
;
%put &=cycle_start_date_value;
%put &=cycle_start_date_literal;
/*
* will log the macro variable value as follows:
* CYCLE_START_DATE_VALUE=22349 and
* CYCLE_START_DATE_LITERAL=10-MAR2021
*/
data ...
set ...;
where date >= &cycle_start_date; *resolve parameter as text representation of a SAS date value (integer);
...
title "Cycle starts: &cycle_start_date_literal";
proc print data=...; * title in output shows human readable part of date;
run;
Another approach is to use a common source code file that is %included by others. You would edit or recreate the parameters file by whatever process you want.
parameters.sas
%let cycle_start_date = 10-Mar-2021;
use
%include 'parameters.sas';
data ...
set ...;
where date >= "&cycle_start_date"D; *resolve parameter as part of date literal;
...
title "Cycle starts: &cycle_start_date";
proc print data=...; * title in output shows human readable part of date literal;
run;
One possible solution would be to put the date from the cycle_start_date table that is in the shared library shared1 into a macro-variable date that will be used in your data step to filter the ytd_table table based on the entry_date variable.
proc sql noprint;
select cycle_start_date into :date
from shared1.cycle_start_date;
quit;
data mtd_table;
set ytd_table;
where entry_date > &date.;
run;

enter column in a dataset to an array

I have 33 different datasets with one column and all share the same column name/variable name;
net_worth
I want to load the values into arrays and use them in a datastep. But the array that I use should depend on the the by groups in the datastep (country by city). There are total of 33 datasets and 33 groups (country by city). each dataset correspond to exactly one by group.
here is an example what the by groups look like in the dataset: customers
UK 105 (other fields)
UK 102 (other fields)
US 291 (other fields)
US 292 (other fields)
Could I get some advice on how to go about and enter the columns in arrays and then use them in a datastep. or do you suggest to do it in another way?
%let var1 = uk105
%let var2 = uk102
.....
&let var33 = jk12
data want;
set customers;
by country city;
if _n_ = 1 then do;
*set datasets and create and populate arrays*;
* use array values in calculations with fields from dataset customers, depending on which by group. if the by group is uk and city is 105 then i need to use the created array corresponding to that by group;
It is a little hard to understand what you want.
It sounds like you have one dataset name CUSTOMERS that has all of the main variables and a bunch of single variable datasets that the values of NET_WORTH for a lot of different things (Countries?).
Assuming that the observations in all of the datasets are in the same order then I think you are asking for how to generate a data step like this:
data want;
set customers;
set uk105 (rename=(net_worth=uk105));
set uk103 (rename=(net_worth=uk103));
....
run;
Which might just be easiest to do using a data step.
filename code temp;
data _null_;
input name $32. ;
file code ;
put ' set ' name '(rename=(net_worth=' name '));' ;
cards;
uk105
uk102
;;;;
data want;
set customers;
%include code / source2;
run;

Unable to import .txt file in SAS using proc IMPORT

My program makes a web-service call and receives a response in XML format which I store as output.txt. When opened in notepad, the file looks like this
<OwnerInquiryResponse xmlns="http://www.fedex.com/esotservice/schema"><ResponseHeader><TimeStamp time="2018-02-01T16:09:19.319Z"/></ResponseHeader><Owner><Employee firstName="Gerald" lastName="Harris" emplnbr="108181"/><SalesAttribute type="Sales"/><Territory NodeGlobalRegion="US" SegDesc="Worldwide Sales" SegNbr="1" TTY="2-2-1-2-1-1-10"/></Owner><Delegates/><AlignmentDetail><SalesAttribute type="Sales"/><Alignments/></AlignmentDetail></OwnerInquiryResponse>
I am unable to read this file into SAS using proc IMPORT. My SAS code is below
proc import datafile="/mktg/prc203/abhee/output.txt" out=work.test2 dbms=dlm replace;
delimiter='<>"=';
getnames=yes;
run;
My log is
1 %_eg_hidenotesandsource;
5 %_eg_hidenotesandsource;
28
29 proc import datafile="/mktg/prc203/abhee/output.txt" out=work.test2 dbms=dlm replace;
30 delimiter='<>"=';
31 getnames=yes;
32 run;
NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to
WORK.PARMS.PARMS.SLIST.
Unable to sample external file, no data in first 5 records.
ERROR: Import unsuccessful. See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.09 seconds
cpu time 0.09 seconds
33
34 %_eg_hidenotesandsource;
46
47
48 %_eg_hidenotesandsource;
51
My ultimate goal is to mine Employee first name (Gerald), last name (Harris) and Employee Number (108181) from the above file and store it in the dataset (and then do this over and over again with a loop and upend the same dataset). If you can help regarding importing the entire file or just the information that I need directly, then that would help.
If you only need these three fields then named input a single input statement is perfectly viable, and arguably preferable to parsing xml with regex:
data want;
infile xmlfile dsd dlm = ' /';
input #"Employee" #"firstName=" firstName :$32. #"lastName=" lastName :$32. #"emplnbr=" emplnbr :8.;
run;
This uses the input file constructed in Richard's answer. The initial #Employee is optional but reduces the risk of picking up any fields with the same names as the desired ones that are subfields of a different top-level field.
Bonus: the same approach can also be used to import json files if you're in a similar situation.
Since you are unable to use the preferred methods of reading xml data, and you are processing a single record result from a service query the git'er done approach seems warranted.
One idea that did not pan out was to use named input.
input #'Employee' lastname= firstname= emplnbr=;
The results could not be made to strip the quotes with $QUOTE. informat nor honor infile dlm=' /'
An approach that did work was to read the single line and parse the value out using a regular expression with capture groups. PRXPARSE is used to compile a pattern, PRXMATCH to test for a match and PRXPOSN to retrieve the capture group.
* create a file to read from (represents the file from the service call capture);
options ls=max;
filename xmlfile "%sysfunc(pathname(WORK))\1-service-call-record.xml";
data have;
input;
file xmlfile;
put _infile_;
datalines;
<OwnerInquiryResponse xmlns="http://www.fedex.com/esotservice/schema"><ResponseHeader><TimeStamp time="2018-02-01T16:09:19.319Z"/></ResponseHeader><Owner><Employee firstName="Gerald" lastName="Harris" emplnbr="108181"/><SalesAttribute type="Sales"/><Territory NodeGlobalRegion="US" SegDesc="Worldwide Sales" SegNbr="1" TTY="2-2-1-2-1-1-10"/></Owner><Delegates/><AlignmentDetail><SalesAttribute type="Sales"/><Alignments/></AlignmentDetail></OwnerInquiryResponse>
run;
* read the entire line from the file and parse out the values using Perl regular expression;
data want;
infile xmlfile;
input;
rx_employee = prxparse('/employee\s+firstname="([^"]+)"\s+lastname="([^"]+)"\s+emplnbr="([^"]+)"/i');
if prxmatch(rx_employee,_infile_) then do;
firstname = prxposn(rx_employee, 1, _infile_);
lastname = prxposn(rx_employee, 2, _infile_);
emplnbr = prxposn(rx_employee, 3, _infile_);
end;
keep firstname last emplnbr;
run;

SAS importing multiple datasets to save name of dataset as variable

I need to access a directory with some sas datasets named all_ci, all_pd, all_vs, etc. ci would be 'care info', pd would be 'patient data' and vs would be 'vital stats.' I am reading them in as such:
data ci_all;
set DIRECTORY.all:; run;
I get a table that looks like this:
No.
16
25
20
This works in only setting all the sets that begin with all. The issue is that I need an output that looks like this:
Category No.
Patient Data 16
Vital Statistics 25
Care Info 20
Since the original all_ datasets do not have the category label, I have to manually count in which order the all_ dataset was read, and then label it. I was wondering if there was a way which saves the name of the dataset that's being read in so I can easier label them in the rows.
Use the INDSNAME option on the SET statement. You need to copy the value to a new variable since the variable referenced in the dataset option is automatically dropped.
libname DIRECT 'mydirectory' ;
data ci_all;
lenght dsname indsname $41 ;
set DIRECT.all: indsname=indsname;
dsname=indsname;
run;