I have in the table b the ID column in format INTEGER .
I use proc append, but when I check the table database.aw_1234 I have ID in double or float format, how can I fix it?
data a (KEEP = ID ACC_NO PERIOD_DTE);
infile "/root/dirs/files." dlm=";";
ID=_n_;
format ID 8.;
input ACC_NO_VAR PERIOD_DTE $10.;
leading_zeros = 16 - length(ACC_NO_VAR);
cat = repeat('0', leading_zeros);
ACC_NO = catt(cat, ACC_NO_VAR);
run;
DATA b(KEEP = ID ACC_NO PERIOD_DTE);
RETAIN ID ACC_NO PERIOD_DTE;
SET a;
RUN;
proc delete data = database.aw_1234;
proc append BASE=database.aw_1234. FORCE;
SAS only has 2 types, strings and doubles. A format is just instructions for SAS on how to display the variable to the user. So your number was always a double.
If you are creating a table in an RDBMS, you will probably see a note in the log that says something along the lines "SAS Formats are not translated". This means that the RDBMS doesn't really know what a format is, so SAS just writes your double, as a double.
To fix this, create the table in the RDBMS system with the TYPE integer. Then use SAS to delete records from the table and append into that table. Don't delete and recreate the table.
Change your code to something like this:
proc sql noprint;
delete from database.aw_1234;
quit;
proc append base=database.aw_1234 data=b force;
run;
Related
I'm a beginner in SAS and i have difficulties with this exercice:
I have a very simple table with 2 columns and three lines
I try to find the request that will return me the name of the most little people (so it must return titi)
All what I found is to return the most little size (157) but i don't want this, I want the name related to the most little value!
Could you help me please?
Larapa
A SQL having clause is a good one for this. SAS will automatically summarize the data and merge it back to the original table, giving you one a one-line table with the name of the smallest value of taille.
proc sql noprint;
create table want as
select nom
from have
having taille = min(taille)
;
quit;
Here are some other ways you can do it:
Using PROC MEANS:
proc means data=have noprint;
id nom;
output out=want
min(taille) = min_taille;
run;
Using sort and a data step to keep only the first observation:
proc sort data=have;
by taille;
run;
data want;
set have;
if(_N_ = 1);
run;
I have 8 tables, all containing the same order and number of columns, while one specific column named ATTRIBUTE contains different data which is of length 4 to 25. When I use PROC SQL and UNION ALL tables, the ATTRIBUTE column data length in minimizes to the lowest (4 digits).
How do I solve that i.e keeping full length of the data ?
Example, per #Lee
data have1;
attrib name length=$10 format=$10.;
name = "Anton Short";
run;
data have2;
attrib name length=$50 format=$50.;
name = "Pippy Longstocking of Stoyville";
run;
* column attributes such as format, informat and label of the selected columns
* in the result set are 'inherited' on a first found first kept order, dependent on
* the SQL join plan (i.e. the order of the tables as coded for the query);
proc sql;
create table want as
select name from have1 union
select name from have2
;
proc contents data=want varnum;
run;
Format is shorted than Length, any output display of longer values will appear to have been truncated at the data level.
* attributes of columns can be reset,
* (cleared so as to be dealt with in default manners),
* without rewriting the entire data set;
proc datasets nolist lib=work;
modify want;
attrib name format=; * format= removes the format of a variable;
run;
proc contents data=want varnum;
run;
I am doing a Proc Freq on a a large amount of User Entered Data, I would like to know if I can Combine the Results Rows based on the Contents of the first column.
You appear to want to perform a frequency of the first word (or 1st scanned part of a column). Such a case will require data manipulation to reduce the longer value to the desired shortened value, in a different variable, to be frequency binned.
data have;
input;
user_entered_data = _infile_;
datalines;
Nyfaria - January
Nyfaria - Febuary
Michelangelo - January
Michelangelo - Feburary
run;
data have_for_freq;
set have;
item = scan (user_entered_data,1,' ');
run;
options nocenter;
ods noproctitle;
proc freq data=have_for_freq;
title "Freq of raw data";
table user_entered_data;
run;
proc freq data=have_for_freq;
title "Freq of raw data formatted as $4.";
table user_entered_data;
format user_entered_data $4.;
run;
proc freq data=have_for_freq;
title "Freq of raw data - item scanned out";
table item;
run;
Note: In some cases you can use a format to control the mapping of a raw value to a reported value. There is no format that returns the first 'word' of a value (such as scan does)
My question is about the append of two different tables that are supposed to have the same name/format/type/length variables.
I am trying to create a step in my SAS program where I don't allow my program to be executed if the format/type/length of variables with the same name is not the same.
For example, when in one table I have a date in type string "dd-mm-yyyy" and in the other table I have the "yyyy-mm-dd" or "dd-mm-yyyy hh:mm:ss". After the append, our daily executions based on these input tables didn't work as expected. Sometimes the values come up as missing or out of order, since the formats are different.
I tried using the PROC COMPARE statement, which allowed me to check which variables have Differing Attributes (Type, Length, Format, InFormat and Labels).
proc compare base = SAS-data-set
compare = SAS-data-set;
run;
However, I only got the info on which variables have differing atributes (listing of common variables with differing attributes), not being able to do anything with/about it.
On the other hand, I would like to know if there's a chance to have a structured output table with this information, in order to use it as a control statement.
Creating an automatic task to do it would save me a lot of time.
Screenshot of an example:
You can use Proc CONTENTS to get information about a data sets variables. Do that for both data sets, and then you can use Proc COMPARE to create a data set informing you of the variable attributes differences.
data cars1;
set sashelp.cars (obs=10);
date = today ();
format date date9.;
cars1_only = 1;
x = 1.458; label x = "x-factor";
run;
data cars2;
length type $50;
set sashelp.cars (obs=10);
format date yymmdd10.;
cars2_only = 1;
X = 1.548; label x = "X factor to apply";
run;
proc contents noprint data=cars1 out=cars1_contents;
proc contents noprint data=cars2 out=cars2_contents;
run;
data cars1_contents;
set cars1_contents;
upName = upcase(Name);
run;
data cars2_contents;
set cars2_contents;
upName = upcase(Name);
run;
proc sort data=cars1_contents; by upName;
proc sort data=cars2_contents; by upName;
run;
proc compare noprint
base=cars1_contents
compare=cars2_contents
outall
out=cars_contents_compare (where=(_TYPE_ ne 'PERCENT'))
;
by upName;
run;
There is also an ODS table you can capture directly without having to run Proc CONTENTS, but the capture is not 'data-rific'
ods output CompareVariables=work.cars_vars;
proc compare base=cars1 compare=cars2;
run;
I need to export a data set from SAS to Excel 2013 as a .csv file. However, I need the file name to be dynamic. In this instance, I need it to appear as:
in_C000000_013117_65201.csv
where the string, "in_C000000_" will remain constant, the string "013117_" will be the current day's date, and the string "65201" will be the row count of the data set itself.
Any help that you can provide would be much appreciated!
Thanks!
Here's a modified macro I wrote in the past that does almost exactly what you're asking for. If you want to replace sysdate with a date in your desired format, that's easy to do as well:
%let path = [[desired destination]];
%macro exporter(dataset);
proc sql noprint;
select count(*) into: obs
from &dataset.;
quit;
data temp;
format date mmddyy6.;
date = today();
run;
proc sql noprint;
select date format mmddyy6. into: date_formatted
from temp;
quit;
proc export data = &dataset.
file = "&path.in_C000000_&date_formatted._%sysfunc(compress(&obs.)).csv"
dbms = csv replace;
run;
%mend exporter;
%exporter(your_dataset_here);
Produces datasets in the format: in_C000000_020117_50000.csv