SAS- Newcomer having trouble setting observations to missing - sas

Code executes correctly - log doesn't show any errors, ect.
I've tried removing the observations with multiple methods. I can manually delete observations. I am able to add to the dataset, but can't remove with code.
Data genes3;
Set genes;
If A6_A8= 28.0507 THEN A6_A8=.;
IF ND3_A8= 0.11936 THEN ND3_A8=.;
IF ND5_A8=0.39961 THEN ND5_A8=.;
IF ND3_ND5= 20.0195 THEN ND3_ND5=.;
Run;
Results Showing no difference in the dataset before running the code and after

Related

How do I get SAS to ignore missing variable names?

If a SAS DATA step references a non-existant variable in a DROP, KEEP, or RENAME statement, it returns an error saying such and stops the DATA step due to this error.
How do I get SAS to keep going with the step when it references a non-existent variable? I assume there's an OPTION for this (?) but I can't figure out what it's called if this is the case.
(I'm dealing with yearly datasets for which variables occasionally get added or deleted from year to year.)
Try using:
options dkricond=nowarn dkrocond=nowarn;
First one is for input datasets, second one is for output datasets.
You might want to set these back to warn or error after you are done with the specific data steps where you know this will be an issue.
SAS Manual page

Prevent SAS EG from outputting every dataset in datastep

I'm new to SAS EG, I usually use BASE SAS when I actually need the program, but my company is moving heavily toward EG. I'm helping some areas with some code to get data they need on an ad-hoc basis (the code won't change though).
However, during processing, we create many temporary files that are just iterations across months. I.E. if the user wants data from 2002 - 2016, we have to pull all those libraries and then concatenate them with our results. This is due to high transactional volume, the final dataset is limited to a small number of observations. Whenever I run this program though, SAS outputs all 183 of the datasteps created in the macro, making it very ugly, and sometimes the "Output Data" that appears isn't even output from the last datastep, but from an intermediary step, making it annoying to search through for the 'final output dataset'.
Is there a way to limit the datasets written to "Output Data" so that it only shows the final dataset - so that our end user doesn't need to worry about being confused?
Above is an example - There's a ton of output data sets that I don't care to see. I just want the final, which is located (somewhere) in that list...
Version is SAS E.G. 7.1
EG will always automatically show every dataset that was created after the program ends. If you don't want it to show any intermediate tables, delete them at the very last step in your process.
In your case, it looks as if your temporary tables all share the name TRN. You can clean it up as such:
/* Start of process flow */
<program statements>;
/* End of process flow*/
proc datasets lib=work nolist nowarn nodetails;
delete TRN:;
quit;
Be careful if you do this. Make sure that all of your temporary tables follow the same prefix naming scheme, otherwise you may accidentally delete tables that you need.
Another solution is to limit the number of datasets generated, and have a user-created link to the final dataset. There's an article about it here.
The alternate solution here is to add the output dataset explicitly as an entry on your process flow, and disregard the OUTPUT window unless you need to investigate something from the intermediary datasets.
This has the advantage that it lets you look at the intermediary datasets if something goes wrong, but also lets you not have to look through all of them to see the final dataset.
You should be able to add the final output dataset to the process flow once it's created once easily, and then after that one time it will be there for you to select to look at.

SAS view permanent inputs/outputs of a project or program

Is there any way to view all permanently stored datasets used by or created by a SAS project (or program, if this is not possible)? I have been tasked with creating a matrix of data inputs and outputs for 40 different SAS projects, each of which contains at least 50 programs. Needless to say there are THOUSANDS of temporary datasets created, but all I am interested in are the permanent ones. After manually checking one project, I noticed that the project process flow does not contain many permanently stored inputs (i.e. from libraries other than WORK) and it is very time consuming to check the properties of each dataset to see if it is temporary or not.
Three other things of note-
1. None of the code is documented.
2. I did not write any of it.
3. I am using SAS enterprise guide
it is not exactly clear what you are asking for. You may want to check out the table sashelp.Vcolumn which will contain a list of all permanent data sets and lists variables of each table. If your project stored all data sets in one library you could try:
proc sql;
select * from sashelp.vcolumn
where libname ="yourprojectslibrary";

Keep SAS processing after finding errors

Is there a way to force sas to continue processing, despite finding errors?
I'm appending a large quantity of datasets at the moment, however within the list I of dataset names I have, some don't exist. This is resulting in a bunch of errors and causing SAS to exit with the message "The SAS System stopped processing this step because of errors.".
You could evaluate the existence of a dataset using the EXIST() function and make the execution of the append conditional on the outcome.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000210903.htm

blocking the values after a specific date

I've got the following question.
I'm trying to run a partial least square forecast on a data model I have. Issue is that I need to block certain line in order to have the forecast for a specific time.
What I want would be the following. For June, every line before May 2014 will be blocked (see the screenshot below).
For May , every line before April 2014 will be blocked (see the screenshot below).
I was thinking of using a delete through a proc sql to do so but this solution seems to be very brutal and I wish to keep my table intact.
Question : Is there a way to block the line for a specific date with needing a deletion?
Many thanks for any insight you can give me as I've never done that before and don't know if there is a way to do that (I did not find anything on the net).
Edit : The aim of the blocking will be to use the missing values and to run the forecast on this missing month namely here in June 2014 and in May 2014 for the second example
I'm not sure what proc you are planning to use, but you should be able to do something like the below.
It builds a control data set, based on a distinct set of dates, including a filter value and building a text data set name. This data set is then called from a data null step.
Call execute is a ridiculously powerful function for this sort of looping behaviour that requires you to build strings that it will than pass as if they were code. Note that column names in the control set are "outside" the string and concatenated with it using ||. The alternative is probably using quite a lot of macro.
proc sql;
create table control_dates as
select distinct
nuov_date,
put(nuov_date,mon3.)||'_results' as out_name
from [csv_import];
quit;
data _null_;
set control_dates;
call execute(
'data '||out_name||';
set control_dates
(where=(nuov_date<'||nouv_date||'));
run;');
call execute('proc [analysis proc] data='||out_name||';run;');
run;