Reverse engineer SAS code to create a mapping document - sas

I have inherited a large base of SAS code. I need to reverse engineer to create some mapping document, so that given a field in the final output dataset, we can easily trace it all the way back to one of the inputs.
I can create it by hand, but can SAS automatically generate something like this?

No, I don't think there is any ready-made automated way of doing this.
Bear in mind that it is possible to create variables and pass them through a whole series of procs and data steps without mentioning them by name anywhere in the source code. Some sort of run-time analysis is therefore unavoidable.
Reeza's suggestion of using proc scaproc will yield some useful information for code executed within a single self-contained job running in a single SAS session, and the ATTR option in the record statement might be of some help to you when tracing the lineage of variables, but I'm afraid that however you approach this, it's going to take quite a lot of work.

Related

How to approach reading a large codebase

I've moved to a larger organization where there's lots of existing code(mainly SAS - the WPS version).
This is the first time I'm using the language, and I'm having trouble understanding the code, I'm not able to figure out how to approach understanding the large codebase.
P.S : Existing questions were not SAS-specific, I posted so people with SAS experience could help
I have converted 1000's of line of code from SAS to Teradata SQL and this are my learnings. if you have basic SAS knowledge, you should be fine.
It can be complex too, once I had issue with very complex regex code, which was difficult for me at that point of time. following two steps helped me out.
Read the code step by step and if you are not clear than run the code step by step in
development area, make sure you are not overwriting permanent tables. This will help understand what is happening in each step. Write comments for yourselves at each step, so that you can understand better.
If you are assigned for rewrite then run original code(step by step) and rewritten code step by step and compare results (do not overwrite permanent datasets). Also compare final resultant sets too.

SAS Code to examine another SAS program

I have a bit of a problem trying to code up what I want to do in SAS and I was hoping to get some advice from someone. I was wondering if it is possible to write code that will examine another piece of existing SAS code and bring up a list of the required input datasets and variables. I am wanting to invoke other SAS code in an automated process using the %include function and a prompt for the user to define the exact name/ location of the code as this will be different every time. But before this I want to somehow check this code, rename an existing dataset to be the input dataset and check that I have all the required variables before running the %include.
I was hoping someone might be able to tell me if this is at all possible and if so what function I would use. I am using EG 5.1 if that makes any difference.
Thanks for your help.
Steph.
P.S. Thanks for your help guys. Sorry if this question is outside the scope of this site, I thought there might be a simple function to achieve this, similar to %include. Also, I have never posted on this site before so apologies if I did stuff wrong.

I have a data set and I need to create an excel sheet in exact below format...Is there any way to do so?

Assume here is the data set.....
Aspect Evaluation Quarter Percentage
HOST/HOSTESS DIVERSIONS /687 Excellent Q1 40%
ROCKIN' BAR D / WAVEBANDS/ EVOLUTION Excellent Q1 50%
KNOWLEDGE OF SERVER TEAM – ROTATION Excellent Q1 60%
Trying to generate below Excel Sheet with same color and Structure, assume the above percentage will be populated in “% Within” column ......
Any way to get the excel in this required format....?I appreciate any help...
Thanks,
Sam
If you're going to do color and such, you have a few options. PROC EXPORT won't do it, of course. So instead, you need to do either Excel Tagsets, DDE, or create an unformatted sheet and use a macro from a template to copy the colors in.
Benefits/Drawbacks:
Excel Tagsets:
Benefits: Make the exact format entirely in SAS code. Have a great deal of control with a fairly simple interface. Uses the powerful PROC TEMPLATE to define styles, which allows highly portable and reusable code.
Drawbacks: Makes an .xml file that is readable by excel, not actually a .xls/.xlsx file. Does have some limitations in what it can do. Can be buggy. Probably the slowest to code of the three options, unless you are very familiar with it.
DDE:
Benefits: Once you make the template (once) in Excel, can make exactly what you want fully in SAS. Can do 100% of what Excel does.
Drawbacks: Uses somewhat outdated method, so fewer SAS programmers are familiar with it. Requires Excel to be installed on the machine, and open (you can open it as part of the DDE program). Somewhat slower to copy data in, and requires more careful checking to verify data went where it should go. Requires knowing DDE commands.
Template/copy:
Benefits: Likely fastest method in terms of set up time. Can do everything exactly like what excel does. Easy for other programmers to understand, as long as they know Excel/VBA and SAS.
Drawbacks: requires outside-of-SAS step to run copy macro (could be called from SAS via DDE or batch file, but more commonly would be done by hand). Does require some knowledge of VBA as well as SAS.
In general, I recommend trying Excel Tagsets first; if they don't work for your needs, try either of the other two options. Some good papers on Excel Tagsets for the beginner:
http://support.sas.com/resources/papers/proceedings11/170-2011.pdf
http://support.sas.com/resources/papers/proceedings12/207-2012.pdf
http://www2.sas.com/proceedings/forum2008/036-2008.pdf
I think you could create the above pretty easily using excel tagsets and proc report; follow the first paper in particular as it seems to be the most similar to what you're doing. If you run into any issues, post them as separate questions and we should be able to help you out.

Reorder list of numbered items using regular expressions

I got this list of items (it's in a sql script) and I would like to reorder it by number :
from this :
,user_1
,user_2
,user_3
,name_1
,name_2
,name_3
to this
,user_1
,name_1
,user_2
,name_2
,user_3
,name_3
I use sql server management studio 2008 so I have ability to replace using regex but I don't know if that kind of manipulation is even possible with regular expressions.
Just copy paste them in excel, then sort and then copy paste back to ssms.
It's that simple :)
I think you need to add a bit more description for this to really make sense.
Perhaps post the SQL script?
Is this data stored in a single varchar field and this is the reason you are looking for a regex solution?
You can easily parse the comma-seperated values using a regex, but you would need some other function to sort that result and it can fairly quickly get messy to do this in SQL.
In general I would say this problem is better handled outside of the SQL statement - eg. process this in your favorite programming/scripting language after getting the result back from the SQL.
Also this problem indicates a design problem with the database layout, if in any way possible the preferred way to solve this would probably be to restructure it.

parser: parsing formulas in template files

I will first describe the problem and then what I currently look at, in terms of libraries.
In my application, we have a set of variables that are always available. For example: TOTAL_ITEMS, PRICE, CONTRACTS, ETC (we have around 15 of them). A clients of the application would like to have certain calculations performed and displayed, using those variables. Up until now, I have been constantly adding those calculations to the app. It's pain in the butt, and I would like to make it more generic by way of creating a template, where the user can specify a set of formulas that the application will parse and calculate.
Here is one case:
total_cost = CONTRACTS*PRICE*TOTAL_ITEMS
So, want to do something like that for the user to define in the template file:
total_cost = CONTRACTS*PRICE*TOTAL_ITEMS and some meta-date, like screen to display it on. Hence they will be specifying the formula with a screen. And the file will contain many formulas of this nature.
Right now, I am looking at two libraies: Spirit and matheval
Would anyone make recommendations what's better for this task, as well as references, examples, links?
Please let me know if the question is unclear, and I will try to further clarify it .
Thanks,
Sasha
If you have a fixed number of variables it may be a bit overkill to invoke a parser. Though Spirit is cool and I've been wanting to use it in a project.
I would probably just tokenize the string, make a map of your variables keyed by name (assuming all your variables are ints):
map<const char*,int*> vars;
vars["CONTRACTS"] = &contracts;
...
Then use a simple postfix calculator function to do the actual math.
Edit:
Looking at MathEval, it seems to do exactly what you want; set variables and evaluate mathematical functions using those variables. I'm not sure why you would want to create a solution at the level of a syntax parser. Do you have any requirements that MathEval does not fulfill?
Looks like it shouldn't be too hard to generate a simple parser using yacc and bison and integrate it into your code.
I don't know about matheval, but boost::spirit can do that for you pretty efficiently : see there.
If you're into template metaprogramming, you may want to have a look into Boost::Proto, but it will take some time to get started using it.