What procs are easy to learn and essential for SAS programming? I have learned several like proc print, sort, freq, format, univariate, anova, glm, import, transpose. What ones should I learn next?
Welcome to Stack Overflow (and SAS). The procedures that AlanC mention are all important.
Probably your best bet is to pick up a copy of The little SAS book and learn the data processing as well as the analysis procedures. I have used many versions of it for years and students like it. SAS changes at a glacial pace. So, if money is tight, pick up an older edition.
You have already hit many of the main procedures. Focus on data processing with data step and PROC SQL. SQL is its own language and is extremely useful with or without SAS. Also do not neglect ODS. SAS can make very beautiful output and the aesthetics matter when you are showing your portfolio.
If you want to be a professional SAS programmer you will need to learn macro, to automate tasks, and also the intermediate to advanced magic that Ron Cody writes about. Get comfortable with the language then work on converting your code into macro. Along the way be sure to check Cody's data cleaning book
Related
I am simulating pga tournaments using Stata. My simulation results table consists of:
column 1: the names of the 30 players in the tournament
columns 2 - 30,001: the 4 round results of my monte-carol simulations.
what I am trying to do is create a 30 x 30 matrix with the golfers' names as column 1 and across the column names where each cell represents the percentage of times Golfer A beat Golfer B outright from the 30,000 simulations. Is this possible to do in Stata? Thanks
I tend to say that everything is always possible in all programming languages, but somethings are much more difficult to do in some languages compared to others. I do not think that Stata is great tool for what you intend to do.
You need to provide some code examples for us to be able to help you with your task, but here is one thing I can say. Stata has two programming languages. One is often called Stata (but is called ado on Stata Corps webiste) and the other is Mata. If you for some reason need to use the software Stata, you should do this in the language Mata that has more matrix operators than ado. And in ado you cant store text in a matrix, so if you want to store the name of the golfer you need to use Mata, but you can also use indexes of rows and columns to keep track of the golfers.
With that said, Stata is primarily a tool to make operations and analyze a single dataset loaded into memory (recently support for multiple datasets has been added). So to answer your question, yes, this can be done in Stata, but you are probably much better of doing it in a language with more support for multidimensional arrays/vectors. For example, R or Python.
I'm trying to convert SAS code to STATA and am encountering some difficulty. Is there an add-in that could do this for me? While I'm new to STATA I don't even have SAS and am unfamiliar with its rules.
Here is the first snippet of SAS code that is a problem:
Libname library 'C:\COFUL\LIB\'; Proc format lib=library;
Value $RCOMT
"D43"="NONE" /*NONE*/
"Z20"="LIT" /*LIT
;
Doing language translation from SAS to something else is hard: there is no getting around that. I have done SAS to C# and it is challenging. You need to know both, as Nick stated. You won't easily find a copy of SAS to use. Check with SAS for a University or Learning edition. That will be limited in the number of obs (recs). SAS is comprised of 2 main things: data steps and procs. These are known as step boundaries. The data step is a very powerful DO/WHILE loop. Procs are a separate beast.
Why would you want to convert to Stata? You would have better luck converting to Python. Read Randy Betancourt's book on Python for SAS users. That would be a start. If you have to use Stata, I am not aware of anyone doing that.
When using R markdown for making statistical reports, I have the ability to echo the R code in my output document. I'm learning SAS and I was wondering if it was possible to highlight or echo the SAS code in my final ODS report ? I'm using a dirty hack right know to display the code in my document, which is using "ods text = ", but it seems quite redundant. Plus it doesn't add syntax highlighting.
That feature does not exist in the SAS language right now, but it has been mentioned in several talks by Amy Peters, principal product manager of the SAS programing environments, as a planned feature for a near-future SAS release (with no specific date yet, but hopefully in the next 2 years). It would likely be implemented in a similar fashion to Jupyter Notebooks, in that you write your code and get your output inline.
That said, SAS does support Jupyter Notebooks, which is the best current (third party) solution. Contact your SAS administrator for more information.
I have an idea here, I'm the type of guy who dosent take no for answer and find a way to fiddle and get it done... but i think this is abit far fetch... you can try still I think it will work with mostly evrything but may have a hard time to play with quotes when you have multiple semi colomns....
Check:
I started by creating a dumbass dataset :
data tata;
x=1;
run;
then we do the following:
%let code= select * from tata;
proc sql;
create table report as
&code.;
quit;
proc print data=report;
footnote "&code.";
run;
The rationale:
I think it you put your code in macrovariables and then execute those macro variables you would be able to print show the code after by printing the macrovariable following your text...
See the sample
is it possible to show the mathemetical formular / concept behind the analysis done with SAS Enterprise?
Assuming SAS would calculate a correlation between a list of numbers -- is it possible to see what exactly SAS did from a mathematical perspective?
It is not possible to ask SAS for the mathematical formula, no. You can check the documentation; for example, this page gives many of the 'elemantary statistics' formulas (like variance, UCLM, etc.)
If you need the formula behind something more complex that you can't find online, contact your SAS Support rep, and they may be able to put you in contact with the developer of that particular proc - like if you need to know some particular to how PROC GLM does something.
You can ask SAS to give you the SAS code that it ran if you executed a task (in most cases it's available by clicking on the task node), in many cases, but that would be something like proc freq; tables a*b; run;, not a mathematical formula per se.
In particular, what, if any, are the substantial changes or extensions in the programming language that gives it functionality beyond PROC TABULATE?
Or is it the case that the programming languages in Proc Tabulate and TPL Tables ( from QQQ Software ) are pretty close to the same?
I was really surprised to hear about TPL Tables, and it's predecessor, the Table Producing Language from the US Department of Labor in the 1970s. After all these years, I had never heard of it. Turns out, two commercial descendants of the Table Producing Language are the SAS PROC and TPL Tables.
Has anyone worked with both? Why are TPL Tables so unknown?
Robert
You are correct, both TABULATE and QQQ TPL Tables are descendants of the US Bureau of Labor Statistics TPL. According to this thread, the developers of TPL/PCL at the Bureau of Labor Statistics eventually left BLS and started QQQ.
This SAS article is a good read regarding TABULATE. According to the article, TABULATE, which was introduced in the 80s, originally borrowed much of its syntax and features from BLS TPL while addressing some of its shortcomings, though the specific shortcomings addressed are not mentioned.
What, if any, are the substantial changes or extensions in the programming language that give it functionality beyond PROC TABULATE?
The features of QQQ TPL Tables have evolved over time, as have the features of TABULATE. I've found no information to suggest that ongoing TABULATE development kept abreast of QQQ TPL features, so the two systems are now likely too different to compare effectively. As a SAS product, TABULATE is intended to integrate with other SAS technologies, such as ODS. TPL probably integrates with other QQQ technologies.
Although, just based on documentation, something that TPL (v7+) can do that TABULATE (as of v9.4) cannot is perform statistical hypothesis tests, e.g. t-tests, chi-squared tests, and ANOVA. But in SAS you have other, likely more flexible, options to get these.
If you're looking to integrate one or the other into your development cycle, I recommend choosing the one that best fits your current system. If you're already using SAS, stick with TABULATE.
Why is TPL Tables so unknown?
Who knows. It's still in use by the BLS and a few others, apparently. But SAS is such a giant in the field that it tends to overshadow its competition.