Best Practices for Writing SAS Documentation? - sas

I'm looking for a style guide to writing documentation for SAS software.
Looking at docs for, say, procedures, there seems to be a common method of writing about them. For example, in this PDF book on procedures for Base 9.2, http://support.sas.com/documentation/cdl/en/proc/61895/PDF/default/proc.pdf , all the procs seem to be explained in an (overview, syntax, concepts, results, examples) format. I can work this out by example, but if I do, I'll likely miss key things like spelling conventions. Ideally I'd like to see a best practices guide for writing this sort of documentation, if one exists.
My goal is to write a SAS equivalent to this article, which is a support of the best practices for R documentation: http://www.stats-et-al.com/2019/04/writing-r-documentation-simplified.html
Can anyone point me in the right direction, or advise me towards some key examples?

Related

Differences between prismatics schema and clojure.spec

I have just recently started to learn some clojure and in order to make something like types (more like contracts) for validations etc, the go-to solution is a library named schema.
Recently I learned that clojure 1.9 however will have something similar named clojure.spec
Can anyone please tell me the differences between them?
When should I use one or the other, pros and cons, etc?
Eric Normand did this comparison, but as already pointed out you should definitely check the rationale, and there is also the guide and a podcast where Rich Hickey talks abouts clojure.spec.
The spec rationale is quite in-depth, I would suggest reading it: https://clojure.org/about/spec after that feel free to compare it with any other library you may be considering.

C++ libraries for web ranking and search engines

Can anybody introduce me some libraries that contains web ranking algorithms such as PageRank, HITS?
Thank you
I guess you are refering to the canonical PageRank algorithm as published in the original PageRank paper. People nowadays use "PageRank" to refer to the actual current Google algorithm for search.
If that is really the case, the PageRank implementation is not that difficult to find and use. Searching through Google you can find a good deal of implementations. One in python, for example.
For the HITS algorithm there's pseudocode in wikipedia. There's also a Perl implementation.
I'm also suggesting CLucene for you to start messing around.
Unless you work for Google, there aren't many good ways of finding out the specifics of their page ranking algorithm...which changes from time to time. Wikipedia outlines some of the basics:
http://en.wikipedia.org/wiki/PageRank
Other people write lengthy articles:
http://www.smashingmagazine.com/2007/06/05/google-pagerank-what-do-we-really-know-about-it/
If you are interested in the kinds of techniques that are involved in writing a search engine, there are several topics. For instance, there is "web crawling" and how to write programs that visit web sites and grab their contents...and determining when to visit the sites again to see if they've changed:
http://en.wikipedia.org/wiki/Web_crawler
Once you have a bunch of data on your machine(s) to analyze and search, the subject area to study is called "Information Retrieval" (or "IR"):
http://en.wikipedia.org/wiki/Information_retrieval
It's a fairly new science, but a lot of work is done on it. Wikipedia has a list of "free search engine software":
http://en.wikipedia.org/wiki/Category:Free_search_engine_software
I'd suggest that if you're new to this then it might be best to start with figuring out how to use something like Lucene to provide a search box on a website you have. Then dig in and see how it works. It has been ported to C++ if that is important to you:
http://clucene.sourceforge.net/

How compatible is WPS with SAS?

How compatible is the WPS SAS-clone with the corresponding products from SAS Institute?
Has anyone tried it - if so: have you run into any compatibility issues?
WPS has a good comparison document available on their website. It lists areas where WPS and SAS are compatible and where they are not. Start in http://www.teamwpc.co.uk/products/wps/language
I have tried it, although briefly and three years ago. I have a former co-worker that uses it exclusively as an alternative to SAS. He manages to function mostly without problems, but does run into some syntax related oddities on some of the more obscure procedures or functions.
WPS does offer a free trial, so you've nothing to lose with that. Give it a shot?
I don't really have an answer, but have you tried SAS-L? I'm sure you'll find people there with strong opinions about this. Also there are a couple of threads available already, which might get you started.
My 2 cents:
I've tried WPS for some time and it does indeed make for a SAS working substitute as long as you keep yourself from using some advanced stuff from the base language.
A few examples from my own experience:
I wasn't able to use the CALL MODULE* family for calling win32 API functions.
When reading one row at a time through FETCH() you cannot use CALL SET() for automatically initializing all the variables found in the input dataset.
Some random errors using macros. Sorry I don't remember those in detail.
In a few words: if you have a working SAS installation, ask for a WPS trial and test if it fits your use. If it does, be sure it's a tremendous save in licensing.

Why does XSLT seem to irritate so many people? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 months ago.
Improve this question
What it is about XSLT that people find irritating? Is it the syntax (which is pretty unusual) or just the way XSLT works in general? Are there features that are lacking?
I did a little bit of XSLT (around 800 lines) a while ago and found it not that bad. So why the general animosity against it?
I think people find it difficult to get their heads around XSLT (and bitch about it) because it is functional and declarative in nature, unlike c# or java programming. Navigating around documents can end up being complicated when XPATH statements get clever - though this is a feature of XPATH rather than XSLT. XPATH typically gets complex when you don't know at design time the exact structure of a document so you start querying siblings, descendents and ancestors. This is when people inheriting a complex XSLT start considering career changes!
With XSLT it is very much 'right tool for the right job'. It is designed to transform an xml document into another xml document extremely quickly and efficiently. XSLT is almost certainly the best tool to use for this purpose because of its extensibility, the fact that it has been written for this purpose, widespread support for it in xml processors across the board, and in case i didnt mention it already, performance. Common use-cases:
converting an xml document purely containing data into a document exposing a user-interface such as an xhtml document
converting an xml document into a different structure to suit someone elses schema e.g. Biz2Biz communications
A great implementation of the xslt technology is the apache-cocoon project which transforms xml documents into multiple output formats including html, excel, chart images, pdf's with an extensible plugin architecture. We use it a lot for our reporting platform and it works very well. When developers start with it, they find the same familiar issues. Once they get over them, they would typically be writing what i am here.
I once worked with a guy who didnt want to work with (and learn) XSLT and ended up presenting a demo to the client which took over 20 seconds to render a page. When i finally persuaded him to use an XSLT transform instead of his dumb DOM code it took under a second.
I like xslt, and use it quite a bit. As long as you think in terms of functional programming (i.e. set-once variables, similar to F# etc), then it is hugely versatile. I use it regularly for data transformation, presentation (in particular [x]html), and versatile code generation.
Definitely highly programming related; nobody except a programmer would grok it - but a very powerful tool.
I have a few xslt (split over a few xsl:import/xsl:include files) that is substantially more than the 800 you mention in the post... it really can (when used correctly) be a fully featured environment.
Notes:
best used at the server; client-side support is hit'n'miss
a few key things missed in 1.0; regex; case-insensitivity; etc
can be tricky if whitespace is important
One particularly useful feature of xslt (as a separate file) is that it makes it possible to change the transform without rebuilding any code. The code-gen example is from an open source project I run; I know of several users who have dipped in and tweaked the code-gen for their local standards. One use even went as far as writing the transform for an entire second language - and all without touching the binaries.
I personally dislike XSLT because it seems to combine several things that are generrally disliked in the developer community:
it uses magic strings (XPATH) that look like noise aka perl reg exs.
xml tags which can make statements verbose - aka xml programming language.
I've worked with XSLT before and I didn't much care for it because I found it extremely verbose for the simple task I wanted to perform.
Just out of curiosity, what did your 800 lines of XSLT do?
XSLT is a really powerful tool in the developer arsenal. I use it all the time for code generation. Performance counters, data access layer, REST interfaces, you name it. anything repetitive.
As a language it sure has its quirks, but as a tool is invaluable.
Many programmers don't have any experience with Functional Programming. XSLT, in many ways, resembles Functional Programming and a new and foreign paradigm to learn.
Learning an unfamiliar programming paradigm can be challenging, let alone learning an unfamiliar programming paradigm expressed in XML.
Code written in a Functional Programming language is typically minimalistic. XML is rarely minimalistic. So folks who know Functional Programming and appreciate its minimalism have to give up that minimalism.
I personally think it is very suitable for certain types of programming problems. To me, in certain situations, it is much easier to maintain a form using XSLT versus having to rewrite/recompile/redeploy code changes. While XSLT is not the only way to accomplish that, I haven't found any other solutions for those cases that is much cleaner and easier.
It has its place. Like everything else, when misused, it becomes a garbled mess of code, just as any language would. When used correctly, it can be a good supplement or solution to a programming problem.
XSLT is very powerful, so long as what you want to do with it matches what it's good for. However, maintaining someone else's XSLT can be a bit daunting. It's a programming language but it's also an XML file, so it can be hard to understand, even when laid out cleanly and adequately commented.
Our Library CMS largely consists of html stylesheets to do almost everything. Our data is XML natively of course. Some of our programmers don't get the functional programming paradigm. Your first experiences might lead to complex templates misusing the iterative features of XSLT. The first thing you have to tell a programmer is not to use the for each statement or travel the xpath axes
If they learn to refrain they may learn to understand the concepts of templates.
I find that the people that complain about XSLT are the ones that misuse it. For example, I think using it as an HTML templating language for a CMS is a terrible idea, unless your data is in XML already. Those people might complain that XSLT is ugly, or verbose, or whatever, but that's because they are using it for the wrong reasons.
XSLT is both functional and imperative at the same time. This trips up a lot of people. they have match and for loops with variables.
It is easy to write bad code in it. But if you follow good patterns you can do some really neat things very easily.
Check out http://www.worldofwarcraft.com/index.xml and http://www.wowarmory.com/index.xml if you have an XSLT-capable browser (FF 3 is good). They are totally written in client side XSLT with underlying XML. It makes scraping those sites REALLY easy and nice and they are forced to keep the data and presentation separate. A great example is their character pages http://www.wowarmory.com/character-achievements.xml?r=Mal%27Ganis&cn=Vosk&gn=Juggernaut
It's an example of turning XML into a programming language. Yuck. I wish people wouldn't do that. We have perfectly good programming languages already, and they are far better at it than XML.
because MS doesn't implement exslt2

What is the most useful way to document assessment of technological choices for a business problem?

I would like to know if there are any templates for doing this in a clear and concise way to give the gist of the application and its inner workings and how it meets the business needs. I do not want to write a mythological story so looking for any new ways of doing this.
Mostly this is about documenting what you actually need from the system. You can't make a good choice if you don't know what you need.
Here is a doc-style approach.
This is a decision matrix approach outline. The formatting is rough, but this is a good approach. This one has better formatting, but is not about software (it doesn't really matter).
I'm not exactly sure if this is what you are asking for, but check out this paper. It's a sample implementation of the CMMI's "Decision and Analysis Resolution" process area. It basically documents a method for comparing alternatives, reaching a decision, and documenting that decision.
The SEI's site has the original definition of DAR (see page 181), as well as a pretty good presentation about it. You have to realize that their whole goal is to help companies define their processes, not to push a particular process. So the documents you find there tend to be pretty high level, discussing the goals that your process should achieve and the specific practices that should be covered.
Consult Eric Evans' "Domain Driven Design". At the end of the day, you're going to have to use your experience and judgment - and that of your team - to make the design decisions big and small, but Evans recommends formulating a one-page manifesto, written in business terms, to share with biz types that explains the value of your view of the domain to the business.