Cumilitive log in SAS EG - sas

Long story short - Familiar with BASE 9, now using EG (7.1) due to a new role with another company. The transition is painful, but there is one thing that bothers me the most and that is the log.
As I am sure most know, it will rewrite/refresh for every piece of code you execute.
Surely there must be an option to maintain a "running log" within the SAS code you are running/building (not necessarily for the whole project, but just for the program node within the project).
Can this be done?
Any assistance is greatly appreciated. Searched for some reference, but none citing the subject specifically.

Yes - from SAS's support pages:
You’ll notice that a separate log node is generated for each code node. By turning on Project Logging, you can
easily tell Enterprise Guide that you’d like a single SAS log to be generated for all of the tasks and code nodes in your
Project. This single Project Log will be created in addition to the individual logs created for each task or code node.
Helpful Hint: If Project Logging is turned on, the log represents a running log of the entire project. To
turn on the Project Logging, select Project Log in the Context Menu of the Process Flow, and then select
Turn On.

Related

Process flow gets stuck on table creations

I'm trying to understand the Enterprise Guide process flow. As I understand it, the process flow is supposed to make it easy to run related steps in the order they need to be run to make a dependent action able to run and be up to date somewhere later in the flow.
Given that understanding, I'm getting stuck trying to make the process flow work in cases where the temporary data is purged. I'm warned when closing Enterprise Guide that the project has references to temporary data which must be the tables I created. That should be fine, the data is on the SAS server and I wrote code to import that data into SAS.
I would expect that the data can be regenerated when I try run an analysis that depends on that data again later, but instead I'm getting an error indicating that the input data does not exist. If I then run the code to import the data and/or join tables in each necessary place, the process flow seems to work as expected.
See the flow that I'm working with below:
I'm sure I must be missing something. Imagine I want to rerun the rightmost linear regression. Is there a way to make the process flow import the data without doing so manually for each individual table creation the first time round?
The general answer to your question is probably that you can't really do what you're wanting directly, but you can do it indirectly.
A process flow (of which you can have many per project, don't forget) is a single set of programs/tasks/etc. that you intend to run as a group. Typically, you will run whole process flows at once, rather than just individual pieces. If you have a point that you want to pause, look at things, then continue, then you have a few choices.
One is to have a process flow that goes to that point, then a second process flow that starts from that point. You can even take your 'import data' steps out of the process flow entirely, make an 'import data' process flow, always run that first, then run the other process flows individually as you need them. In fact, if you use the AUTOEXEC process flow, you could have the import data steps run whenever you open the project, and imported data ready and waiting for you.
A second is to use the UI and control+click or drag a box to select on the process flow to select a group of programs to run; select the first five, say, then run them, then select 'run branch from program...' option to run from that point on. You could also make separate 'branches' and run just the one branch at a time, making each branch dependent on the input streams.
A third option would be to have different starting points for different analysis tasks, and have the import data bit be after that starting point. It could be common to the starting points, and use macro variables and conditional execution to go different directions. For example, you could have a macro variable set in the first program that says which analysis program you're running, then the conditional from the last import step (which are in sequence, not in parallel like you have them) send you off to whatever analysis task the macro variable says. You could also have macro variables that indicate whether an import has been run once already in the current session that then would tell you not to rerun it via conditional steps.
Unfortunately, though, there's no direct way to run something and say 'run this and all of its dependencies', though.

Change stored macro SAS

In SAS using SASMSTORE option I can specify a place where the SASMACR catalog will exist. In this catalog will reside some macro.
At some moment I may need to change the macro and this moment may occure while this macro and therefore the catalog will be in use by another user. But then it will be locked and unavailable to be modified.
How can I avoid such a situation?
If you're using a SAS Macro catalog as a public catalog that is shared among colleagues, a few options exist.
First, use SVN or similar source control option so that you and your colleagues each have a local copy of the macro catalog. This is my preferred option. I'd do this, and also probably not used stored compiled macros - I'd just set it up as autocall macros, personally - because that makes it easy to resolve conflicts (as you have separate files for each macro). Using SCMs you won't be able to resolve conflicts, so you'll have to make sure everyone is very well behaved about always downloading the newest copy before making any changes, and discusses any changes so you don't have two competing changes made at about the same time. If SCMs are important for your particular use case, you could version control the macros that create the SCMs and build the SCM yourself every time you refresh your local copy of the sources.
Second, you could and should separate development from production here. Even if you have a shared library located on a shared network folder, you should have a development copy as well that is explicitly not locked by anyone except when developing a new macro for it (or updating a currently used macro). Then make your changes there, and on a consistent schedule push them out once they've been tested and verified (preferably in a test environment, so you have the classic three: dev, test, and prod environments). Something like this:
Changes in Dev are pushed to Test on Wednesdays. Anyone who's got something ready to go by Wednesday 3pm puts it in a folder (the macro source code, that is), and it's compiled into the test SCM automatically.
Test is then verified Thursday and Friday. Anything that is verified in Test by 3pm Friday is pushed to the Dev source code folder at that time, paying attention to any potential conflicts in other new code in test (nothing's pushed to dev if something currently in test but not verified could conflict with it).
Production then is run at 3pm Friday. Everyone has to be out of the SCM by then.
I suggest not using Friday for prod if you have something that runs over the weekend, of course, as it risks you having to fix something over the weekend.
Create two folders, e.g. maclib1 and maclib2, and a dataset which stores the current library number.
When you want to rebuild your library, query the current number, increment (or reset to 1 if it's already 2), assign your macro library path to the corresponding folder, compile your macros, and then update the dataset with the new library number.
When it comes to assigning your library, query the current library number from the dataset, and assign the library path accordingly.

How can I stop tables appearing in Enterprise guide?

I'm writing procedure for other users to run in Enterprise guide based on SAS 9.3. It logs various bits of information to a table. Is there any way to stop this table appearing in the process flow?
NB This is almost all done using "User written code" steps. Unfortunately the setting in the menu (see vasja's answer below) does not seem to affect UWC steps.
(I've seen this: Tell SAS not to add newly generated tables on the Process Flow but I'm using 9.3 so it doesn't work!)
A colleague (twitter.com/binarytrain) figured out a solution.
Tables are always added to EG projects in 9.3 if, at the end of the code step, the library in which it exists is still assigned(1). So, in the question above, the trick is to clear the libname at the end of the code step.
This can further be used to "discourage" - not stop - users from meddling with temporary tables.
Create a folder in &sasworklocation called _work
Register it as a library
Save any temporary tables in this new library
Clear this library at the end of the code step
At this point the temporary table is inacessible without running a libname statement
Re-registered the library when the table is required again.
(1) Even if it's assigned using a different name, so this won't work for pre-assigned libraries.
In EG 5.1:
go to Tools - Options, select Result General:
deselect Automatically add output to the project tree.

Wix: How to add files to the RemoveFiles table from c++

I've been following the advice in this question.
How to add a WiX custom action that happens only on uninstall (via MSI)?
I have an executable running as a custom action after InstallFinalize which I intend to purge all my files and folders. I was just going to write some standard deletion logic but I'm stuck on the point that Rob Mensching made that the windows installer should handle this incase someone bails midway through an uninstallation.
"create a CustomAction that adds temporary rows to the RemoveFiles table"
I'm looking for some more information on this. I'm not really sure how to achieve this in c++ and my searching hasn't turned up a whole lot.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa371201(v=vs.85).aspx
Thanks
Neil
EDIT: I've marked the answer due to the question being specific about how to add files to the removeFiles table in c++ however I'm inclined to agree that the better solution is to use the RemoveFolderEx functionality in wix even though it is currently in beta (3.6 I think)
Roughly you will have to use the following functions in this order:
MsiDatabaseOpenView - the (input) handle is the one you get inside your custom action functions
MsiCreateRecord - to create a record with the SQL stuff inside
MsiRecord* - set of functions to prepare the record
MsiViewExecute to insert the new record into whatever table you please ...
MsiCloseHandle - with the handle from the very first step and the record handle (from MsiCreateRecord)
Everything is explained in detail over at MSDN. However, pay special attention to the section "Functions Not for Use in Custom Actions".
The documentation of MsiViewExecute also explains how the SQL queries should look. To get a feel for them you may want to use one of the .vbs scripts that are part of the Windows Installer SDK.
If you use WiX to create your installation package, consider using RemoveFolderEx element. It does what you want and you don't have to write the code yourself.
Read Tactical directory nukes for an example of how to use it.
If you still want to implement it yourself, you can get your inspiration from this blog post, there's the code for doing this in VBScript.

Scan for changed files

I'm looking for a good efficient method for scanning a directory structure for changed files in Windows XP+. Something like how git does it is exactly what I'm looking for, when running a git status it displays all modified files, all new (untracked) files and deleted files very quickly which is exactly what I would like to do.
I have a basic model up and running which performs an initial scan and stores all filenames, size, dates and attributes.
On a subsequent scan it checks if the size, attributes or date have changed and marks as a changed file.
My issue now comes in detecting moved and deleted files. Is there a tried and tested method for this sort of thing? I'm struggling to come up with a good method.
I should mention that it will eventually use ReadDirectoryChangesW to monitor files and alert the user when something changes so a full scan is really a last resort after the initial scan.
Thanks,
J
EDIT: I think I may have described the problem badly. The issue I'm facing is not so much detecting the changes - I have ReadDirectoryChangesW() using IOCP on multiple threads to detected when a change happens, the issue is more what to do with the information. For example, a moved file is reported as a delete followed by a create and a rename comes in 2 parts, old name, followed by new name. So what I'm asking is how to differentiate between the delete as part of a move and an actual delete. I'm guessing buffering the changes and processing batches would be an option but feels messy.
In native code FileSystemWatcher is replaced by ReadDirectoryChangesW. Using this properly is not simple, there is a good baseline to build off here.
I have used this code in a previous job and it worked pretty well. The Win32 API itself (and FileSystemWatcher) are prone to problems that are described in the docs and also discussed in various places online, but impact of those will depending on your use cases.
EDIT: the exact change is indicated in the FILE_NOTIFY_INFORMATION structure that you get back - adds, removals, rename data including old and new name.
I voted Liviu M. up. However, another option if you don't want to use the .NET framework for some reason, would be to use the basic Win32 API call FindFirstChangeNotification.
You can use USN journaling if you are up to it, that is pretty low level (NTFS level) stuff.
Here you can find detailed information and source code included. It is written in C# but most of it is PInvoking C/C++ functions.