Macro lost after reading in new file - stata

Using Stata, I define a local macro (macro_name) as a variable (macro_variable) in one data file.
After reading in a new file (in the same do file), I'm no longer able to reference that macro.
Instead, I receive the error:
. di `macro_name'
macro_variable not found
I am learning how to use macros, so please bear with me. However, shouldn't I be able to still display or call on that macro in a single do file even if I load in a new data set?
For example:
use "newdata.dta", clear
This problem occurs regardless of whether I define the macro as a global or local. Additionally, I attempted to solve the problem by creating a separate locals.do file that include in the preamble of my master do file as:
include locals.do
But, I still receive the error listed above.
Do macros (local or global) disappear immediately upon reading in a new file? That doesn't seem right based on what I've read.
Thanks in advance for any clarification.

Consider the following, which points to the source of your problem, and in the last command, reproduces precisely the error message you received.
. do "/var/folders/xr/lm5ccr996k7dspxs35yqzyt80000gp/T//SD08491.000000"
. local macro_name macro_variable
. macro list _macro_name
_macro_name: macro_variable
. display "`macro_name'"
macro_variable
. display `macro_name'
macro_variable not found
r(111);
end of do-file
Added in edit: The above was run from the do-file editor window. When I instead launch Stata and paste the four commands into the command window, running them a line at a time, the following are what results.
. local macro_name macro_variable
. macro list _macro_name
_macro_name: macro_variable
. display "`macro_name'"
macro_variable
. display `macro_name'
macro_variable not found
r(111);
.
At the risk of over-explaining, the point to my original answer is that the error message the displayed in the original post, and in the final command in both of my examples, was due to the failure to include quotation marks in the display command, which caused display to believe that "macro_variable", which was the value assigned to the local macro "macro_name" was not a character string constant, but rather a variable name or scalar, and display was unable to locate a variable or scalar by that name.
Let me add as a bonus explanation that the use of locals.do described in the original post has no hope of working, because local macros are local to the do-file in which they are executed, and vanish at the termination of that do-file. In particular, if you submit a local command by selecting a subset of the lines in the do-file editor window, those lines are copied into a temporary do-file and the values of the local macros vanish at the termination of the temporary do-file.

Generalizing what I wrote in my comment above to Nick:
Macros only maintain the connection between the variable/varlist assigned to a macro name and, therefore, the variable/varlist to which the macro's name refers to must be in memory (i.e. the dataset that contains the variable/varlist has to be in memory) in order to access it via the macro.
Assigning a variable/varlist to a macro does not persist the actual value(s)/element(s) in memory, but rather maintain the connection between the variable/varlist and the macro name assigned to it/them.

Related

Stata : generate/replace alternatives?

I use Stata since several years now, along with other languages like R.
Stata is great, but there is one thing that annoys me : the generate/replace behaviour, and especially the "... already defined" error.
It means that if we want to run a piece of code twice, if this piece of code contains the definition of a variable, this definition needs 2 lines :
capture drop foo
generate foo = ...
While it takes just one line in other languages such as R.
So is there another way to define variables that combines "generate" and "replace" in one command ?
I am unaware of any way to do this directly. Further, as #Roberto's comment implies, there are reasons simply issuing a generate command will not overwrite (see: replace) the contents of a variable.
To be able to do this while maintaining data integrity, you would need to issue two separate commands as your question points out (explicitly dropping the existing variable before generating the new one) - I see this as method in which Stata forces the user to be clear about his/her intentions.
It might be noted that Stata is not alone in this regard. SQL Server, for example, requires the user drop an existing table before creating a table with the same name (in the same database), does not allow multiple columns with the same name in a table, etc. and all for good reason.
However, if you are really set on being able to issue a one-liner in Stata to do what you desire, you could write a very simple program. The following should get you started:
program mkvar
version 13
syntax anything=exp [if] [in]
capture confirm variable `anything'
if !_rc {
drop `anything'
}
generate `anything' `exp' `if' `in'
end
You then would naturally save the program to mkvar.ado in a directory that Stata would find (i.e., C:\ado\personal\ on Windows. If you are unsure, type sysdir), and call it using:
mkvar newvar=expression [if] [in]
Now, I haven't tested the above code much so you may have to do a bit of de-bugging, but it has worked fine in the examples I've tried.
On a closing note, I'd advise you to exercise caution when doing this - certainly you will want to be vigilant with regard to altering your data, retain a copy of your raw data while a do file manipulates the data in memory, etc.

What happens to my data after hitting the break key in Stata?

Suppose I had the following structure for a script called mycode.do in Stata
-some code to modify original data-
save new_data, replace
-some other code to perform calculations on new_data-
Now suppose I press the break button to stop Stata after it has saved new_data in the script. My understanding is that Stata will undo the changes made to the data if it is interrupted with the break button before it has finished. Following such interruption, will Stata erase new_data.dta from memory if it didn't exist initially (or revert it back to its original form if it already existed before mycode.do was executed)?
Stata documentation says "After you click on Break, the state of the system is the same as if you had never issued the original command." However, it sounds as if you expect that it treats an entire do-file as a "command". I do not believe this is the case. I believe once the save is completed, then the file new_data has been replaced, and Stata is not able to revert the file to the version before the save.
The Stata Reference Manual also says, in the documentation for Stata release 13, [R] 16.1.4 Error handling in do-files, "If you press Break while executing a do-file, Stata responds as though an error has occurred, stopping the do-file." Example 4 discusses this further and seems to support my interpretation.
This seems to me to have interesting implications for Stata "commands" that are implemented as ado files.

foreach loop running but not giving results

I am having trouble running a foreach loop. The loop runs without error but gives no output. Can someone tell me what they think might be going on? Many thanks in advance!
Here is the code:
cd "O:\RESEARCH\ikhilko\Subway Big Data project"
local datafiles : dir . files "*.txt"
foreach file in `datafiles' {
insheet using `file',
clear
insheet using `file',
drop v9-v43
save date1, replace
}
UPDATE:
Interestingly, the code runs when I just type it into the command line, rather than doing it from the .do file, any idea what might be going on there?
It is important to note that local macros are precisely that, i.e. defined and visible only locally.
Locally means within
the same interactive session
or
the same program
or
the same do file (or do file editor contents)
or
the same part of the do file (or ...) executed by selection
Locality is, it seems, biting you here. A local macro defined in one place is not visible in another. A local macro reference will evaluate to missing, i.e. an empty string, if the macro is not visible.
Some code for the debugging. display the contents of your local datafiles to see what's going into the loop:
local datafiles : dir . files "*.txt"
display `"`datafiles'"'
local wordx : word 1 of `datafiles'
display `"`wordx'"'
foreach file in `datafiles' {
display "`file'"
}
(The code does not format well in the comments section.)

How to carriage return a long local list and how to define list only once

My first question is simple, but cannot find any answer anywhere and it's driving me crazy:
When defining a local list in Stata how do I do a carriage return if the list is really long?
The usual /// doesn't work when inside double quotations marks.
For example, this doesn't work:
local reglist "lcostcrp lacres lrain ltmax ///
ltmin lrainsq lpkgmaiz lwage2 hyb gend leducavg ///
lageavg ldextn lfertskm ldtmroad"
It does work when I remove the quotation marks, but I am warned that I should include the quotations.
My second question is a more serious problem:
Having defined the local reglist, how can I get Stata to remember it for multiple subsequent uses (that is, not just one)?
For example:
local reglist lcostcrp lacres lrain ltmax ///
ltmin lrainsq ///
lpkgmaiz lwage2 ///
hyb gend leducavg lageavg ldextn lfertskm ldtmroad
reg lrevcrp `reglist' if lrevcrp~=.,r
mat brev=e(b)
mat lis brev
/*Here I have to define the local list again. How do I get Stata to remember
it from the first time ??? */
local reglist lcostcrp lacres lrain ltmax ///
ltmin lrainsq ///
lpkgmaiz lwage2 ///
hyb gend leducavg lageavg ldextn lfertskm ldtmroad
quietly tabstat `reglist' if lrevcrp~=., save
mat Xrev=r(StatTotal),1
mat lis Xrev
Here, I define the local reglist, then run a regression using this list and do some other stuff.
Then, when I want to get the means of all the variables in the local reglist, Stata doesn't remember it anymore and have to define it again. This defeats the whole purpose of defining a list.
I would appreciate it if someone could show me how to define a list just once and be able to call it as many times as one likes.
The best answer to your first question is that if you are typing a long local definition in a command, then (1) you don't need to type a carriage return, you just keep on typing and Stata will wrap around and/or (2) there is a better way to approach local definition. I wouldn't usually type long local definitions interactively because that is too tedious and error-prone.
The quotation marks are not essential for examples like yours, only essential for indicating strings with opening or closing spaces.
Your second question is mysterious. Stata won't forget definitions of local macros in the same program (wide sense) unless you explicitly blank out that macro, i.e. redefine it to an empty string. Here program (wide sense) means program (narrow sense), do-file, do-file editor contents, or main interactive session. You haven't explained why you think this happens. I suspect that you are doing something else, such as writing some of your code in the do-file editor and running that in combination with writing commands interactively via the command window. That runs into the difficulty alluded to: local macros are local to the program they are defined in, so (in the same example) macros defined in the do-file editor are local to that environment but invisible to the main interactive session, and vice versa.
I suggest that you try to provide an example of Stata forgetting a local macro definition that we can test for ourselves, but I am confident that you won't be able to do it.

SAS: Set current folder to the folder containing the running program

I've just started learning SAS because I'm required to use it for a statistics course. For this course, the university provides SAS 9.2 through their virtual-machine setup: I make a reservation in their system, they generate a VM on one of their servers, and I connect to the VM using Microsoft's Remote Desktop client. The virtual machines are generated and erased per session; settings are reset every time, and files must be stored on my client computer (which is accessible in the VM by a UNC path).
Within this setup, when I open a program file stored on my laptop, I've only been able to access the accompanying data files (each stored in the same folder as the program) either by hardcoding the full path or by updating the "current folder" setting at the beginning of each session. The first is problematic because it means the program won't run anywhere else - in particular, when I email it to the professor. The second is inconvenient, because browsing to this particular UNC path is time consuming, and I already have to browse to the same path to open the program file.
I want to make this easier by programmatically setting the current folder to the folder containing the program. Then I could just open the file and get to work. I've found some examples of getting the filename of the program file, of getting the path to a fileref, and of (link limit exceeded) setting the current folder, but I haven't been able to combine them in the right way. Please connect the dots for me.
To programmatically change the Windows current directory from SAS, you can use the X command, which is what really happens when you use the "Change current folder" dialog box:
x 'cd "\\computername\share name\folder"';
You can also do this using the SYSTEM data step function, a method I prefer because you get a return code (but more typing of course):
data _null_;
rc = system( 'cd "\\computername\share name\folder"' );
if rc = 0
then putlog 'Command successful';
else putlog 'Command failed';
run;
Note the UNC path is surrounded with double-quotes, which is necessary if the path contains blanks.
Of course, this still requires you to manually type in the command, but it might be something you could add to the program source code. If your VM environment allowed you to maintain some permanent presence on the server, you could save this command into a start-up file.
I would ask your professor for advice; if you are working with data given to you as part of your class, you may only need to send just the source code. On the other hand, if you are creating output data as part of your assignment, your professor might want your to deliver source code and SAS data sets. Surely he or she will have some procedure.
Complete Answer:
SAS's obtuse notation requires some strange delimiter fiddling to combine my partial solution (finding the path) with #Bob Duell's partial solution (setting the current folder). There seem to be two key rules involved:
&var is expanded in double-quoted strings ("&var"), but not single-quoted strings ('&var')
Quotes in &var are not treated as delimiters after expansion
So the solution is to compute a string of the quoted path (where the quotes are part of the string), and expand that within a double-quoted parameter to X or SYSTEM:
%let qsrc=%str(%")&src%str(%");
X "cd &qsrc"
It's not required to store the string, both &src and &qsrc can be expanded in-place, which yields a single statement solution:
X "cd %str(%")%substr(%sysget(SAS_EXECFILEPATH),1,%eval(%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILENAME))))%str(%")";
This executes correctly, but breaks the syntax coloring in the GUI. Within a string, %str(%") and "" both expand to ", so replacing %str(%") with "" both executes correctly and is colored correctly in the GUI:
X "cd ""%substr(%sysget(SAS_EXECFILEPATH),1,%eval(%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILENAME))))""";
This inherits the limitation that it only works when SAS_EXECFILEPATH and SAS_EXECFILENAME are defined, which is the case when running from within the Windows GUI editor. It's also subject to any limitations on in the "cd" command, which SAS intercepts rather than invoking the Windows command line. I expect it will fail on paths containing quotes.
A partial answer: One way to get the containing folder from the filename of the program file
Spread out & logging steps:
/* Find PathName of folder containing program */
%let FullName=%sysget(SAS_EXECFILEPATH);
%put FullName: &FullName.;
%let FullLen=%length(&FullName);
%put FullLen: &FullLen.;
%let BaseName=%sysget(SAS_EXECFILENAME);
%put BaseName: &BaseName.;
%let BaseLen=%length(&BaseName);
%put BaseLen: &BaseLen.;
%let PathLen=%eval(&FullLen.-&BaseLen.);
%put PathLen: &PathLen.;
%let PathName=%substr(&FullName,1,&PathLen);
%put PathName: &PathName.;
Consolidated & silent:
/* Find src folder */
%let src=%substr(%sysget(SAS_EXECFILEPATH),1,%eval(%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILENAME))));
This only works when SAS_EXECFILEPATH and SAS_EXECFILENAME are defined, and it's not clear when that is. It does work when using the Windows GUI editor.