Splitting long file path in Stata - stata

Assume that I have a long file path (80+ characters) from my current working folder:
use .\random_folders_name\project1\secret_data\survey_data\big_constructed_file.dta
I am looking for a way to split it into two lines to comply with a 80-character-line standard.
I've tried
use .\random_folders_name\project1\secret_data\survey_data///
\big_constructed_file.dta
and
use ".\random_folders_name\project1\secret_data\survey_data"///
+ "\big_constructed_file.dta"
without success.
I would prefer to not change the working directory as that would make necessary to change it back.

+ can be used for string concatenation but only within an expression to be evaluated.
This works
clear
set obs 1
gen whatever = "a" + "b"
and this works
local whatever = "a" + "b"
di "`whatever'"
Putting one or more parts of a string in a local macro is one way to do what you want and what I would recommend if writing within 80 characters on a line.
local dir ".\random_folders_name\project1\secret_data\survey_data\"
use "`dir'big_constructed_file.dta"
You could do this:
local name = ".\random_folders_name\project1\secret_data\survey_data" + ///
"\big_constructed_file.dta"
use "`name'"
That's the closest I could get to taking your approach and making it work.
On backslashes, watch out: http://www.stata-journal.com/sjpdf.html?articlenum=pr0042

Related

Starting position for replace function in db2

I'm converting some Access VBA functionality to DB2 and found a vital difference. VBA lets you specify the starting point in the character string you're working on. DB2 doesn't have that option. It starts from position 1 and replaces whatever you want to be replaced in the whole string. How can I make DB2 start the replace at a specified place in the string? For example, my string is "Incongruent Plastics Incorporated" and I want to replace the second "Inc" at position 22 with "Inc". I'm doing this in a WHILE loop, going through long strings, replacing parts of them until they are less than a specified maximum (15 or 30 depending on the field).
I looked at the Locate function, but I'm not sure that's right.
Replace(a.PAYEE_STD_NAME, B.FullWord, B.abbreviation, B.mLastWord)
Where a.PAYEE_STD_NAME is the string I'm looking at, B.FullWord is what I want to replace, B.abbreviation is what I want to replace it with, and B.mLastWord is the position where I want to start replacing. Something like Replace("Incongruent Plastics Incorporated","Incorporated","Inc",22)
I expect the characters to be replaced starting in the position I need, towards the back of the string, not in the beginning.
Thanks!
Not that good at DB2, but that limitation can generally be worked around by using SUBSTR
The equivalent of Replace(a.PAYEE_STD_NAME, B.FullWord, B.abbreviation, B.mLastWord) would be:
CONCAT(SUBSTR(a.PAYEE_STD_NAME, 1, B.mLastWord - 1), Replace(SUBSTR(a.PAYEE_STD_NAME, b.mLastWord), B.FullWord, B.abbreviation))
This assumes b.mLastWord is greater than 1, if it's 1 you can use a normal REPLACE.
Maybe consider using REGEXP_REPLACE https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0061496.html
and possibly consider recusrive SQL rather than looping logic

Using regex (or similar) in PowerShell to rearrange extracted version number

I am using the below PowerShell command to extract the File Version parameter from an executable and write to a variable, which is working great (found extraction command here: Get file version in PowerShell)
$ver = [System.Diagnostics.FileVersionInfo]::GetVersionInfo("somefilepath").FileVersion
Using the above, I will usually have a File Version that looks like:
11.2.1617.1
What I would like to achieve is to rearrange and slightly amend that output so that it reads:
11.2.1.617
Note that the "1" has been removed from "1617" and that ".617" and ".1" have then swapped places.
I would like the corrected version number to be stored in a new variable (e.g. $vernew) so that I have both the original value and new value in two different variables. The File Version will change over time, but the format will always be the same (e.g. XX.X.XXX.X).
Can anyone please suggest the most appropriate way of achieving this? Any help would be greatly appreciated.
For swapping, you can use a simple -replace operation for this:
$ver = $ver -replace '^(\d+)\.(\d+)\.(\d+)\.(\d+)','$1.$2.$4.$3'
This will capture the four groups of numbers and swap the last two ones
But since you need to change one of the captured values, I'd suggest doing it like this instead:
# grab the individual parts
$major,$minor,$build,$revision = $ver -split '\.'
# remove first character from the 3rd block
$build = $build.Substring(1)
# concatenate them in the new order
$newver = $major,$minor,$revision,$build -join '.'

Advanced Lua Pattern Matching

I would like to know if either/both of these two scenarios are possible in Lua:
I have a string that looks like such: some_value=averylongintegervalue
Say I know there are exactly 21 characters after the = sign in the string, is there a short way to replace the string averylongintegervalue with my own? (i.e. a simpler way than typing out: string.gsub("some_value=averylongintegervalue", "some_value=.....................", "some_value=anewintegervalue")
Say we edit the original string to look like such: some_value=averylongintegervalue&
Assuming we do not know how many characters is after the = sign, is there a way to replace the string in between the some_value= and the &?
I know this is an oddly specific question but I often find myself needing to perform similar tasks using regex and would like to know how it would be done in Lua using pattern-matching.
Yes, you can use something like the following (%1 refers to the first capture in the pattern, which in this case captures some_value=):
local str = ("some_value=averylongintegervalue"):gsub("(some_value=)[^&]+", "%1replaced")
This should assign some_value=replaced.
Do you know if it is also possible to replace every character between the = and & with a single character repeated (such as a * symbol repeated 21 times instead of a constant string like replaced)?
Yes, but you need to use a function:
local str = ("some_value=averylongintegervalue")
:gsub("(some_value=)([^&]+)", function(a,b) return a..("#"):rep(#b) end)
This will assign some_value=#####################. If you need to limit this to just one replacement, then add ,1 as the last parameter to gsub (as Wiktor suggested in the comment).

how do I loop through file names in stata

1) Is it possible to create a vector of strings in stata? 2) If yes, is it then possible to loop through the elements in this vector, performing commands on each element?
To create a single string in stata I know you do this:
local x = "a string"
But I have about 200 data files I need to loop through, and they are not conveniently named with consecutive suffixes like "_2000" "_2001" "_2002" etc. In fact there is no rhyme or reason to the file names, but I do have a list of them which I could easily cut and paste into a string vector, and then call the elements of this vector one by one, as one might do in MATLAB.
Is there a way to do this in stata?
On top of Keith's answer: you can also get the list of files in a directory with
local myfilelist : dir . files "*.dta"
or more generally
local theirfilelist : dir <directory name> files <file mask>
See help extended_fcn.
Sure -- You just create a list using a typical local call. If you don't put quotes around the whole thing your lists can be really long.
local mylist aaa bbb "cc c" dd ee ff
Then you just use foreach.
foreach filename of local mylist {
use `"`filename'"'
}
The double quotes (`" "') are used because one of the filenames has quotes around it because of the space. This is a touch faster than putting foreach filename in `mylist' { on the first line.
If you want to manipulate your list, see help macrolists.
Related questions have been asked >1 time on stackoverflow:
In Stata how do you assign a long list of variable names to a local macro?
Equivalent function of R's "%in%" for Stata
What many people might want the combination of the two as I did. Here it is:
* Create a local containing the list of files.
local myfilelist : dir "." files "*.dta"
* Or manually create the list by typing in the filenames.
local myfilelist "file1.dta" "file2.dta" "file3.dta"
* Then loop through them as you need.
foreach filename of local myfilelist {
use "`filename'"
}
I hope that helps. Note that locals/macros are limited by 67,784 characters--watch out for this when you have a really long list of files or really long filenames.

transfer values from one variable to another in Stata

I have a problem at work: I have merged two datasets, and there is a number of variables which have the same content, but where an observation which has an value in the variable from dataset 1 have a missing-value in dataset 2. So I need to transfer the values from the one variable into the other one.
This is my best shot so far:
replace V23=1 if V232==1
replace V23=2 if V232==2
replace V23=3 if V232==3
replace V23=4 if V232==4
replace V23=8 if V232==8
replace V23=.u if V232==10 | V232==9
However, it is a tedious task to do that for 40+ variables - and since some of them are numerical variables, it becomes a a sisyphean task.
Here's a start:
foreach v of varlist v23 {
local w `v'2
replace `v' = `w' if missing(`v')
replace `v' = .u if `w' == 10 | `w' == 9
}
Notice how this solution relies on a lexical relationship among the variable names: it assumes the old variable "v23" is associated with the new variable "v232". You can make a list of such associations and use it, but this is inconvenient. It's probably easier to rename the variables, if necessary, to conform to such a convention, then run the replacement script, and then restore the desired names.
If you're unfamiliar with this kind of automation, read the help pages for macro and foreach.