Use of maximum likelihood in ado file in Stata - stata

I am trying to understand the use of maximum likelihood in Stata (for which I am currently using the third edition of the book by Gould et al.). In particular, I am focussing on user program craggit. The detail of command can be found in Stata article. When using the view source craggit.ado, I can see all codes in the ado file. In the ado file [details below], I see the ml using the lf method, but nowhere in the file do I see the maximum likelihood commands (probit and truncreg as specified in the article). Please let me know whether I am missing something.
program craggit
version 9.2
if replay() {
if ("`e(cmd)'" != "craggit") error 301
Replay `0'
}
else {
//Checking data structure
syntax varlist [fweight pweight] [if] [in], SECond(varlist) [ ///
Level(cilevel) CLuster(varname) HETero(varlist) * ///
]
gettoken lhs1 rhs1 : varlist
gettoken lhs2 rhs2 : second
marksample touse
quietly sum `lhs1' if `touse'
local minval1 = r(min)
quietly sum `lhs2' if `touse'
local minval2 = r(min)
if `minval1'<0 | `minval2'<0 {
di "{error:A dependant variable is not truncated at 0: {help craggit} is
> not appropriate}"
}
else Estimate `0'
}
end
program Estimate, eclass sortpreserve
di ""
di "{text:Estimating Cragg's tobit alternative}"
di "{text:Assumes conditional independence}"
syntax varlist [fweight pweight] [if] [in], SECond(varlist) [ ///
Level(cilevel) CLuster(varname) HETero(varlist) * ///
]
mlopts mlopts, `options'
gettoken lhs1 rhs1 : varlist
gettoken lhs2 rhs2 : second
if "`cluster'" != "" {
local clopt cluster(`cluster')
}
//mark the estimation subsample
marksample touse
//perform estimation using ml
ml model lf craggit_ll ///
(Tier1: `lhs1' = `rhs1') ///
(Tier2: `lhs2' = `rhs2') ///
(sigma: `hetero') ///
[`weight'`exp'] if `touse', `clopt' `mlopts' ///
maximize
ereturn local cmd craggit
Replay, `level'
end
program Replay
syntax [, Level(cilevel) *]
ml display, level(`level')
end

The log likelihood function is computed in the file craggit_ll.ado, so to see that you need to type viewsource craggit_ll.ado.
The logic behind storing the log likelihood evaluator program in a separate file is that all programs that are defined in the craggit.ado file, except the very first one, are local to the commands stored in that file, so ml would not be able to see it. By storing it in a separate file, the craggit_ll command will become global, and ml wil be able to use it.

Related

Using split command to split a string

I am looking to split a string and put it into a set. The string to be split is an element of a tuple.
This string element in the tuple takes values such as(pitblockSet) :
{"P499,P376,P490,P366,P129,"}
{"P388,P491,P367,"}
{"P500,P377,P479,P355,"}
and so on. Each set refers to a Path Id(string name of the path)
The tuple was defined as :
tuple Path {
string id;
string source;
string dest;
{string} pitblockSet;
{string} roadPoints;
{string} dumpblockSet;
{string} others;
float dist;
};
And the above sets to be split refers to the element : {string} pitblockSet;
I now have to split the pitBlockSet. I am using the following :
{Path} Pbd={};
// Not putting the code to populate Pbd as it is irrelevant here
// there are several lines here for the purpose of creating set Pbd...
{string} splitPitBlocksPath[Pathid];
{string} splitDumpBlocksPath[Pathid];
execute {
for(var p in Pbd) {
var splitPitBlocksPath[p.id] = p.pitblockSet.split(",") ;
var splitDumpBlocksPath[p.id] = p.dumpblockSet.split(",") ;
}
}
The problem is when I execute it I get error in the above 2 lines appearing 4 times:
Scripting parser error: missing ';' or newline between statements.
I am not able to understand where I am going wrong
===============Added after Alex's Answer =============================
Thank you for the Answer again - It worked perfectly with some minor changes.
I might not have been able to explain the issue properly in the above hence adding the following. My actual code is much bigger and these are only an extract
Pbd for my case is a tuple of type {Path}. Path is described above. Pbd reads about 20,000 records from excel and each Pbd has tuple fields like id, source, dest, pitblockSet, dumpblockSet etc etc. These are all read from excel and populated into the tuple Pbd - this part is working fine. The 3 lines that I mentioned above were just for example of the Pbd.pitBlockSet for just 3 records out of the 20,000.
p.pitblockSet is a set but it contains only one string. The requirement is to break this string into a set. Like for example if p.pitblockSet has a value {"P499,P376,P490,P366,P129,"} for say p.id = "PT129" the expected result for this p.id is {"P499" "P376" "P490" "P366" "P129"}. Then say for example for p.id="PT1" the p.pitblockSet is {"P4,"} the expected result is a set with only one element like {"P4"}. As mentioned earlier there are several such records of p and the above two are just for example.
I have therefore modified the suggested code to some extent to fit into the problem. However I am still getting an issue with the split command.
{string} result[Pbd];
int MaxS=10;
execute {
for(var p in Pbd) {
var stringSet = Opl.item(p.pitblockSet,0);
var split= new Array(MaxS);
split=stringSet.split(",") ;
for(var i=0;i<=MaxS;i++) if ((split[i]!='null') && (split[i]!='')) result[p].add(split[i]);
writeln("result:", p.id, result[p]);
}
}
The Answers look like below :
result:PT1 {"P4"}
result:PT2 {"P5"}
result:PT3 {"P6"}
result:PT4 {"P7"}
result:PT5 {"P8"}
result:PT6 {"P8" "P330" "P455" "P341"}
result:PT7 {"P326"}
result:PT8 {"P327"}
result:PT9 {"P328"}
.
.
and so on
.
.
result:PT28097 {"P500" "P377" "P479" "P355"}
result:PT28098 {"P501" "P378" "P139"}
result:PT28099 {"P501" "P388" "P491" "P367"}
result:PT28100 {"P501" "P378" "P480"}
result:PT28101 {"P501" "P378" "P139"}
result:PT28102 {"P502"}
result:PT28103 {"P503"}
Unfortunately, I'm afraid you encounter a product limitation.
See: https://www.ibm.com/support/knowledgecenter/SSSA5P_12.10.0/ilog.odms.ide.help/refjsopl/html/intro.html?view=kc#1037020
Regards,
Chris.
int MaxS=10;
{string} splitDumpBlocksPath={"P499,P376,P490,P366,P129,"} union
{"P388,P491,P367,"} union
{"P500,P377,P479,P355,"};
range Pbd=0..card(splitDumpBlocksPath)-1;
{string} result[Pbd];
execute {
for(var p in Pbd) {
var stringSet=Opl.item(splitDumpBlocksPath,p);
writeln(stringSet);
var split= new Array(MaxS);
split=stringSet.split(",") ;
for(var i=0;i<=MaxS;i++) if ((split[i]!='null') && (split[i]!='')) result[p].add(split[i]);
}
writeln(result);
}
works fine and gives
P499,P376,P490,P366,P129,
P388,P491,P367,
P500,P377,P479,P355,
[{"P499" "P376" "P490" "P366" "P129"} {"P388" "P491" "P367"} {"P500" "P377"
"P479" "P355"}]

"{ required" in nested loop in stata

I hate to annoy you guys with this question, but I am getting the error "{ required" even though all my loops appear to be open (and closed) properly and unfortunately Stata doesn't tell you where the error is, so I can't figure out why this is happening. By the way if I take out the append_replace section with the if statements, I am still getting the same error, so I don't think it is from that section. Here is my code:
local vars = "any_rate resp_rate circ_rate weight_rate diabetes_rate gallstones_rate mental_rate cancer_rate std_rate died_rate"
local dates = "1947 1974"
foreach var of local `vars' {
foreach i of local `dates' {
forvalues j = 500(100)2500 {
local append_replace = "append"
if "`var'"=="any_rate" {
if "`i'" == "1947" {
if `j' == 500 {
local append_replace = "replace"
}
}
}
reg `var' post`i' dobdistfrom`i'change dobdistfrom`i'changesq post`i'_dist`i' post`i'_dist`i'sq if dobdistfrom`i'change < `j' & dobdistfrom`i'change > -`j', cluster(dobdistfrom`i'change)
outreg2 using Prelim_RD_Estimates.xls, `append_replace' excel dec(3)
}
}
}
Thanks so much for your help!
I believe the problem is with the local that prevents the { from being read.
Original problematic version:
local dates = "1947 1974"
foreach i of local `dates' {
di `i'
}
Corrected version:
local dates = "1947 1974"
foreach i in `dates' {
di `i'
}
You could also just omit the quotes in "foreach i of local dates" in your original construction.

Doctrine 2 query builder vs entity persist performance

Summary: which is quicker: updating / flushing a list of entities, or running a query builder update on each?
We have the following situation in Doctrine ORM (version 2.3).
We have a table that looks like this
cow
wolf
elephant
koala
and we would like to use this table to sort a report of a fictional farm. The problem is that the user wishes to have a customer ordering of the animals (e.g. Koala, Elephant, Wolf, Cow). Now there exist possibilities using CONCAT, or CASE to add a weight to the DQL (example 0002wolf, 0001elephant). In my experience this is either tricky to build and when I got it working the result set was an array and not a collection.
So, to solve this we added a "weight" field to each record and, before running the select, we assign each one with a weight:
$animals = $em->getRepository('AcmeDemoBundle:Animal')->findAll();
foreach ($animals as $animal) {
if ($animal->getName() == 'koala') {
$animal->setWeight(1);
} else if ($animal->getName() == 'elephant') {
$animal->setWeight(2);
}
// etc
$em->persist($animal);
}
$em->flush();
$query = $em->createQuery(
'SELECT c FROM AcmeDemoBundle:Animal c ORDER BY c.weight'
);
This works perfectly. To avoid race conditions we added this inside a transaction block:
$em->getConnection()->beginTransaction();
// code from above
$em->getConnection()->rollback();
This is a lot more robust as it handles multiple users generating the same report. Alternatively the entities can be weighted like this:
$em->getConnection()->beginTransaction();
$qb = $em->createQueryBuilder();
$q = $qb->update('AcmeDemoBundle:Animal', 'c')
->set('c.weight', $qb->expr()->literal(1))
->where('c.name = ?1')
->setParameter(1, 'koala')
->getQuery();
$p = $q->execute();
$qb = $em->createQueryBuilder();
$q = $qb->update('AcmeDemoBundle:Animal', 'c')
->set('c.weight', $qb->expr()->literal(2))
->where('c.name = ?1')
->setParameter(1, 'elephant')
->getQuery();
$p = $q->execute();
// etc
$query = $em->createQuery(
'SELECT c FROM AcmeDemoBundle:Animal c ORDER BY c.weight'
);
$em->getConnection()->rollback();
Questions:
1) which of the two examples would have better performance?
2) Is there a third or better way to do this bearing in mind we need a collection as a result?
Please remember that this is just an example - sorting the result set in memory is not an option, it must be done on the database level - the real statement is a 10 table join with 5 orderbys.
Initially you could make use of a Doctrine implementation named Logging (\Doctrine\DBAL\LoggingProfiler). I know that it is not the better answer, but at least you can implement it in order to get best result for each example that you have.
namespace Doctrine\DBAL\Logging;
class Profiler implements SQLLogger
{
public $start = null;
public function __construct()
{
}
/**
* {#inheritdoc}
*/
public function startQuery($sql, array $params = null, array $types = null)
{
$this->start = microtime(true);
}
/**
* {#inheritdoc}
*/
public function stopQuery()
{
echo "execution time: " . microtime(true) - $this->start;
}
}
In you main Doctrine configuration you can enable as:
$logger = new \Doctrine\DBAL\Logging\Profiler;
$config->setSQLLogger($logger);

How can I give labels to variables in Stata from macro

I would like to create variables with a plugin, wich imports a database table.
I am using the following code to do this:
SF_macro_save("_vars", "var1 var2...");
SF_macro_save("_types", "type1 type2...");
SF_macro_save("_formats", "format1 format2...");
SF_macro_save("_obs", "obs1 obs2...");
This creates the variables well, but I don't know how to give labels to variables, or to values.
Which C++ function do I need to use to create labels? Or how can I call Stata functions from C++? (I am using Visual Studio 10 if it counts)
I would like to call this Stata functions from the plugin:
label variable var1 label1
and
label define var1_label 1 "label1" 2 "label2"
label values var1 var1_label
Thanks
This is possible but it is not easy. Basically, you create a .do file in your code (C# example below) then execute the .do file. Here is an example that runs the .do file then puts the results in a SQL Server database using ODBC. You can do something similar with Stat/Transfer to load the data and variable labels into a database.
string m_stcmd_valuelabels = Server.MapPath("~/Data/Cmd/Stata/") + m_filename_noex + "_valuelables.do";
using (StreamWriter m_sw_stcmd_valuelabels = new StreamWriter(m_stcmd_valuelabels, false))
{
m_sw_stcmd_valuelabels.WriteLine("clear");
m_sw_stcmd_valuelabels.WriteLine("set mem 500m");
m_sw_stcmd_valuelabels.WriteLine("set more off");
m_sw_stcmd_valuelabels.WriteLine("use \"" + m_fullpath.Replace(".zip", ".dta") + "\"");
m_sw_stcmd_valuelabels.WriteLine("valtovar _all, dis");
m_sw_stcmd_valuelabels.WriteLine("uselabel");
m_sw_stcmd_valuelabels.WriteLine("ren lname varname");
m_sw_stcmd_valuelabels.WriteLine("drop trunc");
m_sw_stcmd_valuelabels.WriteLine("odbc insert, dsn(\"MyData\") table(\"" + m_filename_noex + "_valuelabels\") create " + m_statadsn_conn);
m_sw_stcmd_valuelabels.WriteLine("exit");
m_sw_stcmd_valuelabels.WriteLine();
}
string str_PathValueLabels = Server.MapPath("~/Data/Stata12/StataMP-64.exe");
ProcessStartInfo processInfoValueLabels = new ProcessStartInfo("\"" + str_PathValueLabels + "\"");
processInfoValueLabels.Arguments = " /e do \"" + m_stcmd_valuelabels + "\"";
processInfoValueLabels.UseShellExecute = false;
processInfoValueLabels.ErrorDialog = false;
Process batchProcessValueLabels = new Process();
batchProcessValueLabels.StartInfo = processInfoValueLabels;
batchProcessValueLabels.Start();
You can't do that from the plugin. You can't create variables, labels. etc.. from your dll, the dataset must be defined before you call the plugin, as you probably already know. You can store data values back into the variables, but there's no adding "columns" if you will. You can store the desired names in the macros, but it will fall on your ".do" file to assign them to the variables in Stata, sorry.

Lua: pass context into loadstring?

I'm trying to pass context into a dynamic expression that I evaluate every iteration of a for loop. I understand that the load string only evaluates within a global context meaning local variables are inaccessible. In my case I work around this limitation by converting a local into a global for the purpose of the string evaluation. Here's what I have:
require 'cosmo'
model = { { player = "Cliff", age = 35, gender = "male" }, { player = "Ally", age = 36, gender = "female" }, { player = "Jasmine", age = 13, gender = "female" }, { player = "Lauren", age = 6.5, gender = "female" } }
values = { eval = function(args)
output = ''
condition = assert(loadstring('return ' .. args.condition))
for _, it in ipairs(model) do
each = it
if condition() then
output = output .. each.player .. ' age: ' .. each.age .. ' ' .. '\n'
end
end
return output
end }
template = "$eval{ condition = 'each.age < 30' }"
result = cosmo.fill(template, values)
print (result)
My ultimate goal (other than mastering Lua) is to build out an XSLT like tempting engine where I could do something like:
apply_templates{ match = each.age > 30}[[<parent-player>$each.player</parent-player>]]
apply_templates{ match = each.age > 30}[[<child-player>$each.player</child-player>]]
...And generate different outputs. Currently I'm stuck on my above hawkish means of sharing a local context thru a global. Does anyone here have better insight on how I'd go about doing what I'm attempting to do?
It's worth noting that setfenv was removed from Lua 5.2 and loadstring is deprecated. 5.2 is pretty new so you won't have to worry about it for a while, but it is possible to write a load routine that works for both versions:
local function load_code(code, environment)
if setfenv and loadstring then
local f = assert(loadstring(code))
setfenv(f,environment)
return f
else
return assert(load(code, nil,"t",environment))
end
end
local context = {}
context.string = string
context.table = table
-- etc. add libraries/functions that are safe for your application.
-- see: http://lua-users.org/wiki/SandBoxes
local condition = load_code("return " .. args.condition, context)
Version 5.2's load handles both the old loadstring behavior and sets the environment (context, in your example). Version 5.2 also changes the concept of environments, so loadstring may be the least of your worries. Still, it's something to consider to possibly save yourself some work down the road.
You can change the context of a function with setfenv(). This allows you to basically sandbox the loaded function into its own private environment. Something like the following should work:
local context = {}
local condition = assert(loadstring('return ' .. args.condition))
setfenv(condition, context)
for _, it in ipairs(model) do
context['each'] = it
if condition() then
-- ...
This will also prevent the condition value from being able to access any data you don't want it to, or more crucially, modifying any data you don't want it to. Note, however, that you'll need to expose any top-level bindings into the context table that you want condition to be able to access (e.g. if you want it to have access to the math package then you'll need to stick that into context). Alternatively, if you don't have any problem with condition having global access and you simply want to deal with not making your local a global, you can use a metatable on context to have it pass unknown indexes through to _G:
setmetatable(context, { __index = _G })