correct use of ``progress`` label - concurrency

According to the man pages,
Progress labels are used to define correctness claims. A progress label states the requirement that the labeled global state must be visited infinitely often in any infinite system execution. Any violation of this requirement can be reported by the verifier as a non-progress cycle.
and
Spin has a special mode to prove absence of non-progress cycles. It does so with the predefined LTL formula:
(<>[] np_)
which formalizes non-progress as a standard Buchi acceptance property.
But let's take a look at the very primitive promela specification
bool initialised = 0;
init{
progress:
initialised++;
assert(initialised == 1);
}
In my understanding, the assert should hold but verification fail because initialised++ is executed exactly once whereas the progress label claims it should be possible to execute it arbitrarily often.
However, even with the above LTL formula, this verifies just fine in ispin (see below).
How do I correctly test whether a statement can be executed arbitrarily often (e.g. for a locking scheme)?
(Spin Version 6.4.7 -- 19 August 2017)
+ Partial Order Reduction
Full statespace search for:
never claim + (:np_:)
assertion violations + (if within scope of claim)
non-progress cycles + (fairness disabled)
invalid end states - (disabled by never claim)
State-vector 28 byte, depth reached 7, errors: 0
6 states, stored (8 visited)
3 states, matched
11 transitions (= visited+matched)
0 atomic steps
hash conflicts: 0 (resolved)
Stats on memory usage (in Megabytes):
0.000 equivalent memory usage for states (stored*(State-vector + overhead))
0.293 actual memory usage for states
64.000 memory used for hash table (-w24)
0.343 memory used for DFS stack (-m10000)
64.539 total actual memory usage
unreached in init
(0 of 3 states)
pan: elapsed time 0.001 seconds
No errors found -- did you verify all claims?
UPDATE
Still not sure how to use this ...
bool initialised = 0;
init{
initialised = 1;
}
active [2] proctype reader()
{
assert(_pid >= 1);
(initialised == 1)
do
:: else ->
progress_reader:
assert(true);
od
}
active [2] proctype writer()
{
assert(_pid >= 1);
(initialised == 1)
do
:: else ->
(initialised == 0)
progress_writer:
assert(true);
od
}
And let's select testing for non-progress cycles. Then ispin runs this as
spin -a test.pml
gcc -DMEMLIM=1024 -O2 -DXUSAFE -DNP -DNOCLAIM -w -o pan pan.c
./pan -m10000 -l
Which verifies without error.
So let's instead try this with ltl properties ...
/*pid: 0 = init, 1-2 = reader, 3-4 = writer*/
ltl progress_reader1{ []<> reader[1]#progress_reader }
ltl progress_reader2{ []<> reader[2]#progress_reader }
ltl progress_writer1{ []<> writer[3]#progress_writer }
ltl progress_writer2{ []<> writer[4]#progress_writer }
bool initialised = 0;
init{
initialised = 1;
}
active [2] proctype reader()
{
assert(_pid >= 1);
(initialised == 1)
do
:: else ->
progress_reader:
assert(true);
od
}
active [2] proctype writer()
{
assert(_pid >= 1);
(initialised == 1)
do
:: else ->
(initialised == 0)
progress_writer:
assert(true);
od
}
Now, first of all,
the model contains 4 never claims: progress_writer2, progress_writer1, progress_reader2, progress_reader1
only one claim is used in a verification run
choose which one with ./pan -a -N name (defaults to -N progress_reader1)
or use e.g.: spin -search -ltl progress_reader1 test.pml
Fine, I don't care, I just want this to finally run, so let's just keep progress_writer1 and worry about how to stitch it all together later:
/*pid: 0 = init, 1-2 = reader, 3-4 = writer*/
/*ltl progress_reader1{ []<> reader[1]#progress_reader }*/
/*ltl progress_reader2{ []<> reader[2]#progress_reader }*/
ltl progress_writer1{ []<> writer[3]#progress_writer }
/*ltl progress_writer2{ []<> writer[4]#progress_writer }*/
bool initialised = 0;
init{
initialised = 1;
}
active [2] proctype reader()
{
assert(_pid >= 1);
(initialised == 1)
do
:: else ->
progress_reader:
assert(true);
od
}
active [2] proctype writer()
{
assert(_pid >= 1);
(initialised == 1)
do
:: else ->
(initialised == 0)
progress_writer:
assert(true);
od
}
ispin runs this with
spin -a test.pml
ltl progress_writer1: [] (<> ((writer[3]#progress_writer)))
gcc -DMEMLIM=1024 -O2 -DXUSAFE -DSAFETY -DNOCLAIM -w -o pan pan.c
./pan -m10000
Which does not yield an error, but instead reports
unreached in claim progress_writer1
_spin_nvr.tmp:3, state 5, "(!((writer[3]._p==progress_writer)))"
_spin_nvr.tmp:3, state 5, "(1)"
_spin_nvr.tmp:8, state 10, "(!((writer[3]._p==progress_writer)))"
_spin_nvr.tmp:10, state 13, "-end-"
(3 of 13 states)
Yeah? Splendid! I have absolutely no idea what to do about this.
How do I get this to run?

The problem with your code example is that it does not have any infinite system execution.
Progress labels are used to define correctness claims. A progress
label states the requirement that the labeled global state must be
visited infinitely often in any infinite system execution. Any
violation of this requirement can be reported by the verifier as a
non-progress cycle.
Try this example instead:
short val = 0;
init {
do
:: val == 0 ->
val = 1;
// ...
val = 0;
:: else ->
progress:
// super-important progress state
printf("progress-state\n");
assert(val != 0);
od;
};
A normal check does not find any error:
~$ spin -search test.pml
(Spin Version 6.4.3 -- 16 December 2014)
+ Partial Order Reduction
Full statespace search for:
never claim - (none specified)
assertion violations +
cycle checks - (disabled by -DSAFETY)
invalid end states +
State-vector 12 byte, depth reached 2, errors: 0
3 states, stored
1 states, matched
4 transitions (= stored+matched)
0 atomic steps
hash conflicts: 0 (resolved)
Stats on memory usage (in Megabytes):
0.000 equivalent memory usage for states (stored*(State-vector + overhead))
0.292 actual memory usage for states
128.000 memory used for hash table (-w24)
0.534 memory used for DFS stack (-m10000)
128.730 total actual memory usage
unreached in init
test.pml:12, state 5, "printf('progress-state\n')"
test.pml:13, state 6, "assert((val!=0))"
test.pml:15, state 10, "-end-"
(3 of 10 states)
pan: elapsed time 0 seconds
whereas, checking for progress yields the error:
~$ spin -search -l test.pml
pan:1: non-progress cycle (at depth 2)
pan: wrote test.pml.trail
(Spin Version 6.4.3 -- 16 December 2014)
Warning: Search not completed
+ Partial Order Reduction
Full statespace search for:
never claim + (:np_:)
assertion violations + (if within scope of claim)
non-progress cycles + (fairness disabled)
invalid end states - (disabled by never claim)
State-vector 20 byte, depth reached 7, errors: 1
4 states, stored
0 states, matched
4 transitions (= stored+matched)
0 atomic steps
hash conflicts: 0 (resolved)
Stats on memory usage (in Megabytes):
0.000 equivalent memory usage for states (stored*(State-vector + overhead))
0.292 actual memory usage for states
128.000 memory used for hash table (-w24)
0.534 memory used for DFS stack (-m10000)
128.730 total actual memory usage
pan: elapsed time 0 seconds
WARNING: ensure to write -l after option -search, otherwise it is not handed over to the verifier.
You ask:
How do I correctly test whether a statement can be executed arbitrarily often (e.g. for a locking scheme)?
Simply write a liveness property:
ltl prop { [] <> proc[0]#label };
This checks that process with name proc and pid 0 executes infinitely often the statement corresponding to label.

Since your edit substantially changes the question, I write a new answer to avoid confusion. This answer addresses only the new content. Next time, consider creating a new, separate, question instead.
This is one of those cases in which paying attention to the unreached in ... warning message is really important, because it affects the outcome of the verification process.
The warning message:
unreached in claim progress_writer1
_spin_nvr.tmp:3, state 5, "(!((writer[3]._p==progress_writer)))"
_spin_nvr.tmp:3, state 5, "(1)"
_spin_nvr.tmp:8, state 10, "(!((writer[3]._p==progress_writer)))"
_spin_nvr.tmp:10, state 13, "-end-"
(3 of 13 states)
relates to the content of file _spin_nvr.tmp that is created during the compilation process:
...
never progress_writer1 { /* !([] (<> ((writer[3]#progress_writer)))) */
T0_init:
do
:: (! (((writer[3]#progress_writer)))) -> goto accept_S4 // state 5
:: (1) -> goto T0_init
od;
accept_S4:
do
:: (! (((writer[3]#progress_writer)))) -> goto accept_S4 // state 10
od;
} // state 13 '-end-'
...
Roughly speaking, you can view this as the specification of a Buchi Automaton which accepts executions of your writer process with _pid equal to 3 in which it does not reach the statement with label progress_writer infinitely often, i.e. it does so only a finite number of times.
To understand this you should know that, to verify an ltl property φ, spin builds an automaton containing all paths in the original Promela model that do not satisfy φ. This is done by computing the synchronous product of the automaton modeling the original system with the automaton representing the negation of the property φ you want to verify. In your example, the negation of φ is given by the excerpt of code above taken from _spin_nvr.tmp and labeled with never progress_writer1. Then, Spin checks if there is any possible execution of this automaton:
if there is, then property φ is violated and such execution trace is a witness (aka counter-example) of your property
otherwise, property φ is verified.
The warning tells you that in the resulting synchronous product none of those states is ever reachable. Why is this the case?
Consider this:
active [2] proctype writer()
{
1: assert(_pid >= 1);
2: (initialised == 1)
3: do
4: :: else ->
5: (initialised == 0);
6: progress_writer:
7: assert(true);
8: od
}
At line 2:, you check that initialised == 1. This statement forces writer to block at line 2: until when initialised is set to 1. Luckily, this is done by the init process.
At line 5:, you check that initialised == 0. This statement forces writer to block at line 5: until when initialised is set to 0. However, no process ever sets initialised to 0 anywhere in the code. Therefore, the line of code labeled with progress_writer: is effectively unreachable.
See the documentation:
(1) /* always executable */
(0) /* never executable */
skip /* always executable, same as (1) */
true /* always executable, same as skip */
false /* always blocks, same as (0) */
a == b /* executable only when a equals b */
A condition statement can only be executed (passed) if it holds. [...]

Related

Is the value of a Fortran DO loop counter variable guaranteed to be persistent after the loop ends? [duplicate]

This question already has answers here:
Why is the Fortran DO loop index larger than the upper bound after the loop?
(2 answers)
Closed 5 years ago.
How do DO loops work exactly?
Let's say you have the following loop:
do i=1,10
...code...
end do
write(*,*)I
why is the printed I 11, and not 10?
But when the loop stops due to an
if(something) exit
the I is as expected (for example i=7, exit because some other value reached it's limit).
The value of i goes to 11 before the do loop determines that it must terminate. The value of 11 is the first value of i which causes the end condition of 1..10 to fail. So when the loop is done, the value of i is 11.
Put into pseudo-code form:
1) i <- 1
2) if i > 10 goto 6
3) ...code...
4) i <- i + 1
5) goto 2
6) print i
When it gets to step 6, the value of i is 11. When you put in your if statement, it becomes:
1) i <- 1
2) if i > 10 goto 7
3) ...code...
4) if i = 7 goto 7
5) i <- i + 1
6) goto 2
7) print i
So clearly i will be 7 in this case.
I want to emphasize that it is an iteration count that controls the number of times the range of the loop is executed. Please refer to Page 98-99 "Fortran 90 ISO/IEC 1539 : 1991 (E)" for more details.
The following steps are performed in sequence:
Loop initiation:
1.1 if loop-control is
[ , ] do-variable = scalar-numeric-expr1 , scalar-numeric-expr2 [ , scalar-numeric-expr3 ]
1.1.1 The initial parameter m1, the terminal parameter m2, and the incrementation parameter m3 are established by evaluating scalar-numeric-expr1, scalar-numeric-expr2, and scalar-numeric-expr3, respectively,
1.1.2 The do-variable becomes defined with the value of the initial parameter m1.
1.1.3 The iteration count is established and is the value of the expression
MAX(INT((m2 –m1+m3)/m3),0)
1.2 If loop-control is omitted, no iteration count is calculated.
1.3 At the completion of the execution of the DO statement, the execution cycle begins.
2.The execution cycle. The execution cycle of a DO construct consists of the following steps performed in sequence repeatedly until
termination:
2.1 The iteration count, if any, is tested. If the iteration count is zero, the loop terminates
2.2 If the iteration count is nonzero, the range of the loop is executed.
2.3 The iteration count, if any, is decremented by one. The DO variable, if any, is incremented by the value of the incrementation parameter m3.

Trying to understanding a for loop that iterates through 40 bits

I recently ordered a DHT22 temperature and humidity sensor to play around with as well as some arduino nanos that I am still waiting on, and I was reading up on a few tutorials and things I am going to do with them when I get them and was reading through how to use the DHT22 which was pretty simple, and after reading the data sheet was interested in how they iterate through the 40 bits of data as I have never played around with bytes in code before so looked up the library for it which is here https://github.com/markruys/arduino-DHT.
Datasheet for DHT22 is here https://cdn-shop.adafruit.com/datasheets/Digital+humidity+and+temperature+sensor+AM2302.pdf
This is the main block of code that loops through the bits.
This is what I think is happening; you have an 8 bit int of i that starts at -3 because it uses 3 bits to start communicating with the sensor. i < 2 * 40 keeps i below 2 but iterates through 40 times (this is a stab in the dark, i haven't seen it before).
Next is the bit I'm not quite understanding at all, the while loop, where if the pin is high - 1 and is == (i(i being 0) & 1) then the while loop will be LOW, or if i is 1 then the loop will be high. Which then flows into the if statement where if ( i >= 0 && (i & 1)), but won't i eventually always be 1? If not what is modifying i? From what I have looked at you don't want to move the bits when the signal is LOW?
I can see what the rest of the code is doing I'm just not understanding it, the first if statement moves the bits i data left through every loop and if the signal is high for > 30 micro secs then the bit is 1 and a 1 is added to data.
// We're going to read 83 edges:
// - First a FALLING, RISING, and FALLING edge for the start bit
// - Then 40 bits: RISING and then a FALLING edge per bit
// To keep our code simple, we accept any HIGH or LOW reading if it's max 85 usecs long
uint16_t rawHumidity = 0;
uint16_t rawTemperature = 0;
uint16_t data = 0;
for ( int8_t i = -3 ; i < 2 * 40; i++ ) {
byte age;
startTime = micros();
do {
age = (unsigned long)(micros() - startTime);
if ( age > 90 ) {
error = ERROR_TIMEOUT;
return;
}
} while ( digitalRead(pin) == (i & 1) ? HIGH : LOW );
if ( i >= 0 && (i & 1) ) {
// Now we are being fed our 40 bits
data <<= 1;
// A zero max 30 usecs, a one at least 68 usecs.
if ( age > 30 ) {
data |= 1; // we got a one
}
}
switch ( i ) {
case 31:
rawHumidity = data;
break;
case 63:
rawTemperature = data;
data = 0;
break;
}
}
// Verify checksum
if ( (byte)(((byte)rawHumidity) + (rawHumidity >> 8) + ((byte)rawTemperature) + (rawTemperature >> 8)) != data ) {
error = ERROR_CHECKSUM;
return;
}
This is what I think is happening; you have an 8 bit int of i that starts at -3 because
it uses 3 bits to start communicating with the sensor. i < 2 * 40 keeps i below 2 but
iterates through 40 times (this is a stab in the dark, i haven't seen it before)
https://en.cppreference.com/w/cpp/language/operator_precedence
* (as the multiplication operator) has higher precedence than < (as less-than), so the terms are grouped such that * is resolved first.
So (i < 2 * 40) gets resolved (i < (2 * 40)). It's equivalent to (i < 80).
Next is the bit I'm not quite understanding at all, the while loop, where if the pin
is high - 1 and is == (i(i being 0) & 1) then the while loop will be LOW, or if i is
1 then the loop will be high.
do {
...
}
while ( digitalRead(pin) == (i & 1) ? HIGH : LOW );
Here, == has the higher precedence, so (digitalRead(pin) == (i & 1) is resolved first. ie, true when either digitalRead(pin) is 0 and i is even, digitalRead(pin) is 1 and i is odd. [since (i & 1) effectively tests the lowest bit]
Then the ternary subexpression is resolved, returning HIGH if true and LOW if false.
Have to run, hopefully that gets you there.
// We're going to read 83 edges:
// - First a FALLING, RISING, and FALLING edge for the start bit
// - Then 40 bits: RISING and then a FALLING edge per bit
The data bits shift left when the 'while' loop breaks: that happens when
the conditional's ternary operator result (HIGH or LOW) evaluates false. It's somewhat unclear exactly when that should occur since we lack definitions for HIGH and LOW.
However, since:
all-caps identifiers generally indicate that the identifier represents a macro,
HIGH and LOW having strictly constant truth value would make having the ternary expression in there at all totally pointless (if true then true else false??),
something in all this supposedly distinguishes rising-edge values from falling edges,
there's pretty much no other sensible place for that to happen (unless the pin read function does it internally and the comments discussion is just watercooler stuff)
...we should probably assume they each expand to an expression of some sort, and the result of THAT determines whether the loop should stop.
So, most likely, data <<= 1; occurs when:
digitalRead(pin) is high and *~something~*
digitalRead(pin) is low and *~something else~*
From what I can see, it would make the most sense if ~something~ and ~something else~ depend on the value of age.

Lua game development, table seems to delete itself after each interation

What I am trying to do is a little addon which would let me know how much time I have spent casting during combat in %,
function()
local spell, _, _, _, _, endTime = UnitCastingInfo("player")
-- getting information from the game itself whether im "Casting"
local inCombat = UnitAffectingCombat("player")
-- getting information form the game if combat is true (1) or not (nil)
local casting = {}
local sum = 0
if inCombat == 1 then
if spell then
table.insert(casting, 1)
else
table.insert(casting, 0)
end
else
for k in pairs (casting) do
casting [k] = nil
end
end
for i=1, #casting, 1 do
sum = sum + casting[i]
end
return( sum / #casting ) end
-- creating a list which adds 1 every frame I am casting and I am in combat,
-- or adds 0 every frame I'm not casting and I'm not in combat.
-- Then I sum all the numbers and divide it by the number of inputs to figure
-- out how much % I have spent "casting".
-- In case the combat condition is false, delete the list
For some reason these numbers don't add up at all, I only see "1" when both conditions are satisfied, or 0 if the combat condition is satisfied.
There might be some better approach I'm sure, but I am kind of new to lua and programming in general.
You say you're new to Lua, so I will attempt to explain in detail what is wrong and how it can be improved, so brace yourself for a long read.
I assume that your function will be called every frame/step/tick/whateveryouwanttocallit of your game. Since you set sum = 0 and casting = {} at the beginning of the function, this will be done every time the function is called. That's why you always get 0 or 1 in the end.
Upvalues to the rescue!
Lua has this nice thing called lexical scoping. I won't go into much detail, but the basic idea is: If a variable is accessible (in scope) when a function is defined, that function remembers that variable, no matter where it is called. For example:
local foo
do
local var = 10
foo = function() return var end
end
print(bar) -- nil
print(foo()) -- 10
You can also assign a new value to the variable and the next time you call the function, it will still have that new value. For example, here's a simple counter function:
local counter
do
count = 0
counter = function() count = count + 1; return count; end
end
print(counter()) -- 1
print(counter()) -- 2
-- etc.
Applying that to your situation, what are the values that need to persist from one call to the next?
Number of ticks spent in combat
Number of ticks spent casting
Define those two values outside of your function, and increment / read / reset them as needed; it will persist between repeated calls to your function.
Things to keep in mind:
You need to reset those counters when the player is no longer casting and/or in combat, or they will just continue where they left off.
casting doesn't need to be a table. Tables are slow compared to integers, even if you reuse them. If you need to count stuff, a number is more than enough. Just make casting = 0 when not in combat and increase it by 1 when in combat.
Thank you for feedback everyone, in the end after your suggestions and some research my code looks like this and works great:
function()
local spell, _, _, _, startTime, endTime, _, _, _ = UnitCastingInfo("player")
local inCombat = UnitAffectingCombat("player")
local inLockdown = InCombatLockdown()
local _, duration, _, _ = GetSpellCooldown("Fireball")
casting = casting or {}
local sum = 0
if inCombat == 1 or inLockdown == 1 then
if spell or duration ~= 0 then
casting[#casting+1] = 1
elseif spell == nil or duration == 0 then
casting[#casting+1] = 0
end
else
local next = next
local k = next(casting)
while k ~= nil do
casting[k] = nil
k = next(casting, k)
end
end
for i=1, #casting, 1 do
sum = sum + casting[i]
end
return(("%.1f"):format( (sum / #casting)*100 ).. "%%") end
what i noticed is there was a problem with reseting the table in the original code:
for k in pairs (casting) do
casting [k] = nil
it seemed like either some zeros stayed there, or the table size didnt "shrink" i dont know.
Maybe intereger would be faster then a table, but honestly i dont see any performance issues even when the table gets ridicolously big (5 min, 60 fps, thats 18k inputs) also for the sake of learning a new language its better to do it harder way in my opinion
Regards

Nearest Neighbor Matching in Stata

I need to program a nearest neighbor algorithm in stata from scratch because my dataset does not allow me to use any of the available solutions (as far as I am concerned).
To be pecise. I have a dataset that is of similar structure to that of the following (original has around 14k observations)
input id value treatment match
1 0.14 0 .
2 0.32 0 .
3 0.465 1 2
4 0.878 1 2
5 0.912 1 2
6 0.001 1 1
end
I want to generate a variable called match (already included in the example above). For each observation with treatment == 1 the variable match should store the id of another observation from within treatment == 0 whose value is closest to value of the considered observation (treatment == 1).
I am new to stata programming, so I am not yet familiar with the syntax. My first shot is the following however it does not produce any changes to the match variable. I am sure this is a novice question but I am hoping for some advice on how to make the code running.
EDIT: I have changed the code slightly and now it seems to work. Do you see any problems that may arise if I run it on a bigger dataset?
set more off
clear all
input id pscore treatment
1 0.14 0
2 0.32 0
3 0.465 1
4 0.878 1
5 0.912 1
6 0.001 1
end
gen match = .
forval i = 1/`= _N' {
if treatment[`i'] == 1 {
local dist 1
forvalues j = 1/`= _N' {
if (treatment[`j'] == 0) {
local current_dist (pscore[`i'] - pscore[`j'])^2
if `dist' > `current_dist' {
local dist `current_dist' // update smallest distance
replace match = id[`j'] in `i' // write match
}
}
}
}
}
Consider some simulated data: 1,000 observations, 200 of them untreated (treat == 0) and the rest treated (treat == 1). Then the code included below will be much more efficient than the originally posted. (Ties, like in your code, are not explicitly handled.)
clear
set more off
*----- example data -----
set obs 1000
set seed 32956
gen id = _n
gen pscore = runiform()
gen treat = cond(_n <= 200, 0, 1)
*----- new method -----
timer clear
timer on 1
// get id of last non-treated and first treated
// (data is sorted by treat and ids are consecutive)
bysort treat (id): gen firsttreat = id[1]
local firstt = first[_N]
local lastnt = `firstt' - 1
// start loop
gen match = .
gen dif = .
quietly forvalues i = `firstt'/`=_N' {
// compute distances
replace dif = (pscore[`i'] - pscore)^2
summarize dif in 1/`lastnt', meanonly
// identify id of minimum-distance observation
replace match = . in 1/`lastnt'
replace match = id in 1/`lastnt' if dif == r(min)
summarize match in 1/`lastnt', meanonly
// save the minimum-distance id
replace match = r(max) in `i'
}
// clean variable and drop
replace match = . in 1/`lastnt'
drop dif firsttreat
timer off 1
tempfile first
save `first'
*----- your method -----
drop match
timer on 2
gen match = .
quietly forval i = 1/`= _N' {
if treat[`i'] == 1 {
local dist 1
forvalues j = 1/`= _N' {
if (treat[`j'] == 0) {
local current_dist (pscore[`i'] - pscore[`j'])^2
if `dist' > `current_dist' {
local dist `current_dist' // update smallest distance
replace match = id[`j'] in `i' // write match
}
}
}
}
}
timer off 2
tempfile second
save `second'
// check for equality of results
cf _all using `first'
// check times
timer list
The results in seconds to finish execution:
. timer list
1: 0.19 / 1 = 0.1930
2: 10.79 / 1 = 10.7900
The difference is huge, specially considering this data set has only 1,000 observations.
An interesting thing to notice is that as the number of non-treated cases increases relative to the number of treated, then the original method improves, but never reaches the levels of efficiency of the new method. As an example, invert the number of cases, so there is now 800 untreated and 200 treated (change data setup to gen treat = cond(_n <= 800, 0, 1)). The result is
. timer list
1: 0.07 / 1 = 0.0720
2: 4.45 / 1 = 4.4470
You can see that the new method also improves and is still much faster. In fact, the relative difference is still the same.
Another way to do this is using joinby or cross. The problem is they temporarily expand (a lot) the size of your data base. In many cases, they are not feasible due to the hard limit Stata has on the number of possible observations (see help limits). You can find an example of joinby here: https://stackoverflow.com/a/19784222/2077064.
Edit
If there's a large number of treated relative to untreated, your code suffers
because you go through the whole first loop many more times (due to the first if).
Furthermore, going through
that whole loop once, implies going through another loop that
has itself two if conditions, _N more times.
The opposite case in which there are few treated observations means that you go through the whole
first loop only in a small number of occasions, speeding up your code substantially.
The reason my code can maintain its efficiency is due to the use of in. This always
offers speed gains over if. Stata will go directly to those observations with no
logical checking needed. Your problem provides an opportunity for that replacement
and it's wise to seize it.
If my code used if where in is in place, the results would be different.
Your code would be faster for the
case in which there's a large number of untreated relative to treated, and again, that
is because in your code there would not be the need to go through the complete loop,
requiring very little work;
the first loop is short-circuited with the first if. For the opposite case,
my code would still dominate.
The key is to "separate" treated from untreated and work on each group using in.

difference between 'when' and 'if' in OpenModelica?

I'm new to OpenModelica and I've a few questions regarding the code of 'BouncingBall.mo' which is distributed with the software as example code.
1) what's the difference between 'when' and 'if'?
2)what's the purpose of variable 'foo' in the code?
3)in line(15) - "when {h <= 0.0 and v <= 0.0,impact}",, shouldn't the expression for 'when' be enough as "{h <= 0.0 and v <= 0.0}" because this becomes TRUE when impact occurs, what's the purpose of impact(to me its redundant here) and what does the comma(,) before impact means?
model BouncingBall
parameter Real e = 0.7 "coefficient of restitution";
parameter Real g = 9.81 "gravity acceleration";
Real h(start = 1) "height of ball";
Real v "velocity of ball";
Boolean flying(start = true) "true, if ball is flying";
Boolean impact;
Real v_new;
Integer foo;
equation
impact = h <= 0.0;
foo = if impact then 1 else 2;
der(v) = if flying then -g else 0;
der(h) = v;
when {h <= 0.0 and v <= 0.0,impact} then
v_new = if edge(impact) then -e * pre(v) else 0;
flying = v_new > 0;
reinit(v, v_new);
end when;
end BouncingBall;
OK, that's quite a few questions. Let me attempt to answer them:
What is the difference between when and if.
The questions inside a when clause are only "active" at the instant that the conditional expressions used in the when clause becomes active. In contrast, equations inside an if statement are true as long as the conditional expression stays true.
What's the purpose of foo?
Probably for visualization. It has no clear impact on the model that I can see.
Why is impact listed in the when clause.
One of the problems you have so-called Zeno systems like this is that it will continue to bounce indefinitely with smaller and smaller intervals. I suspect the impact flag here is meant to indicate when the system has stopped bouncing. This is normally done by checking to make sure that the conditional expression h<=0.0 actually becomes false at some point. Because event detection includes numerical tolerancing, at some point the height of the bounces never gets outside of the tolerance range and you need to detect this or the ball never bounces again and just continues to fall. (it's hard to explain without actually running the simulation and seeing the effect).
What does the , do in the when clause.
Consider the following: when {a, b} then. The thing is, if you want to have a when clause trigger when either a or b become true, you might think you'll write it as when a or b then. But that's not correct because that will only trigger when the first one becomes true. To see this better, consider this code:
a = time>1.0;
b = time>2.0;
when {a, b} then
// Equation set 1
end when;
when a or b then
// Equation set 2
end when;
So equation set 1 will get executed twice here because it will get executed when a becomes true and then again when b becomes true. But equation set 2 will only get executed once when a becomes true. That's because the whole expression a or b only becomes true at one instant.
These are common points of confusion about when. Hopefully these explanations help.