JMM: show that happens-before relation is stronger that RA causality - concurrency

New memory ordering where added in JDK9, so I'm digging into Release/Acquire mode. It introduce causality constraint:
If access A precedes interthread Release mode (or stronger) write W in source program order, then A precedes W in local precedence order for thread T
If interthread Acquire mode (or stronger) read R precedes access A in source program order, then R precedes A in local precedence order for Thread T.
As far as I see, the best way to show distinguish between R/A and Volatile modes is IRIW:
int x = 0
int y = 0
|-----------|-----------|--------------|--------------|
| Thread 1 | Thread 2 | Thread 3 | Thread 4 |
|-----------|-----------|--------------|--------------|
| setM(x,1) | setM(y,1) | r1 = getM(x) | r3 = getM(y) |
| | | r2 = getM(y) | r4 = getM(x) |
|-----------|-----------|--------------|--------------|
If getM/setM is getAcquire/setRelease result (r1, r2, r3 ,r4) = (1, 0, 1, 0) is allowed;
If getM/setM is getVolatile/setVolatile result (r1, r2, r3 ,r4) = (1, 0, 1, 0) is forbidden;
My main target is to show that causality (from RA) != happens-before (in terms of JLS)
To show it I want to proof that for IRIW-volatile case result (1, 0, 1, 0) is forbidden using only happens-before (without using sequential consistency):
volatile int x = 0;
volatile int y = 0;
|----------|----------|----------|----------|
| Thread 1 | Thread 2 | Thread 3 | Thread 4 |
|----------|----------|----------|----------|
|a: x = 1 |b: y = 1 |c: r1 = x |e: r3 = y |
| | |d: r2 = y |f: r4 = x |
|----------|----------|----------|----------|
Result of execution: (r1, r2, r3, r4) = (1, 0, 1, 0)
Happens-before orderings:
T1.W(x,1) -hb→ T3.R(x):1 -hb→ T3.R(y):0
T2.W(y,1) -hb→ T4.R(y):1 -hb→ T4.R(x):0
T1.W(x,1) -not hb→ T4.R(x):0
T2.W(y,1) -not hb→ T3.R(y):0
Synchronization Order:
We don't know it yet now, it's just some permutation of tuple (a, b, c, d, e, f), let's imagine it as array of 6 elements (SO array);
Synchronized-with is order defined for some SO-subsequent actions), so:
if A -sw→ B then A appears before B in SO array
SO is consistent with Program Order, so:
if A -po→ B then A appears before B is SO array
Happens-before is transitive closure of SW and PO, so:
if A -hb→ B then A appears before B is SO array
Due to happens-before orderings №1, №2 we have:
a appears before c is SO array
c appears before d is SO array
b appears before e is SO array
e appears before f is SO array
This means that last element in SO-array is f or d
If last element is f than we have a appears before f is SO, this means a -sw→ f, this causes a -hb→ f: inconsistency with happens-before ordering №3
(for case when last element is d the reasoning is similar)
This means that result (1, 0, 1, 0) is not possible here.
Is it proof correct?
Is it correct way to show that happens-before relation from JLS is not equivalent of causality RA constraint and you can't use term happens before for RA chains?

Yes, your proof is correct (at least as I can see).
Yes, IRIW is clear and simple example that demostrates differences between Release/Acquire and Volatile modes.
You can also use this in your proof: volatile read always returns the latest volatile write to that variable in synchronization order.
Here is the quote from the JLS:
17.4.7. Well-Formed Executions
We only consider well-formed executions. An execution E = < P, A, po, so, W, V, sw, hb > is well formed if the following are true:
...
The execution obeys synchronization-order consistency.
For all volatile reads r in A, it is not the case that either so(r, W(r)) or that there exists a write w in A such that w.v = r.v and so(W(r), w) and so(w, r).
The rest is similar to your proof:
so is a total order consistent with po => either d or f must be the last action in so
in either case there will be a volatile write of 1 to the same variable before the volatile read in so => the volatile read must return 1
Also it's important to mention that even though the article you quoted is written by the author of JDK 9 Memory Order Modes, it's not official documentation.
This means that at any moment the implementation of the Memory Order Modes might be changed from what's described in the article.
The the only real guarantees are given in the official documentation, and AFAIK today it is just this javadoc.

Related

how I can use if condition in ampl?

I am wondering can I use if operator in ampl? I have a set of variable x_{1},...,x_{n} and some constraints. now I have some constraints whose are valid under some circumstances. for example if x_{1}+...+x_{n}=kn+1 where `k is an integer then constraint A is valid.
is there any way that I can write it in ampl?
In other words the problem is that I want to search layer by layer in feasible reign. the layer is dot product between a point x=(x1,...,xn) and the vector 1=(1,1,1,...1) .
so
if < x,1>=1 then x has to satisfy the constraint A<1,
if =2 then x has to satisfy the constraint B<2,
.
.
.
this is what I found in AMPL website but it does not work! (n is dimension of x and k arbitrary integer)
subject to Time {if < x,1 > =kn+1}:
s.t. S1: A<1;
I'm not clear whether your example means "constraint A requires that x_[1]+...+x_[n]=4m+1 where m is an integer", or "if x_[1]+...+x_[n]=4m+1 where m is an integer, then constraint A requires some other condition to be met".
The former is trivial to code:
var m integer;
s.t. c1: sum{i in 1..n} x_[i] = 4m+1;
It does require a solver with MIP capability. From your tags I assume you're using CPLEX, which should be fine.
For the latter: AMPL does have some support for logical constraints, documented here. Depending on your problem, it's also sometimes possible to code logical constraints as linear integer constraints.
For example, if the x[i] variables in your example are also integers, you can set things up like so:
var m integer;
var r1 integer in 0..1;
var r2 integer in 0..2;
s.t. c1: r2 <= 2*r1; # i.e. r2 can only be non-zero if r1 = 1
s.t. c2: sum{i in 1..n} x_[i] = 4m+r1+r2;
var remainder_is_1 binary;
s.t. c3: remainder_is_1 >= r1-r2;
s.t. c4: remainder_is_1 <= 1-r2/2;
Taken together, these constraints ensure that remainder_is_1 equals 1 if and only if sum{i in 1..n} x_[i] = 4m+1 for some integer m. You can then use this variable in other constraints. This sort of trick can be handy if you only have a few logical constraints to deal with, but if you have many, it'll be more efficient to use the logical constraint options if they're available to you.

Writing power function in Standard ML with a predefined compound function

Having trouble writing a power function inStandard Ml. Im trying to write a function called exp of type int -> int -> int.
The application exp b e, for non-negative e, should return b^e.
For example, exp 3 2 should return 9. exp must be implemented with the function compound provided below. exp should not directly calls itself. Here is the compound function, it takes in a value n, a function, and a value x. All it does is it applies the function to the value x n number of times.
fun compound 0 f x = x
| compound n f x = compound (n-1) f (f x);
Im having trouble figuring out how to write this function without recursion, and with the restraint of having to use a function that only can use a function with one parameter. Anyone have any ideas of where to start with this?
This is what I have:
fun exp b 0 = 1
| exp b e = (compound e (fn x => x*x) b)
I know that this doesn't work, since if i put in 2^5 it will do:
2*2, 4*4, 16*16 etc.
You are extremely close. Your definition of exp compounds fn x => x*x which (as you noticed) is not what you want, because it is repeatedly squaring the input. Instead, you want to do repeated multiplication by the base. That is, fn x => b*x.
Next, you can actually remove the special case of e = 0 by relying upon the fact that compound "does the right thing" when asked to apply a function 0 times.
fun exp b e = compound e (fn x => b*x) 1
You could just do this instead I believe
fun exp 0 0 = 1
| exp b 0 = 1
| exp b e = (compound (e - 1) (fn x => b * x ) b);
this may not be exactly 100% proper code. I sort of just now read a bit of Standard ML documentation and took some code and reworked it for your example but the general idea is the same for most programming languages.
fun foo (num, power) =
let
val counter = ref power
val total = 1
in
while !counter > 0 do (
total := !total * num
counter := !counter - 1
)
end;
To be more clear with some pseudo-code:
input x, pow
total = 1
loop from 1 to pow
total = total * x
end loop
return total
This doesn't handle negative exponents but it should get you started.
It basically is a simple algorithm of what exponents truly are: repeated multiplication.
2^4 = 1*2*2*2*2 //The 1 is implicit
2^0 = 1

Fast inner product of ternary vectors

Consider two vectors, A and B, of size n, 7 <= n <= 23. Both A and B consists of -1s, 0s and 1s only.
I need a fast algorithm which computes the inner product of A and B.
So far I've thought of storing the signs and values in separate uint32_ts using the following encoding:
sign 0, value 0 → 0
sign 0, value 1 → 1
sign 1, value 1 → -1.
The C++ implementation I've thought of looks like the following:
struct ternary_vector {
uint32_t sign, value;
};
int inner_product(const ternary_vector & a, const ternary_vector & b) {
uint32_t psign = a.sign ^ b.sign;
uint32_t pvalue = a.value & b.value;
psign &= pvalue;
pvalue ^= psign;
return __builtin_popcount(pvalue) - __builtin_popcount(psign);
}
This works reasonably well, but I'm not sure whether it is possible to do it better. Any comment on the matter is highly appreciated.
I like having the 2 uint32_t, but I think your actual calculation is a bit wasteful
Just a few minor points:
I'm not sure about the reference (getting a and b by const &) - this adds a level of indirection compared to putting them on the stack. When the code is this small (a couple of clocks maybe) this is significant. Try passing by value and see what you get
__builtin_popcount can be, unfortunately, very inefficient. I've used it myself, but found that even a very basic implementation I wrote was far faster than this. However - this is dependent on the platform.
Basically, if the platform has a hardware popcount implementation, __builtin_popcount uses it. If not - it uses a very inefficient replacement.
The one serious problem here is the reuse of the psign and pvalue variables for the positive and negative vectors. You are doing neither your compiler nor yourself any favors by obfuscating your code in this way.
Would it be possible for you to encode your ternary state in a std::bitset<2> and define the product in terms of and? For example, if your ternary types are:
1 = P = (1, 1)
0 = Z = (0, 0)
-1 = M = (1, 0) or (0, 1)
I believe you could define their product as:
1 * 1 = 1 => P * P = P => (1, 1) & (1, 1) = (1, 1) = P
1 * 0 = 0 => P * Z = Z => (1, 1) & (0, 0) = (0, 0) = Z
1 * -1 = -1 => P * M = M => (1, 1) & (1, 0) = (1, 0) = M
Then the inner product could start by taking the and of the bits of the elements and... I am working on how to add them together.
Edit:
My foolish suggestion did not consider that (-1)(-1) = 1, which cannot be handled by the representation I proposed. Thanks to #user92382 for bringing this up.
Depending on your architecture, you may want to optimize away the temporary bit vectors -- e.g. if your code is going to be compiled to FPGA, or laid out to an ASIC, then a sequence of logical operations will be better in terms of speed/energy/area than storing and reading/writing to two big buffers.
In this case, you can do:
int inner_product(const ternary_vector & a, const ternary_vector & b) {
return __builtin_popcount( a.value & b.value & ~(a.sign ^ b.sign))
- __builtin_popcount( a.value & b.value & (a.sign ^ b.sign));
}
This will lay out very well -- the (a.value & b.value & ... ) can enable/disable an XOR gate, whose output splits into two signed accumulators, with the first pathway NOTed before accumulation.

How to assign binary variable in AMPL in respect to another variable

I have a problem with AMPL modelling. Can you help me how to define a binary variable u that suppose to be equall to 0 when another variable x is also equall to 0 and 1 when x is different than 0?
I was trying to use logical expressions but solver that I am working with (cplex and minos) doesn't allow it.
My idea was:
subject to:
u || x != u && x
Take M a 'big' constant such as x < M holds, and assume x is an integer (or x >= 1 if x is continuous). You can use the two constraints:
u <= x (if x=0, then u=0)
x <= M*u (if x>0, then u=1)
with u a binary variable.
If now x is continuous and not necessarily greater than 1, you will have to adapt the constraints above (for example, the first constraint here would not be verified with x=0.3 and u=1).
The general idea is that you can (in many cases) replace those logical constraints with inequalities, using the fact that if a and b are boolean variables, then the statement "a implies b" can be written as b>=a (if a=1, then b=1).

Is this a regular language? If so, what is it's regular expression?

B = {1^k y | k >= 1, y in {0, 1}* and y contains at least k 1's }
Is this language regular? If so, how do you prove it, and how would you represent it with a regular expression in Python?
This is for class work, so if you could explain the reasons and processes behind your answer, it'd be much appreciated.
The language you have is equivalent to this language:
B' = {1 y | y in {0, 1}* and y contains at least one 1}
You can prove that B' is subset of B, since the condition in B' is the same as B, but with k set to 1.
Proving B is subset of B' involves proving that all words in B where k >= 1 also belongs to B', which is easy, since we can take away the first 1 in all words in B and set y to be the rest of the string, then y will always contain at least one 1.
Therefore, we can conclude that B = B'.
So our job is simplified to ensuring the first character is 1 and there is at least 1 1 in the rest of the string.
The regular expression (the CS notation) will be:
10*1(0 + 1)*
In the notation used by common regex engines:
10*1[01]*
The DFA:
Here q2 is a final state.
"At least" is the key to solving this question. If the word becomes "equal", then the story will be different.