Relational algebra - foreign-keys

Suppose R1(A*,B) and R2(C*,D) are two relation schemas. Let r1 and r2 be the corresponding relation instances. B is a foreign key that refers to C in R2. If data in r1 and r2 satisfy referential integrity constraints than will it be correct to say ∏ B (r1) - ∏ C (r2) = ∅ (even when one of the tuple in r1 contains null corresponding to the B attribute.)

will it be correct to say ∏ B (r1) - ∏ C (r2) = ∅
In relational algebra (as typically defined), yes. But a justification depends on what definition of foreign key you are using.
even when one of the tuple in r1 contains null
There are no nulls in relational algebra (as typically defined). They are from SQL. (Primary keys are also irrelevant to relational algebra; candidate keys matter.)
The notion of an SQL foreign key is different from the algebraic notion. First, it involves nulls. Second, an SQL PRIMARY KEY or UNIQUE NOT NULL declaration declares a superkey, not necessarily a candidate key, and an SQL FOREIGN KEY declaration references a superkey, not necessarily a candidate key.

Related

What is the best way to check/ensure that two arrays have the same domain and distribution?

A nice feature in Chapel is that it distinguishes between the domain of an array and its distribution. What is the best way to check that two arrays have the same domain and distribution (which one often wants)?
The best I can see is to check D1==D2 and D1.dist==D2.dist, if D1 and D2 are both domains.
In particular, consider the following code :
const Dom = {1..5, 1..5};
const BDom = newBlockDom(Dom);
var x : [Dom] int;
var y : [BDom] int;
test(x,y);
proc test(a : [?Dom1] int, b : [Dom1] int) {
}
This compiles and runs just fine, which makes sense if the query syntax in the function declaration just tests for domain equality, but not for distribution equality (even though Dom1 also knows about how a is distributed). Is the only way to check for distribution equality in this case is to do a.domain.dist==b.domain.dist?
To check whether two domains describe the same distributed index set in Chapel, you're correct that you'd use D1 == D2 and D1.dist == D2.dist. Domain equality in Chapel checks whether two domains describe the same index set, so is independent of the domain maps / distributions. Similarly, an equality check between two domain maps / distributions checks whether they distribute indices identically.
Note that in Chapel, both domains and distributions have a notion of identity, so if you created two distributed domains as follows:
var BDom1 = newBlockDom(Dom),
BDom2 = newBlockDom(Dom);
they would pass the above equality checks, yet be distinct domain values. In some cases, it might be reasonable to wonder whether two domain expressions refer to the identical domain instance, but I believe there is no official user-facing way to do this in Chapel today. If this is of interest, it would be worth filing a feature request against on our GitHub issues page.
With respect to your code example:
const Dom = {1..5, 1..5};
const BDom = newBlockDom(Dom);
var x : [Dom] int;
var y : [BDom] int;
test(x,y);
proc test(a : [?Dom1] int, b : [Dom1] int) {
}
there is a subtlety going on here that requires some explanation. First, note that if you reverse the arguments to your test() routine, it will not compile, acting perhaps more similar to what you were expecting (TIO):
test(y,x);
The reason for this is that domains which don't have an explicit domain map are treated specially in formal array arguments. Specifically, in defining Chapel, we didn't want to have a formal argument that was declared like X here:
proc foo(X: [1..n] real) { ... }
require that the actual array argument be non-distributed / have the default domain map. In other words, we wanted the user to be able to pass in a Block- or Cyclic-distributed array indexed from 1..n so that the formal was constraining the array's index set but not its distribution. Conversely, if a formal argument's domain is defined in terms of an explicit domain map, like:
proc bar(X: [BDom] int) { ... }
(using your Block-distributed definition of BDom above), it requires the actual array argument to match that domain.
An effect of this is that in your example, since Dom1 was matched to a domain with a default domain map, b is similarly loosely constrained to have the same index set yet with any distribution. Whereas when the first actual argument is block-distributed (as in my call), Dom1 encodes that distribution and applies the constraint to b.
If your reaction to this is that it feels confusing / asymmetric, I'm inclined to agree. I believe we have discussed treating declared/named domains differently from anonymous ones in this regard (since it was the anonymity of the domain in X: [1..n] that we were focused on when adopting this rule, and its application to queried domains like Dom1 in cases like this is something of a side effect of the current implementation). Again, a GitHub issue would be completely fair game for questioning / challenging this behavior.

Enforcing inequality of lists?

For a given CSP I used a variety of viewpoints, one of which is a somewhat more exotic boolean model which uses a variable array of size NxNxN. Then I enforce unequality of various subarrays with this snippet :
(foreach(X, List1),
foreach(Y, List2),
foreach((X #\= Y), Constraints)
do true),
1 #=< sum(Constraints).
The performance of the model is bad, so I was curious to know more about what happens behind the scenes. Is this a proper way to ensure that two given lists are different? Do I understand it correctly that every constraint (X #\= Y) in the Constraints list needs to be instantiated before the sum is calculated, meaning that all the corresponding variables need to be instantiated too?
The constraint library library(ic_global) is indeed missing a constraint here; it should provide lex_ne/2, analogous to lex_lt/2. This would have the same logical and operational behaviour as the code you have written, i.e. propagate when there is only a single variable left in its argument lists:
?- B#::0..1, lex_ne([1,0,1], [1,B,1]).
B = 1
For comparison, you can try the sound difference operator ~=/2 (called dif/2 in some Prologs). This is efficiently implemented, but it doesn't know about domains and will thererefore not propagate; it simply waits until both sides are instantiated and then fails or succeeds:
?- B#::0..1, [1,0,1] ~= [1,B,1].
B = B{[0, 1]}
There is 1 delayed goal.
?- B#::0..1, [1,0,1] ~= [1,B,1], B = 0.
No (0.00s cpu)
Whether this is overall faster will depend on your application.

is there a hashing function that satisfies the following

is there a hashing algorithm that satisfies the following?
let "hash_funct" be a hashing function that takes two args, and returns a hash value. so all the following will be true
Hash1 = hash_funct(arg1, arg2) <=> hash_funct(Hash1, arg1) = hash_funct(Hash1, arg2) = Hash1;
Can anyone point me to this Algorithm? or if it doesn't exist, can anyone collaborate with me to invent it?
more explanation:
imagine a set S={A,B,C,D}, and the Hashing function above.
if we can make: Hash1 = hash_funct(A,B,C,D), then we can check if an element X is in the set by checking the hash result of hash_funct(Hash1,X) == Hash1 ? belogns to the set : doesn't belong
with this property we make checking the exisitance of an element in a set O(1) instead of O(NlogN)
I suppose Highest common factor(Hcf) will fit right here. Let a and b be two numbers with x as their highest common factor.
hcf(a,b) = x.
This means a = x*m and b = x*n. This clearly means that:
hcf(x,x*m) = hcf(x,x*n) = hcf(x*n,x*m) = x
What you are looking for is the Accumulators. Currently, they are very popular with digital coins #youtube
From Wikipedia;
A cryptographic accumulator is a one-way membership function. It answers a query as to whether a potential candidate is a member of a set without revealing the individual members of the set.
For example this paper;
We show how to use the RSA one-way accumulator to realize an efficient and dynamic authenticated dictionary, where untrusted directories provide cryptographically verifiable answers
to membership queries on a set maintained by a trusted source
With a Straightforward Accumulator-Based Scheme;
Query: When asking for a proof of membership.
Verification: check the validity of the answer.
Updates: Insertion and deletions
are available.

What's the difference between records and tuples in OCaml

Is there any difference between records and tuples that is not just a syntactic difference ?
Is there a performance difference ?
Is the implementation the same for tuples and records ?
Do you have examples of things that can be done using tuples but not with records (and
conversely) ?
Modulo syntax they are almost the same. The main semantic difference is that tuples are structural types, while records are nominal types. That implies e.g. that records can be recursive while tuples cannot (at least not without the -rectypes option):
type t = {a : int, b : unit -> t} (* fine *)
type u = int * (unit -> u) (* error *)
Moreover, records can have mutable fields, tuples can't.
FWIW, in OCaml's sister language SML, tuples are records. That is, in SML (a,b,c) is just syntactic sugar for {1=a,2=b,3=c}, and records are structural types as well.
Floats fields in float-only records or arrays are stored unboxed, while no such optimization applies to tuples. If you are storing a lot of floats and only floats, it is important to use records -- and you can gain by splitting a mixed float/other datastructure to have an internal float-only record.
The other differences are at the type level, and were already described by Andreas -- records are generative while tuples pre-exist and have a structural semantics. If you want structural records with polymorphic accessors, you can use object types.

Regular expression equivalences

Is the following regular expression equivalence true? Why or why not?
(ab)* u (aba)* = (ab u aba)*
*=Kleene star
u=Union (Set Theory)
No, they aren't equivalent. The language on the RHS contains "abaab", but the language on the LHS doesn't. Is there any relationship among these? Yes; but I won't just give you the answer. Hint: are there any strings in the RHS that aren't in the LHS?
EDIT:
Just to expound a little for the interested reader. Languages are sets of strings. Therefore, relationships among sets are also relationships among languages. The most common set relationships are equality, subset, and superset. Given two sets A and B over universe sets U1 and U2, set A is a subset of B under universe set U1 u U2 (u stands for union) iff every element of A is also an element of B. Similarly, given two sets A and B over universe sets U1 and U2, set A is a superset of B under universe set U1 u U2 (u stands for union) iff every element of B is also an element of A (equivalently, iff B is a subset of A). Sets A and B are equal iff A is a subset of B and B is a subset of A (equivalently, iff A is a superset of B and B is a superset of A). Note that two sets A and B need not be in any of these three relationships; this happens when A contains an element not in B and, at the same time, B contains an element not in A.
To find which of the four possible relationships exist between sets A and B - equality, subset, superset, none - you usually check whether A is a subset of B and whether B is a subset of A. Call the first check B1 and the second check B2, where B1 and B2 are Boolean variables and are true iff the checks pass (i.e., A is a subset of B in the case of B1, and B is a subset of A in the case of B2). Then (B1 && B2) corresponds to equality, (B1 && !B2) corresponds to subset, (!B1 && B2) corresponds to superset and (!B1 && !B2) corresponds to no relationship.
In the example above, I demonstrate that the two languages are not equal by demonstrating that the RHS contains an element not in the LHS. Note that this also rules out the "A is a superset of B" relationship. Two remain: B is a superset of A, or there is no relationship between them. To decide this, you must determine whether A is a subset of B; whether the elements in the LHS language are all in the RHS language. If so, the LHS is a subset; otherwise, there is no relationship.
To show that there is an element in one set and not another, the easiest and most convincing approach is a proof by counterexample: name a string in one but not the other. This is the approach I adopted. You can also make an argument that the language must contain such an element without explicitly naming it; this kind of proof can be significantly harder to get right.
To show that every element of set A is also in set B, you will need a more generic proof technique. Proof by induction and proof by contradiction are common examples. To do an inductive proof, you assign - explicitly or implicitly - a natural number to each string in the language, demonstrate that your claim (in this case, that the element is also an element of the other set) is true with a simple argument. Then, you assume it is true for the first n elements in your set (according to the numbering you give) and show that this implies all the elements that come afterwards must also satisfy your claim. To do a proof by contradiction, you assume the opposite of what you want to prove, and derive a contradiction. If you succeed and your only assumption was that your claim is false, then your claim must have been true all along.
No, they are not equivalent.
Similarities:
both accept empty string
both accept "ababa" (minimun expression of regex2)
Differences:
ab and aba might appear one time or not in regex1, differing from regex2 that they might appear or not but in conjuntion.
Since we got a difference, we can say that they are not equivalent.
BUT
Since a regular expression is a representation (not a description) of a regular lenguage you can not tell that regex1 is equivalent to regex2 just by looking at the expression, to prove it (mathematical proof) you can convert those regular expression into a NFA (nondeterministic finite automata) or DFA (deterministic finite automata) and compare the diferences.