What does the Chapel compiler error "use of [symbol] before encountering its definition, type unknown" mean? - chapel

When compiling my Chapel program, I see an error message like:
myProgram.chpl:42: error: use of 'symbol' before encountering its definition, type unknown
but I am unsure what this means or what I should do to resolve it. Can someone help me decipher it?

This error message typically occurs when encountering a circularity while resolving the module-scope variables of a Chapel program.
As a very simple example, consider the following program which defines two modules, M and N, each of which defines a variable (a and b, respectively):
module M {
use N;
var a = 42;
proc main() {
writeln("In main(), a is ", a, " and b is ", b);
}
}
module N {
use M;
var b = a;
}
The Chapel compiler will resolve this program as follows:
It starts by trying to resolve M because it is considered the program's main module due to the fact that it contains the main() procedure
It then sees that M depends on N due to its use N statement, so will defer resolving M until N is resolved
It then sees that N also depends on M due to its use M statement, but will note that it's already started to resolve M, so breaks the cycle by ignoring it and continuing to resolve N (similar to a depth-first search on a graph with cycles)
It then tries to resolve the type of module-scope variable b
It sees that b is initialized using a, so b's type will depend on a's
However, in looking up a's type, it finds that it doesn't know it yet since the resolution of M was deferred until N was resolved
This causes it to print out the error:
testit.chpl:11: error: use of 'a' before encountering its definition, type unknown
Note that while we humans can look at this code and see "Well, a is clearly an integer, so b should be an integer as well, what's the problem?", the Chapel compiler's resolution machinery takes a much more constrained approach at present. And arguably this is reasonable because Chapel's definition says that N will be initialized before M since M is the main module and it depends on N. However, that would mean that the program would try to initialize b before initializing a which seems counter to the author's intent.
That said, the error message could definitely be improved for cases like this to explain more about how the compiler got to the statement in question and help users detangle their inter-module dependencies and orderings.
Please note that having a circular use chain between modules M and N is not the inherent source of the problem here, and is an important pattern that's used frequently in Chapel. Such circularities only become problematic when the variable initializations themselves rely on values or expressions that have not yet been resolved using the module resolution/initialization order.
Some potential ways to address errors like this include:
moving the interdependent variables into a single module (either one of the existing ones, or a new one designed to break the cycle)
breaking the circular dependence between the modules themselves (though again, this is not strictly required as long as the variable definitions can be ordered properly)

Related

static analysis checks fails to find trivial C++ issue

I encountered a surprising False Negative in our C++ Static Analysis tool.
We use Klocwork (Currently 2021.1),
and several colleages reported finding issues KW should have found.
I got example down to as simple as:
int theIndex = 40;
int main()
{
int arr[10] = {0,1,2,3,4,5,6,7,8,9};
return arr[theIndex];
}
Any amateur can see I am definitely accessing out of bound array member [40] of the array [0..9].
But KW does not report that clear defect!
TBH, I used CppCheck and SonarQube too, and those failed too!
Testing an more direct flow like:
int main()
{
int theIndex = 40;
int arr[10] = {0,1,2,3,4,5,6,7,8,9};
return arr[theIndex];
}
does find the abundant issue.
My guess was that KW does not see main() as the entrypoint, therefore assume theIndex might be changed before it's called.
I also tired a version that 'might work' (if there is another task that synchronizes perfectly)
int theIndex;
int foo() {
const int arr[10] = {0,1,2,3,4,5,6,7,8,9};
return arr[theIndex];
}
int main()
{
theIndex = 40;
return foo();
}
Which CppCheck found as "bug free".
My Question is:
Am I mis-configuring the tools?
what should I do?
Should KW catch this issue or is it a limitation of SA tools?
Is there a good tool that is capable of catching such issues ?
Edit:
as #RichardCritten assume SA Tools realize other Compilation Units can change the value of theIndex therefore does not indicate the problem.
which holds true as declaring static int theIndex = 40 Does indicate the issue.
Now I wonder:
KW is fed with the full build-spec,
so theoretically, the tool could trace all branching of the software and track possible values of theIndex (might be a computational limitation).
Is there a way to instruct the tool to do so?
somewhat as a 'link' stage?
My guess was that KW does not see main() as the entrypoint, therefore assume theIndex might be changed before it's called.
theIndex can in fact be changed before main is entered. Every initializer of a global variable anywhere in the program can execute arbitrary code and access all global variables. So the tool would potential produce a lot of false positives if it assumed that all initial values of global variables remain unchanged until main is entered.
Of course this doesn't mean that the tool couldn't decide to warn anyway, risking false positives. I don't know whether the mentioned tools are configurable to do so.
If this is intended to be a constant mark it as constexpr. I then expect tools to recognize the issue.
If it is not supposed to be a constant, try to get rid of it. Global variables that aren't constants cause many issues. Because they are potentially modified by any call to a function whose body isn't known (and before entry to main or a thread), they are difficult to keep track of for humans, static analyzers and optimizers alike.
Giving the variable internal linkage may simplify the analysis, because the tool may be able to prove that nothing in the given translation unit could be accessed from another translation unit to set the value of the variable. If there was anything like that, then a global initializer in another unit may still modify it before main is entered. If that is not the case and there is also no global initializer in the variable's translation unit that modifies it, then the tool can be sure that the value remains unchanged before main.
With external linkage that doesn't work, because any translation unit can gain access to the variable simply by declaring it.
Technically I suppose a sufficiently sophisticated tool could do whole-program analysis to verify whether or not the global variable is modified before main. However, this is already problematic in theory if dynamic libraries are involved and I don't think that is a typical approach taken by static analyzers. (I could be wrong on this.)

Unable to recursively multiply BigInt beyond a certain number of iterations at compile-time in D

I need to get the product of an arbitrary number of variables. The actual number of variables and their values will be known at compile-time, however I cannot hardcode these because they come from reflection done on types at compile-time, using templates.
I can get the product of these into a BigInt at runtime just fine, however if I try to do so at compile-time using templates and immutable variables, I can only get the product for a small number of variables before I get a compiler error.
Here is a condensed example that doesn't use type-traits, but suffers from the same issue:
import std.bigint; // BigInt
import std.stdio; // writeln
template Product(ulong value) {
immutable BigInt Product = value;
}
template Product(ulong value, values...) {
immutable BigInt Product = Product!value * Product!values;
}
immutable BigInt NO_PROBLEM = cast(BigInt)ulong.max * ulong.max * ulong.max;
immutable BigInt ERROR = Product!(ulong.max, ulong.max, ulong.max);
void main() {
writeln(NO_PROBLEM, " ", ERROR);
}
Trying to compile this with dmd compiler gives the error message:
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/druntime/import/core/cpuid.d(121): Error: static variable `_dataCaches` cannot be read at compile time
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/phobos/std/internal/math/biguintcore.d(200): called from here: `dataCaches()`
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/phobos/std/internal/math/biguintcore.d(1547): called from here: `getCacheLimit()`
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/phobos/std/internal/math/biguintcore.d(758): called from here: `mulInternal(result, cast(const(uint)[])y.data, cast(const(uint)[])x.data)`
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/phobos/std/bigint.d(380): called from here: `mul(this.data, y.data)`
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/phobos/std/bigint.d(380): called from here: `this.data.opAssign(mul(this.data, y.data))`
/opt/compiler-explorer/dmd2-nightly/dmd2/linux/bin64/../../src/phobos/std/bigint.d(430): called from here: `r.opOpAssign(y)`
<source>(9): called from here: `Product.opBinary(Product)`
<source>(13): Error: template instance `example.Product!(18446744073709551615LU, 18446744073709551615LU, 18446744073709551615LU)` error instantiating
ASM generation compiler returned: 1
I'm quite puzzled by this. At initial glance, it would appear that too much memory is being requested at compile-time (I would understand if there were less heap available during compile-time execution than at runtime), however I'm not sure this is actually the problem, as I can generate the result at compile-time, just not through the recursive template.
Could it be a bug in the Phobos runtime, or an undocumented limitation?
std.bigint appears to be designed to be able to produce huge values at compile-time, with lines such as this compiling and executing fine (and bloating the size of the executable!):
immutable BigInt VERY_BIG = BigInt(2) ^^ 10000000;
The error happens on the last line of this function:
https://github.com/dlang/phobos/blob/e0af01c8adf75b164b43832dd7544e297347cf6f/std/internal/math/biguintcore.d#L1824-L1844
It looks like std.bigint is currently not written to work in CTFE in this circumstance. Perhaps simply making the GC.free call conditional on __ctfe will fix the problem.
As to why it happens with 10 iterations but not 11, the function has a branch which allows performing the calculation for small numbers without dynamic memory allocation.

Structure not in memory

I created a structure like that:
struct Options {
double bindableKeys = 567;
double graphicLocation = 150;
double textures = 300;
};
Options options;
Right after this declaration, in another process, I open the process which contains the structure and search for a byte array with the struct's doubles but nothing gets found.
To obtain a result, I need to add something like std::cout << options.bindableKeys;after the declaration. Then I get a result from my pattern search.
Why is this behaving like that? Is there any fix?
Minimal reproducible example:
struct Options {
double bindableKeys = 567;
double graphicLocation = 150;
double textures = 300;
};
Options options;
while(true) {
double val = options.bindableKeys;
if(val > 10)
std::cout << "test" << std::endl;
}
You can search the array with CheatEngine or another pattern finder
Contrary to popular belief, C++ source code is not a sequence of instructions provided to the executing computer. It is not a list of things that the executable will contain.
It is merely a description of a program.
Your compiler is responsible for creating an executable program, that follows the same semantics and logical narrative as you've described in your source code.
Creating an Options instance is all well and good, but if creating it does not do anything (has no side effects) and you never use any of its data, then it may as well not exist, and therefore is not a part of the logical narrative of your program.
Consequently, there is no reason for the compiler to put it into the executable program. So, it doesn't.
Some people call this "optimisation". That the instance is "optimised away". I prefer to call it common sense: the instance was never truly a part of your program.
And even if you do use the data in the instance, it may be possible for an executable program to be created that more directly uses that data. In your case, nothing changes the default values of Option's members, so there is no reason to include them into the program: the if statement can just have 567 baked into it. Then, since it's baked in, the whole condition becomes the constant expression 567 > 10 which must always be true; you'll likely find that the resulting executable program consequently contains no branching logic at all. It just starts up, then outputs "test" over and over again until you force-terminate it.
That all being said, because we live in a world governed by physical laws, and because compilers are imperfect, there is always going to be some slight leakage of this abstraction. For this reason, you can trick the compiler into thinking that the instance is "used" in a way that requires its presence to be represented more formally in the executable, even if this isn't necessary to implement the described program. This is common in benchmarking code.

Is it possible to check existance of variables?

Can we understand if there is a variable mentioned/created/exists ?
I mean something like that:
//..Some codes
int main(){
int var1;
float var2;
char var3;
cout << isExist("var1") << endl;//Or isExist(/*Something related with var1*/)
cout << isExist("var2") << endl;
cout << isExist("var3") << endl;
cout << isExist("var456") << endl;//There is no variable named with var456
return 0;
}
Output:
true
true
true
false
No. C and C++ do not support reflection.
Not in C/C++. But you could have a look at boost reflect library. http://bytemaster.bitshares.org/boost_reflect/
In C/C++, accessing a variable not defined will generate a compiler error. So, in a sense, that is inherent to how it works. You cannot do that at runtime, at least not as you are trying to do, and should not have need to - because you can't name new variables at runtime in the first place, so you should already know the variables there.
The only way to do this would be indirectly with macros. Macros can't check if a variable itself is defined, but a define could be paired with a variable definition and you could check for the define token.
#define A_VARIABLE 1
int a_variable = 60;
And later:
#ifdef A_VARIABLE
...
#endif
Like most macros, it is probably best to avoid this sort of behavior - however, I have seen it used to deal with platform-dependence of certain variables.
Dynamic memory is a different matter. Since you did not mention it, I will not go into it, but suffice to say it is a more complicated problem which proves the bane of many programmers and the source of many runtime errors.
The 'programming language C' is a human readable form of providing instructions to a computer. All names in the program have only meaning within the program text.
Upon compilation, the names are replaced with a symbolic reference to a storage location or function (execution starting point). Any symbol not found in the current complilation unit (object module) is marked for future resolution.
The object modules are combined (linked) into an executable, where all references to symbols not in an object module are resolved with locations in other object modules; otherwise the creation of the executable fails.
Since now any names have been replaced with references to storage locations and execution starting points, the executable doesn't know anymore about the names used in the program text to refer to its storage locations and functions.
Any ability to do so (the 'reflection' as user #Bill-Lynch calls it) would be 'bolted on' to the language/environment as a separate layer, for example provided by the debugging/development envionment.

Internal Compiler Error on Array Value-Initialization in VC++14 (VS2015)

I'm getting an ICE on Visual Studio 2015 CTP 6. Unfortunately, this is happening in a large project, and I can't post the whole code here, and I have been unable to reproduce the problem on a minimal sample. What I'm hoping to get is help in constructing such a sample (to submit to Microsoft) or possibly illumination regarding what's happening and/or what I'm doing wrong.
This is a mock-up of what I'm doing. (Note that the code I'm presenting here does NOT generate an ICE; I'm merely using this simple example to explain the situation.)
I have a class A which is not copyable (it has a couple of "reference" members) and doesn't have a default constructor. Another class, B holds an array of As (plain C array of A values, no references/pointers) and I'm initializing this array in the constructor of B using uniform initialization syntax. See the sample code below.
struct B;
struct A
{
int & x;
B * b;
A (B * b_, int & x_) : x (x_), b (b_) {}
A (A const &) = delete;
A & operator = (A const &) = delete;
};
struct B
{
A a [3];
int foo;
B ()
: a {{this,foo},{this,foo},{nullptr,foo}} // <-- THE CULPRIT!
, foo (2)
{ // <-- This is where the compiler says the error occurs
}
};
int main ()
{
B b;
return 0;
}
I can't use std::array because I need to construct the elements in their final place (can't copy.) I can't use std::vector because I need B to contain the As.
Note that if I don't use an array and use individual variables (e.g. A a0, a1, a2;, which I can do because the array is small and fixed in size) the ICE goes away. But this is not what I want since I'll lose ability to get to them by index, which I need. I can use a union of the loose variables over the array to solve my ICE problem and get indexing (construct using the variables, access using the array,) but I think that would result in "undefined behavior" and seems convoluted.
The obvious differences between the above sample and my actual code (aside from the scale) is that A and B are classes instead of structs, each is declared/defined in its own source/header file pair, and none of the constructors is inline. (I duplicated these and still couldn't reproduce the ICE.)
For my actual project, I've tried cleaning the built files and rebuild, to no avail. Any suggestions, etc.?
P.S. I'm not sure if my title is suitable. Any suggestions on that?!?!
UPDATE 1: This is the compiler file referenced in the C1001 fatal error message: (compiler file 'f:\dd\vctools\compiler\utc\src\p2\main.c', line 230).
UPDATE 2: Since I had forgotten to mention, the codebase compiles cleanly (and correctly) under GCC 4.9.2 in C++14 mode.
Also, I'm compiling with all optimizations disabled.
UPDATE 3: I have found out that if I rearrange the member data in B and put the array at the very end, the code compiles. I've tried several other permutations and it sometimes does compile and sometimes doesn't. I can't see any patterns regarding what other members coming before the array make the compiler go full ICE! (being UDTs or primitives, having constructors or not, POD or not, reference or pointer or value type, ...)
This means that I have sort of a solution for my problem, although my internal class layout is important to me and this application, I can tolerate the performance hit (due to cache misses resulting from putting some hot data apart from the rest) in order to get past this thing.
However, I still really like a minimal repro of the ICE to be able to submit to Microsoft. I don't want to be stuck with this for the next two years (at least!)
UPDATE 4: I have tried VS2015 RC and the ICE is still there (although the error message refers to a different internal line of code, line 247 in the same "main.c" file.)
And I have opened a bug report on Microsoft Connect.
I did report this to Microsoft, and after sharing some of my project code with them, it seems that the problem has been tracked down and fixed. They said that the fix will be included in the final VC14 release.
Thanks for the comments and pointers.