Defining extern variable in main() vs. globally - c++

Given the following header file, if 'a' is defined inside the main body, I get a warning "unused variable 'a'" and linker error "undefined reference to 'a'.
header.h:
#ifndef HEADER_H
#define HEADER_H
#include <iostream>
extern int* a;
void f()
{
std::cout<<*a <<std::endl;
return;
}
#endif
main.cpp:
#include "header.h"
int main()
{
int* a = new int(10);
f();
}
However, if 'a' is defined outside of main(), the program links with no errors and f() works as expected (prints 10). Why is this?
Example:
int* a = new int(10);
int main()
{
f();
}

int* a = new int(10);
for this line, if in the main function, you are defining a local variable.
so the extern int* a; only declare a variable, but not define it. then get a linkage error on that symbol

You need to learn about name binding, which determines how two
declarations of the same name are related. When you define a variable
within a function, its name has no linkage; i.e. the entity it refers to
is distinct from any other entity in the program.
More generally: a declaration (and in this sense, a definition is also a
declaration) associates a symbol with an entity—an object (in
which case, the declaration declares a variable), a function, a
reference, a type or anything else you can declare in C++. Whether
different declarations of the same name associate with the same entity
or not is defined by their linkage. C++ recognizes three
different types of linkage:
external linkage, in which the entity can be referred to by declarations in other transation units,
internal linkage, in which the entity can be referred to by other declarations in the same translation unit,
and no linkage, in which the entity cannot be referred to by any other declaration.
Variables declared at block scope (i.e. local variables) have no
linkage, unless they are explicitly declared extern (and a local
variable declared extern cannot be a definition). So the int a in
main declares (and defines) an entity which is independent of any
other a in the program. Variables declared in namespace scope have
external linkage, unless they are declared static, in which case they
have internal linkage; when you define int a at namespace scope, it
has external linkage, and so refers to the same entity you declared with
extern int a in the header.

When you define the variables inside main, it only has scope inside the main function.
The global extern cannot resolve to that. In other words the linker cannot match the extern declared globally to the variable definition inside the main function.

If you define a inside of main, then its scope (visibility) is restricted to main -- an extern declaration will not make it visible anywhere else.
You have to define it at namespace scope (i.e., outside any function) for it to be visible in other translation units.

It's rather annoying to read 4 answers explaining what's wrong, but none explaining how the correct way to fix it. It's probably a safe guess that if the OP doesn't know about scoping he probably also doesn't know about passing variables to a function.
The Problem
You're trying to get at the value of a variable, but the variable is in another function. How can I get at it? Well, the simple answer is, you don't WANT to get at it. You heard me right. The entire reason to use a function is reusability, if you tie your newly created function to another function then you can't use it everywhere. Remember, functions help you be lazy. And a good programmer is a lazy programmer. If you can write a function once and use it in a million places, you're doing it right. ;)
But I still really want to get at the value of that variable
Then you want to use a function parameter to pass the variable to the function.
Functions are named because you can think of them in terms of math. Put variables in, get out useful data after the function has run and done interesting things with those variables. So let's say you have a math function y = f(x), the equivalent of this would be int f(int x) { /*stuff here*/ } then you call it in your main function using int y = f(a) where a is some variable or number.
You want to avoid global variables because they don't always do what you expect (especially if you have a lot of code, it's very easy to accidentally use the same name.)
In your case you want the function to print out the contents of a specific variable, so I think perhaps you're seeking a way use that function with any specific variable. So here's how you do that.
void f(); //hi, I'm a function prototype
void f(int a); //hi, I'm a function prototype that takes a parameter
void f(int a, int b); //hi, I'm a function prototype that takes two parameters (both ints)
void f(int a, ...); //hi, I'm a function prototype that takes an int then any number of extra parameters (this is how printf works.)
So what you really want to do is change your code to something like:
header.h:
#ifndef HEADER_H
#define HEADER_H
#include <iostream>
// extern int* a; // We don't need this
void f(int* a)
{
if (a != NULL) //always, always check to make sure a pointer isn't null (segfaults aren't fun)
std::cout<<*a <<std::endl;
//return; //Don't really need this for a function declared void.
}
#endif
main.cpp:
#include "header.h"
int main()
{
int* a = new int(10);
f(a);
return 0; //main is declared as returning an int, so you should.
}
Functions by value, pointer and reference
So, in your examples I gave I used int rather than int* in your example. The difference between the two is the first one passes the parameter by value. The other by pointer. When you pass a variable to a function a copy of it is ALWAYS made. If you pass it a int, it makes a copy of the int, if you pass it a 4 MB structure it will make a copy of the 4MB structure, if you pass it a pointer to a 4MB structure is will make a copy of the pointer (not the entire structure.) This is important for two reasons:
Performance: Making a copy of a 4MB structure takes some time.
Ability to change contents: If you make a copy of the pointer, the original data is still in the same place and still accessible through the pointer.
What if you want 1 and not 2? Well then you can declare the pointer const. The prototype looks like this: int f(int const* a);
What if you want 2 and not 1? Tough cookies (there no good reason anyway.)
Finally, you can also declare a function to take a reference and not a pointer, the big difference between a reference and a pointer is a reference will not be NULL (and you can't use pointer arithmetic on a reference.) You will want to use either pass by reference or pass by value normally. Needing to pass by pointer is something that I almost never need to do, in my experience it's more of a special case sort of thing.
Pass by reference: int f(int& a);
Pass by const reference: int f(int const& a);
So to sum up:
if you have function that needs parameters:
then:
if you do not need to modify the contents:
then:
if the size of the variable is small:
pass by value: int f(int a);
else if the size of the variable is large:
then:
if the value of the address can be NULL:
pass by const pointer: int f(int const* a);
else:
pass by const reference: int f(int const& a);
else if you do need to modify the contents:
then:
if the value of the address can be NULL:
pass by pointer: int f(int* a);
else:
pass by reference: int f(int& a);
There's some more cases, but these are the main ones, see this website for more details.

When you define the variable in the function main it is valid only in the scope of main. You can define in all function a variable with the name a. But these would be different variables, because each has it's owen scope.
Technically the variable is allocated on the stack while the function is called, therefore each instance has its own storage.

Related

redeclaring array in c++

I'm trying to port some C code over to C++ and an array declaration (or rather, a series of array declarations) is giving me problems. The code is organized like this: first, a global array is declared, like so:
static const Foo foos[100];
Then, a bunch of other arrays are declared and initialized, all of which reference certain elements of the foos array, like so:
static const Bar bar1[3] = { .... &foos[3]; .... }
Finally, the original array is re-declared and initialized. The elements of the array make references to the bunch of arrays we just declared (in other words, the structures are mutually recursive):
static const Foo foos[100] = { .... &bar1[1]; .... }
In C, this works fine. The first declaration just serves to say "hey, I'm going to need an array of 100 Foo's later on", and then the second declaration actually tells the compiler what data we'd like to populate the array with. Because the structures I'm dealing with are mutually recursive, this all works out really nicely.
However, C++ is giving me real problems with the re-declaration. I'm not really a C++ programmer, but I believe this all has something to do with C++'s rules on default initialization.
So here's my question: how can I capture the above model in C++? How can I pre-declare the type and size of an array without actually initializing the contents?
(Don't bother telling me this is bad design -- I'm actually working on a compiler that targets C, so it's irrelevant whether the design of the computer-generated C code is good or bad. I'd just like to know how to pre-declare arrays of structures in C++.)
It works in C because C has tentative definitions, which weren't carried over to C++. For example, this is perfectly legal C code, but illegal C++ code:
int a;
int a;
int a;
If you want to declare an array (or any other variable) without defining it, use extern:
extern const Foo foos[100]; // declaration
const Foo foos[100] = ...; // definition
You cannot combine extern with static, but you can put stuff into an anonymous namespace, which more or less has the same effect. Note that top-level variables are always static; the static modifier on a global variable means "limit the visibility of this variable to the current translation unit".
Instead of making it static, change the declaration of Foo to extern.
extern const Foo foos[];
static const Bar bar[] = { ..., &foos[13], ... };
const Foo foos[] = { ..., &bar[1], ... };

How does extern work?

extern is a storage class in C. How exactly does it work? The output of the code given below is 20. How is this the output?
#include <stdio.h>
int main()
{
extern int a;
printf("%d", a);
return 0;
}
int a=20;
It means three things:
The variable has external linkage, and is accessible from anywhere in the program;
It has static storage duration, so its lifetime is that of the program (more or less); and
The declaration is just a declaration, not a definition. The variable must also be defined somewhere (either without the extern, or with an initialiser, or in your case, both).
Specifically, your extern int a; declares that the variable exists, but doesn't define it at that point. At this point, you can use it, and the linker will make sure your use refers to the definition. Then you have the required definition, int a=20; at the end, so all is well.
extern in this case indicates that the symbol a is defined in a different location, such as a different module. So the linker looks for a symbol with the same name in all of the modules that are linked, and if one exists then it sets the address to your local variable a with the address to the externally defined variable. Since you have another a defined outside of your main() function, the a inside your main() function is (basically) the same variable as the one outside.
Since the global a is initialized before the main function executes, the value is 20 by the time you access it.
extern means i declare a variable, just like you implement a function in a source file and declare the prototype in a header to allow other source file to use it.
If you put a global variable in a source file, and use a header to declare it with the extern keyword, each source file including the header will see the variable.
The linker will do the job to tie everything just as it does with functions
extern as a storage class specifier tells the compiler that the object being declared is not a new object, but has storage elsewhere, i.e., is defined elsewhere. You can try this experiment with your code to see how it works. Leave out the keyword extern in your declaration of int a in main(). Then your printf() will print some garbage value, as it would be a new definition of an int with the same identifier, which would hide the global a declared elsewhere.
You use extern to tell the compiler that the variable is defined elsewhere. Without extern in your program compiler would define another variable a (in addition to this in the global scope) in your main() function that would be printed uninitialized.

"extern" inside a function?

Well, reading "a bit old" book ("The C programming language", second edition, by Dennis Ritchie), I came a cross the following:
An external variable must be defined, exactly once, outside of any function; this sets aside storage for it. The variable must also be declared in each function that wants to access it
and I was like - what?!
"The variable must also be declared in each function that wants to access it". Then, I was shocked one more time:
int max;
/* ... */
int main()
{
extern int max;
/* ... */
}
And one more - what?!
As far as I know (obviously, it's not much and far from enough), extern makes sense only when you define a global variable somewhere and you want to access it through another file (not to define it again).
So:
What's the point of this extern int max inside the main or any other function?
Does the standard really says, that this is a must (that I need to declare, for this example, this max in each function, that will use it?)
Is this the same for C++ (that's why I placed the C++ tag)? This is the first time I see something like this.
Note: this is not the same as What is the use of declaring a static variable as extern inside a function?
Your post surprised me. I had no recollection of that and I've read K&R long ago. I only have the first edition here and it is there too. However, that is not all it says. From the first edition:
The variable must also be declared in each function that wants to
access it; this may be done either by an explicit extern declaration
or implicitly by context.
Note the "implicitly by context." Later in the text:
...if the external definition of a variable occurs in the source file
before its use in a particular function, then there is no need for an
extern declaration in the function. The extern declarations in main,
... are thus redundant. In fact, common practice is to place
definitions of all external variables at the beginning of the source
file, and then omit all extern declarations.
So this is saying that making the extern variable visible can be done inside the function for just that function, or can be done outside any function for all functions following it in the source file. I believe that this is the only place in the book where it is done inside the function, later it uses the familiar once for the file approach.
extern int max inside main or function is saying to the compiler "I am not a local variable inside the main or function, I am the global variable defined elsewhere".
If the global is declared in the same file, not useful. In different file,yes, but not in each function, just declare one time in the head file of the source that use this global variable. This is the same in c++.
The extern is linkage. It means this name, max, is linked to other occurrences of the name, possibly in other files. (That is, when the object modules are linked together to make an executable, all the linked references to this name will be made to refer to the same object.)
The scope of this declaration is the remainder of the function body it is in. That means other functions in this file do not see the name declared by this declaration (unless they declare it themselves).
Scope and linkage are different things.
Another possible reason for extern inside a function is to make sure that no local variable is shadowing the global variable
Without extern:
int max = 33;
int main()
{
int max;
printf("%d", max); // prints indeterminate value (garbage)
}
With extern
int main()
{
extern int max;
int max;
printf("%d", max);
}
Output:
error: redeclaration of ‘max’ with no linkage

using compile time constant throws error

in the below program i have used static const int init. But it is throwing error
/tmp/ccEkWmkT.o(.text+0x15d): In function check::operation()':
: undefined reference tocheck::init'
This error is coming only when used with vector. Can someone please help? what is the exact behaviour??
#include<vector>
#include<iostream>
using namespace std;
class check{
static const int init=1;
public:
check(){}
void operation();
};
void check::operation(){
vector<int> dummy;
dummy.push_back(init);
}
int main(){
check ck;
ck.operation();
}
"what is the exact behaviour?"
The problem is that push_back takes a reference parameter. You can use the value of a static const int member variable without providing a separate definition of the object, but you can't use a reference to the object itself (since it doesn't exist). The meaning of "using" the member itself is defined in the section of the standard on the One Definition Rule, 3.2/2.
One fix is to provide a definition in exactly one translation unit:
const int check::init;
If you do this, you can also choose to move the = 1 initialization from the declaration (inside the class) to the definition (outside the class).
Another fix is to create a temporary from the member variable (this only uses the value, it doesn't care where the object is located and hence doesn't care whether it exists), then pass a reference to the temporary:
dummy.push_back(int(init));
Of course there's a potential maintenance issue there, that if the types of init and dummy both change to, say, long long[*], and the value changes from 1 to something bigger than INT_MAX, then you're in trouble. For that reason you could use +init, since the unary + operator also creates a temporary for its result. Readers and future maintainers might be a bit puzzled by it, though.
[*] Supposing your implementation has long long.
You've to provide the definition of the static member outside the class (in .cpp file) as:
//check.h (same as before)
class check
{
static const int init=1; //declaration and in-class initialization
public:
check(){}
void operation();
};
Then in check.cpp file, do this:
//check.cpp
#include "check.h"
const int check::init; //definition
If you pass it by reference it is "used" and you may have to define in the .cpp file so that it gets an address.
If you just use the value of the constant, you might get away with not defining it.

Query on Static member variables of a class in C++

Sorry if this question seems trivial to many here.
In a C++ code there is something as below:
class Foo
{
public:
static int bands;
...
...
private:
...
...
}//class definition ends
int Foo::bands; //Note: here its not initialized to any value!
Why is the above statement needed again when 'bands' is once declared inside the class as static?
Also can a static variable be declared as a private member variable in any class?
C++ notes a distinction between declaring and defining. bands is declared within the class, but not defined.
A non-static data member would be defined when you define an object of that type, but since a static member is not a part of any one specific object, it needs it's own definition.
a) It's needed because that's the way the languge is designed.
b) Static variables are initialized by their default constructor, or to zero for built-in types.
c) Yes, they can be (and usually are) private.
Take a look at this question.
It has to do with obj files, how they are used, and how memory addresses for globally scoped variables are ultimately discovered through the linking process. Object files contain the addresses of all global data and functions defined in the corresponding cpp. They layout some memory in a relative fashion to tell the liker where in that file these global vars/funcs can be found. So for example
function doFoo can be found 0 bytes from beginning of this file
int foo::bands can be found 12 bytes from beginning of this file
etc
Its almost easier to think about if you've done straight C before. In a pure C world you would do things in a more traditional modular programming sense. Your module would be defined with a header and a cpp. The header would define a "public" variable like below, using the extern keyword, then instantiate it in the cpp.
foo.h
extern int bands;
foo.cpp
#include "foo.h"
int bands;
foo.obj:
int bands can be found 0 bytes from the beginning of this file
The "extern" keyword states that this name is valid and its address will get resolved at link time. Everyone that included "foo.h" and wanted to use the "bands" global variable had could now use it. At link time, the linker would figure out that bands existed in the foo.obj. If you forgot to put "int bands" in foo.obj, you'd get a linker error, and have to go resolve it.
In C++ using static in a class declaration i similar. You are telling the users that there exists this thing called "foo::bands" and where it will live will get resolved at link time. Later down the line, the linker sees that in foo.obj, foo::bands exists, and all references to foo::bands can be resolved.
My understanding is that you would only need to declare Foo::bands if you planned on using it prior to ever creating an instance of your class. Basically, when you declare a static in a C++ class then only one copy of that variable exists for all instances of that class. However, you can't normally access Foo::bands until an instance of the class is declared.
For example:
Pointers to Members
#include <iostream>
using namespace std;
class X {
public:
int a;
void f(int b) {
cout << "The value of b is "<< b << endl;
}
};
int main() {
// declare pointer to data member
int X::*ptiptr = &X::a;
// declare a pointer to member function
void (X::* ptfptr) (int) = &X::f;
// create an object of class type X
X xobject;
// initialize data member
xobject.*ptiptr = 10;
cout << "The value of a is " << xobject.*ptiptr << endl;
// call member function
(xobject.*ptfptr) (20);
}