What's the point of creating a header file with extern declarations? - c++

So I've been looking into cleaning up my code, and found suggestions about defining global constants (char const* strings for example) in a separate cpp file, with a header declaring them extern. Then you can include the header wherever you need it, and have access to the variables from a single location.
For example
strings.hpp
extern char const* strA;
extern char const* strB;
strings.cpp
#include "strings.hpp"
char const* strA = "strA";
char const* strB = "strB";
This made sense, I thought, but after a bit it occurred to me that this is going to cause needless recompilation of large chunks of the project. If I keep all my strings together, for example, whenever I add a new string I have to modify the header. That means that every cpp file that includes that header is going to be recompiled. As a project grows in complexity, this can add up, though I'm not sure if it would be a lot of time.
It seems to me that a solution to this problem would be to keep the idea, but instead of including the header, declare and define the constants in one cpp file:
strings.cpp
extern char const* strA;
extern char const* strB;
char const* strA = "strA";
char const* strB = "strB";
Then in any file that needs the strings, I declare the required variables
A.cpp
extern char const* strA;
B.cpp
extern char const* strB;
Essentially, manually doing what the include would have done for me.
With the difference that if I later need to add a char const* strC, used in C.cpp, I add the extern declaration and definition in strings.cpp, declare it again in C.cpp, and A.cpp and B.cpp don't need to be recompiled.
The downside to this approach is code duplication. Instead of declaring the variable once and including the declaration, I have to declare every variable anywhere I want to use it. The problem with that would be that if I decide to rename the variable, I have to rename it in several places. But then again, if I rename a variable I'd have to modify all the files that use it anyway, even if the declaration was included. And I can just do a global rename anyway.
It seems like the cost in time of recompiling several source files when making changes that don't affect them can grow much larger than the cost of renaming variables you may or may not modify. And yet, I've never seen this downside to the included declaration header mentioned.
Is there anything I am missing?

If you change a string you only need to recompile that one file. It's expensive to add a string but very cheap to modify a string. The other benefit is that all your strings are in the one place. If you decide you would like to change one you don't have to go searching through many files for it or if you want to translate your program it's easier.
If you have many strings spread throughout your project perhaps group them rather than have them all in the same header. Also remember that if you don't use any of them, don't include that header.

Including the same header in multiple files ensures that the same declaration is seen by every source file that uses the identifier. This reduces errors.
If there are various subsets of the declarations that are used and not used by various source files, then the declarations can be partitioned into multiple headers, A.h, B.h, C.h, and each header will be included only the source files that use its declarations. Then, when a declaration is changed in B.h, source files that include A.h or C.h but not B.h do not need to be recompiled.
To some extent, yes, there is some needless recompilation because sometimes a declaration changes in a header, resulting in compilation of a source file that uses the header but not the declaration that changed. This is generally regarded as a small cost for the benefit of avoiding the errors and complications caused by duplicated code.

Related

Understanding C linker error: multiple definition

I have a project with multiple header files and .cpp files.
All of the header files have include guards.
There is a file called Constants.h where I define some constants. Some of these with defines, some as constant variables.
There are more header-.cpp-file pairs with code in them. One of these does contain a class, the others don't.
When I include my files into my main file (an arduino sketch), I get a lot of linker errors, claiming there are multiple definitions of some variables.
I read that this mainly occurs when you include .c or .cpp files, which I don't do. All the .cpp files only include their appropriate header files.
I did manage to find multiple solution proposals:
1) inline:
With functions, inline can be used to get rid of this problem. However, this is not possible with variables.
2) anonymous namespace:
This is one of the solutions I used. I put anonymous namespaces around all the problematic definitions I had. It did work, however I do not understand why this works. Could anyone help me understand it?
3) moving definitions into .cpp files:
This is another approach I used sometimes, but it wasn't always possible since I needed some of my definitions in other code, not belonging to this header file or its code (which I do admit is bad design).
Could anyone explain to me where exactly the problem lies and why these approaches work?
Some of these with defines, some as constant variables.
In C const does not imply the same thing as it does in C++. If you have this:
const int foo = 3;
In a header, then any C++ translation unit that includes the header will have a static variable named foo (the const at namespace scope implies internal linkage). Moreover, foo can even be considered a constant expression by many C++ constructs.
Such is not the case in C. There foo is an object at file scope with external linkage. So you will have multiple definitions from C translation units.
A quick fix would be to alter the definitions into something like this:
static const int foo = 3;
This is redundant in C++ but required in C.
In addition to Story Teller's excellent explanation, to define global variables, use the following:
// module.h
#include "glo.h"
// glo.h
#ifndef EXTERN
# define EXTERN extern
#endif
EXTERN int myvar;
// main.c
#define EXTERN
#include "glo.h"
In main.c all variables will be declared (i.e. space is allocated for them), in all other c files that include glo.h, all variables will be known.
You shouldn't declare any object in header files, this should be moved to c\c++ files.
In header you may:
declare types such as: classes, structs, typedefs etc.
put forward declarations of (not classes) functions
put inline (or in classes) functions (+ body)
you may add extern declaration.
you may put your macros.
a static declaration may declare things multiple times, therefore it is not recommended.

Includes causing problems in c++. How to learn to do c++ right

So I have made a complex project and now I have too many include files causing me headaches. How can I best manage these classes? Some classes need to use other classes. I also have a .h file containing a bunch of arrays of int. These stay the same through the application but I get the problem when the compiler complains that I am redefining the array.
Should I make a class library? Namespace? DLL? What is the best practice and where can I find out how to do the right one?
Use include guards in all your headers.
file.h
#ifndef FILE_H_INCLUDED
#define FILE_H_INCLUDED
void foo();
#endif
Avoid global variables when possible. If you must use them, declare global variables using extern and place the definition in a .cpp file instead.
file.h
extern int var[20];
file.cpp
int var[20];
When possible, use forward declarations. You can use forward declarations whenever you use only a reference or a pointer to a class and don't dereference that pointer.
useful.h
class Useful {};
other.h
// Forward-declare instead of #include
class Useful;
class Other
{
Useful* helper;
};
I don't think there really is a best practice, it depends on the situation. Something I might recommend is to group like objects into a namespace then put all of the definitions in a single .h file. If the implementations are short, put them all in a single cpp file. Here at my work we have a database access layer like this. There are roughly a couple dozen objects that are populated by stored procs. The code is still a major pain in the ass but it's better than having two dozen .h and cpp files that are all less than 500 lines. If you do this comments to compartmentalize object definitions become really important. You can easily get files longer than 10,000 lines so you need something to break them up.
Of course use include guards, they'll likely solve the redefining error.
You need to know the difference between a definition and a declaration, and what uses of an entity required the entity to be declared. Then you also need to learn the 'one definition rule' (ODR) which tells you when when you're not allowed to have more than one definition in the program (and therefore the definition can't go in a header) and what things can be defined more than once as long as the definitions are identical (and therefore the definition can go in a header).
For example, those arrays you're declaring; since these are globally visible arrays the program can only contain one definition, and therefore the definition can't go in a header. Every part of the program that needs to access them simply needs to know their declaration. So instead of putting a definition in a header file and violating the ODR, you should have a C++ file that contains their definition and a header that contains declarations for them.
Code like this:
int foo[100];
both declares and defines the array foo. Put code like this in a C++ file. To only declare this array you do this:
extern int foo[100];
Put code like this in a header.
Class definitions, inline functions, and templates are all things that can be defined multiple times as long as the definitions are identical. You can put these definitions into headers, whereas regular functions, and global variables may only be defined once, so you declare them in headers and then define them in implementation files.

Is it appropriate to set a value to a "const char *" in the header file

I have seen people using 2 methods to declare and define char *.
Medhod 1: The header file has the below
extern const char* COUNTRY_NAME_USA = "USA";
Medhod 2:
The header file has the below declaration:
extern const char* COUNTRY_NAME_USA;
The cpp file has the below definition:
extern const char* COUNTRY_NAME_USA = "USA";
Is method 1 wrong in some way ?
What is the difference between the two ?
I understand the difference between "const char * const var" , and "const char * var". If in the above methods if a "const char * const var" is declared and defined in the header as in method 1 will it make sense ?
The first method is indeed wrong, since it makes a definition of an object COUNTRY_NAME_USA with external linkage in the header file. Once that header file gets included into more than one translation unit, the One Definition Rule (ODR) gets violated. The code will fail to compile (more precisely, it will fail to link).
The second method is the correct one. The keyword extern is optional in the definition though, i.e. in the cpp file you can just do
const char* COUNTRY_NAME_USA = "USA"
assuming the declaration from the header file precedes this definition in this translation unit.
Also, I'd guess that since the object name is capitalized, it is probably intended to be a constant. If so, then it should be declared/defined as const char* const COUNTRY_NAME_USA (note the extra const).
Finally, taking that last detail into account, you can just define your constant as
const char* const COUNTRY_NAME_USA = "USA"; // no `extern`!
in the header file. Since it is a constant now, it has internal linkage by default, meaning that there is no ODR violation even if the header file is included into several translation units. In this case you get a separate COUNTRY_NAME_USA lvalue in each translation unit (while in extern method you get one for the entire program). Only you know what you need in your case .
What's the point?
If you want to lookup strings (that could be localized), this would be best:
namespace CountryNames {
const char* const US = "USA";
};
Since the pointer is const, it now has internal linkage and won't cause multiple definitions. Most linkers will also combine redundant constants, so you won't waste space in the executable.
If you want to compare strings by pointer equality though, the above isn't portable because the pointers will only be equal if the linker performs the constant-folding optimization. In that case declaring an extern pointer in the header file is the way to go (and it again should be const if you don't intend to retarget it).
If you must have global variables, normal practice is to declare them in a .h file and define them in one (and only one) .cpp file.
In a .h file;
extern int x;
In a .cpp file;
int x=3;
I have used int (the most fundamental basic type perhaps?) rather than const char * as in your example because the essence of your problem doesn't depend on the type of variable.
The basic idea is that you can declare a variable multiple times, so each .cpp file that includes the .h file declares the variable, and that is fine. But you only define it once. The definition is the statement where you assign the variables initial value, (with an =). You don't want definitions in .h files, because then if the .h file is included by multiple .cpp files, you'll get multiple definitions. If you have multiple definitions of one variable, there is a problem at link time because the linker wants to assign the address of the variable and cannot reasonably do that if there are multiple copies of it.
Additional information added later to try and ease Sud's confusion;
Try to reduce your problem to it's minimal parts to understand it better;
Imagine you have a program that comprises three .cpp files. To build the program each .cpp is compiled separately to create three object files, then the three object files are linked together. If the three .cpp files are as follows (example A, good organization);
file1.cpp
extern int x;
file2.cpp
extern int x;
file3.cpp
extern int x;
Then the files will compile and link together without problem (at least as far as the variable x is concerned). There is no problem because each file is only declaring variable x. A declaration is simply stating that there is a variable out there somewhere that I may (or may not) use.
A better way of achieving the same thing is the following (example A, better organization);
header.h
extern int x;
file1.cpp
#include "header.h"
file2.cpp
#include "header.h"
file3.cpp
#include "header.h"
This is effectively exactly the same, for each of the three compilations the compiler sees the same text as earlier as it processes the .cpp file (or translation unit as the experts call it), because the #include directive simply pulls text from another file. Nevertheless this is an improvement on the earlier example simply because we only have our declaration in one file, not in multiple files.
Now consider another working example (example B, good organization);
file1.cpp
extern int x;
file2.cpp
extern int x;
file3.cpp
extern int x;
int x=3;
This will work fine as well. All three .cpp files declare x and one actually defines it. We could go ahead and add more code within functions in any of the three files that manipulates variable x and we wouldn't get any errors. Again we should use a header file so that the declaration only goes into one physical file (example B, better organization).
header.h
extern int x;
file1.cpp
#include "header.h"
file2.cpp
#include "header.h"
file3.cpp
#include "header.h"
int x=3;
Finally consider an example that just wouldn't work (example C, doesn't work);
file1.cpp
int x=3;
file2.cpp
int x=3;
file3.cpp
int x=3;
Each file would compile without problems. The problem occurs at link time because now we have defined three separate int x variables. The have the same name and are all globally visible. The linker's job is to pull all the objects required for a single program into one executable. Globally visible objects must have a unique name, so that the linker can put a single copy of the object at one defined address (place) in the executable and allow all the other objects to access it at that address. The linker cannot do it's job with global variable x in this case and so will choke out an error instead.
As an aside giving the different definitions different initial values doesn't address the problem. Preceding each definition with the keyword static does address the problem because now the variables are not globally visible, but rather visible within the .cpp file that the are defined in.
If you put a global variable definition into a header file, nothing essential has changed (example C, header organization not helpful in this case);
header.h
int x=3; // Don't put this in a .h file, causes multiple definition link error
file1.cpp
#include "header.h"
file2.cpp
#include "header.h"
file3.cpp
#include "header.h"
Phew, I hope someone reads this and gets some benefit from it. Sometimes the questioner is crying out for a simple explanation in terms of basic concepts not an advanced computer scientist's explanation.

Header Guards and LNK4006

I have a character array defined in a header
//header.h
const char* temp[] = {"JeffSter"};
The header if #defined guarded and has a #pragma once at the top. If this header is included in multiple places, I get an LNK4006 - char const * * temp already defined in blahblah.obj. So, I have a couple of questions about this
Why does this happen if I have the guards in place? I thought that they prevented the header from being read in after the first access.
Why do the numerous enums in this header not also give the LNK4006 warnings?
If I add static before the signature, I don't get the warning. What are the implications of doing it this way.
Is there a better way to do this that avoids the error, but lets me declare the array in the header. I would really hate to have a cpp file just for an array definition.
Why does this happen if I have the guards in place? I thought that they prevented the header from being read in after the first access.
Include guards make sure that a header is included only once in one file (translation unit). For multiple files including the header, you want the header to be included in each file.
By defining, as opposed to declaring variables with external linkage (global variables) in your header file, you can only include the header in once source file. If you include the header in multiple source files, there will be multiple definitions of a variable, which is not allowed in C++.
So, as you have found out, it is a bad idea to define variables in a header file for precisely the reason above.
Why do the numerous enums in this header not also give the LNK4006 warnings?
Because, they don't define "global variables", they're only declarations about types, etc. They don't reserve any storage.
If I add static before the signature, I don't get the warning. What are the implications of doing it this way.
When you make a variable static, it has static scope. The object is not visible outside of the translation unit (file) in which it is defined. So, in simple terms, if you have:
static int i;
in your header, each source file in which you include the header will get a separate int variable i, which is invisible outside of the source file. This is known as internal linkage.
Is there a better way to do this that avoids the error, but lets me declare the array in the header. I would really hate to have a cpp file just for an array definition.
If you want the array to be one object visible from all your C++ files, you should do:
extern int array[SIZE];
in your header file, and then include the header file in all the C++ source files that need the variable array. In one of the source (.cpp) files, you need to define array:
int array[SIZE];
You should include the header in the above source file as well, to allow for catching mistakes due to a difference in the header and the source file.
Basically, extern tells the compiler that "array is defined somewhere, and has the type int, and size SIZE". Then, you actually define array only once. At link stage, everything resolves nicely.
Include guards protect you from including the same header into the same file repeatedly - but not from including it in distinct files.
What happens is that the linker sees temp in more then one object file - you can solve that by making temp static or putting it into an unnamed namespace:
static const char* temp1[] = {"JeffSter"};
// or
namespace {
const char* temp2[] = {"JeffSter"};
}
Alternatively you can use one source file which defines temp and just declare it as extern in the header:
// temp.cpp:
const char* temp[] = {"JeffSter"};
// header.h:
extern const char* temp[];
Header guards have absolutely nothing to do with preventing multiple definitions in your entire program. The purpose of header guards is to prevent multiple inclusion of the same header file into the same translation unit (.cpp file). In other words, they exist to prevent multiple definitions in the same source file. And they do work as intended in your case.
The rule that governs multiple-definition issues in C++ is called One Definition Rule (ODR). ODR is defined differently for different kinds of entities. For example, types are allowed to have multiple identical definitions in the program. They can (and most always have to) be defined in every translation unit where they are used. This is why your enum definition does not result in an error.
Objects with external linkage are a completely different story. They have to be defined in one and only one translation unit. This is why your definition of temp causes an error when you include the header file into multiple translation units. Include guards can't prevent this error. Just don't define objects with external linkage in header files.
By adding static you give your object internal linkage. This will make the error disappear, since now it is perfectly OK from ODR point of view. But this will define an independent temp object in each translation unit into which your header file is included. To achieve the same effect you could also do
const char* const temp[] = { "JeffSter" };
since const objects in C++ have internal linkage by default.
This depends on whether you need an object with external linkage (i.e. one for the entire program) or an object with internal linkage (unique to each translation unit). If you need the latter, use static and/or extra const (if that works for you) as shown above.
If you need the former (external linkage), you should put a non-defining declaration into the header file
extern const char* temp[];
and move the definition into one and only one .cpp file
char* const temp[] = { "JeffSter" };
The above declaration in the header file will work for most purposes. However, it declares temp as an array of unknown size - an incomplete type. If you wish to declare it as an array of known size, you have to specify the size manually
extern const char* temp[1];
and remember to keep it in-synch between the declaration and definition.
I respectfully disagree with the advice against defining variables in headers, because I think "never" is too broad. Nevertheless, the episode that brought me to this thread offers a cautionary tale for those who dare to do so.
I landed on this page as the result of an inquiry into the cause of warning LNK4006, calling out a long established array that I just moved from the translation unit that defines my DLLMain routine into the private header that is included in most of the translation units that comprise this library. I have compiled this library hundreds of times over the last 11 years, and I had never before seen this warning.
Shortly after I read this page, I discovered the cause of the error, which was that the definition was outside the guard block that protects everything else that is defined in the module that also defines DLLMain, which is where I usually gather all the memory blocks that need external linkage. As expected, moving the table inside the guard block eliminated the warnings, leaving me with only two, related to a brand new externally linked table, to be resolved.
Takeaway: You can define variables in headers, and it's a great place to put common blocks, but mind your guards.
Hang on... you are mixing up your declarations...you did say 'char const * * temp' yet in your header file you have 'const char* temp[] = {"JeffSter"};'.
See section 6.1 of the C FAQ, under 'Section 6. Arrays and Pointers', to quote:
6.1: I had the definition char a[6] in one source file, and in
another I declared extern char *a. Why didn't it work?
A: In one source file you defined an array of characters and in the
other you declared a pointer to characters. The declaration
extern char *a simply does not match the actual definition.
The type pointer-to-type-T is not the same as array-of-type-T.
Use extern char a[].
References: ISO Sec. 6.5.4.2; CT&P Sec. 3.3 pp. 33-4, Sec. 4.5
pp. 64-5.
That is the source of the problem. Match up your declaration and definitions. Sorry if this sounds blunt, but I could not help noticing what the linker was telling you...

Anyone knows how to fix compile error: LNK2005? (Source Code inside)

I have the below code in stdafx.h.
using namespace std;
typedef struct {
DWORD address;
DWORD size;
char file[64];
DWORD line;
} ALLOC_INFO;
typedef list<ALLOC_INFO*> AllocList;
//AllocList *allocList;
Without the commented code (last line), it compiles just fine. But when I add the commented code, Im getting the following error.
error LNK2005: "class std::list >
* allocList" (?allocList##3PAV?$list#PAUALLOC_INFO##V?$allocator#PAUALLOC_INFO###std###std##A)
already defined in test.obj
Im using Visual Studio .NET 2003. Anyone has any idea what that is and how to solve it?
Don't put definitions in header files, just declarations. Declarations specify that something exists while definitions actually define them (by allocating space). For example typedef, extern and function prototypes are all declarations, while things like struct, int and function bodies are definitions.
What's happening is that you're most likely including stdafx.h in multiple compilation units (C++ source files) and each of the resulting object files is getting its own copy of allocList.
Then when you link the objects together, there's two (or more) things called allocList, hence the link error.
You would be better off declaring the variable:
extern AllocList *allocList;
in your header file and defining it somewhere in a C++ source file (such as a main.cpp):
AllocList *allocList;
That way, every compilation unit that includes stdafx.h will know about the external variable, but it's only defined in one compilation unit.
Based on your further information:
I was trying to follow http://www.flipcode.com/archives/How_To_Find_Memory_Leaks.shtml, I assume that all those code are meant to be placed in the stdafx.h. Any other alternatives, pax?
My response is as follows.
I wouldn't put them in stdafx.h myself since I think that uses some MS magic for pre-compiled headers.
Make a separate header file mymemory.h and put your function prototypes in it, for example (note that this has no body):
inline void * __cdecl operator new(
unsigned int size,
const char *file,
int line);
Also in that header, put the other prototypes for AddTrack(), DumpUnfreed(), etc., and the #define, typedef and extern statements:
extern AllocList *allocList;
Then, in a new file mymemory.cpp (which also contains #include "mymemory.h"), put the actual definition of allocList along with all the real functions (not just the prototypes) and add that file to your project.
Then, #include "mymemory.h" in every source file in which you need to track memory (probably all of them). Because there are no definitions in the header file, you won't get duplicates during the link and because the declarations are there, you won't get undefined references either.
Keep in mind that this won't track memory leaks in code that you don't compile (e.g., third-party libraries) but it should let you know about your own problems.
I was trying to follow this article, I assume that all those code are meant to be placed in the stdafx.h. Any other alternatives pax?