static initialization confusion - c++

I am getting very confused in some concepts in c++. For ex: I have following two files
//file1.cpp
class test
{
static int s;
public:
test(){s++;}
};
static test t;
int test::s=5;
//file2.cpp
#include<iostream>
using namespace std;
class test
{
static int s;
public:
test(){s++;}
static int get()
{
return s;
}
};
static test t;
int main()
{
cout<<test::get()<<endl;
}
Now My question is :
1. How two files link successfully even if they have different class definitions?
2. Are the static member s of two classes related because I get output as 7.
Please explain this concept of statics.

They link because the linker knows almost nothing about the C++ language. However, if you do this, you have broken the One Definition Rule, and the behaviour of your program will be undefined. writing invalid code is not a good way of learning C++. Also, you seem to be having a lot of questions about static variables - the concept is really not all that complex - which C++ textbook are you using that does not explain it well?

The classes are (as far as the linker is concerned) identical. get() is just an inline function the linker never sees.
static in a class doesn't limit the member to file scope, but makes it global for all class instances.
[edit:] You'd get an linker error if you would put int test::s=5; into the second file as well.

Static is a strange beast in C++, that has different meanings depending on the context.
In File1.cpp, "static int s" in the class definition indicates that s is a static member, that is, is common to all instances of test.
The "static test t", though, has a different meaning: a static global variable only exists in the compilation unit, and will not be visible by other units. This is to avoid linker confusion if you were using the same name for two different things. In modern C++, one would use anonymous namespaces for this:
namespace
{
test t;
}
This means that t inside File1.cpp and t inside File2.cpp are separate objects.
In File2.cpp, you also defined a static method get: a static method is a method that belongs to the class instead of the instance, and that can only access static members of the class.
A last use of static, that you did not use in your example, is the local static variable. A local static variable is initialised the first time it gets executed, and keeps its value throughout the execution, a bit like a global variable with a local scope:
int add()
{
static value = 0;
value++;
return value;
}
Calling add() repetitively will return 1, then 2, then 3... This local static construct can be useful for local caches for example.

How two files link successfully even if they have different class definitions?
You violated the One Definition Rule (ODR) by defining the same class differently in different translation units. That invokes the dreaded Undefined Behavior. Which means the code is allows to do anything, including, but not limited to, doing what you wanted, formatting your hard disk, or causing an un-accounted for eclipse of the sun.
Note that compilers/linker are not requested to detect ODR violations.
Are the static member s of two classes related because I get output as 7.
Maybe they are, maybe they are not. Really, undefined behavior might do just anything. What you see is just an artifact of however your compiler and linker are implemented.
A likely implementation will happily link all references to test::s to the same instance, but let each translation unit have - and link to - their own t object. (Since the class has only inline functions, the linker most likely never even sees anything of test except for test::s and those t instances.)

Unnice side-effect of how C++ actually works. Use namespaces to make the class names different to outside.
In past I had used the quite burdensome in practice rule: every library and file has to have a private namespace for to avoid such link conflicts. Or put the utility classes right into the main class definition. Anyway: not to pollute global namespace and most importantly ensure uniqueness of names across the whole project. (Writing that now I notice that I used pretty much verbatim the concept from Java.)
My search for equivalent of C's static in C++ was unfruitful...

Related

single static member function has static variable, will have multiple instatnce

I have seen in stackoverflow that
//define in .h
inline void fun()
{
static int i; // one instance include in multiple cpp
}
static inline void fun1()
{
static int i; // multiple instance included in multiple cpp
}
I often write signleton in such a pattern
//define in .h
class Singleton
{
static Singleton& Instance()
{
static Singleton s;
return s;
}
private:
int i;
}
a lot of coders write in this way, Can someone explain is it correct, how C++ ensure one instance?
actually in the inline and static inline version, there is no clear cpp declaration to ensure single instance.
The C++ standard does not "ensure" anything.
The C++ standard specifies the observed results, what happens as the result of this declaration
It is up to each C++ compiler to implement this C++ specification in the manner that produces the specified results.
The C++ standard does not specify exactly how the compiler and the linker go about doing this. The low level details that produces these results are not specified or mandated by the C++ standard, in any way. This is entirely up to the compiler and the linker.
Having said that: the common approach is for the compiler to produce an instance of the static object in every compiled .cpp file. It is also tagged in some manner that uniquely identifies it; and indicates that, when linked with other translation units, only one instance of the uniquely-identified object (in all the compiled object files) is produced in the final linked executable.
This is a problem that was inherited from C and you have to go back quite a bit in history. So a long time a ago in a language not quite C++ ...
The inline keyword was just a hint to the compiler that you would like that function to get inlined. Now some compilers just ignored it because nothing said you had to inline functions marked inline. Some compilers addded a fudge factor to their automatic inline heuristic to make it more likely to get inlined. Others always inlined.
This then caused a problems for functions defined in headers because when the function was not actually inlined you ended up with the same function in multiple compilation units and then the linker complained.
Some compilers marked inline functions so that the linker would ignore duplicate definitions. But to be portable you had to mark your functions static inline. That way, even if the function is present in each compilation unit, the function isn't exported and doesn't collide with definitions of the same function in other compilation units. Drawback is that you get multiple copies of the same function, which is a problem when you have local static variables or use the address of the function.
Warp forward to the near present...
Compilers have gotten a lot smarter about inlining automatically and the feature to mark inline function to allow multiple defintions has kind of taken hold and that is the meaning of the inline specifier in C++. Since C++17 this also holds for inline variables, not just functions.
So inline causes the compiler to mark the function (or variable) in the object file for the linker to
Allow multiple definitions without error or warning.
Pick any one of the definitions and replace all the other instances with that one.
All duplicates of the inline function will end up with the same address and any local static variable will end up with a single instance too. All copies become the same object.
PS: using a static local variable for a singleton is imho the best way as it avoids the Static Initialization Order Fiasco as much as can be and is thread safe.

How to use public static variables in C++

There doesn't seem to be a clear, concise example of real word use of a public static variable in C++ from multiple files on StackOverflow.
There are many examples showing how to use static variables in a single C++ translation unit, and many questions about the precise nature of the various uses of the static keyword, but as a programmer more experienced with C# I found it hard to scrape together what I needed to simply "have a static variable on a class and use it elsewhere" like you would in C# or Java.
Here I will try to demonstrate what I discovered in the most straightforward way. I hope that others will then improve on the answer and give more technical details for those that are interested.
SomeClass.h
In the header file for the class where we want to have a static variable, we just declare it static like we would in C# or Java. Don't try to initialise the variable here; C++ doesn't like that.
class SomeClass
{
public:
static bool some_flag;
};
SomeClass.cpp
In the cpp file for the class, we create the storage for the static variable, just like we would provide the implementation for a member function. Here we can initialise the variable. So far, this is going swimmingly!
#include "SomeClass.h"
bool SomeClass::some_flag = true;
SomeOtherClass.cpp
Here is where I ran into problems. If we try to just access SomeClass::some_flag from elsewhere (which, let's face it, is probably the reason you wanted a public static variable in the first place), the linker will complain that it doesn't know where it lives. We've told the compiler that the static variable exists, so the compiler is happy. The problem is that in this other translation unit we've never specified where that static variable is stored. You may be tempted to try redeclaring the storage, and hoping the linker resolves them both as "the some_flag I declared in SomeClass.h", but it won't. It will complain that you've given the variable two homes, which of course is not what you meant.
What we need to do is to tell the linker that the storage for some_flag lives elsewhere, and that it will be found once we try to put all the translation units together at link time. We use the extern keyword for this.
#include "SomeOtherClass.h"
#include "SomeClass.h"
extern bool SomeClass::some_flag;
void SomeOtherClass::SomeOtherFunction()
{
SomeClass::some_flag = true;
};
Voila! The compiler is happy, the linker is happy, and hopefully the programmer is also happy.
I now leave this open for a discussion on how I should not have used a public variable, could have just passed an instance of SomeClass through the 87 layers of code that's between it and the usage, should have used some feature coming in C++23, should have used boost::obscurething etc. etc.
I do however welcome any alternative approaches to this fundamental problem that are true to the usage I've demonstrated here.
Here is where I ran into problems. If we try to just access SomeClass::some_flag from elsewhere [...], the linker will complain that it doesn't know where it lives.
The linker won't complain, as long as you link with the translation unit that defines the variable (SomeClass.cpp).
extern bool SomeClass::some_flag;
This declaration is neither allowed, nor is it necessary. The class definition that contains the variable declaration is sufficient for the compiler, and the variable definition in SomeClass.cpp is sufficient for the linker. Demo
Don't try to initialise the variable here; C++ doesn't like that.
I do however welcome any alternative approaches to this fundamental problem
You could simply use an inline variable (since C++17):
class SomeClass
{
public:
inline static bool some_flag = true;
The answer, in summary form, for people who just want the code so they can get back to work:
SomeClass.h
#pragma once
class SomeClass
{
public:
static bool some_flag;
};
SomeClass.cpp
#include "SomeClass.h"
bool SomeClass::some_flag = true;
SomeOtherClass.cpp
#include "SomeOtherClass.h"
#include "SomeClass.h"
extern bool SomeClass::some_flag;
void SomeOtherClass::SomeOtherFunction()
{
SomeClass::some_flag = true;
}

Aren't non-const variable considered external by default?

Learning from this: By default, non-const variables declared outside of a block are assumed to be external. However, const variables declared outside of a block are assumed to be internal.
But if I write this inside MyTools.h:
#ifndef _TOOLSIPLUG_
#define _TOOLSIPLUG_
typedef struct {
double LN20;
} S;
S tool;
#endif // !_TOOLSIPLUG_
and I include MyTools.h 3 times (from 3 different .cpp) it says that tool is already defined (at linker phase). Only if I change in:
extern S tool;
works. Isn't external default?
The declaration S tool; at namespace scope, does declare an extern linkage variable. It also defines it. And that's the problem, since you do that in three different translation units: you're saying to the linker that you have accidentally named three different global variables, of the same type, the same.
One way to achieve the effect you seem to desire, a single global shared variable declared in the header, is to do this:
inline auto tool_instance()
-> S&
{
static S the_tool; // One single instance shared in all units.
return the_tool;
}
static S& tool = tool_instance();
The function-accessing-a-local-static is called a Meyers' singleton, after Scott Meyers.
In C++17 and later you can also just declare an inline variable.
Note that global variables are considered Evil™. Quoting Wikipedia on that issue:
” The use of global variables makes software harder to read and understand. Since any code anywhere in the program can change the value of the variable at any time, understanding the use of the variable may entail understanding a large portion of the program. Global variables make separating code into reusable libraries more difficult. They can lead to problems of naming because a global variable defined in one file may conflict with the same name used for a global variable in another file (thus causing linking to fail). A local variable of the same name can shield the global variable from access, again leading to harder-to-understand code. The setting of a global variable can create side effects that are hard to locate and predict. The use of global variables makes it more difficult to isolate units of code for purposes of unit testing; thus they can directly contribute to lowering the quality of the code.
Disclaimer: code not touched by compiler's hands.
There's a difference between a declaration and a definition. extern int i; is a declaration: it says that there's a variable named i whose type is int, and extern implies that it will be defined somewhere else. int i;, on the other hand, is a definition of the variable i. When you write a definition, it tells the compiler to create that variable. If you have definitions of the same variable in more than one source file, you've got multiple definitions, and the compiler (well, in practice, the linker) should complain. That's why putting the definition S tool; in the header creates problems: each source file that #include's that header ends up defining tool, and the compiler rightly complains.
The difference between a const and a not-const definition is, as you say, that const int i = 3; defines a variable named i that is local to the file being compiled. There's no problem having the same definition in more than one source file, because those guys all have internal linkage, that is, they aren't visible outside the source file. When you don't have the const, for example, with int i = 3;, that also defines a variable named i, but it has external linkage, that it, it's visible outside the source file, and having the same definition in multiple files gives you that error. (technically, it's definitions of the same name that lead to problems; int i = 3; and double i = 3.0; in two different source files still are duplicate definitions).
By default, non-const variables declared outside of a block are assumed to be external. However, const variables declared outside of a block are assumed to be internal.
That statement is still correct in your case.
In your MyTools.h, tool is external.
S tool; // define tool, external scope
In your cpp file, however, the extern keyword merely means that S is defined else where.
extern S tool; // declare tool

C++ static variables in static class methods defined in headers

// SomeOtherClass.hpp
#pragma once
int someOtherCallMe();
class SomeOtherClass {
public:
static int callMe() {
static int _instance = 7;
++_instance;
return _instance;
}
};
// SomeOtherClass.cpp
#include "SomeOtherClass.hpp"
int
someOtherCallMe() {
return SomeOtherClass::callMe();
}
// main.cpp
#include "SomeOtherClass.hpp"
#include <iostream>
int
main() {
std::cout << SomeOtherClass::callMe();
std::cout << someOtherCallMe();
return 0;
}
I have three files: SomeOtherClass.hpp/cpp, main.cpp. Those files result in two binaries: shared library (of SomeOtherClass.cpp) and executable(of main.cpp, linked with shared library).
Does C++ guaranties that static <any-type> _instance will be a single variable during the execution of a program (does not matter in how many binaries it was defined)?
Note
To clarify the situation. The confusion I see in this situation is that, on one hand, the SomeOtherClass::callMe is defined in program twice, that is expected (because class static member function are actually a regular function with internal linkage, if they are defined in place, like in this case), and that is what you can see from disassembly. Since we have two functions with static local variables in machine code. How does the language/standard qualify their behaviour?
Yes. A static will be a single value. A lot of other things are not well defined or are new to the standard. (When are they initialized if they are global? Is the code for static initialization within a function thread-safe?) But, yes, you can count on there being only one.
The only clarification here is outside the standard but of practical importance if you are creating a shared library (.so or .dll): You can't statically (privately) link your C++ class library into the shared library. Otherwise, that would make two copies if you do this within two different shared libraries. (This comment applies to everything about a library, not just static variables. If you do this, then there is duplication of everything.)
Edit: On many platforms (such as Linux and Windows), this can be used to purposefully "hide" your static variable. If you don't make your function/class accessible outside the dll/so (using declspec or a visibility attribute), then you can ensure your dll/so has its own copy of the entire class. This technique can help reduce unwanted interactions between libraries. However, in your case, it sounds like you really want only one, which will be the case if your class has proper visibility throughout all of your libraries (only visible in one, and other libraries link to that library).
Edit again to refer to the standard
If a function with external linkage is declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. An inline function with external linkage shall have the same address in all translation units. A static local variable in an extern inline function always refers to the same object.
7.1.2.4, C++14
Does C++ guaranties that static _instance will be a single variable during the execution of a program (does not matter in how many binaries it was defined)?
I don't think the language has anything to say about that. It does not talk about static libraries or dynamic libraries.
It's the responsibility of an implementation to provide the mechanism to make sure that it is possible to have one definition. It's up to a user to make sure that they use the mechanism provided by the implementation to have one definition of the static variable in the function.

C++: What is the proper way to define non-class static const values?

First I'll try to describe the current situation:
I am adapting an existing code base for our uses and in some instances a .h/.cpp file contains multiple class definitions. We cannot change the existing public interface of the API without major modifications to other portions of the code that we would rather avoid at this time.
I have found the need for constant values that are used by more than one class (in the same .cpp file and nowhere else) and therefore cannot be defined as a class-specific constant. The only way I know to do this is to define the constants externally from any class (but still in the .cpp file) and reference them as needed.
I have done this and the code compiles and links, but when I go to run a test program on the code it fails with an error that appears to be related to the constant value I have defined. I get the impression that when the code executes, the constant has not actually been defined and the code blows up because of that.
I don't have a lot of experience writing C++ code and wonder if I'm doing this the wrong way. I will include code snippets below to try to illustrate what I'm doing.
In DateTime.cpp (currently nothing defined in DateTime.h for DATE_FORMAT_REGEX):
...
#include <boost/regex.hpp>
static const boost::regex DATE_FORMAT_REGEX("[0-9]{4}-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])(Z|([+|-]([01][0-9]|2[0-4]):[0-5][0-9]))?");
// Other class method implementations ...
bool our_ns::Date::parse_string(const std::string& s)
{
// Validate string against regex.
bool valid = boost::regex_match(s, DATE_FORMAT_REGEX);
if (valid) {
...
}
}
...
The call to regex_match is what fails. All of the classes, by the way are defined within a namespace in the header file.
It would appear the constant is not being initialized. What do I need to do to initialize these values? Based upon what I've described, is there a better way to do this?
[Update: 6/9/15 12:52 EDT]
Actually, I am relaying information witnessed by another developer in the organization. He verified in a debugger that the regex item was null when he reached the line where it blows up. He also mentioned that when he did some experimentation and moved the definitions from the .cpp to the .h file, the error did not recur. Beyond that, I don't know if things were executing properly.
The correct way is to not defined these as static const but rather as const. These are constants, they do not need static linkage. If you're doing it to avoid the global namespace, const variables have implicit compilation unit scoping anyway.
If Date::parse_string is called from a Date constructor and the Date object is static and initialized in a different compilation unit (C++ file) then it could be an issue related to static variable initialization order. In this case the order of initialization of variables is not defined so the Date object might be initialized before DATE_FORMAT_REGEX and as a result parse_string is called before DATE_FORMAT_REGEX is initialized. You can fix it by moving DATE_FORMAT_REGEX to the Date class definition.
This is likely a result of initialization order, a well-known issue that one could run into.
It has been described in books, such as C++ FAQs in detail for a very long time. Here is a quick reference:
https://isocpp.org/wiki/faq/ctors#static-init-order
Basically, there is no guaranteed order in which static variables must initialize. This creates situations where you could run into an error, when the "second" object makes use of the "first", but, "first" hasn't been initialized. Please note that "first" and "second" are interchangeable, therein lies the crux of the problem.
In fact, the more dangerous situation is a latent bug, where things continue to work until one day when they don't - usually, induced by a new compiler version or some such change.
It is better to get out of this dependency altogether. You don't seem to need to static here. If the goal is to limit visibility of the variable, then use anonymous namespace and place it before any use.