C++ Core Guidelines for static member variables - c++

I have a private static vector in my class that keeps a pointer to all objects created from it. It's necessary as each object needs access to information from all the other objects to perform some calculations:
// Header file:
class Example {
public:
Example();
private:
static std::vector<const Example*> examples_;
};
// Cpp file:
std::vector<const Example *> Example::examples_ = {};
Example::Example() {
// intialization
examples_.emplace_back(this);
}
void Example::DoCalc() {
for (auto example : examples_) {
// do stuff
}
}
clang-tidy points out that I'm violating C++ Core Guidelines, namely : "Variable 'examples_' is non-const and globally accessible, consider making it const (cppcoreguidelines-avoid-non-const-global-variables)".
Personally, I don't see the resemblance between my code and the sample code in the core guidelines, especially since the variable is inside a class and private. What would be the 'correct' way of implementing this functionality? I don't want to disable this check from clang-tidy if it can be avoided.

What you've done is fine. This is literally the purpose of class-static. Some people would recommend alternatives, for unrelated reasons, which may be worth considering… but not because of anything clang-tidy is telling you here.
You've run into clang-tidy bug #48040. You can see this because it's wrong in its messaging: the vector is not "globally accessible", at least not in the sense of access rules, since it's marked private (though it's globally present across translation units, which is fine).
Your code doesn't relate to the cited core guideline.

A possible solution is to force each client that accesses Example::examples_ to go through a function. Then put examples as a static variable into that function. That way the object will be created the first time the function is called - independent of any global object construction order.
// Header file:
class Example {
public:
Example();
private:
std::vector<const Example*>& examples();
};
// Cpp file:
std::vector<Example *>& Example::examples()
{
static std::vector<Example *> examples_;
return examples_;
};
Example::Example() {
// intialization
examples().emplace_back(this);
}
void Example::DoCalc() {
for (auto example : examples()) {
// do stuff
}
}
Of course if you are sure that you have no problem with global objects and are sure that no other global object is accessing Examples::examples_ during its construction, you can ignore the warning. It is just a guideline, you don't need to follow it strictly.
As Asteroids With Wings noted the guideline I.2 does not apply to your code. But please note that the CoreGuidelines intend to ban static members as well, see To-do: Unclassified proto-rules:
avoid static class members variables (race conditions, almost-global variables)

Personally, I don't see the resemblance between my code and the sample code in the core guidelines
You have a single variable that is accessible to every thread, hidden from users of Example. The only difference to an ordinary global variable is that it is private, i.e. you can't use the name Example::examples_ to refer to it outside of Example.
Note
The rule is "avoid", not "don't use."
The "correct" way of implementing this functionality might be how you have it, but I strongly suggest you rework "each object needs access to information from all the other objects to perform some calculations" so that you pass a std::vector<const Example*> to where it is needed, having kept track of all the relevant (and especially alive) Examples where they are used.
Alternative: [...] Another solution is to define the data as the state of some object and the operations as member functions.
Warning: Beware of data races: If one thread can access non-local data (or data passed by reference) while another thread executes the callee, we can have a data race. Every pointer or reference to mutable data is a potential data race.

Related

A c++ class include a static member with the same type of itself. Why this pattern?

I inherited a project from a former colleague, and I found these code snippets (and some similar ones in SO questions: can a c++ class include itself as an member and static member object of a class in the same class)
// Service.h
class Service
{
// ...
public:
static Service sInstance;
void Somememberfunc();
//...
};
// Service.cpp
#include "Service.h"
Service Service::sInstance;
void Service::Somememberfunc()
{
//...
}
// Main.cpp
#include "Service.h"
void Fun()
{
Service &instance = Service::sInstance;
//...
instance.Somememberfunc();
//...
}
However, I did not find any explanation on when to use this pattern. And what are the advantages and disadvantages?
Notice that the member is a static, so it's part of the class, not of instantiated objects. This is important, because otherwise you would be trying to make a recursive member (since the member is part of the object, it also contains the same member and so on...), but this is not the case here.
The best pattern to describe this is: global variable. The static member is initialized before main() and can be accessed from any part of the program by including the header file. This is very convenient while implementing but becomes harder to handle the more complex the program gets and the longer you have to maintain it, so the general idea is to avoid this. Also, because there is no way to control the order of initialization, dependencies between different global variables can cause problems during startup.
Static member is roughly a global variable in the scope of the class.
Static members have also the advantage of visibility access (public/protected/private) to restreint its usage (file scope might be an alternative).
That member might be of type of the class.
Global are "easy" to (mis)use, as they don't require to think about architecture.
BUT (mutable) global are mostly discouraged as harder to reason about.
Acceptable usages IMO are for constants:
as for a matrix class, the null matrix, the diagonal one matrix.
for Temperature class, some specific value (absolute 0 (O Kelvin), temperature of water transformation(0 Celsius, 100 Celsius), ...)
in general NullObject, Default, ...
Singleton pattern. For example, you can use it to store your app configurations. And then you can easily access configurations anywhere(globally) within your app.
This is often used in the singleton design pattern
Visit https://en.wikipedia.org/wiki/Singleton_pattern

Best practice for handling const class data

Say you have a certain class in which each instance of it needs to access the exact same set of data. It is more efficient to declare the data outside the class rather than have each instance make its own one, but doesn't this breach the 'no globals' rule?
Example code:
Foo.h
class Foo{
Foo();
void someFunction();
};
Foo.cpp
#include "Foo.h"
//surely ok since it's only global to the class's .cpp?
const int g_data[length] = {
//contains some data
};
Foo::someFunction(){
//do something involving g_data ..
}
..rather than making 'g_data' a member variable. Or is there some other way which avoids creating a global?
Use the modifier static which modifies the declaration of a class member so that is shared among all class instances. An example:
class A {
int length = 10;
static int g_data[length]; //Inside the class
}
And then you can access it like this:
int main(void) {
std::cout << "For example: " << A::g_data[2] << std::endl;
}
You can find more on this matter here
That's what a static member is for.
Thus you will have in your declaration :
class Foo{
Foo();
void someFunction();
static int const sharedData[length];
};
And somewhere in your cpp file :
int const Foo::sharedData[length] = { /* Your data */ };
Summarily, yes - your "global" is probably ok (though it'd be better in an anonymous namespace, and there are a few considerations).
Details: lots of answers recommending a static class member, and that is often a good way to go, but do keep in mind that:
a static class member must be listed in the class definition, so in your case it'll be in Foo.h, and any client code that includes that header is likely to want to recompile if you edit the static member in any way even if it's private and of no possible direct relevance to them (this is most important for classes in low-level libraries with diverse client code - enterprise libraries and those shared across the internet and used by large client apps)
with a static class member, code in the header has the option of using it (in which case a recompile will be necessary and appropriate if the static member changes)
if you put the data in the .cpp file, it's best in an anonymous namespace so it doesn't have external linkage (no other objects see it or can link to it), though you have no way to encapsulate it to protect it from access by other functions later in the .cpp's translation unit (whereas a non-public static class member has such protection)
What you really want is data belonging to the type, not to an instance of that type. Fortunately, in C++ there is an instrument doing exactly that — static class members.
If you want to avoid the global and have a more object-oriented solution, take a look at the Flyweight pattern. The Boost Flyweight library has some helper infrastructure and provides a decent explanation of the concept.
Since you are talking about efficiency, it may or not be a good idea to use such an approach, depending on your actual goal. Flyweights are more about saving memory and introduce an additional layer of indirection which may impact runtime performance. The externally stored state may impact compiler optimizations, especially inlining and reduce data locality which prevents efficient caching. On the other hand, some operations which like assignment or copying can be considerably faster because there is only the pointer to the shared state that needs to be copied (plus the non-shared state, but this has to be copied anyway). As always when it comes to efficiency / performance, make sure you measure your code to compare what suits your requirements.

Immutable "functional" data structure in C++11

I was trying to write down some implementations for a couple of data structures that I'm interested in for a multithreaded / concurrent scenario.
A lot of functional languages, pretty much all that I know of, design their own data structures in such a way that they are immutable, so this means that if you are going to add value to an instance t1 of T, you really get a new instance of T that packs t1 + value.
container t;
container s = t; //t and s refer to the same container.
t.add(value); //this makes a copy of t, and t is the copy
I can't find the appropriate keywords to do this in C++11; there are keywords, semantics and functions from the standard library that are clearly oriented to the functional approach, in particular I found that:
mutable it's not for runtime, it's more likely to be an hint for the compiler, but this keyword doesn't really help you in designing a new data structure or use a data structure in an immutable way
swap doesn't works on temporaries, and this is a big downside in my case
I also don't know how much the other keywords / functions can help with such design, swap was one of them really close to something good, so I could at least start to write something, but apparently it's limited to lvalues .
So I'm asking: it's possible to design immutable data structure in C++11 with a functional approach ?
You simply declare a class with private member variables and you don't provide any methods to change the value of these private members. That's it. You initialize the members only from the constructors of the class. Noone will be able to change the data of the class this way. The tool of C++ to create immutable objects is the private visibility of the members.
mutable: This is one of the biggest hacks in C++. I've seen at most 2 places in my whole life where its usage was reasonable and this keyword is pretty much the opposite of what you are searching for. If you would search for a keyword in C++ that helps you at compile time to mark data members then you are searching for the const keyword. If you mark a class member as const then you can initialize it only from the INITIALIZER LIST of constructors and you can no longer modify them throughout the lifetime of the instance. And this is not C++11, it is pure C++. There are no magic language features to provide immutability, you can do that only by programming smartly.
In c++ "immutability" is granted by the const keyword. Sure - you still can change a const variable, but you have to do it on purpose (like here). In normal cases, the compiler won't let you do that. Since your biggest concern seems to be doing it in a functional style, and you want a structure, you can define it yourself like this:
class Immutable{
Immutable& operator=(const Immutable& b){} // This is private, so it can't be called from outside
const int myHiddenValue;
public:
operator const int(){return myHiddenValue;}
Immutable(int valueGivenUponCreation): myHiddenValue(valueGivenUponCreation){}
};
If you define a class like that, even if you try to change myHiddenValue with const_cast, it won't actually do anything, since the value will be copied during the call to operator const int.
Note: there's no real reason to do this, but hey - it's your wish.
Also note: since pointers exist in C++, you still can change the value with some kind of pointer magic (get the address of the object, calc the offset, etc), but you can't really help that. You wouldn't be able to prevent that even when using an functional language, if it had pointers.
And on a side note - why are you trying to force yourself in using C++ in a functional manner? I can understand it's simpler for you, and you're used to it, but functional programming isn't often used because of its downfalls. Note that whenever you create a new object, you have to allocate space. It's slower for the end-user.
Bartoz Milewski has implemented Okasaki's functional data structures in C++. He gives a very thorough treatise on why functional data structures are important for concurrency. In that treatise, he explains the need in concurrency to construct an object and then afterwards make it immutable:
Here’s what needs to happen: A thread has to somehow construct the
data that it destined to be immutable. Depending on the structure of
that data, this could be a very simple or a very complex process. Then
the state of that data has to be frozen — no more changes are
allowed.
As others have said, when you want to expose data in C++ and have it not be available for changing, you make your function signature look like this:
class MutableButExposesImmutably
{
private:
std::string member;
public:
void complicatedProcess() { member = "something else"; } // mutates
const std::string & immutableAccessToMember() const {
return member;
}
};
This is an example of a data structure that is mutable, but you can't mutate it directly.
I think what you are looking for is something like java's final keyword: This keyword allows you to construct an object, but thereafter the object remains immutable.
You can do this in C++. The following code sample compiles. Note that in the class Immutable, the object member is literally immutable, (unlike what it was in the previous example): You can construct it, but once constructed, it is immutable.
#include <iostream>
#include <string>
using namespace std;
class Immutable
{
private:
const std::string member;
public:
Immutable(std::string a) : member(a) {}
const std::string & immutable_member_view() const { return member; }
};
int main() {
Immutable foo("bar");
// your code goes here
return 0;
}
Re. your code example with s and t. You can do this in C++, but "immutability" has nothing to do with that question, if I understand your requirements correctly!
I have used containers in vendor libraries that do operate the way you describe; i.e. when they are copied they share their internal data, and they don't make a copy of the internal data until it's time to change one of them.
Note that in your code example, there is a requirement that if s changes then t must not change. So s has to contain some sort of flag or reference count to indicate that t is currently sharing its data, so when s has its data changed, it needs to split off a copy instead of just updating its data.
So, as a very broad outline of what your container will look like: it will consist of a handle (e.g. a pointer) to some data, plus a reference count; and your functions that update the data all need to check the refcount to decide whether to reallocate the data or not; and your copy-constructor and copy-assignment operator need to increment the refcount.

What is the best strategy for sharing variables between source files in c/c++?

I frequently have to write c/c++ programs with 10+ source files where a handful of variables need to be shared between functions in all the files. I have read before that it is generally good practice to avoid using global variables with extern. However, if it is completely necessary to use global variables, this link provides a good strategy. Lately, I have been toying with the strategy of wrapping up all my variables in a struct or a class and passing this struct around to different functions. I was wondering which way people consider to be cleaner and if there are any better alternatives.
EDIT: I realize strategies may be different in the two languages. I am interested in strategies that apply to only one language or both.
Pass around a class/struct of "context" data instead of global variables. You will be suprised how often a global variable becomes no longer global, with different modules wanting to use different values for it at the same time.
The better alternative to globals is to not use globals.
Don't try to sweep them under the rug using a struct or a namespace or a singleton or some other silly thing whose only purpose is to hide the fact that you're using globals.
Just don't ever create one. Ever.
It will force you to think of ownership and lifetime and dependencies and responsibility. You know, grown-up things.
And then, when you're comfortable writing global-free code, you can start violating all those rules.
Because that's what rules are for: to be followed, and to be broken.
Does it REALLY need to be global?
This is the first question you should always ask, is this variable used GLOBALLY e.g. in all contexts. The answer is almost certainly... no it's not.
Consider Context
Is the variable global state, or is it context? Global state is usually rare, context on the other hand is quite common. If it's global state consider wrapping in a singleton so you can manage the how of interaction with your globals. Using Atomic<> is probably not a bad idea, you should at least consider synchronization.
If it is context then it should be passed explicitly in a structure or class, as the data is explicitly relevant to that context an no-other. Passing context explicitly may seem like a burden but it makes it very clear where the context is coming from rather than just referencing random variables out of the ether.
What is the Scope?
It may seem odd to say that globals are scoped, but any global declared in a single file may be declared static and thus unlinkable from any other file. This means you can restrict who has access to the global state in a given scope. This allows you to prevent people from randomly tweaking variables.
My take in C++
I found it a good practice in C++ anyway to limit the scope of my global variables with a namespace. That way you can eliminate any ambiguity between your 10+ source files.
For example:
namespace ObjectGlobalVars {
//Put all of your global variables here
int myvariable = 0;
}
//And then later on you can reference them like
ObjectGlobalVars::myvariable++;
In c++
Having global variables lying around here and there, is an example of bad code.
If you want to share things on a global scale, then group them up and follow the singleton pattern.
Example:
class Singleton
{
private:
int mData;
public:
static Singleton& getInstance()
{
static Singleton instance;
return instance;
}
int GetData()
{
return mData;
}
private:
Singleton() {};
Singleton(Singleton const&);
void operator=(Singleton const&);
};
Advantages:
Only 1 global variable. The instance of our singleton.
You can include mutex / semaphore mechanisms inside the singleton, for thread-safe access of it's members.
Restricts the access of it's members helping you avoid logical and synchronization flaws.
Disadvantages:
Harder to implement. - If it's your first time -
In c
You should avoid declaring global variables, pass them in structs instead.
For instance:
struct MyData
{
int a;
int b;
};
void bar(struct MyData* data)
{
data->b = 2;
}
void foo()
{
struct MyData mdata;
mdata.a = 1;
bar( &mdata );
}
To sum things up
Having global variables lying around should be avoided as much as possible, in both languages.

What to use instead of static variables

in a C++ program I need some helper constant objects that would be instantiated once, preferably when the program starts. Those objects would mostly be used within the same translation unit, so the simplest way to do this would be to make them static:
static const Helper h(params);
But then there is this static initialization order problem, so if Helper refers to some other statics (via params), this might lead to UB.
Another point is that I might eventually need to share this object between several units. If I just leave it static and put in a .h file, that would lead to multiple objects. I could avoid that by bothering with extern etc, but this can finally provoke the same initialization order issues (and not to say it looks very C-ish).
I thought about singletons, but that would be overkill due to the boilerplate code and inconvenient syntax (e.g. MySingleton::GetInstance().MyVar) - those objects are helpers, so they are supposed to simplify things, not to complicate them...
The same C++ FAQ mentions this option:
Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}
Is this really used and considered a good thing? Should I do it this way, or would you suggest other alternatives? Thanks.
EDIT: I should have clarified why I actually need that helpers: they are very like normal constants, and could have been pre-calculated, but it is more convenient to do that at runtime. I would prefer to instantiate them before main, as it automatically resolves multi-threading issues (which local statics are not protected against in C++03). Also, as I said, they would often be limited to a translation unit, so it does not make sense to export them and initialize in main(). You can think of them as just constants but only known at runtime.
There are several possibilities for global state (whether mutable or not).
If you fear that you'll have an initialization issue, then you should use the local static approach to create your instance.
Note that the clunky singleton design you present is not mandatory design:
class Singleton
{
public:
static void DoSomething(int i)
{
Singleton& s = Instance();
// do something with i
}
private:
Singleton() {}
~Singleton() {}
static Singleton& Instance()
{
static Singleton S; // no dynamic allocation, it's unnecessary
return S;
}
};
// Invocation
Singleton::DoSomething(i);
Another design is somewhat similar, though I much prefer it because it makes transition to a non-global design much easier.
class Monoid
{
public:
Monoid()
{
static State S;
state = &s;
}
void doSomething(int i)
{
state->count += i;
}
private:
struct State
{
int count;
};
State* state;
};
// Use
Monoid m;
m.doSomething(1);
The net advantage here is that the "global-ness" of the state is hidden, it's an implementation details that clients need not worry about. Very useful for caches.
Let us, will you, question the design:
do you actually need to enforce the singularity ?
do you actually need the object be built before main starts ?
Singularity is generally over-emphasized. C++0x will help here, but even then, technically enforcing singularity rather than relying on programmers to behave themselves can be very annoying... for example when writing tests: do you really want to unload/reload your program between each unit test just to change the configuration between each one ? Ugh. Much more simple to instantiate it once and have faith in your fellow programmers... or the functional tests ;)
The second question is more technical, than functional. If you do need the configuration before the entry point of your program, then you can simply read it when it starts.
It may sound naive, but there is actually one issue with computing during the library load: how do you handle errors ? If you throw, the library is not loaded. If you do not throw and go on, you are in an invalid state. Not so funny, is it ? Things are much simpler once the real work has begun, because you can use the regular control-flow logic.
And if you think about testing whether the state is valid or not... why not simply building everything at the point where you'd test ?
Finally, the very issue with global is the hidden dependencies that are introduced. It's much better when dependencies are implicit to reason about the flow of execution, or the impacts of a refactoring.
EDIT:
Regarding initialization order issues: objects within a single translation unit are guaranteed to be initialized in the order they are defined.
Therefore, the following code is valid according to the standard:
static int foo() { return std::numeric_limits<int>::max() / 2; }
static int bar(int c) { return c*2; }
static int const x = foo();
static int const y = bar(x);
The initialization order is only an issue when referencing constants / variables defined in another translation unit. As such, static objects can naturally be expressed without issues as long as they only refer to static objects within the same translation unit.
Regarding the space issue: the as-if rule can do wonders here. Informally the as-if rule means that you specify a behavior and leave it up to the compiler/linker/runtime to provide it, without a care in the world for how it is provided. This is what actually enables optimizations.
Therefore, if the compiler chain can infer that the address of a constant is never taken, it may elide the constant altogether. If it can infer that several constants will always be equal, and once again that their address are never inspected, it may merge them together.
Yes, you can use Construct On First Use Idiom if it simplifies your problem. It's always better than global objects whose initialization depend on other global objects.
The other alternative is Singleton Pattern. Both can solve similar problem. But you've to decide which suits the situation better and fulfill your requirement.
To the best of my knowledge, there is nothing "better" than these two appproaches.
Singletons and global objects are often considered evil. The simplest and most flexible way is to instantiate the object in your main function and pass this object to other functions:
void doSomething(const Helper& h);
int main() {
const Parameters params(...);
const Helper h(params);
doSomething(h);
}
Another way is to make the helper functions non-members. Maybe they don't need any state at all, and if they do, you can pass a stateful object when you call them.
I think nothing speaks against the local static idiom mentioned in the FAQ. It is simple and should be thread-safe, and if the object isn't mutable, it should also be easily mockable and introduce no action at a distance.
Does Helper need to exist before main runs? If not, make a (set of?) global pointer variables initialized to 0. Then use main to populate them with the constant state in a definitive order. If you like you can even make helper functions that do the dereference for you.