For a project I'm working on, I have a bunch of "library classes". These are essentially collections of related functions of values. Some of these libraries need to be "initialized" at run-time. So far, I've been utilizing the design below as a solution:
// Filename: Foo.h
namespace my_project
{
namespace library
{
class Foo
{
public:
static int some_value; // members used externally and internally
Foo()
{
// Lots of stuff goes on in here
// Therefore it's not a simply member initialization
// But for this example, this should suffice
some_value = 10;
Foo::bar();
}
static void bar() { ++some_value; } // some library function
// no destructor needed because we didn't allocate anything
private:
// restrict copy/assignment
Foo(const Foo&);
void operator=(const Foo&);
};
int Foo::some_value = 0; // since some_value is static, we need this
} // library namespace
static library::Foo Foo;
} // my_project namespace
Using Foo would be similar to this, as an example:
#include "Foo.h"
using namespace my_project;
int main()
{
int i = Foo.some_value;
Foo.bar();
int j = Foo.some_value;
return 0;
}
Of course, this example is very simplified, but it gets the point across. This method has four advantages to me:
User of the library doesn't need to worry about initialization. They wouldn't need to call something like Foo::init(); inside their main(), because library::Foo was initialized when my_project::Foo was constructed. This is the main design constraint here. User should not be responsible for initializing the library.
I can create various private functions inside the library to control its use.
The user can create other instances of this library if they choose, for whatever reason. But no copying would be allowed. One instance would be provided for them by default. This is a requirement.
I can use the . syntax instead of ::. But that's a personal style thing.
Now, the question is, are there any disadvantages to this solution? I feel like I'm doing something that C++ wasn't meant to do because Visual Studio's IntelliSense keeps freaking out on me and thinks my_project::Foo isn't declared. Could it be because both the object and the class are called Foo even though they're in different namespaces?
The solution compiles fine. I'm just worried that once my project grows larger in scale, I might start having name ambiguities. Furthermore, am I wasting extra memory by creating an object of this library?
Should I simply stick to the singleton design pattern as an alternative solution? Are there any alternative solutions?
UPDATE:
After reviewing the solutions provided, and jumping around google for various solutions, I stumbled upon extern. I have to say I'm a bit fuzzy on what this keyword really does; I've been fuzzy about it ever since I learned C++. But after tweaking my code, I changed it to this:
// Foo.h
namespace my_project
{
namespace library
{
class Foo_lib
{
public:
int some_value;
Foo_lib() { /* initialize library */ }
void bar() { /* do stuff */ }
private:
// restrict copy/assignment
Foo_lib(const Foo_lib&);
void operator=(const Foo_lib&);
};
} // library namespace
extern library::Foo_lib Foo;
} // my_project namespace
// Foo.cpp
#include "Foo.h"
namespace my_project
{
namespace library
{
// Foo_lib definitions
} // library namespace
library::Foo_lib Foo;
} // my_project namespace
// main.cpp
#include "Foo.h"
using namespace my_project;
int main()
{
int i = Foo.some_value;
Foo.bar();
int j = Foo.some_value;
return 0;
}
This seems to have the exact same effect as before. But as I said, since I'm still fuzzy on extern usage, would this also have the exact same bad side-effects?
This line is particularly bad:
static library::Foo Foo;
It emits a static copy of Foo in every translation. Don't use it :) The result of Foo::some_value would be equal to the number of translations the Foo.h was visible to, and it's not thread safe (which will frustrate your users).
This line will result in multiple definitions when linking:
int Foo::some_value = 0;
Singletons are also bad. Searching here #SO will produce a lot of reasons to avoid them.
Just create normal objects, and document to your users why they should share objects when using your library, and in which scenarios.
User of the library doesn't need to worry about initialization. They wouldn't need to call something like Foo::init(); inside their main(), because library::Foo was initialized when my_project::Foo was constructed. This is the main design constraint here. User should not be responsible for initializing the library.
Objects should be able to construct themselves as needed without introducing unstrippable binary baggage.
I can create various private functions inside the library to control its use.
That's not unique to your approach.
The user can create other instances of this library if they choose, for whatever reason. But no copying would be allowed. One instance would be provided for them by default. This is a requirement.
Then you can force your users to pass Foo as a necessary argument to create the types they depend upon (where Foo is needed).
I can use the . syntax instead of ::. But that's a personal style thing.
Not good. Not threadsafe, and the user can then seriously mess up your library's state. Private data is best.
There are two things going on here:
What if the user would dearly like to parallelize her code ?
What if the user would like to start using your library during the static initialization phase ?
So, one at a time.
1. What if the user would dearly like to parallelize her code ?
In the age of multi-core processors libraries should strive for re-entrancy. Global State is bad, and unsynchronized Global State is even worse.
I would simply recommend for you to make Foo contain regular attributes instead of static ones, it is then up to the user to decide how many instances in parallel should be used, and perhaps settle on one.
If passing a Foo to all your methods would prove awkward, have a look at the Facade pattern. The idea here would be to create a Facade class that is initialized with a Foo and provides entry points to your library.
2. What if the user would like to start using your library during the static initialization phase ?
The static initialization order fiasco is just horrid, and the static destruction order fiasco (its sibling) is no better, and even harder to track down (because the memory is not 0-initialized there, so it's hard to see what's going on).
Since once again it's hard (impossible ?) for you to predict the usage of your library and since any attempt to use it during static initialization or destruction is nigh impossible with a singleton that you would create, the simpler thing to do is to delegate at least initialization to the user.
If the user is unlikely to be willing to use this library at start-up and shut-down, then you may provide a simple safeguard to automatically initialize the library on first use if she didn't already.
This can be accomplished easily, and in a thread-safe manner (*), using local static variables:
class Foo {
public:
static Foo& Init() { static Foo foo; return foo; }
static int GetValue() { return Init()._value; }
private:
Foo(): _value(1) {}
Foo(Foo const&) = delete;
Foo& operator=(Foo const&) = delete;
int _value;
}; // class Foo
Note that all this glue is completely useless if you simply decide not to use a Singleton and go for the first solution: a regular object, with per-instance state only.
(*) Thread safety is guaranteed in C++11. In C++03 (the version used primarily in the industry) the best compilers guarantee it as well, check the documentation if required.
Now, the question is, are there any disadvantages to this solution?
Yes. See for instance, this entry in the c++ faq on the static initialization order fiasco. http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.14 tldr? Essentially, you have no control over what order static objects (such as Foo above) get initialized in, any assumptions about the order (eg. initializing one static object with values from another) will result in Undefined Behaviour.
Consider this code in my app.
#include "my_project/library/Foo.h"
static int whoKnowsWhatValueThisWillHave = Foo::some_value;
int main()
{
return whoKnowsWhatValueThisWillHave;
}
There are no guarantees on what I am returning from main() here.
The user can create other instances of this library if they choose, for whatever reason. But no copying would be allowed. One instance would be provided for them by default. This is a requirement.
Not really, no... Since all of your data is static, any new instances are essentially empty shells pointing to the same data. Basically, you have a copy.
I feel like I'm doing something that C++ wasn't meant to do because Visual Studio's IntelliSense keeps freaking out on me and thinks my_project::Foo isn't declared. Could it be because both the object and the class are called Foo even though they're in different namespaces?
You are! Suppose I add this line to my code:
using namespace ::my_project::library;
what does 'Foo' resolve to now? Maybe this is defined in the standard, but at the very least, it is confusing.
I can use the . syntax instead of ::. But that's a personal style thing.
Don't fight the language. If you want to code in Python or Java syntax, use Python or Java (or Ruby or whatever)...
Should I simply stick to the singleton design pattern as an alternative solution? Are there any alternative solutions?
Yes, the Singleton is a good one, but you should also consider whether you actually need a singleton here. Since your example is only syntactic, it is hard to say, but maybe it would be better to use dependency injection or something similar to minimize/eliminate tight couplings between classes.
Hopefully I haven't hurt your feelings :) It's good to ask questions, but obviously you already know that!
Related
I have a private static vector in my class that keeps a pointer to all objects created from it. It's necessary as each object needs access to information from all the other objects to perform some calculations:
// Header file:
class Example {
public:
Example();
private:
static std::vector<const Example*> examples_;
};
// Cpp file:
std::vector<const Example *> Example::examples_ = {};
Example::Example() {
// intialization
examples_.emplace_back(this);
}
void Example::DoCalc() {
for (auto example : examples_) {
// do stuff
}
}
clang-tidy points out that I'm violating C++ Core Guidelines, namely : "Variable 'examples_' is non-const and globally accessible, consider making it const (cppcoreguidelines-avoid-non-const-global-variables)".
Personally, I don't see the resemblance between my code and the sample code in the core guidelines, especially since the variable is inside a class and private. What would be the 'correct' way of implementing this functionality? I don't want to disable this check from clang-tidy if it can be avoided.
What you've done is fine. This is literally the purpose of class-static. Some people would recommend alternatives, for unrelated reasons, which may be worth considering… but not because of anything clang-tidy is telling you here.
You've run into clang-tidy bug #48040. You can see this because it's wrong in its messaging: the vector is not "globally accessible", at least not in the sense of access rules, since it's marked private (though it's globally present across translation units, which is fine).
Your code doesn't relate to the cited core guideline.
A possible solution is to force each client that accesses Example::examples_ to go through a function. Then put examples as a static variable into that function. That way the object will be created the first time the function is called - independent of any global object construction order.
// Header file:
class Example {
public:
Example();
private:
std::vector<const Example*>& examples();
};
// Cpp file:
std::vector<Example *>& Example::examples()
{
static std::vector<Example *> examples_;
return examples_;
};
Example::Example() {
// intialization
examples().emplace_back(this);
}
void Example::DoCalc() {
for (auto example : examples()) {
// do stuff
}
}
Of course if you are sure that you have no problem with global objects and are sure that no other global object is accessing Examples::examples_ during its construction, you can ignore the warning. It is just a guideline, you don't need to follow it strictly.
As Asteroids With Wings noted the guideline I.2 does not apply to your code. But please note that the CoreGuidelines intend to ban static members as well, see To-do: Unclassified proto-rules:
avoid static class members variables (race conditions, almost-global variables)
Personally, I don't see the resemblance between my code and the sample code in the core guidelines
You have a single variable that is accessible to every thread, hidden from users of Example. The only difference to an ordinary global variable is that it is private, i.e. you can't use the name Example::examples_ to refer to it outside of Example.
Note
The rule is "avoid", not "don't use."
The "correct" way of implementing this functionality might be how you have it, but I strongly suggest you rework "each object needs access to information from all the other objects to perform some calculations" so that you pass a std::vector<const Example*> to where it is needed, having kept track of all the relevant (and especially alive) Examples where they are used.
Alternative: [...] Another solution is to define the data as the state of some object and the operations as member functions.
Warning: Beware of data races: If one thread can access non-local data (or data passed by reference) while another thread executes the callee, we can have a data race. Every pointer or reference to mutable data is a potential data race.
Say you have a certain class in which each instance of it needs to access the exact same set of data. It is more efficient to declare the data outside the class rather than have each instance make its own one, but doesn't this breach the 'no globals' rule?
Example code:
Foo.h
class Foo{
Foo();
void someFunction();
};
Foo.cpp
#include "Foo.h"
//surely ok since it's only global to the class's .cpp?
const int g_data[length] = {
//contains some data
};
Foo::someFunction(){
//do something involving g_data ..
}
..rather than making 'g_data' a member variable. Or is there some other way which avoids creating a global?
Use the modifier static which modifies the declaration of a class member so that is shared among all class instances. An example:
class A {
int length = 10;
static int g_data[length]; //Inside the class
}
And then you can access it like this:
int main(void) {
std::cout << "For example: " << A::g_data[2] << std::endl;
}
You can find more on this matter here
That's what a static member is for.
Thus you will have in your declaration :
class Foo{
Foo();
void someFunction();
static int const sharedData[length];
};
And somewhere in your cpp file :
int const Foo::sharedData[length] = { /* Your data */ };
Summarily, yes - your "global" is probably ok (though it'd be better in an anonymous namespace, and there are a few considerations).
Details: lots of answers recommending a static class member, and that is often a good way to go, but do keep in mind that:
a static class member must be listed in the class definition, so in your case it'll be in Foo.h, and any client code that includes that header is likely to want to recompile if you edit the static member in any way even if it's private and of no possible direct relevance to them (this is most important for classes in low-level libraries with diverse client code - enterprise libraries and those shared across the internet and used by large client apps)
with a static class member, code in the header has the option of using it (in which case a recompile will be necessary and appropriate if the static member changes)
if you put the data in the .cpp file, it's best in an anonymous namespace so it doesn't have external linkage (no other objects see it or can link to it), though you have no way to encapsulate it to protect it from access by other functions later in the .cpp's translation unit (whereas a non-public static class member has such protection)
What you really want is data belonging to the type, not to an instance of that type. Fortunately, in C++ there is an instrument doing exactly that — static class members.
If you want to avoid the global and have a more object-oriented solution, take a look at the Flyweight pattern. The Boost Flyweight library has some helper infrastructure and provides a decent explanation of the concept.
Since you are talking about efficiency, it may or not be a good idea to use such an approach, depending on your actual goal. Flyweights are more about saving memory and introduce an additional layer of indirection which may impact runtime performance. The externally stored state may impact compiler optimizations, especially inlining and reduce data locality which prevents efficient caching. On the other hand, some operations which like assignment or copying can be considerably faster because there is only the pointer to the shared state that needs to be copied (plus the non-shared state, but this has to be copied anyway). As always when it comes to efficiency / performance, make sure you measure your code to compare what suits your requirements.
I frequently have to write c/c++ programs with 10+ source files where a handful of variables need to be shared between functions in all the files. I have read before that it is generally good practice to avoid using global variables with extern. However, if it is completely necessary to use global variables, this link provides a good strategy. Lately, I have been toying with the strategy of wrapping up all my variables in a struct or a class and passing this struct around to different functions. I was wondering which way people consider to be cleaner and if there are any better alternatives.
EDIT: I realize strategies may be different in the two languages. I am interested in strategies that apply to only one language or both.
Pass around a class/struct of "context" data instead of global variables. You will be suprised how often a global variable becomes no longer global, with different modules wanting to use different values for it at the same time.
The better alternative to globals is to not use globals.
Don't try to sweep them under the rug using a struct or a namespace or a singleton or some other silly thing whose only purpose is to hide the fact that you're using globals.
Just don't ever create one. Ever.
It will force you to think of ownership and lifetime and dependencies and responsibility. You know, grown-up things.
And then, when you're comfortable writing global-free code, you can start violating all those rules.
Because that's what rules are for: to be followed, and to be broken.
Does it REALLY need to be global?
This is the first question you should always ask, is this variable used GLOBALLY e.g. in all contexts. The answer is almost certainly... no it's not.
Consider Context
Is the variable global state, or is it context? Global state is usually rare, context on the other hand is quite common. If it's global state consider wrapping in a singleton so you can manage the how of interaction with your globals. Using Atomic<> is probably not a bad idea, you should at least consider synchronization.
If it is context then it should be passed explicitly in a structure or class, as the data is explicitly relevant to that context an no-other. Passing context explicitly may seem like a burden but it makes it very clear where the context is coming from rather than just referencing random variables out of the ether.
What is the Scope?
It may seem odd to say that globals are scoped, but any global declared in a single file may be declared static and thus unlinkable from any other file. This means you can restrict who has access to the global state in a given scope. This allows you to prevent people from randomly tweaking variables.
My take in C++
I found it a good practice in C++ anyway to limit the scope of my global variables with a namespace. That way you can eliminate any ambiguity between your 10+ source files.
For example:
namespace ObjectGlobalVars {
//Put all of your global variables here
int myvariable = 0;
}
//And then later on you can reference them like
ObjectGlobalVars::myvariable++;
In c++
Having global variables lying around here and there, is an example of bad code.
If you want to share things on a global scale, then group them up and follow the singleton pattern.
Example:
class Singleton
{
private:
int mData;
public:
static Singleton& getInstance()
{
static Singleton instance;
return instance;
}
int GetData()
{
return mData;
}
private:
Singleton() {};
Singleton(Singleton const&);
void operator=(Singleton const&);
};
Advantages:
Only 1 global variable. The instance of our singleton.
You can include mutex / semaphore mechanisms inside the singleton, for thread-safe access of it's members.
Restricts the access of it's members helping you avoid logical and synchronization flaws.
Disadvantages:
Harder to implement. - If it's your first time -
In c
You should avoid declaring global variables, pass them in structs instead.
For instance:
struct MyData
{
int a;
int b;
};
void bar(struct MyData* data)
{
data->b = 2;
}
void foo()
{
struct MyData mdata;
mdata.a = 1;
bar( &mdata );
}
To sum things up
Having global variables lying around should be avoided as much as possible, in both languages.
Say I have (pretty large) C++ module in a namespace foo which has a lot (well, at least one) of static data, namespace-global data and Singletons and so forth, spread across a myriad of files and directories. Is there any way to "sandbox" that entire thing in order to run independent versions at the same time (in the same process, that is). How many versions are to be run will be decided at runtime.
I thought about wrapping everything in several namespaces (e.g. bar1::foo, bar2::foo, ...), but that is a) not possible, since I don't want to touch all files and b) it would not enable me to have an arbitrary number at runtime.
Update: I was thinking perhaps I could create separate thread for each version, but I'm not that well versed with threads.
Consider putting your foo code inside a shared object. During runtime you can load and unload that shared object as often as you desire.
For an initial reference on dynamic loading of shared object, take a peek at http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html
Basically you have created a namespace with state, this is bad, you want to use a class for this use case you should be able to change it reasonably easily so that it a class
So where you have had
namespace foo{
int state;
int func();
}
foo::func();
you need
class foo{
int state;
int func();
};
foo foo1;
foo1.func();
in a C++ program I need some helper constant objects that would be instantiated once, preferably when the program starts. Those objects would mostly be used within the same translation unit, so the simplest way to do this would be to make them static:
static const Helper h(params);
But then there is this static initialization order problem, so if Helper refers to some other statics (via params), this might lead to UB.
Another point is that I might eventually need to share this object between several units. If I just leave it static and put in a .h file, that would lead to multiple objects. I could avoid that by bothering with extern etc, but this can finally provoke the same initialization order issues (and not to say it looks very C-ish).
I thought about singletons, but that would be overkill due to the boilerplate code and inconvenient syntax (e.g. MySingleton::GetInstance().MyVar) - those objects are helpers, so they are supposed to simplify things, not to complicate them...
The same C++ FAQ mentions this option:
Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}
Is this really used and considered a good thing? Should I do it this way, or would you suggest other alternatives? Thanks.
EDIT: I should have clarified why I actually need that helpers: they are very like normal constants, and could have been pre-calculated, but it is more convenient to do that at runtime. I would prefer to instantiate them before main, as it automatically resolves multi-threading issues (which local statics are not protected against in C++03). Also, as I said, they would often be limited to a translation unit, so it does not make sense to export them and initialize in main(). You can think of them as just constants but only known at runtime.
There are several possibilities for global state (whether mutable or not).
If you fear that you'll have an initialization issue, then you should use the local static approach to create your instance.
Note that the clunky singleton design you present is not mandatory design:
class Singleton
{
public:
static void DoSomething(int i)
{
Singleton& s = Instance();
// do something with i
}
private:
Singleton() {}
~Singleton() {}
static Singleton& Instance()
{
static Singleton S; // no dynamic allocation, it's unnecessary
return S;
}
};
// Invocation
Singleton::DoSomething(i);
Another design is somewhat similar, though I much prefer it because it makes transition to a non-global design much easier.
class Monoid
{
public:
Monoid()
{
static State S;
state = &s;
}
void doSomething(int i)
{
state->count += i;
}
private:
struct State
{
int count;
};
State* state;
};
// Use
Monoid m;
m.doSomething(1);
The net advantage here is that the "global-ness" of the state is hidden, it's an implementation details that clients need not worry about. Very useful for caches.
Let us, will you, question the design:
do you actually need to enforce the singularity ?
do you actually need the object be built before main starts ?
Singularity is generally over-emphasized. C++0x will help here, but even then, technically enforcing singularity rather than relying on programmers to behave themselves can be very annoying... for example when writing tests: do you really want to unload/reload your program between each unit test just to change the configuration between each one ? Ugh. Much more simple to instantiate it once and have faith in your fellow programmers... or the functional tests ;)
The second question is more technical, than functional. If you do need the configuration before the entry point of your program, then you can simply read it when it starts.
It may sound naive, but there is actually one issue with computing during the library load: how do you handle errors ? If you throw, the library is not loaded. If you do not throw and go on, you are in an invalid state. Not so funny, is it ? Things are much simpler once the real work has begun, because you can use the regular control-flow logic.
And if you think about testing whether the state is valid or not... why not simply building everything at the point where you'd test ?
Finally, the very issue with global is the hidden dependencies that are introduced. It's much better when dependencies are implicit to reason about the flow of execution, or the impacts of a refactoring.
EDIT:
Regarding initialization order issues: objects within a single translation unit are guaranteed to be initialized in the order they are defined.
Therefore, the following code is valid according to the standard:
static int foo() { return std::numeric_limits<int>::max() / 2; }
static int bar(int c) { return c*2; }
static int const x = foo();
static int const y = bar(x);
The initialization order is only an issue when referencing constants / variables defined in another translation unit. As such, static objects can naturally be expressed without issues as long as they only refer to static objects within the same translation unit.
Regarding the space issue: the as-if rule can do wonders here. Informally the as-if rule means that you specify a behavior and leave it up to the compiler/linker/runtime to provide it, without a care in the world for how it is provided. This is what actually enables optimizations.
Therefore, if the compiler chain can infer that the address of a constant is never taken, it may elide the constant altogether. If it can infer that several constants will always be equal, and once again that their address are never inspected, it may merge them together.
Yes, you can use Construct On First Use Idiom if it simplifies your problem. It's always better than global objects whose initialization depend on other global objects.
The other alternative is Singleton Pattern. Both can solve similar problem. But you've to decide which suits the situation better and fulfill your requirement.
To the best of my knowledge, there is nothing "better" than these two appproaches.
Singletons and global objects are often considered evil. The simplest and most flexible way is to instantiate the object in your main function and pass this object to other functions:
void doSomething(const Helper& h);
int main() {
const Parameters params(...);
const Helper h(params);
doSomething(h);
}
Another way is to make the helper functions non-members. Maybe they don't need any state at all, and if they do, you can pass a stateful object when you call them.
I think nothing speaks against the local static idiom mentioned in the FAQ. It is simple and should be thread-safe, and if the object isn't mutable, it should also be easily mockable and introduce no action at a distance.
Does Helper need to exist before main runs? If not, make a (set of?) global pointer variables initialized to 0. Then use main to populate them with the constant state in a definitive order. If you like you can even make helper functions that do the dereference for you.