RNGs and global variable avoidance

RNGs and global variable avoidance - c++

I'm wondering about this. I've heard that global variables are bad, that they hurt the maintainability, usability, reusability, etc. of the code. But in this case, what can I do otherwise? Namely, I have a "pseudo-random number generator" (PRNG) and as one may know, they involve an internal state that changes every time new random numbers are generated. But this seems like the kind of thing that needs to be a global! Or a "static" member of an RNG class, but that's essentially a global! And globals are bad!
So, what can I do? The obvious thing is to have something like this (really stripped down):
class RNG {
private:
StateType state; // this is the thing one may be tempted
// to make "static", which ruins the
// whole idea
public:
RNG(); // constructor seeds the RNG
~RNG();
int generateRandomInt();
};
But we need to seed that good, if we're going to create an instance of this every time we need a random number in some function or class. Using the clock may not work, since what if two instances of type "RNG" are created too close together? Then they get the same seed and produce the same random sequence. Bad.
We could also create one master RNG object and pass it around with pointers (instead of making it global, which would put us back on square 1), so that classes that need random numbers get a pointer to the RNG object in them. But then I run into a problem involving save/load of these objects to/from disk -- we can't just "save the RNG" for each instance, since we have only one RNG object. We'd have to instead pass an RNG into the load routines, which might give those routines different argument lists than for other objects that don't use the RNG. This would be a problem if, e.g. we wanted to use a common "Saveable" base class for everything that we can load/save. So, what to do? Eliminate the common "Saveable" base and just adopt a convention for how the load/save routines are to be made (but isn't that bad in and of itself? Oy!)?
What is the best solution to this that avoids the hostile-to-maintainability problems of globals yet also does not run into these new problems?
Or is it in fact okay to use a global here, as after all, that's how the "rand()" builtin works anyway? But then I hear that little thing in the back of my mind saying "but... but but but, globals are bad! Bad!" And from what I've read, there seem to be fairly good reasons to think them bad. But it seems like avoiding them creates new kinds of difficulties, like this one. It certainly seems harder to avoid globals than avoid "goto"s, for example.

There's some merit to your reluctance. You may want to substitute another generator for testing for example, that spits out a deterministic sequence that contains some edge cases.
Dependency Injection is one popular method for avoiding globals.

A random number generator is one of those things that is more OK to be global. You can think of it as:
-A RandomNumberFactory, a design pattern that uses a global factory, and it builds random numbers for you. The factory is of course constant semantically (in C++ you might use the keyword "mutable" internally, if that means anything to you.)
-A global constant (global constants are ok, right?) that gives you read-only access to the global constant randomNumber. randomNumber just happens to be non-deterministic, but it's constant in that of course no application is going to write to it.
-And besides, what's the worst that could happen? In a multithreaded application your RNG will yield non-deterministic behavior? Gasp. As #Mark Ransom pointed out above, yes that is actually a drawback as it makes testing harder. If this is a concern to you, you might consider a design pattern that writes this out.

It sometimes makes sense to use globals. No one can categorically say your code should not have any global variables, period.
What you do need is to be aware of the issues that can arise from global variables. If your application's logic needs global variables and it is possible for it to be called from a multi-threaded environment, then there is nothing inherently wrong with that, but you must take care to use locking, mutexes, and/or atomic reads/writes to ensure correct behavior.

Related

What is the need of global objects in c++?

I know that C++ also allows to create global objects of class and struct.
#include <iostream>
using std::cout;
class Test
{
public:
void fun()
{
cout<<"a= "<<a;
}
private:
int a=9;
};
Test t; // global object
int main()
{
t.fun();
return 0;
}
When & where should i use global objects? Is there any specific use of global objects? Please help me.

The short answer is there is no need.
The long answer is that it can sometimes be convenient. Unfortunately convenience is subjective, and what one finds to convenient another might find to be too lenient.
Global objects have issues (among which muddying data flows in the code and access synchronization), but sometimes one might find it convenient nonetheless to use one because it makes something easier. It can indeed remain easier, or it can prove a maintenance burden, but that only the future can tell...
... If you can avoid global objects, in general you should be better off. In practice, though, you will rarely encounter issues before programming at scale; for toy examples and students' projects there is rarely an issue, which is why beginners fail to understand why more seasoned developers avoid them like the plague.

In a complex project, you shouldn't. You should always be within a namespace if you plan to use you're assembly in other projects.
An example of something that might be that high in scope is an operator overload or an interface that you define which will be used many times.
Good practice...organize your project to use descriptive and intuitive namespaces.
Global is really only useful in simple project that simple don't use namespaces.

The question is malposed: is there a need for if wile and do, since we can do all with just for ?
The technical answer is "no, there is no need: the fact we can do without proves it". But a little more pragmatism show us that reducing every control flow into a single keywork makes code harder to track and follow. So it is technically possible but not always convenient.
Now: can we do without global objects?
A first non-answer is "yes, with singletons". But a singleton is a static object obtained through a function. They are not that conceptually different if not for a design flaw (known as "static initialization order fiasco") due to C++ not specifying global initialization object order essentially to allow multiple translation unit to coexist in a same linked executable. Singleton allow to circumvent that problem, but are essentially a workaround to allow global "things" to exist.
The existence of that flaw (and the singleton technique) is wat makes many "teachers" to draw some "moral" suasion like "never use global objects, or the flame of hell will burn your ass-hair". But that soon do std::cout << "hello world" and immediately loose ALL their credibility.
The point is that without "globals" everything is "scope local" and there is no "dynamic scope" in C++: if funcA calls funcB, funcB cannot see funcA locals, hence the only way to access things across scopes is parameter passing of reference or pointers.
In context which are mostly "functional", the missing of "dynamic scopes" is compensated by "lamba captures", and everything else will go as paramenter.
In context which are mostly "procedural", the need of a "state" that survives scopes and can be modified while going in and out is more suited for global objects. And that's the reason cout is that. It represent a resource theat pre-exist and post-exist the program, whose state evolves across various calls. If there is no global way to access cout, it should be initialized in main, an passed as a reference parameter to whatever function call: even the one that have no output to give, since they may call themselves something else that has output to give.
In fact you can think to global object as "implicit parameters passed to every function".
You can warp in global functions but -if functions themselves can be objects and object themselves can be functional- why saying global objects are bad and global functions can be right?
The actual only reason, in fact, turns out to be the static initialization order fiasco

I would say that global variables are needed for backwards compatibility with c. That is for sure, and this is the only hard reason I see. Since classes and structs are essentially the same, it probably didn't make much sense in forbidding only one.
I don't know any patterns or use-cases that are universally accepted as a good practice, that would make use of global objects. Usually global objects, sooner or later, lead to mess if not interacted with properly. They hide data flow and relationships. It is extremely easily to trip over it.
Still, I have seen some libraries exposing some objects as globals. Usually things that contain some static data. But there are other cases too, notable example being standard library's std::cout and family.
I won't judge if it's good or bad approach. This is too subjective IMO. You could probably use singletons, but one might argue, you can work on a global object under a contract etc. etc.

Is a bad practice to declare static variables into functions/member functions?

Recently a fellow worker showed to me a code like this:
void SomeClass::function()
{
static bool init = false;
if (!init)
{
// hundreds of lines of ugly code
}
init = true;
}
He wants to check if SomeClass is initialized in order to execute some piece of code once per Someclass instance but the fact is that only one instance of SomeClass will exist in all the lifetime of the program.
His question were about the init static variable, about when it's initialized. I've answered that the initialization occurs once, so the value will be false at first call and true the rest of its lifetime. After answering I've added that such use of static variables is bad practice but I haven't been able to explain why.
The reasons that I've been thinking so far are the following:
The behaviour of static bool init into SomeClass::function could be achieved with a non-static member variable.
Other functions in SomeClass couldn't check the static bool init value because it's visibility is limited to the void SomeClass::function() scope.
The static variables aren't OOPish because they define a global state instead of a object state.
This reasons looks poor, unclever and not very concrete to me so I'm asking for more reasons to explain why the use of static variables in function and member-function space are a bad practice.
Thanks!

This is certainly a rare occurrence, at least, in good quality code, because of the narrow case for which it's appropriate. What this basically does is a just-in-time initialization of a global state (to deliver some global functionality). A typical example of this is having a random number generator function that seeds the generator at the first call to it. Another typical use of this is a function that returns the instance of a singleton, initialized on the first call. But other use-case examples are few and far between.
In general terms, global state is not desirable, and having objects that contain self-sufficient states is preferred (for modularity, etc.). But if you need global state (and sometimes you do), you have to implement it somehow. If you need any kind of non-trivial global state, then you should probably go with a singleton class, and one of the preferred ways to deliver that application-wide single instance is through a function that delivers a reference to a local static instance initialized on the first call. If the global state needed is a bit more trivial, then doing the scheme with the local static bool flag is certainly an acceptable way to do it. In other words, I see no fundamental problem with employing that method, but I would naturally question its motivations (requiring a global state) if presented with such code.
As is always the case for global data, multi-threading will cause some problems with a simplistic implementation like this one. Naive introductions of global state are never going to be inherently thread-safe, and this case is no exception, you'd have to take measures to address that specific problem. And that is part of the reasons why global states are not desirable.
The behaviour of static bool init into SomeClass::function could be achieved with a non-static member variable.
If there is an alternative to achieve the same behavior, then the two alternatives have to be judged on the technical issues (like thread-safety). But in this case, the required behavior is the questionable thing, more so than the implementation details, and the existence of alternative implementations doesn't change that.
Second, I don't see how you can replace a just-in-time initialization of a global state by anything that is based on a non-static data member (a static data member, maybe). And even if you can, it would be wasteful (require per-object storage for a one-time-per-program-execution thing), and on that ground alone, wouldn't make it a better alternative.
Other functions in SomeClass couldn't check the static bool init value because it's visibility is limited to the void SomeClass::function() scope.
I would generally put that in the "Pro" column (as in Pro/Con). This is a good thing. This is information hiding or encapsulation. If you can hide away things that shouldn't be a concern to others, then great! But if there are other functions that would need to know that the global state has already been initialized or not, then you probably need something more along the lines of a singleton class.
The static variables aren't OOPish because they define a global state instead of a object state.
OOPish or not, who cares? But yes, the global state is the concern here. Not so much the use of a local static variable to implement its initialization. Global states, especially mutable global states, are bad in general and should never be abused. They hinder modularity (modules are less self-sufficient if they rely on global states), they introduce multi-threading concerns since they are inherently shared data, they make any function that use them non-reentrant (non-pure), they make debugging difficult, etc... the list goes on. But most of these issues are not tied to how you implement it. On the other hand, using a local static variable is a good way to solve the static-initialization-order-fiasco, so, they are good for that reason, one less problem to worry about when introducing a (well-justified) global state into your code.

Think multi-threading. This type of code is problematic when function() can be called concurrently by multiple threads. Without locking, you're open to race conditions; with locking, concurrency can suffer for no real gain.

Global state is probably the worst problem here. Other functions don't have to be concerned with it, so it's not an issue. The fact that it can be achieved without static variable essentially means you made some form of a singleton. Which of course introduces all problems that singleton has, like being totally unsuitable for multithreaded environment, for one.

Adding to what others said, you can't have multiple objects of this class at the same time, or at least would they not behave as expected. The first instance would set the static variable and do the initialization. The ones created later though would not have their own version of init but share it with all other instances. Since the first instance set it to true, all following won't do any initialization, which is most probably not what you want.

Long delegation chains in C++

This is definitely subjective, but I'd like to try to avoid it
becoming argumentative. I think it could be an interesting question if
people treat it appropriately.
In my several recent projects I used to implement architectures where long delegation chains are a common thing.
Dual delegation chains can be encountered very often:
bool Exists = Env->FileSystem->FileExists( "foo.txt" );
And triple delegation is not rare at all:
Env->Renderer->GetCanvas()->TextStr( ... );
Delegation chains of higher order exist but are really scarce.
In above mentioned examples no NULL run-time checks are performed since the objects used are always there and are vital to the functioning of the program and
explicitly constructed when execution starts. Basically I used to split a delegation chain in these cases:
1) I reuse the object obtained through a delegation chain:
{ // make C invisible to the parent scope
clCanvas* C = Env->Renderer->GetCanvas();
C->TextStr( ... );
C->TextStr( ... );
C->TextStr( ... );
}
2) An intermediate object somewhere in the middle of the delegation chain should be checked for NULL before usage. Eg.
clCanvas* C = Env->Renderer->GetCanvas();
if ( C ) C->TextStr( ... );
I used to fight the case (2) by providing proxy objects so that a method can be invoked on non-NULL object leading to an empty result.
My questions are:
Is either of cases (1) or (2) a pattern or an antipattern?
Is there a better way to deal with long delegation chains in C++?
Here are some pros and cons I considered while making my choice:
Pros:
it is very descriptive: it is clear out of 1 line of code where did the object came from
long delegation chains look nice
Cons:
interactive debugging is labored since it is hard to inspect more than one temporary object in the delegation chain
I would like to know other pros and cons of the long delegation chains. Please, present your reasoning and vote based on how well-argued opinion is and not how well you agree with it.

I wouldn't go so far to call either an anti-pattern. However, the first has the disadvantage that your variable C is visible even after it's logically relevant (too gratuitous scoping).
You can get around this by using this syntax:
if (clCanvas* C = Env->Renderer->GetCanvas()) {
C->TextStr( ... );
/* some more things with C */
}
This is allowed in C++ (while it's not in C) and allows you to keep proper scope (C is scoped as if it were inside the conditional's block) and check for NULL.
Asserting that something is not NULL is by all means better than getting killed by a SegFault. So I wouldn't recommend simply skipping these checks, unless you're a 100% sure that that pointer can never ever be NULL.
Additionally, you could encapsulate your checks in an extra free function, if you feel particularly dandy:
template <typename T>
T notNULL(T value) {
assert(value);
return value;
}
// e.g.
notNULL(notNULL(Env)->Renderer->GetCanvas())->TextStr();

In my experience, chains like that often contains getters that are less than trivial, leading to inefficiencies. I think that (1) is a reasonable approach. Using proxy objects seems like an overkill. I would rather see a crash on a NULL pointer rather than using a proxy objects.

Such long chain of delegation should not happens if you follow the Law of Demeter. I've often argued with some of its proponents that they where holding themselves to it too conscientiously, but if you come to the point to wonder how best to handle long delegation chains, you should probably be a little more compliant with its recommendations.

Interesting question, I think this is open to interpretation, but:
My Two Cents
Design patterns are just reusable solutions to common problems which are generic enough to be widely applied in object oriented (usually) programming. Many common patterns will start you out with interfaces, inheritance chains, and/or containment relationships that will result in you using chaining to call things to some extent. The patterns are not trying to solve a programming issue like this though - chaining is just a side effect of them solving the functional problems at hand. So, I wouldn't really consider it a pattern.
Equally, anti-patterns are approaches that (in my mind) counter-act the purpose of design patterns. For example, design patterns are all about structure and the adaptability of your code. People consider a singleton an anti-pattern because it (often, not always) results in spider-web like code due to the fact that it inherently creates a global, and when you have many, your design deteriorates fast.
So, again, your chaining problem doesn't necessarily indicate good or bad design - it's not related to the functional objectives of patterns or the drawbacks of anti-patterns. Some designs just have a lot of nested objects even when designed well.
What to do about it:
Long delegation chains can definitely be a pain in the butt after a while, and as long as your design dictates that the pointers in those chains won't be reassigned, I think saving a temporary pointer to the point in the chain you're interested in is completely fine (function scope or less preferably).
Personally though, I'm against saving a permanent pointer to a part of the chain as a class member as I've seen that end up in people having 30 pointers to sub objects permanently stored, and you lose all conception of how the objects are laid out in the pattern or architecture you're working with.
One other thought - I'm not sure if I like this or not, but I've seen some people create a private (for your sanity) function that navigates the chain so you can recall that and not deal with issues about whether or not your pointer changes under the covers, or whether or not you have nulls. It can be nice to wrap all that logic up once, put a nice comment at the top of the function stating which part of the chain it gets the pointer from, and then just use the function result directly in your code instead of using your delegation chain each time.
Performance
My last note would be that this wrap-in-function approach as well as your delegation chain approach both suffer from performance drawbacks. Saving a temporary pointer lets you avoid the extra two dereferences potentially many times if you're using these objects in a loop. Equally, storing the pointer from the function call will avoid the over head of an extra function call every loop cycle.

For bool Exists = Env->FileSystem->FileExists( "foo.txt" ); I'd rather go for an even more detailed breakdown of your chain, so in my ideal world, there are the following lines of code:
Environment* env = GetEnv();
FileSystem* fs = env->FileSystem;
bool exists = fs->FileExists( "foo.txt" );
and why? Some reasons:
readability: my attention gets lost till I have to read to the end of the line in case of bool Exists = Env->FileSystem->FileExists( "foo.txt" ); It's just too long for me.
validity: regardles that you mentioned the objects are, if your company tomorrow hires a new programmer and he starts writing code, the day after tomorrow the objects might not be there. These long lines are pretty unfriendly, new people might get scared of them and will do something interesting such as optimising them... which will take more experienced programmer extra time to fix.
debugging: if by any chance (and after you have hired the new programmer) the application throws a segmentation fault in the long list of chain it is pretty difficult to find out which object was the guilty one. The more detailed the breakdown the more easier to find the location of the bug.
speed: if you need to do lots of calls for getting the same chain elements, it might be faster to "pull out" a local variable from the chain instead of calling a "proper" getter function for it. I don't know if your code is production or not, but it seems to miss the "proper" getter function, instead it seems to use only the attribute.

Long delegation chains are a bit of a design smell to me.
What a delegation chain tells me is that one piece of code has deep access to an unrelated piece of code, which makes me think of high coupling, which goes against the SOLID design principles.
The main problem I have with this is maintainability. If you're reaching two levels deep, that is two independent pieces of code that could evolve on their own and break under you. This quickly compounds when you have functions inside the chain, because they can contain chains of their own - for example, Renderer->GetCanvas() could be choosing the canvas based on information from another hierarchy of objects and it is difficult to enforce a code path that does not end up reaching deep into objects over the life time of the code base.
The better way would be to create an architecture that obeyed the SOLID principles and used techniques like Dependency Injection and Inversion Of Control to guarantee your objects always have access to what they need to perform their duties. Such an approach also lends itself well to automated and unit testing.
Just my 2 cents.

If it is possible I would use references instead of pointers. So delegates are guaranteed to return valid objects or throw exception.
clCanvas & C = Env.Renderer().GetCanvas();
For objects which can not exist i will provide additional methods such as has, is, etc.
if ( Env.HasRenderer() ) clCanvas* C = Env.Renderer().GetCanvas();

If you can guarantee that all the objects exist, I don't really see a problem in what you're doing. As others have mentioned, even if you think that NULL will never happen, it may just happen anyway.
This being said, I see that you use bare pointers everywhere. What I would suggest is that you start using smart pointers instead. When you use the -> operator, a smart pointer will usually throw if the pointer is NULL. So you avoid a SegFault. Not only that, if you use smart pointers, you can keep copies and the objects don't just disappear under your feet. You have to explicitly reset each smart pointer before the pointer goes to NULL.
This being said, it wouldn't prevent the -> operator from throwing once in a while.
Otherwise I would rather use the approach proposed by AProgrammer. If object A needs a pointer to object C pointed by object B, then the work that object A is doing is probably something that object B should actually be doing. So A can guarantee that it has a pointer to B at all time (because it holds a shared pointer to B and thus it cannot go NULL) and thus it can always call a function on B to do action Z on object C. In function Z, B knows whether it always has a pointer to C or not. That's part of its B's implementation.
Note that with C++11 you have std::smart_ptr<>, so use it!

external side effect in constructor

Look at this code:
#include <framework_i_hate.h>
int main() {
XFile file("./my_file.xxxx", "create");
XObject object("my_object");
// modify the object
object.Write();
}
Try to guess where object will be saved... yes, you guessed it. I think this is too magic, I'd like to write something like object.Save(file), but it's not necessary. Obviously there is global variable inside framework_i_hate.h that it is modified during the file constructor. What do you think about this side effect inside constructor?
How can this behavior be hidden?
A bonus to who guess the framework.

Very muddled hard to understand post and the question in the end is very rhetorical.
Global variables are evil, what else to add?

What can be said about this that isn't already obvious enough: It's a fairly nasty side effect, because:
the resulting behaviour is unexpected and not predictable at all, unless you know that framework well.
global program state usually isn't a good idea in object-oriented programming. (In fact, global state is probably never a good idea, so best avoid it if you can.)
it's fairly likely that this framework isn't thread-safe, either. (Think about what happens when two concurrent threads both create an XFile object each, and one of these threads then writes an XObject... where will it end up being saved?)
While this thread is tagged C++ and not about .NET, I have seen this "anti-pattern" before in a less severe and much more sane form, namely with DB transaction scopes.

I'd prefer to see the relationship between XFile and XObject be explicit, I agree that this "magic" is too well hidden. I also question why the object is given a name, unless there are other parts of the API where the name is significant.
Global variables are despised for many reasons, this is just one example.

Initializing a program using a Singleton

I have read multiple articles about why singletons are bad.
I know it has few uses like logging but what about initalizing and deinitializing.
Are there any problems doing that?
I have a scripting engine that I need to bind on startup to a library.
Libraries don't have main() so what should I use?
Regular functions or a Singleton.
Can this object be copied somehow:
class
{
public:
static void initialize();
static void deinitialize();
} bootstrap;
If not why do people hide the copy ctor, assignment operator and the ctor?

Libraries in C++ have a much simpler way to perform initialization and cleanup. It's the exact same way you'd do it for anything else. RAII.
Wrap everything that needs to be initialized in a class, and perform its initialization in the constructor. Voila, problems solved.
All the usual problems with singletons still apply:
You are going to need more than one instance, even if you hadn't planned for it. If nothing else, you'll want it when unit-testing. Each test should initialize the library from scratch so that it runs in a clean environment. That's hard to do with a singleton approach.
You're screwed as soon as these singletons start referencing each others. Because the actual initialization order isn't visible, you quickly end up with a bunch of circular references resulting in accessing uninitialized singletons or stack overflows or deadlocks or other fun errors which could have been caught at compile-time if you hadn't been obsessed with making everything global.
Multithreading. It's usually a bad idea to force all threads to share the same instance of a class, becaus it forces that class to lock and synchronize everything, which costs a lot of performance, and may lead to deadlocks.
Spaghetti code. You're hiding your code's dependencies every time you use a singleton or a global. It is no longer clear which objects a function depends on, because not all of them are visible as parameters. And because you don't need to add them as parameters, you easily end up adding far more dependencies than necessary. Which is why singletons are almost impossible to remove once you have them.

A singleton's purpose is to have only ONE instance of a certain class in your system.
The C'tor, D'tor and CC'tor are hidden, in order to have a single access point for receiving the only existing instance.
Usually the instance is static (could be allocated on the heap too) and private, and there's a static method (usually called GetInstance) which returns a reference to this instance.
The question you should ask yourself when deciding whether to have a singleton is : Do I really need to enforce having one object of this class?
There's also the inheritance problem - it can make things complicated if you are planning to inherit from a singleton.
Another problem is How to kill a singleton (the web is filled with articles about this issue)
In some cases it's better to have your private data held statically rather than having a singleton, all depends on the domain.
Note though, that if you're multi-threaded, static variables can give you a pain in the XXX...
So you should analyse your problem carefully before deciding on the design pattern you're going to use...
In your case, I don't think you need a singleton because you want the libraries to be initialized at the beginning, but it has nothing to do with enforcing having only one instance of your class. You could just hold a static flag (static bool Initialized) if all you want is to ensure initializing it only once.
Calling a method once is not reason enough to have a singleton.

It's a good practice to provide an interface for your libraries so that multiple modules (or threads) can use them simultaneously. If you really need to run some code when modules are loaded then use singletons to init parts that must be init once.

count the number of singletons in your design, and call this number 's'
count the number of threads in your design, and call this number 't'
now, raise t to the s-th power; this is roughly the number of hairs you are likely to lose while debugging the resulting code.
(I personally have run afoul of code that has over 50 singletons with 10 different threads all racing to get to .getInstance() first)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RNGs and global variable avoidance - c++

There's some merit to your reluctance. You may want to substitute another generator for testing for example, that spits out a deterministic sequence that contains some edge cases. Dependency Injection is one popular method for avoiding globals.

Related

What is the need of global objects in c++?

Is a bad practice to declare static variables into functions/member functions?

Long delegation chains in C++

external side effect in constructor

Initializing a program using a Singleton

Categories

Resources