I have started a migration of a high energy physics algorithm written in FORTRAN to an object oriented approach in C++. The FORTRAN code uses a lot of global variables all across a lot of functions.
I have simplified the global variables into a set of input variables, and a set of invariants (variables calculated once at the beginning of the algorithm and then used by all the functions).
Also, I have divided the full algorithm into three logical steps, represented by three different classes. So, in a very simple way, I have something like this:
double calculateFactor(double x, double y, double z)
{
InvariantsTypeA invA();
InvariantsTypeB invB();
// they need x, y and z
invA.CalculateValues();
invB.CalculateValues();
Step1 s1();
Step2 s2();
Step3 s3();
// they need x, y, z, invA and invB
return s1.Eval() + s2.Eval() + s3.Eval();
}
My problem is:
for doing the calculations all the InvariantsTypeX and StepX objects need the input parameters (and these are not just three).
the three objects s1, s2 and s3 need the data of the invA and invB objects.
all the classes use several other classes through composition to do their job, and all those classes also need the input and the invariants (by example, s1 has a member object theta of class ThetaMatrix that needs x, z and invB to get constructed).
I cannot rewrite the algorithm to reduce the global values, because it follows several high energy physics formulas, and those formulas are just like that.
Is there a good pattern to share the input parameters and the invariants to all the objects used to calculate the result?
Should I use singletons? (but the calculateFactor function is evaluated around a million of times)
Or should I pass all the required data as arguments to the objects when they are created?(but if I do that then the data will be passed everywhere in every member object of every class, creating a mess)
Thanks.
Well, in C++ the most suitable solution, given your constraints and conditions, is represented by pointers. Many developers told you to use boost::shared_ptr. Well it is not necessary, although it provides a better performance especially when considering portability and robustness to system faults.
It is not necessary for you to bind to boost. It is true that they are not compiled and that now standardization processes will lead to c++ with boost directly integrated as a standard library, but if you do not want to use an external library you obviously can.
So let's go and try to solve your problem using just C++ and what it provides actually.
You'll probably have a main method and there, you told before, initialize all invariants elements... so you basically have constants and they can be every possible type. no need to make them constant if you want, however, in main you instantiate your invariant elements and point them for all those components requiring their usage. First in a separate file called "common_components.hpp" consider the following (I assume that you need some types for your invariant variables):
typedef struct {
Type1 invariant_var1;
Type2 invariant_var2;
...
TypeN invariant_varN;
} InvariantType; // Contains the variables I need, it is a type, instantiating it will generate a set of global variables.
typedef InvariantType* InvariantPtr; // Will point to a set of invariants
In your "main.cpp" file you'll have:
#include "common_components.hpp"
// Functions declaration
int main(int, char**);
MyType1 CalculateValues1(InvariantPtr); /* Your functions have as imput param the pointer to globals */
MyType2 CalculateValues2(InvariantPtr); /* Your functions have as imput param the pointer to globals */
...
MyType3 CalculateValuesN(InvariantPtr); /* Your functions have as imput param the pointer to globals */
// Main implementation
int main(int argc, char** argv) {
InvariantType invariants = {
value1,
value2,
...
valueN
}; // Instantiating all invariants I need.
InvariantPtr global = &invariants;
// Now I have my variable global being a pointer to global.
// Here I have to call the functions
CalculateValue1(global);
CalculateValue2(global);
...
CalculateValueN(global);
}
If you have functions returning or using the global variable use the pointer to the struct modifying you methods' interface. By doing so all changes will be flooded to all using thoss variables.
Why not passing the invariants as a function parameter or to the constructor of the class having the calculateFactor method ?
Also try to gather parameters together if you have too many params for a single function (for instance, instead of (x, y, z) pass a 3D point, you have then only 1 parameter instead of 3).
three logical steps, represented by three different classes
This may not have been the best approach.
A single class can have a large number of "global" variables, shared by all methods of the class.
What I've done when converting old codes (C or Fortran) to new OO structures is to try to create a single class which represents a more complete "thing".
In some case, well-structured FORTRAN would use "Named COMMON Blocks" to cluster things into meaningful groups. This is a hint as to what the "thing" really was.
Also, FORTRAN will have lots of parallel arrays which aren't really separate things, they're separate attributes of a common thing.
DOUBLE X(200)
DOUBLE Y(200)
Is really a small class with two attributes that you would put into a collection.
Finally, you can easily create large classes with nothing but data, separate from the the class that contains the functions that do the work. This is kind of creepy, but it allows you to finesse the common issue by translating a COMMON block into a class and simply passing an instance of that class to every function that uses the COMMON.
There is a very simple template class to share data between objects in C++ and it is called shared_ptr. It is in the new STL and in boost.
If two objects both have a shared_ptr to the same object they get shared access to whatever data it holds.
In your particular case you probably don't want this but want a simple class that holds the data.
class FactorCalculator
{
InvariantsType invA;
InvariantsType invB;
public:
FactorCalculator() // calculate the invariants once per calculator
{
invA.CalculateValues();
invB.CalculateValues();
}
// call multiple times with different values of x, y, z
double calculateFactor( double x, double y, double z ) /*const*/
{
// calculate using pre-calculated values in invA and invB
}
};
Instead of passing each parameter individually, create another class to store them all and pass an instance of that class:
// Before
void f1(int a, int b, int c) {
cout << a << b << c << endl;
}
// After
void f2(const HighEnergyParams& x) {
cout << x.a << x.b << x.c << endl;
}
First point: globals aren't nearly as bad (in themselves) as many (most?) programmers claim. In fact, in themselves, they aren't really bad at all. They're primarily a symptom of other problems, primarily 1) logically separate pieces of code that have been unnecessarily intermixed, and 2) code that has unnecessary data dependencies.
In your case, it sounds like already eliminated (or at least minimized) the real problems (being invariants, not really variables eliminates one major source of problems all by itself). You've already stated that you can't eliminate the data dependencies, and you've apparently un-mingled the code to the point that you have at least two distinct sets of invariants. Without seeing the code, that may be coarser granularity than really needed, and maybe upon closer inspection, some of those dependencies can be eliminated completely.
If you can reduce or eliminate the dependencies, that's a worthwhile pursuit -- but eliminating the globals, in itself, is rarely worthwhile or useful. In fact, I'd say within the last decade or so, I've seen fewer problems caused by globals, than by people who didn't really understand their problems attempting to eliminate what were (or should have been) perfectly fine as globals.
Given that they are intended to be invariant, what you probably should do is enforce that explicitly. For example, have a factory class (or function) that creates an invariant class. The invariant class makes the factory its friend, but that's the only way members of the invariant class can change. The factory class, in turn, has (for example) a static bool, and executes an assert if you attempt to run it more than once. This gives (a reasonable level of) assurance that the invariants really are invariant (yes, a reinterpret_cast will let you modify the data anyway, but not by accident).
The one real question I'd have is whether there's a real point in separating your invariants into two "chunks" if all the calculations really depend on both. If there's a clear, logical separation between the two, that's great (even if they do get used together). If you have what's logically a single block of data, however, trying to break it into pieces may be counterproductive.
Bottom line: globals are (at worst) a symptom, not a disease. Insisting that you're going to get the patient's temperature down to 98.6 degrees may be counterproductive -- especially if the patient is an animal whose normal body temperature is actually 102 degrees.
uhm. Cpp is not necessarily object oriented. It is the GTA of programming! You are free to be a Object obscessed freak, a relax C programmer, a functional programmer, what ever; a mix martial artist.
My point, if Global variables worked in your fortran compile, just copy and paste to Cpp. No need to avoid global variables. It follows the principle of, dont touch legacy code.
Lets understand why global variables may cause problem. As you know, variables is the programs`s state and state is the soul of the program. Bad or invalid state causes runtime and logic errors. The problem with global variables/ global state, is that any part of our code has access to it; thus in case of invalid state, their are many bad guys or culprits to consider, meaning functions and operators. However this is only applicable if you really used so many functions on your global variable. I mean you are the only one working on your lonely program. Global variables are only a real problem if you are doing a team project. In that case many people have access to it, writing different functions that may or may not be accessing that variable.
Related
This is a followup question on my previous question:
Initialize const members using complex function in C++ class
In short, I have a program that has a class Grid that contains the properties of a 3D grid. I would like the properties of this grid to be read-only after creation, such that complex functions within the class cannot accidentally mess the grid up (such as if(bla = 10), instead of if(bla == 10)) etc. Now, this question has been answered well in the previous discussion: calling an initializer lists via a create function.
Here comes my new problem. My Grid has many properties that just plainly describe the grid (number of grid points, coordinates at grid points etc.) for which it just does not make sense to redistribute them among different objects. Still, basic textbooks in C++ always link functions with a large number of parameters to bad design, but I need them in order to be able to have const member variables.
Are there any standard techniques to deal with such problems?
The answer depends on what you're trying to protect.
If you're trying to assure that users of the class can't inadvertently alter the critical parameters, then the way to do that is to declare these members as private or protected and only provide const getters if they're needed at all outside the class implementation.
If you're trying to assure that the implementer of the Grid class doesn't alter these values, then there a few ways to do so. One simple way is to create a subclass that contains just those parameters and then the answer looks just like 1. Another way is to declare them const in which case they must be initialized when a Grid instance is constructed.
If the answer is 2, then there are also some other things that one can do to prevent inadvertently altering critical values. During the time that you're writing and testing the class implementation, you could temporarily use fixed dummy const values for the critical parameters, assuring that the other functions you write cannot alter those values.
One more trick to avoid specifically the if (i=7) ... error when you meant to write if (i == 7) ... is to always put the constant first. That is, write if (7 == i) .... Also, any decent compiler should be able to flag a warning for this kind of error -- make sure you're taking advantage of that feature by turning on all of the warning and error reporting your compiler provides.
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way. Since the algorithm is complex, I break it down into several functions.
Now, I actually do not see how this might be a class from an idiomatic way; I mean, I am just used to have algorithms as functions. The usage would simply be:
Calculation calc(/* several parameters */);
calc.calculate();
// get the heterogenous results via getters
On the other hand, putting this into a class has the following advantages:
I do not have to pass all the variables to the other functions/methods
arrays initialized at the beginning of the algorithm are accessible throughout the class in each function
my code is shorter and (imo) clearer
A hybrid way would be to put the algorithm class into a source file and access it via a function that uses it. The user of the algorithm would not see the class.
Does anyone have valuable thoughts that might help me out?
Thank you very much in advance!
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way.[...]
Now, I actually do not see how this might be a class from an idiomatic way
It is not, but many people do the same thing you do (so did I a few times).
Instead of creating a class for your algorithm, consider transforming your inputs and outputs into classes/structures.
That is, instead of:
Calculation calc(a, b, c, d, e, f, g);
calc.calculate();
// use getters on calc from here on
you could write:
CalcInputs inputs(a, b, c, d, e, f, g);
CalcResult output = calculate(inputs); // calculate is now free function
// use getters on output from here on
This doesn't create any problems and performs the same (actually better) grouping of data.
I'd say it is very idiomatic to represent an algorithm (or perhaps better, a computation) as a class. One of the definitions of object class from OOP is "data and functions to operate on that data." A compex algorithm with its inputs, outputs and intermediary data matches this definition perfectly.
I've done this myself several times, and it simplifies (human) code flow analysis significantly, making the whole thing easier to reason about, to debug and to test.
If the abstraction for the client code is an algorithm, you
probably want to keep a pure functional interface, and not
introduce additional types there. It's quite common, on the
other hand, for such a function to be implemented in a source
file which defines a common data structure or class for its
internal use, so you might have:
double calculation( /* input parameters */ )
{
SupportClass calc( /* input parameters */ );
calc.part1();
calc.part2();
// etc...
return calc.results();
}
Depending on how your code is organized, SupportClass will be
in an unnamed namespace in the source file (probably the most
common case), or in a "private" header, included only by the
sources involved in the algorith.
It really depends of what kind of algorithm you want to encapsulate. Generally I agree with John Carmack : "Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."
It really boils down to: do the algorithm need access to the private area of the class that is not supposed to be public? If the answer is yes (unless you are willing to refactor your class interface, depending on the specific cases) you should go with a member function, if not, then a free function is good enough.
Take for example the standard library. Most of the algorithms are provided as free functions because they only access the public interface of the class (with iterators for standard containers, for example).
Do you need to call the exact same functions in the exact same order each time? Then you shouldn't be requiring calling code to do this. Splitting your algorithm into multiple functions is fine, but I'd still have one call the next and then the next and so on, with a struct of results/parameters being passed along the way. A class doesn't feel right for a one-off invocation of some procedure.
The only way I'd do this with a class is if the class encapsulates all the input data itself, and you then call myClass.nameOfMyAlgorithm() on it, among other potential operations. Then you have data+manipulators. But just manipulators? Yeah, I'm not so sure.
In modern C++ the distinction has been eroded quite a bit. Even from the operator overloading of the pre-ANSI language, you could create a class whose instances are syntactically like functions:
struct Multiplier
{
int factor_;
Multiplier(int f) : factor_(f) { }
int operator()(int v) const
{
return v * _factor;
}
};
Multipler doubler(2);
std::cout << doubler(3) << std::endl; // prints 6
Such a class/struct is called a functor, and can capture "contextual" values in its constructor. This allows you to effectively pass the parameters to a function in two stages: some in the constructor call, some later each time you call it for real. This is called partial function application.
To relate this to your example, your calculate member function could be turned into operator(), and then the Calculation instance would be a function! (or near enough.)
To unify these ideas, you can try thinking of a plain function as a functor of which there is only one instance (and hence no need for a constructor - although this is no guarantee that the function only depends on its formal parameters: it might depend on global variables...)
Rather than asking "Should I put this algorithm in a function or a class?" instead ask yourself "Would it be useful to be able to pass the parameters to this algorithm in two or more stages?" In your example, all the parameters go into the constructor, and none in the later call to calculate, so it makes little sense to ask users of your class make two calls.
In C++11 the distinction breaks down further (and things get a lot more convenient), in recognition of the fluidity of these ideas:
auto doubler = [] (int val) { return val * 2; };
std::cout << doubler(3) << std::endl; // prints 6
Here, doubler is a lambda, which is essentially a nifty way to declare an instance of a compiler-generated class that implements the () operator.
Reproducing the original example more exactly, we would want a function-like thing called multiplier that accepts a factor, and returns another function-like thing that accepts a value v and returns v * factor.
auto multiplier = [] (int factor)
{
return [=] (int v) { return v * factor; };
};
auto doubler = multiplier(2);
std::cout << doubler(3) << std::endl; // prints 6
Note the pattern: ultimately we're multiplying two numbers, but we specify the numbers in two steps. The functor we get back from calling multiplier acts like a "package" containing the first number.
Although lambdas are relatively new, they are likely to become a very common part of C++ style (as they have in every other language they've been added to).
But sadly at this point we've reached the "cutting edge" as the above example works in GCC but not in MSVC 12 (I haven't tried it in MSVC 13). It does pass the intellisense checking of MSVC 12 though (they use two completely different compilers)! And you can fix it by wrapping the inner lambda with std::function<int(int)>( ... ).
Even so, you can use these ideas in old-school C++ when writing functors by hand.
Looking further ahead, resumable functions may make it into some future version of the language (Microsoft is pushing hard for them as they are practically identical to async/await in C#) and that is yet another blurring of the distinction between functions and classes (a resumable function acts like a constructor for a state machine class).
I have a method FOO() in class A that takes as its arguments input from data members of class B among other things (let's say they are two floats and one int). The way I understand this, it is generally better to implement this with something like:
A->FOO1(B, other_data_x)
rather than
A->FOO2(B.member1, B.member2, B.member3, other_data_x).
I gather one, but not the only, advantage of this is that it leaves the detail of which members of B to access up to FOO1() and so helps hide implementation details.
But what I wonder about is whether this actually introduces additional coupling between classes A and B. Class A in the former case has to know class B exists (via something like include class_B_header.h), and if the members of B change or are moved to a different class or class B is eliminated altogether, you have to modify A and FOO1() accordingly. By contrast, in the latter approach, FOO2() doesn't care whether class B exists, in fact all it cares about is that it is supplied with two floats (which in this case consist of B.member1 and B.member2) and an int (B.member3). There is, to be sure, coupling in the latter example as well, but this coupling is handled by wherever FOO2() gets called or whatever class happens to be calling FOO2(), rather than in the definition of A or B.
I guess a second part of this question is, is there a good way to decouple A and B further when we want to implement a solution like FOO1()?
But what I wonder about is whether this actually introduces additional
coupling between class A and B.
Yes, it does. A and B are now tightly coupled.
You seem to be under the impression that it is generally accepted that one should pass objects rather than members of those objects. I'm not sure how you got this impression, but this is not the case. Whether you should send an object or members of that object depends entirely on what you're trying to do.
In some cases it is necessary and desirable to have a tightly coupled dependency between two entities, and in other cases it is not. If there is a general rule of thumb that applies here, I would say if anything it is the opposite of what you have suggested:
Eliminate dependencies wherever possible, but nowhere else.
There's not really one that's universally right, and another that's universally wrong. It's basically a question of which reflects your real intent better (which is impossible to guess with metasyntactic variables).
For example, if I've written a "read_person_data" function, it should probably take some sort of "person" object as the target for the data it's going to read.
If, on the other hand, I have a "read two strings and an int" function, it should probably take two strings and an int, even if (for example) I happen to be using it to read a first and last name, and employee number (i.e., at least part of a person's data).
Two basic points though: first, a function like the former that does something meaningful and logical is usually preferable to one like the latter that's little more than an arbitrary collection of actions that happen to occur together.
Second, if you have a situation like that, it becomes open to question whether the function in question shouldn't be (at least logically) part of your A instead of being something separate that operates on an A. That's not certain by any means, but you may simply be looking at poor segmentation between the classes in question.
Welcome to the world of engineering, where everything is a compromise and the right solution depends on how your classes and function are supposed to be used (and what their meaning is supposed to be).
If foo() is conceptually something whose result depends on a float, an int, and a string, then it is right for it to accept a float, an int, and a string. Whether these values come from members of a class (could be B, but could also be C or D) it does not matter, because semantically foo() is not defined in terms of B.
On the other hand, if foo() is conceptually something whose result depends on the state of B - for instance, because it realizes an abstraction over that state - then make it accept an object of type B.
Now it is also true that a good programming practice is to let functions accept a small number of arguments if possible, I'd say up to three without exaggeration, so if a function logically works with several values, you may want to group those values in a data structure and pass instances of that data structure to foo().
Now if you call that data structure B, we're back to the original problem - but with a world of semantic difference!
Hence, whether or not foo() should accept three values or an instance of B mostly depends on what foo() and B concretely mean in your program, and how they are going to be used - whether their competences and responsibilities are logically coupled or not.
Given this question, you are probably thinking about classes in the wrong way.
When using a class, you should only be interested in its public interface (usually consisting solely of methods), not in its data members. Data members should normally be private to the class anyways, so you wouldn't even have access to them.
Think of classes as physical objects, say, a ball. You can look at the ball and see that it is red, but you can't simply set the color of the ball to be blue instead. You would have to perform an action on the ball to make it blue, for example by painting it.
Back to your classes: in order to allow A to perform some actions on B, A will have to know something about B (e.g., that the ball needs to be painted to change its color).
If you want to have A work with objects other than those from class B, you can use inheritance to extract the interface needed by A into class I, and then let class B and some other class C inherit from I. A can now work equally well with both classes B and C.
There are several reasons why you want to pass around a class instead of individual members.
It depends on what the called function must do with the arguments. If the arguments are isolated enough I would pass the member.
In some cases you might need to pass a lot of variables. In this case it is more efficient to pack them into a single object and pass that around.
Encapsulation. If you need to keep the values somehow connected to each other, you may have a class dealing with this association instead of han dling it in your code wherever you happen to need one member of it.
If you are worried aybout dependencies you can implement interfaces (like in Java or the same in C++ using abstract classes). This way you can reduce the dependency on a particular object but ensuring that it can handle the required API.
I think this definitely depends on exactly what you want to achieve. There's nothing wrong with passing a few members from a class to a function. It really depends on what "FOO1" means - does doing FOO1 on B with other_data_x make it clearer what you want to do.
If we take an example - instead of having arbitrary names A, B, FOO and so on, we make "real" names that we can understand the meaning of:
enum Shapetype
{
Circle, Square
};
enum ShapeColour
{
Red, Green, Blue
}
class Shape
{
public:
Shape(ShapeType type, int x, int y, ShapeColour c) :
x(x), y(y), type(type), colour(c) {}
ShapeType type;
int x, y;
ShapeColour colour;
...
};
class Renderer
{
...
DrawObject1(const Shape &s, float magnification);
DrawObject2(ShapeType s, int x, int y, ShapeColour c, float magnification);
};
int main()
{
Renderer r(...);
Shape c(Circle, 10, 10, Red);
Shape s(Square, 20, 20, Green);
r.DrawObject1(c, 1.0);
r.DrawObject1(s, 2.0);
// ---- or ---
r.DrawObject2(c.type, c.x, c.y, c.colour, 1.0);
r.DrawObject2(s.type, s.x, s.y, s.colour, 1.0);
};
[Yes, it's a pretty stupid example still, but it makes a little more sense to discuss the subject then objects have real names]
DrawObject1 needs to know everything about a Shape, and if we start doing some reorganising of the data structure (to store x and y in one member variable called point), it has to change. But it's probably just a few small changes.
On the other hand, in DrawObject2, we can reorganize all we like with the shape class - even remove it all together and have x, y, shape and colour in separate vectors [not a particularly good idea, but if we think that's solving some problem somewhere, then we can do so].
It largely comes down to what makes most sense. In my example, it probably makes sense to pass a Shape object to DrawObject1. But it's not certain to be the case, and there are certainly a lot of cases where you wouldn't want that.
Why do you pass objects instead of primitives?
This is a type of question I would expect on programmers#se; nevertheless..
We pass objects instead of primitives because we seek to establish a clear context, distinguish a certain point of view, and express a clear intent.
Choosing to pass primitives instead of full-fledged, rich, contextual objects lowers the degree of abstraction and widens the scope of the function. Sometimes those are goals, but normally they are things to avoid. Low cohesion is a liability, not an asset.
Instead of operating on related things through a full-fledged, rich object, with primitives we now operate on nothing more than arbitrary values, which may or may not be strictly related to each other. With no defined context, we do not know if the combination of primitives is valid together at any time, let alone right now in the current state of the system. Any protections a rich context object would have provided are lost (or duplicated in the wrong place) when we chose primitives at the function definition when we could have chosen a rich object. Further, any events or signals we normally would raise, observe, and act on by changes to any values in the rich context object are harder to raise and trace at the right time they are relevant when working with simple primitives.
Using objects over primitives fosters cohesion. Cohesion is a goal. Things that belong together stay together. Respecting the natural dependency of the function on that grouping, ordering, and interaction of the parameters through a clear context is a good thing.
Using objects over primitives does not necessarily increase the kind of coupling we are worry most about. The kind of coupling we should worry most about is the kind of coupling that occurs when external callers dictate the shape and sequence of messaging, and we further sell-out to their prescribed rules as the only way to play.
Instead of going all in, we should note a clear 'tell' when we see it. Middleware vendors and service providers definitely want us to go all in, and tightly integrate their system to ours. They want a firm dependency, tightly interwoven with our own code, so we keep coming back. However, we are smarter than that. Short of being smart, we may at least be experienced enough, having been down that road to recognize what is coming. We do not up the bid by allowing elements of the vendors code to invade every nook and cranny, knowing that we cannot buy the hand since they are sitting on a pile of chips, and to be honest, our hand just is not that good. Instead, we say this is what I am going to do with your middleware, and we lay down a limited, adaptive interface that allows play to continue, but does not bet the farm on that single hand. We do this because during the next hand we may face off against a different middleware vendor or service provider.
Ad hoc poker metaphor aside, the idea of running away from coupling whenever it presents itself is going to cost you. Running from the most interwoven and costly coupling is probably a smart thing to do if you intend to stay in the game for the long haul, and have an inclination that you will play with other vendors or providers or devices that you can exert little control.
There is a great deal more I could say on the relationship of context objects versus use of primitives, for example in providing meaningful, flexible tests. Instead, I would suggest particular readings about coding style from authors like Uncle Bob Martin, but I will omit them for now.
Agreeing with first responder here, and to your question about "is there another way...";
Since you're passing in an instance of a class, B, to the method FOO1 of A it is reasonable to assume that the functionality B provides is not entirely unique to it, i.e. there could be other ways to implement whatever B provides to A. (If B is a POD with no logic of its own then it wouldn't even make sense to try to decouple them since A needs to know a lot about B anyway).
Hence you could decouple A from B's dirty secrets by elevating whatever B does for A to an interface and then A includes "BsInterface.h" instead. That assumes there might be a C, D... and other variants of doing what B does for A.
If not then you have to ask yourself why B is a class outside of A in the first place...
At the end of the day it all comes down to one thing; It has to make sense...
I always turn the problem on its head when I run into philosophical debates like this (with myself, mostly); how would I be using A->FOO1 ? Does it make sense, in the calling code, to deal with B and do I always end up including A and B together anyway, because of the way A and B are used together?
I.e; if you want to be picky and write clean code (+1 to that) then take a step back and apply the "keep it simple" rule but always allow usability to overrule any temptations you have to over design your code.
Well, that's my view anyway.
I'm currently doing large numerical computations and speed is of utmost importance when using variables (of type double). want to know if there is a more readable way to do the following or if there is a more better way using structs or boost libraries.
UPDATE: after some though, my initial aim due to many variables, is to organise the variables indirectly into some sort of container preferably while maintaining the variables as objects and not references/pointers.
1) I will be doing large and lengthy computations on the variables, they're declared in the order they are used and changing throughout the program
2) Variables can be added to the program at any time when I decide to edit the code (quite frequent)
3) Organizing variables (into a container of pointers or whatever) is important for ease of working with these objects collectively - it will be much more streamlined and efficient code when I e.g. write to file all these objects after some time
I was thinking to instead make a class that create a type (All the variable objects are of type double) and automatically adds to a vector of pointers - as a side question, would this be an overkill
I have many variables doing all sorts of computations like so (which happen to take time):
double varName1 = someValue;
double varName2 = someValue;
double varName3 = someValue;
...
double varNameN = someValue;
...
SOME_COMPUTATION HERE
This is I believe most obvious way for readability of each variable. To store the collection for possible output in the future, I put everything into a container, and made a reference variable to each element like so
std::vector<double> store;
...
ADD VALUES TO VECTOR
...
double& varName1 = store[0];
double& varName2 = store[1];
...
When I do the above method however, computation with reference (&) variables is more costly (overtime). Then i decided to do the opposite, store a vector of pointers instead to the variables, if i need to write all files to file for example i'll use this vector and perform computation on the variables as normal (not references). To do this I came up with the following (ugly) way
std::vector<double*> store;
double create_v(double init, double& d)
{
store.push_back(&d);
d = init;
}
double varName1 = create_v(0.05,varName1);
I was wandering if 1) there is a better implementation of this using templates/boost for readability that does the same thing OR 2) Is there another way a C++ beginner like me should know? 3) optimizations I'm not considering that minimize some overhead mentioned above (I test with -O2 and -O3 and I use g++4.7.2)
The 4.7.2 version of g++ provides support of C++11's initializer lists. This means that you can write the following code to put all your variables in a vector<double>:
vector<double> vec = {varName1, varName2, varName3, ..., varNameN};
This is reasonably clean, and should provide you with a simple way of organizing your variables into a vector for the output purposes. Here is a small demo on ideone.
P.S. Your third example does not work, because you are pushing back a reference, rather than a pointer. This should be a compile-time error, though.
Based on your description, it might be acceptable to use a c-style struct. You can take advantage of certain behavior of structs that meet certain restrictions (Plain Old Data restrictions, or POD) that further simplify their behavior.
POD structs can't have a default constructor, destructor, or copy constructor (other than the defaults provided by the compiler), and can't have any virtual methods or any members of pointer or reference types. You should be able to accomplish this without much difficulty.
Once you've done that, you'll be able to declare such a struct like this:
struct DataSet
{
double foo;
double bar[48]; // arrays are legal in POD types
// snip however many more declarations
};
You will be able to save these structures to a file with something like:
DataSet ds;
// populate and process dataset
ofstream outputfile("somefile.dat", ios_base::out);
outputfile.write((char *) &ds, sizeof(DataSet));
Reading them would work exactly as you expect: the opposite of writing. Just create the object and slurp the contents of the file into it. Working with an array of DataSet should be equally intuitive.
I have about 15~20 member variables which needs to be accessed, I was wondering
if it would be good just to let them be public instead of giving every one of them
get/set functions.
The code would be something like
class A { // a singleton class
public:
static A* get();
B x, y, z;
// ... a lot of other object that should only have one copy
// and doesn't change often
private:
A();
virtual ~A();
static A* a;
};
I have also thought about putting the variables into an array, but I don't
know the best way to do a lookup table, would it be better to put them in an array?
EDIT:
Is there a better way than Singleton class to put them in a collection
The C++ world isn't quite as hung up on "everything must be hidden behind accessors/mutators/whatever-they-decide-to-call-them-todays" as some OO-supporting languages.
With that said, it's a bit hard to say what the best approach is, given your limited description.
If your class is simply a 'bag of data' for some other process, than using a struct instead of a class (the only difference is that all members default to public) can be appropriate.
If the class actually does something, however, you might find it more appropriate to group your get/set routines together by function/aspect or interface.
As I mentioned, it's a bit hard to tell without more information.
EDIT: Singleton classes are not smelly code in and of themselves, but you do need to be a bit careful with them. If a singleton is taking care of preference data or something similar, it only makes sense to make individual accessors for each data element.
If, on the other hand, you're storing generic input data in a singleton, it might be time to rethink the design.
You could place them in a POD structure and provide access to an object of that type :
struct VariablesHolder
{
int a;
float b;
char c[20];
};
class A
{
public:
A() : vh()
{
}
VariablesHolder& Access()
{
return vh;
}
const VariablesHolder& Get() const
{
return vh;
}
private:
VariablesHolder vh;
};
No that wouldn't be good. Image you want to change the way they are accessed in the future. For example remove one member variable and let the get/set functions compute its value.
It really depends on why you want to give access to them, how likely they are to change, how much code uses them, how problematic having to rewrite or recompile that code is, how fast access needs to be, whether you need/want virtual access, what's more convenient and intuitive in the using code etc.. Wanting to give access to so many things may be a sign of poor design, or it may be 100% appropriate. Using get/set functions has much more potential benefit for volatile (unstable / possibly subject to frequent tweaks) low-level code that could be used by a large number of client apps.
Given your edit, an array makes sense if your client is likely to want to access the values in a loop, or a numeric index is inherently meaningful. For example, if they're chronologically ordered data samples, an index sounds good. Summarily, arrays make it easier to provide algorithms to work with any or all of the indices - you have to consider whether that's useful to your clients; if not, try to avoid it as it may make it easier to mistakenly access the wrong values, particularly if say two people branch some code, add an extra value at the end, then try to merge their changes. Sometimes it makes sense to provide arrays and named access, or an enum with meaningful names for indices.
This is a horrible design choice, as it allows any component to modify any of these variables. Furthermore, since access to these variables is done directly, you have no way to impose any invariant on the values, and if suddenly you decide to multithread your program, you won't have a single set of functions that need to be mutex-protected, but rather you will have to go off and find every single use of every single data member and individually lock those usages. In general, one should:
Not use singletons or global variables; they introduce subtle, implicit dependencies between components that allow seemingly independent components to interfere with each other.
Make variables const wherever possible and provide setters only where absolutely required.
Never make variables public (unless you are creating a POD struct, and even then, it is best to create POD structs only as an internal implementation detail and not expose them in the API).
Also, you mentioned that you need to use an array. You can use vector<B> or vector<B*> to create a dynamically-sized array of objects of type B or type B*. Rather than using A::getA() to access your singleton instance; it would be better to have functions that need type A to take a parameter of type const A&. This will make the dependency explicit, and it will also limit which functions can modify the members of that class (pass A* or A& to functions that need to mutate it).
As a convention, if you want a data structure to hold several public fields (plain old data), I would suggest using a struct (and use in tandem with other classes -- builder, flyweight, memento, and other design patterns).
Classes generally mean that you're defining an encapsulated data type, so the OOP rule is to hide data members.
In terms of efficiency, modern compilers optimize away calls to accessors/mutators, so the impact on performance would be non-existent.
In terms of extensibility, methods are definitely a win because derived classes would be able to override these (if virtual). Another benefit is that logic to check/observe/notify data can be added if data is accessed via member functions.
Public members in a base class is generally a difficult to keep track of.