When I wrap up some procedural code in a class (in my case c++, but that is probably not of interest here) I'm often confused about the best way to do it. With procedural code I mean something that you could easily put in an procedure and where you use the surrounding object mainly for clarity and ease of use (error handling, logging, transaction handling...).
For example, I want to write some code, that reads stuff from the database, does some calculations on it and makes some changes to the database. For being able to do this, it needs data from the caller.
How does this data get into the object the best way. Let's assume that it needs 7 Values and a list of integers.
My ideas are:
List of Parameters of the constructor
Set Functions
List of Parameters of the central function
Advantage of the first solution is that the caller has to deliver exactly what the class needs to do the job and ensures also that the data is available right after the class has been created. The object could then be stored somewhere and the central function could be triggered by the caller whenever he wants to without any further interaction with the object.
Its almost the same in the second example, but now the central function has to check if all necessary data has been delivered by the caller. And the question is if you have a single set function for every peace of data or if you have only one.
The Last solution has only the advantage, that the data has not to be stored before execution. But then it looks like a normal function call and the class approaches benefits disappear.
How do you do something like that? Are my considerations correct? I'm I missing some advantages/disadvantages?
This stuff is so simple but I couldn't find any resources on it.
Edit: I'm not talking about the database connection. I mean all the data need for the procedure to complete. For example all informations of a bookkeeping transaction.
Lets do a poll, what do you like more:
class WriteAdress {
WriteAdress(string name, string street, string city);
void Execute();
}
or
class WriteAdress {
void Execute(string name, string street, string city);
}
or
class WriteAdress {
void SetName(string Name);
void SetStreet(string Street);
void SetCity(string City);
void Execute();
}
or
class WriteAdress {
void SetData(string name, string street, string city);
void Execute();
}
Values should be data members if they need to be used by more than one member function. So a database handle is a prime example: you open the connection to the database and get the handle, then you pass it in to several functions to operate on the database, and finally close it. Depending on your circumstances you may open it directly in the constructor and close it in the destructor, or just accept it as a value in the constructor and store it for later use by the member functions.
On the other hand, values that are only used by one member function and may vary every call should remain function parameters rather than constructor parameters. If they are always the same for every invocation of the function then make them constructor parameters, or just initialize them in the constructor.
Do not do two-stage construction. Requiring that you call a bunch of setXYZ functions on a class after the constructor before you can call a member function is a bad plan. Either make the necessary values initialized in the constructor (whether directly, or from constructor parameters), or take them as function parameters. Whether or not you provide setters which can change the values after construction is a different decision, but an object should always be usable immediately after construction.
Interface design is very important but in your case what you need is to learn that worst is better.
First choose the simplest solution you have, write it now.
Then you'll see what are the flaws, so fix them.
Repeat until it's not important to fix them.
The idea is that you'll have to get experience to understand how to get directly to the "best" or better said "less worst" solution of some type of problem (that's what we call "design pattern"). To get that experience you'll have to hit problems fast, solve them and try to deeply understand why something was wrong.
That's you'll have to do each time you try something "new". Errors are not a problem if you fix them and learn from them.
You should use the constructor parameters for all values, which are necessary in any case (consider that many programming languages also support constructor overloading).
This leads to the second: Setter should be used to introduce optional parameters, or to update values.
You can also join these methods: expect necessary parameters in the constructor and then call their setter-function. This way you have to do check validity checks only once (in the setters).
Central functions should use temporary parameters only (timestamps, ..)
First off, it sounds like you are trying to do too much at once. Reading, calculating and updating are all separate operations, that themselves can probably split down further.
A technique I use when I'm thinking about the design of a method or class is to think: 'what do I want the highest-level method to ideally look like?' i.e. think about the separate components of the method and split them down. That's top-down design.
In your case, I envisaged this in my head (C#):
public static void Dostuff(...)
{
Data d = ReadDatabase(...);
d.DoCalculations(...);
UpdateDatabase(d);
}
Then do the same thing for each of those methods.
When you come to passing in parameters to your method, you need to consider whether the data you're passing in is stored or not - i.e. if your class is static (it cannot be instantiated, and is instead just a collection of methods etc) or if you make objects of the class. In other words: each object of the class has a state.
If the parameters can indeed be considered to be attributes of the class, they define its state, and should be stored as private variables with getters and setters for each, where neccessary. If the class instead has no state, it should be static and the parameters passed directly to the method.
Either way, it is common, and not considered bad practice, to have both a constructor and a few get / set functions where neccessary. It is also common to have to check the state of the object at the beginning of a method, so I wouldnt worry about that.
As you can see, it largely depends on what else you are doing in this class.
The reason you can't find many resources on this is that the 'right' answer is hugely domain-specific; it depends heavily on the specific project. The best way to find out is usually by experiment.
(For example: You're right about the advantages of the first two methods. An obvious disadvantage is the use of memory to store the data the whole time the object exists. This disadvantage doesn't matter in the least if your project needs two of these data objects; it's potentially a huge problem if you need a very large number. If it's a big live dataset, you're probably better querying for data as you need it, as implied by your third solution... but not definitely, as there are times when it's better to cache the data.)
When in doubt, do a quick test implementation with a simplest-possible interface; just writing it will frequently make it clearer what the pros and cons are for your project.
Specifically addressing your example it seems as though you are still thinking too procedurally.
You should make an object that initialises the connection to the database doing all relevant error checking. Then have a method on the object that writes the values in whatever convenient way you prefer. When the object is destroyed it should release the handle to the database. That would be the object oriented way to approach the problem.
I assume the only responsibility of your WriteAddress class is to write an address to a database or an output stream. If so, then you should not worry about getters and setters for the address details; instead, define an interface AddressDataProvider that is to be implemented by all classes with which your WriteAddress class will collaborate.
One of the methods on that interface would be GetAddressParts(), which would return an array of strings as required by WriteAddress. Any class that implements that method will need to respect this array structure.
Then, in WriteAddress, define a setter SetDataProvider(AddressDataProvider). This method will be called by the code that instantiates your WriteAddress object(s).
Finally, in your Execute() method, obtain the data that are required by calling GetAddressParts() on the "data provider" that you set and write out your address.
Notice that this design shields WriteAddress from subsidiary activities that are not strictly part of its responsibilities. So, WriteAddress does not care how the address details are retrieved; it does not even care about knowing and holding the address details. It just knows from where to get them and how to write them out.
This is obvious even in the description of this design: only two names WriteAddress and AddressDataProvider come up; there is no mention of database or how to pass the address details. This is usually an indication of high cohesion and low coupling.
I hope this helps.
You can implement each approach, they don't exclude each other, then you're going to see which are most useful.
Related
I have a simple question. I have a class that does not have any variables, it is just a class that has a lot of void functions (that display things, etc.). When I create an object of that class, would it be better/more efficient to pass that one object through all my functions as the program progresses, or to just recreate it every time the program goes into a new function? Keeping in mind, that the object has no variables that need to be kept. Thanks in advance for any help.
It makes much more sense that the class only has static functions and no instance is necessary at all. You have no state anyway...
For performance concerns, there is almost no difference. Passing an object as argument will cost you a (very tiny) bit at runtime. Recreating object will not (assuming compiler optimizations).
However, if you ever have plans to introduce some state (fields), or have two implementations for those void methods, you should pass an object, as it greatly reduces refactoring cost.
Summarize: if your class is something like Math where methods stateless by nature, stick with #Amit answer and make it static. Otherwise, if your class is something like Canvas or Windows and you have thoughts on implementing it another way later, better pass it by reference so you can replace it with abstract interface and supply actual implementation.
if the functions in the otherwise empty class never change... consider making them static. or put them in a namespace instead of a class.
on the other hand... if the functions are set once at runtime, like say you pick which display functions to use based on os, then store them in a global. or singleton.
on the gripping hand... if the functions are different for different parts of the greater code... then yes you'll have to somehow deliver it to whatever functions need it. whether you should create once and pass many times - or pass never and create as needed, really depends on the specifics of your application. sorry, there's no universal answer here.
Although I wrote this example in C++, this code refactoring question also applies to any language that endorses OO, such as Java.
Basically I have a class A
class A
{
public:
void f1();
void f2();
//..
private:
m_a;
};
void A::f1()
{
assert(m_a);
m_a->h1()->h2()->GetData();
//..
}
void A::f2()
{
assert(m_a);
m_a->h1()->h2()->GetData();
//..
}
Will you guys create a new private data member m_f holding the pointer m_a->h1()->h2()? The benenif I can see is that it effectively eliminates the multi-level function calls which does simplify the code a lot.
But from another point of view, it creates an "unnecessary" data member which can be deduced from another existing data member m_a, which is kinda redundant?
I just come to a dilemma here. By far, I cannot convince myself to use one over the other.
Which do you guys prefer, any reason?
The fancy word for this technique is caching: you calculate a two-away reference once, and cache it in the object. In general, caching lets you "pay" with computer memory for speed-up of your computations.
If a profiler tells you that your code is spending a significant amount of time in the repeated call of m_a->h1()->h2(), this may be a legitimate optimization, provided that the return values of h1 and h2 never change. However, doing an optimization like that without profiling first is nearly always a bad sign of a premature optimization.
If performance is not the issue, a good rule is to stay away from storing members that can be calculated from other members stored in your object. If you would like to improve clarity, you can introduce a nicely named method (a member function) to calculate the two-away reference without storing it. Storing makes sense only in the rare cases when it is critical for the performance.
I would not. I agree it would simply things in your contrived example, but that's because m_a->h1()->h2() has no inherent meaning. In a well-designed application, the method names used should tell you something qualitative about the calls being made, and that should be a part of self-documenting code. I would argue that in properly designed code, m_a->h1()->h2() should be simpler to read and understand than redirecting to a private method which calls it for you.
Now, if m_a->h1()->h2() is an expensive call which takes a significant time to compute the result, then you might have an argument for caching as #dasblinkenlight suggests. But throwing away the descriptiveness of the method call for the sake of a few keypresses is bad.
Whenever I have something like this I usually store m_a->h1() into a variable with a meaningful name at function scope since it's likely to be used again later in function's body.
The question is as in the title.
For example:
QPropertyAnimation *animation;
animation = new QPropertyAnimation(this, "windowOpacity", this);
or
QPropertyAnimation animation;
animation.setTargetObject(this);
animation.setPropertyName("windowOpacity");
animation.setParent(this);
Which is more efficient?
edit: though it has no significant difference unless done repeatedly, i would still like to know, i would rather want answers than opinions -as stackoverflow's guidelines suggest.
First, why new in the first example? I'll assume that you will create both variables on the same storage (heap / stack).
Second, this isn't a matter of Qt, it applies to C++ in general.
Without any prior knowledge about the class you are creating, you can be sure of one thing: The constructor with arguments version is at least as efficient as the setter version.
This is because, in the worst case, the constructor might look like this:
QPropertyAnimation(QObject* target, const QByteArray & prop_name, QObject* parent = 0)
{
// members are default initializer, now explicitly set
this->setTargetObject(target);
this->setPropertyName(prop_name);
this->setParent(parent)
}
However, any person that has atleast worked through a good book will write the constructor like this:
QPropertyAnimation(QObject* target, const QByteArray & prop_name, QObject* parent = 0)
: m_target(target)
, m_prop_name(prop_name)
, m_parent(parent)
{
// members explicitly initialized
}
At to whether the one call or three (OK, 2.5, since the first call is implicit) is "better" (ignoring the heap issue), it's worthwhile thinking about the conceptual flow of the program, and your intellectual control over it. And it's also worth considering practical issues related to coding.
On the caller side, if all the appropriate parameters are already at hand where the object is being created, then the single call makes it more obvious that, indeed, all the parameters "belong" to that object, and it's being created "in one piece". On the other hand, if using a single call means that the calling code must gather up parameters over time and then spit out a single "pent up" call, then it may be a better choice to create the object and then set the corresponding properties one at a time, as their values are developed.
And, on the callee side, there may be practical considerations. For instance, it may be that there are a dozen properties, with different uses of the object likely to use different combinations. Rather than provide dozens of different constructors, providing a single constructor (or a small number of them) combined with multiple property setters is both more efficient of programmer time and less apt to be confusing to the user of the object. But if the same combination of a relatively small number of parameters is (almost) always used then the single call is probably a better use of programmer resources.
(Of some importance here is the fact that C++ does not implement true keyword parameters, so when parameter lists get beyond 4-5 items one loses intellectual control over which parameter is which, especially if there are several forms of the constructor. In such a case using separate property setters gives the (rough) effect of keyword parameters and reduces the chance of confusion.)
Efficiency isn't always about CPU cycles. Efficient use of programmer time (including reduced time spent debugging) is, in many ways, far more important.
All else being equal one function call is better than 3.
You're comparing apples and oranges. In the first case you're constructing an object from heap, while in the second case you're constructing an object "in place", in another object or in automatic storage, so there's no heap overhead. Has nothing to do with whether you use a single constructor call or a (implicit) constructor plus two setters.
I have a class which defines a historical extraction on a database:
class ExtractionConfiguration
{
string ExtractionName;
time ExtractionStartTime;
time ExtractionEndTime;
// Should these functions be static/non-static?
// The load/save path is a function of ExtractionName
void SaveConfigruation();
void LoadConfiguration();
}
These ExtractionConfigurations need to be saved to/loaded from disk. What is the best way of organising the save/load functions in terms of static/non-static? To me, it is clear that SaveConfiguration() should be a member function. However with LoadConfiguration(), does it make more sense to call
ExtractionConfiguration newExtraction;
newExtraction.LoadConfiguration();
and have a temporary empty instance or make the load function static
static ExtractionConfiguration LoadConfiguration(string filename);
and just call
ExtractionConfiguration newExtraction = ExtractionConfiguration::LoadConfiguration(filename);
which feels neater to me, but breaks the 'symmetry' of the load/save mechanism (is this even a meaningful/worthwhile consideration?).
I suppose asking for the 'best' answer is somewhat naive. I am really trying to get a better understanding of the issues involved here.
P.S. This is my first question on SO, so if I have not presented it correctly, please let me know and I will try and make the problem clearer.
You should consider using Boost.Serialization style serialization function that avoids having separate functions for saving and loading (even if you don't use the library itself).
In this approach you can pass the function any type of object that has operator&, to perform an operation on all the member variables. One such object might save the data to a file, another might load from a file, third might print the data on console (for debugging, etc).
If you wish to keep separate functions, having them as non-static members might be a better option. For the saving function this is obvious, but loading is a different matter because there you need to construct the object. However, quite commonly loading is done by default-constructing and then calling the load non-static member function, for symmetry reasons, I guess.
Having the loading as a function that returns a new object seems better in some ways, but then you need to decide how it returns the object. Is it allocated by new, or simply returned by value? Returning by value requires the object to be copyable and returning a pointer mandates the resource management scheme (cannot just store the object on stack).
I have heard that in C++, using an accessor ( get...() ) in a member function of the same class where the accessor was defined is good programming practice? Is it true and should it be done?
For example, is this preferred:
void display() {
cout << getData();
}
over something like this:
void display() {
cout << data;
}
data is a data member of the same class where the accessor was defined... same with the display() method.
I'm thinking of the overhead for doing that especially if you need to invoke the accessor lots of times inside the same class rather than just using the data member directly.
The reason for this is that if you change the implementation of getData(), you won't have to change the rest of the code that directly accesses data.
And also, a smart compiler will inline it anyways (it would always know the implementation inside the class), so there is no performance penalty.
It depends. Using an accessor function provides a layer of abstraction, which could make future changes to 'data' less painful. For example, if you wanted to lazily compute the value of 'data', you could hide that computation in the accessor function.
As for the overhead - If you are referring to performance overhead, it will likely be insignificant - your accessors will almost certainly be inlined. If you are referring to coding overhead, then yes, it is a tradeoff, and you'll have to decide whether it is worth the extra effort to provide accessors.
Personally, I don't think the accessors are worth it in most cases.
Yes, I think it should be done more or less unconditionally. If the state variable is in some base class it should more or less always be private. If you allow it to be protected or public, all inherited will use it directly. These classes in turn might be classes your coworkers have written in some other project. If you suddenly decide to mock about in the base class and refactor e.g. the variable name to something more suitable, all users of that state must be rewritten.
This is probably not an issue if you are the only programmer or developing some code that no one ever will use. But as soon as the number of sub classes start to grow, it might get really hairy. Gotta love transparency !
However, I'm not gods best child on this planet. Sometimes I cheat ;) When you're in the owner class, I think it's ok to access private data directly. It might even be beneficial, since you automatically know that you are modifying the actual class you're in. Given that you have some kind of naming convention that actually tells you so, e.g. some variable name with an underscore at the end: "someVariable_".
Cheers !
Well, Mr. Khunt, the overhead is really insignificant for accessors in most cases. The question is whether not the accessor logic needs to be invoked, or the you need direct access to the field. This is a question for each individual implementation, but in many cases, won't make much of a difference.
The real reason for accessors is to provide encapsulation of your fields to other classes - and less about the containing class.
Personally, I prefer not to have dozens of extra functions (get and set per every member variable). I would just use data, and would change to getData() only when required to do something differently. Since we are talking about changing the code only in one class, it shouldn't be too difficult.
It depends on what you might ultimately do with your data member I suppose.
By wrapping it up in the accessor you can then do things like lazily retrieving the data if this was an expensive process and not something you want to do unless someone asks for it. On the other hand you might know that it will always be a dumb built-in type and so I can't see any advantage of going through an accessor there. As I say, it depends on the member.
To my mind, the most important aspect of this question is does it make the code more readable and therefore maintainable? Personally I don't think it does so I wouldn't do this.
Certainly you should never add a private accessor just to do this, that would be cnuts.