I recently came across classes that use a configuration object instead of the usual setter methods for configuration. A small example:
class A {
int a, b;
public:
A(const AConfiguration& conf) { a = conf.a; b = conf.b; }
};
struct AConfiguration { int a, b; };
The upsides:
You can extend your object and easily guarantee reasonable default values for new values without your users ever needing to know about it.
You can check a configuration for consistency (e.g. your class only allows some combinations of values)
You save a lot of code by ommiting the setters.
You get a default constructor for specifying a default constructor for your Configuration struct and use A(const AConfiguration& conf = AConfiguration()).
The downside(s):
You need to know the configuration at construction time and can't change it later on.
Are there more downsides to this that I'm missing? If there aren't: Why isn't this used more frequently?
Whether you pass the data individually or per struct is a question of style and needs to be decided on a case-by-case basis.
The important question is this: Is the object is ready and usable after construction and does the compiler enforce that you pass all necessary data to the constructor or do you have to remember to call a bunch of setters after construction who's number might increase at any time without the compiler giving you any hint that you need to adapt your code. So whether this is
A(const AConfiguration& conf) : a(conf.a), b(conf.b) {}
or
A(int a_, int b_) : a(a_), b(b_) {}
doesn't matter all that much. (There's a number of parameters where everyone would prefer the former, but which number this is - and whether such a class is well designed - is debatable.) However, whether I can use the object like this
A a1(Configuration(42,42));
A a2 = Configuration(4711,4711);
A a3(7,7);
or have to do this
A urgh;
urgh.setA(13);
urgh.setB(13);
before I can use the object, does make a huge difference. Especially so, when someone comes along and adds another data field to A.
Using this method makes binary compatibility easier.
When the library version changes and if the configuration struct contains it, then constructor can distinguish whether "old" or "new" configuration is passed and avoid "access violation"/"segfault" when accessing non-existant fields.
Moreover, the mangled name of constructor is retained, which would have changed if it changed its signature. This also lets us retain binary compatibility.
Example:
//version 1
struct AConfiguration { int version; int a; AConfiguration(): version(1) {} };
//version 2
struct AConfiguration { int version; int a, b; AConfiguration(): version(2) {} };
class A {
A(const AConfiguration& conf) {
switch (conf.version){
case 1: a = conf.a; b = 0; // No access violation for old callers!
break;
case 2: a = conf.a; b = conf.b; // New callers do have b member
break;
}
}
};
The main upside is that the A object can be unmutable. I don't know if having the AConfiguration stuct actualy gives any benefit over just an a and a b parameter to the constructor.
Using this method makes binary compatability harder.
If the struct is changed (one new optional field is added), all code using the class might need a recompile. If one new non-virtual setter function is added, no such recompilation is necessary.
I would support the decreased binary compatibility here.
The problem I see comes from the direct access to a struct fields.
struct AConfig1 { int a; int b; };
struct AConfig2 { int a; std::map<int,int> b; }
Since I modified the representation of b, I am screwed, whereas with:
class AConfig1 { public: int getA() const; int getB() const; /* */ };
class AConfig2 { public: int getA() const; int getB(int key = 0) const; /* */ };
The physical layout of the object might have change, but my getters have not and the offset to the functions have not either.
Of course, for binary compatibility, one should check out the PIMPL idiom.
namespace details { class AConfigurationImpl; }
class AConfiguration {
public:
int getA() const;
int getB() const;
private:
AConfigurationImpl* m_impl;
};
While you do end up writing more code, you have the guarantee here of backward compatibility of your object as long as you add supplementary methods AFTER the existing ones.
The representation of an instance in memory does not depend on the number of methods, it only depends on:
the presence or absence of virtual methods
the base classes
the attributes
Which is what is VISIBLE (not what is accessible).
And here we guarantee that we won't have any change in the attributes. The definition of AConfigurationImpl might change without any problem and the implementation of the methods might change too.
The more code means: constructor, copy constructor, assignment operator and destructor, which is a fair amount, and of course the getters and setters. Also note that these methods can no longer be inlined, since their implementation are defined in a source file.
Whether or not it suits you, you're on your own to decide.
Related
I am working with an existing C library (that I can't modify) where some structures have opaque fields that must be accessed through specific setters and getters, like in the following crude example (imagining x is private, even though it's written in C).
struct CObject {
int x;
};
void setCObjectX(CObject* o, int x) {
o->x = x;
}
int getCObjectX(CObject* o) {
return o->x;
}
I am writing classes that privately own these types of structures, kind of like wrappers, albeit more complex. I want to expose the relevant fields in a convenient way. At first, I was simply writing setters and getters wherever necessary. However, I thought of something else, and I wanted to know if there are any downsides to the method. It uses function pointers (std::function) to store the C setter-getter pairs and present them as if directly accessing a field instead of functions.
Here is the generic class I wrote to help define such "fake" fields:
template<typename T>
struct IndirectField {
void operator=(const T& value) {
setter(value);
}
auto operator()() const -> T {
return *this;
}
operator T() const {
return getter();
}
std::function<void(const T&)> setter;
std::function<T()> getter;
};
It is used by defining an instance in the C++ class and setting up setter and getter with the corresponding C functions:
IndirectField<int> x;
// ...
x.setter = [=](int x) {
setCObjectX(innerObject.get(), x);
};
x.getter = [=]() {
return getCObjectX(innerObject.get());
};
Here is a complete, working code for testing.
Are there any disadvantages to using this method? Could it lead to eventual dangerous behaviors or something?
The biggest problem I see with your solution is that std::function objects take space inside each instance of IndirectField inside CPPObject, even when CObject type is the same.
You can fix this problem by making function pointers into template parameters:
template<typename T,typename R,void setter(R*,T),T getter(R*)>
struct IndirectField {
IndirectField(R *obj) : obj(obj) {
}
void operator=(const T& value) {
setter(obj, value);
}
auto operator()() const -> T {
return *this;
}
operator T() const {
return getter(obj);
}
private:
R *obj;
};
Here is how to use this implementation:
class CPPObject {
std::unique_ptr<CObject,decltype(&freeCObject)> obj;
public:
CPPObject()
: obj(createCObject(), freeCObject)
, x(obj.get())
, y(obj.get()) {
}
IndirectField<int,CObject,setCObjectX,getCObjectX> x;
IndirectField<double,CObject,setCObjectY,getCObjectY> y;
};
This approach trades two std::function objects for one CObject* pointer per IndirectField. Unfortunately, storing this pointer is required, because you cannot get it from the context inside the template.
Your modified demo.
Are there any disadvantages to using this method?
There's a few things to highlight in your code:
Your getters & setters, being not part of the class, break encapsulation. (Do you really want to tie yourself permanently to this library?)
Your example shows a massive amount of copying being done; which will be slower than it needs to be. (auto operator()(), operator T() to name but 2).
It's taking up more memory than you need to and adds more compexity than just passing around a Cobject. If you don't want things to know that it's a CObject, then create an abstract class and pass that abstract class around (see below for example).
Could it lead to eventual dangerous behaviors or something?
The breaking of encapsulation will result in x changing from any number of routes; and force other things to know about how it's stored in the object. Which is bad.
The creation of IndirectField Means that every object will have to have getters and setters in this way; which is going to be a maintenance nightmare.
Really I think what you're looking for is something like:
struct xProvider {
virtual int getX() const = 0;
virtual void setX() = 0;
};
struct MyCObject : xProvider {
private:
CObject obj;
public:
int getX() const override {return obj.x;}
CObject& getRawObj() {return obj;}
// etc ...
}
And then you just pass a reference / pointer to an xProvider around.
This will remove the dependence on this external C library; allowing you to replace it with your own test struct or a whole new library if you see fit; without having to re-write all your code using it
in a struct by default (as you post) all the fields are public, so they are accessible by client software. I you want to make them accessible to derived classes (you don't need to reimplement anything if you know the field contract and want to access it in a well defined way) they are made protected. And if you want them to be accessed by nobody, then mark them as private.
If the author of such a software doesn't want the fields to be touched by you, he will mark them as private, and then you'll have nothing to do, but to adapt to this behaviour. Failing to do will give you bad consequences.
Suppose you make a field that is modified with a set_myField() method, that calls a list of listeners anytime you make a change. If you bypass the method accessing function, all the listeners (many of them of unknown origin) will be bypassed and won't be notified of the field change. This is quite common in object programming, so you must obey the rules the authors impose to you.
I am starting to code bigger objects, having other objects inside them.
Sometimes, I need to be able to call methods of a sub-object from outside the class of the object containing it, from the main() function for example.
So far I was using getters and setters as I learned.
This would give something like the following code:
class Object {
public:
bool Object::SetSubMode(int mode);
int Object::GetSubMode();
private:
SubObject subObject;
};
class SubObject {
public:
bool SubObject::SetMode(int mode);
int SubObject::GetMode();
private:
int m_mode(0);
};
bool Object::SetSubMode(int mode) { return subObject.SetMode(mode); }
int Object::GetSubMode() { return subObject.GetMode(); }
bool SubObject::SetMode(int mode) { m_mode = mode; return true; }
int SubObject::GetMode() { return m_mode; }
This feels very sub-optimal, forces me to write (ugly) code for every method that needs to be accessible from outside. I would like to be able to do something as simple as Object->SubObject->Method(param);
I thought of a simple solution: putting the sub-object as public in my object.
This way I should be able to simply access its methods from outside.
The problem is that when I learned object oriented programming, I was told that putting anything in public besides methods was blasphemy and I do not want to start taking bad coding habits.
Another solution I came across during my research before posting here is to add a public pointer to the sub-object perhaps?
How can I access a sub-object's methods in a neat way?
Is it allowed / a good practice to put an object inside a class as public to access its methods? How to do without that otherwise?
Thank you very much for your help on this.
The problem with both a pointer and public member object is you've just removed the information hiding. Your code is now more brittle because it all "knows" that you've implemented object Car with 4 object Wheel members. Instead of calling a Car function that hides the details like this:
Car->SetRPM(200); // hiding
You want to directly start spinning the Wheels like this:
Car.wheel_1.SetRPM(200); // not hiding! and brittle!
Car.wheel_2.SetRPM(200);
And what if you change the internals of the class? The above might now be broken and need to be changed to:
Car.wheel[0].SetRPM(200); // not hiding!
Car.wheel[1].SetRPM(200);
Also, for your Car you can say SetRPM() and the class figures out whether it is front wheel drive, rear wheel drive, or all wheel drive. If you talk to the wheel members directly that implementation detail is no longer hidden.
Sometimes you do need direct access to a class's members, but one goal in creating the class was to encapsulate and hide implementation details from the caller.
Note that you can have Set and Get operations that update more than one bit of member data in the class, but ideally those operations make sense for the Car itself and not specific member objects.
I was told that putting anything in public besides methods was blasphemy
Blanket statements like this are dangerous; There are pros and cons to each style that you must take into consideration, but an outright ban on public members is a bad idea IMO.
The main problem with having public members is that it exposes implementation details that might be better hidden. For example, let's say you are writing some library:
struct A {
struct B {
void foo() {...}
};
B b;
};
A a;
a.b.foo();
Now a few years down you decide that you want to change the behavior of A depending on the context; maybe you want to make it run differently in a test environment, maybe you want to load from a different data source, etc.. Heck, maybe you just decide the name of the member b is not descriptive enough. But because b is public, you can't change the behavior of A without breaking client code.
struct A {
struct B {
void foo() {...}
};
struct C {
void foo() {...}
};
B b;
C c;
};
A a;
a.c.foo(); // Uh oh, everywhere that uses b needs to change!
Now if you were to let A wrap the implementation:
class A {
public:
foo() {
if (TESTING) {
b.foo();
} else {
c.foo();
}
private:
struct B {
void foo() {...}
};
struct C {
void foo() {...}
};
B b;
C c;
};
A a;
a.foo(); // I don't care how foo is implemented, it just works
(This is not a perfect example, but you get the idea.)
Of course, the disadvantage here is that it requires a lot of extra boilerplate, like you have already noticed. So basically, the question is "do you expect the implementation details to change in the future, and if so, will it cost more to add boilerplate now, or to refactor every call later?" And if you are writing a library used by external users, then "refactor every call" turns into "break all client code and force them to refactor", which will make a lot of people very upset.
Of course instead of writing forwarding functions for each function in SubObject, you could just add a getter for subObject:
const SubObject& getSubObject() { return subObject; }
// ...
object.getSubObject().setMode(0);
Which suffers from some of the same problems as above, although it is a bit easier to work around because the SubObject interface is not necessarily tied to the implementation.
All that said, I think there are certainly times where public members are the correct choice. For example, simple structs whose primary purpose is to act as the input for another function, or who just get a bundle of data from point A to point B. Sometimes all that boilerplate is really overkill.
This is my first post. I believe I am aware of best practices on stackoverflow but probably not 100%. I believe there is no specific post that addresses my interrogation; also I hope it's not too vague.
I am trying to figure out good practices for writing C++ constructors
that do medium-to-heavy-duty work.
Pushing (all?) init work into initialization lists seems a good idea
for two reasons that cross my mind, namely:
Resource Acquisition Is Initialization
As far as I know, the simplest way of guaranteeing that members
are initialized correctly at resource acquisition is to make sure that
what's inside the parentheses of the initialization list is correct
when it is evaluated.
class A
{
public:
A(const B & b, const C & c)
: _c(c)
{
/* _c was allocated and defined at the same time */
/* _b is allocated but its content is undefined */
_b = b;
}
private:
B _b;
C _c;
}
const class members
Using initialization lists is the only correct way of using
const members which can hold actual content.
class A
{
public:
A(int m, int n, int p)
: _m(m) /* correct, _m will be initialized to m */
{
_n = n; /* incorrect, _n is already initialized to an undefined value */
*(const_cast<int*>(&_p)) = p; /* technically valid, but ugly and not particularly RAII-friendly */
}
private:
const int _m, _n, _p;
}
However some problems seem to affect over usage of initialization lists:
Order
Member variables are always initialized in the order they are declared in the class definition, so write them in that order in the constructor initialization list. Writing them in a different order just makes the code confusing because it won't run in the order you see, and that can make it hard to see order-dependent bugs.
http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#S-discussion
This is important if you initialize a value using a value
initialized previously in the list. For example:
A(int in) : _n(in), _m(_n) {}
If _m is defined before _n, its value at initialization is undefined.
I am ready to apply this rule in my code, but when working
with other people it causes code redundancy and forces reading
two files at once.
That is not acceptable and somewhat error-prone.
Solution — initialize using only data from ctor arguments.
Solution's problem — keeping work in the init list without
inner dependency means repeating operations. For example:
int in_to_n(int in)
{
/* ... */
return n;
}
int modify(int n)
{
/* ... */
return modified;
}
A::A(int in)
: _n(in_to_n(in))
, _n_modified(modify(in_to_n(in)))
{}
For tiny bits of repeated operations I believe compilers
can reuse existing data but I don't think one should rely on that
for significant work (and I don't even think it's done if calling
noninlined separate code).
How much work can you put in the list?
In the previous example, I called functions to compute what the
attributes are to be initialized to. These can be plain/lambda
functions or static/nonstatic methods,
of the current class or of another.
(I don't suggest using nonstatic methods of the current class,
it might even be undefined usage according to the standard, not sure.)
I guess this is not in itself a big problem, but one needs to make
special efforts in clarity to keep the intent of the code clear if
writing big classes that do big work that way.
Also, when trying to apply the solution to the previous problem,
there is only so much independent work you can do when initializing
your instance... This usually gets big if you have a long sequence
of attributes to initialize with inner dependencies.
It's starting to look like just the program, translated into an
initialization list; I guess this is not what C++ is supposed to be
transitioning into?
Multiple inits
One often computes two variables at once. Setting two variables
at once in an init list means either:
using an ugly intermediate attribute
struct InnerAData
{
B b;
C c;
};
/* must be exported with the class definition (ugly) */
class A
{
public:
A(const D & input)
: _inner(work(input))
, _b(_inner.b)
, _c(_inner.c) {}
private:
B _b;
C _c;
InnerAData _inner;
}
This is awful and forces extra useless copies.
or some ugly hack
class A
{
public:
A(const D & input) : _b(work(input)) {}
private:
B _b;
C _c;
B work(const D & input)
{
/* ... work ... */
_c = ...;
}
}
This is even more awful and doesn't even work with const
or non-builtin type attributes.
keeping stuff const
Sometimes it can take most of the ctor to figure out the value
to give to an attribute, so that making sure it is const,
and therefore moving the work to the initialization list,
can seem constrained. I won't give a full example, but think
something like computing data from a default filename, then
computing the full filename from that data, then checking if
the corresponding file exists to set a const boolean, etc.
I guess it's not a fundamental problem, but all that seems
intuitively more legible in the body of the ctor, and moving
it to the init list just to do a correct initialization of
a const field seems overkill. Maybe I'm just imagining things.
So here's the hard part: asking a specific question!
Have you faced similar problems, did you find a better solution,
if not what's the lesson to learn — or is there something I'm
missing here?
I guess my problem is I'm pretty much trying to move all the work
to the init list when I could search for a compromise of what state
is initiated and leave some work for later. I just feel like init list
could play a bigger role in making modern C++ code than it does but
I haven't seen them pushed further than basic usage yet.
Additionally, I'm really not convinced as to why the values are
initialized in that order, and not in the order of the list.
I've been orally told it's because attributes are in order on the stack and
the compiler must guarantee that stack data is never above the SP.
I'm not sure how that's a final answer... pretty sure one could
implement safe arbitrarily reordered initialization lists,
correct me if I'm wrong.
In your code:
class A
{
public:
A(const B & b, const C & c)
: _c(c)
{
/* _c was allocated and defined at the same time */
/* _b is allocated but its content is undefined */
_b = b;
}
private:
B _b;
C _c;
}
the constructor calls B::B() and then B::operator= which may be a problem if any of these doesn't exist, is expensive or is not implemented correctly to the RAII and rule-of-three guidelines. The rule of thumb is to always prefer initializer list if it is possible.
Since c++11, an alternative is to use delegating constructors:
struct InnerData;
InnerData (work(const D&);
class A
{
public:
A(const D & input) : A(work(input)) {}
private:
A(const InnerAData&);
private:
const B _b;
const C _c;
};
And (that can be inlined, but then visible in header)
struct InnerAData
{
B b;
C c;
};
A::A(const InnerAData& inner) : _b(inner.b), _c(inner.c) {}
Regarding the following, are there any reasons to do one over the other or are they roughly equivalent?
class Something
{
int m_a = 0;
};
vs
class Something
{
int m_a;
Something(int p_a);
};
Something::Something(int p_a):m_a(p_a){ ... };
The two code snippets you posted are not quite equal.
class Something
{
int m_a = 0;
};
Here you specify the value with which to initialise, i.e. 0, at compile time.
class Something
{
int m_a;
Something(int p_a);
};
Something::Something(int p_a):m_a(p_a){ ... };
And here you do it at run time (or possibly at run time), with the value p_a not known until the constructor is called.
The following piece of code comes closer to your first example:
class Something
{
int m_a;
Something();
};
Something::Something() : m_a(0) { /* ... */ };
What you have to consider here is that in the first case, the value appears directly in the class definition. This may create an unnecessary dependency. What happens if you need to change your 0 to 1 later on? Exposing the value directly in the class definition (and thus, usually, in a header file) may cause recompilation of a lot of code in situations where the other form of initialisation would avoid it, because the Something::Something() : m_a(0) part will be neatly encapsulated in a source file and not appear in a header file:
// Something.h - stable header file, never changed
class Something
{
int m_a;
Something();
};
// Something.cpp - can change easily
Something::Something() : m_a(0) { /* ... */ };
Of course, the benefits of in-class initialisation may vastly outweigh this drawback. It depends. You just have to keep it in mind.
The first form is more convenient if you have more than one constructor (and want them all to initialise the member in the same way), or if you don't otherwise need to write a constructor.
The second is required if the initialiser depends on constructor arguments, or is otherwise too complicated for in-class initialisation; and might be better if the constructor is complicated, to keep all the initialisation in one place. (And it's also needed if you have to support pre-C++11 compilers.)
The first form is new to C++11 and so at this point isn't terribly well supported, especially if you need to support a variety of older compilers.
Otherwise they should be roughly equivalent when a C++11 compiler is available.
Elaborating on Christian Hackl's answer.
The first form allows to initialize m_a and have a default c'tor at the same time. Or you can even be explicit in your code and define a constructor with the default keyword:
class Something
{
int m_a = 0;
// explicitly tell the compiler to generate a default c'tor
Something() = default;
};
With the second form, an auto-generated default c'tor would leave m_a uninitialized, so if you want to initialize to a hard-coded value, you have to write your own default c'tor:
class Something
{
int m_a;
// implement your own default c'tor
Something() : m_a(0) {}
};
class Something
{
int m_a = 0;
};
is equivalent to
class Something
{
int m_a(0);
};
So, doing
class Something
{
int m_a;// (0) is moved to the constructor
public:
Something(): m_a(0){}
};
yields a uniform syntax for initialization that requires or does not require run-time input.
Personally I don't like the first form because it looks like an "declaration then assignment", which is complete misconception.
If you change the code like what Christian did to make them do the same thing. Now it is a tradeoff between compile-time optimization (which Christian explained completely) and run-time optimization.
class Foo{
public:
Vector<double> vec1;
.
.
.
Vector<double> vecN;
}
Imagine you have to initialize each vector by some predefined doubles. If the program must instantiate many and many objects of this class, it is better to initialize the vectors in the header file to make the user happy by reducing the run-time!
Even though it's supported, this type of initialization will create bugs that will be pretty hard to track down. It's the type of "aesthetic optimization" that you will regret a couple months down the road.
See the example below:
class_x_1.h:
class X
{
private:
int x = 10;
public:
int GetX();
};
class_x_2.h:
class X
{
private:
int x = 20;
public:
int GetX();
};
class_x.cpp:
#include "class_x_1.h" // implementation uses the version that initializes x with 10
int X::GetX()
{
return x;
}
main.cpp:
#include "class_x_2.h" // main includes definition that initializes x with 20
#include <iostream>
int main()
{
X x;
std::cout << x.GetX() << std::endl;
return 0;
}
Output:
20
As expected, it will return 20 because this is the version that initializes x with 20. However, if you go to the implementation of X, you might expected it to be 10.
I'm developing a class for my AVR cpu.
The class is going to handle all port related operations like Set, Read, etc;
Constructor looks like:
(...)
public:
Port(volatile uint8_t * DDR, volatile uint8_t * PORT);
(...)
And it is constructed at the beginning of the main():
int main()
{
Port PortA(&DDRA, &PORTA);
(...)
}
Now I want to make sure that nowhere else in the program object with same parameters would be constructed. It is sure that I cannot create array or map to find it out and throw an exception. It must be done at compile time. So basicly I want to force avr-g++ to check if any other Port(the same first parameter OR the same second parameter) exists in current project.
All functions uses pointers / references to Port objects.
How about the following?
The address of the pointers has to be known at compile-time.
template<int *p>
class Tester
{
public:
static Tester& GetInstance()
{
static Tester t;
return t;
}
private:
Tester() {}
};
int i;
int main()
{
Tester<&i>& t = Tester<&i>::GetInstance();
}
If you only want one instance of a class, then why use classes? The whole purpose of classes, is that you can have multiple intances. Just stick to global variables and free functions.
This is not pretty, but it does get the job done. 10 years experience says embedded work tends to be like this at times. :/
I tested this with a simple class under MSVC, you'll have to adapt for your needs, but it does demonstrate the principle.
Declare your class like this in the header:
class t
{
public:
#ifdef ALLOW_CTOR
t(int i);
~t();
#endif
int get();
private:
int m_i;
};
Then in exactly one file add the line:
#define ALLOW_CTOR 1
include the header file, declare your port instances as file scope variables, and then implement all the class methods of your class, ctor, dtor, accessors, etc.
If you do not place ALLOW_CTOR line in other source files, they can still use the accessor methods in your class, but they can't use the constructor. In the absence of the correct constructor, MSVC complains because it tries to use the default copy constructor, but since the parameter lists don't match, it fails. I suspect the compiler you're using will fail in some way, I'm just not sure exactly what the failure will be.
#Neil Kirk
If failure to provide a declaration of a ctor at all for a compilation unit leads to undefined behavior, then there are ways to solve that.
class t
{
#ifdef ALLOW_CTOR
public:
#else
private:
#endif
t(int i);
public:
~t();
int get();
private:
int m_i;
};
marks the ctor as public in one, but private in all other compilation units, leading to a well defined error: the inability to access a private member function.