C++ API: Modifing internal objects - c++

I've got two related questions.
At the moment I am designing/writing a C++ API, in which I need to be able to modify an object that is held by another object.
It is comparable to this example:
class Bar
{
public:
Bar(int x) : num(x){}
void setNum(int x)
{
num = x;
}
int getNum()
{
return num;
}
private:
int num;
};
class Foo
{
public:
Foo() = default;
void setBar(std::unique_ptr<Bar> newBar)
{
bar = std::move(newBar);
}
Bar* getBar()
{
return bar.get();
}
private:
std::unique_ptr<Bar> bar;
};
The class Foo takes ownership of Bar, however, Bar must be able to be modified.
Here, Foo is the main class the user would interact with.
Whilst Bar could be considered more as a data type, which changes the output of Foo.
Is the solution of returning a raw pointer to Bar the preferred option?
I have the feeling, that this brakes the encapsulation which is a no-go for API design.
My googling efforts haven't given me a concrete answer to exactly this problem yet.
But I might just be looking with the wrong search terms.
The second part to this question is, how this example would change, if Bar would be stored in a container in Foo.
Would I return a pointer to the whole container, an iterator for the container ...?

If you worry about getBar breaking the encapsulation, then you should also see void setBar(std::unique_ptr<Bar> newBar) as such a breakage.
Because allowing to set bar from outside does make the knowledge about bar possibly none exclusive to Foo and the one passing Bar to Foo might have still the possibility to modify the bar so Foo can no do any assumptions about the state of Bar, as it could change anytime.
On the other hand, if you only want to have reading access to Bar trough Foo then a const Bar* getBar() or const Bar& getBar() would not break the encapsulation that much, because getBar would not allow changing Bar.

Is the solution of returning a raw pointer to Bar the preferred option?
It's a solution and not necessarily a bad one. Returning a reference would be preferable in cases where the object always exists (not in this case since Foo's default constructor doesn't create a Bar).
Some programmers prefer to use a wrapper for bare pointers (such as observer_ptr, which has been proposed to the standard) to distinguish it from a pointer whose purpose is to iterate an array, or from owning bare pointers (latter of which should be avoided).
I have the feeling, that this brakes the encapsulation
Your whole premise is to break the encapsulation since you want to "modify internal objects". If you want to avoid breakage of encapsulation, then you may need to change the design further up so that you don't need to modify internal objects (externally).
A solution that doesn't break encapsulation is to provide a specific interface to Foo for modification, such as:
void Foo::transmogrify_bar(int gadgets) {
bar->transmofgrify(gadgets);
}
Whether this encapsulation is useful for your API is another matter. In some cases it is essential, in others it doesn't matter much.
if Bar would be stored in a container in Foo. Would I return a pointer to the whole container, an iterator for the container ...?
Would you want the client to be able to modify the container (add, remove elements)?
This is breaking the encapsulation further. Instead, you could have begin and end iterators that don't allow modification of the container itself, which brings you to an equivalent encapsulation of the pointer return in the case of single object.
You could provide only const iterators, and add iterator as an argument to transmogrify to keep encapsulation of modifying Bar.
Finally, for full encapsulation, you would need to use PIMPL pattern to hide Bar completely.

Related

Restrict access to class

I've got some class - lets call it MyMutableClass, which implements MutableInterface.
class MutableInterface {
public:
void setMyPreciousData(int value);
int getMyPreciousData() const;
.... //and so on
};
However there is a huge part of code, which should not change the state this class instance, but it need to have a read access.
How to do it in the most polite manner? Should I create an additional ImmutableInterfaces, with getters only and inherits it by MutableInterface? Then I can choose, which one will be passed to another parts of code.
Second option would be to create another class, which object would encapsulate the MutableInterface implementation and provide an access only to a subset of its methods. Is that better?
Is there some well-known patter, which I'm not aware of?
This won't be what you want to hear, but I think it's important to be said in this case.
Inheritance describes a 'is kind of' interface. The derived thing 'is a kind of' the base thing.
A const thing is not 'a kind of' mutable thing. It's an immutable thing.
A mutable thing is not 'a kind of' immutable thing. It's a thing which happens to be mutable.
Mutability is a property of the thing, not a specialisation.
Therefore, inheritance is the wrong model and this is why in c++, constness is a property, not an interface.
If you really must hide the fact that sometimes a thing is mutable (one wonders why), then as mentioned in the comments, you probably want some kind of proxy view class, such as:
// this is the actual thing
struct the_thing
{
void change_me();
int see_me() const;
};
// and this is the proxy
struct read_only_thing_view
{
int see_me() const { return _reference.see_me(); }
the_thing& _referent;
};

Modifying private object properties through method which returns reference

I'm curious if that's proper way of assignement
class Foo {
int x_;
public:
int & x() {
return x_;
}
};
My teacher is making assignement like that: obj.x() = 5;
But IMO that's not the proper way of doing it, its not obvious and it would be better to use setter here. Is that violation of clear and clean code ? If we take rule that we should read the code like a book that code is bad, am I right ? Can anyone tell me if am I right ? :)
IMO, this code is not a good practice in terms of evolution. If you need to provide some changes checking, formatting, you have to refactor your class API which can become a problem with time.
Having set_x() would be a way cleaner. Moreover, it will allow you to have checking mechanics in your setter.
a proper getter get_x() or x() could also apply some logic (format, anything...) before returning. In your case, you should return int instead of int& since setter should be used for modification (no direct modification allowed).
And truly speaking, this code doesn't really make sense... it returns a reference on a property making it fully modifiable. Why not having directly a public property then ? And avoid creating an additional method ?
Do you want control or not on your data? If you think so, then you probably want a proper getter and setter. If not, you probably don't need a method, just make it public.
To conclude, I would say you are right, because the way you see it would make it better over the time, prone to non-breaking change, better to read.
As the UNIX philosophy mentions : "Rule of Clarity: Clarity is better than cleverness."
Assuming that x() happens to be public (or protected) member the function effectively exposes an implementation: the is an int held somewhere. Whether that is good or bad depends on context and as it stands there is very little context.
For example, if x() were actually spelled operator[](Key key) and part of a container class with subscript operator like std::vector<T> (in which case Key would really be std::size_t) or std::map<Key, Value> the use of returning a [non-const] reference is quite reasonable.
On the other hand, if the advice is to have such functions for essentially all members in a class, it is a rather bad idea as this access essentially allows uncontrolled access to the class's state. Having access functions for all members is generally and indication that there is no abstraction, too: having setters/getters for members tends to be an indication that the class is actually just an aggregate of values and a struct with all public members would likely serve the purpose as well, if not better. Actual abstractions where access to the data matters tend to expose an interface which is independent of its actual representation.
In this example, the effect of returning a (non-const) reference is the same as if you made the variable public. Any encapsulation is broken. However, that is not a bad thing by default. A case where this can help a lot is when the variable is part of a complicated structure and you want to provide an easy interface to that variable. For example
class Foo {
std::vector<std::list<std::pair<int,int>>> values;
public:
int& getFirstAt(int i){
return values[i].[0].first;
}
};
Now you have an easy access to the first element of the first element at position i and dont need to write the full expression every time.
Or your class might use some container internally, but what container it is should be a private detail, then instead of exposing the full container, you could expose references to the elements:
class Bar {
std::vector<int> values; // vector is private!!
public:
int& at(int i){ // accessing elements is public
return values.at(i);
}
};
In general such a code confuses readers.
obj.x() = 5;
However it is not rare to meet for example the following code
std::vector<int> v = { 1, 0 };
v.back() = 2;
It is a drawback of the C++ language.
In C# this drawback was avoided by introducing properties.
As for this particular example it would be better to use a getter and a setter.
For example
class Foo {
int x_;
public:
int get_value() const { return x_; }
void set_value( int value ) { x_ = value; }
};
In this case the interface can be kept while the realization can be changed.

Enforce constness for pointed data in C++?

Let there be a Foo class with some const and non-const methods
struct Foo
{
Foo ();
~Foo();
void noSideEffect() const;
void withSideEffect();
};
I also have a Bar class, that need to refer to Foo in some way. To be more precise, maybe too precise for this question, Bar implements operators || and && for union and intersections, so two Bar instances need to kwow they are working on the same instance of Foo.
The simplest solution I found was to use a pointer to a Foo object:
struct Bar
{
Foo * p_foo;
Bar (Foo& foo)
: p_foo(&foo) {};
}
Now two bar instances can play together and see if they are both handling the same Foo. I'm almost happy.
But now I would like to sometimes use Bar with const Foo instances. Well, it might be easy, I just have to create a const Bar instance, right? There we go:
const Bar createBarFromConstFoo(const Foo& foo)
{
Foo* newfoo = const_cast<Foo*>(&foo);
const Bar newbar (*newfoo);
return newbar;
}
And now the nightmare begins (see Why doesn't C++ enforce const on pointer data?). I think I understand the why (the standard says so), my main problem is how to best cope with it.
Except this little standard thing, the createBarFomConstFoo does almost what I want since it is returning a const Bar.
Is there a way to prevent a const Bar to do nasty things with my (initially) const Foo (ie only call const methods of Foo) while allowing a non-const Bar to do everything?
Maybe there is no way to do that and it's an object design issue, but I do not see a simple alternative.
Edit: to downvoters, can you please explain why, I may be able to progress from your remarks...
Edit 2: Maybe obfuscating the real classes behing Foo and Bar was a bad idea, I just wanted to simplify things.
So Foo is in fact a Molecule (and in fact a Protein), which contains Atoms (many for a protein). Being able to select some atoms is the reason to create Bar, which is a SelectionOfAtoms.
It is sometimes convenient to select, from example, all hydrogens and oxygen atoms, so Bar implements unions and intersections. I want to be able to extract those atoms so SelectionOfAtoms implements a createNewMolecule() methods from the selected atoms. It therefore need a way to refer to the original molecule (maybe some kind of copy would do here but maybe not with the other requirements below).
But I recently felt the need to modify atoms of a selection, while keeping other atoms unmodified. Doing it through SelectionOfAtoms (Bar) was conveninent: it already knows where to find the Atoms (using the pointer) and the index of these atoms (internal implementation detail), so everything needed to change atoms is almost already here, except that I can either use Selection only on Molecule (non-const) or work on const Molecule and forget about modifying them or go into the const_cast horror.
I'm sure it's a pretty bad design, but it is what is already there, it can surely be improved a lot.
Using the STL as a guide, consider your molecule as a container, and your selection as something like an iterator or iterator range.
Now, in this scheme you'd have separate types for the const and non-const selections/iterators, which makes sense since they have different semantics. Making the constness a template parameter is probably a false economy unless there's a lot more code in the selection than you've suggested.
Now, you start off with either a const or a non-const molecule, and you know statically that you're getting ether a const_selection or (non-const) selection.
It Bar is not overly complex, you can make it a class template.
template <typename FooType>
struct Bar
{
FooType * p_foo;
Bar (FooType& foo)
: p_foo(&foo) {};
}
template <typename FooType>
Bar<FooType> makeBar(FooType& foo)
{
return Bar<FooType>(foo);
}

encapsulation difficulty in nested c++ classes

We all are familiar with the concept of encapsulation and abstraction but sometimes this may lead to an obstacle I'm curious about the tricks or methods or whatever you call them to solve the problem.
here we have a nested c++ class:
#include <iostream>
using namespace std;
class Foo {
public:
int get_foo_var()
{
return foo_var;
}
void set_foo_var(int a)
{
foo_var = a;
}
private:
int foo_var;
};
class Bar {
public:
Foo get_foo()
{
return foo;
}
private:
Foo foo;
};
int main()
{
Bar bar;
bar.get_foo().set_foo_var(2);
cout << bar.get_foo().get_foo_var() << endl;
}
as you see here, get_foo() returns a copy of foo_var(it's value) which means it is not the reference to the original one and changing it does nothing, thus nothing is changed.
one solution might be changing to get_foo() in a way that returns a reference and not a value but this is of course in contrast with the concept of encapsulation.
what are the solutions to solve this problem without breaking software designing principles?
UPDATE
one pointed out setting foo_var by a function in bar class:
class Bar {
public:
void set_foo_var(int a) {
foo.set_foo_var(a);
}
private:
Foo foo;
};
but I think this violates encapsulation and abstraction! the whole concept of abstraction is if "foo" is related to "Foo" and "bar" is related to "Bar", that most of foo manipulations should be done in Foo class and some manipulations can be applied in other classes. what about the first situtation? (the situtation in which foo manipulation has nothing to do with Bar and so manipulating foo in bar is stupid!)
Whether you want to return a copy of or a reference to something is a high-level design decision. Both ways can be required, depending on the context.
In this particular example, you could add a corresponding method in Bar to modify the Foo behind it:
class Bar {
public:
void set_foo_var(int a) {
foo.set_foo_var(a);
}
private:
Foo foo;
};
Is this good or bad? The answer is: we cannot tell you. Generally, it's hard to seriously talk about good class design with names like "Foo" and "Bar". What's good and bad depends on the actual, real usage scenario! :)
Let's look at this from a purely conceptual level for a minute. This what your design says:
There exists one conceptual Foo entity for every Bar instance (because I can get a Foo from a Bar, and its state depends on which Bar I get it from).
Each Foo instance belongs to the Bar instance it came from (because operations on a Foo change the Bar it came from - the next time I ask for a Foo from a Bar, the previous Foo's changes are reflected).
A Foo has the same lifetime as its Bar (because I can ask for it at any time in the Bar's lifetime, I can use it as long as Bar exists, and the caller of get_foo() does not manage the lifetime of the returned Foo object).
Another way of looking at it is that Foo is already designed as part of Bar's internal state, a "conceptual member variable", regardless of whether it is actually implemented that way.
Given what your public interface is already telling you, how does returning a non-const reference to a private member really break encapsulation? Could you change the implementation so that Foo isn't a private member variable, yet still use the same public interface? Yes, you could. The only implementation changes that would force you to change the public interface ALSO force you to change the conceptual interface described above.
Implementation rules of thumb can be over-applied. Move past mechanics and look at conceptual design instead. Assuming you're OK with what your design is implying, in this case I say that returning a reference to a private member variable does NOT break encapsulation. At least that's my take on it.
An alternative is to have Foo and Bar less tightly coupled.
class Bar {
public:
Foo get_foo()
{
return foo;
}
set_foo(Foo new_foo)
{
// Update foo with new_foo's values
foo = new_foo;
}
private:
Foo foo;
};
In this case, Foo reflects some part of Bar's internal state at the time it was requested, but isn't tied to the Bar it came from. You have to explicitly call set_foo() to update Bar. Without that requirement, Foo really is conceptually a member variable regardless of how you implement it.

Private members vs temporary variables in C++

Suppose you have the following code:
int main(int argc, char** argv) {
Foo f;
while (true) {
f.doSomething();
}
}
Which of the following two implementations of Foo are preferred?
Solution 1:
class Foo {
private:
void doIt(Bar& data);
public:
void doSomething() {
Bar _data;
doIt(_data);
}
};
Solution 2:
class Foo {
private:
Bar _data;
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
In plain english: if I have a class with a method that gets called very often, and this method defines a considerable amount of temporary data (either one object of a complex class, or a large number of simple objects), should I declare this data as private members of the class?
On the one hand, this would save the time spent on constructing, initializing and destructing the data on each call, improving performance. On the other hand, it tramples on the "private member = state of the object" principle, and may make the code harder to understand.
Does the answer depend on the size/complexity of class Bar? What about the number of objects declared? At what point would the benefits outweigh the drawbacks?
From a design point of view, using temporaries is cleaner if that data is not part of the object state, and should be preferred.
Never make design choices on performance grounds before actually profiling the application. You might just discover that you end up with a worse design that is actually not any better than the original design performance wise.
To all the answers that recommend to reuse objects if construction/destruction cost is high, it is important to remark that if you must reuse the object from one invocation to another, in many cases the object must be reset to a valid state between method invocations and that also has a cost. In many such cases, the cost of resetting can be comparable to construction/destruction.
If you do not reset the object state between invocations, the two solutions could yield different results, as in the first call, the argument would be initialized and the state would probably be different between method invocations.
Thread safety has a great impact on this decision also. Auto variables inside a function are created in the stack of each of the threads, and as such are inherently thread safe. Any optimization that pushes those local variable so that it can be reused between different invocations will complicate thread safety and could even end up with a performance penalty due to contention that can worsen the overall performance.
Finally, if you want to keep the object between method invocations I would still not make it a private member of the class (it is not part of the class) but rather an implementation detail (static function variable, global in an unnamed namespace in the compilation unit where doOperation is implemented, member of a PIMPL...[the first 2 sharing the data for all objects, while the latter only for all invocations in the same object]) users of your class do not care about how you solve things (as long as you do it safely, and document that the class is not thread safe).
// foo.h
class Foo {
public:
void doOperation();
private:
void doIt( Bar& data );
};
// foo.cpp
void Foo::doOperation()
{
static Bar reusable_data;
doIt( reusable_data );
}
// else foo.cpp
namespace {
Bar reusable_global_data;
}
void Foo::doOperation()
{
doIt( reusable_global_data );
}
// pimpl foo.h
class Foo {
public:
void doOperation();
private:
class impl_t;
boost::scoped_ptr<impl_t> impl;
};
// foo.cpp
class Foo::impl_t {
private:
Bar reusable;
public:
void doIt(); // uses this->reusable instead of argument
};
void Foo::doOperation() {
impl->doIt();
}
First of all it depends on the problem being solved. If you need to persist the values of temporary objects between calls you need a member variable. If you need to reinitialize them on each invokation - use local temporary variables. It a question of the task at hand, not of being right or wrong.
Temporary variables construction and destruction will take some extra time (compared to just persisting a member variable) depending on how complex the temporary variables classes are and what their constructors and destructors have to do. Deciding whether the cost is significant should only be done after profiling, don't try to optimize it "just in case".
I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.
If it is semantically correct to preserve a value of Bar inside Foo, then there is nothing wrong with making it a member - it is then that every Foo has-a bar.
There are multiple scenarios where it might not be correct, e.g.
if you have multiple threads performing doSomething, would they need all separate Bar instances, or could they accept a single one?
would it be bad if state from one computation carries over to the next computation.
Most of the time, issue 2 is the reason to create local variables: you want to be sure to start from a clean state.
Like a lot of coding answers it depends.
Solution 1 is a lot more thread-safe. So if doSomething were being called by many threads I'd go for Solution 1.
If you're working in a single threaded environment and the cost of creating the Bar object is high, then I'd go for Solution 2.
In a single threaded env and if the cost of creating Bar is low, then I think i'd go for Solution 1.
You have already considered "private member=state of the object" principle, so there is no point in repeating that, however, look at it in another way.
A bunch of methods, say a, b, and c take the data "d" and work on it again and again. No other methods of the class care about this data. In this case, are you sure a, b and c are in the right class?
Would it be better to create another smaller class and delegate, where d can be a member variable? Such abstractions are difficult to think of, but often lead to great code.
Just my 2 cents.
Is that an extremely simplified example? If not, what's wrong with doing it this
void doSomething(Bar data);
int main() {
while (true) {
doSomething();
}
}
way? If doSomething() is a pure algorithm that needs some data (Bar) to work with, why would you need to wrap it in a class? A class is for wrapping a state (data) and the ways (member functions) to change it.
If you just need a piece of data then use just that: a piece of data. If you just need an algorithm, then use a function. Only if you need to keep a state (data values) between invocations of several algorithms (functions) working on them, a class might be the right choice.
I admit that the borderlines between these are blurred, but IME they make a good rule of thumb.
If it's really that temporary that costs you the time, then i would say there is nothing wrong with including it into your class as a member. But note that this will possibly make your function thread-unsafe if used without proper synchronization - once again, this depends on the use of _data.
I would, however, mark such a variable as mutable. If you read a class definition with a member being mutable, you can immediately assume that it doesn't account for the value of its parent object.
class Foo {
private:
mutable Bar _data;
private:
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
This will also make it possible to use _data as a mutable entity inside a const function - just like you could use it as a mutable entity if it was a local variable inside such a function.
If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.