When I am designing a generic class, I am often in dilemma between the following design choices:
template<class T>
class ClassWithSetter {
public:
T x() const; // getter/accessor for x
void set_x(const T& x);
...
};
// vs
template<class T>
class ClassWithProxy {
struct Proxy {
Proxy(ClassWithProxy& c /*, (more args) */);
Proxy& operator=(const T& x); // allow conversion from T
operator T() const; // allow conversion to T
// we disallow taking the address of the reference/proxy (see reasons below)
T* operator&() = delete;
T* operator&() const = delete;
// more operators to delegate to T?
private:
ClassWithProxy& c_;
};
public:
T x() const; // getter
Proxy x(); // this is a generalization of: T& x();
// no setter, since x() returns a reference through which x can be changed
...
};
Notes:
the reason why I return T instead of const T& in x() and operator T() is because a reference to x might not be available from within the class if x is stored only implicitly (e.g. suppose T = std::set<int> but x_ of type T is stored as std::vector<int>)
suppose caching of Proxy objects and/or x is not allowed
I am wondering what would be some scenarios in which one would prefer one approach versus the other, esp. in terms of:
extensibility / generality
efficiency
developer's effort
user's effort
?
You can assume that the compiler is smart enough to apply NRVO and fully inlines all the methods.
Current personal observations:
(This part is not relevant for answering the question; it just serves as a motivation and illustrates that sometimes one approach is better than the other.)
One particular scenario in which the setter approach is problematic is as follows. Suppose you're implementing a container class with the following semantics:
MyContainer<T>& (mutable, read-write) - allows modifying on both the container and its data
implementation of the
MyContainer<const T>& (mutable, read-only) - allows modifying to the container but not its data
const MyContainer<T> (immutable, read-write) - allows modifying the data but not the container
const MyContainer<const T> (immutable, read-only) - no modifying to the container/data
where by "container modifications" I mean operations like adding/removing elements. If I implement this naively with a setter approach:
template<class T>
class MyContainer {
public:
void set(const T& value, size_t index) const { // allow on const MyContainer&
v_[index] = value; // ooops,
// what if the container is read-only (i.e., MyContainer<const T>)?
}
void add(const T& value); // disallow on const MyContainer&
...
private:
mutable std::vector<T> v_;
};
The problem could be mitigated by introducing a lot of boilerplate code that relies on SFINAE (e.g. by deriving from a specialized template helper which implements both versions of set()). However, a bigger problem is that this brakes the common interface, as we need to either:
ensure that calling set() on an read-only container is a compile error
provide a different semantics for the set() method for read-only containers
On the other hand, while the Proxy-based approach works neatly:
template<class T>
class MyContainer {
typedef T& Proxy;
public:
Proxy get(const T& value, size_t index) const { // allow on const MyContainer&
return v_[index]; // here we don't even need a const_cast, thanks to overloading
}
...
};
and the common interface and semantics is not broken.
One difficulty I see with the proxy approach is supporting the Proxy::operator&()
because there might be no object of type T stored / a reference to available (see notes above). For example, consider:
T* ptr = &x();
which cannot be supported unless x_ is actually stored somewhere (either in the class itself or accessible through a (chain of) methods called on member variables), e.g.:
template<class T>
T& ClassWithProxy::Proxy::operator&() {
return &c_.get_ref_to_x();
}
Does that mean that the proxy object references are actually superior when T& is available (i.e. x_ is explicitly stored) as it allows for:
batching/delaying updates (e.g. imagine the changes are propagated from the proxy class destructor)
better control over caching
?
(In that case, the dilemma is between void set_x(const T& value) and T& x().)
Edit: I changed the typos in constness of setters/accessors
Like most design dilemmas, I think this depends on the situation. Overall, I would prefer the getters and setters pattern, as it is simpler to code (No need for a proxy class for every field), simpler to understand by another person (looking at your code), and more explicit in certain circumstances. However, there are situations where proxy classes can simplify user experience and hide implementation details. A few examples:
If your container is some sort of associative array, you might overload operator[] for getting and setting the value for a particular key. However, if a key hasn't been defined, you might need a special operation for adding it. Here a proxy class would probably be the most convenient solution, as it can handle = assignment in different ways as necessary. However, this can mislead users: If this particular data structure has different times for adding vs setting, using a proxy makes this difficult to see, while using a set and put method set can make it clear the separate time used by each operation.
What if the container does some sort of compression on T and stores the compressed form? While you could use a proxy which did the compression/decompression whenever necessary, it would hide the cost associated with de/re compression from the user, and they might use it as if it were a simple assignment without heavy computation. By creating getter/setter methods with appropriate names, it can be made more apparent that they take significant computational effort.
Getters and setters also seem more extensible. Making a getter and setter for a new field is easy, while making a proxy which forwards the operations for every property would be an error-prone annoyance. What if you later need to extend your container class? With getters and setters, just make them virtual and override them in the subclass. For proxies, you might have to make a new proxy struct in each subclass. To avoid breaking encapsulation you probably should make your proxy struct use the superclasses's proxy struct to do some of the work, which could get quite confusing. With getters/setters, just call the super getter/setter.
Overall, getters and setters are easier to program, understand and change, and they can make visible the costs associated with an operation. So, in most situations, I would prefer them.
I think your ClassWithProxy interface is mixing wrappers/proxys and containers. For containers it is common to use accessors like
T& x();
const T& x() const;
just like the standard containers do, e.g. std::vector::at(). But normally access to members by reference breaks encapsulation. For containers it's a convinience and part of the design.
But you noted that a reference to T is not always available, so this will reduce the options to your ClassWithSetter interface, which should be a wrapper for T dealing with the way you store your type (while containers are dealing with the way you store objects). I would change the naming, to make clear, it might not be as efficient as a plain get/set.
T load() const;
void save(const T&);
or something more in context. Now it should be obvious, modifying T by using a proxy, again breaks encapsulation.
By the way, there is no reason not to use the wrapper inside of a container.
I think that possibly part of the problem with your set implementation is that your idea of how a const MyContainer<T>& would behave is inconsistent with how standard containers behave and therefore would likely confuse future code maintainers. The normal container type for "constant container, mutable elements" is const MyContainer<T*>& where you add a level of indirection to clearly indicate your intention to users.
This is how the standard containers work, and if you utilize that mechanism you don't need the underlying container to be mutable nor the set function to be const.
All that said I slightly prefer the set/get approach because if a particular attribute only needs a get you don't have to write a set at all.
However I prefer not writing any direct access to members (like get/set or proxy) but instead providing a meaningfully named interface through which clients can access the class functionality. In a trivial example to show my meaning, instead of set_foo(1); set_bar(2); generate_report(); prefer a direct interface like generate_report(1, 2); and avoid directly manipulating class attributes.
Related
I have a custom container class that is templated:
template<typename T>
class MyContainer {
T Get();
void Put(T data);
};
I would like to pass a pointer to this container to a function that will access the container's data as generic data - i.e. char* or void*. Think serialization. This function is somewhat complicated so it would be nice to not specify it in the header due to the templates.
// Errors of course, no template argument
void DoSomething(MyContainer *container);
I'm ok with requiring users to provide a lambda or subclass or something that performs the conversion. But I can't seem to come up with a clean way of doing this.
I considered avoiding templates altogether by making MyContainer hold a container of some abstract MyData class that has a virtual void Serialize(void *dest) = 0; function. Users would subclass MyData to provide their types and serialization but that seems like it's getting pretty complicated. Also inefficient since it requires storing pointers to MyData to avoid object slicing and MyData is typically pretty small and the container will hold large amounts (a lot of pointer storage and dereferencing).
You don't need any char* or void* or inheritance.
Consider this simplified implementation:
template <class T>
void Serialize (std::ostream& os, const MyContainer<T>& ct) {
os << ct.Get();
}
Suddenly this works for any T that has a suitable operator<< overload.
What about user types that don't have a suitable operator<< overload? Just tell the users to provide one.
Of course you can use any overloaded function. It doesn't have to be named operator<<. You just need to communicate its name and signature to the users and ask them to overload it.
You can introduce a non-template base class for the container with a pure virtual function that returns a pointer to raw data and implement it in your container:
class IDataHolder
{
public:
virtual ~IDataHolder(); // or you can make destructor protected to forbid deleteing by pointer to base class
virtual const unsigned char* GetData() const = 0;
};
template<typename T>
class MyContainer : public IDataHolder
{
public:
T Get();
void Put(T data);
const unsigned char* GetData() const override { /* cast here internal data to pointer to byte */}
};
void Serialize(IDataHolder& container)
{
const auto* data = container.GetData();
// do the serialization
}
I would like to pass a pointer to this container to a function that will access the container's data as generic data - i.e. char* or void*. Think serialization.
Can't be done in general, because you don't know anything about T. In general, types cannot be handled (e.g. copied, accessed, etc.) as raw blobs through a char * or similar.
Therefore, you would need to restrict what T can be, ideally enforcing it, otherwise never using it for Ts that would trigger undefined behavior. For instance, you may want to assert that std::is_trivially_copyable_v<T> holds. Still, you will have to consider other possible issues when handling data like that, like endianness and packing.
This function is somewhat complicated so it would be nice to not specify it in the header due to the templates.
Not sure what you mean by this. Compilers can handle very easily headers, and in particular huge amounts of template code. As long as you don't reach the levels of e.g. some Boost libraries, your compile times won't explode.
I considered avoiding templates altogether by making MyContainer hold a container of some abstract MyData class that has a virtual void Serialize(void *dest) = 0; function. Users would subclass MyData to provide their types and serialization but that seems like it's getting pretty complicated. Also inefficient since it requires storing pointers to MyData to avoid object slicing and MyData is typically pretty small and the container will hold large amounts (a lot of pointer storage and dereferencing).
In general, if you want a template, do a template. Using dynamic dispatching for this will probably kill performance, specially if you have to go through dispatches for even simple types.
As a final point, I would suggest taking a look at some available serialization libraries to see how they achieved it, not just in terms of performance, but also in terms of easy of use, integration with existing code, etc. For instance, Boost Serialization and Google Protocol Buffers.
I'm curious if that's proper way of assignement
class Foo {
int x_;
public:
int & x() {
return x_;
}
};
My teacher is making assignement like that: obj.x() = 5;
But IMO that's not the proper way of doing it, its not obvious and it would be better to use setter here. Is that violation of clear and clean code ? If we take rule that we should read the code like a book that code is bad, am I right ? Can anyone tell me if am I right ? :)
IMO, this code is not a good practice in terms of evolution. If you need to provide some changes checking, formatting, you have to refactor your class API which can become a problem with time.
Having set_x() would be a way cleaner. Moreover, it will allow you to have checking mechanics in your setter.
a proper getter get_x() or x() could also apply some logic (format, anything...) before returning. In your case, you should return int instead of int& since setter should be used for modification (no direct modification allowed).
And truly speaking, this code doesn't really make sense... it returns a reference on a property making it fully modifiable. Why not having directly a public property then ? And avoid creating an additional method ?
Do you want control or not on your data? If you think so, then you probably want a proper getter and setter. If not, you probably don't need a method, just make it public.
To conclude, I would say you are right, because the way you see it would make it better over the time, prone to non-breaking change, better to read.
As the UNIX philosophy mentions : "Rule of Clarity: Clarity is better than cleverness."
Assuming that x() happens to be public (or protected) member the function effectively exposes an implementation: the is an int held somewhere. Whether that is good or bad depends on context and as it stands there is very little context.
For example, if x() were actually spelled operator[](Key key) and part of a container class with subscript operator like std::vector<T> (in which case Key would really be std::size_t) or std::map<Key, Value> the use of returning a [non-const] reference is quite reasonable.
On the other hand, if the advice is to have such functions for essentially all members in a class, it is a rather bad idea as this access essentially allows uncontrolled access to the class's state. Having access functions for all members is generally and indication that there is no abstraction, too: having setters/getters for members tends to be an indication that the class is actually just an aggregate of values and a struct with all public members would likely serve the purpose as well, if not better. Actual abstractions where access to the data matters tend to expose an interface which is independent of its actual representation.
In this example, the effect of returning a (non-const) reference is the same as if you made the variable public. Any encapsulation is broken. However, that is not a bad thing by default. A case where this can help a lot is when the variable is part of a complicated structure and you want to provide an easy interface to that variable. For example
class Foo {
std::vector<std::list<std::pair<int,int>>> values;
public:
int& getFirstAt(int i){
return values[i].[0].first;
}
};
Now you have an easy access to the first element of the first element at position i and dont need to write the full expression every time.
Or your class might use some container internally, but what container it is should be a private detail, then instead of exposing the full container, you could expose references to the elements:
class Bar {
std::vector<int> values; // vector is private!!
public:
int& at(int i){ // accessing elements is public
return values.at(i);
}
};
In general such a code confuses readers.
obj.x() = 5;
However it is not rare to meet for example the following code
std::vector<int> v = { 1, 0 };
v.back() = 2;
It is a drawback of the C++ language.
In C# this drawback was avoided by introducing properties.
As for this particular example it would be better to use a getter and a setter.
For example
class Foo {
int x_;
public:
int get_value() const { return x_; }
void set_value( int value ) { x_ = value; }
};
In this case the interface can be kept while the realization can be changed.
I have a member variable, enabled_m, whose value is dependent on a number of variables. Since these invariants should be maintained by the class, I want it to be private:
class foo_t
{
public:
void set_this(...); // may affect enabled_m
void set_that(...); // may affect enabled_m
void set_the_other_thing(...); // may affect enabled_m
bool is_enabled() const { return enabled_m; }
private:
bool enabled_m;
};
Which works, but really my intent is to require a user of foo_t to go through the class to modify enabled_m. If the user wants to just read enabled_m, that should be an allowable operation:
bool my_enabled = foo.enabled_m; // OK
foo.enabled_m = my_enabled; // Error: enabled_m is private
Is there a way to make enabled_m public for const operations and private for non-const operations, all without having to require a user go through accessor routines?
Most engineers will prefer that you use accessor methods, but if you really want a hack-around, you could do something like this:
class AccessControl
{
private:
int dontModifyMeBro;
public:
const int& rDontModifyMeBro;
AccessControl(int theInt): dontModifyMeBro(theInt), rDontModifyMeBro(dontModifyMeBro)
{}
// The default copy constructor would give a reference to the wrong variable.
// Either delete it, or provide a correct version.
AccessControl(AccessControl const & other):
dontModifyMeBro(other.rDontModifyMeBro),
rDontModifyMeBro(dontModifyMeBro)
{}
// The reference member deletes the default assignment operator.
// Either leave it deleted, or provide a correct version.
AccessControl & operator=(AccessControl const & other) {
dontModifyMeBro = other.dontModifyMeBro;
}
};
No, there's no way to restrict modification only to members. private restricts all access to the name; const prevents modification everywhere.
There are some grotesque alternatives (like a const reference, or use of const_cast), but the accessor function is the simplest and most idiomatic way to do this. If it's inline, as in your example, then its use should be as efficient as direct access.
A great deal here depends upon the intent behind exposing the enabled state, but my general advice would be to avoid exposing it at all.
The usual use of your is_enabled would be something like:
if (f.is_enabled())
f.set_this(whatever);
In my opinion, it's nearly always better to just call the set_this, and (if the client cares) have it return a value to indicate whether that succeeded, so the client code becomes something like:
if (!f.set_this(whatever))
// deal with error
Although this may seem like a trivial difference when you start to do multi-threaded programming (for one major example) the difference becomes absolutely critical. In particular, the first code that tests the enabled state, then attempts to set the value is subject to a race condition--the enabled state may change between the call to is_enabled and the call to set_this.
To make a long story short, this is usually a poor design. Just don't do it.
I dug up an old Grid class, which is just a simple 2-D container templated with a type. To make one you would do this:
Grid<SomeType> myGrid (QSize (width, height));
I tried to make it "Qt-ish"...for instance it does size operations in terms of QSize, and you index into it with myGrid[QPoint (x, y)]. It can take boolean masks and do operations on elements whose mask bit was set. There's also a specialization where if your elements are QColor it can generate a QImage for you.
But one major Qt idiom I adopted was that it did implicit sharing under the hood. This turned out to be very useful in the QColor-based grids for the Thinker-Qt-based program I had.
However :-/ I also happened to have some cases where I'd written the likes of:
Grid< auto_ptr<SomeType> > myAutoPtrGrid (QSize (width, height));
When I moved up from auto_ptr to C++11's unique_ptr, the compiler rightfully complained. Implicit sharing requires the ability to make an identical copy if needed...and auto_ptr had swept this bug under the rug by conflating copying with transfer-of-ownership. Non-copyable types and implicit sharing simply do not mix, and unique_ptr is kind enough to tell us.
(Note: It so happened that I hadn't noticed the problem in practice, because the use cases for the auto_ptr were passing grids by reference...never by value. Still, this was bad code...and the proactive nature of C++11 is pointing out the potential problem before it happens.)
Ok, so...how might I design a generic container that can flip implicit sharing on and off? I really did want many of the Grid features when I was using the auto_ptr and it's great if copying is disabled for non-copyable types...that catches errors! But having the implicit sharing work is nice as a default, when the type happens to be copyable.
Some ideas:
I could make separate types (NonCopyableGrid, CopyableGrid)...or (UniqueGrid, Grid) depending on your tastes...
I could pass a flag into the Grid constructor
I could use static factory methods (Grid::newNonCopyable, Grid::newCopyable) but which would call the relevant constructor under the hood...maybe more descriptive
If possible, I might "detect" copyability on the contained type, and then either leverage a QSharedDataPointer in the implementation or not, depending?
Any good reasons to pick one of these methods over the others, or have people adopted something altogether better for this kind of situation?
If you were going to do it in a single container, I think the easiest way would be to use std::is_copy_constructable to choose whether your data struct inherited from QSharedData, and to replace QSharedDataPointer with std::unique_ptr (QScopedPointer doesn't support move semantics)
This is only a rough example of what I'm thinking as I don't have Qt and C++11 available together:
template<class T>
class Grid
{
struct EmptyStruct
{
};
typedef typename std::conditional<
std::is_copy_constructible<T>::value,
QSharedData,
EmptyStruct
>::type GridDataBase;
struct GridData : public GridDataBase
{
// data goes here
};
typedef typename std::conditional<
std::is_copy_constructible<T>::value,
QSharedDataPointer<GridData>,
std::unique_ptr<GridData>
>::type GridDataPointer;
public:
Grid() : data_(new GridData) {}
private:
GridDataPointer data_;
};
Disclaimer
I don't really understand your Grid template or your use cases. However I do understand containers in general. So maybe this answer applies to your Grid<T> and maybe it doesn't.
Since you've already stated the intent that Grid< unique_ptr<T> > would indicate unique ownership and a non-copyable T, what about doing something similar with copy on write?
What about explicitly stating when you want to use copy on write with:
Grid< cow_ptr<T> >
A cow_ptr<T> would offer reference counting copies, but on a "non-const dereference" would do a copy of T if the refcount is not 1. So Grid need not worry about memory management to such an extent. It would need only to handle its data buffer, and perhaps move or copy its members around in Grid's copy and/or move members.
A cow_ptr<T> is fairly easily cobbled together by wrapping std::shared_ptr<T>. Here is a partial implementation I put together about a month ago when dealing with a similar issue:
template <class T>
class cow_ptr
{
std::shared_ptr<T> ptr_;
public:
template <class ...Args,
class = typename std::enable_if
<
std::is_constructible<std::shared_ptr<T>, Args...>::value
>::type
>
explicit cow_ptr(Args&& ...args)
: ptr_(std::forward<Args>(args)...)
{}
explicit operator bool() const noexcept {return ptr_ != nullptr;}
T const* read() const noexcept {return ptr_.get();}
T * write()
{
if (ptr_.use_count() > 1)
ptr_.reset(ptr_->clone());
return ptr_.get();
}
T const& operator*() const noexcept {return *read();}
T const* operator->() const noexcept {return read();}
void reset() {ptr_.reset();}
template <class Y>
void
reset(Y* p)
{
ptr_.reset(p);
}
};
I chose to make the "write" syntax very explicit, since COW tends to be more effective when there are very few writes, but many reads/copies. To gain const access, you use it just like any other pointer:
p->inspect(); // compile time error if inspect() isn't const
But to do some modifying operation you have to call it out with the write member function:
p.write()->modify();
shared_ptr has a bunch of really handy constructors and I didn't want to have to replicate all of them in cow_ptr. So the one cow_ptr constructor you see is a poor man's implementation of inheriting constructors that also works for data members.
You may need to fill this out with other smart pointer functionality such as relational operators. You may also want to change how cow_ptr copies a T. I'm currently assuming a virtual clone() function but you could easily substitute into write the use of T's copy constructor instead.
If an explicit Grid< cow_ptr<T> > doesn't really fit your needs, that's all good. I figured I'd share just in case it did.
Which approach is the better one and why?
template<typename T>
struct assistant {
T sum(const T& x, const T& y) const { ... }
};
template<typename T>
T operator+ (const T& x, const T& y) {
assistant<T> t;
return t.sum(x, y);
}
Or
template<typename T>
struct assistant {
static T sum(const T& x, const T& y) { ... }
};
template<typename T>
T operator+ (const T& x, const T& y) {
return assistant<T>::sum(x, y);
}
To explain the things a bit more: assistant has no state it only provides several utility functions and later I can define template specialization of it to achieve a different behavior for certain types T.
I think for higher optimization levels these two approaches don't lead to different byte codes because anyway the assistant will optimized "away"...
Thanks!
It is usually not a question of run-time performance, but one of readability. The former version communicates to a potential maintainer that some form of object initialization is performed. The latter makes the intent much clearer and should be (in my opinion) preferred.
By the way, what you've created is basically a traits class. Take a look at how traits are done in the standard library (they use static member functions).
Since assistant is essentially a collection of free functions, I would go with the static approach (maybe even make the constructor private). This makes clear that assistant is not intended to be instatiated. Also, and this is only a wild guess, this may result in slightly less memory consumption, since no implicit this-pointer (and no instance of the class) is needed.
I'd use the object approach - it seems a bit more standard and similar to the way you pass functors to STL algorithms - it's also easier to extend by allowing parameters passed to the constructor of the assistant to influence the results of the operations etc. There's no difference but the object approach will probably be more flexible long term and more in sync with similar solutions you'll find elsewhere.
Why an object is more flexible? One example is that you can easily implement more complex operations (like average in this example) that require you to store the temporary result "somewhere" and require analyzing results from a couple invocations while still keeping the same "paradigm" of usage. Second might be you'd want to do some optimization - say you need a temporary array to do something in those functions - why allocate it each time or have it static in your class and leave hanging and waste memory when you can allocate it when it's really needed, re-use on a number of elements but then release when all operations are done and the destructor of the object is called.
There's no advantage to using static functions - and as seen above there are at least a few advantages to using objects so the choice is rather simple imho.
Also the calling semantics can be practically identical - Assistant().sum( A, B ) instead of Assistant::sum( A, B ) - there's really little reason to NOT use an object approach :)
In the first method an assistant has to be created while the second method consists of just the function call, thus the second method is faster.
2nd method is preferred, in this method, there is no strcutre "assistent" variable created and it is calling only required member function. I think it is little bit faster in execution than 1st method.