I'm porting some Fortran90 code to C++ (because I'm stupid, to save the "Why?!").
Fortran allows specification of ranges on arrays, in particular, starting at negative values, eg
double precision :: NameOfArray(FirstSize, -3:3)
I can write this in C++ as something like
std::array<std::array<double, 7>, FirstSize> NameOfArray;
but now I have to index like NameOfArray[0:FirstSize-1][0:6]. If I want to index using the Fortran style index, I can write perhaps
template <typename T, size_t N, int start>
class customArray
{
public:
T& operator[](const int idx) { return data_[idx+start]; }
private:
std::array<T,N> data_;
}
and then
customArray<double, 7, -3> NameOfArray;
NameOfArray[-3] = 5.2;
NameOfArray[3] = 2.5;
NameOfArray[4] = 3.14; // This is out of bounds,
// despite being a std::array of 7 elements
So - the general idea is "Don't inherit from std::'container class here'".
My understanding is that this is because, for example, std::vector does not have a virtual destructor, and so should not (can not?) be used polymorphically.
Is there some other way I can use a std::array, std::vector, etc, and get their functions 'for free', whilst overriding specific functions?
template<typename T, size_t N>
T& std::array<T,N>::operator[](const int idx) { ... };
might allow me to override the operator, but it won't give me access to knowledge about a custom start point - making it completely pointless. Additionally, if I were to optimistically think all my customArray objects would have the same offset, I could hardcode that value - but then my std::array is broken (I think).
How can I get around this? (Ignoring the simple answer - don't - just write myArray[idx-3] as needed)
There's no problem with inheriting standard containers. This is only generally discouraged because this imposes several limitations and such an inheritance is not the way how inheritance was originally predicted in C++ to be used. If you are careful and aware of these limitations, you can safely use inheritance here.
You just need to remember that this is not subclassing and what this really means. In particular, you shouldn't use pointers or references to the object of this class. The problem might be if you pass a value of MyVector<x>* where vector<x>* was expected. You should also never create such objects as dynamic (using new), and therefore also delete these objects through the pointer to the base class - simply because destructor call will not forward to your class's destructor, as it's not virtual.
There's no possibility to prevent casting of the "derived pointer" to the "base pointer", but you can prevent taking a pointer from an object by overloading the & operator. You can also prevent creating objects of this class dynamically by declaring an in-class operator new in private section (or = delete should work as well).
Don't also think about private inheritance. This is merely like containing this thing as a field in private section, except the accessor name.
A range converter class could be the solution although you would need to make it yourself, but it would allow you to get the range size to initialize the vector and to do the conversion.
Untested code:
struct RangeConv // [start,end[
{
int start, end;
RangeConv(int s, int e) : start(s), end(e) { }
int size() const { return end - start; }
int operator()(int i) { return i - start; } // possibly check whether in range
}
RangeConv r(-3, 3);
std::vector<int> v(r.size());
v[r(-3)] = 5;
so should not (can not?) be used polymorphically.
Don't give up too soon. There are basically two issues to consider with inheritance in C++.
Lifetime
Such objects, derived classes with non-virtual destructors in the base, can be used safely in a polymorphic fashion, if you basically follow one simple rule: don't use delete anywhere. This naturally means that you cannot use new. You generally should be avoiding new and raw pointers in modern C++ anyway. shared_ptr will do the right thing, i.e. safely call the correct destructor, as long as you use make_shared:
std:: shared_ptr<Base> bp = std:: make_shared<Derived>( /* constructor args */ );
The type parameter to make_shared, in this case Derived, not only controls which type is created. It also controls which destructor is called. (Because the underlying shared-pointer object will store an appropriate deleter.)
It's tempting to use unique_ptr, but unfortunately (by default) it will lead to the wrong deleter being used (i.e. it will naively use delete directly on the base pointer). It's unfortunate that, alongside the default unique_ptr, there isn't a much-safer-but-less-efficient unique_ptr_with_nice_deleter built into the standard.
Polymorphism
Even if std::array did have a virtual destructor, this current design would still be very weird. Because operator[] is not virtual, then casting from customArray* to std:: array* would lead to the wrong operator[]. This isn't really a C++-specific issue, it's basically the issue that you shouldn't pretend that customArray isa std:: array.
Instead, just decide that customArray is a separate type. This means you couldn't pass an customArray* to a function expecting std::array* - but are you sure you even want that anyway?
Is there some other way I can use a std::array, std::vector, etc, and get their functions 'for free', whilst overloading specific functions?
This is a good question. You do not want your new type to satisfy isa std::array. You just want it to behave very similar to it. As if you magically copied-and-pasted all the code from std::array to create a new type. And then you want to adjust some things.
Use private inheritance, and using clauses to bring in the code you want:
template <typename T, size_t N, int start>
struct customArray : private std::array<T,N>
{
// first, some functions to 'copy-and-paste' as-is
using std::array<T,N> :: front;
using std::array<T,N> :: begin;
// finally, the functions you wish to modify
T& operator[](const int idx) { return data_[idx+start]; }
}
The private inheritance will block conversions from customArray * to std::array *, and that's what we want.
PS: I have very little experience with private inheritance like this. So many it's not the best solution - any feedback appreciated.
General thought
The recommendation not to inherit from standard vector, is because this kind of construct is often misunderstood, and some people are tempted to make all kind of objects inherit from a vector, just for minor convenience.
But this rule should'nt become a dogma. Especially if your goal is to make a vector class, and if you know what you're doing.
Danger 1: inconsistency
If you have a very important codebase working with vectors in the range 1..size instead of 0..size-1, you could opt for keeping it according to this logic, in order not to add thousands of -1 to indexes, +1 to index displayed, and +1 for sizes.
A valid approach could be to use something like:
template <class T>
class vectorone : public vector<T> {
public:
T& operator[] (typename vector<T>::size_type n) { return vector<T>::operator[] (n-1); }
const T& operator[] (typename vector<T>::size_type n) const { return vector<T>::operator[] (n-1); }
};
But you have to remain consitent accross all the vector interface :
First, there's also a const T& operator[](). If youd don't overload it, you'll end up having wrong behaviour if you have vectors in constant objects.
Then, and it's missing above, theres's also an at() which shall be consitent with []
Then you have to take extreme care with the constructors, as there are many of them, to be sure that your arguments will not be misinterpreted.
So you have free functionality, but there's more work ahead than initially thougt. The option of creating your own object with a more limited interface, and a private vector could in the end be a safer approach.
Danger 2:more inconsistency
The vector indexes are vector<T>::size_type. Unfortunately this type is unsigned. The impact of inherit from vector, but redefine operator[] with signed integer indexes has to be carefully analysed. This can lead to subtle bugs according to the way the indexes are defined.
Conclusions:
There's perhap's more work that you think to offer a consistent std::vector interface. So in the end, having your own class using a private vector could be the safer approach.
You should also consider that your code will be maintained one day by people without fortran background, and they might have wrong assumptions about the [] in your code. Is going native c++ really out of question ?
It doesn't seem that bad to just stick with composition, and write wrappers for the member functions you need. There aren't that many. I'd even be tempted to make the array data member public so you can access it directly when needed, although some people would consider that a bigger no-no than inheriting from a base class without a virtual destructor.
template <typename T, size_t N, int start>
class customArray
{
public:
std::array<T,N> data;
T& operator[](int idx) { return data[idx+start]; }
auto begin() { return data.begin(); }
auto begin() const { return data.begin(); }
auto end() { return data.end(); }
auto end() const { return data.end(); }
auto size() const { return data.size(); }
};
int main() {
customArray<int, 7, -3> a;
a.data.fill(5); // can go through the `data` member...
for (int& i : a) // ...or the wrapper functions (begin/end).
cout << i << endl;
}
Related
Background
For an embedded project, I want a class that takes a list of structs. This list is known at compile-time, so I shouldn't have to resort to dynamic memory allocation for this.
However, how do I make a struct/class that encapsulates this array without having to use its size as a template parameter?
Templates
My first idea was to do exactly that:
struct Point {
const uint16_t a;
const double b;
};
template<size_t n>
struct Profile {
Array<Point, n> points;
Profile(const Array<Point, n> &points) : points(points) {}
};
Here, Profile is the class that stores/encapsulates the array of points (the 2-member structs). n, the size of the array, is a template parameter.
I'm using this implementation of Array, similar to std::array, btw, because I don't have access to the STL on this embedded platform.
However, no I have another class that uses this Profile that now also has to be templated because Profile is templated with the size of the array:
template<size_t n>
class Runner {
private:
const Profile<n> profile;
public:
Runner(const Profile<n> &profile) : profile(profile) {};
void foo() {
for(auto point : profile.points) {
// do something
}
}
};
As can be seen, this Runner class operates on a Profile and iterates over it. Having to template Runner is not that much of an issue by itself, but this Runner in turn is used by another class in my project, because this other class calls Runner::foo(). Now I have to template that class as well! And classes that use that class, etc.
That's getting out of hand! What started with just one template parameter to specify the size, now propagates through my entire application. Therefore, I don't think this is a good solution.
Question
Is there a way to 'hide' the size of the array in Profile or Runner? Runner only needs to iterate over it, so the size should in principle only affect its implementation, not its public interface. How would I do that, though?
Also, can I avoid having to manually specify n at all, and just pass an array to Profile's constructor and let the compiler figure out how big it is? At compile-time, of course. I feel like this should be possible (given this array is known at compile-time), but I don't know how exactly.
Other approaches
Macros
I could write a macro like
#define n 12
and include that in both the Profile.h and the place where I instantiate a Profile. This feels dirty though, I and would like to avoid macros.
Vector
I could avoid this fuss by just using a std::vector (or equivalent) instead, but that is allocated at run-time on the heap, which I would like to avoid here since it shouldn't be necessary.
Is there a way to 'hide' the size of the array in Profile or Runner?
Yes. The solution is indirection. Instead of storing the object directly, you can point to it. You don't need to know the size of what you're pointing at.
A convenient solution is to point into dynamic storage (for example std::vector) because it allows you to "bind" the lifetime of the dynamically sized object to a member. That's not necessary in general, and you can use automatic storage instead. However, in that case you cannot bind the lifetime of the pointed object, and you must be very careful to not let the pointed object be destroyed before you stop using it.
The indirection can be done at whatever level you prefer. If you do it at the lowest level, you simply store the array outside of Profile. In fact, if all that profile does is contain an array, then you don't need a class for it. Use a generic span:
struct Runner {
span<const Point> profile;
void foo() {
for(auto point : profile) {
// do something
}
}
};
Point points[] {
// ... number of points doesn't matter
};
Runner runner {
.profile = points,
};
By span, I mean something like std::span. If you cannot use the standard library, then use another implementation. It's basically just a pointer and size, with convenient template constructor.
To clarify, you can pick any two, but you cannot have all three of these:
Lifetime of the array bound to the class (safe)
No compiletime constant size
No dynamic storage
1,2 (no 3) = std::vector, RAII
1,3 (no 2) = std::array, templates, no indirection
2,3 (no 1) = std::span, be careful with lifetimes
I'll expand on this comment:
The idea is that Runner takes Profiles no matter their size. Runner needs to iterate over it, but apart from that, its behaviour is always the same. The class using Runner and calling Runner::foo() doesn't need to know the size. The problem with templating Runner is that the class using Runner also needs to be templated, and the classes using that, etc.
This is only a problem when the class is using the templated Runner directly. It has more dependencies than it actually needs. If it doesn't need to know about the size of the array, then it should not know about the size of the array. If runtime polymorphism is an option you can add a base class that allows accessing the array elements, but doesn't need to know anything about the arrays size. The following is only a sketch:
#include <iostream>
struct RunnerInterface {
virtual int* begin() = 0;
virtual int* end() = 0;
virtual ~RunnerInterface(){}
};
template <unsigned size>
struct Runner : RunnerInterface {
int data[size];
int* begin() override { return data; }
int* end() override { return data+size; } // pointer one past the end if fine (it won't get dereferenced)
};
void foo(RunnerInterface& ri) {
for (auto it = ri.begin(); it != ri.end(); ++it){
*it = 42;
}
}
void bar(RunnerInterface& ri){
for (auto it = ri.begin(); it != ri.end(); ++it){
std::cout << *it;
}
}
int main() {
Runner<42> r;
foo(r);
bar(r);
}
Now if a class needs a Runner member, they store a std::unique_ptr<RunnerInterface> and only on construction you need to decide for the size of the array (though you still need to decide for the size somewhere).
I want to create a container that provides all of the same functionality as a std::vector but with the caveat that you cannot add anymore elements once the vector reaches a specified size.
My first thought was to inherit from std::vector and do something like
template <typename T, unsigned int max>
class MyVector : public std::vector<T> {
public:
// override push_back, insert, fill constructor, etc...
// For example:
virtual void push_back(const T& var) override {
if (this.size() < max) vec.push_back(var);
}
};
But then I learned that inheriting STL containers is bad.
Second thought was to create a wrapper class, i.e.
template <typename T, unsigned int max>
class MyVector {
private:
std::vector<T> vec;
public:
// Wrappers for every single std::vector function...
iterator begin() { return vec.begin(); }
const_iterator begin() const { return vec.begin(); }
//and on and on and on...etc.
};
But this smells bad to me. Seems very difficult to maintain.
Am I missing something obvious, should I try an alternative approach?
How can I create a vector with a maximum length?
You cannot do that with std::vector. You can however refrain from inserting any elements after you reach some limit. For example:
if (v.size() < 10) {
// OK, we can insert
}
I want to create a container that provides all of the same functionality as a std::vector but with the caveat that you cannot add anymore elements once the vector reaches a specified size.
That, you can do.
virtual void push_back(const T& var) override {
std::vector::push_back isn't virtual, so you may not override it.
Also, I recommend considering whether the push back should be silently ignored in the case max is reached.
But then I learned that inheriting STL containers is bad.
To be more specific, publicly inheriting them is a bit precarious.
Using private inheritance, you can use using to pick the members that you wish to delegate directly, and implement the differing members manually.
Seems very difficult to maintain.
The standard library doesn't change very often, and the vector class hasn't changed much so far, and changes have been quite simple, so "very difficult" is debatable - although subjective of course.
You could use meta-programming to make it maintainable. But that may be bit of an over-engineered approach.
I understand the mechanics of static polymorphism using the Curiously Recurring Template Pattern. I just do not understand what is it good for.
The declared motivation is:
We sacrifice some flexibility of dynamic polymorphism for speed.
But why bother with something so complicated like:
template <class Derived>
class Base
{
public:
void interface()
{
// ...
static_cast<Derived*>(this)->implementation();
// ...
}
};
class Derived : Base<Derived>
{
private:
void implementation();
};
When you can just do:
class Base
{
public:
void interface();
}
class Derived : public Base
{
public:
void interface();
}
My best guess is that there is no semantic difference in the code and that it is just a matter of good C++ style.
Herb Sutter wrote in Exceptional C++ style: Chapter 18 that:
Prefer to make virtual functions private.
Accompanied of course with a thorough explanation why this is good style.
In the context of this guideline the first example is good, because:
The void implementation() function in the example can pretend to be virtual, since it is here to perform customization of the class. It therefore should be private.
And the second example is bad, since:
We should not meddle with the public interface to perform customization.
My question is:
What am I missing about static polymorphism? Is it all about good C++ style?
When should it be used? What are some guidelines?
What am I missing about static polymorphism? Is it all about good C++ style?
Static polymorphism and runtime polymorphism are different things and accomplish different goals. They are both technically polymorphism, in that they decide which piece of code to execute based on the type of something. Runtime polymorphism defers binding the type of something (and thus the code that runs) until runtime, while static polymorphism is completely resolved at compile time.
This results in pros and cons for each. For instance, static polymorphism can check assumptions at compile time, or select among options which would not compile otherwise. It also provides tons of information to the compiler and optimizer, which can inline knowing fully the target of calls and other information. But static polymorphism requires that implementations be available for the compiler to inspect in each translation unit, can result in binary code size bloat (templates are fancy pants copy paste), and don't allow these determinations to occur at runtime.
For instance, consider something like std::advance:
template<typename Iterator>
void advance(Iterator& it, ptrdiff_t offset)
{
// If it is a random access iterator:
// it += offset;
// If it is a bidirectional iterator:
// for (; offset < 0; ++offset) --it;
// for (; offset > 0; --offset) ++it;
// Otherwise:
// for (; offset > 0; --offset) ++it;
}
There's no way to get this to compile using runtime polymorphism. You have to make the decision at compile time. (Typically you would do this with tag dispatch e.g.)
template<typename Iterator>
void advance_impl(Iterator& it, ptrdiff_t offset, random_access_iterator_tag)
{
// Won't compile for bidirectional iterators!
it += offset;
}
template<typename Iterator>
void advance_impl(Iterator& it, ptrdiff_t offset, bidirectional_iterator_tag)
{
// Works for random access, but slow
for (; offset < 0; ++offset) --it; // Won't compile for forward iterators
for (; offset > 0; --offset) ++it;
}
template<typename Iterator>
void advance_impl(Iterator& it, ptrdiff_t offset, forward_iterator_tag)
{
// Doesn't allow negative indices! But works for forward iterators...
for (; offset > 0; --offset) ++it;
}
template<typename Iterator>
void advance(Iterator& it, ptrdiff_t offset)
{
// Use overloading to select the right one!
advance_impl(it, offset, typename iterator_traits<Iterator>::iterator_category());
}
Similarly, there are cases where you really don't know the type at compile time. Consider:
void DoAndLog(std::ostream& out, int parameter)
{
out << "Logging!";
}
Here, DoAndLog doesn't know anything about the actual ostream implementation it gets -- and it may be impossible to statically determine what type will be passed in. Sure, this can be turned into a template:
template<typename StreamT>
void DoAndLog(StreamT& out, int parameter)
{
out << "Logging!";
}
But this forces DoAndLog to be implemented in a header file, which may be impractical. It also requires that all possible implementations of StreamT are visible at compile time, which may not be true -- runtime polymorphism can work (although this is not recommended) across DLL or SO boundaries.
When should it be used? What are some guidelines?
This is like someone coming to you and saying "when I'm writing a sentence, should I use compound sentences or simple sentences"? Or perhaps a painter saying "should I always use red paint or blue paint?" There is no right answer, and there is no set of rules that can be blindly followed here. You have to look at the pros and cons of each approach, and decide which best maps to your particular problem domain.
As for the CRTP, most use cases for that are to allow the base class to provide something in terms of the derived class; e.g. Boost's iterator_facade. The base class needs to have things like DerivedClass operator++() { /* Increment and return *this */ } inside -- specified in terms of derived in the member function signatures.
It can be used for polymorphic purposes, but I haven't seen too many of those.
The link you provide mentions boost iterators as an example of static polymorphism. STL iterators also exhibit this pattern. Lets take a look at an example and consider why the authors of those types decided this pattern was appropriate:
#include <vector>
#include <iostream>
using namespace std;
void print_ints( vector<int> const& some_ints )
{
for( vector<int>::const_iterator i = some_ints.begin(), end = some_ints.end(); i != end; ++i )
{
cout << *i;
}
}
Now, how would we implement int vector<int>::const_iterator::operator*() const; Can we use polymprhism for this? Well, no. What would the signature of our virtual function be? void const* operator*() const? That's useless! The type has been erased (degraded from int to void*). Instead, the curiously recurring template pattern steps in to help us generate the iterator type. Here is a rough approximation of the iterator class we would need to implement the above:
template<typename T>
class const_iterator_base
{
public:
const_iterator_base():{}
T::contained_type const& operator*() const { return Ptr(); }
T::contained_type const& operator->() const { return Ptr(); }
// increment, decrement, etc, can be implemented and forwarded to T
// ....
private:
T::contained_type const* Ptr() const { return static_cast<T>(this)->Ptr(); }
};
Traditional dynamic polymorphism could not provide the above implementation!
A related and important term is parametric polymorphism. This allows you to implement similar APIs in, say, python that you can using the curiously recurring template pattern in C++. Hope this is helpful!
I think it's worth taking a stab at the source of all this complexity, and why languages like Java and C# mostly try to avoid it: type erasure! In c++ there is no useful all containing Object type with useful information. Instead we have void* and once you have void* you truely have nothing! If you have an interface that decays to void* the only way to recover is by making dangerous assumptions or keeping extra type information around.
While there may be cases where static polymorphism is useful (the other answers have listed a few), I would generally see it as a bad thing. Why? Because you cannot actually use a pointer to the base class anymore, you always have to provide a template argument providing the exact derived type. And in that case, you could just as well use the derived type directly. And, to put it bluntly, static polymorphism is not what object orientation is about.
The runtime difference between static and dynamic polymorphism is exactly two pointer dereferenciations (iff the compiler really inlines the dispatch method in the base class, if it doesn't for some reason, static polymorphism is slower). That's not really expensive, especially since the second lookup should virtually always hit the cache. All in all, those lookups are usually cheaper than the function call itself, and are certainly worth it to get the real flexibility provided by dynamic polymorphism.
I dug up an old Grid class, which is just a simple 2-D container templated with a type. To make one you would do this:
Grid<SomeType> myGrid (QSize (width, height));
I tried to make it "Qt-ish"...for instance it does size operations in terms of QSize, and you index into it with myGrid[QPoint (x, y)]. It can take boolean masks and do operations on elements whose mask bit was set. There's also a specialization where if your elements are QColor it can generate a QImage for you.
But one major Qt idiom I adopted was that it did implicit sharing under the hood. This turned out to be very useful in the QColor-based grids for the Thinker-Qt-based program I had.
However :-/ I also happened to have some cases where I'd written the likes of:
Grid< auto_ptr<SomeType> > myAutoPtrGrid (QSize (width, height));
When I moved up from auto_ptr to C++11's unique_ptr, the compiler rightfully complained. Implicit sharing requires the ability to make an identical copy if needed...and auto_ptr had swept this bug under the rug by conflating copying with transfer-of-ownership. Non-copyable types and implicit sharing simply do not mix, and unique_ptr is kind enough to tell us.
(Note: It so happened that I hadn't noticed the problem in practice, because the use cases for the auto_ptr were passing grids by reference...never by value. Still, this was bad code...and the proactive nature of C++11 is pointing out the potential problem before it happens.)
Ok, so...how might I design a generic container that can flip implicit sharing on and off? I really did want many of the Grid features when I was using the auto_ptr and it's great if copying is disabled for non-copyable types...that catches errors! But having the implicit sharing work is nice as a default, when the type happens to be copyable.
Some ideas:
I could make separate types (NonCopyableGrid, CopyableGrid)...or (UniqueGrid, Grid) depending on your tastes...
I could pass a flag into the Grid constructor
I could use static factory methods (Grid::newNonCopyable, Grid::newCopyable) but which would call the relevant constructor under the hood...maybe more descriptive
If possible, I might "detect" copyability on the contained type, and then either leverage a QSharedDataPointer in the implementation or not, depending?
Any good reasons to pick one of these methods over the others, or have people adopted something altogether better for this kind of situation?
If you were going to do it in a single container, I think the easiest way would be to use std::is_copy_constructable to choose whether your data struct inherited from QSharedData, and to replace QSharedDataPointer with std::unique_ptr (QScopedPointer doesn't support move semantics)
This is only a rough example of what I'm thinking as I don't have Qt and C++11 available together:
template<class T>
class Grid
{
struct EmptyStruct
{
};
typedef typename std::conditional<
std::is_copy_constructible<T>::value,
QSharedData,
EmptyStruct
>::type GridDataBase;
struct GridData : public GridDataBase
{
// data goes here
};
typedef typename std::conditional<
std::is_copy_constructible<T>::value,
QSharedDataPointer<GridData>,
std::unique_ptr<GridData>
>::type GridDataPointer;
public:
Grid() : data_(new GridData) {}
private:
GridDataPointer data_;
};
Disclaimer
I don't really understand your Grid template or your use cases. However I do understand containers in general. So maybe this answer applies to your Grid<T> and maybe it doesn't.
Since you've already stated the intent that Grid< unique_ptr<T> > would indicate unique ownership and a non-copyable T, what about doing something similar with copy on write?
What about explicitly stating when you want to use copy on write with:
Grid< cow_ptr<T> >
A cow_ptr<T> would offer reference counting copies, but on a "non-const dereference" would do a copy of T if the refcount is not 1. So Grid need not worry about memory management to such an extent. It would need only to handle its data buffer, and perhaps move or copy its members around in Grid's copy and/or move members.
A cow_ptr<T> is fairly easily cobbled together by wrapping std::shared_ptr<T>. Here is a partial implementation I put together about a month ago when dealing with a similar issue:
template <class T>
class cow_ptr
{
std::shared_ptr<T> ptr_;
public:
template <class ...Args,
class = typename std::enable_if
<
std::is_constructible<std::shared_ptr<T>, Args...>::value
>::type
>
explicit cow_ptr(Args&& ...args)
: ptr_(std::forward<Args>(args)...)
{}
explicit operator bool() const noexcept {return ptr_ != nullptr;}
T const* read() const noexcept {return ptr_.get();}
T * write()
{
if (ptr_.use_count() > 1)
ptr_.reset(ptr_->clone());
return ptr_.get();
}
T const& operator*() const noexcept {return *read();}
T const* operator->() const noexcept {return read();}
void reset() {ptr_.reset();}
template <class Y>
void
reset(Y* p)
{
ptr_.reset(p);
}
};
I chose to make the "write" syntax very explicit, since COW tends to be more effective when there are very few writes, but many reads/copies. To gain const access, you use it just like any other pointer:
p->inspect(); // compile time error if inspect() isn't const
But to do some modifying operation you have to call it out with the write member function:
p.write()->modify();
shared_ptr has a bunch of really handy constructors and I didn't want to have to replicate all of them in cow_ptr. So the one cow_ptr constructor you see is a poor man's implementation of inheriting constructors that also works for data members.
You may need to fill this out with other smart pointer functionality such as relational operators. You may also want to change how cow_ptr copies a T. I'm currently assuming a virtual clone() function but you could easily substitute into write the use of T's copy constructor instead.
If an explicit Grid< cow_ptr<T> > doesn't really fit your needs, that's all good. I figured I'd share just in case it did.
Say I need a new type in my application, that consists of a std::vector<int> extended by a single function. The straightforward way would be composition (due to limitations in inheritance of STL containers):
class A {
public:
A(std::vector<int> & vec) : vec_(vec) {}
int hash();
private:
std::vector<int> vec_
}
This requires the user to first construct a vector<int> and a copy in the constructor, which is bad when we are going to handle a sizeable number of large vectors. One could, of course, write a pass-through to push_back(), but this introduces mutable state, which I would like to avoid.
So it seems to me, that we can either avoid copies or keep A immutable, is this correct?
If so, the simplest (and efficiency-wise equivalent) way would be to use a typedef and free functions at namespace scope:
namespace N {
typedef std::vector<int> A;
int a_hash(const A & a);
}
This just feels wrong somehow, since extensions in the future will "pollute" the namespace. Also, calling a_hash(...) on any vector<int> is possible, which might lead to unexpected results (assuming that we impose constraints on A the user has to follow or that would otherwise be enforced in the first example)
My two questions are:
how can one not sacrifice both immutability and efficiency when using the above class code?
when does it make sense to use free functions as opposed to encapsulation in classes/structs?
Thank you!
Hashing is an algorithm not a type, and probably shouldn't be restricted to data in any particular container type either. If you want to provide hashing, it probably makes the most sense to create a functor that computes a hash one element (int, as you've written things above) at a time, then use std::accumulate or std::for_each to apply that to a collection:
namespace whatever {
struct hasher {
int current_hash;
public:
hasher() : current_hash(0x1234) {}
// incredibly simplistic hash: just XOR the values together.
operator()(int new_val) { current_hash ^= new_val; }
operator int() { return current_hash; }
};
}
int hash = std::for_each(coll.begin(), coll.end(), whatever::hasher());
Note that this allows coll to be a vector, or a deque or you can use a pair of istream_iterators to hash data in a file...
Ad immutable: You could use the range constructor of vector and create an input iterator to provide the content for the vector. The range constructor is just:
template <typename I>
A::A(I const &begin, I const &end) : vec_(begin, end) {}
The generator is a bit more tricky. If you now have a loop that constructs a vector using push_back, it takes quite a bit of rewriting to convert to object that returns one item at a time from a method. Than you need to wrap a reference to it in a valid input iterator.
Ad free functions: Due to overloading, polluting the namespace is usually not a problem, because the symbol will only be considered for a call with the specific argument type.
Also free functions use the argument-dependent lookup. That means the function should be placed in the namespace the class is in. Like:
#include <vector>
namespace std {
int hash(vector<int> const &vec) { /*...*/ }
}
//...
std::vector<int> v;
//...
hash(v);
Now you can still call hash unqualified, but don't see it for any other purpose unless you do using namespace std (I personally almost never do that and either just use the std:: prefix or do using std::vector to get just the symbol I want). Unfortunately I am not sure how the namespace-dependent lookup works with typedef in another namespace.
In many template algorithms, free functions—and with fairly generic names—are often used instead of methods, because they can be added to existing classes, can be defined for primitive types or both.
One simple solution is to declare the private member variable as reference & initialize in constructor. This approach introduces some limitation, but it's a good alternative in most cases.
class A {
public:
A(std::vector<int> & vec) : vec_(vec) {}
int hash();
private:
std::vector<int> &vec_; // 'vec_' now a reference, so will be same scoped as 'vec'
};