Is There a Reason Standard Algorithms Take Lambdas by Value? [duplicate] - c++

This question already has answers here:
Why the sequence-operation algorithms predicates are passed by copy?
(3 answers)
Closed 6 years ago.
So I asked a question here: Lambda Works on Latest Visual Studio, but Doesn't Work Elsewhere to which I got the response, that my code was implementation defined since the standard's 25.1 [algorithms.general] 10 says:
Unless otherwise specified, algorithms that take function objects as arguments are permitted to copy
those function objects freely. Programmers for whom object identity is important should consider using a
wrapper class that points to a noncopied implementation object such as reference_wrapper<T>
I'd just like a reason why this is happening? We're told our whole lives to take objects by reference, why then is the standard taking function objects by value, and even worse in my linked question making copies of those objects? Is there some advantage that I don't understand to doing it this way?

std assumes function objects and iterators are free to copy.
std::ref provides a method to turn a function object into a pseudo-reference with a compatible operator() that uses reference instead of value semantics. So nothing of large value is lost.
If you have been taught all your life to take objects by reference, reconsider. Unless there is a good reason otherwise, take objects by value. Reasoning about values is far easier; references are pointers into any state anywhere in your program.
The conventional use of references, as a pointer to a local object which is not referred to by any other active reference in the context where it is used, is not something someone reading your code nor the compiler can presume. If you reason about references this way, they don't add a ridiculous amount of complexity to your code.
But if you reason about them that way, you are going to have bugs when your assumption is violated, and they will be subtle, gross, unexpected, and horrible.
A classic example is the number of operator= that break when this and the argument refer to the same object. But any function that takes two references or pointers of the same type has the same issue.
But even one reference can break your code. Let's look at sort. In pseudo-code:
void sort( Iterator start, Iterator end, Ordering order )
Now, let's make Ordering a reference:
void sort( Iterator start, Iterator end, Ordering const& order )
How about this one?
std::function< void(int, int) > alice;
std::function< void(int, int) > bob;
alice = [&]( int x, int y ) { std:swap(alice, bob); return x<y; };
bob = [&]( int x, int y ) { std:swap(alice, bob); return x>y; };
Now, call sort( begin(vector), end(vector), alice ).
Every time < is called, the referred-to order object swaps meaning. Now this is pretty ridiculous, but when you took Ordering by const&, the optimizer had to take into account that possibility and rule it out on every invokation of your ordering code!
You wouldn't do the above (and in fact this particular implementation is UB as it would violate any reasonable requisites on std::sort); but the compiler has to prove you didn't do something "like that" (change the code in ordering) every time it follows order or invokes it! Which means constantly reloading the state of order, or inlining and proving you did nonesuch insanity.
Doing this when taking by-value is an order of magnitude harder (and basically requires something like std::ref). The optimizer has a function object, it is local, and its state is local. Anything stored within it is local, and the compiler and optimizer know who exactly can modify it legally.
Every function you write taking a const& that ever leaves its "local scope" (say, called a C library function) can not assume the state of the const& remained the same after it got back. It must reload the data from wherever the pointer points to.
Now, I did say pass by value unless there is a good reason. And there are many good reasons; your type is very expensive to move or copy, for example, is a great reason. You are writing data to it. You actually want it to change as you read it each time. Etc.
But the default behavior should be pass-by-value. Only move to references if you have a good reason, because the costs are distributed and hard to pin down.

I'm not sure I have an answer for you, but if I have got my object lifetimes correct I think this is portable, safe and adds zero overhead or complexity:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
// #pre f must be an r-value reference - i.e. a temporary
template<class F>
auto resist_copies(F &&f) {
return std::reference_wrapper<F>(f);
};
void removeIntervals(std::vector<double> &values, const std::vector<std::pair<int, int>> &intervals) {
values.resize(distance(
begin(values),
std::remove_if(begin(values), end(values),
resist_copies([i = 0U, it = cbegin(intervals), end = cend(intervals)](const auto&) mutable
{
return it != end && ++i > it->first && (i <= it->second || (++it, true));
}))));
}
int main(int argc, char **args) {
// Intervals of indices I have to remove from values
std::vector<std::pair<int, int>> intervals = {{1, 3},
{7, 9},
{13, 13}};
// Vector of arbitrary values.
std::vector<double> values = {4.2, 6.4, 2.3, 3.4, 9.1, 2.3, 0.6, 1.2, 0.3, 0.4, 6.4, 3.6, 1.4, 2.5, 7.5};
removeIntervals(values, intervals);
// intervals should contain 4.2,9.1,2.3,0.6,6.4,3.6,1.4,7.5
std:
copy(values.begin(), values.end(), std::ostream_iterator<double>(std::cout, ", "));
std::cout << '\n';
}

Related

How to check whether elements of a range should be moved?

There's a similar question: check if elements of a range can be moved?
I don't think the answer in it is a nice solution. Actually, it requires partial specialization for all containers.
I made an attempt, but I'm not sure whether checking operator*() is enough.
// RangeType
using IteratorType = std::iterator_t<RangeType>;
using Type = decltype(*(std::declval<IteratorType>()));
constexpr bool canMove = std::is_rvalue_reference_v<Type>;
Update
The question may could be split into 2 parts:
Could algorithms in STL like std::copy/std::uninitialized_copy actually avoid unnecessary deep copy when receiving elements of r-value?
When receiving a range of r-value, how to check if it's a range adapter like std::ranges::subrange, or a container which holds the ownership of its elements like std::vector?
template <typename InRange, typename OutRange>
void func(InRange&& inRange, OutRange&& outRange) {
using std::begin;
using std::end;
std::copy(begin(inRange), end(inRange), begin(outRange));
// Q1: if `*begin(inRange)` returns a r-value,
// would move-assignment of element be called instead of a deep copy?
}
std::vector<int> vi;
std::list<int> li;
/* ... */
func(std::move(vi), li2);
// Q2: Would elements be shallow copy from vi?
// And if not, how could I implement just limited count of overloads, without overload for every containers?
// (define a concept (C++20) to describe those who take ownership of its elements)
Q1 is not a problem as #Nicol Bolas , #eerorika and #Davis Herring pointed out, and it's not what I puzzled about.
(But I indeed think the API is confusing, std::assign/std::uninitialized_construct may be more ideal names)
#alfC has made a great answer about my question (Q2), and gives a pristine perspective. (move idiom for ranges with ownership of elements)
To sum up, for most of the current containers (especially those from STL), (and also every range adapter...), partial specialization/overload function for all of them is the only solution, e.g.:
template <typename Range>
void func(Range&& range) { /*...*/ }
template <typename T>
void func(std::vector<T>&& movableRange) {
auto movedRange = std::ranges::subrange{
std::make_move_iterator(movableRange.begin()),
std::make_move_iterator(movableRange.end())
};
func(movedRange);
}
// and also for `std::list`, `std::array`, etc...
I understand your point.
I do think that this is a real problem.
My answer is that the community has to agree exactly what it means to move nested objected (such as containers).
In any case this needs the cooperation of the container implementors.
And, in the case of standard containers, good specifications.
I am pessimistic that standard containers can be changed to "generalize" the meaning of "move", but that can't prevent new user defined containers from taking advantage of move-idioms.
The problem is that nobody has studied this in depth as far as I know.
As it is now, std::move seem to imply "shallow" move (one level of moving of the top "value type").
In the sense that you can move the whole thing but not necessarily individual parts.
This, in turn, makes useless to try to "std::move" non-owning ranges or ranges that offer pointer/iterator stability.
Some libraries, e.g. related to std::ranges simply reject r-value of references ranges which I think it is only kicking the can.
Suppose you have a container Bag.
What should std::move(bag)[0] and std::move(bag).begin() return? It is really up to the implementation of the container decide what to return.
It is hard to think of general data structures, bit if the data structure is simple (e.g. dynamic arrays) for consistency with structs (std::move(s).field) std::move(bag)[0] should be the same as std::move(bag[0]) however the standard strongly disagrees with me already here: https://en.cppreference.com/w/cpp/container/vector/operator_at
And it is possible that it is too late to change.
Same goes for std::move(bag).begin() which, using my logic, should return a move_iterator (or something of the like that).
To make things worst, std::array<T, N> works how I would expect (std::move(arr[0]) equivalent to std::move(arr)[0]).
However std::move(arr).begin() is a simple pointer so it looses the "forwarding/move" information! It is a mess.
So, yes, to answer your question, you can check if using Type = decltype(*std::forward<Bag>(bag).begin()); is an r-value but more often than not it will not implemented as r-value.
That is, you have to hope for the best and trust that .begin and * are implemented in a very specific way.
You are in better shape by inspecting (somehow) the category of the range itself.
That is, currently you are left to your own devices: if you know that bag is bound to an r-value and the type is conceptually an "owning" value, you currently have to do the dance of using std::make_move_iterator.
I am currently experimenting a lot with custom containers that I have. https://gitlab.com/correaa/boost-multi
However, by trying to allow for this, I break behavior expected for standard containers regarding move.
Also once you are in the realm of non-owning ranges, you have to make iterators movable by "hand".
I found empirically useful to distinguish top-level move(std::move) and element wise move (e.g. bag.mbegin() or bag.moved().begin()).
Otherwise I find my self overloading std::move which should be last resort if anything at all.
In other words, in
template<class MyRange>
void f(MyRange&& r) {
std::copy(std::forward<MyRange>(r).begin(), ..., ...);
}
the fact that r is bound to an r-value doesn't necessarily mean that the elements can be moved, because MyRange can simply be a non-owning view of a larger container that was "just" generated.
Therefore in general you need an external mechanism to detect if MyRange owns the values or not, and not just detecting the "value category" of *std::forward<MyRange>(r).begin() as you propose.
I guess with ranges one can hope in the future to indicate deep moves with some kind of adaptor-like thing "std::ranges::moved_range" or use the 3-argument std::move.
If the question is whether to use std::move or std::copy (or the ranges:: equivalents), the answer is simple: always use copy. If the range given to you has rvalue elements (i.e., its ranges::range_reference_t is either kind(!) of rvalue), you will move from them anyway (so long as the destination supports move assignment).
move is a convenience for when you own the range and decide to move from its elements.
The answer of the question is: IMPOSSIBLE. At least for the current containers of STL.
Assume if we could add some limitations for Container Requirements?
Add a static constant isContainer, and make a RangeTraits. This may work well, but not an elegant solution I want.
Inspired by #alfC , I'm considering the proper behaviour of a r-value container itself, which may help for making a concept (C++20).
There is an approach to distinguish the difference between a container and range adapter, actually, though it cannot be detected due to the defect in current implementation, but not of the syntax design.
First of all, lifetime of elements cannot exceed its container, and is unrelated with a range adapter.
That means, retrieving an element's address (by iterator or reference) from a r-value container, is a wrong behaviour.
One thing is often neglected in post-11 epoch, ref-qualifier.
Lots of existing member functions, like std::vector::swap, should be marked as l-value qualified:
auto getVec() -> std::vector<int>;
//
std::vector<int> vi1;
//getVec().swap(vi1); // pre-11 grammar, should be deprecated now
vi1 = getVec(); // move-assignment since C++11
For the reasons of compatibility, however, it hasn't been adopted. (It's much more confusing the ref-qualifier hasn't been widely applied to newly-built ones like std::array and std::forward_list..)
e.g., it's easy to implement the subscript operator as we expected:
template <typename T>
class MyArray {
T* _items;
size_t _size;
/* ... */
public:
T& operator [](size_t index) & {
return _items[index];
}
const T& operator [](size_t index) const& {
return _items[index];
}
T operator [](size_t index) && {
// not return by `T&&` !!!
return std::move(_items[index]);
}
// or use `deducing this` since C++23
};
Ok, then std::move(container)[index] would return the same result as std::move(container[index]) (not exactly, may increase an additional move operation overhead), which is convenient when we try to forward a container.
However, how about begin and end?
template <typename T>
class MyArray {
T* _items;
size_t _size;
/* ... */
class iterator;
class const_iterator;
using move_iterator = std::move_iterator<iterator>;
public:
iterator begin() & { /*...*/ }
const_iterator begin() const& { /*...*/ }
// may works well with x-value, but pr-value?
move_iterator begin() && {
return std::make_move_iterator(begin());
}
// or more directly, using ADL
};
So simple, like that?
No! Iterator will be invalidated after destruction of container. So deferencing an iterator from a temporary (pr-value) is undefined behaviour!!
auto getVec() -> std::vector<int>;
///
auto it = getVec().begin(); // Noooo
auto item = *it; // undefined behaviour
Since there's no way (for programmer) to recognize whether an object is pr-value or x-value (both will be duduced into T), retrieving iterator from a r-value container should be forbidden.
If we could regulate behaviours of Container, explicitly delete the function that obtain iterator from a r-value container, then it's possible to detect it out.
A simple demo is here:
https://godbolt.org/z/4zeMG745f
From my perspective, banning such an obviously wrong behaviour may not be so destructive that lead well-implemented old projects failing to compile.
Actually, it just requires some lines of modification for each container, and add proper constraints or overloads for range access utilities like std::begin/std::ranges::begin.

Elegant way to create a vector of reference

I have a vector of Foo
vector<Foo> inputs
Foo is a struct with some score inside
struct Foo {
...
float score
bool winner
}
Now I want to sort inputs by score and only assign winner to the top 3. But I don't want to change the original inputs vector. So I guess I need to create a vector of reference then sort that? Is it legal to create a vector of reference? Is there an elegant way to do so?
Here two different way of creating a vector<Foo*>:
vector<Foo*> foor;
for (auto& x:inputs)
foor.push_back(&x);
vector<Foo*> foob(inputs.size(),nullptr);
transform(inputs.begin(), inputs.end(), foob.begin(), [](auto&x) {return &x;});
You can then use standard algorithms to sort your vectors of pointers without changing the original vector (if this is a requirement):
// decreasing order according to score
sort(foob.begin(), foob.end(), [](Foo*a, Foo*b)->bool {return a->score>b->score;});
You may finally change the top n elements, either using for_each_n() algorithm (if C++17) or simply with an ordinary loop.
Online demo
The only example code given was for pointers, and the IMO far more fitting std::reference_wrapper was only mentioned, with no indication of how it might be used in a situation like this. I want to fix that!
Non-owning pointers have at least 3 drawbacks:
the visual, from having to pepper &, *, and -> in code using them;
the practical: if all you want is a reference to one object, now you have a thing that can be subtracted from other pointers (which may not be related), be inc/decremented (if not const), do stuff in overload resolution or conversion, etc. – none of which you want. I'm sure everyone is laughing at this and saying 'I'd never make such silly mistakes', but you know in your gut that, on a long enough timeline, it will happen.
and the lack of self-documentation, as they have no innate semantics of ownership or lack thereof.
I typically prefer std::reference_wrapper, which
clearly self-documents its purely observational semantics,
can only yield a reference to an object, thus not having any pointer-like pitfalls, and
sidesteps many syntactical problems by implicitly converting to the real referred type, thus minimising operator noise where you can invoke conversion (pass to a function, initialise a reference, range-for, etc.)... albeit interfering with the modern preference for auto – at least until we get the proposed operator. or operator auto – and requiring the more verbose .get() in other cases or if you just want to avoid such inconsistencies. Still, I argue that these wrinkles are neither worse than those of pointers, nor likely to be permanent given various active proposals to prettify use of wrapper/proxy types.
I'd recommend that or another vocabulary class, especially for publicly exposed data. There are experimental proposal(s) for observer_ptrs and whatnot, but again, if you don't really want pointer-like behaviour, then you should be using a wrapper that models a reference... and we already have one of those.
So... the code in the accepted answer can be rewritten like so (now with #includes and my preferences for formatting):
#include <algorithm>
#include <functional>
#include <vector>
// ...
void
modify_top_n(std::vector<Foo>& v, int const n)
{
std::vector< std::reference_wrapper<Foo> > tmp{ v.begin(), v.end() };
std::nth_element( tmp.begin(), tmp.begin() + n, tmp.end(),
[](Foo const& f1, Foo const& f2){ return f1.score > f2.score; } );
std::for_each( tmp.begin(), tmp.begin() + n,
[](Foo& f){ f.winner = true; } );
}
This makes use of the range constructor to construct a range of reference_wrappers from the range of real Foos, and the implicit conversion to Foo& in the lambda argument lists to avoid having to do reference_wrapper.get() (and then we have the far less messy direct member access by . instead of ->).
Of course, this can be generalised: the main candidate for factoring out to a reusable helper function is the construction of a vector< reference_wrapper<Foo> > for arbitrary Foo, given only a pair of iterators-to-Foo. But we always have to leave something as an exercise to the reader. :P
If you really don't want to modify the original vector, then you'll have to sort a vector of pointers or indices into the original vector instead. To answer part of your question, no there's no way to make a vector of references and you shouldn't do so.
To find the top three (or n) elements, you don't even have to sort the whole vector. The STL's got you covered with std::nth_element (or std::partial_sort if you care about the order of the top elements), you would do something like this:
void modify_top_n(std::vector<Foo> &v, int n) {
std::vector<Foo*> tmp(v.size());
std::transform(v.begin(), v.end(), tmp.begin(), [](Foo &f) { return &f; });
std::nth_element(tmp.begin(), tmp.begin() + n, tmp.end(),
[](const Foo* f1, const Foo *f2) { return f1->score > f2->score; });
std::for_each(tmp.begin(), tmp.begin() + n, [](Foo *f) {
f->winner = true;
});
}
Assuming the vector has at least n entries. I used for_each just because it's easier when you have an iterator range, you can use a for loop as well (or for_each_n as Christophe mentioned, if you have C++17).
Answering the question on it's face value:
Vectors of references (as well as built-in arrays of them) are not legal in C++. Here is normative standard wording for arrays:
There shall be no references to references, no arrays of references,
and no pointers to references.
And for vectors it is forbidden by the fact that vector elements must be assignable (while references are not).
To have an array or vector of indirect objects, one can either use a non-owning pointer (std::vector<int*>), or, if a non-pointer access syntax is desired, a wrapper - std::reference_wrapper.
So I guess I need to create a vector of reference then sort that? Is it legal to create a vector of reference?
No, it is not possible to have a vector of references. There is std::reference_wrapper for such purpose, or you can use a bare pointer.
Besides the two ways shown by Christophe, one more way is a transform iterator adaptor, which can be used to sort the top 3 pointers / reference wrappers into an array using std::partial_sort_copy.
A transform iterator simply adapts an output iterator by calling a function to transform input upon assignment. There are no iterator adaptors in the standard library though, so you need to implement one yourself, or use a library.

Taking predicates by value [duplicate]

I'm wondering why functors are passed by copy to the algorithm functions:
template <typename T> struct summatory
{
summatory() : result(T()) {}
void operator()(const T& value)
{ result += value; std::cout << value << "; ";};
T result;
};
std::array<int, 10> a {{ 1, 1, 2, 3, 5, 8, 13, 21, 34, 55 }};
summatory<int> sum;
std::cout << "\nThe summation of: ";
std::for_each(a.begin(), a.end(), sum);
std::cout << "is: " << sum.result;
I was expecting the following output:
The summation of: 1; 1; 2; 3; 5; 8; 13; 21; 34; 55; is: 143
But sum.result contains 0, that is the default value assigned in the ctor. The only way to achieve the desired behaviour is capturing the return value of the for_each:
sum = std::for_each(a.begin(), a.end(), sum);
std::cout << "is: " << sum.result;
This is happening because the functor is passed by copy to the for_each instead of by reference:
template< class InputIt, class UnaryFunction >
UnaryFunction for_each( InputIt first, InputIt last, UnaryFunction f );
So the outer functor remains untouched, while the inner one (which is a copy of the outer) is updated and is returned after perform the algorithm (live demo), so the result is copied (or moved) again after doing all the operations.
There must be a good reason to do the work this way, but I don't really realize the rationale in this design, so my questions are:
Why the predicates of the sequence-operation algorithms are passed by copy instead of reference?
What advantages offers the pass-by-copy approach in front of the pass-by-reference one?
It's mostly for historic reasons. At '98 when the whole algo stuff made it into the standard references had all kind of problems. That got eventually resolved through core and library DRs by C++03 and beyond. Also sensible ref-wrappers and actually working bind only arrived only in TR1.
Those who tried use algos with early C++98 having functions using ref params or returns can recall all kind of trouble. Self-written algos were also prone to hit the dreaded 'reference to reference' problem.
Passing by value at least worked fine, and hardly created many problems -- and boost had ref and cref early on to help out where you needed to tweak.
This is purely a guess, but...
...lets for a moment assume it takes by reference to const. This would mean that all of your members must be mutable and the operator must be const. That just doesn't feel "right".
... lets for a moment assume it takes by reference to non-const. It would call a non-const operator, members can just be worked on fine. But what if you want to pass an ad-hoc object? Like the result of a bind operation (even C++98 had -- ugly and simple -- bind tools)? Or the type itself just does everything you need and you don't need the object after that and just want to call it like for_each(b,e,my_functor());? That won't work since temporaries can not bind to non-const references.
So maybe not the best, but the least bad option is here to take by value, copy it around in the process as much as needed (hopefully not too often) and then when done with it, return it from for_each. This works fine with the rather low complexity of your summatory object, doesn't need added mutable stuff like the reference-to-const approach, and works with temporaries too.
But YMMV, and so likely did those of the committee members, and I would guess it was in the end a vote on what they thought is the most likely to fit the most use-cases.
Maybe this could be a workaround. Capture the functor as reference and call it in a lambda
std::for_each(a.begin(), a.end(), [&sum] (T& value)
{
sum(value);
});
std::cout << "is: " << sum.result;

Why the sequence-operation algorithms predicates are passed by copy?

I'm wondering why functors are passed by copy to the algorithm functions:
template <typename T> struct summatory
{
summatory() : result(T()) {}
void operator()(const T& value)
{ result += value; std::cout << value << "; ";};
T result;
};
std::array<int, 10> a {{ 1, 1, 2, 3, 5, 8, 13, 21, 34, 55 }};
summatory<int> sum;
std::cout << "\nThe summation of: ";
std::for_each(a.begin(), a.end(), sum);
std::cout << "is: " << sum.result;
I was expecting the following output:
The summation of: 1; 1; 2; 3; 5; 8; 13; 21; 34; 55; is: 143
But sum.result contains 0, that is the default value assigned in the ctor. The only way to achieve the desired behaviour is capturing the return value of the for_each:
sum = std::for_each(a.begin(), a.end(), sum);
std::cout << "is: " << sum.result;
This is happening because the functor is passed by copy to the for_each instead of by reference:
template< class InputIt, class UnaryFunction >
UnaryFunction for_each( InputIt first, InputIt last, UnaryFunction f );
So the outer functor remains untouched, while the inner one (which is a copy of the outer) is updated and is returned after perform the algorithm (live demo), so the result is copied (or moved) again after doing all the operations.
There must be a good reason to do the work this way, but I don't really realize the rationale in this design, so my questions are:
Why the predicates of the sequence-operation algorithms are passed by copy instead of reference?
What advantages offers the pass-by-copy approach in front of the pass-by-reference one?
It's mostly for historic reasons. At '98 when the whole algo stuff made it into the standard references had all kind of problems. That got eventually resolved through core and library DRs by C++03 and beyond. Also sensible ref-wrappers and actually working bind only arrived only in TR1.
Those who tried use algos with early C++98 having functions using ref params or returns can recall all kind of trouble. Self-written algos were also prone to hit the dreaded 'reference to reference' problem.
Passing by value at least worked fine, and hardly created many problems -- and boost had ref and cref early on to help out where you needed to tweak.
This is purely a guess, but...
...lets for a moment assume it takes by reference to const. This would mean that all of your members must be mutable and the operator must be const. That just doesn't feel "right".
... lets for a moment assume it takes by reference to non-const. It would call a non-const operator, members can just be worked on fine. But what if you want to pass an ad-hoc object? Like the result of a bind operation (even C++98 had -- ugly and simple -- bind tools)? Or the type itself just does everything you need and you don't need the object after that and just want to call it like for_each(b,e,my_functor());? That won't work since temporaries can not bind to non-const references.
So maybe not the best, but the least bad option is here to take by value, copy it around in the process as much as needed (hopefully not too often) and then when done with it, return it from for_each. This works fine with the rather low complexity of your summatory object, doesn't need added mutable stuff like the reference-to-const approach, and works with temporaries too.
But YMMV, and so likely did those of the committee members, and I would guess it was in the end a vote on what they thought is the most likely to fit the most use-cases.
Maybe this could be a workaround. Capture the functor as reference and call it in a lambda
std::for_each(a.begin(), a.end(), [&sum] (T& value)
{
sum(value);
});
std::cout << "is: " << sum.result;

Can I use const in vectors to allow adding elements, but not modifications to the already added?

My comments on this answer got me thinking about the issues of constness and sorting. I played around a bit and reduced my issues to the fact that this code:
#include <vector>
int main() {
std::vector <const int> v;
}
will not compile - you can't create a vector of const ints. Obviously, I should have known this (and intellectually I did), but I've never needed to create such a thing before. However, it seems like a useful construct to me, and I wonder if there is any way round this problem - I want to add things to a vector (or whatever), but they should not be changed once added.
There's probably some embarrassingly simple solution to this, but it's something I'd never considered before.
I probably should not have mentioned sorting (I may ask another question about that, see this for the difficulties of asking questions). My real base use case is something like this:
vector <const int> v; // ok (i.e. I want it to be OK)
v.push_back( 42 ); // ok
int n = v[0]; // ok
v[0] = 1; // not allowed
Well, in C++0x you can...
In C++03, there is a paragraph 23.1[lib.containers.requirements]/3, which says
The type of objects stored in these components must meet the requirements of CopyConstructible types (20.1.3), and the additional requirements of Assignable types.
This is what's currently preventing you from using const int as a type argument to std::vector.
However, in C++0x, this paragraph is missing, instead, T is required to be Destructible and additional requirements on T are specified per-expression, e.g. v = u on std::vector is only valid if T is MoveConstructible and MoveAssignable.
If I interpret those requirements correctly, it should be possible to instantiate std::vector<const int>, you'll just be missing some of its functionality (which I guess is what you wanted). You can fill it by passing a pair of iterators to the constructor. I think emplace_back() should work as well, though I failed to find explicit requirements on T for it.
You still won't be able to sort the vector in-place though.
Types that you put in a standard container have to be copyable and assignable. The reason that auto_ptr causes so much trouble is precisely because it doesn't follow normal copy and assignment semantics. Naturally, anything that's const is not going to be assignable. So, you can't stick const anything in a standard container. And if the element isn't const, then you are going to be able to change it.
The closest solution that I believe is possible would be to use an indirection of some kind. So, you could have a pointer to const or you could have an object which holds the value that you want but the value can't be changed within the object (like you'd get with Integer in Java).
Having the element at a particular index be unchangeable goes against how the standard containers work. You might be able to construct your own which work that way, but the standard ones don't. And none which are based on arrays will work regardless unless you can manage to fit their initialization into the {a, b, c} initialization syntax since once an array of const has been created, you can't change it. So, a vector class isn't likely to work with const elements no matter what you do.
Having const in a container without some sort of indirection just doesn't work very well. You're basically asking to make the entire container const - which you could do if you copy to it from an already initialized container, but you can't really have a container - certainly not a standard container - which contains constants without some sort of indirection.
EDIT: If what you're looking to do is to mostly leave a container unchanged but still be able to change it in certain places in the code, then using a const ref in most places and then giving the code that needs to be able to change the container direct access or a non-const ref would make that possible.
So, use const vector<int>& in most places, and then either vector<int>& where you need to change the container, or give that portion of the code direct access to the container. That way, it's mostly unchangeable, but you can change it when you want to.
On the other hand, if you want to be able to pretty much always be able to change what's in the container but not change specific elements, then I'd suggest putting a wrapper class around the container. In the case of vector, wrap it and make the subscript operator return a const ref instead of a non-const ref - either that or a copy. So, assuming that you created a templatized version, your subscript operator would look something like this:
const T& operator[](size_t i) const
{
return _container[i];
}
That way, you can update the container itself, but you can't change it's individual elements. And as long as you declare all of the functions inline, it shouldn't be much of a performance hit (if any at all) to have the wrapper.
You can't create a vector of const ints, and it'd be pretty useless even if you could. If i remove the second int, then everything from there on is shifted down one -- read: modified -- making it impossible to guarantee that v[5] has the same value on two different occasions.
Add to that, a const can't be assigned to after it's declared, short of casting away the constness. And if you wanna do that, why are you using const in the first place?
You're going to need to write your own class. You could certainly use std::vector as your internal implementation. Then just implement the const interface and those few non-const functions you need.
Although this doesn't meet all of your requirements (being able to sort), try a constant vector:
int values[] = {1, 3, 5, 2, 4, 6};
const std::vector<int> IDs(values, values + sizeof(values));
Although, you may want to use a std::list. With the list, the values don't need to change, only the links to them. Sorting is accomplished by changing the order of the links.
You may have to expend some brain power and write your own. :-(
I would have all my const objects in a standard array.
Then use a vector of pointers into the array.
A small utility class just to help you not have to de-reference the objects and hay presto.
#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>
class XPointer
{
public:
XPointer(int const& data)
: m_data(&data)
{}
operator int const&() const
{
return *m_data;
}
private:
int const* m_data;
};
int const data[] = { 15, 17, 22, 100, 3, 4};
std::vector<XPointer> sorted(data,data+6);
int main()
{
std::sort(sorted.begin(), sorted.end());
std::copy(sorted.begin(), sorted.end(), std::ostream_iterator<int>(std::cout, ", "));
int x = sorted[1];
}
I'm with Noah: wrap the vector with a class that exposes only what you want to allow.
If you don't need to dynamically add objects to the vector, consider std::tr1::array.
If constness is important to you in this instance I think you probably want to work with immutable types all the way up. Conceptually you'll have a fixed size, const array of const ints. Any time you need to change it (e.g. to add or remove elements, or to sort) you'll need to make a copy of the array with the operation performed and use that instead.
While this is very natural in a functional language it doesn't seem quite "right" in C++. getting efficient implementations of sort, for example, could be tricky - but you don't say what you're performance requirements are.
Whether you consider this route as being worth it from a performance/ custom code perspective or not I believe it is the correct approach.
After that holding the values by non-const pointer/ smart pointer is probably the best (but has its own overhead, of course).
I've been thinking a bit on this issue and it seems that you requirement is off.
You don't want to add immutable values to your vector:
std::vector<const int> vec = /**/;
std::vector<const int>::const_iterator first = vec.begin();
std::sort(vec.begin(), vec.end());
assert(*vec.begin() == *first); // false, even though `const int`
What you really want is your vector to hold a constant collection of values, in a modifiable order, which cannot be expressed by the std::vector<const int> syntax even if it worked.
I am afraid that it's an extremely specified task that would require a dedicated class.
It is true that Assignable is one of the standard requirements for vector element type and const int is not assignable. However, I would expect that in a well-thought-through implementation the compilation should fail only if the code explicitly relies on assignment. For std::vector that would be insert and erase, for example.
In reality, in many implementations the compilation fails even if you are not using these methods. For example, Comeau fails to compile the plain std::vector<const int> a; because the corresponding specialization of std::allocator fails to compile. It reports no immediate problems with std::vector itself.
I believe it is a valid problem. The library-provided implementation std::allocator is supposed to fail if the type parameter is const-qualified. (I wonder if it is possible to make a custom implementation of std::allocator to force the whole thing to compile.) (It would also be interesting to know how VS manages to compile it) Again, with Comeau std::vector<const int> fails to compiler for the very same reasons std::allocator<const int> fails to compile, and, according to the specification of std::allocator it must fail to compile.
Of course, in any case any implementation has the right to fail to compile std::vector<const int> since it is allowed to fail by the language specification.
Using just an unspecialized vector, this can't be done. Sorting is done by using assignment. So the same code that makes this possible:
sort(v.begin(), v.end());
...also makes this possible:
v[1] = 123;
You could derive a class const_vector from std::vector that overloads any method that returns a reference, and make it return a const reference instead. To do your sort, downcast back to std::vector.
std::vector of constant object will probably fail to compile due to Assignable requirement, as constant object can not be assigned. The same is true for Move Assignment also. This is also the problem I frequently face when working with a vector based map such as boost flat_map or Loki AssocVector. As it has internal implementation std::vector<std::pair<const Key,Value> > .
Thus it is almost impossible to follow const key requirement of map, which can be easily implemented for any node based map.
However it can be looked, whether std::vector<const T> means the vector should store a const T typed object, or it merely needs to return a non-mutable interface while accessing.
In that case, an implementation of std::vector<const T> is possible which follows Assignable/Move Assignable requirement as it stores object of type T rather than const T. The standard typedefs and allocator type need to be modified little to support standard requirements.Though to support such for a vector_map or flat_map, one probably needs considerable change in std::pair interface as it exposes the member variables first & second directly.
Compilation fails because push_back() (for instance) is basically
underlying_array[size()] = passed_value;
where both operand are T&. If T is const X that can't work.
Having const elements seem right in principle but in practice it's unnatural, and the specifications don't say it should be supported, so it's not there. At least not in the stdlib (because then, it would be in vector).