I'm learning D and I have been playing with more and more functions and tools defined in phobos. I came across two functions that don't work when the parameters are const or immutable.
BigInt i = "42", j = "42";
writeln(i + j);
// This works, except when I add a const/immutable qualifier to i and j
// When const: main.d(23): Error: incompatible types for ((i) + (j)): 'const(BigInt)' and 'const(BigInt)'
// When immutable: main.d(23): Error: incompatible types for ((i) + (j)): 'immutable(BigInt)' and 'immutable(BigInt)'
The same happens with the std.array.join function.
int[] arr1 = [1, 2, 3, 4];
int[] arr2 = [5, 6];
writeln(join([arr1, arr2]));
// Again, the const and immutable errors are almost identical
// main.d(28): Error: template std.array.join(RoR, R)(RoR ror, R sep) if (isInputRange!RoR && isInputRange!(ElementType!RoR) && isInputRange!R && is(Unqual!(ElementType!(ElementType!RoR)) == Unqual!(ElementType!R))) cannot deduce template function from argument types !()(const(int[])[])
This is quite surprising to me. I have a C++ background so I usually write const everywhere, but it seems I can't do it in D.
As a D "user", I see that as a bug. Can someone explain me why this is not a bug and how I should call these functions with const/immutable data? Thanks.
First off, I should say that D's const is very different from C++'s const. Like C++, ideally you'd mark as much with it as possible, but unlike with C++, there are serious consequences to marking something const in D.
In D, const is transitive, so it affects the entire type, not just the top level, and unlike in C++, you can't mutate it by casting it away or by using mutable (it's undefined behavior and will cause serious bugs if you try to cast away const from an object and then mutate it). The result of those two things is that there are many places where you just can't use const in D without making it impossible to do certain things.
D's const provides real, solid guarantees that you can't mutate the object through that reference in any way, shape or form, whereas C++'s const just makes it so that you can't mutate anything which is const by accident, but you can easily cast away const and mutate the object (with defined behavior), or pieces of the object could be changed internally by const functions thanks to mutable. It's also trivial in C++ to return a mutable reference to the internals of a class from a const function even without casting or mutable (e.g. returning vector<int*> from a const function - the vector can't be mutated but everything it refers to can be). None of those are possible in D, as D guarantees full transitive const, and providing those guarantees makes it so that any circumstance where you need to get at something mutable from something const isn't going to work unless you create an entirely new copy of it.
You should probably read over the answers to these questions:
Logical const in D
What is the difference between const and immutable in D?
So, if you're slapping const on everything in D, you'll find that some things just won't work. Using const as much as you can is great for the same reasons that it is in C++, but the cost is much higher, so you have to be more restrictive about what you mark with const.
Now, as to your specific issue here. BigInt is supposed to work with const and immutable but does not currently. There are a few open bugs on the issue. I believe that a lot of the problem stems from the fact that BigInt uses COW internally, and that doesn't play nicely with const or immutable. Fortunately, there's a pull request on github at the moment which fixes at least some of the problems, so I expect that BigInt will work with const and immutable in the near future, but for the moment, you can't.
As for join, your example compiles just fine, so you copied your code wrong. There is no const in your example. Perhaps you meant
const int[] arr1 = [1, 2, 3, 4];
const int[] arr2 = [5, 6];
writeln(join([arr1, arr2]));
And that doesn't compile. And that's because you aren't passing a valid range of ranges to join. The type that you'd be passing to join in that case would be const(int[])[]. The outer array is mutable, so it's fine, but the inner ones - the ranges that you're trying to join together - are const, and nothing which is const can be a valid range, and that's because popFront won't work. For something to be a valid input range, this code must compile for it (and this is taken from inside of std.range.isInputRange).
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
const(int[]) won't work with popFront as isInputRange requires. e.g.
const int[] arr = [1, 2, 3];
arr.popFront();
won't compile, so isInputRange is false, and join won't compile with it.
Now, fortunately, arrays are a bit special in that the compiler understands them, so the compiler knows that it's perfectly legit to turn const(int[]) into const(int)[] when you slice it. That is, it knows that giving you a tail-const slice won't be able to affect the original array (because the result is a new array, and while the elements are shared between the arrays, they're all const, so they still can't be mutated). So, the type of arr[] would be const(int)[] instead of const(int[]), and the type of [arr1[], arr2[]] is const(int)[][], which will work with join. So, you can do
const int[] arr1 = [1, 2, 3, 4];
const int[] arr2 = [5, 6];
writeln(join([arr1[], arr2[]]));
and your code will work just fine. However, that's just because you're using arrays. If you were dealing with user-defined ranges, the moment you made one of them const, you'd be stuck. This code won't compile
const arr1 = filter!"true"([1, 2, 3, 4]);
const arr2 = filter!"true"([5, 6]);
writeln(join([arr1[], arr2[]]));
And that's because the compiler doesn't know that it can safely get a tail-const slice from a ser-defined type. It needs to know that it can convert const MyRange!E to MyRange!(const E) and have the proper semantics. And it can't know that, because those are two different template instantiations, and they could have completely different internals. The programmer writing MyRange would have to be able to write opSlice such that it returns MyRange(const E) when the type is const MyRange!E or const MyRange!(const E), and that's actually hard to do (if nothing else, it very easily results in recursive template instantiations). Some clever use of static if and alias this should make it possible, but it's hard enough to do that pretty much no one does it right now. It's an open question as to how we're going to make it sane for user-defined types to make opSlice return a tail-const range. And until that question is solved, const and ranges just don't mix, because as soon as you get a const range, there's no way to get a tail-const slice of it which could have popFront called on it. So, once your range is const, it's const.
Arrays are special, since they're built-in, so you can get away with using const with them as long as you slice them at the appropriate times (and the fact that templates are instantiated with their slice type rather than their original type helps), but in general, if you're using a range, just assume that you can't make it const. Hopefully, that changes someday, but for now, that's the way that it is.
Related
It seems that auto was a fairly significant feature to be added in C++11 that seems to follow a lot of the newer languages. As with a language like Python, I have not seen any explicit variable declaration (I am not sure if it is possible using Python standards).
Is there a drawback to using auto to declare variables instead of explicitly declaring them?
The question is about drawbacks of auto, so this answer highlights some of those. A drawback of using a programming language feature (in this case, a facility associated with a language keyword) does not mean that feature is unacceptable, nor does it mean that feature should be avoided entirely. It means there are disadvantages along with advantages, so a decision to use auto type deduction over alternatives must consider engineering trade-offs.
When used well, auto has several advantages as well - which is not the subject of the question. The drawbacks result from ease of abuse, and from increased potential for code to behave in unintended or unexpected ways.
The main drawback is that, by using auto, you don't necessarily know the type of object being created. There are also occasions where the programmer might expect the compiler to deduce one type, but the compiler adamantly deduces another.
Given a declaration like
auto result = CallSomeFunction(x,y,z);
you don't necessarily have knowledge of what type result is. It might be an int. It might be a pointer. It might be something else. All of those support different operations. You can also dramatically change the code by a minor change like
auto result = CallSomeFunction(a,y,z);
because, depending on what overloads exist for CallSomeFunction() the type of result might be completely different - and subsequent code may therefore behave completely differently than intended. You might suddenly trigger error messages in later code(e.g. subsequently trying to dereference an int, trying to change something which is now const). The more sinister change is where your change sails past the compiler, but subsequent code behaves in different and unknown - possibly buggy - ways. For example (as noted by sashoalm in comments) if the deduced type of a variable changes an integral type to a floating point type - and subsequent code is unexpectedly and silently affected by loss of precision.
Not having explicit knowledge of the type of some variables therefore makes it harder to rigorously justify a claim that the code works as intended. This means more effort to justify claims of "fit for purpose" in high-criticality (e.g. safety-critical or mission-critical) domains.
The other, more common drawback, is the temptation for a programmer to use auto as a blunt instrument to force code to compile, rather than thinking about what the code is doing, and working to get it right.
This isn't a drawback of auto in a principled way exactly, but in practical terms it seems to be an issue for some. Basically, some people either: a) treat auto as a savior for types and shut their brain off when using it, or b) forget that auto always deduces to value types. This causes people to do things like this:
auto x = my_obj.method_that_returns_reference();
Oops, we just deep copied some object. It's often either a bug or a performance fail. Then, you can swing the other way too:
const auto& stuff = *func_that_returns_unique_ptr();
Now you get a dangling reference. These problems aren't caused by auto at all, so I don't consider them legitimate arguments against it. But it does seem like auto makes these issue more common (from my personal experience), for the reasons I listed at the beginning.
I think given time people will adjust, and understand the division of labor: auto deduces the underlying type, but you still want to think about reference-ness and const-ness. But it's taking a bit of time.
Other answers are mentioning drawbacks like "you don't really know what the type of a variable is." I'd say that this is largely related to sloppy naming convention in code. If your interfaces are clearly-named, you shouldn't need to care what the exact type is. Sure, auto result = callSomeFunction(a, b); doesn't tell you much. But auto valid = isValid(xmlFile, schema); tells you enough to use valid without having to care what its exact type is. After all, with just if (callSomeFunction(a, b)), you wouldn't know the type either. The same with any other subexpression temporary objects. So I don't consider this a real drawback of auto.
I'd say its primary drawback is that sometimes, the exact return type is not what you want to work with. In effect, sometimes the actual return type differs from the "logical" return type as an implementation/optimisation detail. Expression templates are a prime example. Let's say we have this:
SomeType operator* (const Matrix &lhs, const Vector &rhs);
Logically, we would expect SomeType to be Vector, and we definitely want to treat it as such in our code. However, it is possible that for optimisation purposes, the algebra library we're using implements expression templates, and the actual return type is this:
MultExpression<Matrix, Vector> operator* (const Matrix &lhs, const Vector &rhs);
Now, the problem is that MultExpression<Matrix, Vector> will in all likelihood store a const Matrix& and const Vector& internally; it expects that it will convert to a Vector before the end of its full-expression. If we have this code, all is well:
extern Matrix a, b, c;
extern Vector v;
void compute()
{
Vector res = a * (b * (c * v));
// do something with res
}
However, if we had used auto here, we could get in trouble:
void compute()
{
auto res = a * (b * (c * v));
// Oops! Now `res` is referring to temporaries (such as (c * v)) which no longer exist
}
It makes your code a little harder, or tedious, to read.
Imagine something like that:
auto output = doSomethingWithData(variables);
Now, to figure out the type of output, you'd have to track down signature of doSomethingWithData function.
One of the drawbacks is that sometimes you can't declare const_iterator with auto. You will get ordinary (non const) iterator in this example of code taken from this question:
map<string,int> usa;
//...init usa
auto city_it = usa.find("New York");
Like this developer, I hate auto. Or rather, I hate how people misuse auto.
I'm of the (strong) opinion that auto is for helping you write generic code, not for reducing typing.
C++ is a language whose goal is to let you write robust code, not to minimize development time.
This is fairly obvious from many features of C++, but unfortunately a few of the newer ones like auto that reduce typing mislead people into thinking they should start being lazy with typing.
In pre-auto days, people used typedefs, which was great because typedef allowed the designer of the library to help you figure out what the return type should be, so that their library works as expected. When you use auto, you take away that control from the class's designer and instead ask the compiler to figure out what the type should be, which removes one of the most powerful C++ tools from the toolbox and risks breaking their code.
Generally, if you use auto, it should be because your code works for any reasonable type, not because you're just too lazy to write down the type that it should work with.
If you use auto as a tool to help laziness, then what happens is that you eventually start introducing subtle bugs in your program, usually caused by implicit conversions that did not happen because you used auto.
Unfortunately, these bugs are difficult to illustrate in a short example here because their brevity makes them less convincing than the actual examples that come up in a user project -- however, they occur easily in template-heavy code that expect certain implicit conversions to take place.
If you want an example, there is one here. A little note, though: before being tempted to jump and criticize the code: keep in mind that many well-known and mature libraries have been developed around such implicit conversions, and they are there because they solve problems that can be difficult if not impossible to solve otherwise. Try to figure out a better solution before criticizing them.
auto does not have drawbacks per se, and I advocate to (hand-wavily) use it everywhere in new code. It allows your code to consistently type-check, and consistently avoid silent slicing. (If B derives from A and a function returning A suddenly returns B, then auto behaves as expected to store its return value)
Although, pre-C++11 legacy code may rely on implicit conversions induced by the use of explicitly-typed variables. Changing an explicitly-typed variable to auto might change code behaviour, so you'd better be cautious.
Keyword auto simply deduce the type from the return value. Therefore, it is not equivalent with a Python object, e.g.
# Python
a
a = 10 # OK
a = "10" # OK
a = ClassA() # OK
// C++
auto a; // Unable to deduce variable a
auto a = 10; // OK
a = "10"; // Value of const char* can't be assigned to int
a = ClassA{} // Value of ClassA can't be assigned to int
a = 10.0; // OK, implicit casting warning
Since auto is deduced during compilation, it won't have any drawback at runtime whatsoever.
What no one mentioned here so far, but for itself is worth an answer if you asked me.
Since (even if everyone should be aware that C != C++) code written in C can easily be designed to provide a base for C++ code and therefore be designed without too much effort to be C++ compatible, this could be a requirement for design.
I know about some rules where some well defined constructs from C are invalid for C++ and vice versa. But this would simply result in broken executables and the known UB-clause applies which most times is noticed by strange loopings resulting in crashes or whatever (or even may stay undetected, but that doesn't matter here).
But auto is the first time1 this changes!
Imagine you used auto as storage-class specifier before and transfer the code. It would not even necessarily (depending on the way it was used) "break"; it actually could silently change the behaviour of the program.
That's something one should keep in mind.
1At least the first time I'm aware of.
As I described in this answer auto can sometimes result in funky situations you didn't intend.
You have to explictly say auto& to have a reference type while doing just auto can create a pointer type. This can result in confusion by omitting the specifier all together, resulting in a copy of the reference instead of an actual reference.
One reason that I can think of is that you lose the opportunity to coerce the class that is returned. If your function or method returned a long 64 bit, and you only wanted a 32 unsigned int, then you lose the opportunity to control that.
I think auto is good when used in a localized context, where the reader easily & obviously can deduct its type, or well documented with a comment of its type or a name that infer the actual type. Those who don't understand how it works might take it in the wrong ways, like using it instead of template or similar. Here are some good and bad use cases in my opinion.
void test (const int & a)
{
// b is not const
// b is not a reference
auto b = a;
// b type is decided by the compiler based on value of a
// a is int
}
Good Uses
Iterators
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int> v();
..
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int>::iterator it = v.begin();
// VS
auto vi = v.begin();
Function Pointers
int test (ClassWithLongName1 a, ClassWithLongName2 b, int c)
{
..
}
..
int (*fp)(ClassWithLongName1, ClassWithLongName2, int) = test;
// VS
auto *f = test;
Bad Uses
Data Flow
auto input = "";
..
auto output = test(input);
Function Signature
auto test (auto a, auto b, auto c)
{
..
}
Trivial Cases
for(auto i = 0; i < 100; i++)
{
..
}
Another irritating example:
for (auto i = 0; i < s.size(); ++i)
generates a warning (comparison between signed and unsigned integer expressions [-Wsign-compare]), because i is a signed int. To avoid this you need to write e.g.
for (auto i = 0U; i < s.size(); ++i)
or perhaps better:
for (auto i = 0ULL; i < s.size(); ++i)
I'm surprised nobody has mentioned this, but suppose you are calculating the factorial of something:
#include <iostream>
using namespace std;
int main() {
auto n = 40;
auto factorial = 1;
for(int i = 1; i <=n; ++i)
{
factorial *= i;
}
cout << "Factorial of " << n << " = " << factorial <<endl;
cout << "Size of factorial: " << sizeof(factorial) << endl;
return 0;
}
This code will output this:
Factorial of 40 = 0
Size of factorial: 4
That was definetly not the expected result. That happened because auto deduced the type of the variable factorial as int because it was assigned to 1.
This question already has answers here:
Range-based for with brace-initializer over non-const values?
(5 answers)
Closed 5 months ago.
This is a follow-up question to:
Innocent range based for loop not working
In this question, the syntax that is used is:
int a{},b{},c{},d{};
for (auto& i : {a, b, c, d}) {
i = 1;
}
That is not legal c++. (In order to make it a work a new container type has to be invented that stores pointers or references internally)
Is this just a sort of a side-effect of two different concepts or why was it not allowed ?
At first this look like a miss in regard to driving the language forward, but knowing much work goes into these things Im guessing it was problematic or perhaps just not considered very much.
Im speculating answers might be something like:
It was problematic in some way, given how initializer_list/arrays/brace-init lists/etc. were designed (and 'fixing' that would either be too hard for compiler writers or would perform less optimal in the generel case).
It would require a special rule or it could potentially be allowed with some fundamental language change (eg. to initializer_list).
It would be ambiguous / unclear to read.
I think the most important part about why this is invalid is reason given at cppreference (since C++14, emphasis mine):
The underlying array is a temporary array of type const T[N], in which
each element is copy-initialized (except that narrowing conversions
are invalid) from the corresponding element of the original
initializer list. The lifetime of the underlying array is the same as
any other temporary object, except that initializing an
initializer_list object from the array extends the lifetime of the
array exactly like binding a reference to a temporary (with the same
exceptions, such as for initializing a non-static class member). The
underlying array may be allocated in read-only memory.
So giving reference to non-const in for loop is invalid. Also as noted in comment, array of references is not valid C++ construct
Relevant cpp standard read: http://eel.is/c++draft/support.initlist.access
It would require a special rule or it could potentially be allowed with some fundamental language change (eg. to initializer_list).
The change would indeed need to be fundamental. The problem is not with how initializer_list is implemented though.
According to [dcl.ref]:
There shall be no references to references, no arrays of references, and no pointers to references.
This makes sense since a reference is not an object in the strict sense. A reference is not required to have storage.
Since arrays of references are not legal, these are a few workarounds:
Use an array of pointers
for (auto* i : {&a, &b, &c, &d}) {
*i = 1;
}
Flavours using std::reference_wrapper. Note that I've used int& instead of auto& to not have to use i.get() = 1:
for (int& i : {std::ref(a), std::ref(b), std::ref(c), std::ref(d)}) {
i = 1;
}
for (int& i : std::initializer_list<std::reference_wrapper<int>>{a,b,c,d}) {
i = 1;
}
If you use it a lot, make a helper:
template<typename T>
using refwrap = std::initializer_list<std::reference_wrapper<T>>;
for (int& i : refwrap<int>{a,b,c,d}) {
i = 1;
}
It does not work because it isn't supposed to work. That's the design of the feature. That feature being list initialization, which as the name suggests is about initializing something.
When C++11 introduced initializer_list, it was done for precisely one purpose: to allow the system to generate an array of values from a braced-init-list and pass them to a properly-tagged constructor (possibly indirectly) so that the object could initialize itself with that sequence of values. The "proper tag" in this case being that the constructor took the newly-minted std::initializer_list type as its first/only parameter. That's the purpose of initializer_list as a type.
Initialization, broadly speaking, should not modify the values it is given. The fact that the array backing the list is a temporary also doubles-down on the idea that the input values should logically be non-modifiable. If you have:
std::vector<int> v = {1, 2, 3, 4, 5};
We want to give the compiler the freedom to make that array of 5 elements a static array in the compiled binary, rather than a stack array that bloats the stack size of this function. More to the point, we logically want to think of the braced-init-list like a literal.
And we don't allow literals to be modified.
Your attempt to make {a, b, c, d} into a range of modifiable references is essentially trying to take a construct that already exists for one purpose and turn it into a construct for a different purpose. You're not initializing anything; you're just using a seemingly convenient syntax that happens to make iterable lists of values. But that syntax is called a "braced-init-list", and it generates an initializer list; the terminology is not an accident.
If you take a tool intended for X and try to hijack it do Y, you're likely to encounter rough edges somewhere down the road. So the reason why it doesn't work is that this it's not meant to; these are not what braced-init-lists and initializer_lists are for.
You might next say "if that's the case, why does for(auto i: {1, 2, 3, 4, 5}) work at all, if braced-init-lists are only for initialization?"
Once upon a time, it didn't; in C++11, that would be il-formed. It was only in C++14 when auto l = {1, 2, 3, 4}; became legal syntax; an auto variable was allowed to automatically deduce a braced-init-list as an initializer_list.
Range-based for uses auto deduction for the range type, so it inherited this ability. This naturally led people to believe that braced-init-lists are about building ranges of values, not initializing things.
In short, someone's convenience feature led you to believe that a construct meant to initialize an objects is just a quick way to create an array. It never was.
Having established that braced-init-lists aren't supposed to do the thing you want them to do, what would it take to make them do what you want?
Well, there are basically two ways to do it: small scale and large scale. The large-scale version would be to change how auto i = {a, b, c, d}; works, so that it could (based on something) create a modifiable range of references to expressions. Range-based for would just use its existing auto-deduction machinery to pick up on it.
This is of course a non-starter. That definition already has a well-defined meaning: it creates a non-modifiable list of copies of those expressions, not references to their results. Changing it would be a breaking change.
A small-scale change would be to hack range-based for to do some fancy deduction based on whether the range expression is a braced-init-list or not. Now, because such ranges and their iterators are buried by the compiler-generated code for range-based for, you won't have as many backwards compatibility problems. So you could do make a rule where, if your range-statement defines a non-const reference variable, and the range-expression is a braced-init-list, then you do some different deduction mechanisms.
The biggest problem here is that it's a complete and total hack. If it's useful to do for(auto &i : {a, b, c d}), then it's probably useful to be able to get the same kind of range outside of a range-based for loop. As it currently stands, the rules about auto-deduction of braced-init-lists are consistent everywhere. Range-based for gains its capabilities simply because it uses auto deduction.
The last thing C++ needs is more inconsistency.
This is doubly important in light of C++20 adding an init-statement to range-for. These two things ought to be equivalent:
for(auto &i : {a, b, c, d})
for(auto &&rng = {a, b, c, d}; auto &i: rng)
But if you change the rules only based on the range-expression and range-statement, they wouldn't be. rng would be deduced according to existing auto rules, thus making the auto &i non-functional. And having a super-special-case rule that changes how the init-statement of a range-for behaves, different from the same statement in other locations, would be even more inconsistent.
Besides, it's not too difficult to write a generic reference_range function that takes non-const variadic reference parameters (of the same type) and returns some kind of random-access range over them. That will work everywhere equally and without compatibility problems.
So let's just avoid trying to make a syntactic construct intended for initializing objects into a generic "list of stuff" tool.
This question already has answers here:
Why the sequence-operation algorithms predicates are passed by copy?
(3 answers)
Closed 6 years ago.
So I asked a question here: Lambda Works on Latest Visual Studio, but Doesn't Work Elsewhere to which I got the response, that my code was implementation defined since the standard's 25.1 [algorithms.general] 10 says:
Unless otherwise specified, algorithms that take function objects as arguments are permitted to copy
those function objects freely. Programmers for whom object identity is important should consider using a
wrapper class that points to a noncopied implementation object such as reference_wrapper<T>
I'd just like a reason why this is happening? We're told our whole lives to take objects by reference, why then is the standard taking function objects by value, and even worse in my linked question making copies of those objects? Is there some advantage that I don't understand to doing it this way?
std assumes function objects and iterators are free to copy.
std::ref provides a method to turn a function object into a pseudo-reference with a compatible operator() that uses reference instead of value semantics. So nothing of large value is lost.
If you have been taught all your life to take objects by reference, reconsider. Unless there is a good reason otherwise, take objects by value. Reasoning about values is far easier; references are pointers into any state anywhere in your program.
The conventional use of references, as a pointer to a local object which is not referred to by any other active reference in the context where it is used, is not something someone reading your code nor the compiler can presume. If you reason about references this way, they don't add a ridiculous amount of complexity to your code.
But if you reason about them that way, you are going to have bugs when your assumption is violated, and they will be subtle, gross, unexpected, and horrible.
A classic example is the number of operator= that break when this and the argument refer to the same object. But any function that takes two references or pointers of the same type has the same issue.
But even one reference can break your code. Let's look at sort. In pseudo-code:
void sort( Iterator start, Iterator end, Ordering order )
Now, let's make Ordering a reference:
void sort( Iterator start, Iterator end, Ordering const& order )
How about this one?
std::function< void(int, int) > alice;
std::function< void(int, int) > bob;
alice = [&]( int x, int y ) { std:swap(alice, bob); return x<y; };
bob = [&]( int x, int y ) { std:swap(alice, bob); return x>y; };
Now, call sort( begin(vector), end(vector), alice ).
Every time < is called, the referred-to order object swaps meaning. Now this is pretty ridiculous, but when you took Ordering by const&, the optimizer had to take into account that possibility and rule it out on every invokation of your ordering code!
You wouldn't do the above (and in fact this particular implementation is UB as it would violate any reasonable requisites on std::sort); but the compiler has to prove you didn't do something "like that" (change the code in ordering) every time it follows order or invokes it! Which means constantly reloading the state of order, or inlining and proving you did nonesuch insanity.
Doing this when taking by-value is an order of magnitude harder (and basically requires something like std::ref). The optimizer has a function object, it is local, and its state is local. Anything stored within it is local, and the compiler and optimizer know who exactly can modify it legally.
Every function you write taking a const& that ever leaves its "local scope" (say, called a C library function) can not assume the state of the const& remained the same after it got back. It must reload the data from wherever the pointer points to.
Now, I did say pass by value unless there is a good reason. And there are many good reasons; your type is very expensive to move or copy, for example, is a great reason. You are writing data to it. You actually want it to change as you read it each time. Etc.
But the default behavior should be pass-by-value. Only move to references if you have a good reason, because the costs are distributed and hard to pin down.
I'm not sure I have an answer for you, but if I have got my object lifetimes correct I think this is portable, safe and adds zero overhead or complexity:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
// #pre f must be an r-value reference - i.e. a temporary
template<class F>
auto resist_copies(F &&f) {
return std::reference_wrapper<F>(f);
};
void removeIntervals(std::vector<double> &values, const std::vector<std::pair<int, int>> &intervals) {
values.resize(distance(
begin(values),
std::remove_if(begin(values), end(values),
resist_copies([i = 0U, it = cbegin(intervals), end = cend(intervals)](const auto&) mutable
{
return it != end && ++i > it->first && (i <= it->second || (++it, true));
}))));
}
int main(int argc, char **args) {
// Intervals of indices I have to remove from values
std::vector<std::pair<int, int>> intervals = {{1, 3},
{7, 9},
{13, 13}};
// Vector of arbitrary values.
std::vector<double> values = {4.2, 6.4, 2.3, 3.4, 9.1, 2.3, 0.6, 1.2, 0.3, 0.4, 6.4, 3.6, 1.4, 2.5, 7.5};
removeIntervals(values, intervals);
// intervals should contain 4.2,9.1,2.3,0.6,6.4,3.6,1.4,7.5
std:
copy(values.begin(), values.end(), std::ostream_iterator<double>(std::cout, ", "));
std::cout << '\n';
}
It seems that auto was a fairly significant feature to be added in C++11 that seems to follow a lot of the newer languages. As with a language like Python, I have not seen any explicit variable declaration (I am not sure if it is possible using Python standards).
Is there a drawback to using auto to declare variables instead of explicitly declaring them?
The question is about drawbacks of auto, so this answer highlights some of those. A drawback of using a programming language feature (in this case, a facility associated with a language keyword) does not mean that feature is unacceptable, nor does it mean that feature should be avoided entirely. It means there are disadvantages along with advantages, so a decision to use auto type deduction over alternatives must consider engineering trade-offs.
When used well, auto has several advantages as well - which is not the subject of the question. The drawbacks result from ease of abuse, and from increased potential for code to behave in unintended or unexpected ways.
The main drawback is that, by using auto, you don't necessarily know the type of object being created. There are also occasions where the programmer might expect the compiler to deduce one type, but the compiler adamantly deduces another.
Given a declaration like
auto result = CallSomeFunction(x,y,z);
you don't necessarily have knowledge of what type result is. It might be an int. It might be a pointer. It might be something else. All of those support different operations. You can also dramatically change the code by a minor change like
auto result = CallSomeFunction(a,y,z);
because, depending on what overloads exist for CallSomeFunction() the type of result might be completely different - and subsequent code may therefore behave completely differently than intended. You might suddenly trigger error messages in later code(e.g. subsequently trying to dereference an int, trying to change something which is now const). The more sinister change is where your change sails past the compiler, but subsequent code behaves in different and unknown - possibly buggy - ways. For example (as noted by sashoalm in comments) if the deduced type of a variable changes an integral type to a floating point type - and subsequent code is unexpectedly and silently affected by loss of precision.
Not having explicit knowledge of the type of some variables therefore makes it harder to rigorously justify a claim that the code works as intended. This means more effort to justify claims of "fit for purpose" in high-criticality (e.g. safety-critical or mission-critical) domains.
The other, more common drawback, is the temptation for a programmer to use auto as a blunt instrument to force code to compile, rather than thinking about what the code is doing, and working to get it right.
This isn't a drawback of auto in a principled way exactly, but in practical terms it seems to be an issue for some. Basically, some people either: a) treat auto as a savior for types and shut their brain off when using it, or b) forget that auto always deduces to value types. This causes people to do things like this:
auto x = my_obj.method_that_returns_reference();
Oops, we just deep copied some object. It's often either a bug or a performance fail. Then, you can swing the other way too:
const auto& stuff = *func_that_returns_unique_ptr();
Now you get a dangling reference. These problems aren't caused by auto at all, so I don't consider them legitimate arguments against it. But it does seem like auto makes these issue more common (from my personal experience), for the reasons I listed at the beginning.
I think given time people will adjust, and understand the division of labor: auto deduces the underlying type, but you still want to think about reference-ness and const-ness. But it's taking a bit of time.
Other answers are mentioning drawbacks like "you don't really know what the type of a variable is." I'd say that this is largely related to sloppy naming convention in code. If your interfaces are clearly-named, you shouldn't need to care what the exact type is. Sure, auto result = callSomeFunction(a, b); doesn't tell you much. But auto valid = isValid(xmlFile, schema); tells you enough to use valid without having to care what its exact type is. After all, with just if (callSomeFunction(a, b)), you wouldn't know the type either. The same with any other subexpression temporary objects. So I don't consider this a real drawback of auto.
I'd say its primary drawback is that sometimes, the exact return type is not what you want to work with. In effect, sometimes the actual return type differs from the "logical" return type as an implementation/optimisation detail. Expression templates are a prime example. Let's say we have this:
SomeType operator* (const Matrix &lhs, const Vector &rhs);
Logically, we would expect SomeType to be Vector, and we definitely want to treat it as such in our code. However, it is possible that for optimisation purposes, the algebra library we're using implements expression templates, and the actual return type is this:
MultExpression<Matrix, Vector> operator* (const Matrix &lhs, const Vector &rhs);
Now, the problem is that MultExpression<Matrix, Vector> will in all likelihood store a const Matrix& and const Vector& internally; it expects that it will convert to a Vector before the end of its full-expression. If we have this code, all is well:
extern Matrix a, b, c;
extern Vector v;
void compute()
{
Vector res = a * (b * (c * v));
// do something with res
}
However, if we had used auto here, we could get in trouble:
void compute()
{
auto res = a * (b * (c * v));
// Oops! Now `res` is referring to temporaries (such as (c * v)) which no longer exist
}
It makes your code a little harder, or tedious, to read.
Imagine something like that:
auto output = doSomethingWithData(variables);
Now, to figure out the type of output, you'd have to track down signature of doSomethingWithData function.
One of the drawbacks is that sometimes you can't declare const_iterator with auto. You will get ordinary (non const) iterator in this example of code taken from this question:
map<string,int> usa;
//...init usa
auto city_it = usa.find("New York");
Like this developer, I hate auto. Or rather, I hate how people misuse auto.
I'm of the (strong) opinion that auto is for helping you write generic code, not for reducing typing.
C++ is a language whose goal is to let you write robust code, not to minimize development time.
This is fairly obvious from many features of C++, but unfortunately a few of the newer ones like auto that reduce typing mislead people into thinking they should start being lazy with typing.
In pre-auto days, people used typedefs, which was great because typedef allowed the designer of the library to help you figure out what the return type should be, so that their library works as expected. When you use auto, you take away that control from the class's designer and instead ask the compiler to figure out what the type should be, which removes one of the most powerful C++ tools from the toolbox and risks breaking their code.
Generally, if you use auto, it should be because your code works for any reasonable type, not because you're just too lazy to write down the type that it should work with.
If you use auto as a tool to help laziness, then what happens is that you eventually start introducing subtle bugs in your program, usually caused by implicit conversions that did not happen because you used auto.
Unfortunately, these bugs are difficult to illustrate in a short example here because their brevity makes them less convincing than the actual examples that come up in a user project -- however, they occur easily in template-heavy code that expect certain implicit conversions to take place.
If you want an example, there is one here. A little note, though: before being tempted to jump and criticize the code: keep in mind that many well-known and mature libraries have been developed around such implicit conversions, and they are there because they solve problems that can be difficult if not impossible to solve otherwise. Try to figure out a better solution before criticizing them.
auto does not have drawbacks per se, and I advocate to (hand-wavily) use it everywhere in new code. It allows your code to consistently type-check, and consistently avoid silent slicing. (If B derives from A and a function returning A suddenly returns B, then auto behaves as expected to store its return value)
Although, pre-C++11 legacy code may rely on implicit conversions induced by the use of explicitly-typed variables. Changing an explicitly-typed variable to auto might change code behaviour, so you'd better be cautious.
Keyword auto simply deduce the type from the return value. Therefore, it is not equivalent with a Python object, e.g.
# Python
a
a = 10 # OK
a = "10" # OK
a = ClassA() # OK
// C++
auto a; // Unable to deduce variable a
auto a = 10; // OK
a = "10"; // Value of const char* can't be assigned to int
a = ClassA{} // Value of ClassA can't be assigned to int
a = 10.0; // OK, implicit casting warning
Since auto is deduced during compilation, it won't have any drawback at runtime whatsoever.
What no one mentioned here so far, but for itself is worth an answer if you asked me.
Since (even if everyone should be aware that C != C++) code written in C can easily be designed to provide a base for C++ code and therefore be designed without too much effort to be C++ compatible, this could be a requirement for design.
I know about some rules where some well defined constructs from C are invalid for C++ and vice versa. But this would simply result in broken executables and the known UB-clause applies which most times is noticed by strange loopings resulting in crashes or whatever (or even may stay undetected, but that doesn't matter here).
But auto is the first time1 this changes!
Imagine you used auto as storage-class specifier before and transfer the code. It would not even necessarily (depending on the way it was used) "break"; it actually could silently change the behaviour of the program.
That's something one should keep in mind.
1At least the first time I'm aware of.
As I described in this answer auto can sometimes result in funky situations you didn't intend.
You have to explictly say auto& to have a reference type while doing just auto can create a pointer type. This can result in confusion by omitting the specifier all together, resulting in a copy of the reference instead of an actual reference.
One reason that I can think of is that you lose the opportunity to coerce the class that is returned. If your function or method returned a long 64 bit, and you only wanted a 32 unsigned int, then you lose the opportunity to control that.
I think auto is good when used in a localized context, where the reader easily & obviously can deduct its type, or well documented with a comment of its type or a name that infer the actual type. Those who don't understand how it works might take it in the wrong ways, like using it instead of template or similar. Here are some good and bad use cases in my opinion.
void test (const int & a)
{
// b is not const
// b is not a reference
auto b = a;
// b type is decided by the compiler based on value of a
// a is int
}
Good Uses
Iterators
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int> v();
..
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int>::iterator it = v.begin();
// VS
auto vi = v.begin();
Function Pointers
int test (ClassWithLongName1 a, ClassWithLongName2 b, int c)
{
..
}
..
int (*fp)(ClassWithLongName1, ClassWithLongName2, int) = test;
// VS
auto *f = test;
Bad Uses
Data Flow
auto input = "";
..
auto output = test(input);
Function Signature
auto test (auto a, auto b, auto c)
{
..
}
Trivial Cases
for(auto i = 0; i < 100; i++)
{
..
}
Another irritating example:
for (auto i = 0; i < s.size(); ++i)
generates a warning (comparison between signed and unsigned integer expressions [-Wsign-compare]), because i is a signed int. To avoid this you need to write e.g.
for (auto i = 0U; i < s.size(); ++i)
or perhaps better:
for (auto i = 0ULL; i < s.size(); ++i)
I'm surprised nobody has mentioned this, but suppose you are calculating the factorial of something:
#include <iostream>
using namespace std;
int main() {
auto n = 40;
auto factorial = 1;
for(int i = 1; i <=n; ++i)
{
factorial *= i;
}
cout << "Factorial of " << n << " = " << factorial <<endl;
cout << "Size of factorial: " << sizeof(factorial) << endl;
return 0;
}
This code will output this:
Factorial of 40 = 0
Size of factorial: 4
That was definetly not the expected result. That happened because auto deduced the type of the variable factorial as int because it was assigned to 1.
My comments on this answer got me thinking about the issues of constness and sorting. I played around a bit and reduced my issues to the fact that this code:
#include <vector>
int main() {
std::vector <const int> v;
}
will not compile - you can't create a vector of const ints. Obviously, I should have known this (and intellectually I did), but I've never needed to create such a thing before. However, it seems like a useful construct to me, and I wonder if there is any way round this problem - I want to add things to a vector (or whatever), but they should not be changed once added.
There's probably some embarrassingly simple solution to this, but it's something I'd never considered before.
I probably should not have mentioned sorting (I may ask another question about that, see this for the difficulties of asking questions). My real base use case is something like this:
vector <const int> v; // ok (i.e. I want it to be OK)
v.push_back( 42 ); // ok
int n = v[0]; // ok
v[0] = 1; // not allowed
Well, in C++0x you can...
In C++03, there is a paragraph 23.1[lib.containers.requirements]/3, which says
The type of objects stored in these components must meet the requirements of CopyConstructible types (20.1.3), and the additional requirements of Assignable types.
This is what's currently preventing you from using const int as a type argument to std::vector.
However, in C++0x, this paragraph is missing, instead, T is required to be Destructible and additional requirements on T are specified per-expression, e.g. v = u on std::vector is only valid if T is MoveConstructible and MoveAssignable.
If I interpret those requirements correctly, it should be possible to instantiate std::vector<const int>, you'll just be missing some of its functionality (which I guess is what you wanted). You can fill it by passing a pair of iterators to the constructor. I think emplace_back() should work as well, though I failed to find explicit requirements on T for it.
You still won't be able to sort the vector in-place though.
Types that you put in a standard container have to be copyable and assignable. The reason that auto_ptr causes so much trouble is precisely because it doesn't follow normal copy and assignment semantics. Naturally, anything that's const is not going to be assignable. So, you can't stick const anything in a standard container. And if the element isn't const, then you are going to be able to change it.
The closest solution that I believe is possible would be to use an indirection of some kind. So, you could have a pointer to const or you could have an object which holds the value that you want but the value can't be changed within the object (like you'd get with Integer in Java).
Having the element at a particular index be unchangeable goes against how the standard containers work. You might be able to construct your own which work that way, but the standard ones don't. And none which are based on arrays will work regardless unless you can manage to fit their initialization into the {a, b, c} initialization syntax since once an array of const has been created, you can't change it. So, a vector class isn't likely to work with const elements no matter what you do.
Having const in a container without some sort of indirection just doesn't work very well. You're basically asking to make the entire container const - which you could do if you copy to it from an already initialized container, but you can't really have a container - certainly not a standard container - which contains constants without some sort of indirection.
EDIT: If what you're looking to do is to mostly leave a container unchanged but still be able to change it in certain places in the code, then using a const ref in most places and then giving the code that needs to be able to change the container direct access or a non-const ref would make that possible.
So, use const vector<int>& in most places, and then either vector<int>& where you need to change the container, or give that portion of the code direct access to the container. That way, it's mostly unchangeable, but you can change it when you want to.
On the other hand, if you want to be able to pretty much always be able to change what's in the container but not change specific elements, then I'd suggest putting a wrapper class around the container. In the case of vector, wrap it and make the subscript operator return a const ref instead of a non-const ref - either that or a copy. So, assuming that you created a templatized version, your subscript operator would look something like this:
const T& operator[](size_t i) const
{
return _container[i];
}
That way, you can update the container itself, but you can't change it's individual elements. And as long as you declare all of the functions inline, it shouldn't be much of a performance hit (if any at all) to have the wrapper.
You can't create a vector of const ints, and it'd be pretty useless even if you could. If i remove the second int, then everything from there on is shifted down one -- read: modified -- making it impossible to guarantee that v[5] has the same value on two different occasions.
Add to that, a const can't be assigned to after it's declared, short of casting away the constness. And if you wanna do that, why are you using const in the first place?
You're going to need to write your own class. You could certainly use std::vector as your internal implementation. Then just implement the const interface and those few non-const functions you need.
Although this doesn't meet all of your requirements (being able to sort), try a constant vector:
int values[] = {1, 3, 5, 2, 4, 6};
const std::vector<int> IDs(values, values + sizeof(values));
Although, you may want to use a std::list. With the list, the values don't need to change, only the links to them. Sorting is accomplished by changing the order of the links.
You may have to expend some brain power and write your own. :-(
I would have all my const objects in a standard array.
Then use a vector of pointers into the array.
A small utility class just to help you not have to de-reference the objects and hay presto.
#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>
class XPointer
{
public:
XPointer(int const& data)
: m_data(&data)
{}
operator int const&() const
{
return *m_data;
}
private:
int const* m_data;
};
int const data[] = { 15, 17, 22, 100, 3, 4};
std::vector<XPointer> sorted(data,data+6);
int main()
{
std::sort(sorted.begin(), sorted.end());
std::copy(sorted.begin(), sorted.end(), std::ostream_iterator<int>(std::cout, ", "));
int x = sorted[1];
}
I'm with Noah: wrap the vector with a class that exposes only what you want to allow.
If you don't need to dynamically add objects to the vector, consider std::tr1::array.
If constness is important to you in this instance I think you probably want to work with immutable types all the way up. Conceptually you'll have a fixed size, const array of const ints. Any time you need to change it (e.g. to add or remove elements, or to sort) you'll need to make a copy of the array with the operation performed and use that instead.
While this is very natural in a functional language it doesn't seem quite "right" in C++. getting efficient implementations of sort, for example, could be tricky - but you don't say what you're performance requirements are.
Whether you consider this route as being worth it from a performance/ custom code perspective or not I believe it is the correct approach.
After that holding the values by non-const pointer/ smart pointer is probably the best (but has its own overhead, of course).
I've been thinking a bit on this issue and it seems that you requirement is off.
You don't want to add immutable values to your vector:
std::vector<const int> vec = /**/;
std::vector<const int>::const_iterator first = vec.begin();
std::sort(vec.begin(), vec.end());
assert(*vec.begin() == *first); // false, even though `const int`
What you really want is your vector to hold a constant collection of values, in a modifiable order, which cannot be expressed by the std::vector<const int> syntax even if it worked.
I am afraid that it's an extremely specified task that would require a dedicated class.
It is true that Assignable is one of the standard requirements for vector element type and const int is not assignable. However, I would expect that in a well-thought-through implementation the compilation should fail only if the code explicitly relies on assignment. For std::vector that would be insert and erase, for example.
In reality, in many implementations the compilation fails even if you are not using these methods. For example, Comeau fails to compile the plain std::vector<const int> a; because the corresponding specialization of std::allocator fails to compile. It reports no immediate problems with std::vector itself.
I believe it is a valid problem. The library-provided implementation std::allocator is supposed to fail if the type parameter is const-qualified. (I wonder if it is possible to make a custom implementation of std::allocator to force the whole thing to compile.) (It would also be interesting to know how VS manages to compile it) Again, with Comeau std::vector<const int> fails to compiler for the very same reasons std::allocator<const int> fails to compile, and, according to the specification of std::allocator it must fail to compile.
Of course, in any case any implementation has the right to fail to compile std::vector<const int> since it is allowed to fail by the language specification.
Using just an unspecialized vector, this can't be done. Sorting is done by using assignment. So the same code that makes this possible:
sort(v.begin(), v.end());
...also makes this possible:
v[1] = 123;
You could derive a class const_vector from std::vector that overloads any method that returns a reference, and make it return a const reference instead. To do your sort, downcast back to std::vector.
std::vector of constant object will probably fail to compile due to Assignable requirement, as constant object can not be assigned. The same is true for Move Assignment also. This is also the problem I frequently face when working with a vector based map such as boost flat_map or Loki AssocVector. As it has internal implementation std::vector<std::pair<const Key,Value> > .
Thus it is almost impossible to follow const key requirement of map, which can be easily implemented for any node based map.
However it can be looked, whether std::vector<const T> means the vector should store a const T typed object, or it merely needs to return a non-mutable interface while accessing.
In that case, an implementation of std::vector<const T> is possible which follows Assignable/Move Assignable requirement as it stores object of type T rather than const T. The standard typedefs and allocator type need to be modified little to support standard requirements.Though to support such for a vector_map or flat_map, one probably needs considerable change in std::pair interface as it exposes the member variables first & second directly.
Compilation fails because push_back() (for instance) is basically
underlying_array[size()] = passed_value;
where both operand are T&. If T is const X that can't work.
Having const elements seem right in principle but in practice it's unnatural, and the specifications don't say it should be supported, so it's not there. At least not in the stdlib (because then, it would be in vector).