changing pointer members of a subroutine argument with intent(in)

changing pointer members of a subroutine argument with intent(in) - fortran

I'm writing a sparse matrix library in Fortran for fun but came into a little snag. I have a subroutine for matrix multiplication with the interface
subroutine matvec(A,x,y)
class(sparse_matrix), intent(in) :: A
real(double_precision), intent(in) :: x(:)
real(double_precision), intent(inout) :: y(:)
{etc.}
This uses a sparse matrix type that I've defined myself, the implementation of which is unimportant. Now, I can make things nicer and have a lot less code if A contains an object called iterator:
type :: sparse_matrix
type(matrix_iterator) :: iterator
{etc.}
which stores a few variables that keep track of things during matvec. But, if I change the state of iterator and in turn the state of A during matrix multiplication, the compiler will throw a fit because A has intent(in) for that subroutine.
Suppose I change things around and instead define
type :: sparse_matrix
type(matrix_iterator), pointer :: iterator
{etc.}
It's no problem if I change the state of iterator during a procedure in which a matrix has intent(in), because the value of the pointer to iterator doesn't change; only the memory stored at that address is affected. This is confirmed by making a reduced test case, which compiles and runs just fine using GCC.
Am I correct in thinking this is an appropriate fix? Or should I change the subroutine so that A has intent(inout)? The fact that it compiled with GCC doesn't necessarily mean it's standard-compliant, nor does it mean it's good programming practice.
To make an analogy with C, suppose I had a function foo(int* const p). If I wrote
*p = 42;
this would be ok, as the value of the pointer doesn't change, only the data stored at the address pointed to. On the other hand, I couldn't write
p = &my_var;
because it's a constant pointer.

Yes, it is OK. Actually this practice is well known and it is used, for example, when doing reference counting memory management, because the right hand side of defined assignment is an intent(in) expression, but you must be able to decrease the reference count in it.

Related

Why is C++ auto risky [duplicate]

It seems that auto was a fairly significant feature to be added in C++11 that seems to follow a lot of the newer languages. As with a language like Python, I have not seen any explicit variable declaration (I am not sure if it is possible using Python standards).
Is there a drawback to using auto to declare variables instead of explicitly declaring them?

The question is about drawbacks of auto, so this answer highlights some of those. A drawback of using a programming language feature (in this case, a facility associated with a language keyword) does not mean that feature is unacceptable, nor does it mean that feature should be avoided entirely. It means there are disadvantages along with advantages, so a decision to use auto type deduction over alternatives must consider engineering trade-offs.
When used well, auto has several advantages as well - which is not the subject of the question. The drawbacks result from ease of abuse, and from increased potential for code to behave in unintended or unexpected ways.
The main drawback is that, by using auto, you don't necessarily know the type of object being created. There are also occasions where the programmer might expect the compiler to deduce one type, but the compiler adamantly deduces another.
Given a declaration like
auto result = CallSomeFunction(x,y,z);
you don't necessarily have knowledge of what type result is. It might be an int. It might be a pointer. It might be something else. All of those support different operations. You can also dramatically change the code by a minor change like
auto result = CallSomeFunction(a,y,z);
because, depending on what overloads exist for CallSomeFunction() the type of result might be completely different - and subsequent code may therefore behave completely differently than intended. You might suddenly trigger error messages in later code(e.g. subsequently trying to dereference an int, trying to change something which is now const). The more sinister change is where your change sails past the compiler, but subsequent code behaves in different and unknown - possibly buggy - ways. For example (as noted by sashoalm in comments) if the deduced type of a variable changes an integral type to a floating point type - and subsequent code is unexpectedly and silently affected by loss of precision.
Not having explicit knowledge of the type of some variables therefore makes it harder to rigorously justify a claim that the code works as intended. This means more effort to justify claims of "fit for purpose" in high-criticality (e.g. safety-critical or mission-critical) domains.
The other, more common drawback, is the temptation for a programmer to use auto as a blunt instrument to force code to compile, rather than thinking about what the code is doing, and working to get it right.

This isn't a drawback of auto in a principled way exactly, but in practical terms it seems to be an issue for some. Basically, some people either: a) treat auto as a savior for types and shut their brain off when using it, or b) forget that auto always deduces to value types. This causes people to do things like this:
auto x = my_obj.method_that_returns_reference();
Oops, we just deep copied some object. It's often either a bug or a performance fail. Then, you can swing the other way too:
const auto& stuff = *func_that_returns_unique_ptr();
Now you get a dangling reference. These problems aren't caused by auto at all, so I don't consider them legitimate arguments against it. But it does seem like auto makes these issue more common (from my personal experience), for the reasons I listed at the beginning.
I think given time people will adjust, and understand the division of labor: auto deduces the underlying type, but you still want to think about reference-ness and const-ness. But it's taking a bit of time.

Other answers are mentioning drawbacks like "you don't really know what the type of a variable is." I'd say that this is largely related to sloppy naming convention in code. If your interfaces are clearly-named, you shouldn't need to care what the exact type is. Sure, auto result = callSomeFunction(a, b); doesn't tell you much. But auto valid = isValid(xmlFile, schema); tells you enough to use valid without having to care what its exact type is. After all, with just if (callSomeFunction(a, b)), you wouldn't know the type either. The same with any other subexpression temporary objects. So I don't consider this a real drawback of auto.
I'd say its primary drawback is that sometimes, the exact return type is not what you want to work with. In effect, sometimes the actual return type differs from the "logical" return type as an implementation/optimisation detail. Expression templates are a prime example. Let's say we have this:
SomeType operator* (const Matrix &lhs, const Vector &rhs);
Logically, we would expect SomeType to be Vector, and we definitely want to treat it as such in our code. However, it is possible that for optimisation purposes, the algebra library we're using implements expression templates, and the actual return type is this:
MultExpression<Matrix, Vector> operator* (const Matrix &lhs, const Vector &rhs);
Now, the problem is that MultExpression<Matrix, Vector> will in all likelihood store a const Matrix& and const Vector& internally; it expects that it will convert to a Vector before the end of its full-expression. If we have this code, all is well:
extern Matrix a, b, c;
extern Vector v;
void compute()
{
Vector res = a * (b * (c * v));
// do something with res
}
However, if we had used auto here, we could get in trouble:
void compute()
{
auto res = a * (b * (c * v));
// Oops! Now `res` is referring to temporaries (such as (c * v)) which no longer exist
}

It makes your code a little harder, or tedious, to read.
Imagine something like that:
auto output = doSomethingWithData(variables);
Now, to figure out the type of output, you'd have to track down signature of doSomethingWithData function.

One of the drawbacks is that sometimes you can't declare const_iterator with auto. You will get ordinary (non const) iterator in this example of code taken from this question:
map<string,int> usa;
//...init usa
auto city_it = usa.find("New York");

Like this developer, I hate auto. Or rather, I hate how people misuse auto.
I'm of the (strong) opinion that auto is for helping you write generic code, not for reducing typing.
C++ is a language whose goal is to let you write robust code, not to minimize development time.
This is fairly obvious from many features of C++, but unfortunately a few of the newer ones like auto that reduce typing mislead people into thinking they should start being lazy with typing.
In pre-auto days, people used typedefs, which was great because typedef allowed the designer of the library to help you figure out what the return type should be, so that their library works as expected. When you use auto, you take away that control from the class's designer and instead ask the compiler to figure out what the type should be, which removes one of the most powerful C++ tools from the toolbox and risks breaking their code.
Generally, if you use auto, it should be because your code works for any reasonable type, not because you're just too lazy to write down the type that it should work with.
If you use auto as a tool to help laziness, then what happens is that you eventually start introducing subtle bugs in your program, usually caused by implicit conversions that did not happen because you used auto.
Unfortunately, these bugs are difficult to illustrate in a short example here because their brevity makes them less convincing than the actual examples that come up in a user project -- however, they occur easily in template-heavy code that expect certain implicit conversions to take place.
If you want an example, there is one here. A little note, though: before being tempted to jump and criticize the code: keep in mind that many well-known and mature libraries have been developed around such implicit conversions, and they are there because they solve problems that can be difficult if not impossible to solve otherwise. Try to figure out a better solution before criticizing them.

auto does not have drawbacks per se, and I advocate to (hand-wavily) use it everywhere in new code. It allows your code to consistently type-check, and consistently avoid silent slicing. (If B derives from A and a function returning A suddenly returns B, then auto behaves as expected to store its return value)
Although, pre-C++11 legacy code may rely on implicit conversions induced by the use of explicitly-typed variables. Changing an explicitly-typed variable to auto might change code behaviour, so you'd better be cautious.

Keyword auto simply deduce the type from the return value. Therefore, it is not equivalent with a Python object, e.g.
# Python
a
a = 10 # OK
a = "10" # OK
a = ClassA() # OK
// C++
auto a; // Unable to deduce variable a
auto a = 10; // OK
a = "10"; // Value of const char* can't be assigned to int
a = ClassA{} // Value of ClassA can't be assigned to int
a = 10.0; // OK, implicit casting warning
Since auto is deduced during compilation, it won't have any drawback at runtime whatsoever.

What no one mentioned here so far, but for itself is worth an answer if you asked me.
Since (even if everyone should be aware that C != C++) code written in C can easily be designed to provide a base for C++ code and therefore be designed without too much effort to be C++ compatible, this could be a requirement for design.
I know about some rules where some well defined constructs from C are invalid for C++ and vice versa. But this would simply result in broken executables and the known UB-clause applies which most times is noticed by strange loopings resulting in crashes or whatever (or even may stay undetected, but that doesn't matter here).
But auto is the first time1 this changes!
Imagine you used auto as storage-class specifier before and transfer the code. It would not even necessarily (depending on the way it was used) "break"; it actually could silently change the behaviour of the program.
That's something one should keep in mind.
1At least the first time I'm aware of.

As I described in this answer auto can sometimes result in funky situations you didn't intend.
You have to explictly say auto& to have a reference type while doing just auto can create a pointer type. This can result in confusion by omitting the specifier all together, resulting in a copy of the reference instead of an actual reference.

One reason that I can think of is that you lose the opportunity to coerce the class that is returned. If your function or method returned a long 64 bit, and you only wanted a 32 unsigned int, then you lose the opportunity to control that.

I think auto is good when used in a localized context, where the reader easily & obviously can deduct its type, or well documented with a comment of its type or a name that infer the actual type. Those who don't understand how it works might take it in the wrong ways, like using it instead of template or similar. Here are some good and bad use cases in my opinion.
void test (const int & a)
{
// b is not const
// b is not a reference
auto b = a;
// b type is decided by the compiler based on value of a
// a is int
}
Good Uses
Iterators
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int> v();
..
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int>::iterator it = v.begin();
// VS
auto vi = v.begin();
Function Pointers
int test (ClassWithLongName1 a, ClassWithLongName2 b, int c)
{
..
}
..
int (*fp)(ClassWithLongName1, ClassWithLongName2, int) = test;
// VS
auto *f = test;
Bad Uses
Data Flow
auto input = "";
..
auto output = test(input);
Function Signature
auto test (auto a, auto b, auto c)
{
..
}
Trivial Cases
for(auto i = 0; i < 100; i++)
{
..
}

Another irritating example:
for (auto i = 0; i < s.size(); ++i)
generates a warning (comparison between signed and unsigned integer expressions [-Wsign-compare]), because i is a signed int. To avoid this you need to write e.g.
for (auto i = 0U; i < s.size(); ++i)
or perhaps better:
for (auto i = 0ULL; i < s.size(); ++i)

I'm surprised nobody has mentioned this, but suppose you are calculating the factorial of something:
#include <iostream>
using namespace std;
int main() {
auto n = 40;
auto factorial = 1;
for(int i = 1; i <=n; ++i)
{
factorial *= i;
}
cout << "Factorial of " << n << " = " << factorial <<endl;
cout << "Size of factorial: " << sizeof(factorial) << endl;
return 0;
}
This code will output this:
Factorial of 40 = 0
Size of factorial: 4
That was definetly not the expected result. That happened because auto deduced the type of the variable factorial as int because it was assigned to 1.

range based for loop with an auto reference to a pointer

With regard to the following code I would like to have some clarification. We have an array of pointers to a class. Next we loop over the array using a range based loop. For this range based loop auto& is used. But next when we use the element a we can use the arrow operator to call a function.
This code is compiled using C++ 11.
// Definition of an array of pointers to some class.
some_class* array[10];
// The array of pointers is set.
// Loop over the array.
for(auto& a : array)
{
// Call some function using the arrow operator.
a->some_func();
}
Is my understanding correct that auto& a is a reference to a pointer? Is this not a bit over kill. Would using auto a not create a copy of the pointer and take up the same amount of memory?

Your code compiles fine.
Nevertheless, there is not really a point using a reference here, if you don't like to change it.
Best practise here is
Use const auto &T if the content shall not be changed. The reference is important, if the type T of auto is large. Otherwise you will copy the object.
Use auto & T if you like to change the content of the container you are iterating.

Is my understanding correct that auto& a is a reference to a pointer?
Yes that's correct
Would using auto a not create a copy of the pointer and take up the same amount of memory
Think of references as an alias for the variable, that is, think of it as a different name.
as for this -> not create a copy of the pointer`
A pointer is very light weight and copying a pointer is relatively cheap (that's how views are implemented, pointers to sequences). If the object underline the container you are iterating is a fundamental type or a pointer to some type, auto is enough. In cases where the underline object of a container is a heavy weight object, then auto& is a better alternative (and ofc you can add const qualifier if you don't want to modify it).

Is there a downside to declaring variables with auto in C++?

It seems that auto was a fairly significant feature to be added in C++11 that seems to follow a lot of the newer languages. As with a language like Python, I have not seen any explicit variable declaration (I am not sure if it is possible using Python standards).
Is there a drawback to using auto to declare variables instead of explicitly declaring them?

This isn't a drawback of auto in a principled way exactly, but in practical terms it seems to be an issue for some. Basically, some people either: a) treat auto as a savior for types and shut their brain off when using it, or b) forget that auto always deduces to value types. This causes people to do things like this:
auto x = my_obj.method_that_returns_reference();
Oops, we just deep copied some object. It's often either a bug or a performance fail. Then, you can swing the other way too:
const auto& stuff = *func_that_returns_unique_ptr();
Now you get a dangling reference. These problems aren't caused by auto at all, so I don't consider them legitimate arguments against it. But it does seem like auto makes these issue more common (from my personal experience), for the reasons I listed at the beginning.
I think given time people will adjust, and understand the division of labor: auto deduces the underlying type, but you still want to think about reference-ness and const-ness. But it's taking a bit of time.

Other answers are mentioning drawbacks like "you don't really know what the type of a variable is." I'd say that this is largely related to sloppy naming convention in code. If your interfaces are clearly-named, you shouldn't need to care what the exact type is. Sure, auto result = callSomeFunction(a, b); doesn't tell you much. But auto valid = isValid(xmlFile, schema); tells you enough to use valid without having to care what its exact type is. After all, with just if (callSomeFunction(a, b)), you wouldn't know the type either. The same with any other subexpression temporary objects. So I don't consider this a real drawback of auto.
I'd say its primary drawback is that sometimes, the exact return type is not what you want to work with. In effect, sometimes the actual return type differs from the "logical" return type as an implementation/optimisation detail. Expression templates are a prime example. Let's say we have this:
SomeType operator* (const Matrix &lhs, const Vector &rhs);
Logically, we would expect SomeType to be Vector, and we definitely want to treat it as such in our code. However, it is possible that for optimisation purposes, the algebra library we're using implements expression templates, and the actual return type is this:
MultExpression<Matrix, Vector> operator* (const Matrix &lhs, const Vector &rhs);
Now, the problem is that MultExpression<Matrix, Vector> will in all likelihood store a const Matrix& and const Vector& internally; it expects that it will convert to a Vector before the end of its full-expression. If we have this code, all is well:
extern Matrix a, b, c;
extern Vector v;
void compute()
{
Vector res = a * (b * (c * v));
// do something with res
}
However, if we had used auto here, we could get in trouble:
void compute()
{
auto res = a * (b * (c * v));
// Oops! Now `res` is referring to temporaries (such as (c * v)) which no longer exist
}

It makes your code a little harder, or tedious, to read.
Imagine something like that:
auto output = doSomethingWithData(variables);
Now, to figure out the type of output, you'd have to track down signature of doSomethingWithData function.

One of the drawbacks is that sometimes you can't declare const_iterator with auto. You will get ordinary (non const) iterator in this example of code taken from this question:
map<string,int> usa;
//...init usa
auto city_it = usa.find("New York");

Like this developer, I hate auto. Or rather, I hate how people misuse auto.
I'm of the (strong) opinion that auto is for helping you write generic code, not for reducing typing.
C++ is a language whose goal is to let you write robust code, not to minimize development time.
This is fairly obvious from many features of C++, but unfortunately a few of the newer ones like auto that reduce typing mislead people into thinking they should start being lazy with typing.
In pre-auto days, people used typedefs, which was great because typedef allowed the designer of the library to help you figure out what the return type should be, so that their library works as expected. When you use auto, you take away that control from the class's designer and instead ask the compiler to figure out what the type should be, which removes one of the most powerful C++ tools from the toolbox and risks breaking their code.
Generally, if you use auto, it should be because your code works for any reasonable type, not because you're just too lazy to write down the type that it should work with.
If you use auto as a tool to help laziness, then what happens is that you eventually start introducing subtle bugs in your program, usually caused by implicit conversions that did not happen because you used auto.
Unfortunately, these bugs are difficult to illustrate in a short example here because their brevity makes them less convincing than the actual examples that come up in a user project -- however, they occur easily in template-heavy code that expect certain implicit conversions to take place.
If you want an example, there is one here. A little note, though: before being tempted to jump and criticize the code: keep in mind that many well-known and mature libraries have been developed around such implicit conversions, and they are there because they solve problems that can be difficult if not impossible to solve otherwise. Try to figure out a better solution before criticizing them.

auto does not have drawbacks per se, and I advocate to (hand-wavily) use it everywhere in new code. It allows your code to consistently type-check, and consistently avoid silent slicing. (If B derives from A and a function returning A suddenly returns B, then auto behaves as expected to store its return value)
Although, pre-C++11 legacy code may rely on implicit conversions induced by the use of explicitly-typed variables. Changing an explicitly-typed variable to auto might change code behaviour, so you'd better be cautious.

Keyword auto simply deduce the type from the return value. Therefore, it is not equivalent with a Python object, e.g.
# Python
a
a = 10 # OK
a = "10" # OK
a = ClassA() # OK
// C++
auto a; // Unable to deduce variable a
auto a = 10; // OK
a = "10"; // Value of const char* can't be assigned to int
a = ClassA{} // Value of ClassA can't be assigned to int
a = 10.0; // OK, implicit casting warning
Since auto is deduced during compilation, it won't have any drawback at runtime whatsoever.

What no one mentioned here so far, but for itself is worth an answer if you asked me.
Since (even if everyone should be aware that C != C++) code written in C can easily be designed to provide a base for C++ code and therefore be designed without too much effort to be C++ compatible, this could be a requirement for design.
I know about some rules where some well defined constructs from C are invalid for C++ and vice versa. But this would simply result in broken executables and the known UB-clause applies which most times is noticed by strange loopings resulting in crashes or whatever (or even may stay undetected, but that doesn't matter here).
But auto is the first time1 this changes!
Imagine you used auto as storage-class specifier before and transfer the code. It would not even necessarily (depending on the way it was used) "break"; it actually could silently change the behaviour of the program.
That's something one should keep in mind.
1At least the first time I'm aware of.

As I described in this answer auto can sometimes result in funky situations you didn't intend.
You have to explictly say auto& to have a reference type while doing just auto can create a pointer type. This can result in confusion by omitting the specifier all together, resulting in a copy of the reference instead of an actual reference.

One reason that I can think of is that you lose the opportunity to coerce the class that is returned. If your function or method returned a long 64 bit, and you only wanted a 32 unsigned int, then you lose the opportunity to control that.

I think auto is good when used in a localized context, where the reader easily & obviously can deduct its type, or well documented with a comment of its type or a name that infer the actual type. Those who don't understand how it works might take it in the wrong ways, like using it instead of template or similar. Here are some good and bad use cases in my opinion.
void test (const int & a)
{
// b is not const
// b is not a reference
auto b = a;
// b type is decided by the compiler based on value of a
// a is int
}
Good Uses
Iterators
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int> v();
..
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int>::iterator it = v.begin();
// VS
auto vi = v.begin();
Function Pointers
int test (ClassWithLongName1 a, ClassWithLongName2 b, int c)
{
..
}
..
int (*fp)(ClassWithLongName1, ClassWithLongName2, int) = test;
// VS
auto *f = test;
Bad Uses
Data Flow
auto input = "";
..
auto output = test(input);
Function Signature
auto test (auto a, auto b, auto c)
{
..
}
Trivial Cases
for(auto i = 0; i < 100; i++)
{
..
}

Another irritating example:
for (auto i = 0; i < s.size(); ++i)
generates a warning (comparison between signed and unsigned integer expressions [-Wsign-compare]), because i is a signed int. To avoid this you need to write e.g.
for (auto i = 0U; i < s.size(); ++i)
or perhaps better:
for (auto i = 0ULL; i < s.size(); ++i)

I'm surprised nobody has mentioned this, but suppose you are calculating the factorial of something:
#include <iostream>
using namespace std;
int main() {
auto n = 40;
auto factorial = 1;
for(int i = 1; i <=n; ++i)
{
factorial *= i;
}
cout << "Factorial of " << n << " = " << factorial <<endl;
cout << "Size of factorial: " << sizeof(factorial) << endl;
return 0;
}
This code will output this:
Factorial of 40 = 0
Size of factorial: 4
That was definetly not the expected result. That happened because auto deduced the type of the variable factorial as int because it was assigned to 1.

Fortran performance when passing array slices as arguments

I like fortran's array-slicing notation (array(1:n)), but I wonder whether I take a performance hit if I use them when it's not necessary.
Consider, for example, this simple quicksort code (It works, but obviously it's not taking care to pick a good pivot):
recursive subroutine quicksort(array, size)
real, dimension(:), intent(inout) :: array
integer, intent(in) :: size
integer :: p
if (size > 1) then
p = partition(array, size, 1)
call quicksort(array(1:p-1), p-1)
call quicksort(array(p+1:size), size-p)
end if
end subroutine quicksort
function partition(array, size, pivotdex) result(p)
real, dimension(:), intent(inout) :: array
integer, intent(in) :: size, pivotdex
real :: pivot
integer :: i, p
pivot = array(pivotdex)
call swap(array(pivotdex), array(size))
p=1
do i=1,size-1
if (array(i) < pivot) then
call swap(array(i), array(p))
p=p+1
end if
end do
call swap(array(p), array(size))
end function partition
subroutine swap(a, b)
real, intent(inout) :: a, b
real :: temp
temp = a
a = b
b = temp
end subroutine swap
I could just as easily pass the whole array along with the indices of where the recursive parts should be working, but I like the code this way. When I call quicksort(array(1:p-1), p-1), however, does it make a temporary array to operate on, or does it just make a shallow reference structure or something like that? Is this a sufficiently efficient solution?
This question is related, but it seems like it makes temporary arrays because of the strided slice and explicit-sized dummy variables, so I'm safe from that, right?

Regarding your question of efficiency: Yes, for most cases, using assumed-shape arrays and array slices is indeed a sufficiently efficient solution.
There is some overhead involved. Assumed-shape arrays require an array descriptor (sometimes also called "dope vector"). This array descriptor contains information about dimensions and strides, and setting it up requires some work.
The code in the called procedure with an assume-shape dummy argument has to take both unity stride (the usual case) and non-unity stride into account. Somebody, somewhere, might want to call your sorting routine with an actual argument of somearray(1:100:3) because he only wants to sort every third element of the array. Unusual, but legal. Code that cannot depend on unity stride may have some performance penalty.
Having said that, compilers, especially those using link-time optimization, are quite good nowadays in inlining and/or stripping away all the extra work, and also tend to clone procedures for special-casing unity strides.
So, as a rule, clarity (and assumed-shape arrays) should win. Just keep in mind that the old-fashioned way of passing array arguments may, in some circumstances, gain some extra efficiency.

Your subarray
array(1:p-1)
is contiguous, provided array is contiguous.
Also, you use an assumed shape array dummy argument
real, dimension(:), intent(inout) :: array
There is no need for a temporary. Just the descriptor of an assumed shape array is passed. And as your subarray is contiguous, even an assumed size, or explicit size, or assumed size dummy argument with the contiguous attribute would be OK.

const immutable BigInt and range.join in D

I'm learning D and I have been playing with more and more functions and tools defined in phobos. I came across two functions that don't work when the parameters are const or immutable.
BigInt i = "42", j = "42";
writeln(i + j);
// This works, except when I add a const/immutable qualifier to i and j
// When const: main.d(23): Error: incompatible types for ((i) + (j)): 'const(BigInt)' and 'const(BigInt)'
// When immutable: main.d(23): Error: incompatible types for ((i) + (j)): 'immutable(BigInt)' and 'immutable(BigInt)'
The same happens with the std.array.join function.
int[] arr1 = [1, 2, 3, 4];
int[] arr2 = [5, 6];
writeln(join([arr1, arr2]));
// Again, the const and immutable errors are almost identical
// main.d(28): Error: template std.array.join(RoR, R)(RoR ror, R sep) if (isInputRange!RoR && isInputRange!(ElementType!RoR) && isInputRange!R && is(Unqual!(ElementType!(ElementType!RoR)) == Unqual!(ElementType!R))) cannot deduce template function from argument types !()(const(int[])[])
This is quite surprising to me. I have a C++ background so I usually write const everywhere, but it seems I can't do it in D.
As a D "user", I see that as a bug. Can someone explain me why this is not a bug and how I should call these functions with const/immutable data? Thanks.

First off, I should say that D's const is very different from C++'s const. Like C++, ideally you'd mark as much with it as possible, but unlike with C++, there are serious consequences to marking something const in D.
In D, const is transitive, so it affects the entire type, not just the top level, and unlike in C++, you can't mutate it by casting it away or by using mutable (it's undefined behavior and will cause serious bugs if you try to cast away const from an object and then mutate it). The result of those two things is that there are many places where you just can't use const in D without making it impossible to do certain things.
D's const provides real, solid guarantees that you can't mutate the object through that reference in any way, shape or form, whereas C++'s const just makes it so that you can't mutate anything which is const by accident, but you can easily cast away const and mutate the object (with defined behavior), or pieces of the object could be changed internally by const functions thanks to mutable. It's also trivial in C++ to return a mutable reference to the internals of a class from a const function even without casting or mutable (e.g. returning vector<int*> from a const function - the vector can't be mutated but everything it refers to can be). None of those are possible in D, as D guarantees full transitive const, and providing those guarantees makes it so that any circumstance where you need to get at something mutable from something const isn't going to work unless you create an entirely new copy of it.
You should probably read over the answers to these questions:
Logical const in D
What is the difference between const and immutable in D?
So, if you're slapping const on everything in D, you'll find that some things just won't work. Using const as much as you can is great for the same reasons that it is in C++, but the cost is much higher, so you have to be more restrictive about what you mark with const.
Now, as to your specific issue here. BigInt is supposed to work with const and immutable but does not currently. There are a few open bugs on the issue. I believe that a lot of the problem stems from the fact that BigInt uses COW internally, and that doesn't play nicely with const or immutable. Fortunately, there's a pull request on github at the moment which fixes at least some of the problems, so I expect that BigInt will work with const and immutable in the near future, but for the moment, you can't.
As for join, your example compiles just fine, so you copied your code wrong. There is no const in your example. Perhaps you meant
const int[] arr1 = [1, 2, 3, 4];
const int[] arr2 = [5, 6];
writeln(join([arr1, arr2]));
And that doesn't compile. And that's because you aren't passing a valid range of ranges to join. The type that you'd be passing to join in that case would be const(int[])[]. The outer array is mutable, so it's fine, but the inner ones - the ranges that you're trying to join together - are const, and nothing which is const can be a valid range, and that's because popFront won't work. For something to be a valid input range, this code must compile for it (and this is taken from inside of std.range.isInputRange).
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
const(int[]) won't work with popFront as isInputRange requires. e.g.
const int[] arr = [1, 2, 3];
arr.popFront();
won't compile, so isInputRange is false, and join won't compile with it.
Now, fortunately, arrays are a bit special in that the compiler understands them, so the compiler knows that it's perfectly legit to turn const(int[]) into const(int)[] when you slice it. That is, it knows that giving you a tail-const slice won't be able to affect the original array (because the result is a new array, and while the elements are shared between the arrays, they're all const, so they still can't be mutated). So, the type of arr[] would be const(int)[] instead of const(int[]), and the type of [arr1[], arr2[]] is const(int)[][], which will work with join. So, you can do
const int[] arr1 = [1, 2, 3, 4];
const int[] arr2 = [5, 6];
writeln(join([arr1[], arr2[]]));
and your code will work just fine. However, that's just because you're using arrays. If you were dealing with user-defined ranges, the moment you made one of them const, you'd be stuck. This code won't compile
const arr1 = filter!"true"([1, 2, 3, 4]);
const arr2 = filter!"true"([5, 6]);
writeln(join([arr1[], arr2[]]));
And that's because the compiler doesn't know that it can safely get a tail-const slice from a ser-defined type. It needs to know that it can convert const MyRange!E to MyRange!(const E) and have the proper semantics. And it can't know that, because those are two different template instantiations, and they could have completely different internals. The programmer writing MyRange would have to be able to write opSlice such that it returns MyRange(const E) when the type is const MyRange!E or const MyRange!(const E), and that's actually hard to do (if nothing else, it very easily results in recursive template instantiations). Some clever use of static if and alias this should make it possible, but it's hard enough to do that pretty much no one does it right now. It's an open question as to how we're going to make it sane for user-defined types to make opSlice return a tail-const range. And until that question is solved, const and ranges just don't mix, because as soon as you get a const range, there's no way to get a tail-const slice of it which could have popFront called on it. So, once your range is const, it's const.
Arrays are special, since they're built-in, so you can get away with using const with them as long as you slice them at the appropriate times (and the fact that templates are instantiated with their slice type rather than their original type helps), but in general, if you're using a range, just assume that you can't make it const. Hopefully, that changes someday, but for now, that's the way that it is.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js