C#7 safeguard for deconstruction? - tuples

Is there a way to protect oneself from side effect to deconstruction when rearrenging elements' ordering in a tuple value?
For instance, the second console's writeline would output the wrong results.
public void Test()
{
var result = GetResult(10, 5);
Console.WriteLine($"sum: {result.sum}, substraction : {result.substraction}");
(int substraction, int sum) = result;
Console.WriteLine($"sum: {sum}, substraction : {substraction}");
}
//Old version
//private (int substraction, int sum) GetResult(int a, int b) => (a - b, a + b);
private (int sum, int substraction) GetResult(int a, int b) => (a + b, a - b);

"Element names are semantically insignificant except when used directly."
(From "C# Tuples. More about element names", by Vladimir Sadov who is a member of the C# language team).
Except in a few cases (as Vladimir explains in his post), the element names are ignored by the compiler unless explicitly accessed. This isn't just limited to deconstruction. The following code compiles just fine:
void Example()
{
(int a, int b) GetTuple() => (1, 2);
void PrintTuple((int b, int a) t) => Console.WriteLine($"a:{t.a}, b:{t.b}");
var t = GetTuple();
PrintTuple(t);
}
And the output is a:2, b:1.
So the compiler itself offers you no protection against mismatching names in tuples and deconstructions. That's not to say no protection is possible, it just needs someone (eg you) to write a Roslyn analyzer that detects name swaps like this and reports it as a warning.

This feature (warning for element names mismatches that are clearly accidental) did not make it into C# 7.0. You can voice your interest and follow the progress at https://github.com/dotnet/roslyn/issues/14217
It is also possible that some analyzers will fill this gap.
Note this sort of accident is not unique to tuples or deconstruction, you can accidentally pass arguments in the wrong order in invocations (in or out parameters).

Related

Is it possible / advisable to return a range?

I'm using the ranges library to help filer data in my classes, like this:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(v) {}
std::vector<int> getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
return std::vector<int>(evens.begin(), evens.end());
}
private:
std::vector<int> vec;
};
In this case, a new vector is constructed in the getEvents() function. To save on this overhead, I'm wondering if it is possible / advisable to return the range directly from the function?
class MyClass
{
public:
using RangeReturnType = ???;
MyClass(std::vector<int> v) : vec(v) {}
RangeReturnType getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
// ...
return evens;
}
private:
std::vector<int> vec;
};
If it is possible, are there any lifetime considerations that I need to take into account?
I am also interested to know if it is possible / advisable to pass a range in as an argument, or to store it as a member variable. Or is the ranges library more intended for use within the scope of a single function?
This was asked in op's comment section, but I think I will respond it in the answer section:
The Ranges library seems promising, but I'm a little apprehensive about this returning auto.
Remember that even with the addition of auto, C++ is a strongly typed language. In your case, since you are returning evens, then the return type will be the same type of evens. (technically it will be the value type of evens, but evens was a value type anyways)
In fact, you probably really don't want to type out the return type manually: std::ranges::filter_view<std::ranges::ref_view<const std::vector<int>>, MyClass::getEvens() const::<decltype([](int i) {return ! (i % 2);})>> (141 characters)
As mentioned by #Caleth in the comment, in fact, this wouldn't work either as evens was a lambda defined inside the function, and the type of two different lambdas will be different even if they were basically the same, so there's literally no way of getting the full return type here.
While there might be debates on whether to use auto or not in different cases, but I believe most people would just use auto here. Plus your evens was declared with auto too, typing the type out would just make it less readable here.
So what are my options if I want to access a subset (for instance even numbers)? Are there any other approaches I should be considering, with or without the Ranges library?
Depends on how you would access the returned data and the type of the data, you might consider returning std::vector<T*>.
views are really supposed to be viewed from start to end. While you could use views::drop and views::take to limit to a single element, it doesn't provide a subscript operator (yet).
There will also be computational differences. vector need to be computed beforehand, where views are computed while iterating. So when you do:
for(auto i : myObject.getEven())
{
std::cout << i;
}
Under the hood, it is basically doing:
for(auto i : myObject.vec)
{
if(!(i % 2)) std::cout << i;
}
Depends on the amount of data, and the complexity of computations, views might be a lot faster, or about the same as the vector method. Plus you can easily apply multiple filters on the same range without iterating through the data multiple times.
In the end, you can always store the view in a vector:
std::vector<int> vec2(evens.begin(), evens.end());
So my suggestions is, if you have the ranges library, then you should use it.
If not, then vector<T>, vector<T*>, vector<index> depending on the size and copiability of T.
There's no restrictions on the usage of components of the STL in the standard. Of course, there are best practices (eg, string_view instead of string const &).
In this case, I can foresee no problems with handling the view return type directly. That said, the best practices are yet to be decided on since the standard is so new and no compiler has a complete implementation yet.
You're fine to go with the following, in my opinion:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(std::move(v)) {}
auto getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
private:
std::vector<int> vec;
};
As you can see here, a range is just something on which you can call begin and end. Nothing more than that.
For instance, you can use the result of begin(range), which is an iterator, to traverse the range, using the ++ operator to advance it.
In general, looking back at the concept I linked above, you can use a range whenever the conext code only requires to be able to call begin and end on it.
Whether this is advisable or enough depends on what you need to do with it. Clearly, if your intention is to pass evens to a function which expects a std::vector (for instance it's a function you cannot change, and it calls .push_back on the entity we are talking about), you clearly have to make a std::vector out of filter's output, which I'd do via
auto evens = vec | ranges::views::filter(whatever) | ranges::to_vector;
but if all the function which you pass evens to does is to loop on it, then
return vec | ranges::views::filter(whatever);
is just fine.
As regards life time considerations, a view is to a range of values what a pointer is to the pointed-to entity: if the latter is destroied, the former will be dangling, and making improper use of it will be undefined behavior. This is an erroneous program:
#include <iostream>
#include <range/v3/view/filter.hpp>
#include <string>
using namespace ranges;
using namespace ranges::views;
auto f() {
// a local vector here
std::vector<std::string> vec{"zero","one","two","three","four","five"};
// return a view on the local vecotor
return vec | filter([](auto){ return true; });
} // vec is gone ---> the view returned is dangling
int main()
{
// the following throws std::bad_alloc for me
for (auto i : f()) {
std::cout << i << std::endl;
}
}
You can use ranges::any_view as a type erasure mechanism for any range or combination of ranges.
ranges::any_view<int> getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
I cannot see any equivalent of this in the STL ranges library; please edit the answer if you can.
EDIT: The problem with ranges::any_view is that it is very slow and inefficient. See https://github.com/ericniebler/range-v3/issues/714.
It is desirable to declare a function returning a range in a header and define it in a cpp file
for compilation firewalls (compilation speed)
stop the language server from going crazy
for better factoring of the code
However, there are complications that make it not advisable:
How to get type of a view?
If defining it in a header is fine, use auto
If performance is not a issue, I would recommend ranges::any_view
Otherwise I'd say it is not advisable.

Is there any way to assign multiple variable at once with Dlang?

With using Ruby, we can do this.
s = "split by space"
A,B,C = s.split(" ").map(&:to_i)
With using D-lang, it's compile error.
string s = "split by space";
int A,B,C = s.split(" ").map!(x => x.to!int);
Jonathan is mostly right, but there is in fact a way to split a tuple into
its constituent parts, albeit more verbose than in Ruby, and without any handy
type inference:
import std.meta : AliasSeq;
import std.typecons : tuple;
auto foo() { return tuple(42, 29, "hello"); }
unittest {
int a, b;
string c;
AliasSeq!(a, b, c) = foo(); // Look ma, magic!
assert(a == 42);
assert(b == 29);
assert(c == "hello");
}
While there's no built-in way to do this with ranges like your example, it's
possible to implement in a library:
import std.meta : AliasSeq, Repeat;
import std.typecons : Tuple, tuple;
import std.algorithm : map;
import std.conv : to;
import std.string : split;
import std.range : isInputRange, ElementType;
unittest {
string s = "1 2 3";
int A,B,C;
AliasSeq!(A,B,C) = s.split(" ").map!(x => x.to!int).tuplify!3;
assert(A == 1);
assert(B == 2);
assert(C == 3);
}
auto tuplify(size_t n, R)(R r) if (isInputRange!R) {
Tuple!(Repeat!(n, ElementType!R)) result;
static foreach (i; 0..n) {
result[i] = r.front;
r.popFront();
}
assert(r.empty);
return result;
}
No, there is no way to do that. There has been talk off-and-on about possibly adding tuple support to the language such that you could something like
int a;
int b;
string c;
(a, b, c) = foo();
and maybe that will happen someday, but it's not possible right now. The closest would be using something like std.typecons.Tuple/tuple so that you can do something like
Tuple!(int, int, string) foo() { return tuple(42, 29, "hello"); }
Tuple!(int, int, string) result = foo();
or more likely
auto foo() { return tuple(42, 29, "hello"); }
auto result = foo();
but Tuple is ultimately just a struct, and you can't magically split it out at the other end. You have to access its members via indices such as result[0] or result[1], or if you declare Tuple with names - e.g. Tuple!(int, "x", int, "y", string, "str") - then you can access the members by name - e.g. result.x. So, Tuple/tuple allows you to return multiple values without explicitly declaring a struct type just for that, but it's still creating a struct type just for that, and while it allows you to easily pack values to return, it does not allow you to automatically unpack them on the other end. That would require compiler support of some kind that we don't have.
However, even if we had better tuple support in the language so that something like
(a, b, c) = foo();
worked, I doubt that what you're trying to do would work, since map specifically returns a range. So, it's an object with member functions, not a tuple of any kind to be split up. It just so happens to represent a list of values that can be extracted with the right set of function calls. And the number of values that it has is not known at compile time, so even if you assume that the compiler understands the range primitives well enough to get a list out of them for you, it can't guarantee at compile time that there are enough values to put into the variables you're trying to assign to, let alone that there are exactly that number of values. So, while it wouldn't be impossible to make something like that work (e.g. if it threw an Error at compile time if there weren't enough values in the range), I'd be surprised if that were implemented. D is a statically typed language and that would effectively be making a piece of it dynamic, so it would be pretty out-of-character for it to be in the language. Ruby is a dynamic language, so it's a very different beast.
Regardless, any improvements with tuples would be improvements to the language and would have to go through the DIP process and get approved, and nothing like that has happened yet.

qsort comparison compilation error

My medianfilter.cpp class invokes qsort as seen below.
vector<float> medianfilter::computeMedian(vector<float> v) {
float arr[100];
std::copy(v.begin(), v.end(), arr);
unsigned int i;
qsort(arr, v.size(), sizeof(float), compare);
for (i = 0; i < v.size(); i++) {
printf("%f ", arr[i]);
}
printf("median=%d ", arr[v.size() / 2]);
return v;
}
The implementaiton of my comparison is:
int medianfilter::compare(const void * a, const void * b) {
float fa = *(const float*) a;
float fb = *(const float*) b;
return (fa > fb) - (fa < fb);
}
while the declaration in mediafilter.hpp is set private and looks like that:
int compare (const void*, const void*);
A compilation error occurs: cannot convert ‘mediafilter::compare’ from type ‘int (mediafilter::)(const void*, const void*)’ to type ‘__compar_fn_t {aka int (*)(const void*, const void*)}’
I don't understand this error completly. How do I correctly declare and implement this comparison method?
Thanks!
Compare is a non-static member function whereas qsort expects a non-member function (or a static member function). As your compare function doesn't seem to use any non-static members of the class, you could just declare it static. In fact I'm not sure what your median filter class does at all. Perhaps you just need a namespace.
Why not sort the vector directly instead of copying it into a second array? Furthermore your code will break if the vector has more than 100 elements.
The default behavior of sort does just want you need, but for completeness I show how to use a compare function.
I also changed the return type of your function because I don't understand why a function called computeMedian wouldn't return the median..
namespace medianfilter
{
bool compare(float fa, float fb)
{
return fa < fb;
}
float computeMedian(vector<float> v)
{
std::sort(v.begin(), v.end(), compare);
// or simply: std::sort(v.begin(), v.end());
for (size_t i = 0; i < v.size(); i++) {
printf("%f ", v[i]);
}
if (v.empty())
{
// what do you want to happen here?
}
else
{
float median = v[v.size() / 2]; // what should happen if size is odd?
printf("median=%f ", median); // it was %d before
return median;
}
}
}
You can't call compare as it is because it is a member function and requires a this pointer (i.e. it needs to be called on an object). However, as your compare function doesn't need a this pointer, simply make it a static function and your code will compile.
Declare it like this in your class:
static int compare(const void * a, const void * b);
Not directly related to your question (for which you already have the answer) but some observations:
Your calculation of median is wrong. If the number of elements is even you should return the average of the two center values not the value of lower one.
The copy to the array with a set size screams buffer overflow. Copy to another vector and std:sort it or (as suggested by #NeilKirk) just sort the original one unless you have cause not to modify it.
There is no guard against empty input. Median is undefined in this case but your implementation would just return whatever happens to be on arr[0]
Ok, this is more of an appendix to Eli Algranti (excellent) answer than an answer to the original question.
Here is a generic code to compute the quantile quant of a vector of double called x (which the code below preserves).
First things first: there are many definitions of quantiles (R alone lists 9). The code below corresponds to definition #5 (which is also the default quantile function in matlab and generally the ones statisticians think of when they think quantile).
The key idea here is that when the quantile do not fall on a precise observation (e.g. when you want the 15% quantile of an array of length 10) the implementation below realizes the (correct) interpolation (in this case between the 10% and 20%) between adjacent quantile. This is important so that when you increase the number of observations (i m hinting at the name medianfilter here) the value of the quantile do not jump about abruptly but converges smoothly instead (which is one reason why this is the statistician's preferred definition).
The code assumes that x has at least one element (the code below is part of a longer one and I feel this point has been made already).
Unfortunately it s written using many function from the (excellent!) c++ eigen library and it is too late for me at this advanced time in the night to translate the eigen functions --or sanitize the variable names--, but the key ideas should be readable.
#include <Eigen/Dense>
#include <Eigen/QR>
using namespace std;
using namespace Eigen;
using Eigen::MatrixXd;
using Eigen::VectorXd;
using Eigen::VectorXi;
double quantiles(const Ref<const VectorXd>& x,const double quant){
//computes the quantile 'quant' of x.
const int n=x.size();
double lq,uq,fq;
const double q1=n*(double)quant+0.5;
const int index1=floor(q1);
const int index2=ceil(q1);
const double index3=(double)index2-q1;
VectorXd x1=x;
std::nth_element(x1.data(),x1.data()+index1-1,x1.data()+x1.size());
lq=x1(index1-1);
if(index1==index2){
fq=lq;
} else {
uq=x1.segment(index1,x1.size()-index1-1).minCoeff();
fq=lq*index3+uq*(1.0-index3);
}
return(fq);
}
So the code uses one call to nth_element, which has average complexity O(n) [sorry for sloppely using big O for average] and (when n is even) one extra call to min() [which in eigen dialect is noted .minCoeff()] on at most n/2 elements of the vector, which is O(n/2).
This is much better than using partial sort (which would cost O(nlog(n/2)), worst case) or sort (which would cost
O(nlogn))

Lambda closure vs simple argument?

For lambda expressions, I don't quite get the usefulness of closures in C++11.
auto f = [] (int n, int m) { return n + m };
std::cout << f(2,2);
versus.
int n = 2;
auto f = [n] (int m) { return n + m };
std::cout << f(2);
This is a very basic and primitive example. I'm guessing that closures play an important part in other kinds of statements, but my C++ book doesn't clarify this (so far).
Why not include the closure as a parameter?
OK, a simple example, remove all the x's from a string
char x = 'x';
std::string s = "Text to remove all 'x's from";
s.erase(std::remove_if(s.begin(), s.end(), [x](char c) {return x == c;}), s.end());
Borrowed and modifed from http://en.cppreference.com/w/cpp/algorithm/remove
In this example, remove_if() only takes a single parameter, but I need two values for the comparison.
Closures are not always called immediately. They are objects which can be stored and called later when the data necessary to successfully execute the lambda function may no longer be in scope or easily accessible from the call site.
It's possible to to store any necessary data along with the closure but it's so much simpler for the closure to grab anything it needs when it's created and use it when it's eventually called. It provides a form of encapsulation.
This also decreases code coupling because if you were to store the data along with the code then the caller could only work with the specific objects you decided to store. Since a closure carries its own data along with it, it can work with any data it needs.
Here's an greatly oversimplified real-life example. I built a database server which needed to support fields with multiple values. The problem was that when results were displayed, it was important to highlight which values actually caused a record to match the search criteria. So, the query parser would spit out a predicate in the form of a closure which would indicate whether or not it was a matching value.
It looked something like this:
std::function< bool(int value) > parser::match_int(int search_val) {
return [=](int value) { value == search_val; };
}
That closure got stored in a collection. When it was time to render the record, I could easily determine which values needed to be highlighted. Keep in mind that the parser and any associated data is now gone:
void render_values(std::function< bool(int value) > pred, std::vector<int> values) {
for (int value : values) {
if (pred(value))
render_highlight(value);
else
render_normal(value);
}
}

Dynamic creation of a pointer function in c++

I was working on my advanced calculus homework today and we're doing some iteration methods along the lines of newton's method to find solutions to things like x^2=2. It got me thinking that I could write a function that would take two function pointers, one to the function itself and one to the derivative and automate the process. This wouldn't be too challenging, then I started thinking could I have the user input a function and parse that input (yes I can do that). But can I then dynamically create a pointer to a one-variable function in c++. For instance if x^2+x, can I make a function double function(double x){ return x*x+x;} during run-time. Is this remotely feasible, or is it along the lines of self-modifying code?
Edit:
So I suppose how this could be done if you stored the information in an array and that had a function that evaluated the information stored in this array with a given input. Then you could create a class and initialize the array inside of that class and then use the function from there. Is there a better way?
As others have said, you cannot create new C++ functions at runtime in any portable way. You can however create an expression evaluator that can evaluate things like:
(1 + 2) * 3
contained in a string, at run time. It's not difficult to expand such an evaluator to have variables and functions.
You can't dynamically create a function in the sense that you can generate raw machine code for it, but you can quite easily create mathematical expressions using polymorphism:
struct Expr
{
virtual double eval(double x) = 0;
};
struct Sum : Expr
{
Sum(Expr* a, Expr* b):a(a), b(b) {}
virtual double eval(double x) {return a->eval(x) + b->eval(x);}
private:
Expr *a, *b;
};
struct Product : Expr
{
Product(Expr* a, Expr* b):a(a), b(b) {}
virtual double eval(double x) {return a->eval(x) * b->eval(x);}
private:
Expr *a, *b;
};
struct VarX : Expr
{
virtual double eval(double x) {return x;}
};
struct Constant : Expr
{
Constant(double c):c(c) {}
virtual double eval(double x) {return c;}
private:
double c;
};
You can then parse your expression into an Expr object at runtime. For example, x^2+x would be Expr* e = new Sum(new Product(new VarX(), new VarX()), new VarX()). You can then evaluate that for a given value of x by using e->eval(x).
Note: in the above code, I have ignored const-correctness for clarity -- you should not :)
It is along the lines of self-modifying code, and it is possible—just not in "pure" C++. You would need to know some assembly and a few implementation details. Without going down this road, you could abstractly represent operations (e.g. with functors) and build an expression tree to be evaluated.
However, for the simple situation of just one variable that you've given, you'd only need to store coefficients, and you can evaluate those for a given value easily.
// store coefficients as vector in "reverse" order, e.g. 1x^2 - 2x + 3
// is stored as [3, -2, 1]
typedef double Num;
typedef vector<double> Coeffs;
Num eval(Coeffs c, Num x) {
assert(c.size()); // must not be empty
Num result = 0;
Num factor = 1;
for (Coeffs::const_iterator i = c.begin(); i != c.end(); ++i) {
result += *i * factor;
factor *= x;
}
return result;
}
int main() {
Coeffs c; // x^2 + x + 0
c.push_back(0);
c.push_back(1);
c.push_back(1);
cout << eval(c, 0) << '\n';
cout << eval(c, 1) << '\n';
cout << eval(c, 2) << '\n';
}
You don't really need self modifiying code for that. But you will be writing what comes down to an expression parser and interpreter. You write the code to parse your function into suitable data structures (e.g. trees). For a given input you now traverse the tree and calculate the result of the function. Calculation can be done through a visitor.
You don't need to know assembly. Write c++ code for the possible expressions, and then write a compiler which examines the expression and choose the appropriate code snippets. That could be done at runtime like an interpreter usually does, or it could be a compile phase which creates code to execute by copying the instructions from each expression evaluation into allocated memory and then sets it up as a function. The latter is harder to understand and code, but will perform better. But for the development time plus execution time to be less than an interpreted implementation, the compiled code would have to be used lots (billions) of times.
As others have mentioned. Writing self-modifying code isn't necessary at all and is painfull in a compiled language if you want it to be portable.
The hardest part of your work is parsing the input. I recommend muParser to evaluate your expressions. It should take away a lot of pain and you would be able to focus on the important part of your project.