What is the difference between a var and val definition in Scala and why does the language need both? Why would you choose a val over a var and vice versa?
As so many others have said, the object assigned to a val cannot be replaced, and the object assigned to a var can. However, said object can have its internal state modified. For example:
class A(n: Int) {
var value = n
}
class B(n: Int) {
val value = new A(n)
}
object Test {
def main(args: Array[String]) {
val x = new B(5)
x = new B(6) // Doesn't work, because I can't replace the object created on the line above with this new one.
x.value = new A(6) // Doesn't work, because I can't replace the object assigned to B.value for a new one.
x.value.value = 6 // Works, because A.value can receive a new object.
}
}
So, even though we can't change the object assigned to x, we could change the state of that object. At the root of it, however, there was a var.
Now, immutability is a good thing for many reasons. First, if an object doesn't change internal state, you don't have to worry if some other part of your code is changing it. For example:
x = new B(0)
f(x)
if (x.value.value == 0)
println("f didn't do anything to x")
else
println("f did something to x")
This becomes particularly important with multithreaded systems. In a multithreaded system, the following can happen:
x = new B(1)
f(x)
if (x.value.value == 1) {
print(x.value.value) // Can be different than 1!
}
If you use val exclusively, and only use immutable data structures (that is, avoid arrays, everything in scala.collection.mutable, etc.), you can rest assured this won't happen. That is, unless there's some code, perhaps even a framework, doing reflection tricks -- reflection can change "immutable" values, unfortunately.
That's one reason, but there is another reason for it. When you use var, you can be tempted into reusing the same var for multiple purposes. This has some problems:
It will be more difficult for people reading the code to know what is the value of a variable in a certain part of the code.
You may forget to re-initialize the variable in some code path, and end up passing wrong values downstream in the code.
Simply put, using val is safer and leads to more readable code.
We can, then, go the other direction. If val is that better, why have var at all? Well, some languages did take that route, but there are situations in which mutability improves performance, a lot.
For example, take an immutable Queue. When you either enqueue or dequeue things in it, you get a new Queue object. How then, would you go about processing all items in it?
I'll go through that with an example. Let's say you have a queue of digits, and you want to compose a number out of them. For example, if I have a queue with 2, 1, 3, in that order, I want to get back the number 213. Let's first solve it with a mutable.Queue:
def toNum(q: scala.collection.mutable.Queue[Int]) = {
var num = 0
while (!q.isEmpty) {
num *= 10
num += q.dequeue
}
num
}
This code is fast and easy to understand. Its main drawback is that the queue that is passed is modified by toNum, so you have to make a copy of it beforehand. That's the kind of object management that immutability makes you free from.
Now, let's covert it to an immutable.Queue:
def toNum(q: scala.collection.immutable.Queue[Int]) = {
def recurse(qr: scala.collection.immutable.Queue[Int], num: Int): Int = {
if (qr.isEmpty)
num
else {
val (digit, newQ) = qr.dequeue
recurse(newQ, num * 10 + digit)
}
}
recurse(q, 0)
}
Because I can't reuse some variable to keep track of my num, like in the previous example, I need to resort to recursion. In this case, it is a tail-recursion, which has pretty good performance. But that is not always the case: sometimes there is just no good (readable, simple) tail recursion solution.
Note, however, that I can rewrite that code to use an immutable.Queue and a var at the same time! For example:
def toNum(q: scala.collection.immutable.Queue[Int]) = {
var qr = q
var num = 0
while (!qr.isEmpty) {
val (digit, newQ) = qr.dequeue
num *= 10
num += digit
qr = newQ
}
num
}
This code is still efficient, does not require recursion, and you don't need to worry whether you have to make a copy of your queue or not before calling toNum. Naturally, I avoided reusing variables for other purposes, and no code outside this function sees them, so I don't need to worry about their values changing from one line to the next -- except when I explicitly do so.
Scala opted to let the programmer do that, if the programmer deemed it to be the best solution. Other languages have chosen to make such code difficult. The price Scala (and any language with widespread mutability) pays is that the compiler doesn't have as much leeway in optimizing the code as it could otherwise. Java's answer to that is optimizing the code based on the run-time profile. We could go on and on about pros and cons to each side.
Personally, I think Scala strikes the right balance, for now. It is not perfect, by far. I think both Clojure and Haskell have very interesting notions not adopted by Scala, but Scala has its own strengths as well. We'll see what comes up on the future.
val is final, that is, cannot be set. Think final in java.
In simple terms:
var = variable
val = variable + final
val means immutable and var means mutable.
Full discussion.
The difference is that a var can be re-assigned to whereas a val cannot. The mutability, or otherwise of whatever is actually assigned, is a side issue:
import collection.immutable
import collection.mutable
var m = immutable.Set("London", "Paris")
m = immutable.Set("New York") //Reassignment - I have change the "value" at m.
Whereas:
val n = immutable.Set("London", "Paris")
n = immutable.Set("New York") //Will not compile as n is a val.
And hence:
val n = mutable.Set("London", "Paris")
n = mutable.Set("New York") //Will not compile, even though the type of n is mutable.
If you are building a data structure and all of its fields are vals, then that data structure is therefore immutable, as its state cannot change.
Thinking in terms of C++,
val x: T
is analogous to constant pointer to non-constant data
T* const x;
while
var x: T
is analogous to non-constant pointer to non-constant data
T* x;
Favoring val over var increases immutability of the codebase which can facilitate its correctness, concurrency and understandability.
To understand the meaning of having a constant pointer to non-constant data consider the following Scala snippet:
val m = scala.collection.mutable.Map(1 -> "picard")
m // res0: scala.collection.mutable.Map[Int,String] = HashMap(1 -> picard)
Here the "pointer" val m is constant so we cannot re-assign it to point to something else like so
m = n // error: reassignment to val
however we can indeed change the non-constant data itself that m points to like so
m.put(2, "worf")
m // res1: scala.collection.mutable.Map[Int,String] = HashMap(1 -> picard, 2 -> worf)
"val means immutable and var means mutable."
To paraphrase, "val means value and var means variable".
A distinction that happens to be extremely important in computing (because those two concepts define the very essence of what programming is all about), and that OO has managed to blur almost completely, because in OO, the only axiom is that "everything is an object". And that as a consequence, lots of programmers these days tend not to understand/appreciate/recognize, because they have been brainwashed into "thinking the OO way" exclusively. Often leading to variable/mutable objects being used like everywhere, when value/immutable objects might/would often have been better.
val means immutable and var means mutable
you can think val as java programming language final key world or c++ language const key world。
Val means its final, cannot be reassigned
Whereas, Var can be reassigned later.
It's as simple as it name.
var means it can vary
val means invariable
Val - values are typed storage constants. Once created its value cant be re-assigned. a new value can be defined with keyword val.
eg. val x: Int = 5
Here type is optional as scala can infer it from the assigned value.
Var - variables are typed storage units which can be assigned values again as long as memory space is reserved.
eg. var x: Int = 5
Data stored in both the storage units are automatically de-allocated by JVM once these are no longer needed.
In scala values are preferred over variables due to stability these brings to the code particularly in concurrent and multithreaded code.
Though many have already answered the difference between Val and var.
But one point to notice is that val is not exactly like final keyword.
We can change the value of val using recursion but we can never change value of final. Final is more constant than Val.
def factorial(num: Int): Int = {
if(num == 0) 1
else factorial(num - 1) * num
}
Method parameters are by default val and at every call value is being changed.
In terms of javascript , it same as
val -> const
var -> var
I have a struct that receives an array of type Float from a C++ library.
class MyStruct extends Struct{
#Array.multi([12])
external Array<Float> states;
}
I am able to receive data and parse it in Dart.
Now I want to do the reverse. I have a List<double> which I want to assign to this struct and pass to C++.
The following cast fails at run time.
myStructObject.states = listObject as Array<Float>;
Neither Array class, nor List class has any related methods. Any idea on this?
There's no way to get around copying elements into FFI arrays.
for (var i = 0; i < listObject.length; i++) {
my_struct.states[i] = listObject[i];
}
This may seem inefficient, but consider that depending on the specialization of listObject, the underlying memory layout of the data may differ significantly from the contiguous FFI layout, and so a type conversion sugar provided by Dart would likely also need to perform conversions on individual elements anyways (as opposed to just performing a single memcpy under the hood).
One possibility for closing the convenience gap would be to define an extension method. For example:
extension FloatArrayFill<T> on ffi.Array<ffi.Float> {
void fillFromList(List<T> list) {
for (var i = 0; i < list.length; i++) {
this[i] = list[i] as double;
}
}
}
Usage:
my_struct.states.fillFromList(list);
Note that a separate extension method would be need to be defined for each ffi.Array<T> specialization you want to do this for (Array<Uint32>, Array<Double>, Array<Bool>, etc.).
This is due to the [] operator being implemented through a separate extension method for each of these type specializations internally.
In am looping over the elements of an Arrow Array and trying to apply a compute function to each scalar that will tell me the year, month, day, etc... of each element. The code looks something like this:
arrow::NumericArray<arrow::Date32Type> array = {...}
for (int64_t i = 0; i < array.length(); i++) {
arrow::Result<std::shared_ptr<arrow::Scalar>> result = array->GetScalar(i);
if (!result.ok()) {
// TODO: handle error
}
arrow::Result<arrow::Datum> year = arrow::compute::Year(*result);
}
However, I am not really clear as to how to extract the actual int64_t value from the arrow::compute::Year call. I have tried to do things like
const std::shared_ptr<int64_t> val = year.ValueOrDie();
>>> 'arrow::Datum' to non-scalar type 'const std::shared_ptr<long int>' requested
I've tried similarly to assign to just an int64_t which also fails with error: cannot convert 'arrow::Datum' to 'int64_t'
I didn't see any method of the Datum class that would otherwise return a scalar value in the primitive type that I think arrow::compute::Year should be returning. Any idea what I might be misunderstanding with the Datum / Scalar / Compute APIs?
Arrow's compute functions are really meant to be applied on arrays and not scalars, otherwise the overhead renders the operation rather inefficient. The arrow::compute::Year function takes in a Datum. This is a convenience item that could be a Scalar, an Array, ArrayData, RecordBatch, or Table. Not all functions accept all possible values of Datum (in particular, many do not accept RecordBatch or Table).
Once you have a result, there are a few ways you can get the data, and grabbing individual scalars is probably going to be the least efficient, especially if you know the type of the data ahead of time (in this case we know the type will be int64_t). This is because a scalar is meant to be a type-erased wrapper (e.g. like an "object" in python or java) around some value and it carries some overhead.
So my suggestion would be:
// If you are going to be passing your array through the compute
// infrastructure you'll need to have it in a shared_ptr.
// Also, NumericArray is a base class so you don't often need
// to refer to it directly. You'll typically be getting one of the
// concrete subclasses like Date32Array
std::shared_ptr<arrow::Date32Array> array = {...}
// A datum can be implicitly constructed from a shared_ptr to an
// array. You could also explicitly construct it if that is more
// comfortable to you. Here `array` is being implicitly cast to a Datum.
ARROW_ASSIGN_OR_RAISE(arrow::Datum year_datum, arrow::compute::Year(array));
// Now we have a datum, but the docs tell us the return value from the
// `Year` function is always an array, so lets just unwrap it. This is
// something that could probably be improved in Arrow (might as well
// return an array)
std::shared_ptr<arrow::Array> years_arr = year_datum.make_array();
// Also, we know that the data type is Int64 so let's go ahead and
// cast further
std::shared_ptr<arrow::Int64Array> years = std::dynamic_pointer_cast<arrow::Int64Array>(years_arr);
// The concrete classes can be iterated in a variety of ways. GetScalar
// is the least efficient (but doesn't require knowing the type up front)
// Since we know the type (we've cast to Int64Array) we can use Value
// to get a single int64_t, raw_values() to get a const int64_t* (e.g a
// C-style array) or, perhaps the simplest, begin() and end() to get STL
// compliant iterators of int64_t
for (int64_t year : years) {
std::cout << "Year: " << year << std::endl;
}
If you really want to work with scalars:
arrow::Array array = {...}
for (int64_t i = 0; i < array.length(); i++) {
arrow::Result<std::shared_ptr<arrow::Scalar>> result = array->GetScalar(i);
if (!result.ok()) {
// TODO: handle error
}
ARROW_ASSIGN_OR_RAISE(Datum year_datum, arrow::compute::Year(*result));
std::shared_ptr<arrow::Scalar> year_scalar = year_datum.scalar();
std::shared_ptr<arrow::Int64Scalar> year_scalar_int = std::dynamic_pointer_cast<arrow::Int64Scalar>(year_scalar);
int64_t year = year_scalar_int->value;
}
Is there an implementation in the protocol buffers library that allows sorting the array that's specified as a repeated field? For e.g., say the array consists of items of a type that itself contains an index field based on which the array items need to be sorted.
I couldn't find it, so guess I'll have to write one myself. Just wanted to confirm.
Thanks.
Protobufs provide a RepeatedPtr interface, via the mutable_* methods, which can be sorted with the std::sort() template.
Unless the underlying type of the repeated field is a simple one, you'll likely want to use an overloaded operator<, comparator, or lambda to do so. An toy example using a lambda would be:
message StaffMember {
optional string name = 1;
optional double hourly_rate = 2;
}
message StoreData {
repeated StaffMember staff = 1;
}
StoreData store;
// Reorder the list of staff by pay scale
std::sort(store.mutable_staff()->begin(),
store.mutable_staff()->end(),
[](const StaffMember& a, const StaffMember& b){
return a.hourly_rate() < b.hourly_rate();
});
Suppose that I have an array. I want to remove all the elements within the array that have a given value. Does anyone know how to do this? The value I am trying to remove may occur more than once and the array is not necessarily sorted. I would prefer to filter the array in-place instead of creating a new array. For example, removing the value 2 from the array [1, 2, 3, 2, 4] should produce the result [1, 3, 4].
This is the best thing I could come up with:
T[] without(T)(T[] stuff, T thingToExclude) {
auto length = stuff.length;
T[] result;
foreach (thing; stuff) {
if (thing != thingToExclude) {
result ~= thing;
}
}
return result;
}
stuff = stuff.without(thingToExclude);
writeln(stuff);
This seems unnecessarily complex and inefficient. Is there a simpler way? I looked at the std.algorithm module in the standard library hoping to find something helpful but everything that looked like it would do what I wanted was problematic. Here are some examples of things I tried that didn't work:
import std.stdio, std.algorithm, std.conv;
auto stuff = [1, 2, 3, 2, 4];
auto thingToExclude = 2;
/* Works fine with a hard-coded constant but compiler throws an error when
given a value unknowable by the compiler:
variable thingToExclude cannot be read at compile time */
stuff = filter!("a != " ~ to!string(thingToExclude))(stuff);
writeln(stuff);
/* Works fine if I pass the result directly to writeln but compiler throws
an error if I try assigning it to a variable such as stuff:
cannot implicitly convert expression (filter(stuff)) of type FilterResult!(__lambda2,int[]) to int[] */
stuff = filter!((a) { return a != thingToExclude; })(stuff);
writeln(stuff);
/* Mysterious error from compiler:
template to(A...) if (!isRawStaticArray!(A)) cannot be sliced with [] */
stuff = to!int[](filter!((a) { return a != thingToExclude; })(stuff));
writeln(stuff);
So, how can I remove all occurrences of a value from an array without knowing the indexes where they appear?
std.algorithm.filter is pretty close to what you want: your second try is good.
You'll want to either assign it to a new variable or use the array() function on it.
auto stuffWithoutThing = filter!((a) { return a != thingToExclude; })(stuff);
// use stuffWithoutThing
or
stuff = array(filter!((a) { return a != thingToExclude; })(stuff));
The first one does NOT create a new array. It just provides iteration over the thing with the given thing filtered out.
The second one will allocate memory for a new array to hold the content. You must import the std.array module for it to work.
Look up function remove in http://dlang.org/phobos/std_algorithm.html. There are two strategies - stable and unstable depending on whether you want the remaining elements to keep their relative positions. Both strategies operate in place and have O(n) complexity. The unstable version does fewer writes.
if you want to remove the values you can use remove
auto stuffWithoutThing = remove!((a) { return a == thingToExclude; })(stuff);
this will not allocate a new array but work in place, note that the stuff range needs to be mutable