I want to get the first row from a spark 2 dataset..the dataset is as follow:
|arrayValue |
+-------------------------------------------------------------+
|[1.47527718E12, 134535353E12] |
+-------------------------------------------------------------+
I used below codes to acess the tow values
double training_point = (double) ratios.collectAsList().get(0).getDouble(0);
double validation_point = (double) ratios.collectAsList().get(0).getDouble(1);
but it gives me below exception:
java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.lang.Double
Does anyone know how to fix the error?
i think you are trying to get 2 arrays, when you only have one
Related
I have two tables in BigQuery that I am trying to merge. For the purpose of explanation, let us name the two tables as A and B. So, we merge B into A. Also, I have a primary key called id based on which I am performing the merge. Now, both of them have a column (let us name it as X for explanation purposes) which is of the type ARRAY. My main intention is to replace the array data in A with that of B if the arrays are not equal in both the table. How can I do that. I did find posts on SO and other sites but none of them are working in my usecase.
A B
---------- ----------
id | x id | x
---------- ----------
1 | [1,2] 1 | [1,2]
---------- ----------
2 | [3] 2 | [4, 5]
The result of the merge should be
A
----------
id | x
----------
1 | [1,2]
----------
2 | [4,5]
How can I achieve the above result. Any leads will be very helpful. Also, if there are some other posts that address the above scenario directly, please point me to them
Edits:
I tried the following:
merge A as main_table
using B as updated_table
on main_table.id = updated_taable.id
when matched and main_table.x != updated_table.x then update set main_table.x = updated_table.x
when not matched then
insert(id, x) values (updated_table.id, updated_table.x)
;
Hope, this helps.
I cannot direclty use a compare operator over array right. My use case is that only update values when they are not equal. So, i cannot use something like != directly. This is the main problem
You can use to_json_string function to compare two arrays "directly"
to_json_string(main_table.x) != to_json_string(updated_table.x)
Maybe the solution is to do it with Filter and then loop. But let's see if you guys can tell me a way to do it with GET
I have this query with GET as I need to be sure I get only one result
result = OtherModel.objects.get(months_from_the_avail__lte=self.obj.months_from_avail)
Months_from_avail is an Integer value.
Example
months_from_the_avail = 22
In the other model there's 3 lines.
A) months_from_the_avail = 0
B) months_from_the_avail = 7
C) months_from_the_avail = 13
So, when I query it returns all of them as all are less than equal the value 22 but I need to get the 13 as is the last range.
range 1 = 0-6
range 2 = 7-12
range 3 = 13 ++
Is there any way that I haven't thought to do it? Or should I change it to filter() and then loop on the results?
you can get the first() section from the query order_by months_from_the_avail
Remember that django query are lazy, it won't execute until the query if finished calling so you can still use filter:
result = OtherModel.objects.filter(months_from_the_avail__lte=self.obj.months_from_avail).order_by('-months_from_the_avail').first()
#order by descending get first object which is the largest, return None if query set empty
another suggestion from Abdul which i think it's faster and better is using latest()
OtherModel.objects.latest('-months_from_the_avail')
I'm new to PowerBi/powerquery and I was trying to write a function that calculates the correlation coefficient of 2 given lists.
I used the formula in the following image :
[![enter image description here][1]][1]
let
Function = (l1 as list , l2 as list) =>
let
CorCoefNumerator = List.Sum((l1 - List.Average(l1)) * (l2 -
List.Average(l2))),
Denominator1 = List.Sum(Number.Power(l1 - List.Average(l1), 2)),
Denominator2 = List.Sum(Number.Power(l2 - List.Average(l2), 2)),
CorCoefDenominator = Number.Sqrt(Denominator1 - Denominator2),
CorCoef = Value.Divide(CorCoefNumerator, CorCoefDenominator)
in
CorCoef
in
Function(Table.ToList([Sales]), Table.ToList([Profit]))
```
The Error message I'm getting is :
An error occurred in the ‘’ query. Expression.Error: There is an unknown identifier. Did you use the [field] shorthand for a _[field] outside of an 'each' expression?
One more question : Is there a way to use DAX function while writing power query queries ? becaus when first I tried to compute this correlation coefficient I needed it to work on columns, but since I couldn't use the DAX functions I had to use my columns as Lists !
[1]: https://i.stack.imgur.com/5z3uJ.png
OBS: im using OpenOffice, i cant use the "OpenOffice" tag, =|
i have this Sheet2:
and I'm planning to type the value of B4:B12 inside another Sheet
for example, i type in A1 the value 4, so it will fill the B with D4 and C with E4(from sourceSheet position)
Sheet1 that will get the value of D or E from Row where Sheet2.B is equal Sheet1.A
--A--B--C
1|4-D4--E4
2|
3|7-D7--E7
4|1-D1--E1
and i tried this:
LOOKUP(A1;Sheet2.B1:Sheet2.B12;Sheet2.D4:Sheet2.D12);
but its not getting the value, just return sometimes #NAME
I believe your ranges are written incorrectly.
First, Sheet2.B1:Sheet2.B12 should be Sheet2.B1:B12
Second, for the Lookup function, the searchtable and result table must be the same size (take a look at the online documentation for details).
Try this instead:
LOOKUP(A1;Sheet2.B1:B12;Sheet2.D1:D12);
Please try in B1 and copied across to C1, then both down to suit:
=IF(ISERROR(LOOKUP($A1;Sheet2.$B$4:$B$12;Sheet2.D$4:D$12));"";LOOKUP($A1;Sheet2.$B$4:$B$12;Sheet2.D$4:D$12))
I have a dataframe with one column and 20 rows. I want to use
dataframe[column].apply(lambda x : some_func(x))
to get second column. The function returns a list. Pandas is not giving me what I want. It is filling the second column with NaN instead of the list items that some_func() is returning.
Is there a clever or simple way to fix this?
It seems that the error was cause because I forgot to include:
axis = 1
My full line of code should have been:
dataframe[column].apply(lambda x : some_func(x), axis = 1)
You can just assign it like a dictionary:
dataframe['column2'] = dataframe['column1'].apply(lambda x : some_func(x))
Simple as that.