Defining two variable in one line, as opposed to 2 lines - python-2.7

I am new-ish to Python and was interested if putting code on one line (as opposed to many) is always the way to go.
For example, the two code snippets below do exactly the same thing, but the first one has cut out 1 line of code. Is this considered 'un-pythonic'?
mean1, var1 = np.mean(value), np.var(value)
Or..
mean1 = np.mean(value)
var1 = np.var(value)

That construct:
a,b = c
is particularly useful to unpack c which is known as a collection/iterable made of 2 elements.
The usefulness of that:
mean1, var1 = np.mean(value), np.var(value)
is dubious: you create a tuple on the right side just to be able to unpack it on the left side. If the effect is a one-liner, you could as well do this:
mean1 = np.mean(value); var1 = np.var(value)
so you don't create any extra temp object.

Related

May Leetcode Speedrun Question: Single element in a Sorted Array

So I was watching Errichto complete these challenges and I was amazed at how fast he solved the "Single element in a Sorted Array". From a beginner's perspective, it does look impressive - maybe for senior devs the speed is quite normal.
You are given a sorted array where all elements are integers, and all elements appear exactly twice in the array, except for one element, which appears exactly once. (i.e., all elements are duplicated, except one.) You need to find the element appearing exactly once.
I am just here to understand how said code works:
class Solution {
public:
int singleNonDuplicate(vector<int>& nums) {
long long a = 0;
for(int x : nums) {
a ^= x
}
return a;
}
};
Here's what I've got so far:
for every single integer "x" in the vector/array "nums", a is equal to a^x (if what I said is correct).
And here are my questions:
Wouldn't a^x be equal to 0 because a is 0 since the beginning?
int singleNonDuplicate(vector<int> nums) {
//...
}
and
int singleNonDuplicate(vector<int>& nums) {
//...
}
I've understood this: vector<int> nums is pass by value (you're working with a "copy" of nums inside the function) and vector<int>& nums is pass by reference (you're working with nums itself inside the function).
Does the "&" matter if you were to solve the problem just like Errichto?
ps:
sorry for possible mistakes from a programming perspective, I might've accidentally said some wrong things.
yes I will learn C++ sooner or later, 2020 is the first year in my life where I actually have an actual "programming" class in my schedule, these videos are entertaining and I'm curious to see why said code works & try understand etc.
Casual proof:
(If you're interested in areas of study that help you to come up with solutions like this and understand them, I'd suggest Discrete Mathematics and Group Theory / Abstract Algebra.)
I think I know the question you were referencing. It goes something like,
You are given an unsorted array where all elements are integers, and all elements appear exactly twice in the array, except for one element, which appears exactly once. (i.e., all elements are duplicated, except one.)
You're on the right track for the first part, why the algorithm works. It takes advantage of a few properties of XOR:
X^0=X
X^X=0
The XOR operation is commutative and associative.
# proof
# since XOR is commutative, we can take advantage
# of the fact that all elements except our target
# occur in pairs of two:
P1, P1 = Two integers of the same value in a pair.
T = Our target.
# sample unsorted order, array size = 7 = (3*2)+1
[ P3, P1, T, P1, P2, P3, P2 ]
# since XOR is commutative, we can re-arrange the array
# to our liking, and still get the same value as
# the XOR algorithm.
# here, we move our target to the front, and then group
# each pair together. I've arranged them in ascending
# order, but that's not important:
[ T, P1, P1, P2, P2, P3, P3 ]
# write out the equation our algorithm is solving:
solution = 0 ^ T ^ P1 ^ P1 ^ P2 ^ P2 ^ P3 ^ P3
# because XOR is associative, we can use parens
# to indicate elements of the form X^X=0:
solution = T ^ (P1 ^ P1) ^ (P2 ^ P2) ^ (P3 ^ P3) ^ 0
# substitute X^X=0
solution = T ^ 0 ^ 0 ^ 0 ^ 0
# use X^0=X repeatedly
solution = T
So we know that running that algorithm will give us our target, T.
On using & to pass-by-reference instead of pass-by-value:
Your understanding is correct. Here, it doesn't make a real difference.
Pass-by-reference lets you modify the original value in place, which he doesn't do.
Pass-by-value copies the vector, which wouldn't meaningfully impact performance here.
So he gets style points for using pass-by-reference, and if you're using leetcode to demonstrate your diligence as a software developer it's good to see, but it's not pertinent to his solution.
^ is XOR operation in the world of coding, not the power operation (which you are assuming I guess).
I don't know about which problem you are talking about, if its finding the only unique element in array (given every other element occurs twice),
then the logic behind solving is
**a XOR a equals 0 **
**a XOR 0 equals a**
So if we XOR all the elements present in array, we will get 0 corresponding to the elements occurring twice.
The only element remaining will be XORed with 0 and hence we get the element.
Answer to second query is that whenever you want to modify the array we pass it by reference .
PS: I am also new to programming.I hope I answered your queries.

Lua: Storing a logical operator in a variable?

I can't find anything about this through Google so I have to ask here. I want to do something like this (very pseudo code):
y = first_value
x={op_1 = >, op_2 = <, c = some_value}
if first_value x.op_1 x.c then
...
end
What that code says to me is that if first_value if greater than x's c value then do something. Now, I know I could set op_1 and op_2 to some value to differentiate between them and then compare values using separate if statements, but I would like to minimize the number of if statements used.
I was just wondering if something like this is possible, maybe even in a different form. Thanks in advance!
Not this way, an operator is a specific symbol that is part of the syntax. However, you can represent an operation using a function:
y = first_value
x={op_1 = function(a,b)return a>b end, op_2 = function(a,b)return a<b end, c = some_value}
if x.op1(first_value, x.c) then
...
end

Cython replacing list of tuples by C equivalent

I am trying to speed up an already very optimized function in Cython using list of two sized tuples of doubles as inputs and outputs. To do that I need to have it all in pure C first.
In python somehow cythonized the tuple syntax ressembles this:
def f(inputList1, inputList2):
cdef double p, i10,i11,i20,i21,a0,a1
a0,a1=inputList1[-1]
for i10,i11 in inputList1:
outputList=[]
#some more computations involving a0 and a1
for i20,i21 in inputList2:
p=f(i10,i11,a0,a1,i21,i20) #f returns a double
if p==0:
outputList.append((i10,i21))
else if p>0:
outputList.append(g(i10,i21,i20,i21)) #g returns two outputs that are parsed as a two sized tuples automatically
if len(outputList)<1:
return []
#some more computations
a0, a1=i10, i11
return outputList
The details are not relevant what is important is the speed and the syntax: it uses tuple unpacking in the for loops to break the tuples apart and can append whole tuples using append(), it can also fetch the last element of a list and it can return an empty list. It can also convert the two outputs of g to a tuple.
I am trying to change all the python to pure C to either increase speed or at least not damage the speed in my mind it should be doable (maybe I am wrong ?). So as g has to become pure C g has to return one object (I guess multiple outputs are out of the question ?)
My first idea was to use std::vectors and std::pairs lists of lists become vector[pair[double,double]] I modified g to return a pair[double,double] instead of two doubles
cdef vector[pair[double,double]] f(vector[pair[double,double]] inputList1, vector[pair[double,double]] inputList2):
cdef double p, i10,i11,i20,i21,a0,a1
#I have to add the pairs as I cannot use the for a,b in syntax
cdef pair[double,double] i1,i2,e
cdef vector[pair[double,double]] outputList, emptyList
a0,a1=inputList.back()
for i1 in inputList1:
i10=i1.first
i11=i1.second
outputList=emptyList
#some more computations involving a0 and a1
for i2 in inputList2:
i20=i2.first
i21=i2.second
p=f(i10,i11,a0,a1,i21,i20) #f returns a double
if p==0.:
outputList.push_back(i1) #I am now using push_back and not append
else if p>0.:
outputList.push_back(g(i10,i21,i20,i21)) #g now returns a pair
if outputList.size()<1:
return outputList
#some more computations
a0, a1=i10, i11
return outputList
Everything is pure C but it is 3 times slower !!!
I also tried std::list and list[list[double]] I am losing in speed by a factor of 3 also ! I use i1.back() and i1.front() instead of first and second I guess that makes me loose speed too. What is the reason for that ? Is there a better C object to use ? Is is the syntax I am using ? Doing explicitely i20=i2.first and so on is that what makes it so slow ?
Especially the syntax of g now seems really silly maybe the bottleneck comes from there:
cdef pair[double,double] g(double a, double b, double c, double d):
#looks ugly that I have to define res
cdef pair[double,double] res
cdef double res_int
res_1=computations1(a,b,c,d)
res_2=computations2(a,b,c,d)
#looks ugly
res.first=res_1
res.second=res_2
return res
instead of simply returning res_1, res_2 as before:
EDIT: I redid everything to benchmark the different solutions and:
-it turns out the list[list[double]] is not an option for me because later in my code I need to access specific elements of the list through indexing
-vector[np.ndarray[DTYPE, ndim=1]] does not work I guess you cannot form a vector with Python objects
-vector[pair[double,double]] is actually indeed faster than Python list versions !
For 100000 iterations the Python version takes in total:
1.10413002968s for the Python version
0.781275987625s for the (vector[pair[double,double]]) C version.
It still looks very ugly and I still want to hear if this is the right approach

Replace x if cond in Mata

I want to overwrite some elements of a vector x with a scalar a based on a 0/1 vector cond. In pseudocode: x[cond]=a.
x[cond] is not the right way of subsetting in Mata. I should use select(x,cond). Unfortunately, this latter object cannot be assigned to.
x[selectindex(cond)] = a fails because such an assignment requires the same dimensions on both sides of the =.
I could modify the latter approach to
x[selectindex(cond)] = J(sum(cond),1,a)
Is that the idiom in Mata? I was expecting something more straightforward because Stata has nice replace x = a if cond syntax.
In the general case, I think that's about as good as you're going to get. sum(cond) is safe if cond is 0 or 1, but a more general alternative is:
select = selectindex(cond)
x[select] = J(length(select), 1, a)
I agree that this is not the simplest syntax. An additional assignment colon operator := would be nice here.
If x and cond are views, st_store() is another option:
st_store(., st_viewvars(x), st_viewvars(cond), J(sum(cond), 1, a))
If you already know the variable names/indices and don't have to call st_viewvars(), all the better.

libsvm : C++ vs. MATLAB : What's With The Different Accuracies?

I have two multi-class data sets with 5 labels, one for training, and the other for cross validation. These data sets are stored as .csv files, so they act as a control in this experiment.
I have a C++ wrapper for libsvm, and the MATLAB functions for libsvm.
For both C++ and MATLAB:
Using a C-type SVM with an RBF kernel, I iterate over 2 lists of C and Gamma values. For each parameter combination, I train on the training data set and then predict the cross validation data set. I store the accuracy of the prediction in a 2D map which correlates to the C and Gamma value which yielded the accuracy.
I've recreated different training and cross validation data sets many, many times. Each time, the C++ and MATLAB accuracies are different; sometimes by a lot! Mostly MATLAB produces higher accuracies, but sometimes the C++ implementation is better.
What could be accounting for these differences? The C/Gamma values I'm trying are the same, as are the remaining SVM parameters (default).
There should be no significant differences as both C and Matlab codes use the same svm.c file. So what can be the reason?
implementation error in your code(s), this is unfortunately the most probable one
used wrapper has some bug and/or use other version of libsvm then your matlab code (libsvm is written in pure C and comes with python, Matlab and java wrappers, so your C++ wrapper is "not official") or your wrapper assumes some additional default values, which are not default in C/Matlab/Python/Java implementations
you perform cross validation in somewhat randomized form (shuffling the data and then folding, which is completely correct and reasonable, but will lead to different results in two different runs)
There is some rounding/conversion performed during loading data from .csv in one (or both) of your codes which leads to inconsistencies (really not likely to happen, yet still possible)
I trained an SVC using scikit-Learn (sklearn.svm.SVC) within a python Jupiter Notebook. I wanted to use the trained classifier in MATLAB v. 2022a and C++. I nedeed to verify that all three versions' predictions matched for each implementation of the kernel, decision, and prediction functions. I found some useful guidance from bcorso's implementation of the original libsvm C++ code.
Exporting structure that represents the structure's model is explained in bcorso's post ab required to call his prediction function implementation:
predict(params, sv, nv, a, b, cs, X)
for it to match sklearn's version for trained classifier instance, clf:
clf.predict(X)
Once I established this match, I created a MATLAB versions of bcorso's kernel,
function [k] = kernel_svm(params, sv, X)
k = zeros(1,length(sv));
if strcmp(params.kernel,'linear')
for i = 1:length(sv)
k(i) = dot(sv(i,:),X);
end
elseif strcmp(params.kernel,'rbf')
for i = 1:length(sv)
k(i) =exp(-params.gamma*dot(sv(i,:)-X,sv(i,:)-X));
end
else
uiwait(msgbox('kernel not defined','Error','modal'));
end
k = k';
end
decision,
function [d] = decision_svm(params, sv, nv, a, b, X)
%% calculate the kernels
kvalue = kernel_svm(params, sv, X);
%% define the start and end index for support vectors for each class
nr_class = length(nv);
start = zeros(1,nr_class);
start(1) = 1;
%% First Class Loop
for i = 1:(nr_class-1)
start(i+1) = start(i)+ nv(i)-1;
end
%% Other Classes Nested Loops
for i = 1:nr_class
for j = i+1:nr_class
sum = 0;
si = start(i); %first class start
sj = start(j); %first class end
ci = nv(i)+1; %next class start
cj = ci+ nv(j)-1; %next class end
for k = si:sj
sum =sum + a(k) * kvalue(k);
end
sum1=sum;
sum = 0;
for k = ci:cj
sum = sum + a(k) * kvalue(k);
end
sum2=sum;
end
end
%% Add class sums and the intercept
sumd = sum1 + sum2;
d = -(sumd +b);
end
and predict functions.
function [class, classIndex] = predict_svm(params, sv, nv, a, b, cs, X)
dec_value = decision_svm(params, sv, nv, a, b, X);
if dec_value <= 0
class = cs(1);
classIndex = 1;
else
class = cs(2);
classIndex = 0;
end
end
Translation of the python comprehension syntax to a MATLAB/C++ equivalent of the summations required nested for loops in the decision function.
It is also required to account for for MATLAB indexing (base 1) vs.Python/C++ indexing (base 0).
The trained classifer model is conveyed by params, sv, nv, a, b, cs, which can be gathered within a structure after hanving exported the sv and a matrices as .csv files from teh python notebook. I simply created a wrapper MATLAB function svcInfo that builds the structure:
svcStruct = svcInfo();
params = svcStruct.params;
sv= svcStruct.sv;
nv = svcStruct.nv;
a = svcStruct.a;
b = svcStruct.b;
cs = svcStruct.cs;
Or one can save the structure contents within as MATLAB workspace within a .mat file.
The new case for prediction is provided as a vector X,
%Classifier input feature vector
X=[x1 x2...xn];
A simplified C++ implementation that follows bcorso's python version is fairly similar to this MATLAB implementation in that it uses the nested "for" loop within the decision function but it uses zero based indexing.
Once tested, I may expand this post with the C++ version on the MATLAB code shared above.