determining whether two convex hulls overlap - linear-programming

I'm trying to find an efficient algorithm for determining whether two convex hulls intersect or not. The hulls consist of data points in N-dimensional space, where N is 3 up to 10 or so. One elegant algorithm was suggested here using linprog from scipy, but you have to loop over all points in one hull, and it turns out the algorithm is very slow for low dimensions (I tried it and so did one of the respondents). It seems to me the algorithm could be generalized to answer the question I am posting here, and I found what I think is a solution here. The authors say that the general linear programming problem takes the form Ax + tp >= 1, where the A matrix contains the points of both hulls, t is some constant >= 0, and p = [1,1,1,1...1] (it's equivalent to finding a solution to Ax > 0 for some x). As I am new to linprog() it isn't clear to me whether it can handle problems of this form. If A_ub is defined as on page 1 of the paper, then what is b_ub?

There is a nice explanation of how to do this problem, with an algorithm in R, on this website. My original post referred to the scipy.optimize.linprog library, but this proved to be insufficiently robust. I found that the SCS algorithm in the cvxpy library worked very nicely, and based on this I came up with the following python code:
import numpy as np
import cvxpy as cvxpy
# Determine feasibility of Ax <= b
# cloud1 and cloud2 should be numpy.ndarrays
def clouds_overlap(cloud1, cloud2):
# build the A matrix
cloud12 = np.vstack((-cloud1, cloud2))
vec_ones = np.r_[np.ones((len(cloud1),1)), -np.ones((len(cloud2),1))]
A = np.r_['1', cloud12, vec_ones]
# make b vector
ntot = len(cloud1) + len(cloud2)
b = -np.ones(ntot)
# define the x variable and the equation to be solved
x = cvxpy.Variable(A.shape[1])
constraints = [A*x <= b]
# since we're only determining feasibility there is no minimization
# so just set the objective function to a constant
obj = cvxpy.Minimize(0)
# SCS was the most accurate/robust of the non-commercial solvers
# for my application
problem = cvxpy.Problem(obj, constraints)
problem.solve(solver=cvxpy.SCS)
# Any 'inaccurate' status indicates ambiguity, so you can
# return True or False as you please
if problem.status == 'infeasible' or problem.status.endswith('inaccurate'):
return True
else:
return False
cube = np.array([[1,1,1],[1,1,-1],[1,-1,1],[1,-1,-1],[-1,1,1],[-1,1,-1],[-1,-1,1],[-1,-1,-1]])
inside = np.array([[0.49,0.0,0.0]])
outside = np.array([[1.01,0,0]])
print("Clouds overlap?", clouds_overlap(cube, inside))
print("Clouds overlap?", clouds_overlap(cube, outside))
# Clouds overlap? True
# Clouds overlap? False
The area of numerical instability is when the two clouds just touch, or are arbitrarily close to touching such that it isn't possible to definitively say whether they overlap or not. That is one of the cases where you will see this algorithm report an 'inaccurate' status. In my code I chose to consider such cases overlapping, but since it is ambiguous you can decide for yourself what to do.

Related

Sparse index use for optimization with Pyomo model

I have a Pyomo model connected to a Django-created website.
My decision variable has 4 indices and I have a huge amount of constraints running on it.
Since Pyomo takes a ton of time to read in the constraints with so many variables, I want to sparse out the index set to only contain variables that actually could be 1 (i have some conditions on that)
I saw this post
Create a variable with sparse index in pyomo
and tried a for loop for all my conditions. I created a set "AllowedVariables" to later put this inside my constraints.
But Django's server takes so long to create this set while performing the system check, it never comes out.
Currently i have this model:
model = AbstractModel()
model.x = Var(model.K, model.L, model.F, model.Z, domain=Boolean)
def ObjRule(model):
# some rule, sense maximize
model.Obj = pyomo.environ.Objective(rule=ObjRule, sense=maximize)
def ARule(model,l):
maxA = sum(model.x[k,l,f,z] * for k in model.K for f in model.F
for z in model.Z and (k,l,f,z) in model.AllowedVariables)
return maxA <= 1
model.maxA = Constraint(model.L, rule=ARule)
The constraint is exemplary, I have 15 more similar ones. I currently create "AllowedVariables" this way:
AllowedVariables = []
for k in model.K:
for l in model.L:
..... check all sorts of conditions, break if not valid
AllowedVaraibles.append((k,l,f,z))
model.AllowedVariables = Set(initialize=AllowedVariables)
Using this, the Django server starts checking....and never stops
performing system checks...
Sadly, I somehow need some restriction on the variables or else the reading for the solver will take way to long since the constraints contain so many unnecessary variables that have to be 0 anyways.
Any ideas on how I can sparse my variable set?

How to Use MCMC with a Custom Log-Probability and Solve for a Matrix

The code is in PyMC3, but this is a general problem. I want to find which matrix (combination of variables) gives me the highest probability. Taking the mean of the trace of each element is meaningless because they depend on each other.
Here is a simple case; the code uses a vector rather than a matrix for simplicity. The goal is to find a vector of length 2, where the each value is between 0 and 1, so that the sum is 1.
import numpy as np
import theano
import theano.tensor as tt
import pymc3 as mc
# define a theano Op for our likelihood function
class LogLike_Matrix(tt.Op):
itypes = [tt.dvector] # expects a vector of parameter values when called
otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)
def __init__(self, loglike):
self.likelihood = loglike # the log-p function
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
theta, = inputs # this will contain my variables
# call the log-likelihood function
logl = self.likelihood(theta)
outputs[0][0] = np.array(logl) # output the log-likelihood
def logLikelihood_Matrix(data):
"""
We want sum(data) = 1
"""
p = 1-np.abs(np.sum(data)-1)
return np.log(p)
logl_matrix = LogLike_Matrix(logLikelihood_Matrix)
# use PyMC3 to sampler from log-likelihood
with mc.Model():
"""
Data will be sampled randomly with uniform distribution
because the log-p doesn't work on it
"""
data_matrix = mc.Uniform('data_matrix', shape=(2), lower=0.0, upper=1.0)
# convert m and c to a tensor vector
theta = tt.as_tensor_variable(data_matrix)
# use a DensityDist (use a lamdba function to "call" the Op)
mc.DensityDist('likelihood_matrix', lambda v: logl_matrix(v), observed={'v': theta})
trace_matrix = mc.sample(5000, tune=100, discard_tuned_samples=True)
If you only want the highest likelihood parameter values, then you want the Maximum A Posteriori (MAP) estimate, which can be obtained using pymc3.find_MAP() (see starting.py for method details). If you expect a multimodal posterior, then you will likely need to run this repeatedly with different initializations and select the one that obtains the largest logp value, but that still only increases the chances of finding the global optimum, though cannot guarantee it.
It should be noted that at high parameter dimensions, the MAP estimate is usually not part of the typical set, i.e., it is not representative of typical parameter values that would lead to the observed data. Michael Betancourt discusses this in A Conceptual Introduction to Hamiltonian Monte Carlo. The fully Bayesian approach is to use posterior predictive distributions, which effectively averages over all the high-likelihood parameter configurations rather than using a single point estimate for parameters.

keras custom loss function

I'm new to Keras framework and I want to implement the following loss function of
Root Mean Squared Logarithmic Error
Here is my code for Keras with tensorflow backend
def loss_function(y_true, y_pred):
ones = K.ones(shape=K.shape(y_pred).shape)
y_pred = tf.add(y_pred,ones)
y_true = tf.add(y_true,ones)
val = K.sqrt(K.mean(K.sum(K.log(y_pred)-K.log(y_true))))
return val
But I end up getting the following error:
ValueError: Error when checking input: expected dense_1_input to have shape (None, 16) but got array with shape (1312779, 11)
with the val returned to be 0.
The order of your operations is inverted.
Since "log(true) - log(pred)" can be either negative or positive (the result may be a little higher or a little lower than the expected), the square is the first thing that must happen. (The square is responsible for eliminating the negative signs).
And the mean is the last one (the most external), because you want first to compute the error for each element, and only after that you get the mean of the error. (The mean function already carries the sum function in it).
So:
def loss_function(y_true, y_pred):
y_pred = y_pred + 1
y_true = y_true + 1
return K.mean(K.square(K.log(y_pred)-K.log(y_true)))
Please note that this does not carry the "root" part. If you want to add it, I'd say that the root should go before the mean (different from the formula in the picture)
I'd use this instead:
return K.mean(K.sqrt(K.square(K.log(y_pred)-K.log(y_true))))
Make sure that your model ends with an activation that outputs numbers greater or equal to zero:
Relu is ok
Sigmoid is ok
Softmax is ok
Other activations may have negative values and will bring errors with log:
linear is not ok
tanh is not ok

Difficulty Understanding TensorFlow Computations

I'm new to TensorFlow and have difficulty understanding how the computations works. I could not find the answer to my question on the web.
For the following piece of code, the last time I print "d" in the for loop of the "train_neural_net()" function, I'm expecting the values to be identical to when I print "test_distance.eval". But they are way different. Can anyone tell me why this is happening? Isn't TensorFlow supposed to cache the Variable results learned in the for loop and use them when I run "test_distance.eval"?
def neural_network_model1(data):
nn1_hidden_1_layer = {'weights': tf.Variable(tf.random_normal([5, n_nodes_hl1])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl1]))}
nn1_hidden_2_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl2]))}
nn1_output_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl2, vector_size])), 'biasses': tf.Variable(tf.random_normal([vector_size]))}
nn1_l1 = tf.add(tf.matmul(data, nn1_hidden_1_layer["weights"]), nn1_hidden_1_layer["biasses"])
nn1_l1 = tf.sigmoid(nn1_l1)
nn1_l2 = tf.add(tf.matmul(nn1_l1, nn1_hidden_2_layer["weights"]), nn1_hidden_2_layer["biasses"])
nn1_l2 = tf.sigmoid(nn1_l2)
nn1_output = tf.add(tf.matmul(nn1_l2, nn1_output_layer["weights"]), nn1_output_layer["biasses"])
return nn1_output
def neural_network_model2(data):
nn2_hidden_1_layer = {'weights': tf.Variable(tf.random_normal([5, n_nodes_hl1])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl1]))}
nn2_hidden_2_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])), 'biasses': tf.Variable(tf.random_normal([n_nodes_hl2]))}
nn2_output_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl2, vector_size])), 'biasses': tf.Variable(tf.random_normal([vector_size]))}
nn2_l1 = tf.add(tf.matmul(data, nn2_hidden_1_layer["weights"]), nn2_hidden_1_layer["biasses"])
nn2_l1 = tf.sigmoid(nn2_l1)
nn2_l2 = tf.add(tf.matmul(nn2_l1, nn2_hidden_2_layer["weights"]), nn2_hidden_2_layer["biasses"])
nn2_l2 = tf.sigmoid(nn2_l2)
nn2_output = tf.add(tf.matmul(nn2_l2, nn2_output_layer["weights"]), nn2_output_layer["biasses"])
return nn2_output
def train_neural_net():
prediction1 = neural_network_model1(x1)
prediction2 = neural_network_model2(x2)
distance = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(prediction1, prediction2)), reduction_indices=1))
cost = tf.reduce_mean(tf.multiply(y, distance))
optimizer = tf.train.AdamOptimizer().minimize(cost)
hm_epochs = 500
test_result1 = neural_network_model1(x3)
test_result2 = neural_network_model2(x4)
test_distance = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(test_result1, test_result2)), reduction_indices=1))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(hm_epochs):
_, d = sess.run([optimizer, distance], feed_dict = {x1: train_x1, x2: train_x2, y: train_y})
print("Epoch", epoch, "distance", d)
print("test distance", test_distance.eval({x3: train_x1, x4: train_x2}))
train_neural_net()
Each time you call the functions neural_network_model1() or neural_network_model2(), you create a new set of variables, so there are four sets of variables in total.
The call to sess.run(tf.global_variables_initializer()) initializes all four sets of variables.
When you train in the for loop, you only update the first two sets of variables, created with these lines:
prediction1 = neural_network_model1(x1)
prediction2 = neural_network_model2(x2)
When you evaluate with test_distance.eval(), the tensor test_distance depends only on the variables that were created in the last two sets of variables, which were created with these lines:
test_result1 = neural_network_model1(x3)
test_result2 = neural_network_model2(x4)
These variables were never updated in the training loop, so the evaluation results will be based on the random initial values.
TensorFlow does include some code for sharing weights between multiple calls to the same function, using with tf.variable_scope(...): blocks. For more information on how to use these, see the tutorial on variables and sharing on the TensorFlow website.
You don't need to define two function for generating models, you can use tf.name_scope, and pass a model name to the function to use it as a prefix for variable declaration. On the other hand, you defined two variables for distance, first is distance and second is test_distance . But your model will learn from train data to minimize cost which is only related to first distance variable. Therefore, test_distance is never used and the model which is related to it, will never learn anything! Again there is no need for two distance functions. You only need one. When you want to calculate train distance, you should feed it with train data and when you want to calculate test distance you should feed it with test data.
Anyway, if you want second distance to work, you should declare another optimizer for it and also you have to learn it as you have done for first one. Also you should consider the fact that models are learning base on their initial values and training data. Even if you feed both models with exactly same training batches, you can't expect to have exactly similar characteristics models since initial values for weights are different and this could cause falling into different local minimum of error surface. At the end notice that whenever you call neural_network_model1 or neural_network_model2 you will generate new weights and biases, because tf.Variable is generating new variables for you.

Getting significance level, alpha, from KS test results?

I am trying to find the significance level/alpha level (to eventually get the confidence level) of my Kolmogorov-Smirnov test results and I feel like I'm going crazy because this doesn't seem explained well enough anywhere (in a way that I understand.)
I have sample data that I want to see if it comes from one of four probability distribution functions: Cauchy, Gaussian, Students t, and Laplace. (I am not doing a two-sample test.)
Here is sample code for Cauchy:
### Cauchy Distribution Function
data = [-1.058, 1.326, -4.045, 1.466, -3.069, 0.1747, 0.6305, 5.194, 0.1024, 1.376, -5.989, 1.024, 2.252, -1.451, -5.041, 1.542, -3.224, 1.389, -2.339, 4.073, -1.336, 1.081, -2.573, 3.788, 2.26, -0.6905, 0.9064, -0.7214, -0.3471, -1.152, 1.904, 2.082, -2.471, 0.6434, -1.709, -1.125, -1.607, -1.059, -1.238, 6.042, 0.08664, 2.69, 1.013, -0.7654, 2.552, 0.7851, 0.5365, 4.351, 0.9444, -2.056, 0.9638, -2.64, 1.165, -1.103, -1.624, -1.082, 3.615, 1.709, 2.945, -5.029, -3.57, 0.6126, -2.88, 0.4868, 0.4222, -0.2062, -1.337, -0.326, -2.784, 6.724, -0.1316, 4.681, 6.839, -1.987, -5.372, 1.522, -2.347, 0.4531, -1.154, -3.631, 0.426, -4.271, 1.687, -1.612, -1.438, 0.8777, 0.06759, 0.6114, -1.296, 0.07865, -1.104, -1.454, -1.62, -1.755, 0.7868, -3.312, 1.054, -2.183, -7.066, -0.04661, 1.612, 1.441, -1.768, -0.2443, -0.7033, -1.16, 0.2529, 0.2441, -1.962, 0.568, 1.568, 8.385, 0.7192, -1.084, 0.9035, 3.376, -0.7172, -0.1221, 3.267, 0.4064, -0.4894, -2.001, 1.63, -2.891, 0.6244, 2.381, -1.037, -1.705, -0.5223, -0.2912, 1.77, -3.792, 0.1716, 4.121, -0.9119, -0.1166, 5.694, -5.904, 0.5485, -2.788, 2.582, -1.553, 1.95, 3.886, 1.066, -0.475, 0.5701, -0.9367, -2.728, 4.588, -5.544, 1.373, 1.807, 2.919, 0.8946, 0.6329, -1.34, -0.6154, 4.005, 0.204, -1.201, -4.912, -4.766, 0.0554, 3.484, -2.819, -5.131, 2.108, -1.037, 1.603, 2.027, 0.3066, -0.3446, -1.833, -2.54, 2.828, 4.763, 0.9926, 2.504, -1.258, 0.4298, 2.536, -1.214, -3.932, 1.536, 0.03379, -3.839, 4.788, 0.04021, -0.2701, -2.139, 0.1339, 1.795, -2.12, 5.558, 0.8838, 1.895, 0.1073, 2.011, -1.267, -1.08, -1.12, -1.916, 1.524, -1.883, 5.348, 0.115, -1.059, -0.4772, 1.02, -0.4057, 1.822, 4.011, -3.246, -7.868, 2.445, 2.271, 0.5377, 0.2612, 0.7397, -1.059, 1.177, 2.706, -4.805, -0.7552, -4.43, -0.4607, 1.536, -4.653, -0.5952, 0.8115, -0.4434, 1.042, 1.179, -0.1524, 0.2753, -1.986, -2.377, -1.21, 2.543, -2.632, -2.037, 4.011, 1.98, -2.589, -4.9, 1.671, -0.2153, -6.109, 2.497]
def C(data):
stuff = []
# vary gamma
for scale in xrange(1, 101, 1):
ks_statistic, pvalue = ss.kstest(data, "cauchy", args=(scale,))
stuff.append((ks_statistic, pvalue, scale))
bestks = min(c[0] for c in stuff)
bestrow = [row for row in stuff if row[0] == bestks]
return bestrow
I am trying to fit this function to my data, and to return the scale parameter (gamma) that corresponds to the highest probability of being fit with a Cauchy Distribution. The corresponding ks-statistic and p-value also get returned. I thought that this would be done by finding the minimum ks-statistic, which would be the curve that yields the smallest distance between any given data point and distribution-curve point. I realize that I need, though, to find "alpha" so that I can find my probability that the sample data is from a Cauchy Distribution, with the specified scale/gamma value I found.
I have referenced many sources trying to explain how to find "alpha", but I have no clue how to do this in my code.
Thank you for any help and insight!
I think this question is actually outside the range of SO because it involves statistics. You would probably be better to answer on, say, Cross Validation. However, let me offer one or two remarks.
The K-S is used for testing whether a given set of data has arisen from a given, fully specified distribution function. (Even for this purpose it might not be optimal.) It's not intended, as far as I know, as a measure of fit amongst alternatives.
To make inferences about probabilities one must have a viable probability model for the data in the first place. In this case, what is the space of alternatives and how are probabilities assigned to them under the null and alternative hypotheses?
Now, to get that unhelpful comment that I offered. Thanks for being so tactful about it! This is what I was trying to express.
You try scales from 1 to 100 in unit steps. I wanted to point out that scales less than one produce curious results. Now I see some close fits, which is especially true when p-values are considered; there's nothing to tell them apart from that for scale=2. Here's a plot.
Each triple gives (scale, K-S, p).
The main thing might be, what do you want from your data?