I have a code with lpsolve package which quite standard way calculates minimal price for given constrains. That works ok.
My next task is to see if i exclude or change one constraint, how does prices change? It could be ok to have little bit higher price, but one constraint is excluded for example. Is there any way to calculate this sensitivity towards constraints? I understand that brutal way is to solve problem again with one constraint excluded, but it's too computationally expensive. Is there any algorithm or so?
I have seen something that one can calculate sensitivity towards coefficients in constraint. But maybe there is simple elegant way for this. Thanks!
Related
As per my search regarding the query, that I am posting here, I have got many links which propose solution but haven't mentioned exactly how this is to be done. I have explored, for example, the following links :
Link 1
Link 2
Link 3
Link 4
etc.
Therefore, I am presenting my understanding as to how the Naive Bayes formula with tf-idf can be used here and it is as follows:
Naive-Bayes formula :
P(word|class)=(word_count_in_class + 1)/(total_words_in_class+total_unique_words_in_all_classes(basically vocabulary of words in the entire training set))
tf-idf weighting can be employed in the above formula as:
word_count_in_class : sum of(tf-idf_weights of the word for all the documents belonging to that class) //basically replacing the counts with the tfidf weights of the same word calculated for every document within that class.
total_words_in_class : sum of (tf-idf weights of all the words belonging to that class)
total_unique_words_in_all_classes : as is.
This question has been posted multiple times on stack overflow but nothing substantial has been answered so far. I want to know that the way I am thinking about the problem is correct or not i.e. implementation that I have shown above. I need to know this as I am implementing the Naive Bayes myself without taking help of any Python library which comes with the built-in functions for both Naive Bayes and tf-idf. What I actually want is to improve the accuracy(currently 30%) of the model which was using Naive Bayes trained classifier. So, if there are better ways to achieve good accuracy, suggestions are welcome.
Please suggest me. I am new to this domain.
It would be better if you actually gave us the exact features and class you would like to use, or at least give an example. Since none of those have been concretely given, I'll just assume the following is your problem:
You have a number of documents, each of which has a number of words.
You would like to classify documents into categories.
Your feature vector consists of all possible words in all documents, and has values of number of counts in each document.
Your Solution
The tf idf you gave is the following:
word_count_in_class : sum of(tf-idf_weights of the word for all the documents belonging to that class) //basically replacing the counts with the tfidf weights of the same word calculated for every document within that class.
total_words_in_class : sum of (tf-idf weights of all the words belonging to that class)
Your approach sounds reasonable. The sum of all probabilities would sum to 1 independent of the tf-idf function, and the features would reflect tf-idf values. I would say this looks like a solid way to incorporate tf-idf into NB.
Another potential Solution
It took me a while to wrap my head around this problem. The main reason for this was having to worry about maintaining probability normalization. Using a Gaussian Naive Bayes would help ignore this issue entirely.
If you wanted to use this method:
Compute mean, variation of tf-idf values for each class.
Compute the prior using a gaussian distribution generated by the above mean and variation.
Proceed as normal (multiply to prior) and predict values.
Hard coding this shouldn't be too hard since numpy inherently has a gaussian function. I just prefer this type of generic solution for these type of problems.
Additional methods to increase
Apart from the above, you could also use the following techniques to increase accuracy:
Preprocessing:
Feature reduction (usually NMF, PCA, or LDA)
Additional features
Algorithm:
Naive bayes is fast, but inherently performs worse than other algorithms. It may be better to perform feature reduction, and then switch to a discriminative model such as SVM or Logistic Regression
Misc.
Bootstrapping, boosting, etc. Be careful not to overfit though...
Hopefully this was helpful. Leave a comment if anything was unclear
P(word|class)=(word_count_in_class+1)/(total_words_in_class+total_unique_words_in_all_classes
(basically vocabulary of words in the entire training set))
How would this sum up to 1? If using the above conditional probabilities, I assume the SUM is
P(word1|class)+P(word2|class)+...+P(wordn|class) =
(total_words_in_class + total_unique_words_in_class)/(total_words_in_class+total_unique_words_in_all_classes)
To correct this, I think the P(word|class) should be like
(word_count_in_class + 1)/(total_words_in_class+total_unique_words_in_classes(vocabulary of words in class))
Please correct me if I am wrong.
I think there are two ways to do it:
Round down tf-idf as integers, then use the multinomial distribution for the conditional probabilities. See this paper https://www.cs.waikato.ac.nz/ml/publications/2004/kibriya_et_al_cr.pdf.
Use Dirichlet distribution which is a continuous version of the multinomial distribution for the conditional probabilities.
I am not sure if Gaussian mixture will be better.
I'm using a function (NEQNF manual page here) which I call using
call neqnf(SYSTEM_OF_EQUATIONS, x, xguess=x_GUESS, itmax = 10000)
where SYSTEM_OF_EQUATIONS is the subroutine that contains equations
f(1)=...x(2)...x(1)...
f(2)=...x(1)...x(4)...
f(3)=...x(3)...x(4)...
f(4)=...x(1)...x(5)...
f(5)=...x(1)...x(5)...
from IMSL libraries on Fortran that lets me to solve a non-linear system with five unknowns in five equations. Because there exists more than one solution (couple of five numbers, real or complex, that solve my system), how can I choose which couple to "use" as solution?
I link an online solver with already entered a piece of my system (only two unknowns in two equations, other variables are constant in this example) which easily show you that there exists more than one solution.
example
To conclude my issue I can say that I have to choose the couple of variables which let other variables to be positive, so an easy check is the way to choose the couple.
I don't think the question has anything to do with programming, but I will show how I understand the problem.
You supply an initial guess. Then the method just converges to some solution by a modification of a Newton method.
You can choose the root by the placement of the initial guess. However, the convergence pattern can be very unpredictable (even fractal - https://en.wikipedia.org/wiki/Newton_fractal ) and it may be very difficult to choose the particular root using the initial guess.
So I have an iterative closest point (ICP) algorithm that has been written and will fit a model to a point cloud. As a quick tutorial for those not in the know ICP is a simple algorithm that fits points to a model ultimately providing a homogeneous transform matrix between the model and points.
Here is a quick picture tutorial.
Step 1. Find the closest point in the model set to your data set:
Step 2: Using a bunch of fun maths (sometimes based on gradiant descent or SVD) pull the clouds closer together and repeat untill a pose is formed:
![Figure 2][2]
Now that bit is simple and working, what i would like help with is:
How do I tell if the pose that I have is a good one?
So currently I have two ideas, but they are kind of hacky:
How many points are in the ICP Algorithm. Ie, if I am fitting to almost no points, I assume that the pose will be bad:
But what if the pose is actually good? It could be, even with few points. I dont want to reject good poses:
So what we see here is that low points can actually make a very good position if they are in the right place.
So the other metric investigated was the ratio of the supplied points to the used points. Here's an example
Now we exlude points that are too far away because they will be outliers, now this means we need a good starting position for the ICP to work, but i am ok with that. Now in the above example the assurance will say NO, this is a bad pose, and it would be right because the ratio of points vs points included is:
2/11 < SOME_THRESHOLD
So thats good, but it will fail in the case shown above where the triangle is upside down. It will say that the upside down triangle is good because all of the points are used by ICP.
You don't need to be an expert on ICP to answer this question, i am looking for good ideas. Using knowledge of the points how can we classify whether it is a good pose solution or not?
Using both of these solutions together in tandem is a good suggestion but its a pretty lame solution if you ask me, very dumb to just threshold it.
What are some good ideas for how to do this?
PS. If you want to add some code, please go for it. I am working in c++.
PPS. Someone help me with tagging this question I am not sure where it should fall.
One possible approach might be comparing poses by their shapes and their orientation.
Shapes comparison can be done with Hausdorff distance up to isometry, that is poses are of the same shape if
d(I(actual_pose), calculated_pose) < d_threshold
where d_threshold should be found from experiments. As isometric modifications of X I would consider rotations by different angles - seems to be sufficient in this case.
Is poses have the same shape, we should compare their orientation. To compare orientation we could use somewhat simplified Freksa model. For each pose we should calculate values
{x_y min, x_y max, x_z min, x_z max, y_z min, y_z max}
and then make sure that each difference between corresponding values for poses does not break another_threshold, derived from experiments as well.
Hopefully this makes some sense, or at least you can draw something useful for your purpose from this.
ICP attempts to minimize the distance between your point-cloud and a model, yes? Wouldn't it make the most sense to evaluate it based on what that distance actually is after execution?
I'm assuming it tries to minimize the sum of squared distances between each point you try to fit and the closest model point. So if you want a metric for quality, why not just normalize that sum, dividing by the number of points it's fitting. Yes, outliers will disrupt it somewhat but they're also going to disrupt your fit somewhat.
It seems like any calculation you can come up with that provides more insight than whatever ICP is minimizing would be more useful incorporated into the algorithm itself, so it can minimize that too. =)
Update
I think I didn't quite understand the algorithm. It seems that it iteratively selects a subset of points, transforms them to minimize error, and then repeats those two steps? In that case your ideal solution selects as many points as possible while keeping error as small as possible.
You said combining the two terms seemed like a weak solution, but it sounds to me like an exact description of what you want, and it captures the two major features of the algorithm (yes?). Evaluating using something like error + B * (selected / total) seems spiritually similar to how regularization is used to address the overfitting problem with gradient descent (and similar) ML algorithms. Selecting a good value for B would take some experimentation.
Looking at your examples, it seems that one of the things that determines whether the match is good or not, is the quality of the points. Could you use/calculate a weighting factor in calculating your metric?
For example, you could weight down points which are co-linear / co-planar, or spatially close, as they probably define the same feature. That would perhaps allow your upside-down triangle to be rejected (as the points are in a line, and that not a great indicator of the overall pose) but the corner-case would be ok, as they roughly define the hull.
Alternatively, maybe the weighting should be on how distributed the points are around the pose, again trying to ensure you have good coverage, rather than matching small indistinct features.
I have an integer linear optimisation problem and I'm interested in feasible, good solutions. As far as I know, for example the Gnu Linear Programming Kit only returns the optimal solution (given it exists).
This takes endless time and is not exactly what I'm looking for: I would be happy with any good solution, not only the optimal one.
So a LP-Solver that e.g. stops after some time and returns the best solution he found so far, would do the job.
Is there any such software? It would be great if that software was open source or at least free as in beer.
Alternatively: Is there any other way that usually speeds up Integer LP problems?
Is this the right place to ask?
Many solvers provide a time limit parameter; if you set the time limit parameter, they will stop once the time limit is reached. If an integer feasible solution has been found, it will return the best feasible solution found to that point.
As you may know, integer programming is NP-hard, and there is a real art to finding optimal solutions as well as good feasible solutions quickly. To compare the different solvers, see Prof. Hans Mittelmann's Benchmarks for Optimization Software. The MILP benchmarks - particularly MIPLIB2010 and the Feasibility Benchmark should be most relevant.
In addition to selecting a good solver, there are many things that can be done to improve solve times including tuning the parameters of the solver and model reformulation. Many people in research and industry - including myself - spend our careers working on improving the solve times of MIP models, both in general and for specific models.
If you are an academic user, note that the top commercial systems like CPLEX and Gurobi are free for academic use. See the respective websites for details.
Finally, you may want to look at OR-Exchange, a sister site to Stack Overflow that focuses on the field of operations research.
(Disclaimer: I currently work for Gurobi Optimization and formerly worked for ILOG, which provided CPLEX).
If you would like to get a feasibel integer solution fast and if you don't need the optimal solution, you can try
Increase the relative or absolute Gap. Usually solvers have small gaps of say 0.0001% for relative gap. This means that the solver will continue searching for MIP solutions until it the MIP solution is not farther than 0.0001% away from the optimal solution. Increase this gab to e.g. 1%., So you get good solution, but the solver will not spent a long time in proving optimality.
Try different values for solver parameters concerning MIP heuristics.
CPLEX and GUROBI have parameters to control, MIP emphasis. This means that the solver will put more emphasis on looking for feasible solutions or on proving optimality. Set emphasis to feasible MIP solutions.
Most solvers like CPLEX, Gurobi, MOPS or GLPK have settings for gap and heuristics. MIP emphasis can be set - as far as I know - only in CPLEX and Gurobi.
A usual approach for solving ILP is branch-and-bound. This utilized the solution of many sub-LP (without-I). The finally optimal result is the best of all sub-LP. As at least one solution is found you could stop anytime and would have a best-so-far.
One package that could do it, is the free lpsolve. Look there at set_timeout for giving a time limit, and when it is ILP the solve function can return in SUPOPTIMAL the best_so_far value.
As far as I know CPLEX can. It can return the solution pool which contains primal feasible solutions in the search, and if you specify the search focus on feasibility rather on optimality, more faesible solutions can be generated. At the end you can just export the pool. You can use the pool to do a hot start so it's pretty up to you. CPlex is free now at least in my country as you can sign up as a researcher.
Could you take into account Microsoft Solver Foundation? The only restriction is technology stack that you prefer and here you should use, as you guess, Microsoft technologies: C#, vb.net, etc. Here is example how to use it with Excel: http://channel9.msdn.com/posts/Modeling-with-Solver-Foundation-30 .
Regarding to your question it is possible to have not a fully optimized solutions if you set efficiency (for example 85% or 0.85). In outcome you can see all possible solutions for such restriction.
I am new to gecode and constraint programming in general.
So far, I haven't had much trouble picking up gecode, it's great. But I was wondering what is the best way to perform a "nested" cost function. Specifically, I am looking to minimize X, but within the space of solutions for which X is equal, prefer solutions which minimize Y? I could probably hack it by defining a cost function that looks like X*large_number+Y, but I'd prefer to do this properly if there's a good solution.
If anyone can point me to explain how to implement this in Gecode, that would be really helpful. Thanks!
You can define any kind of optimization criteria using the constrain member in a space in Gecode. See Section 2.5 in Modeling and Programming with Gecode for an example. In your case, the straight forward way would be to add a constrain member that adds a lexicographic constraint between the previous best solutions answer and the current space.
That being said, in general optimizing based on a lexicographic order can be wasteful (too much searching). It may often be better to first run a search optimizing the first component (X in your case). After that, re-run the search with the first components value fixed (X set to best possible value), and optimize the second value (Y in your case). Iterate as needed for all elements in the cost.