is there a built in method to detect outliers? - c++

When I was using a matlab, I was using the method filloutliers. I was wondering if there is something similar to that in C++.
In other words, I want to know if there is any sort of a built-in method in a certain library that detect outliers in a data set and replace them.

No, there's no built-in standard library facility which does that. Numerical analysis is not a focus or a strong point of C++, though of course there are numerical analysis libraries available out there (available via a Google search). Note that Matlab's method is a very particular one: there's no precise and universal definition of an "outlier" (some would say there's no such thing as an outlier). So expect to have to come up with your own opinion of how to classify a point as an outlier.

Related

Performing Multivariate Linear Regression in C++

I am looking for a way to perform a (medium-scale*) multivariate linear regression (ordinary least-squares, OLS) in C++. Say C++11 with using std library, and if helpful also boost; if easily installable also the use of further libraries is fine; use of e.g. std::vector format for inputs & outputs could be most convenient.
Performance is not top priority, though of course appreciated.
What does a reasonable algorithm look like? Are there any good tools readily available to help with it?
Various implementations for univariate regression algos can readily be found, e.g. here, but I have not yet found any multivariate version. Also, this question essentially asks for such a regression though it has been closed due to lack of focus, even if it's title would be the ideal one.
*Medium-scale: in my case e.g. say few ten-thousands of observations, and up to hundred explanatories/attributes (e.g. time-dummies).

Does tensorflow c++ API support automatic differentiation for backpropagation?

Does tensor-flow C++ API support automatic differentiation to back-propagate the gradient?
If I write a graph in c++ and would like to run it in a c++ code (not in python!) will automatic differentiation work?
Let's suppose every op in the graph has a gradient implementation.
I think the documentation regarding what tensor-flow C++ API can and can't do is is very poor.
Thank you very much for the help
Technically it can, but AFAIK the automatic differentiation is only "configured" in Python. What I mean by this is that, at a lower level, each TensorFlow operation does not declare itself what its gradient is (that is, the corresponding operation that computes its gradient). That is instead declared at Python level. For example, you can take a look at math_ops.py. You will see that, among other things, there are several functions decorated with #ops.RegisterGradient(...). What this decorator does is adding that function to a global registry (in Python) of operations and their gradients. So, for example, optimizer classes are largely implemented in Python, since they make use of this registry to build the backpropagation computation (as opposed to making use of native TensorFlow primitives to that end, which do not exist).
So the point is that you can do the same computations using the same ops (which are then implemented with the same kernels), but I don't think that C++ has (or will ever have) such gradient registry (and optimizer classes), so you would need to work out or copy that backpropagation construction by yourself. In general, the C++ API is not well suited to building the computation graph.
Now a different question (and maybe this was what you were asking about in the first place) is whether you can run an already existing graph that does backpropagation in C++. By this I mean building a computation graph in Python, creating an optimizer (which in turn creates the necessary operations in the graph to compute the gradient and update the variables) and exporting the graph, then load that graph in C++ and run it. That is entirely possible and no different to running any other kind of thing in TensorFlow C++.

What are the main rates and values we should figure to evaluate both feature detection, description and matching?

I work on palmprint recognition using feature2D with Open_CV library, and I use algorithms such as SIFT, SURF, ORB... to detect features and extract/match descriptors. My test include (1 vs 1) palmprint and also (1 vs Data Base) of palmprint.
Ones I get the result, I need to evaluate the algorithm, and for this I know that there are some rates or scores (like EER, rank-1 identification, recall and accuracy) which gives an estimation about how much this method was successful. Now I need to know if any of those rates are implemented in Open_CV, and how to use them. If they aren't, what are the different formulas used in the literary.
As far as I know there is little implemented in OpenCV. A common way is to store the results (e.g. in JSON) and process those with other programs such as Matlab or Python. This also allows you to change the evaluation without the need to recompute the algorithms.
There is no overall best method to show the results. It always depends on what you want to show. In my opinion ROC is the best way to express your output. It is also very widely used in research.
If you insist on doing it in C++, then you could use:
Roceasy or
DLIB

Student's T-distribution in C++

I need to use the equivalent of Excel's TINV function in a C++ code with no statistics library linked to it.
The problem is I don't know the maths behind Student's law.
Do you think it will be reasonable to reimplement this function from scratch without using a statistics library?
I don't have access to C++11, in that case I would use std::student_t_distribution.
If yes, please provide me references to code it.
If no, do you know a lightweight library that provides it?
Thank you.
Boost has a math library with statistics functions. Here is an example on how to use it for the student's t-test
http://www.boost.org/doc/libs/1_43_0/libs/math/doc/sf_and_dist/html/math_toolkit/dist/stat_tut/weg/st_eg/two_sample_students_t.html
Given the lack to this tool means you may have to write one. I'm already assuming that you looked and could not find one. The math isn't that bad though. It's just testing to see if two observed distributions have the same mean. www.r-tutor.com has a good tutorial on this distribution. Math World shows the deeper context. Happy hunting.
It's too much work without using a statistics library.
What I am gonna do is generate numeric values in Excel for the range of values I need and copy it in an array in my code.
Hardcore style.

symbolic computation in C++

I need to do analytical integration in C++. For example, I should integrate expressions like this: exp[I(x-y)], I is an imaginary number.
How can I do this in C++?
I tried GiNaC but it can just integrate polynomials. I also tried SymbolicC++. It can integrate functions like sine, cosine or exp(x) and ln(x), but it is not very powerful. For example, it can not integrate x*ln(x) which can be easily obtained by use of Mathematica or by integration by parts.
Are there any other tools or libraries which are able to do symbolic computation like analytical integration in C++?
If you need to do symbolic integration, then you're probably not going to get anything faster than running it in mathematica or maxima - they're already highly optimised. So unless your equations have a very specific formulae that you can exploit in a way that Mathematica or Maxima can not then you're probably out of luck -- and at very least you're not going to get that kind of custom manipulation from an off-the-shelf library.
You may be justified in writing your own code to get a speed boost if you needed to do numerical solutions. ( I know that I did for generating numerical solutions to PDEs).
The other C++ libraries I am aware of that do symbolic computation are
SymEngine (https://github.com/symengine/symengine)
Piranha (https://github.com/bluescarni/piranha)
If I am not mistaken, SymEngine does not yet support integration; however, Piranha does. The documentation for Piranha is somewhat limited at the moment and is under development, but you can see the integration function here. Note that the second link uses the syntax for the Python wrapper Piranha. However, Piranha "is a computer-algebra library for the symbolic manipulation of sparse multivariate polynomials and other closely-related symbolic objects (such as Poisson series)", so I do not think it can integrate the particular functions in which you may be interested.
Though it is not C++, you may also be interested in SymPy for Python, which can perform some of the more complicated symbolic integration you may be interested in. The documentation for SymPy's integrate is here.
A couple of days ago, I was searching for a symbolic math library like SymPy for C++, because I bedazzled by its speed comparing to Python or most of the other programming languages.
I found Vienna Math Library, an awesome library with very modern syntax, and SymPy's features to the best of my knowledge. This library also has an integral function that can be used for your problem.
It was good enough for solving IK (Inverse Kinematics) of 3 degrees of freedom articulated manipulator.