I need to make some polygon computation on 2D plan. Typically, isInside operation.
I found boost::Polygon API but my points are inside a single big array.
That's I call indexed geometry.
See http://www.opengl-tutorial.org/intermediate-tutorials/tutorial-9-vbo-indexing/
So my best way is just to boost::Polygon and give to it my array + indices of points to use.
The objective is just to don't copy my million of points (because they are shared at least by two polygons).
I don't know if API allows it ( or I need to inherit my own class :-( ).
Maybe, someone know another API (inside boost or other).
Thanks
Documentation
within demo : https://www.boost.org/doc/libs/1_68_0/libs/geometry/doc/html/geometry/reference/algorithms/within/within_2.html
Boost Geometry allows for adapted user-defined data types.
Specifically, C arrays are adapted here: https://www.boost.org/doc/libs/1_68_0/boost/geometry/geometries/adapted/c_array.hpp
I have another answer up where I show how to use Boost Geometry algorithms on a direct C array of structs (in that case I type punned using tuple as the point type): How to calculate the convex hull with boost from arrays instead of setting each point separately? (the other answers show alternatives that may be easier if you can afford to copy some data).
The relevant algorithms would be:
https://www.boost.org/doc/libs/1_68_0/libs/geometry/doc/html/geometry/reference/algorithms/within.html
https://www.boost.org/doc/libs/1_68_0/libs/geometry/doc/html/geometry/reference/algorithms/disjoint.html
Related
I'm new to C++ and I think a good way for me to jump in is to build some basic models that I've built in other languages. I want to start with just Linear Regression solved using first order methods. So here's how I want things to be organized (in pseudocode).
class LinearRegression
LinearRegression:
tol = <a supplied tolerance or defaulted to 1e-5>
max_ite = <a supplied max iter or default to 1k>
fit(X, y):
// model learns weights specific to this data set
_gradient(X, y):
// compute the gradient
score(X,y):
// model uses weights learned from fit to compute accuracy of
// y_predicted to actual y
My question is when I use fit, score and gradient methods I don't actually need to pass around the arrays (X and y) or even store them anywhere so I want to use a reference or a pointer to those structures. My problem is that if the method accepts a pointer to a 2D array I need to supply the second dimension size ahead of time or use templating. If I use templating I now have something like this for every method that accepts a 2D array
template<std::size_t rows, std::size_t cols>
void fit(double (&X)[rows][cols], double &y){...}
It seems there likely a better way. I want my regression class to work with any size input. How is this done in industry? I know in some situations the array is just flattened into row or column major format where just a pointer to the first element is passed but I don't have enough experience to know what people use in C++.
You wrote a quite a few points in your question, so here are some points addressing them:
Contemporary C++ discourages working directly with heap-allocated data that you need to manually allocate or deallocate. You can use, e.g., std::vector<double> to represent vectors, and std::vector<std::vector<double>> to represent matrices. Even better would be to use a matrix class, preferably one that is already in mainstream use.
Once you use such a class, you can easily get the dimension at runtime. With std::vector, for example, you can use the size() method. Other classes have other methods. Check the documentation for the one you choose.
You probably really don't want to use templates for the dimensions.
a. If you do so, you will need to recompile each time you get a different input. Your code will be duplicated (by the compiler) to the number of different dimensions you simultaneously use. Lots of bad stuff, with little gain (in this case). There's no real drawback to getting the dimension at runtime from the class.
b. Templates (in your setting) are fitting for the type of the matrix (e.g., is it a matrix of doubles or floats), or possibly the number of dimesions (e.g., for specifying tensors).
Your regressor doesn't need to store the matrix and/or vector. Pass them by const reference. Your interface looks like that of sklearn. If you like, check the source code there. The result of calling fit just causes the class object to store the parameter corresponding to the prediction vector β. It doesn't copy or store the input matrix and/or vector.
I have a dataset from custom abstract objects and a custom distance function. Is there any good SVM libraries that allows me to train on my custom objects (not 2d points) and my custom distance function?
I searched the answers in this similar stackoverflow question, but none of them allows me to use custom objects and distance functions.
First things first.
SVM does not work on distance functions, it only accepts dot products. So your distance function (actually similarity, but usually 1-distance is similarity) has to:
be symmetric s(a,b)=s(b,a)
be positive definite s(a,a)>=0, s(a,a)=0 <=> a=0
be linear in first argument s(ka, b) = k s(a,b) and s(a+b,c) = s(a,c) + s(b,c)
This can be tricky to check, as you actually ask "is there a function from my objects to some vector space, phi such that s(phi(x), phi(y))" is a dot-product, thus leading to definition of so called kernel, K(x,y)=s(phi(x), phi(y)). If your objects are themselves elements of vector space, then sometimes it is enough to put phi(x)=x thus K=s, but it is not true in general.
Once you have this kind of similarity nearly any SVM library (for example libSVM) works with providing Gram matrix. Which is simply defined as
G_ij = K(x_i, x_j)
Thus requiring O(N^2) memory and time. Consequently it does not matter what are your objects, as SVM only works on pairwise dot-products, nothing more.
If you look appropriate mathematical tools to show this property, what can be done is to look for kernel learning from similarity. These methods are able to create valid kernel which behaves similarly to your similarity.
Check out the following:
MLPack: a lightweight library that provides lots of functionality.
DLib: a very popular toolkit that is used both in industry and academia.
Apart from these, you can also use Python packages, but import them from C++.
Anyone can help me to understand the difference among itk 4.8 data object. What are the difference between Vector Image, CovariantVector Image and Spatial Objects?
The itk::VectorImage class is merely an image where each image pixel has a list of values associated with it rather than a single intensity value.
I am not aware of any itk::CovariantVectorImage class or a similar class.
The itk::Vector class represents a mathematical vector with magnitude and direction with operators and methods for vector addition, scalar multiplication, the inner product of two vectors, getting a vector's norm, and so forth. You can also perform linear transformations on them using methods in the itk::AffineTransform, mainly the TransformVector() method. This is not related to C++'s std::vector container object, which is really a dynamic array data structure.
The itk::CovariantVector class is similar to itk::Vector, except it represents a covector rather than a vector. Covectors represent n-1-dimensional hyperplanes, (2D planes in the case of 3D space), and so their components transform in the opposite way that a vector's components do. itk::AffineTransform's TransformCovariantVector() method will transform a itk::CovariantVector object according to covariant transformation laws.
The itk::SpatialObject class allows you to create objects that exist in physical n-dimension space such as boxes, ellipses, tubes, planes, and cylinders, and relate these objects through parent-child relationships. You can read Chapter 5 of the ITK software manual for more information on this topic.
I want to use the de9im to speed up a call to point within a a polygon, where the polygon may be used many times. I know that de9im has this functionality but I can't seem to figure out how the class in boost even works (geometry/strategies/intersection_result.hpp ). Does anyone know if this class is actually functional and if so can they provide a simple example of a query for a polygon containing a point.
EDIT: I'm comparing the boost geometry library to JTS, which has a prepared geometry class, at this point I'm not 100% that use of the DE-9IM is what is allowing the pre computation but I am still wondering if boost has this functionality in it.
I'm not entirely sure what is the problem exactly.
DE9IM is a model used to describe the spatial relationship of Geometrical objects. See http://en.wikipedia.org/wiki/DE-9IM for more info.
I assume you're looking for a way how to represent Points, Polygons and how to check if one is within the other one. If this is the case then yes, Boost.Geometry of course supports that and many more. For instance to check if a Point is within a Polygon you may use:
boost::geometry::model::point<> to represent a Point
boost::geometry::model::polygon<> to represent a Polygon
boost::geometry::within() function to check the spatial relationship
More info you can find in the docs: http://www.boost.org/libs/geometry
E.g. at the bottom of this page:
http://www.boost.org/doc/libs/1_55_0/libs/geometry/doc/html/geometry/reference/algorithms/within/within_2.html
you can find an example showing how to create a Point, load Polygon from wkt string and check if one is within another one.
I've created my own Matrix class were inside the class the information regarding the Matrix is stored in a STL vector. I've notice that while searching the web some people work with a vector of vectors to represent the Matrix information. My best guess tells me that so long as the matrix is small or skinny (row_num >> column_num) the different should be small, but what about if the matrix is square or fat (row_num << column_num)? If I were to create a very large matrix would I see a difference a run time? Are there other factors that need to be considered?
Thanks
Have you considered using an off-the-shelf matrix representation such as boost's instead of reinventing the wheel?
If you have a lot of empty rows for example, using the nested representation could save a lot of space. Unless you have specific information in actual use cases showing one way isn't meeting your requirements, code the way that's easiest to maintain and implement properly.
There are too many variables to answer your question.
Create an abstraction so that your code does not care how the matrix is represented. Then write your code using any implementation. Then profile it.
If your matrix is dense, the "vector of vectors" is very unlikely to be faster than a single big memory block and could be slower. (Chasing two pointers for random access + worse locality.)
If your matrices are large and sparse, the right answer to your question is probably "neither".
So create an abstract interface, code something up, and profile it. (And as #Mark says, there are lots of third-party libraries you should probably consider.)
If you store everything in a single vector, an iterator will traverse the entire matrix. If you use a vector of vectors, an iterator will only traverse a single dimension.