Magnetic moments relative to crystal axes? - pymatgen

I am using pymatgen to write .mcif files. My structures always have collinear magnetic moments of magnitude 4 along z, but before writing them to the file I noticed that pymatgen transforms them with the function Magmom.get_moment_relative_to_crystal_axes(). My questions are:
What does exactly this function do?
Why for some lattices my magnetic moments remain (0, 0, 4) and for some others I get three non-zero components and even the magnitudes are no more equal to 4?

I'm actually responsible for that code so I hope I can answer your question :)
In brief, there are a few things to bear in mind:
• Assuming when you say "collinear magnetic moments always along z", you mean these are scalar collinear moments e.g. from a DFT calculation, it's worth bearing in mind that the current .mcif standard isn't really designed for representing scalar moments, which is why we arbitrarily choose the z axis. This is by convention however and has no physical meaning. I am told a future version of the .mcif standard will support scalar moments.
• We typically present magnetic moments relative in terms of the crystallographic lattice vectors because this is usually more scientifically meaningful. This means we need to have a conversion from the Cartesian x, y, z basis and into the lattice's a, b, c basis. This is what the Magmom.get_moment_relative_to_crystal_axes() method does.
• The magnitude of the resulting moment should be the same; if it is not this is a bug (please share if you have an example!) However, note that the lattice basis might not be orthogonal, which can make the math a bit trickier.
In regards to asking questions on pymatgen generally, note we also have a Google group, the Materials Project has a forum too, and in general we try to be responsive (I'm not sure anyone is currently monitoring stackoverflow however). You're also more than welcome to email myself directly. If you do find a bug, please do report it to the pymatgen GitHub Issues page, and we'll try and fix it asap.

Related

Best way to feature select using PCA (discussion)

Terminology:
Component: PC
loading-score[i,j]: the j feature in PC[i]
Question:
I know the question regarding feature selection is asked several times here at StackOverflow (SO) and on other tech-pages, and it proposes different answers/discussion. That is why I want to open a discussion for the different solutions, and not post it as a general question since that has been done.
Different methods are proposed for feature selection using PCA: For instance using the dot product between the original features and the components (here) to get their correlation, a discussion at SO here suggests that you can only talk about important features as loading-scores in a component (and not use that importance in the input space), and another discussion at SO (which I cannot find at the moment) suggest that the importance for feature[j] would be abs(sum(loading_score[:,j]) i.e the sum of the absolute value of loading_score[i,j] for all i components.
I personally would think that a way to get the importance of a feature would be an absolute sum where each loading_score[i,j] is weighted by the explained variance of component i i.e
imp_feature[j]=sum_i (abs(loading_score[i,j])*explained_variance[i].
Well, there is no universal way to select features; it totally depends on the dataset and some insights available about the dataset. I will provide some examples which might be helpful.
Since you asked about PCA, initially it separates the whole dataset into two sets under which the variances. On the other ICA (Independent Component Analysis) is able to extract multiple features simultaneously. Look at this example,
In this example, we mix three independent signals and try to separate out them using ICA and PCA. In this case, ICA is doing it a better way than PCA. In general, if you search Blind Souce Separation (BSS) you may find more information about this. Besides, in this example, we know the independent components thus, separation is easy. In general, we do not know the number of components. Hence, you may have to guess based on some prior information about the dataset. Also, you may use LDA (Linear Discriminate Analysis) to reduce the number of features.
Once you extract PC components using any of the techniques, following way we can visualize it. If we assume, those extracted components as random variables i.e., x, y, z
More information about you may refer to this original source where I took about two figures.
Coming back to your proposition,
imp_feature[j]=sum_i (abs(loading_score[i,j])*explained_variance[i]
I would not recommend this way due to the following reasons:
abs(loading_score[i,j]) when we get absolute values you may loose positive or negative correlations of considered features. explained_variance[i] may be used to find the correlation between features, but multiplying does not make any sense.
Edit:
In PCA, each component has its explained variance. Explained variance is the ratio between individual component variance and total variance(sum of all individual components variances). Feature significance can be measured by magnitude of explained variance.
All in all, what I want to say, feature selection totally depends on the dataset and the significance of features. PCA is just one technique. Frist understand the properties of features and the dataset. Then, try to extract features. Hope this helps. If you can provide us with an exact example, we may provide more insights.

Set-to-Subset point cloud matching

I have two point clouds, in 3d coordinates. One is a subset of the other, containing many less points. They are in the same scale.
What i need to do is find the translation and rotation between the two. I have looked at Point cloud Library, "Iterative closest point", and Coherent Point Drift, but these matching approaches both seem to expect the two point sets to contain mostly the same points, not have one be a smaller, subset of the other.
Can i use either of these, with adjustments? Or is there another algorithm to match a subset point cloud to a set?
Thank you.
Without having access to sample data, is kind of hard to recommend you a specific registration algorithm.
However, I'm pretty exicted nowdays about all the new "data-driven" registration approaches.
From my personal experience, I'm having awesome registration results using the approach of this recent paper:
https://arxiv.org/abs/1603.08182
Wich has source code avaliable here:
https://github.com/andyzeng/3dmatch-toolbox
As reported in the paper, it outperforms pcl-descriptor based registration approaches and I think that it may be suitable for your needs.

Is it possible to specify number of nodes in hidden layer of SVM in OpenCV 3.1.0?

I was just curious to have a better control over outcome of the SVM.
Tried to search the documentation, but couldn't find a function that seems to do the same.
One could say that SVM does not have hidden nodes, but this is only partially true.
SVM, originally, were called Support Vector Networks (this is what Vapnik himself called them), and they were seen as a kind of neural networks with a single hidden layer. Due to the popularity of neural networks in this time, many people till this day use sigmoid "kernel" even though it is rarely a valid Mercer's kernel (only because NN community was so used to using it they started doing so even though it has no mathematical justification).
So is SVM a neural net or not? Yes, it can be seen as a neural network. In fact, many classifiers can be seen through such prism. However, what makes SVM really different is the way they are trained and parametrized. In particular, SVMs work with "activation functions" which are valid Mercer's kernels (they denote dot product in some space). Furthermore, weights of the hidden nodes are equal to training samples, thus you get the same amount of hidden units as you have training examples. During training, SVM, on its own, reduces number of hidden units through solving an optimization problem which "prefers" sparse solutions (removal of hidden units), thus ending up with the hidden layer consisting of the subset of training samples, we call them support vectors. To underline, this is not a classical view of SVMs, but it is a valid perspective, which might be more easy to understand by someone from NN community.
So can you control this number? Yes and no. No, because SVM needs all this hidden units to have a valid optimization problem, and it will remove all redundant ones on its own. Yes, because there is an alternative optimization problem, called nu-SVM, which uses nu-hyperparamer, which is lower bound of support vectors, thus lower bound of hidden units. You cannot, unfortunately, directly specify the upper bound.
But I really need to! If this is the case, you can go with approximate solutions which will follow your restriction. You can use H-dimensional sampler which approximate the kernel space explicitely (http://scikit-learn.org/stable/modules/kernel_approximation.html). One of such methods is Nystroem method. In short terms, if you want to have "H hidden units" you simply fit Nystroem model to produce H dimensional output, you transfrom your input data through it, and fit linear SVM on top. This, from mathematical perspective** is approximating true non-linear SVM with a given kernel, however quite slowly.

Assurance of ICP, internal Metrics

So I have an iterative closest point (ICP) algorithm that has been written and will fit a model to a point cloud. As a quick tutorial for those not in the know ICP is a simple algorithm that fits points to a model ultimately providing a homogeneous transform matrix between the model and points.
Here is a quick picture tutorial.
Step 1. Find the closest point in the model set to your data set:
Step 2: Using a bunch of fun maths (sometimes based on gradiant descent or SVD) pull the clouds closer together and repeat untill a pose is formed:
![Figure 2][2]
Now that bit is simple and working, what i would like help with is:
How do I tell if the pose that I have is a good one?
So currently I have two ideas, but they are kind of hacky:
How many points are in the ICP Algorithm. Ie, if I am fitting to almost no points, I assume that the pose will be bad:
But what if the pose is actually good? It could be, even with few points. I dont want to reject good poses:
So what we see here is that low points can actually make a very good position if they are in the right place.
So the other metric investigated was the ratio of the supplied points to the used points. Here's an example
Now we exlude points that are too far away because they will be outliers, now this means we need a good starting position for the ICP to work, but i am ok with that. Now in the above example the assurance will say NO, this is a bad pose, and it would be right because the ratio of points vs points included is:
2/11 < SOME_THRESHOLD
So thats good, but it will fail in the case shown above where the triangle is upside down. It will say that the upside down triangle is good because all of the points are used by ICP.
You don't need to be an expert on ICP to answer this question, i am looking for good ideas. Using knowledge of the points how can we classify whether it is a good pose solution or not?
Using both of these solutions together in tandem is a good suggestion but its a pretty lame solution if you ask me, very dumb to just threshold it.
What are some good ideas for how to do this?
PS. If you want to add some code, please go for it. I am working in c++.
PPS. Someone help me with tagging this question I am not sure where it should fall.
One possible approach might be comparing poses by their shapes and their orientation.
Shapes comparison can be done with Hausdorff distance up to isometry, that is poses are of the same shape if
d(I(actual_pose), calculated_pose) < d_threshold
where d_threshold should be found from experiments. As isometric modifications of X I would consider rotations by different angles - seems to be sufficient in this case.
Is poses have the same shape, we should compare their orientation. To compare orientation we could use somewhat simplified Freksa model. For each pose we should calculate values
{x_y min, x_y max, x_z min, x_z max, y_z min, y_z max}
and then make sure that each difference between corresponding values for poses does not break another_threshold, derived from experiments as well.
Hopefully this makes some sense, or at least you can draw something useful for your purpose from this.
ICP attempts to minimize the distance between your point-cloud and a model, yes? Wouldn't it make the most sense to evaluate it based on what that distance actually is after execution?
I'm assuming it tries to minimize the sum of squared distances between each point you try to fit and the closest model point. So if you want a metric for quality, why not just normalize that sum, dividing by the number of points it's fitting. Yes, outliers will disrupt it somewhat but they're also going to disrupt your fit somewhat.
It seems like any calculation you can come up with that provides more insight than whatever ICP is minimizing would be more useful incorporated into the algorithm itself, so it can minimize that too. =)
Update
I think I didn't quite understand the algorithm. It seems that it iteratively selects a subset of points, transforms them to minimize error, and then repeats those two steps? In that case your ideal solution selects as many points as possible while keeping error as small as possible.
You said combining the two terms seemed like a weak solution, but it sounds to me like an exact description of what you want, and it captures the two major features of the algorithm (yes?). Evaluating using something like error + B * (selected / total) seems spiritually similar to how regularization is used to address the overfitting problem with gradient descent (and similar) ML algorithms. Selecting a good value for B would take some experimentation.
Looking at your examples, it seems that one of the things that determines whether the match is good or not, is the quality of the points. Could you use/calculate a weighting factor in calculating your metric?
For example, you could weight down points which are co-linear / co-planar, or spatially close, as they probably define the same feature. That would perhaps allow your upside-down triangle to be rejected (as the points are in a line, and that not a great indicator of the overall pose) but the corner-case would be ok, as they roughly define the hull.
Alternatively, maybe the weighting should be on how distributed the points are around the pose, again trying to ensure you have good coverage, rather than matching small indistinct features.

Trajectory interpolation and derivative

I'm working on the analysis of a particle's trajectory in a 2D plane. This trajectory typically consists of 5 to 50 (in rare cases more) points (discrete integer coordinates). I have already matched the points of my dataset to form a trajectory (thus I have time resolution).
I'd like to perform some analysis on the curvature of this trajectory, unfortunately the analysis framework I'm using has no support for fitting a trajectory. From what I heard one can use splines/bezier curves for getting this done but I'd like your opinion and/or suggestions what to use.
As this is only an optional part of my work I can not invest a vast amount of time for implementing a solution on my own or understanding a complex framework. The solution has to be as simple as possible.
Let me specify the features I need from a possible library:
- create trajectory from varying number of points
- as the points are discrete it should interpolate their position; no need for exact matches for all points as long as the resulting distance between trajectory and point is less than a threshold
- it is essential that the library can yield the derivative of the trajectory for any given point
- it would be beneficial if the library could report a quality level (like chiSquare for fits) of the interpolation
EDIT: After reading the comments I'd like to add some more:
It is not necessary that the trajectory exactly matches the points. The points are created from values of a pixel matrix and thus they form a discrete matrix of coordinates with a space resolution limited by the number of pixel per given distance. Therefore the points (which are placed at the center of the firing pixel) do not (exactly) match the actual trajectory of the particle. Either interpolation or fit is fine for me as long as the solution can cope with a trajectory which may/most probably will be neither bijective nor injective.
Thus most traditional fit approaches (like fitting with polynomials or exponential functions using a least squares fit) can't fulfil my criterias.
Additionaly all traditional fit approaches I have tried yield a function which seems to describe the trajectory quite well but when looking at their first derivative (or at higher resolution) one can find numerous "micro-oscillations" which (from my interpretation) are a result of fitting non-straight functions to (nearly) straight parts of the trajectory.
Edit2: There has been some discussion in the comments, what those trajectories may look like. Essentially thay may have any shape, length and "curlyness", although I try to exclude trajectories which overlap or cross in the previous steps. I have included two examples below; ignore the colored boxes, they're just a representation of the values of the raw pixel matrix. The black, circular dots are the points which I'd like to match to a trajectory, as you can see they are always centered to the pixels and therefore may have only discrete (integer) values.
Thanks in advance for any help & contribution!
This MIGHT be the way to go
http://alglib.codeplex.com/
From your description I would say that a parametric spline interpolation may suit your requirements. I have not used the above library myself, but it does have support for spline interpolation. Using an interpolant means you will not have to worry about goodness of fit - the curve will pass through every point that you give it.
If you don't mind using matrix libraries, linear least squares is the easiest solution (look at the end of the General Problem section for the equation to use). You can also use linear/polynomial regression to solve something like this.
Linear least squares will always give the best solution, but it's not scalable, because matrix multiplication is moderately expensive. Regression is an iterative heuristic method, so you can just run it until you have a "sufficiently good" answer. I've seen guidelines for the cutoff at about 1000-10000 dimensions in your data. So, with your data set, I'd recommend linear least squares, unless you decide to make them highly dimensioned for some reason.