Principal component analysis on PCL - c++

I am using PCL with C++ and want to perform PCA on an already clustered pointcloud (that is on every individual cluster). The idea is to eliminate all clusters that are too large/small by measuring their size along the eigenvectors. So intended algorithm is :Get eigenvectors of every cluster, project the cluster points on the corresponding eigenvectors in order to measure maximum distances of points along these dimensions, and from there eliminate all clusters that have "bad" dimensions/ratio of dimensions. I am having difficulties with the implementation. The more help you could spare, the merrier. This is how my data is organized (just to be clear, these are snippets, cluster_indices have already been extracted properly which is checked), and what I started:
std::vector<pcl::PointIndices> cluster_indices;
pcl::PCA<pcl::PointXYZ> cpca = new pcl::PCA<pcl::PointXYZ>;
cpca.setInputCloud(input);
Eigen::Vector3f pca_vector(3,3);
// and now iterating with a for loop over the clusters, but having issues using setIndices already
for (std::vector<pcl::PointIndices>::const_iterator it = cluster_indices.begin (); it != cluster_indices.end (); ++it, n++)
{
cpca.setIndices(it->indices); //not sure what to put in brackets, everything I thought of returns an error!
pca_vector = cpca.getEigenVectors();
}

It seems that there is a general issue with this. So I found this solution:
Instead of iterating with pointers, use a regular iterator, and then use these formulation.
PointIndicesPtr pi_ptr(new PointIndices);
pi_ptr->indices = cluster_indices[i].indices;
//now can use pi_ptr as input

Related

Memory Error when using combination and intersection in python

I am coding for large graph sampling and meet some memory problems.
possible_edges = set(itertools.combinations(list(sampled_nodes), 2))
sampled_graph = list(possible_edges.intersection(ori_edges))
The code is supposed to find all combinations of nodes in sampled_nodes, which provided all possible edges formed by these nodes. Then take the intersection with original_edges to find which edge exactly exists.
The problem is when the graph is enormous, the itertools.combinations function would cause memory error.
I've thought to write for loop to iteratively calculate the intersection but takes too much time.
Any help from you guys would be appreciated. Thank you!
Found one solution, instead of take intersection of 2 lists, I choose not to create all combinations for possible_edges.
I pick edges in ori_edges to see if both of the node exists in sampled_nodes.
sampled_graph = list(set([(e[0], e[1]) for e in ori_edges if int(e[0]) in result_nodes and int(e[1]) in result_nodes]))
This code won't create the huge list and the speed is acceptable.

Abaqus Python 'Getclosest' command

I'm using the getclosest command to find a vertex.
ForceVertex1 = hatInstance.vertices.getClosest(coordinates=((x,y,z,))
This is a dictionary object with Key 0 and two values (hatInstance.vertices[1] and the coordinates of the vertex) The specific output:
{0: (mdb.models['EXP-100'].rootAssembly.instances['hatInstance-100'].vertices[1], (62.5242172081597, 101.192447407436, 325.0))}
Whenever I try to create a set, the vertex isn't accepted
mainAssembly.Set(vertices=ForceVertex1[0][0],name='LoadSet1')
I also tried a different way:
tolerance = 1.0e-3
vertex = []
for vertex in hatInstance.vertices:
x = vertex.pointOn[0][0]
print x
y = vertex.pointOn[0][1]
print y
z = vertex.pointOn[0][2]
print z
break
if (abs(x-xTarget)) < tolerance and abs(y-yTarget) < tolerance and abs(z-zTarget) < tolerance):
vertex.append(hatInstance.vertices[vertex.index:vertex.index+1])
xTarget etc being my coordinates, despite this I still don't get a vertex object
For those struggeling with this, I solved it.
Don't use the getClosest command as it returns a dictionary object despite the manual recommending this. I couldn't convert this dictionary object, specifically a key and a value within to a standalone object (vertex)
Instead use Instance.vertices.getByBoundingSphere(center=,radius=)
The center is basically a tuple of the coordinates and the radius is the tolerance. This returns an array of vertices
If you want the geometrical object you just have to access the dictionary.
One way to do it is:
ForceVertex1 = hatInstance.vertices.getClosest(coordinates=((x,y,z,))[0][0]
This will return the vertices object only, which you can assign to a set or whatever.
Edit: Found a solution to actually address the original question:
part=mdb.models[modelName].parts[partName]
v=part.vertices.getClosest(coordinates=(((x,y,z)),))
Note the formatting requirement for coordinates ((( )),), three sets of parenthesis with a comma. This will find the vertex closest to the specified point. In order to use this to create a set, I found you need to massage the Abaqus Python interface to return the vertex in a format that uses their "getSequenceFromMask" method. In order to create a set, the edges, faces, and/or vertices need to be of type "Sequence", which is internal to Abaqus. To do this, I then use the following code:
v2=part.verticies.findAt((((v[0][1])),))
part.Set(name='setName', vertices=v2)
Note, v[0][1] will give you the point at which the vertex lies on. Note again the format of the specified point using the findAt method (((point)),) with three sets of parenthesis and a comma. This will return a vertex that uses the getSequenceFromMask method in Abaqus (you can check by typing v2 then enter in the python box at the bottom of CAE, works with Abaqus 2020). This is type "Sequence" (you can check by typing type(V2)) and this can be used to create a set. If you do not format the point in findAt correctly (e.g., findAt(v[0][1]), without the parenthesis and comma), it will return an identical vertex as you get by accessing the dictionary returned using getClosest (e.g., v[0][0]). This is type 'Vertex' and cannot be used to create a set, even though it asks for a vertex. If you know the exact point where the vertex is, then you do not need the first step. You can simply use the findAt method with the correct formatting. However, the tolerance for findAt is very small (1e-6) and will return an empty sequence if nothing is found within the tolerance. If you only have a ballpark idea of where the vertex is located, then you need to use the getClosest method first. This indeed gets the closest vertex to the specified point, which may or may not be the one you are interested in.
Original post:
None of these answers work for a similar problem I am having while trying to create a set of faces within some range near a point. If I use getClosest as follows
f=mdb.models['Model-1'].parts['Part-1'].faces.getClosest(coordinates=((0,0,0),), searchTolerance=1)
mdb.models['Model-1'].parts['Part-1'].Set(faces=f, name='faceSet')
I get an error "TypeError: Keyword error on faces".
If I access the dictionary via face=f[0], I get error "Feature Creation Failed". If I access the tuple within the dictionary via f[0][0], I get the error "TypeError: keyword error on faces" again.
The option to use .getByBoundingSphere doesn't work either, because the faces in my model are massive, and the faces have to be completely contained within the sphere for Abaqus to "get" them, basically requiring me to create a sphere that encompasses the entire model.
My solution was to create my own script as follows:
import numpy as np
model=mdb.models['Model-1']
part=model.parts['Part-1']
faceSave=[]
faceSave2=[]
x=np.arange(-1,1,0.1)
y=np.arange(-1,1,0.1)
z=np.arange(-1,1,0.1)
for x1 in x:
for y1 in y:
for z1 in z:
f=part.faces.findAt(((x1,y1,z1),))
if len(f)>0:
if f[0] in faceSave2:
None
else:
faceSave.append(f)
faceSave2.append(f[0])
part.Set(faces=faceSave,name='faceSet')
This works, but it's extraordinarily slow, in part because "findAt" will throw a warning to the console whenever it doesn't find a face, and it usually doesn't find a face with this approach. The code above basically looks within a small cube for any faces, and puts them in the list "faceSave". faceSave2 is setup to ensure that duplicate faces aren't added to the list. Accessing the tuple (e.g, f[0] in the code above) contains the unique information about the face, whereas 'f' is just a pointer to the 'findAt' command. Strangely, you can use the pointer 'f' to create a Set, but you cannot use the actual face object 'f[0]' to create a set. The problem with this approach for general use is, the tolerance for "findAt" is super small, so, you either have to be confident where things are located in your model, or have the step size be 1e-6 in np.arange() to ensure you don't miss a face that's in the cube. With a tiny step size, expect the code to take forever.
At any rate, I can use a tuple (or a list of tuples) obtained via "findAt" to create a Set in Abaqus. However, I cannot use the tuple obtained via "getClosest" to make a set, even though I see no difference between the two objects. It's unfortunate, because getClosest gives me the exact info I need effectively immediately without my jumbled mess of for-loops.
#anarchoNobody:
Thank you so much for your edited answer!
This workaround works great, also with faces. I spent a lot of hours trying to figure out why .getClosest does not provide a working result for creating a set, but with the workaround and the number of brackets it works.
If applied with several faces, the code has to be slightly modified:
faces=((mdb.models['Model-1'].rootAssembly.instances['TT-1'].faces.getClosest(
coordinates=(((10.0, 10.0, 10.0)),), searchTolerance=2)),
(mdb.models['Model-1'].rootAssembly.instances['TT-1'].faces.getClosest(
coordinates=((-10.0, 10.0, 10.0),), searchTolerance=2)),)
faces1=(mdb.models['Model-1'].rootAssembly.instances['Tube-1'].faces.findAt((((
faces[0][0][1])),)),
mdb.models['Model-1'].rootAssembly.instances['Tube-1'].faces.findAt((((
faces[1][0][1])),)),)
mdb.models['Model-1'].rootAssembly.Surface(name='TT-inner-surf', side1Faces=faces1)
```

How to detect and delete noise in rapidminer?

I am new in rapid miner 5, just want to know how to find noise in my data and show them in chart and how to delete them?
A complex problem because it depends what you mean by noise.
If you mean finding individual attributes whose values are plain wrong then you could plot a histogram view and work out some sort of limits on what constitutes a valid value. You could then impose that rule by using Filter Examples to remove them.
If you mean finding attributes that have some sort of random jitter applied to them it would be difficult to detect these. Only by knowing beforehand what the expected shape of the distribution is could you compare with observation and do something about it. However, the action to take is by no means obvious.
If you mean finding examples within an example set that are obviously different from other examples then you could consider using the various outlier functions. The simplest one to get started is Detect Outlier (Distances). This finds a set number of outliers (default 10) based on a distance calculation that uses all the attributes for examples. It creates a new attribute called outlier that is set to true or false. You could then use the Filter Examples operator to remove those that are set to true.
Hope that helps at least as a start.

Get outliers from PCL sample consensus

I am using the functionality for SampleConsensus provided by PCL as per the example here: http://pointclouds.org/documentation/tutorials/random_sample_consensus.php#random-sample-consensus
The question is, the implementation allows for the retrieval of ransac inliers through the use of getInliers this can then be easily transferred into a point cloud using the common function copyPointCloud(in, inliers, out). I am interested in looking at the outliers. There doesnt seem to be functionality to return a list of outliers. If the inlier list is sorted then I could loop through the point list and check against the current inlier as so:
for i in point cloud
if i == currentInlier
currentInlier++
else
add point cloud (i) to new outlier cloud
But I am not sure that the inlier list is guaranteed to be sorted even though it seems like it would be created that way?
Also surely there is a native way to do this in PCL?
You almost certainly want pcl::ExtractIndices. It is documented here:
http://docs.pointclouds.org/1.7.0/classpcl_1_1_extract_indices.html
You can see how it is used here:
http://pointclouds.org/documentation/tutorials/extract_indices.php#extract-indices
See in particular:
pcl::ExtractIndices<pcl::PointXYZ> extract;
...
extract.setInputCloud (cloud_filtered);
extract.setIndices (inliers);
...
extract.setNegative (true);
extract.filter (*cloud_f);

C++, determine the part that have the highest zero crosses

I’m not specialist in signal processing. I’m doing simple processing on 1D signal using c++. I want really to know how I can determine the part that have the highest zero cross rate (highest frequency!). Is there a simple way or method to tell the beginning and the end of this part.
This image illustrate the form of my signal, and this image is what I need to do (two indexes of beginning and end)
Edited:
Actually I have no prior idea about the width of the beginning and the end, it's so variable.
I could calculate the number of zero crossing, but I have no idea how to define it's range
double calculateZC(vector<double> signals){
int ZC_counter=0;
int size=signals.size();
for (int i=0; i<size-1; i++){
if((signals[i]>=0 && signals[i+1]<0) || (signals[i]<0 && signals[i+1]>=0)){
ZC_counter++;
}
}
return ZC_counter;
}
Here is a fairly simple strategy which might give you some point to start. The outline of the algorithm is as follows
Input: Vector of your data points {y0,y1,...}
Parameters:
Window size sigma.
A threshold 0<p<1 defining when to start looking for a region.
Output: The start- and endpoint {t0,t1} of the region with the most zero-crossings
I won't give any C++ code, but the method should be easy to implement. As example let us use the following function
What we desire is the region between about 480 and 600 where the zero density higher than in the front. First step in the algorithm is to calculate the positions of zeros. You can do this by what you already have but instead of counting, you store the values for i where you met a zero.
This will give you a list of zero positions
From this list (you can do this directly in the above for-loop!) you create a list having the same size as your input data which looks like {0,0,0,...,1,0,..,1,0,..}. Every zero-crossing position in your input data is marked with a 1.
The next step is to smooth this list with a smoothing filter of size sigma. Here, you can use what you like; in the simplest case a moving average or a Gaussian filter. The higher you choose sigma the bigger becomes your look around window which measures how many zero-crossings are around a certain point. Let me give the output of this filter together with the original zero positions. Note that I used a Gaussian filter of size 10 here
In a next step, you go through the filtered data find the maximum value. In this case it is about 0.15. Now you choose your second parameter which is some percentage of this maximum. Lets say p=0.6.
The final step is to go through the filtered data and when the value is greater than p you start to remember a new region. As soon as the value drops below p, you end this region and remember start and endpoint. Once you are finished walking through the data, you are left with a list of regions, each defined by a start and an endpoint. Now you choose the region with the biggest extend and you are done.
(Optionally, you could add the filter size to each end of the final region)
For the above example, I get 11 regions as follows
{{164,173},{196,205},{220,230},{241,252},{259,271},{278,290},
{297,309},{318,327},{341,350},{458,468},{476,590}}
where the one with the biggest extend is the last one {476,590}. The final result looks (with 1/2 filter region padding)
Conclusion
Please don't be discouraged by the length of my answer. I tried to explain everything in detail. The implementation is really just some loops:
one loop to create the zero-crossings list {0,0,..,1,0,...}
one nested loop for the moving average filter (or you use some library Gaussian filter). Here you can at the same time extract the maximum value
one loop to extract all regions
one loop to extract the largest region if you haven't already extracted it in the above step