I have an image which was shown to groups of people with different domain knowledge of its content. I than recorded gaze fixation data of them watching the image.
I now kind of want to compare the results of the two groups - so what I need to know is, if there is a correlation of the positions of the sampling data between the two groups or not.
I have the original image as well as the fixation coords. Do you have any good idea how to start analyzing the data?
It's more about the idea or the plan so you don't have to be too technical on that one.
Thanks
Simple idea: render all the coordinates on the original image in a 'heat map' like way, one image for each group. You can then visually compare the images for correlation, and you have some nice graphics for in your paper.
There is something like the two-dimensional correlation coefficient. With software like R or Matlab you can do the number crunching for the correlation.
Matlab has a function for this:
Two Dimensional Correlation Function: corr2
Computes two dimensional correlation coefficient between two matrices
and the matrices must be of the same size. r = corr2 (A,B)
In gaze tracking, the most interesting data lies in two areas.
In where all people look, for that you can use the heat map Daan suggests. Make a heat map for all people, and heat maps for separate groups of people.
In when people look there. For that I would recommend you start by making heat maps as above, but for short time intervals starting from the time the picture was first shown. Again, for all people, and for the separate groups you have.
The resulting set of heat-maps, perhaps animated for the ones from the second point, should give you some pointers for further analysis.
Related
I want to locate a service robot via infrared landmarks. The idea is to detect two landmarks, get the distance to the landmarks and calculate the robots position from these informations (the position of the landmarks are known).
For this I have built an artificial 2x3 matrix of IR LEDs, which are visible in the robots infrared camera image (shown in the image below).
As the first step, I want to detect a single landmark in a picture and get it's x-y coordinates. I can use these coordinates in the future to get the distance from the depth-image provided.
My first approach was to convert the image to a black and white image. Then I tried to filter out different cluster of points (which i dilated and contoured in the first place). I couldn't succeed with this method.
Now I wonder if there are any pattern recognition/computer vision methods which can help me to quite "easily" detect the pattern.
I've added a picture of the infrared image with the landmark in it and a converted black/white image.
a) Which method can help me to solve this problem?
b) Should I use a 3x3 Matrix or any other geometric form instead of the 2x3 Matrix ?
IR-Image
Black-White Image
A direct answer:
1) find all small circles in the image; 2) look among these small circles for ones that are the same size and close together, and, say, form parallel lines.
The reason for this approach is that you have coded the robot with a specific pattern of small objects. Therefore, look for the objects and then look for the pattern. (If the orientation and size wouldn't change, then you could just look for a sub-image within the larger image, but because it can, you need to look for elements of the pattern that remain consistent with motion in the 3D space, that is, the parallel lines.)
This will work in the example images, but to know whether this will work more generally, we need to know more than you told us: It depends on whether the variation in the images of the matrix and the variations in the background will let this be enough to distinguish between them. If not, maybe you need a more clever algorithm or maybe a different pattern of lights. In the extreme case, it's obvious that if you had another 2x3 matric around, it's not enough. It all depends on the variation of the object to be identified and the variations within the background scene, and because you don't tell us either of these things, it's hard to say the best way, what's good enough, what's a better way, etc.
If you have the choice, and here it sound like you do, good data is better than clever analysis. For this problem, I'd call good data to be anything that clearly distinguishes the object from the background. You need to think of it this way, and look at what the background is, and all the different perspectives on the lights that are possible, and make sure these can never be confused.
For example, if you have a lot of control over this, and enough time, temporal variations are often the easiest. Turning the lights (or a subset of the lights) on and off, etc, and then looking for the expected temporal variation is often the surest way to distinguish signal from noise — but really, this again is just making an assumption about the background and foreground (ie, that the background won't vary with some particular time pattern).
I'm currently working on a project where I have to compare similar histograms of image intensity. These histograms are obtained from photos taken under different illumination conditions.
I know that OpenCV offers the compareHist function. However this function returns a metric of similarity and I'm looking for a method that matches corresponding peaks/valleys between similar histograms.
For instance, if we have two photos of the same subject, one underexposed and one with the "ideal" exposure, their histograms of intensity might look something like the image in the following URL:
http://i.stack.imgur.com/tLIGR.png
As shown by the arrows, the peaks in one histogram also exist in the other. Anyone has a suggestion on how to match corresponding peaks?
Thank you!
You can use an implementation of DTW (https://en.wikipedia.org/wiki/Dynamic_time_warping) to compare the histograms.
Using dynamic programming, you can create a matrix that calculates DTW. Then, you can trace back through the matrix to find the relations between different parts of the histograms.
After that, it's simply a matter of extracting only the peaks.
What would b the best way to implement a simple shape-matching algorithm to match a plot interpolated from just 8 points (x, y) against a database of similar plots (> 12 000 entries), each plot having >100 nodes. The database has 6 categories of plots (signals measured under 6 different conditions), and the main aim is to find the right category (so for every category there's around 2000 plots to compare against).
The 8-node plot would represent actual data from measurement, but for now I am simulating this by selecting a random plot from the database, then 8 points from it, then smearing it using gaussian random number generator.
What would be the best way to implement non-linear least-squares to compare the shape of the 8-node plot against each plot from the database? Are there any c++ libraries you know of that could help with this?
Is it necessary to find the actual formula (f(x)) of the 8-node plot to use it with least squares, or will it be sufficient to use interpolation in requested points, such as interpolation from the gsl library?
You can certainly use least squares without knowing the actual formula. If all of your plots are measured at the same x value, then this is easy -- you simply compute the sum in the normal way:
where y_i is a point in your 8-node plot, sigma_i is the error on the point and Y(x_i) is the value of the plot from the database at the same x position as y_i. You can see why this is trivial if all your plots are measured at the same x value.
If they're not, you can get Y(x_i) either by fitting the plot from the database with some function (if you know it) or by interpolating between the points (if you don't know it). The simplest interpolation is just to connect the points with straight lines and find the value of the straight lines at the x_i that you want. Other interpolations might do better.
In my field, we use ROOT for these kind of things. However, scipy has a great collections of functions, and it might be easier to get started with -- if you don't mind using Python.
One major problem you could have would be that the two plots are not independent. Wikipedia suggests McNemar's test in this case.
Another problem you could have is that you don't have much information in your test plot, so your results will be affected greatly by statistical fluctuations. In other words, if you only have 8 test points and two plots match, how will you know if the underlying functions are really the same, or if the 8 points simply jumped around (inside their error bars) in such a way that it looks like the plot from the database -- purely by chance! ... I'm afraid you won't really know. So the plots that test well will include false positives (low purity), and some of the plots that don't happen to test well were probably actually good matches (low efficiency).
To solve that, you would need to either use a test plot with more points or else bring in other information. If you can throw away plots from the database that you know can't match for other reasons, that will help a lot.
I am trying to do image detection in C++. I have two images:
Image Scene: 1024x786
Person: 36x49
And I need to identify this particular person from the scene. I've tried to use Correlation but the image is too noisy and therefore doesn't give correct/accurate results.
I've been thinking/researching methods that would best solve this task and these seem the most logical:
Gaussian filters
Convolution
FFT
Basically, I would like to move the noise around the images, so then I can use Correlation to find the person more effectively.
I understand that an FFT will be hard to implement and/or may be slow especially with the size of the image I'm using.
Could anyone offer any pointers to solving this? What would the best technique/algorithm be?
In Andrew Ng's Machine Learning class we did this exact problem using neural networks and a sliding window:
train a neural network to recognize the particular feature you're looking for using data with tags for what the images are, using a 36x49 window (or whatever other size you want).
for recognizing a new image, take the 36x49 rectangle and slide it across the image, testing at each location. When you move to a new location, move the window right by a certain number of pixels, call it the jump_size (say 5 pixels). When you reach the right-hand side of the image, go back to 0 and increment the y of your window by jump_size.
Neural networks are good for this because the noise isn't a huge issue: you don't need to remove it. It's also good because it can recognize images similar to ones it has seen before, but are slightly different (the face is at a different angle, the lighting is slightly different, etc.).
Of course, the downside is that you need the training data to do it. If you don't have a set of pre-tagged images then you might be out of luck - although if you have a Facebook account you can probably write a script to pull all of yours and your friends' tagged photos and use that.
A FFT does only make sense when you already have sort the image with kd-tree or a hierarchical tree. I would suggest to map the image 2d rgb values to a 1d curve and reducing some complexity before a frequency analysis.
I do not have an exact algorithm to propose because I have found that target detection method depend greatly on the specific situation. Instead, I have some tips and advices. Here is what I would suggest: find a specific characteristic of your target and design your code around it.
For example, if you have access to the color image, use the fact that Wally doesn't have much green and blue color. Subtract the average of blue and green from the red image, you'll have a much better starting point. (Apply the same operation on both the image and the target.) This will not work, though, if the noise is color-dependent (ie: is different on each color).
You could then use correlation on the transformed images with better result. The negative point of correlation is that it will work only with an exact cut-out of the first image... Not very useful if you need to find the target to help you find the target! Instead, I suppose that an averaged version of your target (a combination of many Wally pictures) would work up to some point.
My final advice: In my personal experience of working with noisy images, spectral analysis is usually a good thing because the noise tend to contaminate only one particular scale (which would hopefully be a different scale than Wally's!) In addition, correlation is mathematically equivalent to comparing the spectral characteristic of your image and the target.
I am currently working on a data visualization project.My aim is to produce contour lines ,in other words iso-lines, from gridded data.Data can be temperature, weather data or any kind of other environmental parameters but only condition is it must be regularly spaced.
I searched in internet , however i could not find a good algorithm, pseudo-code or source code for producing contour lines from grids.
Does anybody knows a library, source code or an algorithm for producing contour lines from gridded data?
it will be good if your suggestion has a good run time performance, i don't want to wait my users so much :)
Edit: thanks for response but isolines have some constrains like they should not intersects
so just generating bezier curves does not accomplish my goal.
See this question: How to approximate a vector contour from an elevation raster?
It's a near duplicate, but uses quite different terminology. You'll find that cartography and computer graphics solve many of the same problems, but use different terminology for them.
there's some reasonably good contouring available in GNUplot - if you're able to use GPL code that may help.
If your data is placed at regular intervals, this can be done fairly easily (assuming I understand your problem correctly). First you need to determine at what interval you want your contours. Next create the grid you are going to use to store the contour information (i'm assuming just a simple on/off or elevation at this contour level type of data), which should be one interval smaller than the source data.
Now the trick here is to offset the 2 grids by 1/2 an interval (won't actually show up in code like this, but its the concept I'm dealing with here) and compare the 4 coordinates surrounding the current point in the contour data grid you are calculating. If any of the 4 points are in a different interval range, then that 'pixel' in the contour grid should be set to true (or the value of the contour range being crossed).
With this method, there will be a problem when the interval is too fine which will cause several contours to overlap onto each other.
As the link from Paul Tomblin suggests, Bezier curves (which are a subset of B-splines) are a ripe solution for your problem. If runtime performance is an issue, Bezier curves have the added benefit of being constructable via the very fast de Casteljau algorithm, instead of drawing them according to the parametric equations. On the off chance you're working with DirectX, it has a library function for the de Casteljau, but it should not be challenging to brew one yourself using the 1001 web pages that describe it.