Optical Flow - Motion histograms - c++

I'm currently working on optical flow with OpenCV C++. I'm using calcOpticalFlowPyrLK with a grid of point (= one interest point for each 5*5 pixels square).
Which is the best way to :
1) Compute the histogram of the computed values (orientation and distance) for each frame
2) Compute an histogram of the values (orientation and distance) that a given pixel took during several frames (for instance 100)
Are the functions of OpenCV adapted for this work ? How may I use them in a simple way in combination with calcOpticalFlowPyrLK ?

I was searching for the same OpenCV tools a couple of months ago. Unfortunately, OpenCV does not include any Motion Histogram implementation. Instead, what you should have to do is to run calcOpticalFlowPyrLK for each frame and calculate the orientation/length of each displacement. Then, you have to create/fill the histograms yourself . Not as hard as it sounds, believe me :)

The OpenCV implementation for the fist part of HOOF can be like below:
const int rows = flow1.rows;
const int cols = flow1.cols;
for (int y = 0; y < rows; ++y)
for (int x = 0; x < cols; ++x)
{
Vec2f flow1_at_point = flow1.at<Vec2f>(y, x);
float u1 = flow1_at_point[0];
float v1 = flow1_at_point[1];
magnitudeImage += sqrt((u1*u1) + (v1 + v1));
orientationImage += atan2(u1, v1);
}

Related

Make a mosaic image (bitmap format)

I want to make a mosaic photo with different window-size (that has been determined by user). This is just like a first draft of the code but I have problems to get the pixels and calculating averages. Then put the avarage value in each pixel and continue to the end. Even I get error to converting them of diffrent types: (Also the other part manufacturers a gray-scale image)
p.s: sorry that I am in the very first steps of learning image processing.
''' void CImageProcessingDoc::OnProcessMosaic()
{
if (m_pImage) {
DlgMosaicOption dlg;
if (dlg.DoModal() == IDOK) {
DWORD dwWindowSize = dlg.m_dwWindowSize;
DWORD width = m_pImage->GetWidth();
DWORD height = m_pImage->GetHeight();
RGBQUAD color;
RGBQUAD newcolor;
float X_step = width / dwWindowSize;
float Y_step = height / dwWindowSize;
int avg, pixel;
for (DWORD y = 0; y < dwWindowSize; y++) {
for (DWORD x = 0; x < dwWindowSize; x++) {
color = m_pImage->GetPixelColor(x, y);
(RGBQUAD) pixel = m_pImage->GetPixelColor(x, y);
avR += (int)(color.red(pixel);
avG += (int)(color.green(pixel);
avB += (int)(color.blue(pixel);
newcolor.rgbBlue = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
newcolor.rgbGreen = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
newcolor.rgbRed = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
m_pImage->SetPixelColor(x, y, newcolor);
}
}
}
}
} '''
Could anyone please help me to understand the problem?
I think you are mixing up spatial, spectral and temporal average here.
Spatial average
This is the operation of computing average of pixels over an area.
You have to compute eR = 1/N * (P0.R + P1.R + P2.R + P3.R + ...), eG = 1/N * (P0.G + P1.G + ...), eB = 1/N * (P0.B + P1.B + ...)
You'll get a pixel with as many color as there was in the input picture, but with limited spatial frequency, a picture computed like this will appear blurred, with no details
Spectral average
This is the operation of computing average of the components (spectrum) of each pixels.
You have to compute e = 1/3 * (P0.R + P0.G + P0.B)
You'll get a monochromic picture with the exact same spatial frequency as the initial picture.
Temporal average
While you haven't talked about it, this is for reference. The idea is to compute the average of each pixel, and each component for N pictures in a temporal sequence
This gives a kind of motion blurred picture.
Answer
If I understand your question correctly, you want spectral average to convert a RGB to the average grey value taken that grey = (R+G+B)/3.
Thus, you pixel loop should look like this:
for (DWORD y = 0; y < dwWindowSize; y++) {
for (DWORD x = 0; x < dwWindowSize; x++) {
color = m_pImage->GetPixelColor(x, y);
float avg = (color.rgbRed + color.rgbGreen + color.rgbBlue) / 3.f;
m_pImage->SetPixelColor(x, y, RGBQUAD(avg, avg, avg, 1.0f));
}
}
Please notice that converting non linear RGB (usually called sRGB) to luminance using the average is a poor formula for RGB to grayscale conversion. You should read about RGB to Lab* conversion (you are interested in L part only) or at least RGB to YUV (you are interested to Y part only).
If your question is about resizing the input picture, then you are not using the appropriate algorithm, what you want is called resampling.

Point cloud conversion to 2D range

I am trying to convert a point cloud (x, y, z) data acquired from a Kinect V2 using libfreenect2, into a virtual 2D laser scan (e.g., a horizontal angle/distance vector).
I am currently assigning per pixel column, the PCL distance value, as shown below:
std::vector<float> scan(512, 0);
for (unsigned int row = 0; row < 424; ++row) {
for (unsigned int col = 0; col < 512; ++col) {
float x, y, z;
registration->getPointXYZ(depth, row, col, x, y, z);
if (std::isnan(x) || std::isnan(y) || std::isnan(z)) {
continue;
}
Eigen::Vector3f values = rotate_translate((-1 * x), y - 1.186, z);
if (scan[col] == 0) {
scan[col] = values[1];
}
if (values[1] < scan[col]) {
scan[col] = values[1];
}
}
}
You may ignore the rotate_translate method, it simply changes the local to global coordinates using the sensor pose.
The problem is best shown using the pictures below:
Whereas the LIDAR range sensor produces the following pointsmap:
the kinect 2D range scan is curved, and of course narrower, since the horizontal FOV is 70.6 degrees compared to the 270 degree range of the LIDAR.
It is this curvature that I am trying to fix; the SLAM/ICP library I'm using is mrpt and the actual data scan is inserted into an mrpt::obs::CObservation2DRangeScan observation:
auto obs = mrpt::obs::CObservation2DRangeScan();
obs.loadFromVectors(scan.size(), scan.data(), (char*)scan.data());
obs.aperture = mrpt::utils::DEG2RAD(70.6f);
obs.maxRange = 6.0;
obs.rightToLeft = true;
obs.timestamp = mrpt::system::now();
obs.setSensorPose(sensor);
I've searched around google and SO, and the only answers which seem to address this question, are this one and that one. So whereas I understand that the curvature is the result of me assigning each pixel column the PCL value, I am uncertain as to how I can use that to remove the curvature.
Each reply seems to take a different approach, and from what I understand the task is a linear interpolation of the angle per pixel ratio, and the current pixel coordinates?

Irregular sampling of an image using OpenGL

I'm looking for some pointers on how to sample an image using OpenGL at a list of specified locations. Any links to tutorial or examples similar to the problem below?
At the moment we have a code that calculates the 'output intensity' at a list of specified locations x1,y1, x2,y2, ..., xn,yn applying a Lanczos2 filter to an input image. The amount of locations at the moment is 20 (which is actually the list of phosphene locations in a visual prosthesis) but it will eventually increase up to 256 and GPU processing will certainly accelerate things. The list of locations can't be hardcoded.
So far I have seen how to implement a median filter and alike, but in my case there is no need to compute the convolution with the filter kernel at every image pixel, just at the locations specified.
Handle those values (intesity) in a second texture that has a sample or not bit.
If you use OpenGL, you'll be able to define the ROI (region of interest), the portion of an image to which you want to apply edits or processing, as you describe.
If you go that route, this is how you calculate the median in a pixel neighborhood radius of your choosing using OpenGL ES 2.0/3.0:
kernel vec4 medianUnsharpKernel(sampler u) {
vec4 pixel = unpremultiply(sample(u, samplerCoord(u)));
vec2 xy = destCoord();
int radius = 3;
int bounds = (radius - 1) / 2;
vec4 sum = vec4(0.0);
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
sum += unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
}
}
vec4 mean = vec4(sum / vec4(pow(float(radius), 2.0)));
float mean_avg = float(mean);
float comp_avg = 0.0;
vec4 comp = vec4(0.0);
vec4 median = mean;
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
comp = unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
comp_avg = float(comp);
median = (comp_avg < mean_avg) ? max(median, comp) : median;
}
}
return premultiply(vec4(vec3(abs(pixel.rgb - median.rgb)), 1.0));
}
A brief description of the steps
1. Calculate the mean of the values of the pixels surrounding the source pixel in a 3x3 neighborhood;
2. Find the maximum pixel value of all pixels in the same neighborhood that are less than the mean.
3. [OPTIONAL] Subtract the median pixel value from the source pixel value for edge detection.
If you're using the median value for edge detection, there are a couple of ways to modify the above code for better results, namely, hybrid median filtering and truncated media filtering (a substitute and a better 'mode' filtering). If you're interested, please ask.

How to map optical flow field (float) to pixel data (char) for image warping?

I've been playing with the optical flow functions in OpenCV and am stuck. I've successfully generated X and Y optical flow fields/maps using the Farneback method, but I don't know how to apply this to the input image coordinates to warp the images. The resulting X and Y fields are of 32bit float type (0-1.0), but how does this translate to the coordinates of the input and output images? For example, 1.0 of what? The width of the image? The difference between the two?
Plus, I'm not sure what my loop would look like to apply the transform/warp. I've done plenty of loops to change color, but the pixels always remain in the same location. Moving pixels around is new territory for me!
Update: I got this to work, but the resulting image is messy:
//make a float copy of 8 bit grayscale source image
IplImage *src_img = cvCreateImage(img_sz, IPL_DEPTH_32F, 1);
cvConvertScale(input_img,src_img,1/255.0); //convert 8 bit to float
//create destination image
IplImage *dst_img = cvCreateImage(img_sz, IPL_DEPTH_32F, 1);
for(y = 0; y < flow->height; y++){
//grab flow maps for X and Y
float* vx = (float*)(velx->imageData + velx->widthStep*y);
float* vy = (float*)(vely->imageData + vely->widthStep*y);
//coords for source and dest image
const float *srcpx = (const float*)(src_img->imageData+(src_img->widthStep*y));
float *dstpx = (float*)(dst_img->imageData+(dst_img->widthStep*y));
for(x=0; x < flow->width; x++)
{
int newx = x+(vx[x]);
int newy = (int)(vy[x])*flow->width;
dstpx[newx+newy] = srcpx[x];
}
}
I could not get this to work. The output was just garbled noise:
cvRemap(src_img,dst_img,velx,vely,CV_INTER_CUBIC,cvScalarAll(0));
The flow vectors are velocity values. If the pixel in image 1 at position (x, y) has the flow vector (vx, vy) it is estimated to be at position (x+vx, y+vy) (so the values aren't really in the [0, 1] range - they can be bigger, and be negative too). Easiest way to do the warping is to create floating point images with those values (x+vx for the x direction, similar for y), and then use cv::remap.
Using OpenCV
https://github.com/opencv/opencv/blob/master/samples/python/opt_flow.py
def warp_flow(img, flow):
h, w = flow.shape[:2]
flow = -flow
flow[:,:,0] += np.arange(w)
flow[:,:,1] += np.arange(h)[:,np.newaxis]
res = cv2.remap(img, flow, None, cv2.INTER_LINEAR)
return res

How to smooth a histogram?

I want to smooth a histogram.
Therefore I tried to smooth the internal matrix of cvHistogram.
typedef struct CvHistogram
{
int type;
CvArr* bins;
float thresh[CV_MAX_DIM][2]; /* for uniform histograms */
float** thresh2; /* for non-uniform histograms */
CvMatND mat; /* embedded matrix header for array histograms */
}
I tried to smooth the matrix like this:
cvCalcHist( planes, hist, 0, 0 ); // Compute histogram
(...)
// smooth histogram with Gaussian Filter
cvSmooth( hist->mat, hist_img, CV_GAUSSIAN, 3, 3, 0, 0 );
Unfortunately, this is not working because cvSmooth needs a CvMat as input instead of a CvMatND. I couldn't transform CvMatND into CvMat (CvMatND is 2-dim in my case).
Is there anybody who can help me? Thanks.
You can use the same basic algorithm used for Mean filter, just calculating the average.
for(int i = 1; i < NBins - 1; ++i)
{
hist[i] = (hist[i - 1] + hist[i] + hist[i + 1]) / 3;
}
Optionally you can use a slightly more flexible algorithm allowing you to easily change the window size.
int winSize = 5;
int winMidSize = winSize / 2;
for(int i = winMidSize; i < NBins - winMidSize; ++i)
{
float mean = 0;
for(int j = i - winMidSize; j <= (i + winMidSize); ++j)
{
mean += hist[j];
}
hist[i] = mean / winSize;
}
But bear in mind that this is just one simple technique.
If you really want to do it using OpenCv tools, I recommend you access the openCv forum: http://tech.groups.yahoo.com/group/OpenCV/join
You can dramatically change the "smoothness" of a histogram by changing the number of bins you use. A good rule of thumb is to have sqrt(n) bins if you have n data points. You might try applying this heuristic to your histogram and see if you get a better result.