I am trying to find out the difference in 2 images.
Scenario: Suppose that i have 2 images, one of a background and the other of a person in front of the background, I want to subtract the two images in such a way that I get the position of the person, that is the program can detect where the person was standing and give the subtracted image as the output.
The code that I have managed to come up with is taking two images from the camera and re-sizing them and is converting both the images to gray scale. I wanted to know what to do after this. I checked the subtract function provided by OpenCV but it takes arrays as inputs so I don't know how to progress.
The code that I have written is:
cap>>frame; //gets the first image
cv::cvtColor(frame,frame,CV_RGB2GRAY); //converts it to gray scale
cv::resize(frame,frame,Size(30,30)); //re-sizes it
cap>>frame2;//gets the second image
cv::cvtColor(frame2,frame2,CV_RGB2GRAY); //converts it to gray scale
cv::resize(frame2,frame2,Size(30,30)); //re-sizes it
Now do I simply use the subtract function like:
cv::subtract(frame_gray,frame,frame);
or do I apply some filters first and then use the subtract function?
As others have noticed, it's a tricky problem: easy to come up with a hack that will work sometimes, hard to come up with a solution that will work most of the time with minimal human intervention. Also, much easier to do if you can control tightly the material and illumination of the background. The professional applications are variously known as "chromakeying" (esp. in the TV industry), "bluescreening", "matting" or "traveling matte" (in cinematography), "background removal" in computer vision.
The groundbreaking work for matting quasi-uniform backdrops was done by Petro Vlahos many years ago. The patents on its basic algorithms have already expired, so you can go to town with them (and find open source implementations of various quality). Needless to say, IANAL, so do your homework on the patent subject.
Matting out more complex backgrounds is still an active research area, especially for the case when no 3D information is available. You may want to look into a few research papers that have come out of MS Research in the semi-recent past (A. Criminisi did some work in that area).
Using the subtract would not be appropriate because, it might result in some values becoming negative and will work only if you are trying to see if there is a difference or not( a boolean true/false).
If you need to get the pixels where it is differing, you should do a pixel by pixel comparison - something like:
int rows = frame.rows;
int cols = frame.cols;
cv::Mat diffImage = cv::Mat::zeros(rows, cols, CV_8UC1);
for(int i = 0; i < rows; ++i)
{
for(int j = 0; j < cols; ++j)
{
if(frame.at<uchar>(i,j) != frame2.at<uchar>(i,j))
diffImage.at<uchar>(i, j) = 255;
}
}
now, you can either show or save diffImage. All pixels that differ will be white while the similar ones will be in black
Related
Matlab offers the ability to set colour limits for the current axis using CAXIS. OpenCV has applyColorMap which can be used to highlight differences in pixel intensity in a greyscale image which I believe maps pixel from 0 - 255.
I am new to Matlab/Image-processing and have been asked to port a simple program from MatLab which uses the CAXIS function to change the "brightness" of a colour map. I have no experience in Matlab but it appears that they use this function to "lower" the intensity requirements needed for pixels to be mapped to a more intense colour on the map
i.e. Colour map using "JET"
When brightness = 1, red = 255
When brightness = 10, red >= 25
The matlab program allows 16bit images to be read in and displayed which obviouly gives higher pixel values whereas everything i've read and done indicates OpenCV only supports 8 bit images (for colour maps)
Therefore my question is, is it possible to provide similar functionality in OpenCV? How do you set the axis limit for a colourmap/how do you scale the colour map lookup table so that "less" intense pixels are scaled to the more intense regions?
A similar question was asked with a reply stating the array needs to be "normalised" but unfortunately I don't quite know how to achieve this and can't reply to the answer as i don't have enough rep!
I have gone ahead and used cv::normalize to set the max value in the array to be maxPixelValue/brightness but that doesn't work at all.
I have also experimented and tried converting my 16bit image into a CV_8UC1 with a scale factor to no avail. Any help would be greatly appreciated!
In my opinion you can use cv::normalize to "crop" values in the source picture to the corresponding ones in color map you are interested in. Say you want your image to be mapped to the blue-ish region of Jet colormap then you should do something like:
int minVal = 0, maxVal = 80;
cv::normalize(src,dst, minVal, maxVal, cv::NORM_MINMAX);
If you plan to apply some kind of custom map it's fairly easy for 1-or3-channel 8-bit image, you only need to create LUT with 255 values (with proper number of channels) and apply it using cv::LUT, more about it in this blog, also see the dosc about LUT
If the image you are working is of different depth, 16-bit or even floating point data I guess all you need to do is write a function like:
template<class T>
T customColorMapper(T input_pixel)
{
T output_pixel = 0;
// do something with output_pixel basing on intput_pixel
return output_pixel;
}
and apply it to each source image pixel like:
cv::Mat dst_image = src_image.clone(); //copy data
dst_image.forEach<TYPE>([](TYPE& input_pixel, const int* pos_row_col) -> void {
input_pixel = customColorMapper<TYPE>(input_pixel);
});
of course TYPE need to be a valid type. Maybe specialized version of this function taking cv::Scalar or cv::Vec3-something would be nice if you need to work with multiple channels.
Hope this helps!
I managed to replicate the MATLAB behaviour but had to resort to manually iterating over each pixel and setting the value to the maximum value for the image depth or scaling the value where needed.
my code looked something like this
cv::minMaxLoc(dst, &min, &max);
double axisThreshold = floor(max / contrastLevel);
for (int i = 0; i < dst.rows; i++)
{
for (int j = 0; j < dst.cols; j++)
{
short pixel = dst.at<short>(i, j);
if (pixel >= axisThreshold)
{
pixel = USHRT_MAX;
}
else
{
pixel *= (USHRT_MAX / axisThreshold);
}
dst.at<short>(i, j) = cv::saturate_cast<short>(pixel);
}
}
In my example I had a slider which adjusted the contrast/brightness (we called it contrast, the original implementation called it brightness).
When the contrast/brightness was changed, the program would retrieve the maximum pixel value and then compute the axis limit by doing
calculatedThreshold = Max pixel value / contrast
Each pixel more than the threshold gets set to MAX, each pixel lower than the threshold gets multiplied by a scale factor calculated by
scale = MAX Pixel Value / calculatedThreshold.
TBH i can't say I fully understand the maths behind it. I just used trial and error until it worked; any help in that department would be appreciated HOWEVER it seems to do what i want to!
My understanding of the initial matlab implementation and the terminology "brightness" is in fact their attempt to scale the colourmap so that the "brighter" the image, the less intense each pixel had to be to map to a particular colour in the colourmap.
Since applycolourmap only works on 8 bit images, when the brightness increases and the colourmap axis values decrease, we need to ensure the values of the pixels scale accordingly so that they now match up with the "higher" intensity values in the map.
I have seen numerous OPENCV tutorials which use this approach to changing the contrast/brightness but they often promote the use of optimised convertTo (especially if you're trying to use the GPU). However as far as I can see, convertTo applies the aplha/beta values uniformly and not on a pixel by pixel basis therefore I can't use that approach.
I will update this question If i found more suitable OPENCV functions to achieve what I want.
I'm doing some tutorials for OpenFrameworks (i'm kind of a noob when it comes to coding but have a bit of experience so far w/ tutorials and learning what's going on and stuff over the past few years) and a major part of the code involves grabbing the sound spectrum of an audio sample and throwing the values into an array to control a float value. But I can't seem to wrap my head around what's going on here.
This is the relevant code (it's a VJ shaper that rotates and changes the size of shapes according to input from the sound spectrum):
header:
float * fftSmooth;
int bands;
cpp setup:
fftSmooth = new float[8192];
for (int i = 0; i < 8192; i++) {
fftSmooth[i] = 0;
}
bands = 64;
cpp update:
float * value = ofSoundGetSpectrum(bands);
for (int i = 0; i < bands; i++) {
fftSmooth[i] *= release; //"release" is a float
if (fftSmooth[i] < value[i]) {
fftSmooth[i] = value[i];
}
}
if anyone could walk me through the steps of what's going on, that would be great. I understand (sort of) that in the setup, an array called "fftSmooth" is being created, with 8192 floats in it, then being filled with zeros in the for loop after which the int "bands" is being assigned a value of 64. Then in the update, another array called "value" is being created with 64 floats in it by looking at "bands", which is also the number of bands in ofSoundGetSpectrum, which is grabbing the frequency levels from a sound file as it plays. I've looked at the openframeworks reference page for the sound spectrum thing and didn't really get any more clues as to what it's doing in this context, and i have no idea what the for loops and if statements in the update section are doing either.
Not knowing what's going on really isn't going to impact whether i can actually use the code or not, but i feel like if i want to actually build on this code (grabbing different frequency ranges etc) i need to know what the for loops and if statements in the update are doing.
ofSoundGetSpectrum(...)
Gets a frequency spectrum sample, taking all current sound players into account.
Each band will be represented as a float between 0 and 1.
This appears to be taking an instantaneous FFT, and returning the "strength" of each of the frequency bands.
I assume the second half of the code is run in a loop. The first time through, it is just going to copy the current band strength into fftSmooth. In subsequent passes, the multiply by release is designed to reduce the value in fftSmooth by some percentage. Then any new band strength greater than the filtered one will overwrite the old value.
If you animate plots of fftSmooth, you should get an image like this (minus the color) :
I am building a vision system which can count boxes moving on a variable speed conveyor belt.
Using open_cv and c++, I could separate the blobs and extract the respective centroids.
Now I have to increment the count variable, if the centroid crosses the cutoff boundary line.
This is where I am stuck. I tried 2 alternatives.
Fixing a rectangular strip where a centroid would stay for only one single frame
But since the conveyor is multi speed, I could not fix a constant boundary value.
I tried something like
centroid_prev = centroid_now;
centroid_now = posX;
if (centroid_now >= xLimit && centroid_prev < xLimit)
{
count++;
}
This works fine if just a single box is present on the conveyor.
But for 2 or more blobs in same frame, I do not know how to handle using arrays for contours.
Can you please suggest a simple counting algorithm which can compare
blob properties between previous frame and current frame even if
multiple blobs are present per frame?
PS. Conveyor speed is around 50 boxes/second, so a lightweight algorithm will be very much appreciated else we may end up with a lower frame rate.
Assuming the images you pasted are representative, you can easily solve this by doing some kind of tracking.
The simplest way that comes to mind is to use goodFeaturesToTrack and calcOpticalFlowPyrLK to track the motion of the conveyor.
You'll probably need to do some filtering on the result, but I don't think that would be difficult, as the motion and images are very low in noise.
Once you have that motion, you can calculate for each centroid when it moved beyond a certain X threshold and count it.
With a low number of corners (<100) such as in this image, it should be fast.
Have you tried matching the centroid coordinates from the previous frame with the centroids from the new frame? You can use OpenCV's descriptor matchers for that. (The code samples all match feature vectors, but there's no reason why you shouldn't use them for coordinate matching.)
If you're worried about performance: matching 5-10 coordinate centers should be orders of magnitudes faster than finding blobs in an image.
This is the algorithm for arrays. It's just an extension of what you are doing - you can adjust the specifics
for(i=0; i<centroid.length; i++)
centroid_prev[i] = centroid[i].posX;
for(frame j=0 to ...) {
... recompure centroids
for(i=0; i<centroid.length; i++) {
centroid_now = centroid[i].posX;
if (centroid_now >= xLimit && centroid_prev[i] < xLimit)
{
count++;
}
}
for(i=0; i<centroid.length; i++)
centroid_prev[i] = centroid[i].posX;
}// end j
--
If the objects can move about (and they look about the same) you need to add additional info such as color to locate the same objects.
EDIT: honest recommendation
If you want to stream from a PMD in realtime, use C#. Any UI is simple to create and there os quite a mighty library, MetriCam by Metrilus AG, which supports streaming for a variety of 3D-Cameras. I am able to get stable 45 fps with that.
ORIGINAL:
I've been trying to get depth information from a PMD camboard nano and visualize it in a GUI. The Information is delivered as a 165x120 float array.
As I also want to use the Data for analysis purpose (image quality, white noise etc.), I need to grab the frames at a specific framerate. The problem is, that the SDK which PMD delivers with its camera (for MATLAB & C) only provides the possibility to grab single frames by calling
pmdUpdate(hnd);
so the framerate is dependent on how often you poll the image data.
I initially tried to do the analysis in MATLAB, but I couldn't get more than 30 fps out of the camera and adding some further code to the loop made it impossible to work with (I need at least reliable 25 fps).
I then switched to C, where I got rates of up to 70 fps, but could not visualize the data.
Then I tried it with Qt, which is based on C/C++ - it should therefore be fast polling the image data - and where I could easily include the libraries of the PMDSDK. As I am new to Qt, though, I do not know much about the UI-Elements.
So my question:
Is there any performant way to visualize a 2D-float-array on a Qt-GUI? If not, how about anything useful in Visual Studio with C++?
(I know that drawing every pixel one by one on a QGraphicsView is dumb, but I tried it, and I get a whopping framerate of .4 fps...)
Thanks for any helpful suggestions!
Jannik
The QImage Class actually has a constructor that accepts a uchar pointer/array. You only need to map my float values to RGB values in uchar-format.
pmdGetDistances(hnd, dist, dd.img.numColumns*dd.img.numRows*sizeof(float));
uchar *imagemap = new uchar[dd.img.numColumns*dd.img.numRows*3];
int i,j;
for (i = 0; i < 165; i++){
for (j = 0; j < 120; j++){
uchar value = (uchar)std::floor(40*dist[j*165+i]);
if(value > 255 || value < 0){
value = 0;
}
//colorscaling integrated
imagemap[3*(j*165+i)] = floor((255-value)*(255-value)/255.0);
imagemap[3*(j*165+i)+1] = abs(floor((value-127)/1.5));
imagemap[3*(j*165+i)+2] = floor(value*value/255.0);
}
}
The QImage can then be converted to Pixmap and displayed in the QGraphicsView. This worked for me, but the framerate seems not really stable.
QImage image(imagemap, 165, 120, 165*3, QImage::Format_RGB888);
QPixmap pmap(QPixmap::fromImage(image));
scene->addPixmap(pmap.scaled(165,120));
ui->viewCamera->update();
It could be worth a try to send the Thread sleeping until the desired time is elapsed.QThread::msleep(msec);
I wanted to plot the wave-form of the .wav file for the specific plotting width.
Which method should I use to display correct waveform plot ?
Any Suggestions , tutorial , links are welcomed....
Basic algorithm:
Find number of samples to fit into draw-window
Determine how many samples should be presented by each pixel
Calculate RMS (or peak) value for each pixel from a sample block. Averaging does not work for audio signals.
Draw the values.
Let's assume that n(number of samples)=44100, w(width)=100 pixels:
then each pixel should represent 44100/100 == 441 samples (blocksize)
for (x = 0; x < w; x++)
draw_pixel(x_offset + x,
y_baseline - rms(&mono_samples[x * blocksize], blocksize));
Stuff to try for different visual appear:
rms vs max value from block
overlapping blocks (blocksize x but advance x/2 for each pixel etc)
Downsampling would not probably work as you would lose peak information.
Either use RMS, BlockSize depends on how far you are zoomed in!
float RMS = 0;
for (int a = 0; a < BlockSize; a++)
{
RMS += Samples[a]*Samples[a];
}
RMS = sqrt(RMS/BlockSize);
or Min/Max (this is what cool edit/Audtion Uses)
float Max = -10000000;
float Min = 1000000;
for (int a = 0; a < BlockSize; a++)
{
if (Samples[a] > Max) Max = Samples[a];
if (Samples[a] < Min) Min = Samples[a];
}
Almost any kind of plotting is platform specific. That said, .wav files are most commonly used on Windows, so it's probably a fair guess that you're interested primarily (or exclusively) in code for Windows as well. In this case, it mostly depends on your speed requirements. If you want a fairly static display, you can just draw with MoveTo and (mostly) LineTo. If that's not fast enough, you can gain a little speed by using something like PolyLine.
If you want it substantially faster, chances are that your best bet is to use something like OpenGL or DirectX graphics. Either of these does the majority of real work on the graphics card. Given that you're talking about drawing a graph of sound waves, even a low-end graphics card with little or no work on optimizing the drawing will probably keep up quite easily with almost anything you're likely to throw at it.
Edit: As far as reading the .wav file itself goes, the format is pretty simple. Most .wav files are uncompressed PCM samples, so drawing them is a simple matter of reading the headers to figure out the sample size and number of channels, then scaling the data to fit in your window.
Edit2: You have a couple of choices for handling left and right channels. One is to draw them in two separate plots, typically one above the other. Another is to draw them superimposed, but in different colors. Which is more suitable depends on what you're trying to accomplish -- if it's mostly to look cool, a superimposed, multi-color plot will probably work nicely. If you want to allow the user to really examine what's there in detail, you'll probably want two separate plots.
What exactly do you mean by a waveform? Are you trying to plot the level of the frequency components in the signal a.k.a the spectrum, most commonly seen in musci visualizers, car stereos, boomboxes? If so, you should use the Fast Fourier Transform. FFT is a standard technique to split a time domain signal into its individual frequencies. There are tons of good FFT library routines available.
In C++, you can use the openFrameworks library to set up a music player for wav, extract the FFT and draw it.
You can also use Processing with the Minim library to do the same. I have tried it and it is pretty straightforward.
Processing even has support for OpenGL and it is a snap to use.