OpenCV multithreading with Mats - c++

Hey! So I am working on an assignment about multi-threading with OpenCV. My question is as follows. How can I get all my threads to work on the same image (stored in a Mat)? I know making copies would make it slow and thus multi-threading would have no sense. Also, I would like to control the number of threads I use, and even tho I have seen the lambda c++ 11 introduced I do not know how to make it so that I control the number of threads.
I currently have a function that calculates every pixel to be put in the image, so my code running on serial looks something like this:
for(int i=0;r<MyMat.cols;i++){
for (int j=0;j<MyMat.rows;j++){
uchar value = (uchar) MyFunction(i,j);
MyMat.ptr<uchar>(i)[j] = value;
}
}
English is not my mother tongue, if I did not explain myself properly please ask for clarifications. Any help is good help!

If you split the image into horizontal bands, each thread can work on its own band independently. If each thread does not change any image data beyond its band, it should work.
In fact, OpenCV supports this already.
Take a look at parallel_for_ and how it is used.

Related

how to access pixel in OpenCV Cuda (GpuMat)

cv::Mat image = cv::Mat::zeros(1920,1080,CV_8UC4); //it's an example (I use image have 4 channel)
cv::Vec4b& pixel = image.at<cv::Vec4b>(i,j) // i and j is rows and cols
I want to use Cuda(GpuMat) and there is no ".at"
how can I change my code for accessing the pixels
The cv::cuda::GpuMat class has it's data live on the GPU/device, and this cannot be directly accessed by CPU/host code. This is why there is no equivalent to cv::Mat.at(). Transferring data between the host and device is slow, and doing a per pixel operation on a cv::cuda::GpuMat would therefore be far slower than on a cv::Mat.
It is however possible to write CUDA kernels which perform per-pixel operations. I'm afraid that while I can't give good advice on this, that this is apparently doable and there are answers to similar problems such as this one that might be able to help you.
Outside of that depending on exactly what you need to do there might be a build in function that does something similar.

Filling blobs in OpenCV (OpenFrameworks) with images or video

So in openFrameworks (and I am new to it), using ofxOpenCV, I am trying to track blobs (done) and use the information as a mask, to fill each blob with a different image/video (not done). Looking through the documentation for ofxCVContourFinder, I don't seem to see any methods related to making a mask or filling that blob space. I was wondering, does anyone have any advice on continuing? Does anyone have any advice on how I might be looking at this wrong?(http://www.openframeworks.cc/documentation/ofxOpenCv/ofxCvContourFinder.html#show_blobs)
for (int i = 0; i < contourFinder.nBlobs; i++){
contourFinder.blobs[i].draw(360,100);
//some sort of blobs[i].fill();?
}
Thanks!
Getting this together was definitely at the limits of what I am capable of, but I did get it working. First thing is that I used Kyle McDonald's ofxCv addon
With this, I used the (much simpler) contour tracking to get the shapes, created shader fbos for each shape, and assigned videos to those as an alpha mask. I apologize if this isn't in detail, it has been a while and I haven't touched ofxCv since. If you need help with it though, get ahold of me and I'll share what I have.

Person counting for a store using image processing techniques in OpenCV

I am new to image processing and am writing a small application in which I need to count the number of people entering a store. The entry point is fixed and there are 4 camera feeds in the same video to do the same counting. What can I possibly use to do this?
I have used Running Average and Background subtraction till now and that has given me only parts of the image which involve a person. How do I use this for counting? I am using OpenCV with C++.
Thanks!
If you have at your disposal multiple video stream you can calibrate your system to create a passive stereo framework.
I've already seen many work on this topic like this one:
http://www.dis.uniroma1.it/~iocchi/publications/iocchi-ie05.pdf
You can also take a look at this question:
People counting using OpenCV

how to store and keep skeleton data in kinect (access,read write,threads...)?

I have a question and I am sure somebody has faced this before. I will appreciate your suggestions.
I am working with kinect's skeleton data (joint position information). kinect gives us information of 20 joints in each frame but I need to keep variable number of frames in memory (let's say 2000 frame) and then manipulate it, read and write on it, apply different algorithms on it.
what is the best method in your idea to keep these frames information (consider simultaneous read/write, threads ..). what I found so far is Concurrency class and I want to add each frame into a concurrent vector or queue like:
Concurrency::concurrent_vector<NUI_SKELETON_DATA> skeletonFrameQueue;
skeletonFrameQueue.push_back(frameData);
that is what I came up with but sometime I have runtime errors and program crashes and it is about heap.
what do you think and what do you suggest about storing these data? is concurrent vectors ok? does they secure with threads? or what is other options ?
thanks in advance for your suggestions.

How do I do high quality scaling of a image?

I'm writing some code to scale a 32 bit RGBA image in C/C++. I have written a few attempts that have been somewhat successful, but they're slow and most importantly the quality of the sized image is not acceptable.
I compared the same image scaled by OpenGL (i.e. my video card) and my routine and it's miles apart in quality. I've Google Code Searched, scoured source trees of anything I thought would shed some light (SDL, Allegro, wxWidgets, CxImage, GD, ImageMagick, etc.) but usually their code is either convoluted and scattered all over the place or riddled with assembler and little or no comments. I've also read multiple articles on Wikipedia and elsewhere, and I'm just not finding a clear explanation of what I need. I understand the basic concepts of interpolation and sampling, but I'm struggling to get the algorithm right. I do NOT want to rely on an external library for one routine and have to convert to their image format and back. Besides, I'd like to know how to do it myself anyway. :)
I have seen a similar question asked on stack overflow before, but it wasn't really answered in this way, but I'm hoping there's someone out there who can help nudge me in the right direction. Maybe point me to some articles or pseudo code... anything to help me learn and do.
Here's what I'm looking for:
No assembler (I'm writing very portable code for multiple processor types).
No dependencies on external libraries.
I am primarily concerned with scaling DOWN, but will also need to write a scale up routine later.
Quality of the result and clarity of the algorithm is most important (I can optimize it later).
My routine essentially takes the following form:
DrawScaled(uint32 *src, uint32 *dst,
src_x, src_y, src_w, src_h,
dst_x, dst_y, dst_w, dst_h );
Thanks!
UPDATE: To clarify, I need something more advanced than a box resample for downscaling which blurs the image too much. I suspect what I want is some kind of bicubic (or other) filter that is somewhat the reverse to a bicubic upscaling algorithm (i.e. each destination pixel is computed from all contributing source pixels combined with a weighting algorithm that keeps things sharp.
Example
Here's an example of what I'm getting from the wxWidgets BoxResample algorithm vs. what I want on a 256x256 bitmap scaled to 55x55.
www.free_image_hosting.net/uploads/1a25434e0b.png
And finally:
www.free_image_hosting.net/uploads/eec3065e2f.png
the original 256x256 image
I've found the wxWidgets implementation fairly straightforward to modify as required. It is all C++ so no problems with portability there. The only difference is that their implementation works with unsigned char arrays (which I find to be the easiest way to deal with images anyhow) with a byte order of RGB and the alpha component in a separate array.
If you refer to the "src/common/image.cpp" file in the wxWidgets source tree there is a down-sampler function which uses a box sampling method "wxImage::ResampleBox" and an up-scaler function called "wxImage::ResampleBicubic".
A fairly simple and decent algorithm to resample images is Bicubic interpolation, wikipedia alone has all the info you need to get this implemented.
Is it possible that OpenGL is doing the scaling in the vector domain? If so, there is no way that any pixel-based scaling is going to be near it in quality. This is the big advantage of vector based images.
The bicubic algorithm can be tuned for sharpness vs. artifacts - I'm trying to find a link, I'll edit it in when I do.
Edit: It was the Mitchell-Netravali work that I was thinking of, which is referenced at the bottom of this link:
http://www.cg.tuwien.ac.at/~theussl/DA/node11.html
You might also look into Lanczos resampling as an alternative to bicubic.
Now that I see your original image, I think that OpenGL is using a nearest neighbor algorithm. Not only is it the simplest possible way to resize, but it's also the quickest. The only downside is that it looks very rough if there's any detail in your original image.
The idea is to take evenly spaced samples from your original image; in your case, 55 out of 256, or one out of every 4.6545. Just round the number to get the pixel to choose.
Try using the Adobe Generic Image Library ( http://opensource.adobe.com/wiki/display/gil/Downloads ) if you want something ready and not only an algorithm.
Extract from: http://www.catenary.com/howto/enlarge.html#c
Enlarge or Reduce - the C Source Code
Requires Victor Image Processing Library for 32-bit Windows v 5.3 or higher.
int enlarge_or_reduce(imgdes *image1)
{
imgdes timage;
int dx, dy, rcode, pct = 83; // 83% percent of original size
// Allocate space for the new image
dx = (int)(((long)(image1->endx - image1->stx + 1)) * pct / 100);
dy = (int)(((long)(image1->endy - image1->sty + 1)) * pct / 100);
if((rcode = allocimage(&timage, dx, dy,
image1->bmh->biBitCount)) == NO_ERROR) {
// Resize Image into timage
if((rcode = resizeex(image1, &timage, 1)) == NO_ERROR) {
// Success, free source image
freeimage(image1);
// Assign timage to image1
copyimgdes(&timage, image1);
}
else // Error in resizing image, release timage memory
freeimage(&timage);
}
return(rcode);
}
This example resizes an image area and replaces the original image with the new image.
Intel has IPP libraries which provide high speed interpolation algorithms optimized for Intel family processors. It is very good but it is not free though. Take a look at the following link:
Intel IPP
A generic article from our beloved host: Better Image Resizing, discussing the relative qualities of various algorithms (and it links to another CodeProject article).
It sounds like what you're really having difficulty understanding is the discrete -> continuous -> discrete flow involved in properly resampling an image. A good tech report that might help give you the insight into this that you need is Alvy Ray Smith's A Pixel Is Not A Little Square.
Take a look at ImageMagick, which does all kinds of rescaling filters.
As a follow up, Jeremy Rudd posted this article above. It implements filtered two pass resizing. The sources are C# but it looks clear enough that I can port it to give it a try. I found very similar C code yesterday that was much harder to understand (very bad variable names). I got it to sort-of-work, but it was very slow and did not produce good results which led me to believe there was an error in my adaptation. I may have better luck writing it from scratch with this as a reference, which I'll try.
But considering how the two pass algorithm works I wonder if there isn't a faster way of doing it, perhaps even in one pass?