I am currently working on a project using a Microsoft's Azure Kinect SDK. I'm taking video feed from a depth camera as input and rendering the depth video. My application uses body tracking, and i am trying to highlight objects that the tracked body touches in a particular color. I do this by constantly calculating the distance between the body and all the pixels, and considering them as having been touched once that distance is below a certain threshold. It is working, but the rendering is quite slow. I believe it is being slowed down because of the distance calculation. I'd like to do this distance calculation with a coroutine, so that the rendering can continue while the calculation is being done. I have done similar things in Unity with C#, but I've looked through C++'s coroutine documentation and can't quite figure it out.
How do I take a function that I've written, and use a coroutine to make it run in parallel to the main function?
I need to render vector graphics very fast to use it in OpenCV (in nodejs).
Fastest way to render simple shapes like oval is to use OpenCV drawing functions.
In my multithreaded test program I have ~625 1-channel 512*512 Mat's with 1 random filled oval per second.
With fastest available in nodejs SVG to PNG renderer 'librsvg' I have only ~277 same Mat's per second. It's not fast enough for my purposes.
I found another SVG renderer lib based on OpenGL - SVGL, but I didn't test it's performance, there is no bindings for node, C++ only.
I will need to render much more complicated vector graphics than just one ellipse.
So I expect a lot of work if I will try to implement all the drawing functions I will need with OpenCV, and I am not sure if OpenCV performance will be still acceptable in case of complicated vector images.
"Complicated" I mean some hundreds of semi-transparent arcs, beziers or some kind of rounded polygons, not filled or filled with solid semi-transparent color or, possibly, with gradients. And I want to render it to pretty large Mat, may be 1024*768 or so.
SVG already has everything I need, but I don't know C++,
so it will(probably) also take a lot of time to implement bindings for SVGL, while I still don't know it's performance
May be there are some alternative opensource ways?
I want to extract the background from a video but i don't want to use cv::bgsegm::BackgroundSubtractorMOG, cv::BackgroundSubtractorMOG2 these methods. because they using frame means. But I planed to use frame comparison method. Where i'm using first frame as background model and i plane to compere pixel values of next frames with first frame pixel values and if there is no change or change less than threshold it is background pixel. How can implement these using OpenCV and C++
Your question is too vague, I think. I can only give you some hints.
First, your approach is very simplistic. That's not bad. But from my experience, it won't give great results, even if you have a lot of control over your scene. Nevertheless, I do not want to hold you back if you want to make your own experiences.
You probably want to take a look at
Operations on Arrays in OpenCV
Basic Threshold Operations in OpenCV
Everything you need should be there. In particular, the absdiff operation and the threshold function (with binary threshold type) should be of interest.
I'm writing an OpenGL 2D library in Python. Everything is going great, and the codebase is steadily growing.
Now I want to write unit tests so I don't accidently bring in new bugs while fixing others/making new features. But I have no idea how those would work with graphics libraries.
Some things I thought of:
make reference screenshots and compare them with autogenerated screenshots in the tests
replace opengl calls with logging statements and compare logs
But both seem a bad idea. What is the common way to test graphics libraries?
The approach I have used in the past for component level testing is:
Use a uniform colored background, with a few different colors.
Use uniform colored rectangles as graphical objects in tests (with a few different colors).
Place rectangles in known places where you can calculate their projected position in the image by yourself.
Calculate expected intensity of each channel of each pixel (background, foreground or mixture).
If you have a test scenario that results in non-round positions, use a non-accurate compare (e.g. correlation)
Use calculations to create expected result images.
Compare output images to expected result images.
If you have a blur effect, compare sum of intensity instead of discrete intensities.
As graham stated, internal units may be unit-testable free from graphics calls.
Break it down even further.
The calls that make the graphics will rely on algorithms - test the algorithms.
I am rewriting an opengl-based gis/mapping program. Among other things, the program allows you to load raster images of nautical charts, fix them to lon/lat coordinates and zoom and pan around on them.
The previous version of the program uses a custom tiling system, where in essence it manually creates mipmaps of the original image, in the form of 256x256-pixel tiles at various power-of-two zoom levels. A tile for zoom level n - 1 is constructed from four tiles from zoom level n, using a simple average-of-four-points algorithm. So, it turns off opengl mipmapping, and instead when it comes time to draw some part of the chart at some zoom level, it uses the tiles from the nearest-match zoom level (i.e., the tiles are in power-of-two zoom levels but the program allows arbitrary zoom levels) and then scales the tiles to match the actual zoom level. And of course it has to manage a cache of all these tiles at various levels.
It seemed to me that this tiling system was overly complex. It seemed like I should be able to let the graphics hardware do all of this mipmapping work for me. So in the new program, when I read in an image, I chop it into textures of 1024x1024 pixels each. Then I fix each texture to its lon/lat coordinates, and then I let opengl handle the rest as I zoom and pan around.
It works, but the problem is: My results are a bit blurrier than the original program, which matters for this application because you want to be able to read text on the charts as early as possible, zoom-wise. So it's seeming like the simple average-of-four-points algorithm the original program uses gives better results than opengl + my GPU, in terms of sharpness.
I know there are several glTexParameter settings to control some aspects of how mipmaps work. I've tried various combinations of GL_TEXTURE_MAX_LEVEL (anywhere from 0 to 10) with various settings for GL_TEXTURE_MIN_FILTER. When I set GL_TEXTURE_MAX_LEVEL to 0 (no mipmaps), I certainly get "sharp" results, but they are too sharp, in the sense that pixels just get dropped here and there, so the numbers are unreadable at intermediate zooms. When I set GL_TEXTURE_MAX_LEVEL to a higher value, the image looks quite good when you are zoomed far out (e.g., when the whole chart fits on the screen), but as you zoom in to intermediate zooms, you notice the blurriness especially when looking at text on the charts. (I.e., if it weren't for the text you might think "wow, opengl is doing a nice job of smoothly scaling my image." but with the text you think "why is this chart out of focus?")
My understanding is that basically you tell opengl to generate mipmaps, and then as you zoom in it picks the appropriate mipmaps to use, and there are some limited options for interpolating between the two closest mipmap levels, and either using the closest pixels or averaging the nearby pixels. However, as I say, none of these combinations seem to give quite as clear results, at the same zoom level on the chart (i.e., a zoom level where text is small but not minuscule, like the equivalent of "7 point" or "8 point" size), as the previous tile-based version.
My conclusion is that the mipmaps that opengl creates are simply blurrier than the ones the previous program created with the average-four-point algorithm, and no amount of choosing the right mipmap or LINEAR vs NEAREST is going to get the sharpness I need.
Specific questions:
(1) Does it seem right that opengl is in fact making blurrier mipmaps than the average-four-points algorithm from the original program?
(2) Is there something I might have overlooked in my use of glTexParameter that could give sharper results using the mipmaps opengl is making?
(3) Is there some way I can get opengl to make sharper mipmaps in the first place, such as by using a "cubic" filter or otherwise controlling the mipmap creation process? Or for that matter it seems like I could use the same average-four-points code to manually generate the mipmaps and hand them off to opengl. But I don't know how to do that...
(1) it seems unlikely; I'd expect it just to use a box filter, which is average four points in effect. Possibly it's just switching from one texture to a higher resolution one at a different moment — e.g. it "Chooses the mipmap that most closely matches the size of the pixel being textured", so a 256x256 map will be used to texture a 383x383 area, whereas the manual system it replaces may always have scaled down from 512x512 until the target size was 256x256 or less.
(2) not that I'm aware of in base GL, but if you were to switch to GLSL and the programmable pipeline then you could use the 'bias' parameter to texture2D if the problem is that the lower resolution map is being used when you don't want it to be. Similarly, the GL_EXT_texture_lod_bias extension can do the same in the fixed pipeline. It's an NVidia extension from a decade ago and is something all programmable cards could do, so it's reasonably likely you'll have it.
(EDIT: reading the extension more thoroughly, texture bias migrated into the core spec of OpenGL in version 1.4; clearly my man pages are very out of date. Checking the 1.4 spec, page 279, you can supply a GL_TEXTURE_LOD_BIAS)
(3) yes — if you disable GL_GENERATE_MIPMAP then you can use glTexImage2D to supply whatever image you like for every level of scale, that being what the 'level' parameter dictates. So you can supply completely unrelated mip maps if you want.
To answer your specific points, the four-point filtering you mention is equivalent to box-filtering. This is less blurry than higher-order filters, but can result in aliasing patterns. One of the best filters is the Lanczos filter. I suggest you calculate all of your mipmap levels from the base texture using a Lanczos filter and crank up the anisotropic filtering settings on your graphics card.
I assume that the original code managed textures itself because it was designed to view data sets that are too large to fit into graphics memory. This was probably a bigger problem in the past, but is still a concern.