I need to acquire depth pixels from Kinect at VGA resolution (640x480) and then resize the depth map to QVGA (320x240).
Does openNI provide a method for this?
I tried the following:
_depthGenerator.GetMetaData(*_depthMetaData); // THIS WORKS, I CAN DISPLAY THE DEPTH MAP AT VGA RESOLUTION
_depthMetaData->ReAdjust(320, 240); //IF I DO THIS, THE DEPTH MAP BECOMES FILLED WITH ZEROS
So I am kinda lost.
Any help?
Notes: The point is not to receive QVGA from the sensor, is to receive at VGA and then resize.
The point is not to use openCV to resize the depth map, because the interpolation rules corrupt the depth data.
I don't understand exactly what are you trying to achieve here, but I'll give you a possible solution.
you can create another object structure to save this, for example opencv Mat, then drop each even column and even row, and you'll get a resize without interpolation... There is no other way to downscale without interpolation or loosing data...
I hope this helps, if not, comment the answer and I'll try to change it to a better answer
Related
I have had issues finding any information on how to use a depth map/image to gain the distance to an obstacle.
TLDR: I have a depth map as an OpenCV Mat, I know it's CV_16UC1, but I dont know how to get the distances from it.
I have an Intel Realsense D415 camera, I have installed the SDK, the ROS wrapper, I have a topic with a depth map published. (/camera/depth/image_rect_raw)
Next, I've written a little program in C++ that converts the image using cv_bridge to an OpenCV image and I display it in a window, so I know it's working. (image: https://i.imgur.com/QyKWp2J.png )
Now I need to get the distance from it, which I have no idea how to do and have been unsuccessful in finding help for.
I imagine I'm only going to want to use the top half or 2/3 of the image, because of the way the camera will be mounted, so the bottom third/half will only contain of floor/ground.
I feel like I am missing something big and simple in order to make this work, but I literally dont even know what to do now.
If you have a depth map and you want to generate 3D points from it, you can use reprojectImageTo3D() function of OpenCV.
Before that, you need to have the disparity-to-depth mapping matrix, which can be obtained using stereoRectify() function.
I want to extract the background from a video but i don't want to use cv::bgsegm::BackgroundSubtractorMOG, cv::BackgroundSubtractorMOG2 these methods. because they using frame means. But I planed to use frame comparison method. Where i'm using first frame as background model and i plane to compere pixel values of next frames with first frame pixel values and if there is no change or change less than threshold it is background pixel. How can implement these using OpenCV and C++
Your question is too vague, I think. I can only give you some hints.
First, your approach is very simplistic. That's not bad. But from my experience, it won't give great results, even if you have a lot of control over your scene. Nevertheless, I do not want to hold you back if you want to make your own experiences.
You probably want to take a look at
Operations on Arrays in OpenCV
Basic Threshold Operations in OpenCV
Everything you need should be there. In particular, the absdiff operation and the threshold function (with binary threshold type) should be of interest.
I'm using Open CV 2.4.6 with C++ (with Python sometimes too but it is irrelevant). I would like to know if there is a simple way to get all the available frame sizes from a capture device?
For example, my webcam can provide 640x480, 320x240 and 160x120. Suppose that I don't know about these frame sizes a priori... Is it possible to get a vector or an iterator, or something like this that could give me these values?
In other words, I don't want to get the current frame size (which is easy to obtain) but the sizes I could set the device to.
Thanks!
When you retrieve a frame from a camera, it is the maximum size that that camera can give. If you want a smaller image, you have to specify it when you get the image, and opencv will resize it for you.
A normal camera has one sensor of one size, and it sends one kind of image to the computer. What opencv does with it thereafter is up to you to specify.
I'm working on an OpenGL-powered 2d engine.
I'm using stb_image to load image data so I can create OpenGL textures. I know that the UV origin for OpenGL is bottom-left and I also intend to work in that space for my screen-space 2d vertices i.e. I'm using glm::ortho( 0, width, 0, height, -1, 1 ), not inverting 0 and height.
You probably guessed it, my texturing is vertically flipped but I'm 100% sure that my UV are specified correctly.
So: is this caused by stbi_load's storage of pixel data? I'm currently loading PNG files only so I don't know if it would cause this problem if I was using another file format. Would it? (I can't test right now, I'm not at home).
I really want to keep the screen coords in the "standard" OpenGL space... I know I could just invert the orthogonal projection to fix it but I would really rather not.
I can see two sane options:
1- If this is caused by stbi_load storage of pixel data, I could invert it at loading time. I'm a little worried about that for performance reason and because I'm using texture arrays (glTexture3d) for sprite animations meaning I would need to invert texture tiles individually which seems painful and not a general solution.
2- I could use a texture coordinate transformation to vertically flip the UVs on the GPU (in my GLSL shaders).
A possible 3rd option would be to use glPixelStore to specify the input data... but I can't find a way to tell it that the incoming pixels are vertically flipped.
What are your recommendations for handling my problem? I figured I can't be the only one using stbi_load + OpenGL and having that problem.
Finally, my target platforms are PC, Android and iOS :)
EDIT: I answered my own question... see below.
I know this question's pretty old, but it's one of the first results on google when trying to solve this problem, so I thought I'd offer an updated solution.
Sometime after this question was originally asked stb_image.h added a function called "stbi_set_flip_vertically_on_load", simply passing true to this function will cause it to output images the way OpenGL expects - thus removing the need for manual flipping/texture-coordinate flipping.
Also, for those who don't know where to get the latest version, for whatever reason, you can find it at github being actively worked on:
https://github.com/nothings/stb
It's also worth noting that in stb_image's current implementation they flip the image pixel-by-pixel, which isn't exactly performant. This may change at a later date as they've already flagged it for optimsation. Edit: It appears that they've swapped to memcpy, which should be a good bit faster.
Ok, I will answer my own question... I went thru the documentation for both libs (stb_image and OpenGL).
Here are the appropriate bits with reference:
glTexImage2D says the following about the data pointer parameter: "The first element corresponds to the lower left corner of the texture image. Subsequent elements progress left-to-right through the remaining texels in the lowest row of the texture image, and then in successively higher rows of the texture image. The final element corresponds to the upper right corner of the texture image." From http://www.opengl.org/sdk/docs/man/xhtml/glTexImage2D.xml
The stb_image lib says this about the loaded image pixel: "The return value from an image loader is an 'unsigned char *' which points to the pixel data. The pixel data consists of *y scanlines of *x pixels, with each pixel consisting of N interleaved 8-bit components; the first pixel pointed to is top-left-most in the image." From http://nothings.org/stb_image.c
So, the issue is related the pixel storage difference between the image loading lib and OpenGL. It wouldn't matter if I loaded other file formats than PNG because stb_image returns the same data pointer for all formats it loads.
So I decided I'll just swap in place the pixel data returned by stb_image in my OglTextureFactory. This way, I keep my approach platform-independent. If load time becomes an issue down the road, I'll remove the flipping at load time and do something on the GPU instead.
Hope this helps someone else in the future.
Yes, you should. This can be easily accomplished by simply calling this STBI function before loading the image:
stbi_set_flip_vertically_on_load(true);
Since this is a matter of opposite assumptions between image libraries in general and OpenGL, Id say the best way is to manipulate the vertical UV-coord. This takes minimal effort and is always relevant when loading images using any image library and passing it to OpenGL.
Either feed tex-coords with 1.0f-uv.y in vertex-population OR reverse in shader.
fcol = texture2D( tex, vec2(uv.x,1.-uv.y) );
I am trying to detect whether a particular pixel is filled or not in OpenGL in order to implement the flood fill algorithm. So, I searched and found the glreadpixel function but I don't get how I can use this function and whether or not it can solve my problem.
The proper way is probably not to read back pixels. Instead, you should do all manipulations in a bitmap that you manage on your own. Then you request OpenGL to show this bitmap.
OpenGL is not a image manipulation library. It's a drawing API and it should not be used for tasks like this. Reading back image data is very expensive in OpenGL and should be avoided.