Should I provide a full or partial image to glTexSubImage2D? - c++

I have a piece of GLvoid* data that contains an entire image, which is periodically updated throughout my program. I use glTexImage2D to initialize a texture on the GPU with this data.
I would like to use glTexSubImage2D to update parts of the texture as necessary. The documentation for glTexSubImage2D describes the GLvoid* pixels argument as such:
Specifies a pointer to the image data in memory.
What "image data" is this expecting? Can I provide the entire GLvoid* data, or is it expecting a buffer that only contains the data being copied?
If it expects the partial data, is there an alternative way to provide the whole buffer instead?

You can provide entire image data with call
glTexSubImage2D( target, level, 0, 0, W, H, format, type, pixels);
W and H are texture width and height, the whole texture will be copied, pixels is W*H array. Or you can provide only modified data with call
glTexSubImage2D( target, level, offset_x, offset_y, w, h, format, type, pixels);
where w and h is weigth and height of modified data so offset_x + w < W and offset_y + y < Y. pixels is w*h array.
Edit:
glPixelStorei( GL_UNPACK_ROW_LENGTH, W);
glPixelStorei( GL_UNPACK_SKIP_PIXELS, offset_x);
glPixelStorei( GL_UNPACK_SKIP_ROWS, offset_y);
glTexSubImage2D( target, level, offset_x, offset_y, w, h, format, type, pixels);
where pixels is W*H

It's only the data to be copied. Or you will have to waste a lot of memory to upload only a part of the image(sometimes the only part you currently have).
P.S. And it's rather painful that it doesn't even support strided data. So if you have a full image, you can't upload left or right half of it without copying it to a smaller buffer.

Related

Direct write to D3D texture from kernel

I am playing around with NVDEC H.264 decoder from NVIDIA CUDA samples, one thing I've found out is once frame is decoded, it's converted from NV12 to BGRA buffer which is allocated on CUDA's side, then this buffer is copied to D3D BGRA texture.
I find this not very efficient in terms of memory usage, and want to convert NV12 frame directly to D3D texture with this kernel:
void Nv12ToBgra32(uint8_t *dpNv12, int nNv12Pitch, uint8_t *dpBgra, int nBgraPitch, int nWidth, int nHeight, int iMatrix)
So, create D3D texture (BGRA, D3D11_USAGE_DEFAULT, D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS, D3D11_CPU_ACCESS_WRITE, 1 mipmap),
then register and write it on CUDA side:
//Register
ck(cuGraphicsD3D11RegisterResource(&cuTexResource, textureResource, CU_GRAPHICS_REGISTER_FLAGS_NONE));
...
//Write output:
CUarray retArray;
ck(cuGraphicsMapResources(1, &cuTexResource, 0));
ck(cuGraphicsSubResourceGetMappedArray(&retArray, cuTexResource, 0, 0));
/*
yuvFramePtr (NV12) is uint8_t* from decoded frame,
it's stored within CUDA memory I believe
*/
Nv12ToBgra32(yuvFramePtr, w, (uint8_t*)retArray, 4 * w, w, h);
ck(cuGraphicsUnmapResources(1, &cuTexResource, 0));
Once kernel is called, I get crash. May be because of misusing CUarray, can anybody please clarify how to use output of cuGraphicsSubResourceGetMappedArray to write texture memory from CUDA kernel? (since writing raw memory is only needed, there is no need to handle correct clamp, filtering and value scaling)
Ok, for anyone who struggling on question "How to write D3D11 texture from CUDA kernel", here is how:
Create D3D texture with D3D11_BIND_UNORDERED_ACCESS.
Then, register resource:
//ID3D11Texture2D *textureResource from D3D texture
CUgraphicsResource cuTexResource;
ck(cuGraphicsD3D11RegisterResource(&cuTexResource, textureResource, CU_GRAPHICS_REGISTER_FLAGS_NONE));
//You can also add write-discard if texture will be fully written by kernel
ck(cuGraphicsResourceSetMapFlags(cuTexResource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_WRITE_DISCARD));
Once texture is created and registered we can use it as write surface.
ck(cuGraphicsMapResources(1, &cuTexResource, 0));
//Get array for first mip-map
CUArray retArray;
ck(cuGraphicsSubResourceGetMappedArray(&retArray, cuTexResource, 0, 0));
//Create surface from texture
CUsurfObject surf;
CUDA_RESOURCE_DESC surfDesc{};
surfDesc.res.array.hArray = retArray;
surfDesc.resType = CU_RESOURCE_TYPE_ARRAY;
ck(cuSurfObjectCreate(&surf, &surfDesc));
/*
Kernel declaration is:
void Nv12ToBgra32Surf(uint8_t* dpNv12, int nNv12Pitch, cudaSurfaceObject_t surf, int nBgraPitch, int nWidth, int nHeight, int iMatrix)
Surface write:
surf2Dwrite<uint>(VALUE, surf, x * sizeof(uint), y);
For BGRA surface we are writing uint, X offset is in bytes,
so multiply it with byte-size of type.
Run kernel:
*/
Nv12ToBgra32Surf(yuvFramePtr, w, /*out*/surf, 4 * w, w, h);
ck(cuGraphicsUnmapResources(1, &cuTexResource, 0));
ck(cuSurfObjectDestroy(surf));

draw part of image with openGL glDrawPixels

I have a function to draw an image in an openGL context. (used in that case to render to a texture) That works for the whole image, but should also be able to render only a rectangular part. Rendering parts works if the part has the same width as the image. For parts that are less wide than the image-data it fails.
Here is the function (reduced to only the part for small width, no cleanup,etc)
void drawImage(uint32 imageWidth, uint32 imageHeight, uint8* pData,
uint32 offX, uint32 partWidth) // (offX+partWidth<=imageWidth)
{
uint8* p(pData);
if (partWidth != imageWidth)
{
glPixelStorei(GL_PACK_ROW_LENGTH, imageWidth);
p = calcFrom(offX, pData); // point at pixel in row
}
glDrawPixels(partWidth, ImageHeight, GL_BGRA, GL_UNSIGNED_BYTE, p);
}
As said: if (widthPart==imageWidth) the rendering works fine. For some combinations of partWidth and imageWidth it works also but that seems to be a very special case, mainly width very small images and a some special partWidths.
I found no examples for this, but from the docs I think this shold be possible to do somehow like that. Did I missunderstand the whole thing, or have I just overseen a small pit-fall??
Thanks,
Moritz
P.S: it's running on windows
[Edited:] P.P.S: by now I have tried to do that as texture. If I replace glDrawPixels with glTexImage2D I have the same problem...(could upload the whole image and render only part, but for small small parts of big pictures that might not e the best way...)
AAArrrghh!!
GL_UNPACK_ROW_LENGTH not GL_PACK_ROW_LENGTH!!!!

Save OpenGL output to image using cross-platform libraries

I want to take screenshot from OpenGL window and save it to image file of any type. DevIL method described here gives correct PNG. Replace ilSaveImage with ilSave and you can save image in different formats. SOIL method here gives flipped vertically image. Replacing code below
vector< unsigned char > buf( w * h * 3 );
glPixelStorei( GL_PACK_ALIGNMENT, 1 );
glReadPixels( 0, 0, w, h, GL_RGB, GL_UNSIGNED_BYTE, &buf[0] );
int err = SOIL_save_image ("img.bmp", SOIL_SAVE_TYPE_BMP, w, h, 3, &buf[0]);
with only one line creates the correct image.
int err = SOIL_save_screenshot("img.bmp",SOIL_SAVE_TYPE_BMP, 0, 0, w, h);
Q1: Is any more convenient alternatives using other libraries?
Q2: Which one is the best way? Comparison is appreciated e.g. performance\compatibility.

opengl glReadPixels

I want to get color of a pixel. The pixel is mouse position. I use glReadPixels but i can't
POINT pt;
GetCursorPos(&pt);
unsigned char pixel[3];
glReadPixels(pt.x, pt.y, 1, 1, GL_RGB, GL_UNSIGNED_BYTE, pixel);
After this codes value of pixel is: 'Ì'
any idea?
204 is CC in hex representation. This value is often used to fill non-initialized memory. If you'll initialize pixel with zero (for example)
unsigned char pixel[3] = {0};
99,(9)% you'll see zero after call to glReadPixels. Depending on documentation glReadPixels
If an error is generated, no change is made to the contents of data.
that is your data in pixel was not changed because of error. Follow #OlegTitov's fourth advice (look for what glGetError(); will tell you)
Upd: If you want to get a pixel value from the main screen using only glReadPixels, and if you didn't create any GLFrameBuffer, I'm not sure, but I think you'll fail. I'll repeat - I'm not sure, but I think, that glReadPixels can read pixel values only from frame buffers, that was previously created by gl-functions
When outputting char to console, compiler will print as a symbol, not as letter. This is done for C style strings could be printed normal way. To print integer value first cast variable to integer type.
When reading from screen: Note that openGL's coordinate origin is bottom left corner while window systems use upper left corner, so you need to convert from one coordinate system to another
glReadPixels(pt.x, window_height - pt.y, 1, 1, GL_RGB, GL_UNSIGNED_BYTE, pixel);
If you experience further problems make sure that correct pixel buffer is bound as read beffer. For window output:
glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);
If still have some problems start checking your code with glGetError();

Stiching multiple textures/frames together in OpenGL using a Kinect

I came across the following situation:
I have a Kinect camera and I keep taking frames (but they are stored only when the user presses a key).
I am using the freenect library in order to retrieve the depth and the color of the frame (I am no interested in skeleton tracking or something like that).
For a single frame I am using the glpclview example that comes with the freenect library
After retrieving the space data from the Kinect sensor, in the glpclview example, the current frame it is drawn like this:
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_SHORT, 0, xyz);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glTexCoordPointer(3, GL_SHORT, 0, xyz);
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, gl_rgb_tex);
glTexImage2D(GL_TEXTURE_2D, 0, 3, 640, 480, 0, GL_RGB, GL_UNSIGNED_BYTE, rgb);
glPointSize(2.0f);
glDrawElements(GL_POINTS, 640*480, GL_UNSIGNED_INT, indices);
where
static unsigned int indices[480][640];
static short xyz[480][640][3];
char *rgb = 0;
short *depth = 0;
where:
rgb is the color information for the current frame
depth is the depth information for the current frame
xyz is constructed as :
xyz[i][j][0] = j
xyz[i][j]3 = i
xyz[i][j]4 = depth[i*640+j]
indices is (I guess only) array that keeps track of the rgb/depth data and is constructed as:
indices[i][j] = i*640+j
So far, so good, but now I need to render more that just one frame (some of them rotated and translated with a certain angle/offsets). How can I do this?
I'ved tried to increase the size of the arrays and keep reallocationg memory for each new frame, but how can I render them?
Should I change this current line to something else?
glTexImage2D(GL_TEXTURE_2D, 0, 3, 640, 480, 0, GL_RGB, GL_UNSIGNED_BYTE, rgb)
If so, to what values should I change 640 and 480 since now xyz and rgb is a contiguos pointer of 640x480x(number of frames)?
To get a better ideea, I am trying to get something similar to this in the end (except the robot :D ).
If somewone has a better ideea, hint anything on how I should approach this problem, please let me know.
It isn't as simple as allocating a bigger array.
If you want to stitch together multiple point-clouds to make a bigger map, you should look into the SLAM algorithms (that is what they are running in the video your link to). You can find many implementations at http://openslam.org. You might also look into an ICP algorithm (Iterative Closest Point) and the KinectFusion from Microsoft (and the open source KinFu implementation from PCL).