I've had this problem a couple of times. Let's say I want to display a splash-screen or something in an OpenGL context (or DirectX for that matter, it's more of a conceptual thing), now, I could either just load a 2048x2048 texture and hope that the graphics card will cope with it (most will nowadays I suppose), but growing with old-school graphics card I have this bad conscience leaning over me and telling me I shouldn't use textures that large.
What is the preferred way nowadays? Is it to just cram that thing into video memory, tile it, or let the CPU do the work and glDrawPixels? Or something more elaborate?
If this is a single frame splash image with no animations, then there's no harm using glDrawPixels. If performance is crucial and you are required to use a texture, then the correct way to do it is to determine at runtime, the maximum supported texture size, using a proxy texture.
GLint width = 0;
while ( 0 == width ) { /* use a better condition to prevent possible endless loop */
0, /* mip map level */
GL_RGBA, /* internal format */
desiredWidth, /* width of image */
desiredHeight, /* height of image */
0, /* texture border */
GL_RGBA /* pixel data format, */
GL_UNSIGNED_BYTE, /* pixel data type */
NULL /* null pointer because this a proxy texture */
/* the queried width will NOT be 0, if the texture format is supported */
glGetTexLevelParameteriv(GL_PROXY_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &width);
desiredWidth /= 2; desiredHeight /= 2;
Once you know the maximum texture size supported by the system's OpenGL driver, you have at least two options if your image doesn't fit:
Image tiling: use multiple quads after splitting your image into smaller supported chunks. Image tiling for something like a splash screen should not be too tricky. You can use glPixelStorei's GL_PACK_ROW_LENGTH parameter to load sections of a larger image into a texture.
Image resizing: resize your image to fit the maximum supported texture size. There's even a GLU helper function to do this for you, gluScaleImage.
I don't think there is a built-in opengl function, but you might find a library (or write the function yourself) to break the image down into smaller chunks and then print to screen the smaller chunks (128x128 tiles or similar).
I had problem with this using Slick2D for a Java game. You can check out their "BigImage" class for ideas here.
The following code stretches a bitmap, blends it with an existing background, maintains transparent area of primary graphic and then displays the blend within a window (imgScreen). This works fine when the level of stretch is not large or when it is actually shrinking the initial bitmap. However when stretching the graphic it is very slow.
I have limited experience with C++ and this kind of graphics so perhaps there is another more efficient way to do this. The primary bitmap to be sized is always square. Any ideas are much appreciated..!
I was going to try not displaying clipping area but from tests it seems the initial stretch is causing the slowdown... Also having trouble seeing how to calculate non clipped area... Drawing to controls seems a waste but seems only way to use built in functions like stretchdraw and the alpha draw option.
std::auto_ptr<Graphics::TBitmap> bmap(new Graphics::TBitmap);
std::auto_ptr<Graphics::TBitmap> bmap1(new Graphics::TBitmap);
int s = newsize;
TRect sR = Rect(X,Y,X+s,Y+s);
TRect tR = Rect(0,0,s,s);
bmap->Canvas->StretchDraw(Rect(0, 0, s, s), Form1->Image4->Picture-
>Bitmap); // scale
bmap1->Canvas->CopyRect(tR, Form1->imgScreen->Canvas, sR); //background
bmap1->Canvas->Draw(0,0,bmap.get()); // combine
Form1->imgScreen->Canvas->Draw(X,Y, Form1->imgTemp->Picture->Bitmap,
Displays correctly but as graphic gets larger draw rate slows down quickly...
I am preparing to build a clone of the ambilight for my pc.
For this purpose I need a way to calculate the average color of several areas of the screen.
The fastest way I have found so far is the following:
pd3dDevice->CreateOffscreenPlainSurface(ddm.Width, ddm.Height, D3DFMT_A8R8G8B8, D3DPOOL_SCRATCH/*D3DPOOL_SYSTEMMEM*/, &pSurface, nullptr)
pd3dDevice->GetFrontBufferData(0, pSurface);
D3DLOCKED_RECT lockedRect;
memcpy(pBits, (unsigned char*) lockedRect.pBits, dataLength);
//calculate average over of pBits
However it involes copying the whole front buffer back to the system memory which takes 33 ms on average. Obviuosly 33ms is no way near the speed that I need for a decent update rate therefore I am looking for a way to calculate the average over a region of the front buffer directly on the gpu without copying the front buffer back to the system memory.
edit: the bottleneck in the code snippet is pd3dDevice->GetFrontBufferData(0, pSurface);.
The memcpy has no visible effect on performance.
Based on user3125280's answer i cooked up a pice of code that should take the top left corner of the screen and average it. However the result is always 0. What am I missing?
Also notice that pSurface is now in video memory and thus GetFrontBufferData is just a memcpy in video ram which is super fast.
pd3dDevice->CreateOffscreenPlainSurface(1, 1, D3DFMT_A8R8G8B8, D3DPOOL_SCRATCH, &pAvgSurface, nullptr);
pd3dDevice->CreateOffscreenPlainSurface(ddm.Width, ddm.Height, D3DFMT_A8R8G8B8, D3DPOOL_DEFAULT, &pSurface, nullptr);
pd3dDevice->GetFrontBufferData(0, pSurface);
r.right = 100;
r.bottom = 100;
r.left = 0;
r.top = 0;
pd3dDevice->StretchRect(pSurface, &r, pAvgSurface, nullptr, D3DTEXF_LINEAR);
D3DLOCKED_RECT lockedRect;
unsigned int color = -1;
memcpy((unsigned char*) &color, (unsigned char*) lockedRect.pBits, 4); //FIXME there has to be a better way than memcopy
Apparantly GetFrontBufferData requires the target to reside in the system memory. So I am back to square one.
According to this the following should be possible in DX11.1:
Create a Direct3D 11.1 device. (Maybe earlier works too -- I haven't tried. I'm not sure there's a reason to use a D3D10/10.1/11 device anyway.)
Find the IDXGIOutput you want to duplicate, and call DuplicateOutput() to get an IDXGIOutputDuplication interface.
Call AcquireNextFrame() to wait for a new frame to arrive.
Process the received texture.
Call ReleaseFrame().
However due to my non existing knowledge of DirectX I am having a hard time implementing it.
DuplicateOutput is not supported in operating systems older than Windows 8 :(
I did some experiments with the classical GetPixel API thinking that it may be fast enough for random sampling. Sadly it is not. GetPixel takes the same amount of time that GetFrontBufferData takes. I guess it internally calls GetFrontBufferData.
So for now I see two solutions:
* Disable Aero and use GetFrontBufferData
* Switch to windows 8
Both of them are not really good :(
This problem is actually common (apparently) in game code and the like. One interesting solution is the following: Efficient pixel shader sum of all pixels. This is particularly relevant to your exact situation, since you can use a larger mimmap texture to sum smaller segments of the display.
Get the screen into a texture
IDirect3DTexture9* texture; // needs to be created, of course
IDirect3DSurface9* dest = NULL; // to be our level0 surface of the texture
texture->GetSurfaceLevel(0, &dest);
pD3DDevice->StretchRect(pSurface, NULL, dest, NULL, D3DTEXF_LINEAR);
And then create a mipmap chain as here
// This code example assumes that m_d3dDevice is a
// valid pointer to a IDirect3DDevice9 interface
IDirect3DTexture9 * pMipMap;
m_pD3DDevice->CreateTexture(256, 256, 5, 0, D3DFMT_R8G8B8,
Of course you don't have to access the bottom mipmap (which is the average). You could access a few levels higher to get averages of sections. Also this is quick because texture mipmapping is important in games and graphics in general. Other filtering options may be available too.
For the second edit try here - something about textures in gpu mem are read differently, and can't be locked, you'll need to use getrendertargetdata or some such. This can be used to copy your stretchrect surface to a texture created cpu side in the system pool. As far as I know gpu side textures/surfaces can't be memcpy'ed directly.
I would like to grab an OpenGL image and feed it to OpenCV for analysis (as a simulator for the OpenCV algorithms) but I am not finding much information about it, all I can find is the other way around (placing an OpenCV image inside OpenGL). Could someone explain how to do so?
I will be simulating a camera on top of a Robot, so I will render in realtime a 3D environment and display it in a Qt GUI for the user. I will give the user the option to use a a real webcam feed or a simulated 3D scene (that changes as the robot moves) and the OpenCV algorithm will be the same for both inputs so the user might test his code without having to use a real robot all the time.
You are probably looking for the function glReadPixels. It will download whatever is currently displayed by OpenGL to a buffer.
unsigned char* buffer = new unsigned char[width*height*3];
glReadPixels(0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, buffer);
cv::Mat image(height, width, CV_8UC3, buffer);
cv::imshow("Show Image", image);
For OpenCV you will probably also need to flip and convert to BGR as well.
Edit: Since just using glReadPixels is not a very efficient way to do it, here is some sample code using Framebuffers and Pixel Buffer Objects to efficiently transfer:
How to render offscreen on OpenGL?
I did it in a previous research project. There are not much difficulties here.
What you have to do is basically:
make a texture read from OpenGL to some pre-allocated memory buffer;
apply some geometric transform (flip X and/or Y coordinate) to account for the possibly different coordinate frames between OpenGL and OpenCV. It's a detail but it helps in visualization (hint: use a texture with an F letter inside to find quickly what coordinate you need to flip!);
you can build an OpenCV cv::Matobject directly around your pre-allocated memory buffe, and then process it directly or copy it to some other matrix object and process it.
As indicated in another answer, reading OpenGL texture is a simple matter of calling the glRead() function.
What you get is usually 3 or 4 channels with 8 bits per data (RGB / RGBA - 8 bits per channel), though it may depend on your actual OpenGL context.
If color is important to you, you may need (but it is not required) to convert the RGB image data to the BGR format (Blue - Green - Red). For historical reasons, this is the default color channel ordering in OpenCV.
You do this with a call to cv::cvtColor(source, dest, cv::COLOR_RGB2BGR) for example.
I needed this for my research.
I took littleimp's advice, but fixing the colors and flipping the image took valuable time to figure out.
Here is what I ended up with.
typedef Mat Image ;
typedef struct {
int width;
int height;
char* title;
float field_of_view_angle;
float z_near;
float z_far;
} glutWindow;
glutWindow win;
Image glutTakeCVImage() {
// take a picture within glut and return formatted
// for use in openCV
int width = win.width;
int height = win.height;
unsigned char* buffer = new unsigned char[width*height*3];
glReadPixels(0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, buffer);
Image img(height, width, CV_8UC3, buffer);
Image flipped_img;
Image BGR_img;
cvtColor(flipped_img,BGR_img, COLOR_RGB2BGR);
return BGR_img;
I hope someone finds this useful.
In my OpenGL program, I'm loading a 24BPP image with the width of 501. The GL_UNPACK_ALINGMENT parameter is set to 4. They write it shouldn't work because the size of each of the rows which are being uploaded (501*3 = 1503) cannot be divided by 4. However, I can see a normal texture without artifacs when displaying it.
So my code works. I'm considering why to understand this fully and prevent the whole project from getting bugged.
Maybe (?) it works because I'm not just calling glTexImage2D. Instead, at first I'm creating a proper (with dimensions which are powers of two) blank texture, then uploading pixels with glTexSubImage2D.
But do you think it does a sense to write some code like that?
// w - the width of the image
// depth - the depth of the image
bool change_alignment = false;
if (depth != 4 && !is_divisible(w*depth)) // *
change_alignment = true;
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
// ... now use glTexImage2D
if (change_alingment) glPixelStorei(GL_UNPACK_ALIGNMENT, 4); // set to default
// * - of course we don't even need such a function
// but I wanted to make the code as clear as possible
Hope it should prevent the application from crashing or malfunction?
It depends on where your image data is coming from.
The Windows BMP format, for example, enforces a 4-byte row alignment. Indeed, formats like this are exactly why OpenGL has a row-alignment field: because some image formats enforce a row alignment.
So how correct it is to use a 4-byte row alignment on your data depends entirely on how your data is aligned in memory. Some image loaders will automatically align to 4 bytes. And some will not.
I've been working on some sound processing code and now I'm doing some visualizations. I finished making a spectrogram spectrogram, but how I am drawing it is too slow.
I'm using OpenGL to do 2D drawing, which has made searching for help more difficult. Also I am very new to OpenGL, so I don't know the standard way things are done.
I am storing the r,g,b values for each pixel in a large matrix.
Each time I get a small sound segment, I process it and convert it to column of pixels. Everything is shifted to the left 1 pixel, and the new line is put at the end.
Each time I redraw, I am looping through setting the color and drawing each pixel individually, which seems like a horribly inefficient way to do this.
Is there a better way to do this? Is there some method for simply shifting a bunch of pixels over?
They are many ways to improve your drawing speed.
The simplest would be to allocate a an RGB texture that you will draw using a screen aligned texture quad.
Each time that you want to draw a new line you can use glTexSubImage2d to a load a new subset of the texture and then you redraw the quad.
Are you perhaps passing a lot more data to the graphics card than you have pixels? This could happen if your FFT size is much larger than the height of the drawing area or the number of spectral lines is a lot more than its width. If so, it's possible that the bottle neck could be passing too much data across the bus. Try reducing the number of spectral lines by either averaging them or picking (taking the maximum in each bin for a set of consecutive lines).
I know this is an old question, but . . .
Use a circular buffer to store the pixels, and then simply call glDrawPixels twice with the appropriate offsets. Something like this untested C:
#define SIZE_X 800
#define SIZE_Y 600
unsigned char pixels[SIZE_Y][SIZE_X*2][3];
int start = 0;
void add_line(const unsigned char line[SIZE_Y][1][3]) {
int i,j,coord=(start+SIZE_X)%(2*SIZE_X);
for (i=0;i<SIZE_Y;++i) for (j=0;j<3;++j) pixels[i][coord][j] = line[i][0][j];
start = (start+1) % (2*SIZE_X);
void draw(void) {
int w;
w = 2*SIZE_X-start;
if (w!=0) glDrawPixels(w,SIZE_Y,GL_RGB,GL_UNSIGNED_BYTE,3*sizeof(unsigned char)*SIZE_Y*start+pixels);
w = SIZE_X - w;
if (w!=0) glDrawPixels(SIZE_X,SIZE_Y,GL_RGB,GL_UNSIGNED_BYTE,pixels);