Confused about the sense of GDI at all [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I was starting to learn Windows programing few years ago, and I always used only "native" environment for my needs. I mean, I write code only using Winapi, not DirectX\Draw\2D\etc for graphics, not any external libraries for music or something else, just only Winapi. I'm working on the next evolution step of my graphical render due last time. Previouly algorithms were good, they redraws only parts of windows that should to be redraw, and it works for not fullscreened windows. But when I worked with full-screen cases, I've got very small fps.
So, few monthes ago, I've understood, that I can make a brand new algorithm: instead of make draws in WM_PAINT, every time recreate dc'c, bitmaps, I can run a parallel thread, where in eternal loop goes redrawing, dc'c and bitmaps creates only one time + I even can don't use Gdi or Gdi+ functions such as Rectangle, Graphics::FillRect, but write my own faster functions. So I did it. And what I've got:
62 fps, 1920\1080 with no any graphical load
why?
It's just this code
void render()
{
COLORREF *matrix;
matrix = re->GetMatrix();
while (1)
{
Sleep(1000 / 120);
re->Render();
//below goes fps counter, that counts in another thread
while (!mu.try_lock())
{
}
frames++;
mu.unlock();
}
}
re->Render function
inline void Card::Render()
{
//SetDIBits(hdcc, bm, 0, bi.bmiHeader.biHeight, matrix, &bi, DIB_RGB_COLORS);
//StretchBlt(hdc, 0, 0, width, height, hdcc, 0, 0, width, height, SRCCOPY);
//method above with Stretch or just BitBlt is awfull at all
SetDIBitsToDevice(hdc, 0, 0, width, height, 0, 0, 0, bi.bmiHeader.biHeight, matrix, &bi, DIB_RGB_COLORS);//hdc is a surface dc, not memory
}
So if I understanding well, it is the maximum, that can be taken from Gdi. If I right, the question is - what is the sense of Gdi? Computer games were developed on Direct 2D then DirectX\OpenGL, user interfaces, before NT 8 were not window less, and(or) used DirectDraw. I'm confused, is it really to write good software render do not using any library, just by yourself?

Part of the problem is here:
Sleep(1000 / 120);
comes out to 8 ms (after integer division). But Sleep is not a very precise timing mechanism. It will sleep for at least the amount of time specified. And, with the default clock tick rate, it will sleep for at least 15.6 ms on most configurations. A frame duration of 15.6 ms is very close to 62 frames per second, so that's probably the root problem.
Beyond that, you will have problems with GDI because the graphics operations are largely performed in system memory which then has to be transferred to graphics memory. At higher resolutions, it can be difficult to do this at a high frame rate, depending on the hardware in use.

I’m not sure I understand your question but here’s my best guess.
I can run a parallel thread
Not a good idea for GDI. WinAPI can go multithreading, but it’s tricky to use correctly, see this article: https://msdn.microsoft.com/en-us/library/ms810439.aspx
62 fps, 1920\1080 with no any graphical load. why?
GDI wasn’t designed for the high-FPS use case you want from that. It was designed to redraw stuff when something changed. It was designed before modern GPUs.
On modern hardware, the way to get high FPS and/or low latency rendering is by using GPU-centric technologies. In C++, that’s Direct3D and Direct2D.
is it really to write good software render do not using any library
Sure it’s possible. Just not the way you’re trying to do it with SetDIBitsToDevice.
On modern hardware + OS, your API calls (regardless on what the API is) will become commands sent to 3D GPU. That’s why newer GPU-centric APIs like D3D and D2D often deliver better performance.
If you want to implement a software render that’s fine, just keep in mind you have to upload the result to a GPU texture. On modern Windows, WinAPI does just that under the hood.
Legacy Windows (i.e. anything before Vista) didn’t rely on GPU for that. But games didn’t use GDI either, they used DirectX 9, DirectDraw, etc…

Related

Fastest way to copy my own system memory RGB array into a Win32 window

while it seems such a basic question, I open this thread after extensive stackoverflow and Google search, which helped but not in a definitive way.
My C++ code draws to an array used as RGB framebuffer, hardware acceleration is not possible for this original graphics algorithm and I want my software to directly generate the images (no hardware acceleration) also for portability reasons.
However, I would like to have hardware acceleration to copy my C++ array into the window.
In the past, I have outputted to BMP files the images generated by my code, now I want to display them as efficiently as possible in realtime, on the screen's window.
I wrote a Win32 program that has a window freely resizeable by the user, so far so good.
What is the best / fastest / most efficient way to (possibly v-synched) blit/copy my system memory RGB array (I can adapt my graphics generator to any RGB format, so to avoid conversions) into the window that can change size any moment? (I am handling resizing via WM messages, and anyway I'm not asking how to resize/rescale the image array, I will do it all by myself, I mention resizeable window just to say that its size cannot be fixed after creation, but I will rescale it with my own code or, more precisely, I will simply reallocate the RGB array and generate a new image when the user changes the window's size)
NOTE: the image array should be allocated by my own program, but if having a system pointer will make things much more efficient (e.g. directly in video RAM) then it's ok.
Windows10 is the target OS, but at least Windows7 retro-compatibility would be better. Hardware is PC's, even low end, released in the last 5-10 years till now, so I guess all will have GDI hardware acceleration or such, like Direct2D or how is it called DirectDraw now.
NOTE: please do NOT propose the use of any library, I will have to deal directly with GDI calls, Direct2D or whatever is most efficient to do the task, but without using third-party libraries.
Thank you very much, I don't know much about Windows GUI coding, I've done console mode only until now, and some windows (thus I am familiar with WM messages) and basic DeviceContext graphics output. From my research (but I wanted to ask here because I'm in doubt really, and lotsa post I've read date back to 2010) SetDIBitsToDevice should be the best solution to my problem, but if so I think I would still need some way to synchronize with VSynch to avoid, if possible, the annoying tearing/flickering.

Drawing pixel by pixel in C++ [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Some years ago, I used to program for MS-DOS in assembly language. One the things available was to tell the BIOS an x coordinate, a y coordinate and a color (expressed as an integer) then call a function and the BIOS would do it immediately.
Of course, this is very hard work and very time consuming but the trade off was that you got exactly what you want exactly at the time you wanted it.
I tried for many years to write to MacOS API, but found it either difficult or impossible as nothing is documented at all. (What the hell is an NSNumber? Why do all the controls return a useless object?)
I don't really have any specific project in mind right now, but I would like to be able to write C++ that can draw pixels much in the same way. Maybe I'm crazy but I want that kind of control.
Until I can overcome this, I'm limited to writing programs that run in the console by printing text and scrolling up as the screen gets full.
You could try using Windows GDI:
#include <windows.h>
int main()
{
HDC hdc = GetDC(GetConsoleWindow());
for (int x = 0; x < 256; ++x)
for (int y = 0; y < 256; ++y)
SetPixel(hdc, x, y, RGB(127, x, y));
}
It is pretty easy to get something drawn (if this is what you are asking) as you could see from the above example.
Modern x86 operating systems do not work anymore under real mode.
You have several options:
Run a VM and install a real mode OS (e.g. MS-DOS).
Use a layer that emulates the real mode (e.g. DOSBox).
Use a GUI library (e.g. Qt, GTK, wxWidgets, Win32, X11) and use a canvas or a similar control where you can draw.
Use a 2D API (e.g. the 2D components of SDL, SFML, Allegro).
Use a 3D API (e.g. OpenGL, Direct3D, Vulkan, Metal; possibly exposed by SDL, SFML or Allegro if you want it portable) to stream a texture that you have filled pixel by pixel with the CPU each frame.
Write fragment shaders (either using a 3D API or, much easier, in an web app using WebGL).
If you want to learn how graphics are really done nowadays, you should go with the last 2 options.
Note that, if you liked drawing "pixel by pixel", you will probably love writing fragment shaders directly on the GPU and all the amazing effects you can achieve with them. See ShaderToy for some examples!

Is it possible to control pixels on the screen just from plain C or plain C++ without any opengl / directx hassle?

Well, I want to know.. maybe others too.
Is it possible to control each pixel separately on a screen by programming, especially C or C++?
Do you need special control over the drivers for the current screen? Are there operating systems which allow you to change pixels (for example draw a message/overlay on top of everything)?
Or does windows support this maybe in it's WinApi?
Edit:
I am asking this question because I want to make my computer warn me when I'm gaming and my processor gets too hot. I mainly use Windows but I have a dual boot ubuntu distro.
The lower you go, the more hassle you'll run into.
If you want raw pixel manipulation you might check out http://www.libsdl.org/ which helps you mitigate the hassle of creating surfaces/windows and that kind of stuff.
Linux has a few means to get you even lower if you want (ie without "windows" or "xwindows" or anything of the sort, just the raw screen), look in to the Linux Frame Buffer if you're interested in that.
Delving even lower (such as doing things with your own OS), the BIOS will let you go into certain video modes, this is what OS installers tend to use (at least they used to, some of the fancier ones don't anymore). This isn't the fastest way of doing things, but can get you into the realm of showing pixels in a few assembly instructions.
And of course if you wanted to do your own OS and take advantage of the video card (bypass the BIOS), you're then talking about writing video drivers and such, which is obviously a substantial amount of work :)
Re overlay messages ontop of the screen and that sort of thing, windows does support that sort of thing, so I'm sure you can do it with the WinAPI, although there are likely libraries that would make that easier. I do know you don't need to delve too deep to do that sort of thing though.
Let's look at each bit at a time:
Is it possible to control each pixel separately on a screen by
programming, especially C or C++?
Possibly. It really depends on the graphics architecture, and in many modern systems, the actual screen surface (that is "the bunch of pixels appearing on the screen") is not directly under software control - at least not from "usermode" (that is, from an application that you or I can write - you need to write driver code, and you need to co-operate sufficiently with the existing graphics driver).
It is generally accepted that drawing the data into an off-screen buffer and using a BitBlt [BitBlockTransfer] function to copy the content onto the screen is the prefferred way to do this sort of thing.
So, in reality, you probably can't manipulate each pixel ON the screen - but you may be able to appear like you do.
Do you need special control over the drivers for the current screen?
Assuming you could get direct access to the screen memory, your code certainly will have to have cooperation with the driver - otherwise, who's to say that what you want to appear on the screen doesn't get overwritten by something else [e.g. you want full screen access, and the clock-updater updates the time on screen every once a minute on top of what you draw, etc].
You may be able to set the driver into a mode where you have a "hole" that allows you to access the screen memory as a big "framebuffer". I don't think there's an easy way to do this in Windows. I don't remember one from back in 2003-2005 when I wrote graphics drivers for a living.
Are there operating systems which allow you to change pixels (for
example draw a message/overlay on top of everything)?
It is absolutely possible to create an overlay layer in the hardware of modern graphics cards. That's generally how video playback works - the video is played into a piece of framebuffer memory that is overlaid on top of the other graphics. You need help from the driver, and this is definitely available in the Windows API, via DirectX as far as I remember.
Or does windows support this maybe in it's WinApi?
Probably, but to answer precisely, we need to understand better what you are looking to do.
Edit: In your particular use-case, I would have thought that making sounds or ejecting the CD/DVD drive may be a more suitable opton. It can be hard to overlay something on top of the graphics drawn by a game, because games often try to use as much as possible of the available graphics resource, and you will probably have a hard time finding a way that works even for the most simple use-cases - never mind something that works for multiple different categories of games using different drawing/engine/graphics libraries. I'm also not entirely sure it's anything to worry overly about, since modern CPU's are pretty tolerant to overheating, so the CPU will just slow down, possibly grind to a halt, but it will not break - even if you take the heatsink off, it won't go wrong [no, I don't suggest you try this!]
Every platform supports efficient raw pixel block transfer "aka BitBlt()", so if you really want to go to frame buffer level you can allocate a bitmap and use pointers to set its contents directly then with one line of code efficiently flip this memory chunk into video ram buffer. Of course it is not as efficient as working with PCI framebuffers directly, but on the other hand this approach (BitBlt) was fast enough even in Win95 days to port Wolfenstein 3d on Pentium CPU WITHOUT the use of WinG.
HOWEVER, a care must be taken while creating this bitmap to match its format (i.e. RGB 16 bits, or 32 bits etc...) with actual mode that device is in, otherwise the graphics sub-system will do a lengthy recoding/dithering which will completely kill your speed.
So depending on your goals, If you want a 3d game your performance will suck with this approach. If you want just to render some shapes and dont need more than 10-15fps - this will work without diving into any device-driver levels.
Here is a few tips for overlaying in Windows:
hdc = GetDC(0);//returns hdc for the whole screen and is VERY fast
You can take HDC for screen and do a BItBlt(hdc, ..... SRCCOPY) to flip blocks of raster efficiently. There are also pre-defined Windows Handles for desktop but I dont recall the exact mechanics but if you are on multiple monitors you can get HDC for each desktop, look at "GetDesktopWindow", "GetDC" and the like...

GlTexSubImage2D slow and uses 4% of CPU [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am using glTexSubImage2D for update window that uses openGL.
I see that this function takes a lot of time to return and it also takes 4% of CPU.
Here is the code that I use:
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, (*i)->getTextureID());
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, (*i)->getWidth(), (*i)->getHeightView(),
GL_BGRA, GL_UNSIGNED_BYTE,(*i)->getBuffer());
Does anybody know of a better implementation? Something with better performance that will take less CPU?
Right now this is making my program sluggish.
There are some things you can do, though how much you can benefit from them depends on the circumstances.
First, make sure that your pixel upload format is correct for the driver's needs. You seem to have that taken care of with GL_BGRA, GL_UNSIGNED_BYTE, which is likely the driver's preferred format for GL_RGBA8 image formats.
However, if you happen to have access to OpenGL 4.3 or a driver that implements ARB_internalformat_query2, you can actually detect at runtime what the preferred upload format will be. Like this:
GLint pixelFormat, pixelType;
glGetInternalFormativ(GL_TEXTURE_2D, GL_RGBA8, GL_TEXTURE_IMAGE_FORMAT, 1, &pixelFormat);
glGetInternalFormativ(GL_TEXTURE_2D, GL_RGBA8, GL_TEXTURE_IMAGE_TYPE, 1, &pixelType);
Of course, this means that you will need to be able to modify your data generation method to generate data in the above format/type pair.
Once you've taken steps to appease the driver, your next possibilities are using buffer objects to store your pixel transfer data. This probably won't help overall performance, but it can reduce the CPU burden.
However, in order to take the best advantage of this, you need to be able to generate your pixel data "directly" into the buffer object's memory by mapping it. If you are able to do this, then you can probably get back some of the CPU cost of the upload. Otherwise, it may not be worthwhile.
If you do this, you should use proper buffer object streaming techniques.
Double-buffering your texture may also help. That is, while you're rendering from one texture object, you're uploading to another one. This will prevent GPU stalls that wait for the prior rendering to complete. How much this helps really depends on how you're rendering.
Without knowing more about the specific circumstances of your application, there's not much more that can be said.
If your texture really is changing every frame, then you will want to use a double buffer to transport your data to the GPU. (If it's not changing every frame, then the obvious optimization is to only upload it once!)
Each frame, you upload data to one buffer and draw data from the other buffer, and you switch which buffer you use each frame. This will speed everything up because the GPU will not have to wait for the memory transfer to finish.
A tutorial on PBOs is somewhat beyond my ability to condense into an answer, but "OpenGL Pixel Buffer Objects" is a decent reference, and I would look at the "OGL Samples" repository to see how PBOs work.
However, If you can't compute a texture frame in advance, then there is no real advantage to using PBOs. Just use glTexSubImage2D.
That said, 4% of CPU might not be a problem.
You should not be changing the data of a texture every frame in order to update your screen. Textures are meant to be loaded once and rarely (if ever) changed. If you are trying to write to individual pixels on your screen, I would recommend not using OpenGL, and use something more suited to the task, like SDL.
Edit: Okay, this isn't necessarily true. See discussion below.
As I understand from this answer's comment thread, you're rendering a website on the CPU side (or the rendered image goes through the CPU), but applying OpenGL shaders to it. If it's so, you need a GPU-side renderer, rendering the webpage and applying shaders on the GPU side. This way, you'll no longer upload each frame to the GPU through the CPU, and the CPU will be free from rendering needs, as it's intended to be.

Draw on DeviceContext from COLORREF[]

I have a pointer to a COLORREF buffer, something like: COLORREF* buf = new COLORREF[x*y];
A subroutine fills this buffer with color-information. Each COLORREF represents one pixel.
Now I want to draw this buffer to a device context. My current approach works, but is pretty slow (== ~200ms per drawing, depending on the size of the image):
for (size_t i = 0; i < pixelpos; ++i)
{
// Get X and Y coordinates from 1-dimensional buffer.
size_t y = i / wnd_size.cx;
size_t x = i % wnd_size.cx;
::SetPixelV(hDC, x, y, buf[i]);
}
Is there a way to do this faster; all at once, not one pixel after another?
I am not really familiar with the GDI. I heard about al lot of APIs like CreateDIBitmap(), BitBlt(), HBITMAP, CImage and all that stuff, but have no idea how to apply it. It seems all pretty complicated...
MFC is also welcome.
Any ideas?
Thanks in advance.
(Background: the subroutine I mentioned above is an OpenCL kernel - the GPU calculates an Mandelbrot image and safes it in the COLORREF buffer.)
EDIT:
Thank you all for your suggestions. The answers (and links) gave me some insight into Windows graphics programming. The performance is now acceptable (semi-realtime-scrolling into the Mandelbrot works :)
I ended up with the following solution (MFC):
...
CDC dcMemory;
dcMemory.CreateCompatibleDC(pDC);
CBitmap mandelbrotBmp;
mandelbrotBmp.CreateBitmap(clientRect.Width(), clientRect.Height(), 1, 32, buf);
CBitmap* oldBmp = dcMemory.SelectObject(&mandelbrotBmp);
pDC->BitBlt(0, 0, clientRect.Width(), clientRect.Height(), &dcMemory, 0, 0, SRCCOPY);
dcMemory.SelectObject(oldBmp);
mandelbrotBmp.DeleteObject();
So basically CBitmap::CreateBitmap() saved me from using the raw API (which I still do not fully understand). The example in the documentation of CDC::CreateCompatibleDC was also helpful.
My Mandelbrot is now blue - using SetPixelV() it was red. But I guess that has something to do with CBitmap::CreateBitmap() interpreting my buffer, not really important.
I might try the OpenGL suggestion because it would have been the much more logical choice and I wanted to try OpenCL under Linux anyway.
Under the circumstances, I'd probably use a DIB section (which you create with CreateDIBSection). A DIB section is a bitmap that allows you to access the contents directly as an array, but still use it with all the usual GDI functions.
I think that'll give you the best performance of anything based on GDI. If you need better, then #Kornel is basically correct -- you'll need to switch to something that has more direct support for hardware acceleration (DirectX or OpenGL -- though IMO, OpenGL is a much better choice for the job than DirectX).
Given that you're currently doing the calculation in OpenCL and depositing the output in a color buffer, OpenGL would be the really obvious choice. In particular, you can have OpenCL deposit the output in an OpenGL texture, then you have OpenGL draw a quad using that texture. Alternatively, since you're just putting the output on screen anyway, you could just do the calculation in an OpenGL fragment shader (or, of course, a DirectX pixel shader), so you wouldn't put the output into memory off-screen just so you can copy the result onto the screen. If memory serves, the Orange book has a Mandelbrot shader as one of its examples.
Yes, sure, that's slow. You are making a round-trip through the kernel and video device driver for each individual pixel. You make it fast by drawing to memory first, then update the screen in one fell swoop. That takes, say, CreateDIBitmap, CreateCompatibleDc() and BitBlt().
This isn't a good time and place for an extensive tutorial on graphics programming. It is well covered by any introductory text on GDI and/or Windows API programming. Everything you'll need to know you can find in Petzold's seminal Programming Windows.
Since you already have an array of pixels, you can directly use BitBlt to transfer it to the window's DC. See this link for a partial example:
http://msdn.microsoft.com/en-us/library/aa928058.aspx