Poor performance with DrawText on Win7 x64 - c++

I noticed in an MFC application I'm developing that while dragging the scroll bar to smoothly scroll down the document, the framerate drops to choppy levels when a block containing about a paragraph of text is on screen, but silky smooth when it's offscreen. Investigating the performance, I found the single CDC::DrawText call for the paragraph of text responsible. This is in an optimised release build.
I used QueryPerformanceCounter to get a high-resolution measurement of just the DrawText call, like this:
QueryPerformanceCounter(...);
pDC->DrawText(some_cstring, some_crect, DT_WORDBREAK);
QueryPerformanceCounter(...);
The text is unicode, lorem-ipsum style filler, 865 characters long and wraps over 7-and-a-bit lines given the rectangle and font (Segoe UI, lfHeight = -12, a standard body text size). From my measurements, that call alone takes on average 7.5 ms, with the odd peak at 21ms. (Note to keep up with a 60Hz monitor you get about 16ms to render each update.)
I tried making some changes to improve the performance:
Removing the DT_WORDBREAK improves performance to about 1ms (about 7 times faster), but given only one line of text is making it to the screen, and there were just over 7 lines with word breaking, this seems to suggest to me the bottleneck is elsewhere.
I was drawing text in transparent mode (SetBkMode(TRANSPARENT)). So I tried opaque mode with a solid background fill. No improvement.
I thought ClearType rendering might be to blame. I changed the font lfQuality from CLEARTYPE_QUALITY to NONANTIALIASED_QUALITY. It looked like crap with sharp edges and all, and no improvement.
As per a comment suggestion, I was using a CMemDC, but I got rid of it and did direct drawing. It flickered like mad, and no improvement.
This is running on a Windows 7 64-bit laptop with an Intel Core 2 Duo P8400 # 2.26 GHz and 4 GB RAM - I don't think it counts as a slow system.
I'm calling DrawText() every time it draws and this obviously hammers the performance with such a slow function, especially if several of those text-blocks are visible at once. It's enough to make the experience feel sluggish. However, Firefox can render a page like this one in ClearType with much more text, and seems to cope just fine. What am I doing wrong? How can I get around the poor performance of an actual DrawText call?

Drawing the text at every refresh is wasteful. Use double buffering, that is, draw in an offscreen bitmap and just blit it to the screen. Then, for scrolling, just copy most of the bitmap up or down or sideways as necessary, then draw only the invalidated area (before blitting the result to the screen).
If even that turns out to be too slow, keep also the drawn text in an off-screen bitmap, and blit instead of draw.
Cheers & hth.,

According to this german blogpost, the issue has to do with support for asian language fonts. If you enable those in XP you get the same perf hit. In Vista/7, they are default enabled and you can't turn them off.
EDIT: Just maybe, using a different font might help.. (one that does not contain asian characters).

Users can't read text at 7 lines in 7 milliseconds, so the call itself is fast enough.
The 60 Hz refresh rate of the monitor is entirely irrelevant. You don't need to re-render the same text for every frame. The videocard will happily send the same pixels to the screen again.
So, I thibk you have another problem. Are you perhaps wondering about scrolling text? Please ask about the problem you really have, instead of assuming DrawText is the culprit.

In order to break the text on word breaks, DrawText needs to repeatedly try to get the width of a block of text to see if it will fit, then take the remainder and do it over. It will need to do this at every call. If your text is unchanging, this is an unnecessary overhead. As a workaround, you could measure the text yourself and insert temporary line breaks and remove the DT_WORDBREAK flag.

Have you considered Direct2D/DirectWrite?
Anyway it should work better if you just draw the text once to its own mem dc and blit that over to whatever dc you want it painted on with each iteration.

Related

Is it possible to create coded animated wallpapers?

The idea is to ultimately have a program that runs in the background. This program paints pixels to the desktop wallpaper at a rate of at least 24 pixels per second.
I've already tried using certain dekstop handles but that does not have the desired result. This solution paints over cursor and icons as well.
Suggestions?
This rust crate can change the wallpaper of many OS:
https://docs.rs/wallpaper/2.0.1/wallpaper/
It may not yield the performance you where talking about, in my experience it can change the background a few times per second.

Choppy scrolling of QPixmap using Qt Animation Framework

I created QPropertyAnimation and connected it to my SonogramWidget that scroll a long picture vertically on animation events. The 'long picture' is composed of 100 pre-calculated QPixmap objects 1024x128 placed one after another vertically. They displayed in SonogramWidget::paintEvent() with QPainter. Drawing procedure paint not all QPixmap at once, but only visible of them, considering widget height and current vertical offset. CPU is almost free, because QPixmap is a fastest way to display a picture. There is no big calculations during scrolling, because all the 100 QPixmaps are pre-calculated and stored in memory.
I see strange effect: pulsating movement: 2 times a second the entire image slightly speed-up and moves up by 1..2 pixels faster than usual motion. The same effect when i replace Qt Animation Framework with single 60 fps QTimer and scroll the image in its SLOT.
Video: http://www.youtube.com/watch?v=KRk_LNd7EBg#t=8 (watch from 00:08; My firefox adds more chopping to video playing itself, google chrome plays the video much better).
I see the same effect for my Linux and Windows build.
SOLUTION
i figured out the issue: the "chopping" was not a bug, it was a feature! It is a feature of integer-number calculations, so sometimes we had to have different numbers for animations, like: 16,16,16,16,16,16,17,16,16,16,16,16,17,....
In the paintEvent add the following assert:
Q_ASSERT(m_animation->currentValue() == m_animatedPropertyValue);
If it triggers, then you know you must use currentValue() instead of the property value. This might be the case. Let me know.

SFML Drawing OpenGL to multiple windows extremely slow

Here is the situation:
I have 4 SFML windows, which are inside a container which I have built. The container calls independent redraw methods for each window, starting with the first and ending with the last.
If each window's drawing code contains the lines drawMyCube() OR glClear(...), then the frame rate becomes slow.
drawMyCube() just draws a cube which rotates depending on the value of an sf::Clock object.
If one window calls (either of) these functions, the frame rate is ~60fps.
If two windows call (either of) these functions, the frame rate is ~30fps.
If three windows call (either of) these functions, the frame rate is ~20fps.
Finally, if all four call (either of) these functions, the frame rate is ~15fps.
This looks like a pattern emerging, so I tried removing the functions from 3 of the windows, and calling them 10 times from one window. I was expecting the frame rate to be ~6fps, but it remained at 60.
Does anyone know why this is happening? There doesn't seem to be any effect if I remove any other functions from the window drawing methods, for example, gluLookAt() doesn't seem to slow it down.
EDIT: Frame rate limit is set to zero and vsync is false.
This sounds exactly like vertical sync. Each of your windows is waiting for vertical refresh, which is why your rate keeps getting cut in half.
I know you said that vsync is off, but it's possible that your video driver is forcing it. Check your driver settings.

How to sync page-flips with vertical retrace in a windowed SDL application?

I'm currently writing a game of immense sophistication and cunning, that will fill you with awe and won- oh, OK, it's the 15 puzzle, and I'm just familiarising myself with SDL.
I'm running in windowed mode, and using SDL_Flip as the general-case page update, since it maps automatically to an SDL_UpdateRect of the full window in windowed mode. Not the optimum approach, but given that this is just the 15 puzzle...
Anyway, the tile moves are happening at ludicrous speed. IOW, SDL_Flip in windowed mode doesn't include any synchronisation with vertical retraces. I'm working in Windows XP ATM, but I assume this is correct behaviour for SDL and will occur on other platforms too.
Switching to using SDL_UpdateRect obviously won't change anything. Presumably, I need to implement the delay logic in my own code. But a simple clock-based timer could result in updates occuring when the window is half-drawn, causing visible distortions (I forget the technical name).
EDIT This problem is known as "tearing".
So - in a windowed mode game in SDL, how do I synchronise my page-flips with the vertical retrace?
EDIT I have seen several claims, while searching for a solution, that it is impossible to synchronise page-flips to the vertical retrace in a windowed application. On Windows, at least, this is simply false - I have written games (by which I mean things on a similar level to the 15-puzzle) that do this. I once wasted some time playing with Dark Basic and the Dark GDK - both DirectX-based and both syncronising page-flips to the vertical retrace in windowed mode.
Major Edit
It turns out I should have spent more time looking before asking. From the SDL FAQ...
http://sdl.beuc.net/sdl.wiki/FAQ_Double_Buffering_is_Tearing
That seems to imply quite strongly that synchronising with the vertical retrace isn't supported in SDL windowed-mode apps.
But...
The basic technique is possible on Windows, and I'm beginning the think SDL does it, in a sense. Just not quite certain yet.
On Windows, I said before, synchronising page-flips to vertical syncs in Windowed mode has been possible all the way back to the 16-bit days using WinG. It turns out that that's not exactly wrong, but misleading. I dug out some old source code using WinG, and there was a timer triggering the page-blits. WinG will run at ludicrous speed, just as I was surprised by SDL doing - the blit-to-screen page-flip operations don't wait for a vertical retrace.
On further investigation - when you do a blit to the screen in WinG, the blit is queued for later and the call exits. The blit is executed at the next vertical retrace, so hopefully no tearing. If you do further blits to the screen (dirty rectangles) before that retrace, they are combined. If you do loads of full-screen blits before the vertical retrace, you are rendering frames that are never displayed.
This blit-to-screen in WinG is obviously similar to the SDL_UpdateRect. SDL_UpdateRects is just an optimised way to manually combine some dirty rectangles (and be sure, perhaps, they are applied to the same frame). So maybe (on platforms where vertical retrace stuff is possible) it is being done in SDL, similarly to in WinG - no waiting, but no tearing either.
Well, I tested using a timer to trigger the frame updates, and the result (on Windows XP) is uncertain. I could get very slight and occasional tearing on my ancient laptop, but that may be no fault of SDLs - it could be that the "raster" is outrunning the blit. This is probably my fault for using SDL_Flip instead of a direct call to SDL_UpdateRect with a minimal dirty rectangle - though I was trying to get tearing in this case, to see if I could.
So I'm still uncertain, but it may be that windowed-mode SDL is as immune to tearing as it can be on those platforms that allow it. Results don't seem as bad as I imagined, even on my ancient laptop.
But - can anyone offer a definitive answer?
You can use the framerate control of SDL_gfx.
Looking at the docs of library, the flow of your application will be like this:
// initialization code
FPSManager *fpsManager;
SDL_initFramerate(fpsManager);
SDL_setFramerate(fpsManager, 60 /* desired FPS */);
// in the render loop
SDL_framerateDelay(fpsManager);
Also, you may look at the source code to create your own framerate control.

how do I do print preview in win32 c++?

I have a drawing function that just takes an HDC.
But I need to show an EXACT scaled version of what will print.
So currently, I use
CreateCompatibleDC() with a printer HDC and
CreateCompatibleBitmap() with the printer's HDC.
I figure this way the DC will have the printer's exact width and height.
And when I select fonts into this HDC, the text will be scaled exactly as the printer would.
Unfortunately, I can't to a StretchBlt() to copy this HDC's pixels to the control's HDC since they're of different HDC types I guess.
If I create the "memory canvas" from a window HDC with same w,h as the printer's page,
the fonts come out WAY teeny since they're scaled for the screen, not page...
Should I CreateCompatibleDC() from the window's DC and
CreateCompatibleBitmap() from the printer's DC or something??
If somebody could explain the RIGHT way to do this.
(And still have something that looks EXACTLY as it would on printer)...
Well, I'd appreciate it !!
...Steve
Depending on how accurate you want to be, this can get difficult.
There are many approaches. It sounds like you're trying to draw to a printer-sized bitmap and then shrink it down. The steps to do that are:
Create a DC (or better yet, an IC--Information Context) for the printer.
Query the printer DC to find out the resolution, page size, physical offsets, etc.
Create a DC for the window/screen.
Create a compatible DC (the memory DC).
Create a compatible bitmap for the window/screen, but the size should be the pixel size of the printer page. (The problem with this approach is that this is a HUGE bitmap and it can fail.)
Select the compatible bitmap into the memory DC.
Draw to the memory DC, using the same coordinates you would use if drawing to the actual printer. (When you select fonts, make sure you scale them to the printer's logical inch, not the screen's logical inch.)
StretchBlt the memory DC to the window, which will scale down the entire image. You might want to experiment with the stretch mode to see what works best for the kind of image you're going to display.
Release all the resources.
But before you head in that direction, consider the alternatives. This approach involves allocating a HUGE off-screen bitmap. This can fail on resource-poor computers. Even if it doesn't, you might be starving other apps.
The metafile approach given in another answer is a good choice for many applications. I'd start with this.
Another approach is to figure out all the sizes in some fictional high-resolution unit. For example, assume everything is in 1000ths of an inch. Then your drawing routines would scale this imaginary unit to the actual dpi used by the target device.
The problem with this last approach (and possibly the metafile one) is that GDI fonts don't scale perfectly linearly. The widths of individual characters are tweaked depending on the target resolution. On a high-resolution device (like a 300+ dpi laser printer), this tweaking is minimal. But on a 96-dpi screen, the tweaks can add up to a significant error over the length of a line. So text in your preview window might appear out-of-proportion (typically wider) than it does on the printed page.
Thus the hardcore approach is to measure text in the printer context, and measure again in the screen context, and adjust for the discrepancy. For example (using made-up numbers), you might measure the width of some text in the printer context, and it comes out to 900 printer pixels. Suppose the ratio of printer pixels to screen pixels is 3:1. You'd expect the same text on the screen to be 300 screen pixels wide. But you measure in the screen context and you get a value like 325 screen pixels. When you draw to the screen, you'll have to somehow make the text 25 pixels narrower. You can ram the characters closer together, or choose a slightly smaller font and then stretch them out.
The hardcore approach involves more complexity. You might, for example, try to detect font substitutions made by the printer driver and match them as closely as you can with the available screen fonts.
I've had good luck with a hybrid of the big-bitmap and the hardcore approaches. Instead of making a giant bitmap for the whole page, I make one large enough for a line of text. Then I draw at printer size to the offscreen bitmap and StretchBlt it down to screen size. This eliminates dealing with the size discrepancy at a slight degradation of font quality. It's suitable for actual print preview, but you wouldn't want to build a WYSIWYG editor like that. The one-line bitmap is small enough to make this practical.
The good news is only text is hard. All other drawing is a simple scaling of coordinates and sizes.
I've not used GDI+ much, but I think it did away with non-linear font scaling. So if you're using GDI+, you should just have to scale your coordinates. The drawback is that I don't think the font quality on GDI+ is as good.
And finally, if you're a native app on Vista or later, make sure you've marked your process as "DPI-aware" . Otherwise, if the user is on a high-DPI screen, Windows will lie to you and claim that the resolution is only 96 dpi and then do a fuzzy up-scaling of whatever you draw. This degrades the visual quality and can make debugging your print preview even more complicated. Since so many programs don't adapt well to higher DPI screens, Microsoft added "high DPI scaling" by default starting in Vista.
Edited to Add
Another caveat: If you select an HFONT into the memory DC with the printer-sized bitmap, it's possible that you get a different font than what would get when selecting that same HFONT into the actual printer DC. That's because some printer drivers will substitute common fonts with in memory ones. For example, some PostScript printers will substitute an internal PostScript font for certain common TrueType fonts.
You can first select the HFONT into the printer IC, then use GDI functions like GetTextFace, GetTextMetrics, and maybe GetOutlineTextMetrics to find out about the actual font selected. Then you can create a new LOGFONT to try to more closely match what the printer would use, turn that into an HFONT, and select that into your memory DC. This is the mark of a really good implementation.
Another Edit
I've recently written new code that uses enhanced meta files, and that works really well, at least for TrueType and OpenType fonts when there's no font substitution. This eliminates all the work I described above trying to create a screen font that is a scaled match for the printer font. You can just run through your normal printing code and print to an enhanced meta file DC as though it's the printer DC.
One thing that might be worth trying is to create an enhanced metafile DC, draw to it as normal and then scale this metafile using printer metrics. This is the approach used by the WTL BmpView sample - I don't know how accurate this will be but it might be worth looking at (it should be easy to port the relevant classes to Win32 but WTL is a great replacement for Win32 programming so might be worth utilizing.)
Well it won't look the same because you have a higher resolution in the printer DC, so you'll have to write a conversion function of sorts. I'd go with the method that you got to work but the text was too small and just multiply every position/font size by the printer window width and divide by the source window width.