I noticed a large amount of page faults in my Qt application. I reproduced it by resizing a docking widget (with a widget tree of dozens of widgets underneath) for 2 seconds and traced that operation using AQTime. I get 2000 page faults for this operation. Why is that?
Using Qt 4.5.3 on Windows XP 32 bit
UPDATE: They're soft page faults
UPDATE2: I created a ui in Qt Designer with 1 combobox with 2 items in it. If I preview this, I get 200 page faults each time I click the combobox to select one of these items.
Parents
Code Type Routine Name Faults Faults with Children Hit Count
x86 qt_memfill_template<unsigned int,unsigned int> 2416 2416 5160
x86 qt_memfill<unsigned int> 2416 2416 5160
x86 qt_rectfill<unsigned int> 0 2416 5160
x86 qt_rectfill_template<unsigned int> 0 2416 63
x86 qt_rectfill_quint32 3 2419 63
x86 fillRect_normalized 1 2420 63
x86 QRasterPaintEngine::fillRect 3 2423 63
x86 QRasterPaintEngine::fillRect 1 2424 63
x86 QPainter::fillRect 1 2427 63
x86 fillRegion 0 2427 15
x86 QWidgetPrivate::paintBackground 2 2430 12
x86 QWidgetPrivate::drawWidget 0 2430 12
x86 QWidgetBackingStore::sync 2 2596 12
x86 QWidgetPrivate::syncBackingStore 4 2610 12
x86 QETWidget::translateConfigEvent 0 2479 6
x86 QtWndProc 0 2495 12
Most likely, Qt allocated a new bitmap to hold the widget's appearance, and the system satisfied this request by allocating new pages to the process. Upon the first write to these pages, a soft page fault occurs, and the actual pages are mapped into the process address space. This could potentially be avoided by caching the bitmap between repaint calls; however, when resizing, the size of the bitmap needed will change, and so this optimization no longer applies; the bitmap must be reallocated (causing soft page faults) each time the dimensions change.
Is this actually having a performance impact, though?
Related
A WRITEBLK command fails when the item reaches 2GB in size (item is truncated to 2147483647 bytes).
Using cat I was able to create an item larger than 2GB in the same directory, but opening it in UV gave a corrupt (negative) value for STATUS<4> (Number of bytes available to read).
uv 11.1.4
64bit Linux on a VM
64BIT_FILES = 1
You can make the universe files 32 or 64 bit (regardless of the OS). So you can do a FILEINFO call to see if the file is actually 64bit (even if the account is 64bit).
My guess is that there is an File system limitation on the file size. in the Rocket UniVerse documentation (page 927) it says:
If the device runs out of disk
space, WRITEBLK takes the ELSE clause and returns –4 to the STATUS
function.
Generally only 32 bit systems would be the hard limit on 2 GB, but maybe there is some kind of 32 bit process running in our 64 bit virtual machine that is producing the same effect. See here for a few leads: https://unix.stackexchange.com/questions/274380/file-size-limit
I have set up simple tests with Parallel Boost Graph Library (PBGL), which I have never used before, and observed entirely unexpected behaviour I would like to explain.
My steps were as follows:
Dump test data in METIS format (a kind of social graph with 50 mln vertices and 100 mln edges);
Build modified PBGL example from graph_parallel\example\dijkstra_shortest_paths.cpp
Example was slightly extended to proceed with Eager, Crauser and delta-stepping algorithms.
Note: building of the example required some obscure workaround about the MUTABLE_QUEUE define in crauser_et_al_shortest_paths.hpp (example code is in fact incompatible with the new mutable_queue)
int lookahead = 1;
delta_stepping_shortest_paths(g, start, dummy_property_map(), get(vertex_distance, g), get(edge_weight, g), lookahead);
dijkstra_shortest_paths(g, start, distance_map(get(vertex_distance, g)).lookahead(lookahead));
dijkstra_shortest_paths(g, start, distance_map(get(vertex_distance, g)));
Run
mpiexec -n 1 mytest.exe mydata.me
mpiexec -n 2 mytest.exe mydata.me
mpiexec -n 4 mytest.exe mydata.me
mpiexec -n 8 mytest.exe mydata.me
The observed behaviour:
-n 1:
mem usage: 35 GB in 1 running process, which utilizes exactly 1 device thread (processor load 12.5%)
delta stepping time: about 1 min 20 s
eager time: about 2 min
crauser time: about 3 min 20 s.
-n 2:
crash in the stage of data load.
-n 4:
mem usage: 40+ Gb in roughly equal parts in 4 running processes, each of which utilizes exactly 1 device thread
calculation times are unchanged in the margins of observation error.
-n 8:
mem usage: 44+ Gb in roughly equal parts in 8 running processes, each of which utilizes exactly 1 device thread
calculation times are unchanged in the margins of observation error.
So, except the unapropriate memory usage and very low total performance the only changes I observe when more MPI processes are running are slightly increased total memory consumption and linear rise of processor load.
The fact that initial graph is somehow partitioned between processes (probably by the vertices number ranges) is nevertheless evident.
What is wrong with this test (and, probably, my idea of MPI usage in whole)?
My enviromnent:
- one Win 10 PC with 64 Gb and 8 kernels;
- MS MPI 10.0.12498.5;
- MSVC 2017, toolset 141;
- boost 1.71
N.B. See original example code here.
why GlobalMemoryStatusEx() gives huge total virtual memory.Does it take into account all the page files that can be created?
System details:
Windows 8.1, 64 bit Process, x64 Processor
int main()
{
MEMORYSTATUSEX mex;
mex.dwLength = sizeof (mex);
GlobalMemoryStatusEx(&mex);
std::cout<<mex.ullTotalVirtual<<" "<<mex.ullAvailVirtual;
}
140737488224256 140737478111232
EDIT:
I got same result on Windows 10.I am interested in knowing how this 127 TB figure comes up.Why does the system not take into account that i don't have 127 tb space on my disk?
A 32 bit process on (x64 system) shows only 2gb which is the accessible address limit of a 32 bit process for user mode.Why does it not take into account page files in case of 32 bit process?
Yes. From MSDN:
You can use the GlobalMemoryStatusEx() to determine how much memory your application can allocate without severely impacting other applications.
How I can get physical ram installed to my computer using c++ in Windows?
I mean not only capacity parametrs which can GlobalMemoryStatusEx(), but also number of used memory slots, type of memory (like DDR1/DDR2/DDR3), type of slot (DIMM/SO-DIMM) and clock rate of memory bus.
Am I need to use SMBIOS? Or have been any another way to get this info?
On my machine, most of the information you request is available through WMI. Take a look at the Win32_PhysicalMemory and related classes.
For example, the output of wmic memorychip on my machine is:
C:\>wmic memorychip
Attributes BankLabel Capacity Caption ConfiguredClockSpeed ConfiguredVoltage CreationClassName DataWidth Description DeviceLocator FormFactor HotSwappable InstallDate InterleaveDataDepth InterleavePosition Manufacturer MaxVoltage MemoryType MinVoltage Model Name OtherIdentifyingInfo PartNumber PositionInRow PoweredOn Removable Replaceable SerialNumber SKU SMBIOSMemoryType Speed Status Tag TotalWidth TypeDetail Version
2 BANK 0 17179869184 Physical Memory 2133 1200 Win32_PhysicalMemory 64 Physical Memory ChannelA-DIMM0 12 Samsung 0 0 0 Physical Memory M471A2K43BB1-CPB 15741117 26 2133 Physical Memory 0 64 128
2 BANK 2 17179869184 Physical Memory 2133 1200 Win32_PhysicalMemory 64 Physical Memory ChannelB-DIMM0 12 Samsung 0 0 0 Physical Memory M471A2K43BB1-CPB 21251413 26 2133 Physical Memory 2 64 128
As noted in the link above, FormFactor 12 is SODIMM.
Notably missing are the voltages (which you didn't ask for, but are usually of interest) and the MemoryType, the documentation of which is outdated on MSDN, while the recent SMBIOS docs from DMTF include values in the enum for DDR4. etc.
Therefore, you would probably have to resort to looking at the SMBIOS tables more or less by hand. See: How to get memory information (RAM type, e.g. DDR,DDR2,DDR3?) with WMI/C++
I recently faced flash overflow problem. After doing some optimization in code, I saved some flash memory and executed software successfully. I want to how much flash memory is saved through my changes. Please let me know how can I check for used flash / available flash memory. Also I want to how much flash is utilized by particular function/file.
Below mentioned are some info about my developing environment.
- Avr microcontroller with 64 k ram and 512 K flash.
- Using freeRtos.
- Using GNU C++ compiler.
- Using AVRATJTAGEICE for programming and Debugging.
Please let me know the solution.
Regards,
Jagadeep.
GCC's size program is what you're looking for.
size can be passed the full compiled .elf file. It will, by default, output something like this:
$ size linked-file.elf
text data bss dec hex filename
11228 112 1488 12828 321c linked-file.elf
This is saying:
There are 11228 bytes in the .text "section" of this file. This is generally for functions.
There are 112 bytes of initialized data: global variables in the program with initial values.
There are 1488 bytes of uninitialized data: global variables without initial values.
dec is simply the sum of the previous 3 values: 11228 + 112 + 1488 = 12828.
hex is simply the hexadecimal representation of the dec value: 0x321c == 12828.
For embedded systems, generally dec needs to be smaller than the flash size of your target device (or the available space on the device).
It is generally sufficient to simply watch the dec or text outputs of GCC's size command to monitor the size of your compiled code over time. A large jump in size often indicates a poorly implemented new feature or constexpr that are not getting compiled away. (Don't forget function-sections and data-sections).
Note: For AVR's, you'll want to use avr-size for checking the linked size of AVR .elf files. avr-size takes an extra argument of the target chip and will automatically calculate the percentage of used flash for your chosen chip.
GCC's size also works directly on intermediate object files.
This is particularly useful if you want to check the compiled size of functions.
You should see something like this excerpt:
$ size -A main.cpp.o
main.cpp.o :
section size addr
.group 8 0
.group 8 0
.text 0 0
.data 0 0
.bss 0 0
.text._Z8sendByteh 8 0
.text._ZN3XMC5IOpin7setModeENS0_4ModeE 64 0
.text._ZN7NamSpac6OptionIN5Clock4TimeEEmmEi 76 0
.text.Default_Handler 24 0
.text.HardFault_Handler 16 0
.text.SVC_Handler 16 0
.text.PendSV_Handler 16 0
.text.SysTick_Handler 28 0
.text._Z5errorPKc 8 0
.text._ZN7NamSpac5Motor2goEi 368 0
.text._ZN7NamSpac5Motor3getEv 12 0
.rodata.cst1 1 0
.text.startup.main 632 0
.text._ZN7NamSpac7Program3runEv 380 0
.text._ZN7NamSpac8Position4tickEv 24 0
.text.startup._GLOBAL__sub_I__ZN7NamSpac7displayE 292 0
.init_array 4 0
.bss._ZN5Debug9formatterE 4 0
.rodata._ZL10dispDigits 8 0
.bss.position 4 0
.bss.motorState 4 0
.bss.count 4 0
.rodata._ZL9diameters 20 0
.bss._ZN7NamSpac8diameterE 16 0
.bss._ZN5Debug3pinE 12 0
.bss._ZN7NamSpac7displayE 24 0
.rodata.str1.4 153 0
.rodata._ZL12dispSegments 32 0
.bss._ZL16diametersDisplay 10 0
.bss.loadAggregate 4 0
.bss.startCount 4 0
.bss._ZL15runtimesDisplay 10 0
.bss._ZN7NamSpac7runtimeE 16 0
.bss.startTime 4 0
.rodata._ZL8runtimes 20 0
.comment 111 0
.ARM.attributes 49 0
Total 2494
Please let me know the solution.
Sorry, there's no the solution! You've gotta getting through what's linked to your final ELF, and decide if it was linked by intend, or unwanted default.
Please let me know how can I check for used flash / available flash memory.
That primarily depends on your actual target hardware platform, so you have to manage to get your .text section fitting in there.
Also I want to how much flash is utilized by particular function/file.
The nm tool of the GCC binutils provides detailed information about any (global) symbol found in an ELF file and the space it occupies in it's associated section. You'll just need to grep the results for particular functions/classes/namespaces (best demangled!) to accumulate section type and symbol filtered outputs for analysis.
That's the approach, I've been using for a little tool called nmalyzr. Sorry to say, as it stands on the GIT repo, its not really working as intended (I've got working versions, that aren't pushed back).
In general, it's a good strategy to chase for code that has #include <iostream> statements (no matter if std::cout or alike are used or not, static instances are provided!), or unwanted newlib/libstdc++ bindings as for e.g. default exception handling.
Use size command from binutils on the generated elf file. As you seem to use an AVR chip, use avr-size.
To get the size of functions, use nm command from binutils (avr-nm on AVR chips).