Slow or delayed loading of my application - c++

Question:
My question is what will be the impact on my application memory footprint or performance if I replace functions like foo1 (which I have in my code) below with foo2. This function is called frequently in application.
#define SIZE 5000
void foo1()
{
double data[SIZE];
// ....
}
void foo2()
{
std::unique_ptr< double[] > data( new double[SIZE] );
// ....
}
Context:
My MFC application loads really slow on the embedded device running Windows 7 after implementation of new features/modules. The same application loads fast on PC. At least one of the difference and what I suspect is the cause is RAM on embedded unit is really low, just 768 MB.
I debugged it to find out where does this delay occurs and recorded time stamps within application in loading process. What I discovered was interesting. When I double click the exe, it takes about a minute to record the first time stamp and after that it runs fast, so all the delay is right there.
My theory is that windows is taking all this time to setup the environment for exe and once done, it runs fast. The reason I suspect this is there are a lot big structures declared on stack in the application to the point I had to move some of them to heap to get rid of stack overflow errors even on PC with new features.
What do you think is the cause of the slow or more accurately delayed loading of executable on low RAM machine? Do you think it will fix up if I move all of the big structures from stack to heap?

There are not a lot of things that take a minute in modern day computing. Not on a machine with an embedded version of Windows either. Not the processor, not the RAM, not the disk.
Except one, networking is still based on assumptions that were last valid in the 1980s. TCP/IP has taken over as the only protocol in common use. But has a flaw, there is no reasonable way to discover how long a connection attempt might take. So connection timeouts are based on absolute worst-case conditions, trying to hook up to a machine half-way around the world, connected with a modem that needs to spin up the drum to load the program.
The minimum timeout on Windows is hard-baked at 45 seconds. And, in general, a condition that certainly isn't unlikely in an embedded machine. You might have hooked it up to a network to get it initialized but it isn't connected anymore or the machine you copied from might no longer be powered up.
Chase it down by first looking for a disconnected disk drive, very common. Next use SysInternals' utilities like TcpView to look for network activity, like trying to connect to a CRL server. Use Process Explorer to find out where the program is stuck. Mark Russinovich' blog is excellent to show his trouble-shooting strategies using these tools. Good luck with it.

Related

Retrieving disk read/write max speed (programmatically)

I am in the process of creating a C++ application that measures disk usage. I've been able to retrieve current disk usage (read and write speeds) by reading /proc/diskstats at regular intervals.
I would now like to be able to display this usage as a percentage (I find it is more user-friendly than raw numbers, which can be hard to interpret). Therefore, does anyone know of a method for retrieving maximum (or nominal) disk I/O speed programmatically on Linux (API call, reading a file, etc)?
I am aware of various answers about measuring disks speeds(eg https://askubuntu.com/questions/87035/how-to-check-hard-disk-performance), but all are through testing. I would like to avoid such methods as they take some time to run and entail heavy disk I/O while running (thus potentially degrading the performance of other running applications).
In the advent of IBM PC era, there was a great DOS utility, I forgot its name, but it was measuring the speed of the computer (maybe Speedtest? whatever). There was a bar in the 2/3 bottom of the screen, which is represented the speed of the CPU. If you had a 4.0 MHz (not GHz!) the bar occupied the 10% of the screen.
2-3 years later, '386 computers has risen, and the speed indicator bar overgrown not just the line but the screen, and it looked crappy.
So, there is no such as 100% disk speed, CPU speed etc.
The best you can do: if you program runs for a while, you can remember the highest value and set it as 100%. Probably you may save the value into a tmp file.

Windows pauses my process after a short time if memory consumption is high

I am using a 3d-lattice to update two fields in time, using OpenCL kernels for the update rule, and a C++ host program, and run my program under Windows 64bit with 8GB RAM. The application is built using VS2017.
My problem is: No matter whether I use my graphics card or the CPU for the computation, the application is paused by Windows after a brief time (about 15min), and I have to press a key in the open console to wake it up, afer which it continues running, but stops outputting status information to the console (which it should do).
This happens only when I use a lot of memory, i.e. compute on a big lattice with at least 3GB of allocated memory, with less memory consumption the program runs just fine for as long as I need it to.
Of course, I would like to be able to run my simulations without having to watch my PC all the time.I already tried increasing the priority of the process, which did not help.
Is there a way to tell Windows to leave my processes running?

C++, OpenCV, & Kinect: Processing speed goes down

I use C++ (Visual Studio 2015) and OpenCV (ver 3.2.0) to process data sent from Kinect v1. My C++ program has no problem when it starts debugging for the first time. After it stops debugging and re-start debugging, however, it gets very slow.
I am suspecting that the program closes without releasing some memory (i.e., memory leak). I am aware of that I would need to use the delete function to release the memory if I use the new function. But I didn't use the new function in the C++ program (I neither used the malloc() function, which is equivalent to the new function in C programs).
For OpenCV, I use the destroyAllWindows function at the end of the program. For Kinect v1, I also use the NuiShutdown(), Release(), and CloseHandle() functions at the end of the program.
Is there anything else I need do to release the memory (e.g., releasing memory associated with Mat in OpenCV)? Or is something else causing the decrease in processing speed?
I'd appreciate your help. Thanks.
After first run disconnect Kinect then reconnect and try second run.
If all goes well now then the problem is most likely stuck thread. The device access is usually handled by separate threads and especially with USB they can get stuck (in case of error or sync problem between accessing form host and expecting on device side) until you disconnect device (not sure which Kinect driver are you using but JUNGO version which NuiShutdown() infers have this problem). You can also check task manager before disconnection if there are not some stuck processes left after first run.
To remedy this you need to find out what are you doing wrong during access. It could be:
wrong USB port
use the back side not front slots.
invalid USB transfer request
device is always waiting for specific set of commands or stream and waits until it does not receive it so it blocks all other things. So using unsupported commands or reading in wrong times or sizes of packets can cause this.
USB communication is out of sync
PC host can timeout in case you do not have enough CPU power while critical operation is processed (or have opened too many apps on background).
This can be caused also by wrong gfx driver as I suspect you are using rendering ... Intel HD graphics can generate such problems with ease especially on notebooks. Try to disable any rendering in your app or at least limit rendering to OpenGL 1.0 to see if speed is the same in between runs. If this is the case the whole desktop usually flickers or is not repainting parts of apps ... and animations are sometimes sluggish.
Another problem might be a debugger. If without it all is well then debugger is the problem and you can not solve it. Debugging while accessing IO can cause sync and timeout problems especially with USB.
To check for memory leaks you can simply see how much free memory you got before 1st run and compare it to values after 1st,2nd,3th .. runs if the value lowers you got something stuck somewhere. After app close all the memory belonging to app is freed by OS so even if you forget some delete that does not matter unless some thread is still running ...
Some USB drivers based on libUSB I encountered got also problem with Handle leaks. But that behaves differently ... all runs fine until there are no free handles. After that OS is non functional you can not open any window,app, anything ... until any app is closed.
[Edit1] Front USB slots
Front slots are usually connected to motherboard with relatively long cable (usually flat and not very well shielded) so it is more susceptible to noise. Also as it is located usually around HDD and above high frequency parts of the motherboard it also induce it into the USB feed. All this degrades the quality of USB signal causing much much bigger rejection rate hence lowering sync capability and also the overall usable bandwidth.
If you compare that with backside USB ports they have no cables but are connected directly in PCB with short and well shielded paths so the connection quality is much much better.
So if you use device demanding high bandwith or synchronism then front ports are a bad choice.

Windows based C++ application consumes more CPU over time

We have a C++ based Multi-threaded application on Windows that captures network packets in real-time using the WinPCAP library and then processes these packets for monitoring the network. This application is intended to run 24x7. Our applicatin easily consumes 7-8 GB of RAM.
Issue that we are observing :
Lets say the application is monitoring 100Mbps of network traffic and consumes 60% CPU. We have observed that when the application keeps running for a longer duration like a day or two, the CPU consumption of the application increases to like 70-80%, even though it is still processing 100 Mbps traffic (doing the same amount of work).
We have tried to debug this issue to the thread level using ProcessExplorer and noticed that the packet capturing threads start consuming more CPU over time. This issue is not resolved even after re-starting the application. Only a machine restart solves the problem.
We have observed this issue is easily reproducible on Windows 2012 R2 Server OS during over night runs. In Windows 7, the issue happens but over few days.
Any idea what might be causing this ?
Thanks in Advance
What about memory allocation? Because you are using lots of memory it could be a memory fregmentation problem so if you do several allocation/reallocation of buffers this of course will cause a major cost for the processor to find and allocate space available.
I finally found the reason for the above behavior : it was the winpcap code that was causing it. After replacing that, we did not observe this behavior.

Running background services on a PocketPC

I've recently bought myself a new cellphone, running Windows Mobile 6.1 Professional. And of course I am currently looking into doing some coding for it, on a hobby basis. My plan is to have a service running as a DLL, loaded by Services.exe. This needs to gather som data, and do som processing at regular intervals (every 5-10 minutes).
Since I need to run this at regular intervals, it is a bit of a problem for me, that the system typically goes to sleep (suspend) after a short period of inactivity by the user.
I have been reading all the documentation I could find on MSDN, and MSDN blogs about this subject, and it seems to me, that there are three possible solutions to this problem:
Keep the system in an "Always On"-state, by calling SystemIdleTimerReset periodically. This seems a bit excessive, and is therefore out of the question.
Have the system periodically waken up with CeRunAppAtTime, and enter the unattended state, to do my processing.
Use the unattended state instead of going into a full suspend. This would be transparent to the user, but the system would never go into sleep.
The second approach seems to be preferred, however, this would require an executable to be called by the system on wake up, with the only task of notifying my service that it should commence processing. This seems a bit unnecessary and I would like to avoid this extra executable. I could of course move all my processing into this extra executable, but I would like to use some of the facilities provided when running as a service, and also not have a program pop up (even if its in the background) whenever processing starts.
At first glance, the third approach seems to have the same basic problem as the first. However, I have read on some of the MSDN blogs, that it might be possible to actually conserve battery consumption with this approach, instead of going in and out of suspend mode often (The arguments for this was that the nature of the WM platform is to have a very little battery consumption, when the system is idle. And that going in and out of suspend require quite a bit of processing).
So I guess my questions are as following:
Which approach would you recommend in my situation? With respect to keeping a minimum battery consumption, and a nice clean implementation.
In the case of approach number two, is it possible to eliminate the need for a notifying executable? Either through alternative API functions, or existing generic applications on the platform?
In the case of approach number three, do you know of any information/statistics relevant to the claim, that it is possible to extend the battery lifetime when using unattended mode over going into suspend. E.g. how often do you need to pull the system out of suspend, before unattended mode is to be preferred.
Implementation specific (bonus) question: Is it necessary to regularly call SystemIdleTimerReset to stay in unattended mode?
And finally, if you think I have prematurely eliminated approach number one, please tell me why.
Please include in your response whether you base your response on knowledge, or are merely guessing (the latter is also very welcome!).
Please leave a comment, if you think I need to clarify any parts of this question.
CERunAppAtTime is a much-misunderstood API (largely because of the terrible name). It doesn't have to run an app. It can simply set a named system event (see the description of the pwszAppName parameter in the MSDN docs). If you care to know when it has fired (to lat your app put the device to sleep again when it's done processing) simply have a worker thread that is doing a WaitForSingleObject on that same named event.
Unattended state is often used for devices that need to keep an app running continuously (like an MP3 player) but conserve power by shutting down the backlight (probably the single most power consuming subsystem).
Obviously unattended mode uses significantly more powr than suspend, becasue in suspend the only power draw is for RAM self-refresh. In unattended mode the processor is stuill powered and running (and several peripherals may be too - depends on how the OEM defined their unattended mode).
SystemIdleTimerReset simply prevents the power manager from putting the device into low-power mode due to inactivity. This mode, whether suspended, unattended, flight or other, is defined by the OEM. Use it sparingly because when your do it impacts the power consumption of the device. Doing it in unattended mode is especially problematic from a user perspective because they might think the device is off (it looks that way) but now their battery life has gone south.
I had a whole long post detailing how you shouldn't expect to be able to get acceptable battery life because WM is not designed to support what you're trying to do, but -- you could signal your service on wakeup, do your processing, then use the methods in this post to put the device back to sleep immediately. You should be able to keep the ratio of on-time-to-sleep-time very low this way -- but as you say, I'm only guessing.
See also:
Power-Efficient Apps (MSDN)
Power To The People (Developers 1, Developers 2, Devices)
Power-Efficient WM Apps (blog post)