Why is latency much greater if my process is idle between events?

Why is latency much greater if my process is idle between events? - c++

I have a process that is blocked on a socket. When input becomes available in the socket the process decodes the input and most of the time does nothing but update an in memory structure. Periodically the input is such that more complex analysis is triggered, ultimately resulting in a outgoing message on another connection. I would like to minimize the latency in this later case, i.e. minimize the time between receiving and sending. What I have noticed is that latency numbers are 2x worse when the time between interesting events increases. What could this be attributed to and how could I improve on it? I have tried to reserve a CPU for my process but I haven't see much of an improvement.

You should try to "nice" the process to a negative value. I don't know the Linux scheduler in detail, but normal policy is to reduce the time slice (sometimes a quantum) when a process fails to use its slice up and vice versa. This is called multi-level feedback policy. In your case getting a bunch of quickly handled events probably gives the process a very short time slice. When a "significant" event occurs, it would have to work its way up to a longer slice through several context swaps. Setting the "nice" value high enough is likely to give it whatever time slice it needs.
Unfortunately "negative niceness" requires superuser privilege in most systems.

Related

How to handle delays of scheduled jobs and thus duplications by mistake? (caching? message brokers?)

In our project, we have scheduled jobs which send shipment requests for orders every 60 seconds. There must be exactly one request per order. Some jobs have delays (take around 70 seconds instead), which results in sending a request twice for the same order just because the previous job had a delay and a new one already started. How to ensure that only one request is sent per order no matter what delay there is?
My assumptions so far:
Add a flag to the database, lookup for it before processing a request for an order (we use DynamoDb)
Temporary store the result in caches (I'd assume even something like 10 minutes, cause delayed jobs usually don't take longer than 1,5 minutes, so it'd be a safe assumption)
Temporary store it in some message broker (similar to caching). We already use SQS and SNS in our project. Would it be appropriate to store messages about orders which were already processed there? Are message brokers ever used for scheduled jobs to ensure they don't duplicate each other?
Increase the interval between jobs to 2 mins. Even though delays are not longer than 1,5 mins in total now, it will not guarantee to prevent possible longer delays in the future. However, this solution would be simple enough
What do you think? What would be a good solution in this case, in terms of simple implementation, fast performance and preventing duplicates?

So, if you want to make your operation idempotent by using de-duplication logic then you should ask the following questions to narrow down the possible options:
In worst case how many times would you receive the exact same request?
In worst case how much time would be between first and last duplicate requests?
In worst case how much requests should be evaluated nearly at the same time during peak hours?
Which storage system does allow me to use point query instead of scan?
Which storage system does have the lowest overhead during write operation to capture the I have seen this flag?
...
Depending on your answers you can justify that the given storage is suitable for your needs or not.

Is there a reliable method to guarantee a while loop will execute at a specified frequency in C++?

I have a while loop that executes a program, with a sleep every so often. The while loop is meant to simulate a real-time program that executes at a certain frequency. The current logic calcualtes a number of cycles to execute per sleep to achieve a desired frequency. This has proven to be innacurate. I think a timer would be a better implementation, but do to the complexity of refactor I am trying to maintain a while loop solution to achieve this. I am looking for advice on a scheme that may more tightly achieve a desired frequency of execution in a while loop. Pseudo-code below:
MaxCounts = DELAY_TIME_SEC/DESIRED_FREQUENCY;
DoProgram();
while(running)
{
if(counts > MaxCounts)
{
Sleep(DELAY_TIME_SEC);
}
}

You cannot reliably schedule an operation to occur at specific times on a non-realtime OS.
As C++ runs on non-realtime OS's, it cannot provide what cannot be provided.
The amount of error you are willing to accept, in both typical and extreme cases, will matter. If you want something running every minute or so, and you don't want drift on the level of days, you can just set up a starting time, then do math to determine when the nth event should happen.
Then do a wait for the nth time.
This fixes "cumulative drift" issues, so over 24 hours you get 1440+/-1 events with 1 minute between them. The time between the events will vary and not be 60 seconds exactly, but over the day it will work out.
If your issue is time on the ms level, and you are ok with a loaded system sometimes screwing up, you can sleep and aim for a time before the next event shy half a second (or whatever margin makes it reliable enough for you). Then busy-wait until the time occurs. You may also have to tweak process/thread priority; be careful, this can easily break things badly if you make priority too high.
Combining the two can work as well.

What is the benefit of using exponential backoff?

When the code is waiting for some condition in which delay time is not deterministic, it looks like many people choose to use exponential backoff, i.e. wait N seconds, check if the condition satisfies; if not, wait for 2N seconds, check the condition, etc. What is the benefit of this over checking in a constant/linearly increasing time span?

Exponential back-off is useful in cases where simultaneous attempts to do something will interfere with each other such that none succeed. In such cases, having devices randomly attempt an operation in a window which is too small will result in most attempts failing and having to be retried. Only once the window has grown large enough will attempts have any significant likelihood of success.
If one knew in advance that 16 devices would be wanting to communicate, one could select the size of window that would be optimal for that level of loading. In practice, though, the number of competing devices is generally unknown. The advantage of an exponential back-off where the window size doubles on each retry is that regardless of the number of competing entities:
The window size where most operations succeed will generally be within a factor of two of the smallest window size where most operations would succeed,
Most of the operations which fail at that window size will succeed on the next attempt (since most of the earlier operations will have succeeded, that will leave less than half of them competing for a window which is twice as big), and
The total time required for all attempts will end up only being about twice what was required for the last one.
If, instead of doubling each time, the window were simply increased by a constant amount, then the time spent retrying an operation until the window reached a usable size would be proportional to the square of whatever window size was required. While the final window size might be smaller than would have been used with exponential back-off, the total cost of all the attempts would be much greater.

This is the behavior of TCP congestion control. If the network is extremely congested, effectively no traffic gets through. If every node waits for a constant time before checking, the traffic just for checking will continue to clog the network, and the congestion never resolves. Similarly for a linear increasing time between checks, it may take a long time before the congestion resolves.

Assuming you are referring to testing a condition before performing an action:
Exponential backoff is beneficial when the cost of testing the condition is comparable to the cost of performing the action (such as in network congestion).
if the cost of testing the condition is much smaller (or negligible), then a linear or constant wait can work better, provided the time it takes for the condition to change is negigible as well.
For exemple, if your condition is a complex (slow) query against a database, and the action is an update of the same database, then every check of the condition will negatively impact the database performance, and at some point, without exponential backoff, checking the condition by multiple actors could be enough to use all database resources.
But if the condition is just a lightweight memory check (f.i. a critical section), and the action is still an update of a database (at best tens of thousandths of times slower than the check), and if the condition is flipped in a negligible time at the very start of the action (by entering the critical section), then a constant or linear backoff would be fine. Actually under this particular scenario, an exponential backoff would be detrimental as it would introduce delays in situations of low load, and is more likely to result in time-outs in situations of high load (even when the processing bandwidth is sufficient).
So to summarize, exponential backoff is a hammer: it works greats for nails, not so much for screws :)

What is the definition of realtime, near realtime and batch? Give examples of each?

I'm trying to get a good definition of realtime, near realtime and batch? I am not talking about sync and async although to me, they are different dimensions. Here is what I'm thinking
Realtime is sync web services or async web services.
Near realtime could be JMS or messaging systems or most event driven systems.
Batch to me is more of an timed system that is processing when it wakes up.
Give examples of each and feel free to fix my assumptions.

https://stackoverflow.com/tags/real-time/info
Real-Time
Real-time means that the time of an activity's completion is part of its functional correctness. For example, the sqrt() function's correctness is something like
The sqrt() function is implemented
correctly if, for all x >=0, sqrt(x) =
y implies y^2 == x.
In this setting, the time it takes to execute the sqrt() procedure is not part of its functional correctness. A faster algorithm may be better in some qualitative sense, but no more or less correct.
Suppose we have a mythical function called sqrtrt(), a real-time version of square root. Imagine, for instance, we need to compute the square root of velocity in order to properly execute the next brake application in an anti-lock braking system. In this setting, we might say instead:
The sqrtrt() function is implemented
correctly if
for all x >=0, sqrtrt(x) =
y implies y^2 == x and
sqrtrt() returns a result in <= 275 microseconds.
In this case, the time constraint is not merely a performance parameter. If sqrtrt() fails to complete in 275 microseconds, you may be late applying the brakes, triggering either a skid or reduced braking efficiency, possibly resulting in an accident. The time constraint is part of the functional correctness of the routine. Lift this up a few layers, and you get a real-time system as one (at least partially) composed of activities that have timeliness as part of their functional correctness conditions.
Near Real-Time
A near real-time system is one in which activities completion times, responsiveness, or perceived latency when measured against wall clock time are important aspects of system quality. The canonical example of this is a stock ticker system -- you want to get quotes reasonably quickly after the price changes. For most of us non-high-speed-traders, what this means is that the perceived delay between data being available and our seeing it is negligible.
The difference between "real-time" and "near real-time" is both a difference in precision and magnitude. Real-time systems have time constraints that range from microseconds to hours, but those time constraints tend to be fairly precise. Near-real-time usually implies a narrower range of magnitudes -- within human perception tolerances -- but typically aren't articulated precisely.
I would claim that near-real-time systems could be called real-time systems, but that their time constraints are merely probabilistic:
The stock price will be displayed to the user within 500ms of its change at the exchange, with
probability p > 0.75.
Batch
Batch operations are those which are perceived to be large blocks of computing tasks with only macroscopic, human- or process-induced deadlines. The specific context of computation is typically not important, and a batch computation is usually a self-contained computational task. Real-time and near-real-time tasks are often strongly coupled to the physical world, and their time constraints emerge from demands from physical/real-world interactions. Batch operations, by contrast, could be computed at any time and at any place; their outputs are solely defined by the inputs provided when the batch is defined.
Original Post
I would say that real-time means that the time (rather than merely the correct output) to complete an operation is part of its correctness.
Near real-time is weasel words for wanting the same thing as real-time but not wanting to go to the discipline/effort/cost to guarantee it.
Batch is "near real-time" where you are even more tolerant of long response times.
Often these terms are used (badly, IMHO) to distinguish among human perceptions of latency/performance. People think real-time is real-fast, e.g., milliseconds or something. Near real-time is often seconds or milliseconds. Batch is a latency of seconds, minutes, hours, or even days. But I think those aren't particularly useful distinctions. If you care about timeliness, there are disciplines to help you get that.

I'm curious for feedback myself on this. Real-time and batch are well defined and covered by others (though be warned that they are terms-of-art with very specific technical meanings in some contexts). However, "near real-time" seems a lot fuzzier to me.
I favor (and have been using) "near real-time" to describe a signal-processing system which can 'keep up' on average, but lags sometimes. Think of a system processing events which only happen sporadically... Assuming it has sufficient buffering capacity and the time it takes to process an event is less than the average time between events, it can keep up.
In a signal processing context:
- Real-time seems to imply a system where processing is guaranteed to complete with a specified (short) delay after the signal has been received. A minimal buffer is needed.
- Near real-time (as I have been using it) means a system where the delay between receiving and completion of processing may get relatively large on occasion, but the system will not (except under pathological conditions) fall behind so far that the buffer gets filled up.
- Batch implies post-processing to me. The incoming signal is just saved (maybe with a bit of real-time pre-processing) and then analyzed later.
This gives the nice framework of real-time and near real-time being systems where they can (in theory) run forever while new data is being acquired... processing happens in parallel with acquisition. Batch processing happens after all the data has been collected.
Anyway, I could be conflicting with some technical definitions I'm unaware of... and I assume someone here will gleefully correct me if needed.

There are issues with all of these answers in that the definitions are flawed. For instance, "batch" simply means that transactions are grouped and sent together. Real Time implies transactional, but may also have other implications. So when you combine batch in the same attribute as real time and near real time, clarity in purpose for that attribute is lost. The definition becomes less cohesive, less clear. This would make any application created with the data more fragile. I would guess that practitioners would be better off w/ a clearly modeled taxonomy such as:
Attribute1: Batched (grouped) or individual transactions.
Attribute2: Scheduled (time-driven), event-driven.
Attribute3: Speed per transaction. For batch that would be the average speed/transaction.
Attribute4: Protocol/Technology: SOAP, REST, combination, FTP, SFTP, etc. for data movement.
Attributex: Whatever.
Attribute4 is more related to something I am doing right now, so you could throw that out or expand the list for what you are trying to achieve. For each of these attribute values, there would likely be additional, specific attributes. But to bring the information together, we need to think about what is needed to make the collective data useful. For instance, what do we need to know between batched & transactional flows, to make them useful together. For instance, you may consider attributes for each to provide the ability to understand total throughput for a given time period. Seems funny how we may create conceptual, logical, and physical data models (hopefully) for our business clients, but we don't always apply that kind of thought to how we define terminology in our discussions.

Any system in which time at which output is produced is significant. This is usually because the input corresponding to some movement in the physical environment or world and the output has to relate to the same movement. The lag from input to output time must be sufficiently small for acceptable timelines.

CPU throttling in C++

I was just wondering if there is an elegant way to set the maximum CPU load for a particular thread doing intensive calculations.
Right now I have located the most time consuming loop in the thread (it does only compression) and use GetTickCount() and Sleep() with hardcoded values. It makes sure that the loop continues for a certain period and then sleeps for a certain minimum time. It more or less does the job, i.e. guarantees that the thread will not use more than 50% of CPU. However, behavior is dependent on the number of CPU cores (huge disadvantage) and simply ugly (smaller disadvantage :)). Any ideas?

I am not aware of any API to do get the OS's scheduler to do what you want (even if your thread is idle-priority, if there are no higher-priority ready threads, yours will run). However, I think you can improvise a fairly elegant throttling function based on what you are already doing. Essentially (I don't have a Windows dev machine handy):
Pick a default amount of time the thread will sleep each iteration. Then, on each iteration (or on every nth iteration, such that the throttling function doesn't itself become a significant CPU load),
Compute the amount of CPU time your thread used since the last time your throttling function was called (I'll call this dCPU). You can use the GetThreadTimes() API to get the amount of time your thread has been executing.
Compute the amount of real time elapsed since the last time your throttling function was called (I'll call this dClock).
dCPU / dClock is the percent CPU usage (of one CPU). If it is higher than you want, increase your sleep time, if lower, decrease the sleep time.
Have your thread sleep for the computed time.
Depending on how your watchdog computes CPU usage, you might want to use GetProcessAffinityMask() to find out how many CPUs the system has. dCPU / (dClock * CPUs) is the percentage of total CPU time available.
You will still have to pick some magic numbers for the initial sleep time and the increment/decrement amount, but I think this algorithm could be tuned to keep a thread running at fairly close to a determined percent of CPU.

On linux, you can change the scheduling priority of a thread with nice().

I can't think of any cross platform way of what you want (or any guaranteed way full stop) but as you are using GetTickCount perhaps you aren't interested in cross platform :)
I'd use interprocess communications and set the intensive processes nice levels to get what you require but I'm not sure that's appropriate for your situation.
EDIT:
I agree with Bernard which is why I think a process rather than a thread might be more appropriate but it just might not suit your purposes.

The problem is it's not normal to want to leave the CPU idle while you have work to do. Normally you set a background task to IDLE priority, and let the OS handle scheduling it all the CPU time that isn't used by interactive tasks.
It sound to me like the problem is the watchdog process.
If your background task is CPU-bound then you want it to take all the unused CPU time for its task.
Maybe you should look at fixing the watchdog program?

You may be able to change the priority of a thread, but changing the maximum utilization would either require polling and hacks to limit how many things are occurring, or using OS tools that can set the maximum utilization of a process.
However, I don't see any circumstance where you would want to do this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js