Subtract two times in crystal lang - crystal-lang

I am writing a small benchmark for crystal, and want to compare the time taken without any other library.
In programming language like ruby I would do this:
t = Time.new
sleep 3
puts "Time: #{Time.new - t}s"
Is there a way to get current monotonic clock time and subtract it from another time to get accuracy up to at least 3 decimal places?

The typical time representation provided by the operating system is
based on a "wall clock" which is subject to changes for clock
synchronization. This can result in discontinuous jumps in the
time-line making it not suitable for accurately measuring elapsed
time.
...
As an alternative, the operating system also provides a monotonic
clock. Its time-line has no specfied starting point but is strictly
linearly increasing.
This monotonic clock should always be used for measuring elapsed time.
A reading from this clock can be taken using .monotonic:
t1 = Time.monotonic
# operation that takes 1 minute
t2 = Time.monotonic
t2 - t1 # => 1.minute (approximately)
The execution time of a block can be measured using .measure:
elapsed_time = Time.measure do
# operation that takes 20 milliseconds
end
elapsed_time # => 20.milliseconds (approximately)
-- Time - github.com/crystal-lang/crystal
* Emphasis mine.
So your example would be basically the same, but with Time.monotonic instead of Time.new.

Related

std::clock() and CLOCKS_PER_SEC

I want to use a timer function in my program. Following the example at How to use clock() in C++, my code is:
int main()
{
std::clock_t start = std::clock();
while (true)
{
double time = (std::clock() - start) / (double)CLOCKS_PER_SEC;
std::cout << time << std::endl;
}
return 0;
}
On running this, it begins to print out numbers. However, it takes about 15 seconds for that number to reach 1. Why does it not take 1 second for the printed number to reach 1?
Actually it is a combination of what has been posted. Basically as your program is running in a tight loop, CPU time should increase just as fast as wall clock time.
But since your program is writing to stdout and the terminal has a limited buffer space, your program will block whenever that buffer is full until the terminal had enough time to print more of the generated output.
This of course is much more expensive CPU wise than generating the strings from the clock values, therefore most of the CPU time will be spent in the terminal and graphics driver. It seems like your system takes about 14 times the CPU power to output that timestamps than generating the strings to write.
std::clock returns cpu time, not wall time. That means the number of cpu-seconds used, not the time elapsed. If your program uses only 20% of the CPU, then the cpu-seconds will only increase at 20% of the speed of wall seconds.
std::clock
Returns the approximate processor time used by the process since the beginning of an implementation-defined era related to the program's execution. To convert result value to seconds divide it by CLOCKS_PER_SEC.
So it will not return a second until the program uses an actual second of the cpu time.
If you want to deal with actual time I suggest you use the clocks provided by <chrono> like std::steady_clock or std::high_resolution_clock

How to remove the time taken by the clock from the total measured execution time?

I am trying to measure the time taken by a set of statements. The following is a pseudocode. The code is implemented in C++ on a Xilinx chipset, with a custom RTOS, so the traditional c++ clock functions do not work here.
I do not need help on the actual time measurement, but more on the math on how to calculate the actual execution time.
one = clock.getTime();
/*statement
* 1 *
* to *
* 10 */
two = clock.getTime();
fTime = two - one;
Now I know the time taken by the statements. This time is also includes the time taken by getTime() too right?
one = clock.getTime();
clock.getTime();
two = clock.getTime();
cTime = two - one; //Just measure and the min value i get is 300 microseconds.
Now this block gives me the time taken by getTime().
Finally, my question is:
What is the actual time taken by the statements?
fTime - cTime
fTime - (2* cTime)
Other equation ?
Your time measurement shifts the time
if it is on stable enough platform then the shift is the same for all times so
one = t1 + dt
two = t2 + dt
after substraction the shift eliminates itself so
two-one = (t2+dt)-(t1+dt) = t2-t1
so there is no need to make corrections for time measurement shift in this case.
Problems starts on multi-scalar/vector architectures where the code execution is variable
due to different cache miss-es
different prefetch invalidation
and so on then you have to play with cache invalidation. Also if your getTime() waits for interrupt or HW event that can also add few error T's
In that case measure many times and get the avg or smallest valid result something like:
Negative clock cycle measurements with back-to-back rdtsc?
Typically when benchmarking we perform the measured tasks many many times in between the timer calls, and then divide by the number of task executions.
This is to:
smooth over irrelevant variations, and
avoid the measured duration falling below the actual clock resolution, and
leave the timing overhead as a totally negligible fraction of the time.
In that way, you don't have to worry about it any more.
That being said, this can introduce problems of its own, particularly with respect to things like branch predictors and caches, as they may be "warmed up" by your first few executions, impacting the veracity of your results in a way that wouldn't happen on a single run.
Benchmarking properly can be quite tricky and there is a wealth of material available to explain how to do it properly.

why clock() does not work on the the cluster machine

I want to get the running time of part of my code.
my C++ code is like:
...
time_t t1 = clock();
/*
Here is my core code.
*/
time_t t2 = clock();
cout <<"Running time: "<< (1000.0 * (t2 - t1)) / CLOCKS_PER_SEC << "ms" << endl;
...
This code works well on my laptop.(Opensuse,g++ and clang++, Core i5).
But it does not work well on the cluster in the department.
(Ubuntu, g++, amd Opteron and intel Xeon)
I always get some integer running time :
like : 0ms or 10ms or 20ms.
What cause that ? Why? Thanks!
Clocks are not guaranteed to be exact down to ~10-44 seconds (Planck time), they often have a minimal resolution. The Linux man page implies this with:
The clock() function returns an approximation of processor time used by the program.
and so does the ISO standard C11 7.27.2.1 The clock function /3:
The clock function returns the implementation’s best approximation of ...
and in 7.27.1 Components of time /4:
The range and precision of times representable in clock_t and time_t are implementation-defined.
From your (admittedly limited) sample data, it looks like the minimum resolution of your cluster machines is on the order of 10ms.
In any case, you have several possibilities if you need a finer resolution.
First, find a (probably implementation-specific) means of timing things more accurately.
Second, don't do it once. Do it a thousand times in a tight loop and then just divide the time taken by 1000. That should roughly increase your resolution a thousand-fold.
Thirdly, think about the implication that your code only takes 50ms at the outside. Unless you have a pressing need to execute it more than twenty times a second (assuming you have no other code to run), it may not be an issue.
On that last point, think of things like "What's the longest a user will have to wait before they get annoyed?". The answer to that would vary but half a second might be fine in most situations.
Since 50ms code could run ten times over during that time, you may want to ignore it. You'd be better off concentrating on code that has a clearly larger impact.

How to calculate and print clock_t time roughly

I am timing how long it takes to do three different types of searches, sequential, recursive binary, and iterative binary. I have those in place, and it does iterate through and finish the search. My problem is that when I time them all, I get 0 for all of them every time, even if I make an array of 100,000, and I have it search for something not in the array. If I set a break point in the search it obviously makes the time longer, and it gives me a reasonable time that I can work with. But otherwise it is always 0. Here is my code, it is similar for all three search timers.
clock_t recStart = clock();
mySearch.recursiveSearch(SEARCH_INT);
clock_t recEnd = clock();
clock_t recDiff = recEnd - recStart;
double recClockTime = (double)recDiff/(double)CLOCKS_PER_SEC;
cout << recClockTime << endl;
cout << CLOCKS_PER_SEC << endl;
cout << recClockTime << endl;
For the last two I get 1000 and 0.
Am I doing something wrong here? Or is it in my search Object?
clock() is not an accurate timer, and it just don't work well for timing short intervals.
C says clock returns the implementation’s best approximation to the processor time used by the program since the beginning of an implementation-defined era related only to the program invocation.
If between two successive clock calls you program takes less time than one unity of the clock function, you could get 0. POSIX clock defines the unity with CLOCKS_PER_SEC as 1000000 (unity is then 1 microsecond).
(http://pubs.opengroup.org/onlinepubs/009604499/functions/clock.html)
To measure clock cycles in x86/x64 you can use assembly to retreive the clock count of the CPU Time Stamp Counter register rdtsc. (which can be achieved by inline assembling?) Note that it returns the time stamp, not the number of seconds elapsed. So you need to retrieve the cpu frequency as well.
However, the best way to get accurate time in seconds depends on your platform.
To sum up, it's virtually impossible to achieve calculating and printing clock_t time in seconds accurately. You might want to see this on Stackoverflow to find a better approach (if accuracy is top priority).
clock() just doesn't have enough resolution - here is one good discussion/blog on that topic
http://www.guyrutenberg.com/2007/09/10/resolution-problems-in-clock/
I think two options either use clock_gettime or even better have you considered using OProfile or CodeAnalyst?
I personally prefer to use tools - OProfile is good. I have not used CodeAnalyst before - and then there is Valgrind and gprof.
If you insist on using clock_gettime - please check this out
http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/

Measuring the runtime of a C++ code?

I want to measure the runtime of my C++ code. Executing my code takes about 12 hours and I want to write this time at the end of execution of my code. How can I do it in my code?
Operating system: Linux
If you are using C++11 you can use system_clock::now():
auto start = std::chrono::system_clock::now();
/* do some work */
auto end = std::chrono::system_clock::now();
auto elapsed = end - start;
std::cout << elapsed.count() << '\n';
You can also specify the granularity to use for representing a duration:
// this constructs a duration object using milliseconds
auto elapsed =
std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
// this constructs a duration object using seconds
auto elapsed =
std::chrono::duration_cast<std::chrono::seconds>(end - start);
If you cannot use C++11, then have a look at chrono from Boost.
The best thing about using such a standard libraries is that their portability is really high (e.g., they both work in Linux and Windows). So you do not need to worry too much if you decide to port your application afterwards.
These libraries follow a modern C++ design too, as opposed to C-like approaches.
EDIT: The example above can be used to measure wall-clock time. That is not, however, the only way to measure the execution time of a program. First, we can distinct between user and system time:
User time: The time spent by the program running in user space.
System time: The time spent by the program running in system (or kernel) space. A program enters kernel space for instance when executing a system call.
Depending on the objectives it may be necessary or not to consider system time as part of the execution time of a program. For instance, if the aim is to just measure a compiler optimization on the user code then it is probably better to leave out system time. On the other hand, if the user wants to determine whether system calls are a significant overhead, then it is necessary to measure system time as well.
Moreover, since most modern systems are time-shared, different programs may compete for several computing resources (e.g., CPU). In such a case, another distinction can be made:
Wall-clock time: By using wall-clock time the execution of the program is measured in the same way as if we were using an external (wall) clock. This approach does not consider the interaction between programs.
CPU time: In this case we only count the time that a program is actually running on the CPU. If a program (P1) is co-scheduled with another one (P2), and we want to get the CPU time for P1, this approach does not include the time while P2 is running and P1 is waiting for the CPU (as opposed to the wall-clock time approach).
For measuring CPU time, Boost includes a set of extra clocks:
process_real_cpu_clock, captures wall clock CPU time spent by the current process.
process_user_cpu_clock, captures user-CPU time spent by the current process.
process_system_cpu_clock, captures system-CPU time spent by the current process. A tuple-like class process_cpu_clock, that captures real, user-CPU, and system-CPU process times together.
A thread_clock thread steady clock giving the time spent by the current thread (when supported by a platform).
Unfortunately, C++11 does not have such clocks. But Boost is a wide-used library and, probably, these extra clocks will be incorporated into C++1x at some point. So, if you use Boost you will be ready when the new C++ standard adds them.
Finally, if you want to measure the time a program takes to execute from the command line (as opposed to adding some code into your program), you may have a look at the time command, just as #BЈовић suggests. This approach, however, would not let you measure individual parts of your program (e.g., the time it takes to execute a function).
Use std::chrono::steady_clock and not std::chrono::system_clock for measuring run time in C++11. The reason is (quoting system_clock's documentation):
on most systems, the system time can be adjusted at any moment
while steady_clock is monotonic and is better suited for measuring intervals:
Class std::chrono::steady_clock represents a monotonic clock. The time
points of this clock cannot decrease as physical time moves forward.
This clock is not related to wall clock time, and is best suitable for
measuring intervals.
Here's an example:
auto start = std::chrono::steady_clock::now();
// do something
auto finish = std::chrono::steady_clock::now();
double elapsed_seconds = std::chrono::duration_cast<
std::chrono::duration<double> >(finish - start).count();
A small practical tip: if you are measuring run time and want to report seconds std::chrono::duration_cast<std::chrono::seconds> is rarely what you need because it gives you whole number of seconds. To get the time in seconds as a double use the example above.
You can use time to start your program. When it ends, it print nice time statistics about program run. It is easy to configure what to print. By default, it print user and CPU times it took to execute the program.
EDIT : Take a note that every measure from the code is not correct, because your application will get blocked by other programs, hence giving you wrong values*.
* By wrong values, I meant it is easy to get the time it took to execute the program, but that time varies depending on the CPUs load during the program execution. To get relatively stable time measurement, that doesn't depend on the CPU load, one can execute the application using time and use the CPU as the measurement result.
I used something like this in one of my projects:
#include <sys/time.h>
struct timeval start, end;
gettimeofday(&start, NULL);
//Compute
gettimeofday(&end, NULL);
double elapsed = ((end.tv_sec - start.tv_sec) * 1000)
+ (end.tv_usec / 1000 - start.tv_usec / 1000);
This is for milliseconds and it works both for C and C++.
This is the code I use:
const auto start = std::chrono::steady_clock::now();
// Your code here.
const auto end = std::chrono::steady_clock::now();
std::chrono::duration<double> elapsed = end - start;
std::cout << "Time in seconds: " << elapsed.count() << '\n';
You don't want to use std::chrono::system_clock because it is not monotonic! If the user changes the time in the middle of your code your result will be wrong - it might even be negative. std::chrono::high_resolution_clock might be implemented using std::chrono::system_clock so I wouldn't recommend that either.
This code also avoids ugly casts.
If you wish to print the measured time with printf(), you can use this:
auto start = std::chrono::system_clock::now();
/* measured work */
auto end = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
printf("Time = %lld ms\n", static_cast<long long int>(elapsed.count()));
You could also try some timer classes that start and stop automatically, and gather statistics on the average, maximum and minimum time spent in any block of code, as well as the number of calls. These cxx-rtimer classes are available on GitHub, and offer support for using std::chrono, clock_gettime(), or boost::posix_time as a back-end clock source.
With these timers, you can do something like:
void timeCriticalFunction() {
static rtimers::cxx11::DefaultTimer timer("expensive");
auto scopedStartStop = timer.scopedStart();
// Do something costly...
}
with timing stats written to std::cerr on program completion.