GNU C++ Compiler in Windows 10 returns CLOCKS_PER_SEC = 1000, but I need to measure compiling time for an algorithm that goes below millisecond intervals (it's a school project). Is there a way to redefine CLOCKS_PER_SEC to, say, one million (like UNIX-based OSes)? On a side note, #define CLOCKS_PER_SEC ((clock_t)(1000000)) doesn't seem to work, either.
Short answer : no.
Long answer : No but you can use the QueryPerformanceCounter function, heres an example off of MSDN :
LARGE_INTEGER StartingTime, EndingTime, ElapsedMicroseconds;
LARGE_INTEGER Frequency;
QueryPerformanceFrequency(&Frequency);
QueryPerformanceCounter(&StartingTime);
// Activity to be timed
QueryPerformanceCounter(&EndingTime);
ElapsedMicroseconds.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
//
// We now have the elapsed number of ticks, along with the
// number of ticks-per-second. We use these values
// to convert to the number of elapsed microseconds.
// To guard against loss-of-precision, we convert
// to microseconds *before* dividing by ticks-per-second.
//
ElapsedMicroseconds.QuadPart *= 1000000;
ElapsedMicroseconds.QuadPart /= Frequency.QuadPart;
That way, you can even measure nanoseconds but beware : at that precision level, even the tick count can drift and jitter so you might never receive a perfectly accurate result. If you want perfect precision i guess you will be forced to use an RTOS on appropriate, specialized hardware which is shielded against soft errors, for example
Well, this assignment absolutely requires the usage of time.h and time.h only
In this case, measuring short times is hard, but making short times longer is easy... Just repeat your algorithm until you reach, say, 1 second, and then divide the measured time by the number of iterations you did. You may get a skewed picture for cache-related and branch predictor-related times (as repeated iterations will "warm up" the caches and teach the branch predictor), but for the rest it should be decently accurate.
Incidentally, notice that using clock() is a bit problematic, as by standard it measures user CPU time of the current process (so, kernel time and IO wait is excluded), although on Windows it measures wall clock time. That's essentially the same as long as your algorithm is CPU-bound and manages to run pretty much continuously, but you may in for big differences if it is IO-bound or if it is running on a busy system
If you are interested in wall clock time and you are restricted to time.h, your best option is plain old time(); in that case I'd sync up precisely to the change of second with a busy wait, and then measure the number of iterations in a few seconds as said before.
time_t start = time(nullptr);
while(start == time(nullptr));
start = time(nullptr);
int i = 0;
while(time(nullptr) - start < 5) {
// your algorithm
++i;
}
int elapsed = time(nullptr) - start;
double time_per_iteration = double(elapsed) / i;
Related
I am programming a game using OpenGL GLUT code, and I am applying a game developing technique that consists in measuring the time consumed on each iteration of the game's main loop, so you can use it to update the game scene proportionally to the last time it was updated. To achieve this, I have this at the start of the loop:
void logicLoop () {
float finalTime = (float) clock() / CLOCKS_PER_SEC;
float deltaTime = finalTime - initialTime;
initialTime = finalTime;
...
// Here I move things using deltaTime value
...
}
The problem came when I added a bullet to the game. If the bullet does not hit any target in two seconds, it must be destroyed. Then, what I did was to keep a reference to the moment the bullet was created like this:
class Bullet: public GameObject {
float birthday;
public:
Bullet () {
...
// Some initialization staff
...
birthday = (float) clock() / CLOCKS_PER_SEC;
}
float getBirthday () { return birthday; }
}
And then I added this to the logic just beyond the finalTime and deltaTime measurement:
if (bullet != NULL) {
if (finalTime - bullet->getBirthday() > 2) {
world.remove(bullet);
bullet = NULL;
}
}
It looked nice, but when I ran the code, the bullet keeps alive too much time. Looking for the problem, I printed the value of (finalTime - bullet->getBirthday()), and I watched that it increases really really slow, like it was not a time measured in seconds.
Where is the problem? I though that the result would be in seconds, so the bullet would be removed in two seconds.
This is a common mistake. clock() does not measure the passage of actual time; it measures how much time has elapsed while the CPU was running this particular process.
Other processes also take CPU time, so the two clocks are not the same. Whenever your operating system is executing some other process's code, including when this one is "sleeping", does not count to clock(). And if your program is multithreaded on a system with more than one CPU, clock() may "double count" time!
Humans have no knowledge or perception of OS time slices: we just perceive the actual passage of actual time (known as "wall time"). Ultimately, then, you will see clock()'s timebase being different to wall time.
Do not use clock() to measure wall time!
You want something like gettimeofday() or clock_gettime() instead. In order to allay the effects of people changing the system time, on Linux I personally recommend clock_gettime() with the system's "monotonic clock", a clock that steps in sync with wall time but has an arbitrary epoch unaffected by people playing around with the computer's time settings. (Obviously switch to a portable alternative if needs be.)
This is actually discussed on the cppreference.com page for clock():
std::clock time may advance faster or slower than the wall clock, depending on the execution resources given to the program by the operating system. For example, if the CPU is shared by other processes, std::clock time may advance slower than wall clock. On the other hand, if the current process is multithreaded and more than one execution core is available, std::clock time may advance faster than wall clock.
Please get into the habit of reading documentation for all the functions you use, when you are not sure what is going on.
Edit: Turns out GLUT itself has a function you can use for this, which is might convenient. glutGet(GLUT_ELAPSED_TIME) gives you the number of wall milliseconds elapsed since your call to glutInit(). So I guess that's what you need here. It may be slightly more performant, particularly if GLUT (or some other part of OpenGL) is already requesting wall time periodically, and if this function merely queries that already-obtained time… thus saving you from an unnecessary second system call (which costs).
If you are on windows you can use QueryPerformanceFrequency / QueryPerformanceCounter which gives pretty accurate time measurements.
Here's an example.
#include <Windows.h>
using namespace std;
int main()
{
LARGE_INTEGER freq = {0, 0};
QueryPerformanceFrequency(&freq);
LARGE_INTEGER startTime = {0, 0};
QueryPerformanceCounter(&startTime);
// STUFF.
for(size_t i = 0; i < 100; ++i) {
cout << i << endl;
}
LARGE_INTEGER stopTime = {0, 0};
QueryPerformanceCounter(&stopTime);
const double ellapsed = ((double)stopTime.QuadPart - (double)startTime.QuadPart) / freq.QuadPart;
cout << "Ellapsed: " << ellapsed << endl;
return 0;
}
I was given the following HomeWork assignment,
Write a program to test on your computer how long it takes to do
nlogn, n2, n5, 2n, and n! additions for n=5, 10, 15, 20.
I have written a piece of code but all the time I am getting the time of execution 0. Can anyone help me out with it? Thanks
#include <iostream>
#include <cmath>
#include <ctime>
using namespace std;
int main()
{
float n=20;
time_t start, end, diff;
start = time (NULL);
cout<<(n*log(n))*(n*n)*(pow(n,5))*(pow(2,n))<<endl;
end= time(NULL);
diff = difftime (end,start);
cout <<diff<<endl;
return 0;
}
better than time() with second-precision is to use a milliseconds precision.
a portable way is e.g.
int main(){
clock_t start, end;
double msecs;
start = clock();
/* any stuff here ... */
end = clock();
msecs = ((double) (end - start)) * 1000 / CLOCKS_PER_SEC;
return 0;
}
Execute each calculation thousands of times, in a loop, so that you can overcome the low resolution of time and obtain meaningful results. Remember to divide by the number of iterations when reporting results.
This is not particularly accurate but that probably does not matter for this assignment.
At least on Unix-like systems, time() only gives you 1-second granularity, so it's not useful for timing things that take a very short amount of time (unless you execute them many times in a loop). Take a look at the gettimeofday() function, which gives you the current time with microsecond resolution. Or consider using clock(), which measure CPU time rather than wall-clock time.
Your code is executed too fast to be detected by time function returning the number of seconds elapsed since 00:00 hours, Jan 1, 1970 UTC.
Try to use this piece of code:
inline long getCurrentTime() {
timeb timebstr;
ftime( &timebstr );
return (long)(timebstr.time)*1000 + timebstr.millitm;
}
To use it you have to include sys/timeb.h.
Actually the better practice is to repeat your calculations in the loop to get more precise results.
You will probably have to find a more precise platform-specific timer such as the Windows High Performance Timer. You may also (very likely) find that your compiler optimizes or removes almost all of your code.
I am printing microseconds continuously using gettimeofday(). As given in program output you can see that the time is not updated microsecond interval rather its repetitive for certain samples then increments not in microseconds but in milliseconds.
while(1)
{
gettimeofday(&capture_time, NULL);
printf(".%ld\n", capture_time.tv_usec);
}
Program output:
.414719
.414719
.414719
.414719
.430344
.430344
.430344
.430344
e.t.c
I want the output to increment sequentially like,
.414719
.414720
.414721
.414722
.414723
or
.414723, .414723+x, .414723+2x, .414723 +3x + ...+ .414723+nx
It seems that microseconds are not refreshed when I acquire it from capture_time.tv_usec.
=================================
//Full Program
#include <iostream>
#include <windows.h>
#include <conio.h>
#include <time.h>
#include <stdio.h>
#if defined(_MSC_VER) || defined(_MSC_EXTENSIONS)
#define DELTA_EPOCH_IN_MICROSECS 11644473600000000Ui64
#else
#define DELTA_EPOCH_IN_MICROSECS 11644473600000000ULL
#endif
struct timezone
{
int tz_minuteswest; /* minutes W of Greenwich */
int tz_dsttime; /* type of dst correction */
};
timeval capture_time; // structure
int gettimeofday(struct timeval *tv, struct timezone *tz)
{
FILETIME ft;
unsigned __int64 tmpres = 0;
static int tzflag;
if (NULL != tv)
{
GetSystemTimeAsFileTime(&ft);
tmpres |= ft.dwHighDateTime;
tmpres <<= 32;
tmpres |= ft.dwLowDateTime;
/*converting file time to unix epoch*/
tmpres -= DELTA_EPOCH_IN_MICROSECS;
tmpres /= 10; /*convert into microseconds*/
tv->tv_sec = (long)(tmpres / 1000000UL);
tv->tv_usec = (long)(tmpres % 1000000UL);
}
if (NULL != tz)
{
if (!tzflag)
{
_tzset();
tzflag++;
}
tz->tz_minuteswest = _timezone / 60;
tz->tz_dsttime = _daylight;
}
return 0;
}
int main()
{
while(1)
{
gettimeofday(&capture_time, NULL);
printf(".%ld\n", capture_time.tv_usec);// JUST PRINTING MICROSECONDS
}
}
The change in time you observe is 0.414719 s to 0.430344 s. The difference is 15.615 ms. The fact that the representation of the number is microsecond does not mean that it is incremented by 1 microsecond. In fact I would have expected 15.625 ms. This is the system time increment on standard hardware. I've given a closer look here and here.
This is called granularity of the system time.
Windows:
However, there is a way to improve this, a way to reduce the granularity: The Multimedia Timers. Particulary Obtaining and Setting Timer Resolution will disclose a way to increase the systems interrupt frequency.
The code:
#define TARGET_PERIOD 1 // 1-millisecond target interrupt period
TIMECAPS tc;
UINT wTimerRes;
if (timeGetDevCaps(&tc, sizeof(TIMECAPS)) != TIMERR_NOERROR)
// this call queries the systems timer hardware capabilities
// it returns the wPeriodMin and wPeriodMax with the TIMECAPS structure
{
// Error; application can't continue.
}
// finding the minimum possible interrupt period:
wTimerRes = min(max(tc.wPeriodMin, TARGET_PERIOD ), tc.wPeriodMax);
// and setting the minimum period:
timeBeginPeriod(wTimerRes);
This will force the system to run at its maximum interrupt frequency. As a consequence
also the update of the system time will happen more often and the granularity of the system time increment will be close to 1 milisecond on most systems.
When you deserve resolution/granularity beyond this, you'd have to look into QueryPerformanceCounter. But this is to be used with care when using it over longer periods of time. The frequency of this counter can be obtained by a call to QueryPerformanceFrequency. The OS considers this frequency as a constant and will give the same value all time. However, some hardware produces this frequency and the true frequency differs from the given value. It has an offset and it shows thermal drift. Thus the error shall be assumed in the range of several to many microseconds/second. More details about this can be found in the second "here" link above.
Linux:
The situation looks somewhat different for Linux. See this to get an idea. Linux
mixes information of the CMOS clock using the function getnstimeofday (for seconds since epoch) and information from a high freqeuncy counter (for the microseconds) using the function timekeeping_get_ns. This is not trivial and is questionable in terms of accuracy since both sources are backed by different hardware. The two sources are not phase locked, thus it is possible to get more/less than one million microseconds per second.
The Windows system clock only ticks every few milliseconds -- in your case 64 times per second, so when it does tick it increases the system time by 15.625 ms.
The solution is to use a higher-resolution timer that the system time (QueryPerformanceCounter).
You still won't see .414723, .414723+x, .414723+2x, .414723 +3x + ...+ .414723+nx, though, because you code will not run exactly once every x microseconds. It will run as fast as it can, but there's no particular reason that should always be a constant speed, or that if it is then it's an integer number of microseconds.
I recommend you to look at the C++11 <chrono> header.
high_resolution_clock (C++11) the clock with the shortest tick period available
The tick period referred to here is the frequency at which the clock is updated. If we look in more details:
template<
class Rep,
class Period = std::ratio<1>
> class duration;
Class template std::chrono::duration represents a time interval.
It consists of a count of ticks of type Rep and a tick period, where the tick period is a compile-time rational constant representing the number of seconds from one tick to the next.
Previously, functions like gettimeofday would give you a time expressed in microseconds, however they would utterly fail to tell you the interval at which this time expression was refreshed.
In the C++11 Standard, this information is now in the clear, to make it obvious that there is no relation between the unit in which the time is expressed and the tick period. And that, therefore, you definitely need to take both into accounts.
The tick period is extremely important when you want to measure durations that are close to it. If the duration you wish to measure is inferior to the tick period, then you will measure it "discretely" like you observed: 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, ... I advise caution at this point.
This is because the process running your code isn't always scheduled to execute.
Whilst it does, it will bang round the loop quickly, printing multiple values for each microsecond - which is a comparatively long period of time on modern CPUs.
There are then periods where it is not scheduled to execute by the system, and therefore cannot print values.
If what you want to do is execute every microsecond, this may be possible with some real-time operating systems running on high performance hardware.
I was given the following HomeWork assignment,
Write a program to test on your computer how long it takes to do
nlogn, n2, n5, 2n, and n! additions for n=5, 10, 15, 20.
I have written a piece of code but all the time I am getting the time of execution 0. Can anyone help me out with it? Thanks
#include <iostream>
#include <cmath>
#include <ctime>
using namespace std;
int main()
{
float n=20;
time_t start, end, diff;
start = time (NULL);
cout<<(n*log(n))*(n*n)*(pow(n,5))*(pow(2,n))<<endl;
end= time(NULL);
diff = difftime (end,start);
cout <<diff<<endl;
return 0;
}
better than time() with second-precision is to use a milliseconds precision.
a portable way is e.g.
int main(){
clock_t start, end;
double msecs;
start = clock();
/* any stuff here ... */
end = clock();
msecs = ((double) (end - start)) * 1000 / CLOCKS_PER_SEC;
return 0;
}
Execute each calculation thousands of times, in a loop, so that you can overcome the low resolution of time and obtain meaningful results. Remember to divide by the number of iterations when reporting results.
This is not particularly accurate but that probably does not matter for this assignment.
At least on Unix-like systems, time() only gives you 1-second granularity, so it's not useful for timing things that take a very short amount of time (unless you execute them many times in a loop). Take a look at the gettimeofday() function, which gives you the current time with microsecond resolution. Or consider using clock(), which measure CPU time rather than wall-clock time.
Your code is executed too fast to be detected by time function returning the number of seconds elapsed since 00:00 hours, Jan 1, 1970 UTC.
Try to use this piece of code:
inline long getCurrentTime() {
timeb timebstr;
ftime( &timebstr );
return (long)(timebstr.time)*1000 + timebstr.millitm;
}
To use it you have to include sys/timeb.h.
Actually the better practice is to repeat your calculations in the loop to get more precise results.
You will probably have to find a more precise platform-specific timer such as the Windows High Performance Timer. You may also (very likely) find that your compiler optimizes or removes almost all of your code.
I have a problem in using time.
I want to use and get microseconds on windows using C++.
I can't find the way.
The "canonical" answer was given by unwind :
One popular way is using the QueryPerformanceCounter() call.
There are however few problems with this method:
it's intended for measurement of time intervals, not time. This means you have to write code to establish "epoch time" from which you will measure precise intervals. This is called calibration.
As you calibrate your clock, you also need to periodically adjust it so it's never too much out of sync (this is called drift) with your system clock.
QueryPerformanceCounter is not implemented in user space; this means context switch is needed to call kernel side of implementation, and that is relatively expensive (around 0.7 microsecond). This seems to be required to support legacy hardware.
Not all is lost, though. Points 1. and 2. are something you can do with a bit of coding, 3. can be replaced with direct call to RDTSC (available in newer versions of Visual C++ via __rdtsc() intrinsic), as long as you know accurate CPU clock frequency. Although, on older CPUs, such call would be susceptible to changes in cpu internal clock speed, in all newer Intel and AMD CPUs it is guaranteed to give fairly accurate results and won't be affected by changes in CPU clock (e.g. power saving features).
Lets get started with 1. Here is data structure to hold calibration data:
struct init
{
long long stamp; // last adjustment time
long long epoch; // last sync time as FILETIME
long long start; // counter ticks to match epoch
long long freq; // counter frequency (ticks per 10ms)
void sync(int sleep);
};
init data_[2] = {};
const init* volatile init_ = &data_[0];
Here is code for initial calibration; it has to be given time (in milliseconds) to wait for the clock to move; I've found that 500 milliseconds give pretty good results (the shorter time, the less accurate calibration). For the purpose of callibration we are going to use QueryPerformanceCounter() etc. You only need to call it for data_[0], since data_[1] will be updated by periodic clock adjustment (below).
void init::sync(int sleep)
{
LARGE_INTEGER t1, t2, p1, p2, r1, r2, f;
int cpu[4] = {};
// prepare for rdtsc calibration - affinity and priority
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
SetThreadAffinityMask(GetCurrentThread(), 2);
Sleep(10);
// frequency for time measurement during calibration
QueryPerformanceFrequency(&f);
// for explanation why RDTSC is safe on modern CPUs, look for "Constant TSC" and "Invariant TSC" in
// Intel(R) 64 and IA-32 Architectures Software Developer’s Manual (document 253668.pdf)
__cpuid(cpu, 0); // flush CPU pipeline
r1.QuadPart = __rdtsc();
__cpuid(cpu, 0);
QueryPerformanceCounter(&p1);
// sleep some time, doesn't matter it's not accurate.
Sleep(sleep);
// wait for the system clock to move, so we have exact epoch
GetSystemTimeAsFileTime((FILETIME*) (&t1.u));
do
{
Sleep(0);
GetSystemTimeAsFileTime((FILETIME*) (&t2.u));
__cpuid(cpu, 0); // flush CPU pipeline
r2.QuadPart = __rdtsc();
} while(t2.QuadPart == t1.QuadPart);
// measure how much time has passed exactly, using more expensive QPC
__cpuid(cpu, 0);
QueryPerformanceCounter(&p2);
stamp = t2.QuadPart;
epoch = t2.QuadPart;
start = r2.QuadPart;
// calculate counter ticks per 10ms
freq = f.QuadPart * (r2.QuadPart-r1.QuadPart) / 100 / (p2.QuadPart-p1.QuadPart);
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_NORMAL);
SetThreadAffinityMask(GetCurrentThread(), 0xFF);
}
With good calibration data you can calculate exact time from cheap RDTSC (I measured the call and calculation to be ~25 nanoseconds on my machine). There are three things to note:
return type is binary compatible with FILETIME structure and is precise to 100ns , unlike GetSystemTimeAsFileTime (which increments in 10-30ms or so intervals, or 1 millisecond at best).
in order to avoid expensive conversions integer to double to integer, the whole calculation is performed in 64 bit integers. Even though these can hold huge numbers, there is real risk of integer overflow, and so start must be brought forward periodically to avoid it. This is done in clock adjustment.
we are making a copy of calibration data, because it might have been updated during our call by clock adjustement in another thread.
Here is the code to read current time with high precision. Return value is binary compatible with FILETIME, i.e. number of 100-nanosecond intervals since Jan 1, 1601.
long long now()
{
// must make a copy
const init* it = init_;
// __cpuid(cpu, 0) - no need to flush CPU pipeline here
const long long p = __rdtsc();
// time passed from epoch in counter ticks
long long d = (p - it->start);
if (d > 0x80000000000ll)
{
// closing to integer overflow, must adjust now
adjust();
}
// convert 10ms to 100ns periods
d *= 100000ll;
d /= it->freq;
// and add to epoch, so we have proper FILETIME
d += it->epoch;
return d;
}
For clock adjustment, we need to capture exact time (as provided by system clock) and compare it against our clock; this will give us drift value. Next we use simple formula to calculate "adjusted" CPU frequency, to make our clock meet system clock at the time of next adjustment. Thus it is important that adjustments are called on regular intervals; I've found that it works well when called in 15 minutes intervals. I use CreateTimerQueueTimer, called once at program startup to schedule adjustment calls (not demonstrated here).
The slight problem with capturing accurate system time (for the purpose of calculating drift) is that we need to wait for the system clock to move, and that can take up to 30 milliseconds or so (it's a long time). If adjustment is not performed, it would risk integer overflow inside function now(), not to mention uncorrected drift from system clock. There is builtin protection against overflow in now(), but we really don't want to trigger it synchronously in a thread which happened to call now() at the wrong moment.
Here is the code for periodic clock adjustment, clock drift is in r->epoch - r->stamp:
void adjust()
{
// must make a copy
const init* it = init_;
init* r = (init_ == &data_[0] ? &data_[1] : &data_[0]);
LARGE_INTEGER t1, t2;
// wait for the system clock to move, so we have exact time to compare against
GetSystemTimeAsFileTime((FILETIME*) (&t1.u));
long long p = 0;
int cpu[4] = {};
do
{
Sleep(0);
GetSystemTimeAsFileTime((FILETIME*) (&t2.u));
__cpuid(cpu, 0); // flush CPU pipeline
p = __rdtsc();
} while (t2.QuadPart == t1.QuadPart);
long long d = (p - it->start);
// convert 10ms to 100ns periods
d *= 100000ll;
d /= it->freq;
r->start = p;
r->epoch = d + it->epoch;
r->stamp = t2.QuadPart;
const long long dt1 = t2.QuadPart - it->epoch;
const long long dt2 = t2.QuadPart - it->stamp;
const double s1 = (double) d / dt1;
const double s2 = (double) d / dt2;
r->freq = (long long) (it->freq * (s1 + s2 - 1) + 0.5);
InterlockedExchangePointer((volatile PVOID*) &init_, r);
// if you have log output, here is good point to log calibration results
}
Lastly two utility functions. One will convert FILETIME (including output from now()) to SYSTEMTIME while preserving microseconds to separate int. Other will return frequency, so your program can use __rdtsc() directly for accurate measurements of time intervals (with nanosecond precision).
void convert(SYSTEMTIME& s, int &us, long long f)
{
LARGE_INTEGER i;
i.QuadPart = f;
FileTimeToSystemTime((FILETIME*) (&i.u), &s);
s.wMilliseconds = 0;
LARGE_INTEGER t;
SystemTimeToFileTime(&s, (FILETIME*) (&t.u));
us = (int) (i.QuadPart - t.QuadPart)/10;
}
long long frequency()
{
// must make a copy
const init* it = init_;
return it->freq * 100;
}
Well of course none of the above is more accurate than your system clock, which is unlikely to be more accurate than few hundred milliseconds. The purpose of precise clock (as opposed to accurate) as implemented above, is to provide single measure which can be used for both:
cheap and very accurate measurement of time intervals (not wall time),
much less accurate, but monotonous and consistent with the above, measure of wall time
I think it does it pretty well. Example use are logs, where one can use timestamps not only to find time of events, but also reason about internal program timings, latency (in microseconds) etc.
I leave the plumbing (call to initial calibration, scheduling adjustment) as an exercise for gentle readers.
You can use boost date time library.
You can use boost::posix_time::hours, boost::posix_time::minutes,
boost::posix_time::seconds, boost::posix_time::millisec, boost::posix_time::nanosec
http://www.boost.org/doc/libs/1_39_0/doc/html/date_time.html
One popular way is using the QueryPerformanceCounter() call. This is useful if you need high-precision timing, such as for measuring durations that only take on the order of microseconds. I believe this is implemented using the RDTSC machine instruction.
There might be issues though, such as the counter frequency varying with power-saving, and synchronization between multiple cores. See the Wikipedia link above for details on these issues.
Take a look at the Windows APIs GetSystemTime() / GetLocalTime() or GetSystemTimeAsFileTime().
GetSystemTimeAsFileTime() expresses time in 100 nanosecond intervals, that is 1/10 of a microsecond. All functions provide the current time with in millisecond accuracy.
EDIT:
Keep in mind, that on most Windows systems the system time is only updated about every 1 millisecond. So even representing your time with microsecond accuracy makes it still necessary to acquire the time with such a precision.
Take a look at this: http://www.decompile.com/cpp/faq/windows_timer_api.htm
May be this can help:
NTSTATUS WINAPI NtQuerySystemTime(__out PLARGE_INTEGER SystemTime);
SystemTime [out] - a pointer to a LARGE_INTEGER structure that receives the system time. This is a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC).