The source from here says that it is supposed to work on the iPhone. I have worked with it, but I get 2 errors, saying that msleep() is undeclared. I have tried to include unistd.h, time.h, and numerous others. How can I get this to work? Thanks.
The msleep() is a non-standard artifact from early BSDs, before the clock_nanosleep() and nanosleep() made it into POSIX.
It is unportable. On some systems it is available by default - on others one has to compile the code with _BSD_SOURCE define.
iPhone is a distant relative of Mac OS X, which is distant relative of NeXT, which is very distant relative of BSD 4.x. So the function might have stuck in some header/library somewhere, but you shouldn't use it anyway. If memory serves me right, check the NSThread's sleepForTimeInterval: static method.
There is nothing in that linked thread stating that msleep is available. The original author, bagusflyer, actually implemented their own msleep, stating:
Sorry. Maybe I missed something in my code. Here is my msleep:
#include <sys/time.h>
void msleep (unsigned int ms) {
int microsecs;
struct timeval tv;
microsecs = ms * 1000;
tv.tv_sec = microsecs / 1000000;
tv.tv_usec = microsecs % 1000000;
select (0, NULL, NULL, NULL, &tv);
}
However, you should be careful about using that code since I think, from memory, that select() is interruptable.
Maybe you can use usleep(). It is also in unistd.h.
Related
Let's say I have a function like:
template<typename It, typename Cmp>
void mysort( It begin, It end, Cmp cmp )
{
std::sort( begin, end, cmp );
}
When I compile this using -finstrument-functions-after-inlining with clang++ --version:
clang version 11.0.0 (...)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: ...
The instrument code explodes the execution time, because my entry and exit functions are called for every call of
void std::__introsort_loop<...>(...)
void std::__move_median_to_first<...>(...)
I'm sorting a really big array, so my program doesn't finish: without instrumentation it takes around 10 seconds, with instrumentation I've cancelled it at 10 minutes.
I've tried adding __attribute__((no_instrument_function)) to mysort (and the function that calls mysort), but this doesn't seem to have an effect as far as these standard library calls are concerned.
Does anyone know if it is possible to ignore function instrumentation for the internals of a standard library function like std::sort? Ideally, I would only have mysort instrumented, so a single entry and a single exit!
I see that clang++ sadly does not yet support anything like finstrument-functions-exclude-function-list or finstrument-functions-exclude-file-list, but g++ does not yet support -finstrument-functions-after-inlining which I would ideally have, so I'm stuck!
EDIT: After playing more, it would appear the effect on execution-time is actually less than that described, so this isn't the end of the world. The problem still remains however, because most people who are doing function instrumentation in clang will only care about the application code, and not those functions linked from (for example) the standard library.
EDIT2: To further highlight the problem now that I've got it running in a reasonable time frame: the resulting trace that I produce from the instrumented code with those two standard library functions is 15GB. When I hard code my tracing to ignore the two function addresses, the resulting trace is 3.7MB!
I've run into the same problem. It looks like support for these flags was once proposed, but never merged into the main branch.
https://reviews.llvm.org/D37622
This is not a direct answer, since the tool doesn't support what you want to do, but I think I have a decent work-around. What I wound up doing was creating a "skip list" of sorts. In the instrumented functions (__cyg_profile_func_enter and __cyg_profile_func_exit), I would guess the part that is contributing most to your execution time is the printing. If you can come up with a way of short-circuiting the profile functions, that should help, even if it's not the most ideal. At the very least it will limit the size of the output file.
Something like
#include <stdint.h>
uintptr_t skipAddrs[] = {
// assuming 64-bit addresses
0x123456789abcdef, 0x2468ace2468ace24
};
size_t arrSize = 0;
int main(void)
{
...
arrSize = sizeof(skipAddrs)/sizeof(skipAddrs[0]);
// https://stackoverflow.com/a/37539/12940429
...
}
void __cyg_profile_func_enter (void *this_fn, void *call_site) {
for (size_t idx = 0; idx < arrSize; idx++) {
if ((uintptr_t) this_fn == skipAddrs[idx]) {
return;
}
}
}
I use something like objdump -t binaryFile to examine the symbol table and find what the addresses are for each function.
If you specifically want to ignore library calls, something that might work is examining the symbol table of your object file(s) before linking against libraries, then ignoring all the ones that appear new in the final binary.
All this should be possible with things like grep, awk, or python.
You have to add attribute __attribute__((no_instrument_function)) to the functions that should not be instrumented. Unfortunately it is not easy to make it work with C/C++ standard library functions because this feature requires editing all the C++ library functions.
There are some hacks you can do like #define existing macros from include/__config to add this attribute as well. e.g.,
-D_LIBCPP_INLINE_VISIBILITY=__attribute__((no_instrument_function,internal_linkage))
Make sure to append existing macro definition with no_instrument_function to avoid unexpected errors.
Identifying the Problem
I was busy editing a library for lua bindings to rtmidi. I wanted to fix MinGW-GCC and LLVM/Clang compilation compability. When I was done making the edits and compiling the bindings, I noticed a weird timing issue caused by std::this_thread::sleep_for() when compared to MSVC.
I understand that there are bound to be some scheduling differences between different compilers, but in the following examples you can hear large timing issues:
MIDI playback using MSVC compiled bindings
MIDI playback using GCC compiled bindings
I have narrowed it down that this is the piece of code in question:
lua_pushliteral(L, "sleep");
lua_pushcfunction(L, [] (lua_State *L) {
auto s = std::chrono::duration<lua_Number>(luaL_checknumber(L, 1));
std::this_thread::sleep_for(s);
return 0;
});
lua_rawset(L, -3);
Obviously it's about these two lines:
auto s = std::chrono::duration<lua_Number>(luaL_checknumber(L, 1));
std::this_thread::sleep_for(s);
The average waiting time that is passed to sleep_for() is around 0.01s, with some calls here and there between 0.002s - 0.005s.
Troubleshooting
First off I have checked whether the problem was present with my current version of GCC (9.2.0) by using a different version and even using LLVM/Clang.
Both GCC 8.1.0 and LLVM/Clang 9.0.0 yield the same results.
At this point I can conclude there is some weird scheduling going on with the winpthreads runtime, since they depend on it and MSVC does not.
After that I tried to switch out the code with the Windows Sleep() call. I had to multiply by 1000 to adjust for the correct timing.
Sleep(luaL_checknumber(L, 1) * 1000);
As I expected, the timing issue is not present here; this tells me that winpthreads is indeed the culprit here.
Obviously I do not want to make calls to Windows Sleep() and keep using sleep_for() for the sake of portability (as in crossplatform).
The Questions
So based on what I gathered I have the following questions:
Is winpthread indeed the culprit? Am I perhaps missing some compiler defines that would solve the problem?
If winpthreads is indeed the culprit, why are the timing differences so big?
If there is no compiler define 'fix', what would you recommend to do tackle the problem?
To partially answer the third question (if it may come to it), I was thinking of doing something like:
#ifdef _WIN32 && MINGW
#include <windows.h>
#endif
...
#ifdef _WIN32 && MINGW
Sleep(luaL_checknumber(L, 1) * 1000);
#elif _WIN32 && MSVC
auto s = std::chrono::duration<lua_Number>(luaL_checknumber(L, 1));
std::this_thread::sleep_for(s);
#endif
Of course the problem arises that Window's Sleep() call is less precise (or so I've read).
Here is some C++ code illustrating my problem with a minimal expample:
// uncomment the next line, to make it hang up:
//#define BOOST_DATE_TIME_POSIX_TIME_STD_CONFIG //needed for nanosecond support of boost
#include <boost/thread.hpp>
void foo()
{
while(true);
}
int main(int noParameters, char **parameterArray)
{
boost::thread MyThread(&foo);
if ( MyThread.timed_join( boost::posix_time::seconds(1) ) )
{
std::cout<<"\nDone!\n";
}
else
{
std::cerr<<"\nTimed out!\n";
}
}
As long as I don't turn on the nanosecond support everthing works as expected, but as soon as I uncomment the #define needed for the nanosecond support in boost::posix_time the program doesn't get past the if-statement any more, just as if I had called join() instead of timed_join().
Now I've already figured out, that this happens because BOOST_DATE_TIME_POSIX_TIME_STD_CONFIG changes the actual data representation of the timestamps from a single 64bit integer to 64+32 bit. A lot boost stuff is completely implemented inside the headers but the thread methods are not and because of that they cannot adapt to the new data format without compiling them again with the apropriate options. Since the code is meant to run on an external server, compiling my own version of boost is not an option and neither is turning off the nanosecond support.
Therefore my question is as follows: Is there a way to pass on a value (on the order of seconds) to timed_join() without using the incompatible 96bit posix_time methods and without modifying the standard boost packages?
I'm running on Ubuntu 12.04 with boost 1.46.1.
Unfortunately I don't think your problem can be cleanly solved as written. Since the library you're linking against was compiled without nanosecond support, by definition you violate the one-definition rule if you happen to enable nanosecond support for any piece that's already compiled into the library binary. In this case, you're enabling it across the function calls to timed_join.
The obvious solution is to decide which is less painful to give up: Building your own boost, or removing nanosecond times.
The less obvious "hack" that may or may not totally work is to write your own timed_join wrapper that takes a thread object and an int representing seconds or ms or whatever. Then this function is implemented in a source file with nothing else and that does not enable nanosecond times for the specific purpose of calling into the compiled boost binary. Again I want to stress that if at any point you fail to completely segregate such usages you'll violate the one definition rule and run into undefined behavior.
#include <semaphore.h>
sem_t* x;
int main ()
{
x = sem_open("x", O_CREAT, 0, 0);;
sem_wait(x); sem_wait(x); sem_wait(x);
std::cout << "\ndone\n";
}
This code shouldn't even pass the first sem_wait() but on my system it reaches the end of main(). Everything I have read, such as here and here, say that, although Mac OS X does not support sem_init(), it does support sem_open(). However, using sem_open() as above hasn't fixed the problem. I'm running OS X 10.5.7.
Try putting sem_unlink("x"); before sem_open(), I'm sure it's not your first attempt on it. And mode of 0 won't let you do much with it, unless you remove it. Also, do check your calls for errors, it will if not resolve, but, at least, amend your questions.
Permissions of 0 to sem_open mean that nobody can access the semaphore. You really should add proper error checking -- it will tell you which function is failing and way.
I recently came across the need to sleep the current thread for an exact period of time. I know of two methods of doing so on a POSIX platform: using nanosleep() or using boost::this_thread::sleep().
Out of curiosity more than anything else, I was wondering what the differences are between the two approaches. Is there any difference in precision, and is there any reason not to use the Boost approach?
nanosleep() approach:
#include <time.h>
...
struct timespec sleepTime;
struct timespec returnTime;
sleepTime.tv_sec = 0;
sleepTime.tv_nsec = 1000;
nanosleep(&sleepTime, &returnTime);
Boost approach:
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/thread/thread.hpp>
...
boost::this_thread::sleep(boost::posix_time::nanoseconds(1000));
The few reasons why use boost that I can think of:
boost::this_thread::sleep() is an
interruption point in boost.thread
boost::this_thread::sleep() can be
drop-in replaced by C++0x's
std::this_thread::sleep_until() in
future
For why not -- if you're not using threads at all, or of everything else in your project uses POSIX calls, then nanosleep() makes more sense.
As for precision, on my system both boost and nanosleep() call the same system call, hrtimer_nanosleep(). I imagine boost authors try to get the highest precision possible on each system and for me it happens to be the same thing as what nanosleep() provides.
How about because your nanonsleep example is wrong.
#include <time.h>
...
struct timespec sleepTime;
struct timespec time_left_to_sleep;
sleepTime.tv_sec = 0;
sleepTime.tv_nsec = 1000;
while( (sleepTime.tv_sec + sleepTime.tv_nsec) > 0 )
{
nanosleep(&sleepTime, &time_left_to_sleep);
sleepTime.tv_sec = time_left_to_sleep.tv_sec;
sleepTime.tv_nsec = time_left_to_sleep.tv_nsec;
}
Admittedly if you're only sleeping for 1 microsecond waking up too early shouldn't be an issue, but in the general case this is the only way to get it done.
And just to ice the cake in boost's favor, boost::this_thread::sleep() is implemented using nanosleep(). They just took care of all the insane corner cases for you.
is there any reason not to use the Boost approach
I suppose this is kind of obvious, but the only reason I can think of is that you'd require boost to compile your project.
For me the main reason for using the boost variant is platform independence. If you are required to compile your application for both posix and Windows platforms, for example, the platform sleep is not sufficient.