It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
is it possible to multi thread subroutines? I like to use openmp to run ZGEEV subroutine out of the lapack module on mutiple cores to fast things up. Is that even possible?
Yes in this case. LAPACK uses BLAS to get performance, and there are a number of efficient, multithreaded versions of BLAS (e.g. MKL, ACML, ATLAS). So you can use the threads at that level to get improved performance, though I must say that in my experience speed up is limited for diagonalisers.
More generally, though, you will have to parallelise the code yourself. In this case you get lucky because threaded versions of the important layer already exist.
Related
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
My platform is vs2010 win2003 server, I have an application working well. There is an integer protected by a critical section, when I modify and use boost::detail::spinlock instead then it goes to dead lock.
It's boost::detail::spinlock. That means it's intended for internal use only. If you want portable replacement for critical sections, use boost::mutex from Boost.Thread.
It's boost::detail::spinlock. Spinlocks usually busy-wait, which makes them faster, but usable only under tightly controlled conditions.
Boost 1.53 (the latest release) finally got Boost.Atomic, which is a portable (and C++11 compatible) replacement for interlocked operations.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Here is my source code for testing multithread performance in C++. Please tell me why is time about 5x smaller for ONE thread running(WaitForMultipleObject()) then first sequential performance. I expect almost same result for sequential performance and running with only one thread. Thanks
http://pastebin.com/EeJ5qW03
OS will decide when will your thread start running and it will also decide if there's a need for dispatching, perhaps. Add to that, it also has to create a separate stack for your thread, perhaps.
Read about the overhead on thread creation. All in all, the overhead is system specific.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I was going through Scott Meyer's podcast on CPU CACHES AND WHY YOU CARE It seems this will make code run faster, is there any open source where such coding is done for reference.
Or anybody has example of design of data structures/algorithms based on CPU caches aware
Sure, the entire Linux kernel is implemented to be cache-aware.
For more details there is highly recommended paper What Every Programmer Should Know About Memory.
Linear algebra is sensitive to cache problems. The BLAS subroutines allow one to abstract away from these concerns
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I need to create a simple application, but speed is very important here. Application is pretty much simple.
It will generate all available chars by saving them to text file. User will enter length that will be used for generating so the application will use recursive function with loop inside.
Will C be faster then C++ in this matter, or it does not matter?
Speed is very important because if my application needs to generate/save to file 10 million+ words.
It doesn't really matter, chances are your application will be I/O bound rather than CPU bound unless you have enough RAM to hold all that in memory.
It's much more important that you choose the best algorithm, and the best data structures to back that algorithm up.
Then implement that in the language you're most familiar with. C++ has the advantage of having easy to use containers in its standard libraries, but that's about it. You can write slow code in both, and fast code in both.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
We are facing a possible problem on ARM processor machine.
We have done an implementation for smart pointers,which involves atomic operation for keeping track of the references.
We are getting crashes for that.
Is there a possible problem with atomic operations on ARM processor?
It's possible, but it's way more likely that there is a bug in your code.
Perhaps you should post some code.