As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Let say I have an array of 1,000,000 elements with about 90% of 0s and 10% of 1s.
In order to count of 1s, I can do
sum=0;
for(int i=0;i<size;i++) {
sum+=x[i]
}
But I thought maybe comparison is cheaper than addition so this would be better.
sum=0;
for(int i=0;i<size;i++) {
if(x[i]==1)
sum++;
}
But I am not sure. Which one is faster?
It is hard to say which one is going to be faster without trying it, but a even a slightly slower instruction without a branch will usually be faster due to pipelining and branch prediction.
In your case, the branch predictor will be wrong 90% of the time, reducing the speed quite a bit.
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
is it better to modify the file using positioning (seekg/seekp ) while it's in the hard drive
with out loading it to the RAM (into an object)
or read it as a whole into an object then treat the object (delete,modify,add...)
better "mostly speed"
The answer depends on your usecase. For one thing there are cases where you can not fit the whole file in the RAM(if it is huge). Also if you only need to perform a small change, loading the whole file will be a huge overhead.
On the other hand if you need to read/modify a huge portion of the file multiple times and it is reasonably big, loading it into the RAM will make sense and will improve the performance.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I was just looking at some algorithms for prime numbers and came across this:
for(int i=2;i*i <= n;i++)
{/*assume no operations here*/}
I was just wondering if the above loop will be faster than the following or not?
int x=sqrt(n);
for(int i=2;i<=x;i++)
{/*nop*/}
It depends on the value of n, of course. Anyway, sqrt() is not guaranteed to give you the right result: due to rounding reasons, you might end up with a value of x which is one less than expected and ruin the algorithm. Rather than going for a micro-optimisation, I would stick to correctness here and use the original version, which is guaranteed to give correct results.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I've used Clojure for about 2 years now (having used scheme/lisp before that).
I'm getting to the point where I feel like I'm not learning more clojure "by osmosis" and am considering using a conscious effort to memorize the function names in Clojure.core
Question:
Has anyone else done this? If so, has it been a significant productivity boost?
I'm afraid that just having a function memorized is not enough to spot the need to use it when it is needed. It is better to learn in context - for example by solving the 4clojure problems and then looking at solutions by users with high scores. Once you have a function in context, then you can memorize it.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What is an example program that realizes a performance gain by calling _mm_stream_si64x()?
The MSDN article on _mm_stream_si64x: http://msdn.microsoft.com/en-us/library/35b8kssy.aspx
Here's an example, assuming the source and destination are sufficiently large:
const char *source;
char *destination;
for (size_t offset= 0; offset<100*1024*1024; offset+= 64)
{
*(__int64 *)(destination + offset)= *(__int64 *)(source + offset);
}
If you do this manually instead of using _mm_stream_si64x, you effectively flush the cache.
Like the reference says, the _mm_stream_si64x intrinsic writes to the memory location pointed to by Dest directly without writing Dest to the cache. So if you want to copy data to the Dest pointer, but do not plan on accessing data from the Dest pointer until much later, then this intrinsic would 'realize a performance gain' over the equivalent _mm_stream_si64 intrinsic.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What resources are available where I can learn how to represent a b-tree using a two-dimenstional array? Searching on Google did not provide any fruitful results.
Ignoring the reasons why you might want to do this, because no one would recommend it, which explains why Google doesn't have much on the subject, the trick is to use indexes into the array in place of pointers.
Then you have one dimension of the array representing nodes in the tree, and the other dimension representing child nodes.
It's related to the problem you would solve if you had to write out a btree to disk, where the disk is essentially a one-dimensional array.