Is += faster than -=? - c++

Full disclosure - I was inspired by Is x += a quicker than x = x + a?
That aside, I decided to test += vs -=. Simple tests reveal they're about the same. Then I tried something similar to:
std::vector<int> x;
for (int i = 0 ; i < 10000 ; i++)
x.push_back(rand()%10);
and call += and -= proportionally to a given number:
long long sum = 0;
for ( each number in the array )
if ( x[j] < k )
sum += x[j];
else
sum -= x[j];
so, if k is, say, small, -= would get called more often (duuuh). I tried with k = 2 which would give a higher proportion of -= called, and with k = 5, which should yield about the same number of -= and +=.
The punchline: calling -= is about twice as faster than calling +=. Why would it be more efficient in this case?

I'm gonna jump in before Mysticial gets a hold of this and guess: branch prediction.
So, it's not the -= vs +=.
The condition x[j] < k can be better predicted when it's almost always true or false than it can be when there's about the same number of numbers for which it can evaluate to either.
For k = 2, one in 10 will evaluate to false.
For k = 5, they'll be about the same and distributed randomly, so harder to predict.
EDIT: See http://ideone.com/1PYMl - all extra stuff is there to prevent unused code optimization (the couts).
tl;dr: Results for varying k:
k: 1 Time: 280
k: 2 Time: 360
k: 3 Time: 440
k: 4 Time: 520
k: 5 Time: 550
k: 6 Time: 510
k: 7 Time: 450
k: 8 Time: 360
k: 9 Time: 260
as you can see, the closer k gets to a chaotically varying condition, the program takes more. Towards the ends, it takes about half the time.

Related

Big O Notation Confusion (C++)

int f(const std::vector<int>& v) {
int result = 0;
for (int i = 0; i < v.size(); ++i) { O(N)
for (int j = v.size(); j >= 0; j -= 2) { O(N/2)
result += v.at(i) * j;
}
}
return result;
}
The inner for loop is O(N/2), however I am wondering why this is because
For example, if v.size() is 10, then
10 >= 0 ✓
8 >= 0 ✓
6 >= 0 ✓
4 >= 0 ✓
2>= 0 ✓
0 >= 0 ✓
-2 Fails
The inner for loop could be executed 6 times with an input size of 10
What am I missing?
EDIT* I understand that only highest magnitude is taken into consideration. This question was more about coming up with the original O(N/2 + 1)
Complexity gives you a way to assess the magnitude of time it would take an input of certain size to complete, not the accurate time it would perform with.
Therefore, when dealing with complexity, you should only consider the highest magnitude, without constant multipliers:
O(N/2 + 1) = O(N/2) = O(N)
In a comment, you said:
I understand this, but I am just curious as to how O(N/2) is obtained
Take a look at the following table:
Size of vector Number of time the inner loop is executed:
0 1
1 1
2 2
3 2
...
100 51
101 51
...
2x x + 1
2x + 1 x + 1
If you take the constant 1 out of that equation, the inner loop is O(N/2).

Mutex gone wrong?

I am trying Pthreads and its pretty basic program: I have two shared variables (declared global) among all threads
long Sum = 0;
long Sum1 = 0;
pthread_mutex_t mutexLock = PTHREAD_MUTEX_INITIALIZER;
In thread function:
for(int i=start; i<end; i++) //start and end are being passed to thread and they are being passed correctly
{
pthread_mutex_lock(&mutexLock);
Sum1+=i;
Sum+=Sum1;
pthread_mutex_unlock(&mutexLock);
}
main() in case one needs for reference:
int main()
{
pthread_t threadID[10];
for(int i=0; i<10; i++)
{
int a = (i*500) + 1;
int b =(i + 1)*500;
ThreadStruct* obj = new ThreadStruct(a,b);
pthread_create(&threadID[i],NULL,ThreadFunc,obj);
}
for(int i=0; i<10; i++)
{
pthread_join(threadID[i], NULL);
}
cout<<"Sum: "<<Sum<<endl;
cout<<"Sum1: "<<Sum1<<endl;
return 0;
}
OUTPUT
Sum: 40220835000
Sum1: 12502500
Run again
Sum: 38720835000
Sum1: 12502500
Run again
Sum: 39720835000
Sum1: 12502500
PROBLEM
Why I am getting a different value for Sum in each iteration?
Rest whole code is working ok and output of Sum1 is correct - no matter how much times do I run the code. (Only issue is in Sum). Am I doing something wrong in use of mutex here?
UPDATE
If I use local variables as #molbdnilo specified in his well detailed answer, this problem is solved. In start, I thought that mutex is irrelevant here but I tested it a number of times and observed the cases when not using a mutex results in recurrence of this problem. So, solution of this problem (courtesy: Answer by #molbdnilo) is to use local variables WITH mutex and I have tested it to work perfectly!
It's not a threading problem – the problem is that even though the order of additions to Sum1 doesn't matter, the order of additions to Sum does.
Consider the much shorter sum 1 + 2 + 3 and the following interleavings
1:
Sum1 = 1 + 2 = 3
Sum = 0 + 3 = 3
Sum1 = 3 + 3 = 6
Sum = 3 + 6 = 9
2:
Sum1 = 1 + 3 = 4
Sum = 0 + 4 = 4
Sum1 = 4 + 2 = 6
Sum = 4 + 6 = 10
3:
Sum1 = 2 + 3 = 5
Sum = 0 + 5 = 5
Sum1 = 5 + 1 = 6
Sum = 5 + 6 = 11
You could solve this by having the threads compute their own sum-of-sums independently and adding them afterwards.
(Notice that there's no concurrent mutation here, so locking anything can't make any difference.)
For a more concrete example, let's limit your program to two threads and the sum from 1 to 6.
You then have one thread computing 1 + 2 + 3 and one doing 4 + 5 + 6.
At a glance, thread one should also compute 1 + (1 + 2) + (1 + 2 + 3) and thread 2, 4 + (4 + 5) + (4 + 5 + 6).
Except they don't – every time they use it, Sum may have been modified by the other thread.
So thread one may compute 1 + ((1 + 4) + 2) + ((1 + 4) + 2 + 3), or something else.
When you use local variables, you keep each thread's result independent of the others.
(I think this problem is a pretty good illustration of how shared mutable state can complicate things in unexpected ways, by the way.)

shortest path with two variables

So I'm trying to use a modified Bellman Ford algorithm to find the shortest path from the starting vertex to the ending vertex but I cannot go over a certain distance. So given a graph with edges:
0 1 100 30
0 4 125 50
1 2 50 250
1 2 150 50
4 2 100 40
2 3 90 60
4 3 125 150
Where the each line represents an edge and the first value is the starting vertex, the second value is the end vertex, the third is cost and the fourth is the distance.
With the code I have now when I try to find the cheapest path from 0 to 3 without going over 140 it yields 0 (default when no path is found) instead of 340 (the cost of the cheapest path). Any suggestions on how to alter my code.
Just gonna copy the code down below because this site is not letting me do anything else.
static void BellmanFord(struct Graph *graph, int source, int ending, int max){
int edges = graph->edgeCount;
int vertices = graph->verticesCount;
int* money = (int*)malloc(sizeof(int) * vertices);
int* distance = (int*)malloc(sizeof(int) * vertices);
for (int I = 0; I < vertices; I++){
distance[I] = INT_MAX;
money[I] = INT_MAX;
}
distance[source] = 0;
money[source] = 0;
for (int I = 1; I <= vertices - 1; ++I){
for int j = 0; j < edges; ++j){
int u = graph->edge[j].Source;
int v = graph->edge[j].Destination;
int Cost = graph->edge[j].cost;
int Duration = graph->edge[j].duration;
if ((money[u] != INT_MAX) && (money[u] + Cost < money[v])){
if (distance[u] + Duration <= max){
money[v] = money[u] + Cost;
distance[v] = distance[u] + Duration;
}
}
}
}
if (money[ending] == INT_MAX) cout << "0" << endl;
else cout << money[ending] << endl;
}
Please help! This is probably not that hard but finals are stressing me out
This problem, known as the "constrained shortest path" problem, is much harder to solve than this. The algorithm you provided does not solve it, it only might catch the solution, only by luck, according to the graph's structure.
When this algorithm is applied on the graph you provide, with max-distance = 140, it fails to find the solution, which is 0-->1-->2-->3 (using the edge 1 2 150 50) with total cost of 340 and a distance of 140.
We can observe the reason of failure by logging the updates to the nodes whenever they are updated, and here is the result:
from node toNode newCost newDistance
0 1 100 30
0 4 125 50
1 2 250 80
4 2 225 90
Here the algorithm is stuck and cannot go further, since any progress from this point will lead to paths that exceed the max distance (of 140). As you see, node 2 has found the path 0-->4--2 which is the lowest-cost from node 0 while respecting the max-distance constraint. But now, any progress from 2 to 3 will exceed the distance of 140 (it will be 150, because 2->3 has a distance of 60.)
Running again with max-distance=150 will find the path 0-->4->3 with cost 315 and distance 150.
from node toNode newCost newDistance
0 1 100 30
0 4 125 50
1 2 250 80
4 2 225 90
2 3 315 150
Obviously this is not the minimum cost path that respects the constraint of distance; the correct should be the same (that it failed to find) in the first example. This again proves the failure of the algorithm; this time it gives a solution but which is not the optimal one.
In conclusion, this is not a programming mistake or bug in the code, it is simply that the algorithm is not adequate to the stated problem.
Okay so right before the
if (money[ending] == INT_MAX) cout << "0" << endl;
I added some code that made it work but I'm wondering will this work for every case or does it need to be altered a little.
if (money[ending] == INT_MAX){
for (int j = 0; j < edges; ++j){
int u = graph->edge[j].Source;
int v = graph->edge[j].Destination;
int Cost = graph->edge[j].cost;
int Duration = graph->edge[j].duration;
if ((distance[u] != INT_MAX) && (distance[u] +Duration < distance[v])){
if (distance[u] + Duration <= max){
money[v] = money[u] + Cost;
distance[v] = distance[u] + Duration;
}
}
}
}

Downscale array for decimal factor

Is there efficient way to downscale number of elements in array by decimal factor?
I want to downsize elements from one array by certain factor.
Example:
If I have 10 elements and need to scale down by factor 2.
1 2 3 4 5 6 7 8 9 10
scaled to
1.5 3.5 5.5 7.5 9.5
Grouping 2 by 2 and use arithmetic mean.
My problem is what if I need to downsize array with 10 elements to 6 elements? In theory I should group 1.6 elements and find their arithmetic mean, but how to do that?
Before suggesting a solution, let's define "downsize" in a more formal way. I would suggest this definition:
Downsizing starts with an array a[N] and produces an array b[M] such that the following is true:
M <= N - otherwise it would be upsizing, not downsizing
SUM(b) = (M/N) * SUM(a) - The sum is reduced proportionally to the number of elements
Elements of a participate in computation of b in the order of their occurrence in a
Let's consider your example of downsizing 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to six elements. The total for your array is 55, so the total for the new array would be (6/10)*55 = 33. We can achieve this total in two steps:
Walk the array a totaling its elements until we've reached the integer part of N/M fraction (it must be an improper fraction by rule 1 above)
Let's say that a[i] was the last element of a that we could take as a whole in the current iteration. Take the fraction of a[i+1] equal to the fractional part of N/M
Continue to the next number starting with the remaining fraction of a[i+1]
Once you are done, your array b would contain M numbers totaling to SUM(a). Walk the array once more, and scale the result by N/M.
Here is how it works with your example:
b[0] = a[0] + (2/3)*a[1] = 2.33333
b[1] = (1/3)*a[1] + a[2] + (1/3)*a[3] = 5
b[2] = (2/3)*a[3] + a[4] = 7.66666
b[3] = a[5] + (2/3)*a[6] = 10.6666
b[4] = (1/3)*a[6] + a[7] + (1/3)*a[8] = 13.3333
b[5] = (2/3)*a[8] + a[9] = 16
--------
Total = 55
Scaling down by 6/10 produces the final result:
1.4 3 4.6 6.4 8 9.6 (Total = 33)
Here is a simple implementation in C++:
double need = ((double)a.size()) / b.size();
double have = 0;
size_t pos = 0;
for (size_t i = 0 ; i != a.size() ; i++) {
if (need >= have+1) {
b[pos] += a[i];
have++;
} else {
double frac = (need-have); // frac is less than 1 because of the "if" condition
b[pos++] += frac * a[i]; // frac of a[i] goes to current element of b
have = 1 - frac;
b[pos] += have * a[i]; // (1-frac) of a[i] goes to the next position of b
}
}
for (size_t i = 0 ; i != b.size() ; i++) {
b[i] /= need;
}
Demo.
You will need to resort to some form of interpolation, as the number of elements to average isn't integer.
You can consider computing the prefix sum of the array, i.e.
0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10
yields by summation
0 1 2 3 4 5 6 7 8 9
1 3 6 10 15 21 28 36 45 55
Then perform linear interpolation to get the intermediate values that you are lacking, like at 0*, 10/6, 20/6, 30/5*, 40/6, 50/6, 60/6*. (Those with an asterisk are readily available).
0 1 10/6 2 3 20/6 4 5 6 40/6 7 8 50/6 9
1 3 15/3 6 10 35/3 15 21 28 100/3 36 45 145/3 55
Now you get fractional sums by subtracting values in pairs. The first average is
(15/3-1)/(10/6) = 12/5
I can't think of anything in the C++ library that will crank out something like this, all fully cooked and ready to go.
So you'll have to, pretty much, roll up your sleeves and go to work. At this point, the question of what's the "efficient" way of doing it boils down to its very basics. Which means:
1) Calculate how big the output array should be. Based on the description of the issue, you should be able to make that calculation even before looking at the values in the input array. You know the input array's size(), you can calculate the size() of the destination array.
2) So, you resize() the destination array up front. Now, you no longer need to worry about the time wasted in growing the size of the dynamic output array, incrementally, as you go through the input array, making your calculations.
3) So what's left is the actual work: iterating over the input array, and calculating the downsized values.
auto b=input_array.begin();
auto e=input_array.end();
auto p=output_array.begin();
Don't see many other options here, besides brute force iteration and calculations. Iterate from b to e, getting your samples, calculating each downsized value, and saving the resulting value into *p++.

How to decrement a values of variable in python?

Does anyone here who can help me what is the shortest way of decrementing a values of a variable?
Below is my desired output:
start = 5000
range = 5
qout = start/range
Distributed Remaining
1000 4000 # start - 1000
1000 3000 # 4000 - 1000
1000 2000 # 3000 - 1000
1000 1000 # 2000 - 1000
1000 0 # 1000 - 1000
what i have done so far is this:
start = 5000
range = 5
qout = start/range
i = 0
while i < range:
temp = {
'distr' : qout,
'remain' : start - remain, # This is what i can do only, unless it is being saved in the database so that i can move to next item.
}
i+=1
return temp
RE UPDATED:
I guess you are right, i don't know how should i ask. But let me show my original code.
temp = {}
i = 0
seq = 0
start = 11529.60
range = 6
qout = start / range
remaining = start - qout
while i < range:
while remaining >= 0:
temp = {
'sequence' : i+1,
'distributed' : qout,
'remaining' : remaining,
}
remaining -= qout
i += 1
print(temp)
My expected output would like this (and this is the output that i wanted/desired to show)
Sequence Distributed Remaining
1 1921.60 9608.00
2 1921.60 7686.40
3 1921.60 5764.80
4 1921.60 3843.20
5 1921.60 1921.60
6 1921.60 0.00
How ever this is what i get:
Sequence Distributed Remaining
1 1921.60 9608.00
1 1921.60 7686.40
1 1921.60 5764.80
1 1921.60 3843.20
1 1921.60 1921.60
Thanks for any help
This is my 3rd edit. I honestly believe that the largest problem here is that you can not define the question.
How to decrement a values of variable in python?
The answer to this is --i, but that's not what you asking.
Than you have desired input with no explanation which is what.
That's how I guess you want it to work...
start - an initial value;
range - how many times start will be deducted from
quot - amount of deduction, which is eq. to start/range.
remaining - this is my variable, which reflect the result of deducting from start. From your comment below, I assume remaining can go negative.
Still no question here, but let's put it together ...
start = 11529.60
range = 6
quot = start/range
sequence = 0
remaining = start
while range > 0:
range -= 1
sequence += 1
remaining -= quot
print(sequence, quot, remaining)