How to make my program work faster?

How to make my program work faster? - c++

I tried to run this code but it shows time limit exceeded in few cases, how can i shorten the time?
I need to understand what I have used in my program for which time is taking much, like some functions etc.. I understand by improving the iteration and complexity i can reduce execution time but its not helping much.please help
The program is simple, I take point a and point b and calculate the numbers of all the palindrome numbers.
#include<stdio.h>
int ifpalin(int g)
{
int rev=0;
int tmp=g;
while(tmp>0)
{
rev=rev*10+(tmp%10);
tmp=tmp/10;
}
if(rev==g)
return 1;
else
return 0;
}
int findpalin(int a1,int b1)
{
int sm=0;
for(int i=a1;i<=b1;i++)
{
if (ifpalin(i)==1)
sm++;
}
printf("%d",sm);
printf("\n");
return 0;
}
int main()
{
int a,b,n;
scanf("%d",&n);
for(int i=0;i<n;i++)
{
scanf("%d",&a);
scanf("%d",&b);
findpalin(a,b);
}
return 0;
}

Your code is already pretty efficient (as an implementation of your algorithm, which is the thing that can be improved). These challenges want to you to find a "non-obvious", but more efficient, algorithm. I.e., in this particular case, you should not check every number between a and b.
There is another solution here, i.e. you can "know" the number of palidromes directly. Think about it´like this:
With one digit, there are 10 palidromes [0, ..., 9],
With two digits, there are 9 palindromes [11, ..., 99].
With three digits, there are 9 possibilities where the first and last digit are equal [1, ..., 9]. For a viable palindrom, the middle has to be a palindrome as well. Since the middle has one digit, we know there are 10 possibilities for palindromes here and thus we have 9 * 10 = 90 palindromes with 3 digits.
With four digits, we got 9 * 10 (two-digit palindromes, 00 now also allowed) and with 5 digits 9 * 100 (3-digit p, starting with 0 allowed).
Thus you can derive a formula for n-digit numbers.
Then, you can directly derive the number for large streaks between a and b and only have to worry about which number of digits are relevant and how many numbers are lost in the beginning and end due to a and b not being 10^(n-1) and and 10^n - 1

Your int ifpalin(int g) fnction, for each given g, could be run in parallel because it seems like different input data for this function, has no effect on other data. you can run this function in parallel.
In int findpalin(int a1,int b1) function, there is a for loop which its complexity order is N, this is where you can run your threads. (each thread, runs function ifpalin). Of course, a good parallelism plan is needed.
You can run this function in some logical bunch, and aggregate the results.
On the other hand, any benchmark should be performed in release mode.
I hope it helps.
Excuse me if my writing in English is bad, and please correct me.

In ifpalin
Convert the number to string
Reverse the string
Compare with the original
If equal then it's a palindrome
See How to reverse an std::string?

Related

Can Anyone reduce the Complexity of My Code. Problem E of Codeforces Round113 Div.2

Link to The Problem: https://codeforces.com/problemset/problem/166/E
Problem Statement:
*You are given a tetrahedron. Let's mark its vertices with letters A, B, C, and D correspondingly.
An ant is standing in the vertex D of the tetrahedron. The ant is quite active and he wouldn't stay idle. At each moment of time, he makes a step from one vertex to another one along some edge of the tetrahedron. The ant just can't stand on one place.
You do not have to do much to solve the problem: your task is to count the number of ways in which the ant can go from the initial vertex D to itself in exactly n steps. In other words, you are asked to find out the number of different cyclic paths with the length of n from vertex D to itself. As the number can be quite large, you should print it modulo 1000000007 (10^9 + 7).*
Input:
The first line contains the only integer n (1 ≤ n ≤ 107) — the required length of the cyclic path.
Output:
Print the only integer — the required number of ways modulo 1000000007 (10e9 + 7).
Example: Input n=2 , Output: 3
Input n=4, Output: 21
My Approach to Problem:
I have written a recursive code that takes two input n and present index, then I am traveling and exploring all possible combinations.
#include<iostream>
using namespace std;
#define mod 10000000
#define ll long long
ll count_moves=0;
ll count(ll n, int present)
{
if(n==0 and present==0) count_moves+=1, count_moves%=mod; //base_condition
else if(n>1){ //Generating All possible Combinations
count(n-1,(present+1)%4);
count(n-1,(present+2)%4);
count(n-1,(present+3)%4);
}
else if(n==1 and present) count(n-1,0);
}
int main()
{
ll n; cin>>n;
if(n==1) {
cout<<"0"; return;
}
count(n,0);
cout<<count_moves%mod;
}
But the problem is that I am getting Time Limit Error since Time Complexity of my Code is very high. Please Can anyone suggest me how can I optimize/Memoize my code to reduce its complexity?
#**Edit 1: ** Some People are commenting about macros and division well it's not an issue. The Range of n is 10^7 and complexity of my code is exponential so my actual doubt is how to decrease it to linear time. i,e O(n).

Anytime you built into a recursion and you exceeded time complexity, you have to understand the recursion is likely the problem.
The best solution is to not use a recursion.
Look at the result you have:
3
6
21
60
183
546
1641
4920
   ⋮      ⋮
While it might be hard to find a pattern for the first couple terms, but it gets easier later on.
Each term is roughly 3 times larger than the last term, or more precisely,
Now you could just write a for loop for it:
for(int i = 0; i < n-1; i++)
{
count_moves = count_moves * 3 + std::pow(-1, i) * 3;
}
or to get rid of pow():
for(int i = 0; i < n-1; i++)
{
count_moves = count_moves * 3 + (i % 2 * 2 - 1) * -3;
}
Further more, you could even build that into a general term formula to get rid of the for loop:
or in code:
count_moves = (pow(3, n) + (n % 2 * 2 - 1) * -3) / 4;
However, you can't get rid of the pow() this time, or you will have to write a loop for that then.

I believe one of your issues is that you are recalculating things.
Take for example n=4. count(3,x) is called 3 times for x in [0,3].
However if you made a std::map<int,int> you could save the value for (n,present) pairs and only calculate each value once.
This will take more space. The map will be 4*(n-1) big when you are done. That is still probably too large for 10^9?
Another thing you can do is multithread. Each call to count can instigate its own thread. You need to be careful then to be thread safe when changing the global count and the state of the std::map if you decide to use it.
Edit:
Calculate count(n,x) one time for n in [1,n-1] x in [0,3] then count[n,0] = a*count(n-1,1) +b*count(n-1,2) +c*count(n-1,3).
If you can figure out the pattern for what a,b,c are given n or maybe even the a,b,c for the n-1 case then you may be able to solve this problem easily.

How to calculate the sum of the bitwise xor values of all the distinct combination of the given numbers efficiently?

Given n(n<=1000000) positive integer numbers (each number is smaller than 1000000). The task is to calculate the sum of the bitwise xor ( ^ in c/c++) value of all the distinct combination of the given numbers.
Time limit is 1 second.
For example, if 3 integers are given as 7, 3 and 5, answer should be 7^3 + 7^5 + 3^5 = 12.
My approach is:
#include <bits/stdc++.h>
using namespace std;
int num[1000001];
int main()
{
int n, i, sum, j;
scanf("%d", &n);
sum=0;
for(i=0;i<n;i++)
scanf("%d", &num[i]);
for(i=0;i<n-1;i++)
{
for(j=i+1;j<n;j++)
{
sum+=(num[i]^num[j]);
}
}
printf("%d\n", sum);
return 0;
}
But my code failed to run in 1 second. How can I write my code in a faster way, which can run in 1 second ?
Edit: Actually this is an Online Judge problem and I am getting Cpu Limit Exceeded with my above code.

You need to compute around 1e12 xors in order to brute force this. Modern processors can do around 1e10 such operations per second. So brute force cannot work; therefore they are looking for you to figure out a better algorithm.
So you need to find a way to determine the answer without computing all those xors.
Hint: can you think of a way to do it if all the input numbers were either zero or one (one bit)? And then extend it to numbers of two bits, three bits, and so on?

When optimising your code you can go 3 different routes:
Optimising the algorithm.
Optimising the calls to language and library functions.
Optimising for the particular architecture.
There may very well be a quicker mathematical way of xoring every pair combination and then summing them up, but I know it not. In any case, on the contemporary processors you'll be shaving off microseconds at best; that is because you are doing basic operations (xor and sum).
Optimising for the architecture also makes little sense. It normally becomes important in repetitive branching, you have nothing like that here.
The biggest problem in your algorithm is reading from the standard input. Despite the fact that "scanf" takes only 5 characters in your computer code, in machine language this is the bulk of your program. Unfortunately, if the data will actually change each time your run your code, there is no way around the requirement of reading from stdin, and there will be no difference whether you use scanf, std::cin >>, or even will attempt to implement your own method to read characters from input and convert them into ints.
All this assumes that you don't expect a human being to enter thousands of numbers in less than one second. I guess you can be running your code via: myprogram < data.

This function grows quadratically (thanks #rici). At around 25,000 positive integers with each being 999,999 (worst case) the for loop calculation alone can finish in approximately a second. Trying to make this work with input as you have specified and for 1 million positive integers just doesn't seem possible.

With the hint in Alan Stokes's answer, you may have a linear complexity instead of quadratic with the following:
std::size_t xor_sum(const std::vector<std::uint32_t>& v)
{
std::size_t res = 0;
for (std::size_t b = 0; b != 32; ++b) {
const std::size_t count_0 =
std::count_if(v.begin(), v.end(),
[b](std::uint32_t n) { return (n >> b) & 0x01; });
const std::size_t count_1 = v.size() - count_0;
res += count_0 * count_1 << b;
}
return res;
}
Live Demo.
Explanation:
x^y = Sum_b((x&b)^(y&b)) where b is a single bit mask (from 1<<0 to 1<<32).
For a given bit, with count_0 and count_1 the respective number of count of number with bit set to 0 or 1, we have count_0 * (count_0 - 1) 0^0, count_0 * count_1 0^1 and count_1 * (count_1 - 1) 1^1 (and 0^0 and 1^1 are 0).

Output wrong Project Euler 50

So I am attempting Problem 50 of project euler. (So close to level 2 :D) It goes like this:
The prime 41, can be written as the sum of six consecutive primes:
41 = 2 + 3 + 5 + 7 + 11 + 13
This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
Here is my code:
#include <iostream>
#include <vector>
using namespace std;
int main(){
vector<int> primes(1000000,true);
primes[0]=false;
primes[1]=false;
for (int n=4;n<1000000;n+=2)
primes[n]=false;
for (int n=3;n<1000000;n+=2){
if (primes[n]==true){
for (int b=n*2;b<100000;b+=n)
primes[b]=false;
}
}
int basicmax,basiccount=1,currentcount,biggermax,biggercount=1,sum=0,basicstart,basicend,biggerstart,biggerend;
int limit=1000000;
for (int start=2;start<limit;start++){
//cout<<start;
sum=0;
currentcount=0;
for (int basic=start;start<limit&&sum+basic<limit;basic++){
if (primes[basic]==true){
//cout<<basic<<endl;
sum+=basic;currentcount++;}
if (primes[sum]&&currentcount>basiccount&&sum<limit)
{basicmax=sum;basiccount=currentcount;basicstart=start;basicend=basic;}
}
if (basiccount>biggercount)
{biggercount=basiccount;biggermax=basicmax;biggerend=basicend;biggerstart=basicstart;}
}
cout<<biggercount<<endl<<biggermax<<endl;
return 0;
}
Basically it just creates a vector of all primes up to 1000000 and then loops through them finding the right answer. The answer is 997651 and the count is supposed to be 543 but my program outputs 997661 and 546 respectively. What might be wrong?

It looks like you're building your primes vector wrong
for (int b=n*2;b<100000;b+=n)
primes[b]=false;
I think that should be 1,000,000 not 100,000. It might be better to refactor that number out as a constant to make sure it's consistent throughout.
The rest of it looks basically fine, although without testing it ourselves I'm not sure what else we can add. There's plenty of room for efficiency improvements: you do do a lot of repeated scanning of ranges e.g. there's no point starting to sum when prime[start] is false, you could build a second vector of just the primes for the summing etc. (Does project Euler have runtime and memory limit restrictions? I can't remember)

You are thinking about this the wrong way.
Generate the maximal sequence of primes such that their sum is less than 1,000,000. This is 2, 3, 5, ..., p. For some p.
Sum this sequence and test it for primality.
If it is prime terminate and return the sum.
A shorter sequence must be the correct one. There are exactly two ways of shortening the sequence and preserving the consecutive prime property - removing the first element or removing the last. Recurse from 2 with both of these sequences.

Fastest way to find the sum of decimal digits

What is the fastest way to find the sum of decimal digits?
The following code is what I wrote but it is very very slow for range 1 to 1000000000000000000
long long sum_of_digits(long long input) {
long long total = 0;
while (input != 0) {
total += input % 10;
input /= 10;
}
return total;
}
int main ( int argc, char** argv) {
for ( long long i = 1L; i <= 1000000000000000000L; i++) {
sum_of_digits(i);
}
return 0;
}

I'm assuming what you are trying to do is along the lines of
#include <iostream>
const long long limit = 1000000000000000000LL;
int main () {
long long grand_total = 0;
for (long long ii = 1; ii <= limit; ++ii) {
grand_total += sum_of_digits(i);
}
std::cout << "Grand total = " << grand_total << "\n";
return 0;
}
This won't work for two reasons:
It will take a long long time.
It will overflow.
To deal with the overflow problem, you will either have to put a bound on your upper limit or use some bignum package. I'll leave solving that problem up to you.
To deal with the computational burden you need to get creative. If you know the upper limit is limited to powers of 10 this is fairly easy. If the upper limit can be some arbitrary number you will have to get a bit more creative.
First look at the problem of computing the sum of digits of all integers from 0 to 10n-1 (e.g., 0 to 9 (n=1), 0 to 99 (n=2), etc.) Denote the sum of digits of all integers from 10n-1 as Sn. For n=1 (0 to 9), this is just 0+1+2+3+4+5+6+7+8+9=45 (9*10/2). Thus S1=45.
For n=2 (0 to 99), you are summing 0-9 ten times and you are summing 0-9 ten times again. For n=3 (0 to 999), you are summing 0-99 ten times and you are summing 0-9 100 times. For n=4 (0 to 9999), you are summing 0-999 ten times and you are summing 0-9 1000 times. In general, Sn=10Sn-1+10n-1S1 as a recursive expression. This simplifies to Sn=(9n10n)/2.
If the upper limit is of the form 10n, the solution is the above Sn plus one more for the number 1000...000. If the upper limit is an arbitrary number you will need to get creative once again. Think along the lines that went into developing the formula for Sn.

You can break this down recursively. The sum of the digits of an 18-digit number are the sums of the first 9 digits plus the last 9 digits. Likewise the sum of the digits of a 9-bit number will be the sum of the first 4 or 5 digits plus the sum of the last 5 or 4 digits. Naturally you can special-case when the value is 0.

Reading your edit: computing that function in a loop for i between 1 and 1000000000000000000 takes a long time. This is a no brainer.
1000000000000000000 is one billion billion. Your processor will be able to do at best billions of operations per second. Even with a nonexistant 4-5 Ghz processor, and assuming best case it compiles down to an add, a mod, a div, and a compare jump, you could only do 1 billion iterations per second, meaning it will take on the order of 1 billion seconds.

You probably don't want to do it in a bruteforce way. This seems to be more of a logical thinking question.
Note - 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = N(N+1)/2 = 45.
---- Changing the answer to make it clearer after David's comment
See David's answer - I had it wrong

Quite late to the party, but anyways, here is my solution. Sorry it's in Python and not C++, but it should be relatively easy to translate. And because this is primarily an algorithm problem, I hope that's ok.
As for the overflow problem, the only thing that comes to mind is to use arrays of digits instead of actual numbers. Given this algorithm I hope it won't affect performance too much.
https://gist.github.com/frnhr/7608873
It uses these three recursions I found by looking and poking at the problem. Rather then trying to come up with some general and arcane equations, here are three examples. A general case should be easily visible from those.
relation 1
Reduces function calls with arbitrary argument to several recursive calls with more predictable arguments for use in relations 2 and 3.
foo(3456) == foo(3000)
+ foo(400) + 400 * (3)
+ foo(50) + 50 * (3 + 4)
+ foo(6) + 6 * (3 + 4 + 5)
relation 2
Reduce calls with an argument in the form L*10^M (e.g: 30, 7000, 900000) to recursive call usable for relation 3. These triangular numbers popped in quite uninvited (but welcome) :)
triangular_numbers = [0, 1, 3, 6, 10, 15, 21, 28, 36] # 0 not used
foo(3000) == 3 * foo(1000) + triangular_numbers[3 - 1] * 1000
Only useful if L > 1. It holds true for L = 1 but is trivial. In that case, go directly to relation 3.
relation 3
Recursively reduce calls with argument in format 1*10^M to a call with argument that's divided by 10.
foo(1000) == foo(100) * 10 + 44 * 100 + 100 - 9 # 44 and 9 are constants
Ultimately you only have to really calculate the sum or digits for numbers 0 to 10, and it turns out than only up to 3 of these calculations are needed. Everything else is taken care of with this recursion. I'm pretty sure it runs in O(logN) time. That's FAAST!!!!!11one
On my laptop it calculates the sum of digit sums for a given number with over 1300 digits in under 7 seconds! Your test (1000000000000000000) gets calculated in 0.000112057 seconds!

I think you cannot do better than O(N) where N is the number of digits in the given number(which is not computationally expensive)
However if I understood your question correctly (the range) you want to output the sum of digits for a range of numbers. In that case, you can increment by one when you go from number0 to number9 and then decrease by 8.

You will need to cheat - look for mathematical patterns that let you short-cut your computations.
For example, do you really need to test that input != 0 every time? Does it matter if you add 0/10 several times? Since it won't matter, consider unrolling the loop.
Can you do the calculation in a larger base, eg, base 10^2, 10^3, etcetera, that might allow you to reduce the number of digits, which you'll then have to convert back to base 10? If this works, you'll be able to implement a cache more easily.
Consider looking at compiler intrinsics that let you give hints to the compiler for branch prediction.
Given that this is C++, consider implementing this using template metaprogramming.
Given that sum_of_digits is purely functional, consider caching the results.
Now, most of those suggestions will backfire - but the point I'm making is that if you have hit the limits of what your computer can do for a given algorithm, you do need to find a different solution.
This is probably an excellent starting point if you want to investigate this in detail: http://mathworld.wolfram.com/DigitSum.html

Possibility 1:
You could make it faster by feeding the result of one iteration of the loop into the next iteration.
For example, if i == 365, the result is 14. In the next loop, i == 366 -- 1 more than the previous result. The sum is also 1 more: 3 + 6 + 6 = 15.
Problems arise when there is a carry digit. If i == 99 (ie. result = 18), the next loop's result isn't 19, it's 1. You'll need extra code to detect this case.
Possibility 2:
While thinking though the above, it occurred to me that the sequence of results from sum_of_digits when graphed would resemble a sawtooth. With some analysis of the resulting graph (which I leave as an exercise for the reader), it may be possible to identify a method to allow direct calculation of the sum result.
However, as some others have pointed out: Even with the fastest possible implementation of sum_of_digits and the most optimised loop code, you can't possibly calculate 1000000000000000000 results in any useful timeframe, and certainly not in less than one second.

Edit: It seems you want the the sum of the actual digits such that: 12345 = 1+2+3+4+5 not the count of digits, nor the sum of all numbers 1 to 12345 (inclusive);
As such the fastest you can get is:
long long sum_of_digits(long long input) {
long long total = input % 10;
while ((input /= 10) != 0)
total += input % 10;
return total;
}
Which is still going to be slow when you're running enough iterations. Your requirement of 1,000,000,000,000,000,000L iterations is One Million, Million, Million. Given 100 Million takes around 10,000ms on my computer, one can expect that it will take 100ms per 1 million records, and you want to do that another million million times. There are only 86400 seconds in a day, so at best we can compute around 86,400 Million records per day. It would take one computer
Lets suppose your method could be performed in a single float operation (somehow), suppose you are using the K computer which is currently the fastest (Rmax) supercomputer at over 10 petaflops, if you do the math that is = 10,000 Million Million floating operations per second. This means that your 1 Million, Million, Million loop will take the world's fastest non-distributed supercomputer 100 seconds to compute the sums (IF it took 1 float operation to calculate, which it can't), so you will need to wait around for quite some time for computers to become 100 so much more powerful for your solution to be runable in under one second.
What ever you're trying to do, you're either trying to do an unsolvable problem in near real-time (eg: graphics calculation related) or you misunderstand the question / task that was given you, or you are expected to perform something faster than any (non-distributed) computer system can do.
If your task is actually to sum all the digits of a range as you show and then output them, the answer is not to improve the for loop. for example:
1 = 0
10 = 46
100 = 901
1000 = 13501
10000 = 180001
100000 = 2250001
1000000 = 27000001
10000000 = 315000001
100000000 = 3600000001
From this you could work out a formula to actually compute the total sum of all digits for all numbers from 1 to N. But it's not clear what you really want, beyond a much faster computer.

No the best, but simple:
int DigitSumRange(int a, int b) {
int s = 0;
for (; a <= b; a++)
for(c : to_string(a))
s += c-48;
return s;
}

A Python function is given below, which converts the number to a string and then to a list of digits and then finds the sum of these digits.
def SumDigits(n):
ns=list(str(n))
z=[int(d) for d in ns]
return(sum(z))

In C++ one of the fastest way can be using strings.
first of all get the input from users in a string. Then add each element of string after converting it into int. It can be done using -> (str[i] - '0').
#include<iostream>
#include<string>
using namespace std;
int main()
{ string str;
cin>>str;
long long int sum=0;
for(long long int i=0;i<str.length();i++){
sum = sum + (str[i]-'0');
}
cout<<sum;
}

The formula for finding the sum of the digits of numbers between 1 to N is:
(1 + N)*(N/2)
[http://mathforum.org/library/drmath/view/57919.html][1]
There is a class written in C# which supports a number with more than the supported max-limit of long.
You can find it here. [Oyster.Math][2]
Using this class, I have generated a block of code in c#, may be its of some help to you.
using Oyster.Math;
class Program
{
private static DateTime startDate;
static void Main(string[] args)
{
startDate = DateTime.Now;
Console.WriteLine("Finding Sum of digits from {0} to {1}", 1L, 1000000000000000000L);
sum_of_digits(1000000000000000000L);
Console.WriteLine("Time Taken for the process: {0},", DateTime.Now - startDate);
Console.ReadLine();
}
private static void sum_of_digits(long input)
{
var answer = IntX.Multiply(IntX.Parse(Convert.ToString(1 + input)), IntX.Parse(Convert.ToString(input / 2)), MultiplyMode.Classic);
Console.WriteLine("Sum: {0}", answer);
}
}
Please ignore this comment if it is not relevant for your context.
[1]: https://web.archive.org/web/20171225182632/http://mathforum.org/library/drmath/view/57919.html
[2]: https://web.archive.org/web/20171223050751/http://intx.codeplex.com/

If you want to find the sum for the range say 1 to N then simply do the following
long sum = N(N+1)/2;
it is the fastest way.

Writing a C++ version of the algebra game 24

I am trying to write a C++ program that works like the game 24. For those who don't know how it is played, basically you try to find any way that 4 numbers can total 24 through the four algebraic operators of +, -, /, *, and parenthesis.
As an example, say someone inputs 2,3,1,5
((2+3)*5) - 1 = 24
It was relatively simple to code the function to determine if three numbers can make 24 because of the limited number of positions for parenthesis, but I can not figure how code it efficiently when four variables are entered.
I have some permutations working now but I still cannot enumerate all cases because I don't know how to code for the cases where the operations are the same.
Also, what is the easiest way to calculate the RPN? I came across many pages such as this one:
http://www.dreamincode.net/forums/index.php?showtopic=15406
but as a beginner, I am not sure how to implement it.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
bool MakeSum(int num1, int num2, int num3, int num4)
{
vector<int> vi;
vi.push_back(num1);
vi.push_back(num2);
vi.push_back(num3);
vi.push_back(num4);
sort(vi.begin(),vi.end());
char a1 = '+';
char a2 = '-';
char a3 = '*';
char a4 = '/';
vector<char> va;
va.push_back(a1);
va.push_back(a2);
va.push_back(a3);
va.push_back(a4);
sort(va.begin(),va.end());
while(next_permutation(vi.begin(),vi.end()))
{
while(next_permutation(va.begin(),va.end()))
{
cout<<vi[0]<<vi[1]<<vi[2]<<vi[3]<< va[0]<<va[1]<<va[2]<<endl;
cout<<vi[0]<<vi[1]<<vi[2]<<va[0]<< vi[3]<<va[1]<<va[2]<<endl;
cout<<vi[0]<<vi[1]<<vi[2]<<va[0]<< va[1]<<vi[3]<<va[2]<<endl;
cout<<vi[0]<<vi[1]<<va[0]<<vi[2]<< vi[3]<<va[1]<<va[2]<<endl;
cout<<vi[0]<<vi[1]<<va[0]<<vi[2]<< va[1]<<vi[3]<<va[2]<<endl;
}
}
return 0;
}
int main()
{
MakeSum(5,7,2,1);
return 0;
}

So, the simple way is to permute through all possible combinations. This is slightly tricky, the order of the numbers can be important, and certainly the order of operations is.
One observation is that you are trying to generate all possible expression trees with certain properties. One property is that the tree will always have exactly 4 leaves. This means the tree will also always have exactly 3 internal nodes. There are only 3 possible shapes for such a tree:
A
/ \
N A
/ \ (and the mirror image)
N A
/ \
N N
A
/ \
N A
/ \
A N (and the mirror image)
/ \
N N
A
/` `\
A A
/ \ / \
N N N N
In each spot for A you can have any one of the 4 operations. In each spot for N you can have any one of the numbers. But each number can only appear for one N.
Coding this as a brute force search shouldn't be too hard, and I think that after you have things done this way it will become easier to think about optimizations.
For example, + and * are commutative. This means that mirrors that flip the left and right children of those operations will have no effect. It might be possible to cut down searching through all such flips.
Someone else mentioned RPN notation. The trees directly map to this. Here is a list of all possible trees in RPN:
N N N N A A A
N N N A N A A
N N N A A N A
N N A N N A A
N N A N A N A
That's 4*3*2 = 24 possibilities for numbers, 4*4*4 = 64 possibilities for operations, 24 * 64 * 5 = 7680 total possibilities for a given set of 4 numbers. Easily countable and can be evaluated in a tiny fraction of a second on a modern system. Heck, even in basic on my old Atari 8 bit I bet this problem would only take minutes for a given group of 4 numbers.

You can just use Reverse Polish Notation to generate the possible expressions, which should remove the need for parantheses.
An absolutely naive way to do this would be to generate all possible strings of 4 digits and 3 operators (paying no heed to validity as an RPN), assume it is in RPN and try to evaluate it. You will hit some error cases (as in invalid RPN strings). The total number of possibilities (if I calculated correctly) is ~50,000.
A more clever way should get it down to ~7500 I believe (64*24*5 to be exact): Generate a permutation of the digits (24 ways), generate a triplet of 3 operators (4^3 = 64 ways) and now place the operators among the digits to make it valid RPN(there are 5 ways, see Omnifarious' answer).
You should be able to find permutation generators and RPN calculators easily on the web.
Hope that helps!
PS: Just FYI: RPN is nothing but the postorder traversal of the corresponding expression tree, and for d digits, the number is d! * 4^(d-1) * Choose(2(d-1), (d-1))/d. (The last term is a catalan number).

Edited: The solution below is wrong. We also need to consider the numbers makeable with just x_2 and x_4, and with just x_1 and x_4. This approach can still work, but it's going to be rather more complex (and even less efficient). Sorry...
Suppose we have four numbers x_1, x_2, x_3, x_4. Write
S = { all numbers we can make just using x_3, x_4 },
Then we can rewrite the set we're interested in, which I'll call
T = { all numbers we can make using x_1, x_2, x_3, x_4 }
as
T = { all numbers we can make using x_1, x_2 and some s from S }.
So an algorithm is to generate all possible numbers in S, then use each number s in S in turn to generate part of T. (This will generalise fairly easily to n numbers instead of just 4).
Here's a rough, untested code example:
#include <set> // we can use std::set to store integers without duplication
#include <vector> // we might want duplication in the inputs
// the 2-number special case
std::set<int> all_combinations_from_pair(int a, int b)
{
std::set results;
// here we just use brute force
results.insert(a+b); // = b+a
results.insert(a-b);
results.insert(b-a);
results.insert(a*b); // = b*a
// need to make sure it divides exactly
if (a%b==0) results.insert(a/b);
if (b%a==0) results.insert(b/a);
return results;
}
// the general case
std::set<int> all_combinations_from(std::vector<int> inputs)
{
if (inputs.size() == 2)
{
return all_combinations_from_pair(inputs[0], inputs[1]);
}
else
{
std::set<int> S = all_combinations_from_pair(inputs[0], inputs[1]);
std::set<int> T;
std::set<int> rest = S;
rest.remove(rest.begin());
rest.remove(rest.begin()); // gets rid of first two
for (std::set<int>.iterator i = S.begin(); i < S.end(); i++)
{
std::set<int> new_inputs = S;
new_inputs.insert(*i);
std::set<int> new_outputs = all_combinations_from(new_inputs);
for (std::set<int>.iterator j = new_outputs.begin(); j < new_outputs.end(); j++)
T.insert(*j); // I'm sure you can do this with set_union()
}
return T;
}
}

If you are allowed to use the same operator twice, you probably don't want to mix the operators into the numbers. Instead, perhaps use three 0's as a placeholder for where operations will occur (none of the 4 numbers are 0, right?) and use another structure to determine which operations will be used.
The second structure could be a vector<int> initialized with three 1's followed by three 0's. The 0's correspond to the 0's in the number vector. If a 0 is preceded by zero 1's, the corresponding operation is +, if preceded by one 1, it's -, etc. For example:
6807900 <= equation of form ( 6 # 8 ) # ( 7 # 9 )
100110 <= replace #'s with (-,-,/)
possibility is (6-8)-(7/9)
Advance through the operation possibilities using next_permutation in an inner loop.
By the way, you can also return early if the number-permutation is an invalid postfix expression. All permutations of the above example less than 6708090 are invalid, and all greater are valid, so you could start with 9876000 and work your way down with prev_permutation.

Look up the Knapsack problem (here's a link to get you started: http://en.wikipedia.org/wiki/Knapsack_problem), this problem is pretty close to that, just a little harder (and the Knapsack problem is NP-complete!)

One thing that might make this faster than normal is parallelisation. Check out OpenMP. Using this, more than one check is carried out at once (your "alg" function) thus if you have a dual/quad core cpu, your program should be faster.
That said, if as suggested above the problem is NP-complete, it'll be faster, not necessarily fast.

i wrote something like this before. You need a recursive evaluator. Call evaluate, when you hit "(" call evaluate again otherwise run along with digits and operators till you hit ")", now return the result of the -+*/ operations the the evaluate instance above you

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js