How to implement binary search on function in C++? - c++

I've got a situation in C++ where I have a costly sub-problem: nearly a minute. Further, this subproblem--bool f(x)--gets longer as x ranges from [1,n].
On the range of [1,n], there is a point 1<j<n, f(j) = true such that f(k), k < j is always false...and f(m), j < m is always true.
I need to find that point.
The way I need to do it is with binary search starting at x=1 (so that it never even touches the time consuming region, where x is close to n).
Right now, I am manually slogging through a binary search implementation where I start at a minimum value for a function (so, f(1), which is false). Then I double the input value until I reach a state where f(x) is happy (true). This part is no big deal. I can do it manually fine.
Next, I want to binary search on the range [x/2,x] for the first value where f(x) = true (noting that f(x/2) must equal false, since f(1) = false...and I need to do it without making any mistakes. And this is where things are getting a little hairy.
So I have a creeping suspicion that C++ has this already implemented, but I am new to C++, and have limited experience with the libraries. I am currently looking at binary search, but I don't see a way to bin search for the actual point at which all values of a function change from false to true.
Rather, it seems to be greedy: it would just grab the first true it finds, rather than the minimum true.
How would I do this in C++?
The inputs of my search function would be as follows:
std::vector<int> range(max);
for (int j = 0; j < max; j++){
range[j] = j;
}
std::some_search_function(range.begin(),range.end(),f(int value))
where f(...) is the boolean function from before...
And it needs to have the properties of what I am describing: start from the first value in the item it is searching, and return the earliest item where f(...) is satisfied...

You can implement a custom iterator that "points" to true or false depending on the index it represents and use std::lower_bound function to find the first true value.

Related

Complexity of searching in a set of sets (C++)

I have a set of sets of positive integers std::set<set::<int> > X. Now I am given a set std::set<int> V and I want to know if it occurs in X. Obviously, this can be done by invoking the function find, so X.find(V) != X.end() should return true if V is in X.
My question is about the complexity of this operation, i.e. if X contains n sets of positive integers, what is time complexity of X.find(V)?
Searching in a set is O(log n) in the number of elements, regardless of what the elements are composed of, even other sets. If the element is another set all you need is an ordering predicate (using the address of the object is a safe default). However, searching for an integer nested in the set of sets is going to be O(m log n) in general.
Suppose there are e sets in X such that the summation of sizes of all e sets is n, i.e., |S1| + |S2| + ... + |Se| = n then in the worst case X.find(V) will take O(m*log(e)) where m is the size of V, i.e., |V| = m. As you can see it is independent of n.
Why? So a set in STL is typically implemented as a self-balancing binary search tree. Therefore the height of root is always O(log(e)) where e is the total number of elements in the tree currently. Now notice that in our case the nodes of the tree are sets. set by default use less than < operator to compare with other set of the same type which takes O(min(|S1|, |S2|)) time to compare.
Therefore in the worst case, if the set V we want to find is one of the leaves of X and all the nodes on the branch from the root to V have size >= |V| then every node comparison will take O(|V|) time and since there are O(log(e)) nodes on this branch, it'll take us O(m*log(e)) time.

Most concise way of checking if a number falls betwen two other numbers?

Suppose I have three numbers. Two of them form a range between them. The last number, I want to check to see if it falls within that range. It's a simple caveat: the numbers that define the range's start and end, may be greater than or less than the other. This is for a physics algorithm whose performance I'm working to improve, so I also want to avoid using conditional statements.
double inRange(double point, double rangeStart, double rangeEnd){
// returns true if the 'point' lies within the range
// the 'range' is every number between 'rangeStart' and 'rangeEnd'
// rangeStart can be greater than or less than rangeEnd
// conditional branches should be avoided
return ?; // return values [0.0 - 1.0] are considered 'in range'
}
Is there a mathematical equation to accomplish this, without using condition logic?
edit:
The reason it returns a double instead of a bool, is because I need to know the ratio too; 0.0 is closest to one edge while 1.0 is closest to the other.
The original algorithm I have is this:
double inRange(double point, double rangeStart, double rangeEnd){
if(rangeStart > rangeEnd){
double temp = rangeStart;
rangeStart = rangeEnd;
rangeEnd = temp;
}
return (point - rangeStart) / (rangeEnd - rangeStart);
}
My profiler shows about 16% of the time the program is running, is spent in this function, with optimizations enabled. It's called pretty frequently. Not sure if the condition statement is entirely to blame, but I would like to try a function that doesn't have one and see.
to answer your specification "it should return zero when close to the start and 1 when close to the end", that you don't want conditionals, and that start and end might be swapped:
return (point-std::min(rangeStart, rangeEnd))/std::abs(rangeStart - rangeEnd);
Note that although I don't know about the particular STL implementation, min does not necessarily require conditionals to be implemented. For instance, min(a,b) = (a+b-abs(b-a))/2.
If the start is larger than the end, then swap those.

My recursive function does not return the correct value

I wrote a recursive function that computes the sum of an array of double. For some reasons, the value returned by my recursive function is not correct. Actually, my recursive sum does not match my iterative sum. I know I made a little mistake somewhere, but I can't see where. Your help will be very appreciated. I only pasted the recursive function. I am using C++ on Visual Studio. Thanks!
double recursive_sum(double array_nbr[], int size_ar)
{ double rec_sum=0.0;
if( size_ar== 0)
return -1;
else if( size_ar> 0)
rec_sum=array_nbr[size_ar-1]+recursive_sum(array_nbr,size_ar-1);
return rec_sum;
}
//#### Output######
The random(s) number generated in the array =
0.697653 | 0.733848 | 0.221564 |
Recursive sum: 0.653066
Iterative sum: 1.65307
Press any key to continue . . .
Well, because sum of no elements is zero, not minus one.
if (size_ar == 0.0)
return 0.0;
Think about it this way: sum(1,2,3) is the same as sum(1,2) + sum(3) just as it is the same as sum(1,2,3)+sum() — in all three cases, you add 1, 2, and 3 together, just in a slighlty different ways. That's also why the product of no elements is one.
Try changing "if( size_ar== 0) return -1;" to return 0.
While this does not account for the large discrepancy in your output, another thing to keep in mind is the ordering of operations once you have fixed the issue with returning a -1 vs. 0 ... IEEE floating point operations are not necessarily commutative, so make sure that when you are doing your recursive vs. iterative methods, you add up the numbers in the exact same order, otherwise your output may still differ by some epsilon value.
For instance, currently in your recursive method you're adding up the values from the last member of the array in reverse to the first member of the array. That may, because of the non-commutative property of floating point math, give you a slightly different value (small epsilon) than if you sum up the values in the array from first to last. This probably won't show on a simple cout where the floating point values are truncated to a specific fixed decimal position, but should you attempt to use the == operation on the two different summations without incorporating some epsilon value, the result may still test false.

Recursion: Understanding (subset-sum) inclusion/exclusion pattern

I need to understand how this recursion work, I understand simple recursion examples but more advanced ones is hard. Even thought there are just two lines of code I got problem with... the return statement itself. I just draw a blank on how this works, especially the and/or operator. Any insight is very welcome.
bool subsetSumExists(Set<int> & set, int target) {
if (set.isEmpty()) {
return target == 0;
} else {
int element = set.first();
Set<int> rest = set - element;
return subsetSumExists(rest, target)
|| subsetSumExists(rest, target - element);
}
}
Recursive code is normally coupled with the concept of reduction. In general, reduction is a means to reduce an unknown problem to a known one via some transformation.
Let's take a look at your code. You need to find whether a given target sum can be constructed from an elements of the input data set.
If the data set is empty, there is nothing to do besides comparing the target sum to 0.
Otherwise, let's apply the reduction. If we choose a number from the set, there can actually be 2 possibilities - the chosen number participates in the sum you're seeking or it doesn't. No other possibilities here (it's very important to cover the full spectrum of possibilities!). In fact, it doesn't really matter which data element is chosen as long as you can cover all the possibilities for the remaining data.
First case: the number doesn't participate in the sum. We can reduce the problem to a smaller one, with data set without the inspected element and the same target sum.
Second case: the number participates in the sum. We can reduce the problem to a smaller one, with data set without the inspected element and the requested sum decreased by the value of the number.
Note, you don't know at this point whether any of these cases is true. You just continue reducing them until you get to the trivial empty case where you can know for sure the answer.
The answer to the original question would be true if it's true for any of these 2 cases. That's exactly what operator || does - it will yield true if any of its operands (the outcome of the 2 cases) are true.
|| is logical OR. It's evaluated left-to-right and short-circuited.
This means that in an expression A || B, A is evaluated first. If it's true, the entire expression is true and no further evaluation is done. If A is false, B is evaluated and the expression gets the value of B.
In your example, A is "try getting the same sum without using the 1st element from the set". B is "use the 1st element from the set, which decreases the total left to sum, and try to get that with the rest of the element."
Lets first look at algorithm..
The base case(i.e the case in which recursion terminates) is when the set is empty.
Otherwise the program takes the first elements subtracts it from the set.
Now it will call subsetSumExists(rest, target) and check if its true,
if it is it will return true otherwise it will call
subsetSumExists(rest, target - element) and return whatever it
returns.
In simple terms, it will this call subsetSumExists(rest, target - element) only if first one subsetSumExists(rest, target) returns false.
Now lets try to dry run this code with a small sample set of {3,5} and a sum of 8. I'll call the function sSE from now on
sSE({3,5}, 8) => "sSE({5}, 8) || sSE({5},(8-3))"
sSE({5}, 8) => sSE({}, 8) || sSE({}, (8-5))
sSE({}, 8) => false.. now will call sSE({}, (8-5))
sSE({}, 3) => false.. now will call sSE({5}, (8-3))
sSE({5}, 5) => sSE({}, 5} || sSE({}, (5-5))
sSE({}, 5) => false.. now will call sSE({}, (5-5))
sSE({}, 0) => true.. ends here and return true
To understand recursion, you need to understrand recursion.
To do that, you need to think recusively.
In this particular case.
For any: subsetSum(set, target)
If set is empty AND target is 0, then subsetSum exists
Otherwise, remove first element of the set. check if subdetSum(set, target) exists OR subdetSum(set, target - removed_element) exists (using step 0)
The set subtraction looks a strange syntax but I will assume it means pop() on the element.
It "works" through finding every possible combination although it is exponential.
In the || statement, the LHS is the sum including the current element and the RHS is the sum excluding it. So you will get, down the exponential tree, every combination of each element either switched on or off.
Exponential, by the way, means that if you have 30 elements it will produce 2 to the power of 30, i.e. 0x40000000 or close to a billion combinations.
Of course you may well run out of memory.
If it finds the solution it might not run through all 2^N cases. If there is no solution it will always visit them all.
If I speak for myself, difficulty in understanding of the problem stems from || operator. Let's glance at bottom return statement of same code with another way,
if (subsetSumExists(rest, target - element))
return true;
if (subsetSumExists(rest, target))
return true;
return false;

Recursive Backtracking Sudoku Solver Problems, c++

It's my first time dealing with recursion as an assignment in a low level course. I've looked around the internet and I can't seem to find anybody using a method similar to the one I've come up with (which probably says something about why this isn't working). The error is a segmentation fault in std::__copy_move... which I'm assuming is something in the c++ STL.
Anywho, my code is as follows:
bool sudoku::valid(int x, int y, int value)
{
if (x < 0) {cerr << "No valid values exist./n";}
if (binary_search(row(x).begin(), row(x).end(), value))
{return false;} //if found in row x, exit, otherwise:
else if (binary_search(col(y).begin(), col(y).end(), value))
{return false;} //if found in col y, exit, otherwise:
else if (binary_search(box((x/3), (y/3)).begin(), box((x/3), (y/3)).end(), value))
{return false;} //if found in box x,y, exit, otherwise:
else
{return true;} //the value is valid at this index
}
int sudoku::setval(int x, int y, int val)
{
if (y < 0 && x > 0) {x--; y = 9;} //if y gets decremented past 0 go to previous row.
if (y > 8) {y %= 9; x++;} //if y get incremented past 8 go to next row.
if (x == 9) {return 0;} //base case, puzzle done.
else {
if (valid(x,y,val)){ //if the input is valid
matrix[x][y] = val; //set the element equal to val
setval(x,y++,val); //go to next element
}
else {
setval(x,y,val++); //otherwise increment val
if(val > 9) {val = value(x,y--); setval(x,y--,val++); }
} //if val gets above 9, set val to prev element,
} //and increment the last element until valid and start over
}
I've been trying to wrap my head around this thing for a while and I can't seem to figure out what's going wrong. Any suggestions are highly appreciated! :)
sudoku::setval is supposed to return an int but there are at least two paths where it returns nothing at all. You should figure out what it needs to return in those other paths because otherwise you'll be getting random undefined behavior.
Without more information, it's impossible to tell. Things like the data
structures involved, and what row and col return, for example.
Still, there are a number of obvious problems:
In sudoku::valid, you check for what is apparently an error
condition (x < 0), but you don't return; you still continue your
tests, using the negative value of x.
Also in sudoku:valid: do row and col really return references to
sorted values? If the values aren't sorted, then binary_search will
have undefined behavior (and if they are, the names are somewhat
misleading). And if they return values (copies of something), rather
than a reference to the same object, then the begin() and end()
functions will refer to different objects—again, undefined
behavior.
Finally, I don't see any backtracking in your algorithm, and I don't
see how it progresses to a solution.
FWIW: when I wrote something similar, I used a simple array of 81
elements for the board, then created static arrays which mapped the
index (0–80) to the appropriate row, column and box. And for each of
the nine rows, columns and boxes, I kept a set of used values (a
bitmap); this made checking for legality very trivial, and it meant that
I could increment to the next square to test just by incrementing the
index. The resulting code was extremely simple.
Independently of the data representation used, you'll need: some
"global" (probably a member of sudoku) means of knowing whether you've
found the solution or not; a loop somewhere trying each of the nine
possible values for a square (stopping when the solution has been
found), and the recursion. If you're not using a simple array for the
board, as I did, I'd suggest a class or a struct for the index, with a
function which takes care of the incrementation once and for all.
All of the following is for Unix not Windows.
std::__copy_move... is STL alright. But STL doesn't do anything by itself, some function call from your code would've invoked it with wrong arguments or in wrong state. You need to figure that out.
If you have a core dump from teh seg-fault then just do a pstack <core file name>, you will see the full call stack of the crash. Then just see which part of your code was involved in it and start debugging (add traces/couts/...) from there.
Usually you'll get this core file with nice readable names, but in case you don't you can use nm or c++filt etc to dismangle the names.
Finally, pstack is just a small cmd line utility, you can always load the binary (that produced the core) and the core file into a debugger like gdb, Sun Studio or debugger built into your IDE and see the same thing along with lots of other info and options.
HTH
It seems like your algorithm is a bit "brute forcy". This is generally not a good tactic with Constraint Satisfaction Problems (CSPs). I wrote a sudoku solver a while back (wish I still had the source code, it was before I discovered github) and the fastest algorithm that I could find was Simulated Annealing:
http://en.wikipedia.org/wiki/Simulated_annealing
It's probabilistic, but it was generally orders of magnitude faster than other methods for this problem IIRC.
HTH!
segmentation fault may (and will) happen if you enter a function recursively too many times.
I noted one scenario which lead to it. But I'm pretty sure there are more.
Tip: write in your words the purpose of any function - if it is too complicated to write - the function should probably be split...