I'm trying to implement a function that will look at each element of an array and determine if that particular element is larger than one INT and less than another INT. For example:
Return true if Arr[5] is >i && < u
I have this as a basic algorithm, which works but I want to create a more efficient piece of code, by using the 'divide and conquer' methodology, however I'm having problems using recursion to make it count and all examples I've seen only deal with one point of comparison, not two. can anyone shed some light on the situation.
(http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm)
My original code (linear):
int SimpleSort(int length)
{
int A[] = {25,5,20,10,50};
int i = 25; //Less Than int
u = 2; //Greater Than
for(int Count = 0; Count < length; Count++) //Counter
{
if (A[Count] <= i && A[Count] >= u) //Checker
return true;
} return false;
}
Example code from what I've picked up of this so far(with no luck after many hours of working on various things, and using different example code:
int A[] = {5,10,25,30,50,100,200,500,1000,2000};
int i = 10; //Less Than
int u = 5; //Greater Than
int min = 1;
int max = length;
int mid = (min+max)/2;
if (i < A[mid] && u > A[mid])
{
min = mid + 1;
}
else
{
max = mid - 1;
}
Until i <= A1[mid] && u >= A1[mid])
If this question is not clear I'm sorry, do ask if you need me to elaborate on anything.
Assuming your input vector is always sorted, I think something like this may work for you. This is the simplest form I could come up with, and the performance is O(log n):
bool inRange(int lval, int uval, int ar[], size_t n)
{
if (0 == n)
return false;
size_t mid = n/2;
if (ar[mid] >= std::min(lval,uval))
{
if (ar[mid] <= std::max(lval,uval))
return true;
return inRange(lval, uval, ar, mid);
}
return inRange(lval, uval, ar+mid+1, n-mid-1);
}
This uses implied range differencing; i.e. it always uses the lower of the two values as the lower-bound, and the higher of the two as the upper-bound. If your usage mandates that input values for lval and uval are to be treated as gospel, and therfore any invoke where lval > uval should return false (since it is impossible) you can remove the std::min() and std::max() expansions. In either case, you can further increase performance by making an outter front-loader and pre-checking the order of lval and uval to either (a) returning immediately as false if absolute ordering is required and lval > uval, or (b) predetermine lval and uval in proper order if range-differencing is the requirement. Examples of both such outter wrappers are explored below:
// search for any ar[i] such that (lval <= ar[i] <= uval)
// assumes ar[] is sorted, and (lval <= uval).
bool inRange_(int lval, int uval, int ar[], size_t n)
{
if (0 == n)
return false;
size_t mid = n/2;
if (ar[mid] >= lval)
{
if (ar[mid] <= uval)
return true;
return inRange_(lval, uval, ar, mid);
}
return inRange_(lval, uval, ar+mid+1, n-mid-1);
}
// use lval and uval as an hard range of [lval,uval].
// i.e. short-circuit the impossible case of lower-bound
// being greater than upper-bound.
bool inRangeAbs(int lval, int uval, int ar[], size_t n)
{
if (lval > uval)
return false;
return inRange_(lval, uval, ar, n);
}
// use lval and uval as un-ordered limits. i.e always use either
// [lval,uval] or [uval,lval], depending on their values.
bool inRange(int lval, int uval, int ar[], size_t n)
{
return inRange_(std::min(lval,uval), std::max(lval,uval), ar, n);
}
I have left the one I think you want as inRange. The unit tests performed to hopefully cover main and edge cases are below along with the resulting output.
#include <iostream>
#include <algorithm>
#include <vector>
#include <iomanip>
#include <iterator>
int main(int argc, char *argv[])
{
int A[] = {5,10,25,30,50,100,200,500,1000,2000};
size_t ALen = sizeof(A)/sizeof(A[0]);
srand((unsigned int)time(NULL));
// inner boundary tests (should all answer true)
cout << inRange(5, 25, A, ALen) << endl;
cout << inRange(1800, 2000, A, ALen) << endl;
// limit tests (should all answer true)
cout << inRange(0, 5, A, ALen) << endl;
cout << inRange(2000, 3000, A, ALen) << endl;
// midrange tests. (should all answer true)
cout << inRange(26, 31, A, ALen) << endl;
cout << inRange(99, 201, A, ALen) << endl;
cout << inRange(6, 10, A, ALen) << endl;
cout << inRange(501, 1500, A, ALen) << endl;
// identity tests. (should all answer true)
cout << inRange(5, 5, A, ALen) << endl;
cout << inRange(25, 25, A, ALen) << endl;
cout << inRange(100, 100, A, ALen) << endl;
cout << inRange(1000, 1000, A, ALen) << endl;
// test single-element top-and-bottom cases
cout << inRange(0,5,A,1) << endl;
cout << inRange(5,5,A,1) << endl;
// oo-range tests (should all answer false)
cout << inRange(1, 4, A, ALen) << endl;
cout << inRange(2001, 2500, A, ALen) << endl;
cout << inRange(1, 1, A, 0) << endl;
// performance on LARGE arrays.
const size_t N = 2000000;
cout << "Building array of " << N << " random values." << endl;
std::vector<int> bigv;
generate_n(back_inserter(bigv), N, rand);
// sort the array
cout << "Sorting array of " << N << " random values." << endl;
std::sort(bigv.begin(), bigv.end());
cout << "Running " << N << " identity searches..." << endl;
for (int i=1;i<N; i++)
if (!inRange(bigv[i-1],bigv[i],&bigv[0],N))
{
cout << "Error: could not find value in range [" << bigv[i-1] << ',' << bigv[i] << "]" << endl;
break;
};
cout << "Finished" << endl;
return 0;
}
Output Results:
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
Sorting array of 2000000 random values.
Running 2000000 identity searches...
Finished
It's actually pretty straight forward if you assume the array to be sorted. You can get away with logarithmic complexity by always looking at either the respective left or right side of the sequence:
#include <iterator>
template <typename Limit, typename Iterator>
bool inRange(Limit lowerBound, Limit upperBound, Iterator begin, Iterator end) {
if (begin == end) // no values => no valid values
return false;
Iterator mid = begin;
auto const dist = std::distance(begin,end);
std::advance(mid,dist/2); // mid might be equal to begin, if dist == 1
if (lowerBound < *mid && *mid < upperBound)
return true;
if (dist == 1) // if mid is invalid and there is only mid, there is no value
return false;
if (*mid > upperBound)
return inRange(lowerBound, upperBound, begin, mid);
std::advance(mid,1); // we already know mid is invalid
return inRange(lowerBound, upperBound, mid, end);
}
You can invoke this for plain arrays with:
inRange(2,25,std::begin(A),std::end(A));
To my understanding, using divide and conquer for your specific problem
will not yield an andvantage. However, at least in your example, the input is
sorted; is should be possible to improve a bit by skipping values until your lower bound is reached.
Related
Lets say I have range of integers [l, r) and a function check(int idx) which satisfies the following condition:
there is an index t (l <= t < r) such that for each i (l <= i <= t) check(i) == true and for each j (t < j < r) check(j) == false. Is there a standard way to find index t?
Standard binary_search() needs comparator that takes two arguments, so it can't be applied here (correct me if I'm wrong).
Assuming you are searching for a continuous range of integers (and not, for example, an indexed array) I would suggest a dichotomic search:
int find_t(int l, int r) {
// Preconditions
assert(check(l) == true);
//assert(check(r) == false); // this precondition is not mandatory
int max_idx_true = l; // highest known integer which satisfies check(idx) == true
int min_idx_false = r; // lowest known integer which satisfies check(idx) == false
while (max_idx_true+1 < min_idx_false) {
int mid_idx = (max_idx_true+min_idx_false)/2;
if (check(mid_idx)) max_idx_true = mid_idx;
else min_idx_false = mid_idx;
}
int t = max_idx_true;
// Postconditions
assert(check(t) == true);
assert(t+1 == r || check(t+1) == false);
return t;
}
This algorithm narrows the closest integers where check(idx) is true and the next one is false. In your case, you are looking for t which corresponds to max_idx_true.
It should be noted that the following preconditions must be satisfied for this to work:
l < r
check(l) is true
for any idx, if check(idx) is true then check(idx-1) is always true
for any idx, if check(idx) is false then check(idx+1) is always false
Below is a source code example for testing the algorithm and output lines to better understand how it works. You can also try it out here.
#include <iostream>
#include <cassert>
using namespace std;
// Replace this function by your own check
bool check(int idx) {
return idx <= 42;
}
int find_t(int l, int r) {
assert(check(l) == true);
//assert(check(r) == false); // this precondition is not mandatory
int max_idx_true = l; // highest known integer which satisfies check(idx) == true
int min_idx_false = r; // lowest known integer which satisfies check(idx) == false
int n = 0; // Number of iterations, not needed but helps analyzing the complexity
while (max_idx_true+1 < min_idx_false) {
++n;
int mid_idx = (max_idx_true+min_idx_false)/2;
// Analyze the algorithm in detail
cerr << "Iteration #" << n;
cerr << " in range [" << max_idx_true << ", " << min_idx_false << ")";
cerr << " the midpoint " << mid_idx << " is " << boolalpha << check(mid_idx) << noboolalpha;
cerr << endl;
if (check(mid_idx)) max_idx_true = mid_idx;
else min_idx_false = mid_idx;
}
int t = max_idx_true;
assert(check(t) == true);
assert(t+1 == r || check(t+1) == false);
return t;
}
int main() {
// Initial constants
int l = 0;
int r = 100;
int t = find_t(l, r);
cout << "The answer is " << t << endl;
return 0;
}
The main advantage of the dichomotic search is that it finds your candidate with a complexity of only O(log2(N)).
For example if you initialize int l = -2000000000 and int r = 2000000000 (+/- 2 billions) you need to known the answer in about 4 billion numbers, yet the number of iterations will be 32 at worst.
Need to move all values which is less than 1 in begin of array (WITHOUT SORT, and need solution without second array)
for example:
START ARRAY: {-2.12, -3, 7.36, 6.83, -1.82, 7.01}
FINISH ARRAY: {-2.12, -3, -1.82, 7.36, 6.83, 7.01}
There is my solution but it doesn't work very well, because at final we receive:
FINISH ARRAY: {-2.12, -3, -1.82, 6.83, 7.36, 7.01}
Values which less than 1, moves to begin of array, but 4 and 5 elements not in correct order
#include <iostream>
using namespace std;
int main() {
double arr[6] = {-2.12, -3, 7.36, 6.83, -1.82, 7.01};
cout << "Start array: " << endl;
for (int x = 0; x < 6; x++) {
cout << arr[x] << ", ";
}
int index=0;
double temp;
for (int i = 0; i < 6; i++) {
if (arr[i] < 1) {
temp=arr[i];
arr[i] = arr[index];
arr[index] = temp;
index++;
}
}
cout << endl << "FINISH ARRAY: " << endl;
for (int x = 0; x < 6; x++) {
cout << arr[x] << ", ";
}
return 0;
}
Use std::stable_partition:
std::stable_partition(std::begin(arr), std::end(arr),
[](double d) { return d < 1; });
If you want to implement it yourself, note, that in-place stable partition (using comparisons and swaps) cannot be done better than in O(N log N) time. Any algorithm with O(N) running time is incorrect.
One possible solution can be obtained with divide-and-conquer approach:
template<class It, class Pred>
It stable_partition(It first, It last, Pred pred) {
// returns the iterator to the first element of the second group:
// TTTTTFFFFFF
// ^ return value
if (first == last)
return last;
if (last - first == 1) {
if (pred(*first)) // T
return last; // ^ last
else // F
return first; // ^ first
}
// Split in two halves
const auto mid = first + (last - first) / 2;
// Partition the left half
const auto left = stable_partition(first, mid, pred);
// TTTTTFFFFF
// ^ left
// ^ mid
// Partition the right half
const auto right = stable_partition(mid, last, pred);
// TTTTTFFFFF
// ^ right
// ^ mid
// Rotate the middle part: TTTTTFFFFFTTTTTFFFFF
// ~~~~~~~~~~
// ^ left ^ right
// ^ mid
const auto it = std::rotate(left, mid, right);
// TTTTTTTTTTFFFFFFFFFF
// ^ it
return it;
}
It resembles quicksort, but here we do not actually sort the range. std::rotate itself can be easily implemented via three reverses.
My code doesn't seem to work and I cannot understand why.
When the user enters a number to search for its location it doesn't show anything. If anyone could explain it to me I would greatly appreciate it.
void Array::binarySearch(vector<int> vect)
{
int search_val;
int high = (int)vect.size();
int low = 0;
int mid = 0;
bool found = false;
cout << "Enter Number to search : ";
cin>>search_val;
while (low <= high && !found) {
mid = (high + low)/2;
if (search_val > vect[mid]) {
low = mid + 1;
} else if (search_val < vect[mid]) {
high = mid - 1;
} else {
cout << "Number you entered " << search_val << " was found in position " << mid << endl;
found = true;
}
}
if (!found) {
cout << " The value isn't found " << endl;
}
}
sorted algo:
void Array::arrSort(vector<int> vect)
{
for (unsigned int i = 0; i < vect.size()-1; i++)
{
for (unsigned int j = 0; j < vect.size()-i-1; j++)
{
if (vect[j] > vect[j+1])
{
int x = vect[j+1];
vect[j+1] = vect[j];
vect[j] = x;
}
}
}
cout<<"Sorted output is "<<endl;
printArr(vect);
}
Your arrSort function takes its parameter by value, so it receives (and sorts) a copy of the original array.
To sort the array you're passing in, take the parameter by reference:
void Array::arrSort(vector<int> &vect)
As someone has pointed out, you must ensure that you are performing binary search on a sorted array. Perhaps, you should build and test each algorithm separately to ensure correctness before combining them together.
Check out std::sort to get your binary search function working, then work on your sort function––or vice versa.
Also, if you have found the item you are looking for say, vect[mid] == search_val you can go ahead and return true (or print like you've done) and terminate the algorithm.
What I understand already
I understand that median of medians algorithm(I will denote as MoM) is a high constant factor O(N) algorithm. It finds the medians of k-groups(usually 5) and uses them as the next iteration's sets to find medians of. The pivot after finding this will be between 3/10n and 7/10n of the original set, where n is the number of iterations it took to find the one median base case.
I keep getting a segmentation fault when I run this code for MoM, but I'm not sure why. I've debugged it and believe that the issue lies with the fact that I'm calling medianOfMedian(medians, 0, medians.size()-1, medians.size()/2);. However, I thought that this was logically sound since we were supposed to recursively find the median by calling itself. Perhaps my base case isn't correct? In a tutorial by YogiBearian on youtube(a stanford professor, link: https://www.youtube.com/watch?v=YU1HfMiJzwg ), he did not state any extra base case to take care of the O(N/5) operation of recursion in MoM.
Complete Code
Note: Per suggestions, I have added a base case and used .at() function by vectors.
static const int GROUP_SIZE = 5;
/* Helper function for m of m. This function divides the array into chunks of 5
* and finds the median of each group and puts it into a vector to return.
* The last group will be sorted and the median will be found despite its uneven size.
*/
vector<int> findMedians(vector<int>& vec, int start, int end){
vector<int> medians;
for(int i = start; i <= end; i+= GROUP_SIZE){
std::sort(vec.begin()+i, min(vec.begin()+i+GROUP_SIZE, vec.end()));
medians.push_back(vec.at(min(i + (GROUP_SIZE/2), (i + end)/2)));
}
return medians;
}
/* Job is to partition the array into chunks of 5(subject to change via const)
* And then find the median of them. Do this recursively using select as well.
*/
int medianOfMedian(vector<int>& vec, int start, int end, int k){
/* Acquire the medians of the 5-groups */
vector<int> medians = findMedians(vec, start, end);
/* Find the median of this */
int pivotVal;
if(medians.size() == 1)
pivotVal = medians.at(0);
else
pivotVal = medianOfMedian(medians, 0, medians.size()-1, medians.size()/2);
/* Stealing a page from select() ... */
int pivot = partitionHelper(vec, pivotVal, start, end);
cout << "After pivoting with the value " << pivot << " we get : " << endl;
for(int i = start; i < end; i++){
cout << vec.at(i) << ", ";
}
cout << "\n\n" << endl;
usleep(10000);
int length = pivot - start + 1;
if(k < length){
return medianOfMedian(vec, k, start, pivot-1);
}
else if(k == length){
return vec[k];
}
else{
return medianOfMedian(vec, k-length, pivot+1, end);
}
}
Some extra functions for helping unit test
Here are some unit tests that I wrote for these 2 functions. Hopefully they help.
vector<int> initialize(int size, int mod){
int arr[size];
for(int i = 0; i < size; i++){
arr[i] = rand() % mod;
}
vector<int> vec(arr, arr+size);
return vec;
}
/* Unit test for findMedians */
void testFindMedians(){
const int SIZE = 36;
const int MOD = 20;
vector<int> vec = initialize(SIZE, MOD);
for(int i = 0; i < SIZE; i++){
cout << vec[i] << ", ";
}
cout << "\n\n" << endl;
vector<int> medians = findMedians(vec, 0, SIZE-1);
cout << "The 5-sorted version: " << endl;
for(int i = 0; i < SIZE; i++){
cout << vec[i] << ", ";
}
cout << "\n\n" << endl;
cout << "The medians extracted: " << endl;
for(int i = 0; i < medians.size(); i++){
cout << medians[i] << ", ";
}
cout << "\n\n" << endl;
}
/* Unit test for medianOfMedian */
void testMedianOfMedian(){
const int SIZE = 30;
const int MOD = 70;
vector<int> vec = initialize(SIZE, MOD);
cout << "Given array : " << endl;
for(int i = 0; i < SIZE; i++){
cout << vec[i] << ", ";
}
cout << "\n\n" << endl;
int median = medianOfMedian(vec, 0, vec.size()-1, vec.size()/2);
cout << "\n\nThe median is : " << median << endl;
cout << "As opposed to sorting and then showing the median... : " << endl;
std::sort(vec.begin(), vec.end());
cout << "sorted array : " << endl;
for(int i = 0; i < SIZE; i++){
if(i == SIZE/2)
cout << "**";
cout << vec[i] << ", ";
}
cout << "Median : " << vec[SIZE/2] << endl;
}
Extra section about the output that I'm getting
Given array :
7, 49, 23, 48, 20, 62, 44, 8, 43, 29, 20, 65, 42, 62, 7, 33, 37, 39, 60, 52, 53, 19, 29, 7, 50, 3, 69, 58, 56, 65,
After pivoting with the value 5 we get :
23, 29, 39, 42, 43,
After pivoting with the value 0 we get :
39,
Segmentation Fault: 11
It seems all right and dandy until the segmentation fault. I'm confident that my partition function works as well(was one of the implementations for the leetcode question).
Disclaimer: This is not a homework problem, but rather my own curiosity about the algorithm after I used quickSelect in a leetcode problem set.
Please let me know if my question proposed requires more elaboration for MVCE, thanks!
EDIT: I figured out that the recursion partition scheme is wrong in my code. As Pradhan has pointed out - I somehow have empty vectors which lead to the start and end being 0 and -1 respectively, causing me to have segmentation fault from an infinite loop of calling it. Still trying to figure this part out.
MoM always calls itself (to compute pivot), and thus exhibits infinite recursion. This violates the "prime directive" of recursive algorithms: at some point, the problem is "small" enough to not need a recursive call.
Correct Implementation
With the help of Scott's hint, I was able to give a correct implementation of this median of medians algorithm. I fixed it and realized that the main idea that I had was correct, but there were a couple errors:
My base case should be for subvectors in the size of <=5.
There were some small subtleties about whether the last number(variable end), in this case should be considered to be included or as the upper bound less than. In this implementation below I made it the upper bound less than definition.
Here it is below. I also accepted Scott's answer - thank you Scott!
/* In case someone wants to pass in the pivValue, I broke partition into 2 pieces.
*/
int pivot(vector<int>& vec, int pivot, int start, int end){
/* Now we need to go into the array with a starting left and right value. */
int left = start, right = end-1;
while(left < right){
/* Increase the left and the right values until inappropriate value comes */
while(vec.at(left) < pivot && left <= right) left++;
while(vec.at(right) > pivot && right >= left) right--;
/* In case of duplicate values, we must take care of this special case. */
if(left >= right) break;
else if(vec.at(left) == vec.at(right)){ left++; continue; }
/* Do the normal swapping */
int temp = vec.at(left);
vec.at(left) = vec.at(right);
vec.at(right) = temp;
}
return right;
}
/* Returns the k-th element of this array. */
int MoM(vector<int>& vec, int k, int start, int end){
/* Start by base case: Sort if less than 10 size
* E.x.: Size = 9, 9 - 0 = 9.
*/
if(end-start < 10){
sort(vec.begin()+start, vec.begin()+end);
return vec.at(k);
}
vector<int> medians;
/* Now sort every consecutive 5 */
for(int i = start; i < end; i+=5){
if(end - i < 10){
sort(vec.begin()+i, vec.begin()+end);
medians.push_back(vec.at((i+end)/2));
}
else{
sort(vec.begin()+i, vec.begin()+i+5);
medians.push_back(vec.at(i+2));
}
}
int median = MoM(medians, medians.size()/2, 0, medians.size());
/* use the median to pivot around */
int piv = pivot(vec, median, start, end);
int length = piv - start+1;
if(k < length){
return MoM(vec, k, start, piv);
}
else if(k > length){
return MoM(vec, k-length, piv+1, end);
}
else
return vec[k];
}
It compiles fine, prints the first "start" but it stops right there. Any help is greatly appreciated. I spent several hours trying to figure out whats wrong and tried running it within several different IDEs. I think it fails at the while-loop.
#ifndef TERNARY_SEARCH_H
#define TERNARY_SEARCH_H
#include <cstdlib>
#include <iostream>
template <typename ArrayLike, typename T>
int ternary_search(const ArrayLike& array, const T& value, int low, int high)
{
/*
* low is the lowest possible index, high is the highest possible index
* value is the target value we are searrching for
* array is the ascending order array we are searching
*/
bool found = false;
while(!found)
{
int lowerThirdIndex = (((high - low)/(3)) + low);
int upperThirdIndex = (2*((high - low)/(3)) + low);
// search lower third
if (array[lowerThirdIndex] == value)
{
return lowerThirdIndex;
found = true;
}
else if (array[lowerThirdIndex] > value)
{
high = lowerThirdIndex;
}
else // array[lowerThirdIndex] < value
{
low = lowerThirdIndex;
}
//search upper third
if (array[upperThirdIndex] == value)
{
return upperThirdIndex;
found = true;
}
else if (array[upperThirdIndex] > value)
{
high = upperThirdIndex;
}
else // array[upperThirdIndex] < value
{
low = upperThirdIndex;
}
}
return -1;
}
#endif /* TERNARY_SEARCH_H */
//main.cpp
#include "ternary_search.h"
using namespace std;
int main() {
cout << "start";
int nums[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90};
for (int i = 0; i <= 90; i += 10) {
if (ternary_search(nums, i, 0, 10) != i / 10) {
std::cout
<< "Searching for " << i << " returned index "
<< ternary_search(nums, i, 0, 10) << " instead of "
<< i / 10 << "." << std::endl;
return 1;
}
// search for something that doesn't exist.
if (ternary_search(nums, i + 1, 0, 10) != -1) {
std::cout
<< "Searching for " << i + 1 << " returned index "
<< ternary_search(nums, i + 1, 0, 10) << " instead of -1."
<< std::endl;
return 1;
}
}
std::cout << "On this small example, your search algorithm seems correct.\n";
return 0;
}
Your ternary_search function doesn't have a means to return when it fails to find the value in the search table. It returns only when it finds an element in the table that exactly matches the value you pass in.
Since the second invocation of the function is called with i+1 -- which is 1 -- which is not a member of your table, your ternary search function never exits.