Why is my binary search so insanely slow compared to an iterative? - c++

I'm writing an autocomplete program that finds all possible matches to a letter or set of characters given a dictionary file and input file. I just finished a version that implements a binary search over an iterative search and thought I could boost the overall performance of the program.
Thing is, the binary search is almost 9 times slower than an iterative search. What gives? I thought I was improving performance by using a binary search over iterative.
Run time(bin search to the left)[Larger]:
Here is the important part of each version, full code can be built and run at my github with cmake.
Binary Search Function(called while looping through input given)
bool search(std::vector<std::string>& dict, std::string in,
std::queue<std::string>& out)
{
//tick makes sure the loop found at least one thing. if not then break the function
bool tick = false;
bool running = true;
while(running) {
//for each element in the input vector
//find all possible word matches and push onto the queue
int first=0, last= dict.size() -1;
while(first <= last)
{
tick = false;
int middle = (first+last)/2;
std::string sub = (dict.at(middle)).substr(0,in.length());
int comp = in.compare(sub);
//if comp returns 0(found word matching case)
if(comp == 0) {
tick = true;
out.push(dict.at(middle));
dict.erase(dict.begin() + middle);
}
//if not, take top half
else if (comp > 0)
first = middle + 1;
//else go with the lower half
else
last = middle - 1;
}
if(tick==false)
running = false;
}
return true;
}
Iterative Search(included in main loop):
for(int k = 0; k < input.size(); ++k) {
int len = (input.at(k)).length();
// truth false variable to end out while loop
bool found = false;
// create an iterator pointing to the first element of the dictionary
vecIter i = dictionary.begin();
// this while loop is not complete, a condition needs to be made
while(!found && i != dictionary.end()) {
// take a substring the dictionary word(the length is dependent on
// the input value) and compare
if( (*i).substr(0,len) == input.at(k) ) {
// so a word is found! push onto the queue
matchingCase.push(*i);
}
// move iterator to next element of data
++i;
}
}
example input file:
z
be
int
nor
tes
terr
on

Instead of erasing elements in the middle of the vector (which is quite expensive), and then starting your search over, just compare the elements before and after the found item (because they should all be adjacent to eachother) until you find the all the items which match.
Or use std::equal_range, which does exactly that.

This will be the culprit:
dict.erase(dict.begin() + middle);
You are repeatedly removing items from your dictionary to naively use binary search to find all valid prefixes. This adds huge complexity, and is unnecessary.
Instead, once you have found a match, step backwards until you find the first match, then step forwards, adding all matches to your queue. Remember that because your dictionary is sorted and you are using only the prefixes, all valid matches will appear consecutively.

dict.erase operation is linear in the size of dict: it copies the entire array from middle to end into the beginning of the array. This makes the "binary search" algorithm possible quadratic in the length of dict, with O(N^2) expensive memory copy operations.

Related

My code won't print when submitted for codecheck even though it compiles without error

I've been assigned this question for my lab (and yes I understand there will be backlash because it's homework). I've been working on this question for a couple of days to no avail and I feel like I'm missing something glaringly obvious.
My code:
int processSuitors(vector<int>& currentSuitors, list<int>& rekt)
{
int sizeSuitors = currentSuitors.size();
int eliminated = 2;
while(sizeSuitors != 1)
{
rekt.push_back(currentSuitors[eliminated]);
currentSuitors.erase(currentSuitors.begin() + eliminated);
sizeSuitors--;
if(eliminated > sizeSuitors)
{
eliminated -= sizeSuitors;
}
}
return currentSuitors[0];
}
Prompt:
In an ancient land, the beautiful princess Eve had many suitors. She decided on the following procedure to determine which suitor she would marry. First, all of the suitors would be lined up one after the other and be assigned numbers. The first suitor would be number 1, the second number 2, and so on up to the last suitor, number n. Starting at the first suitor she would then count three suitors down the line (because of the three letters in her name) and the third suitor would be eliminated from winning her hand and he would be removed from the line. Eve would then continue, counting three more suitors and eliminating every third suitor. When she reached the end of the line she would continue counting from the beginning.
Write a function named processSuitors that takes as arguments an STL vector of type int containing the suitors, and an STL list of type int that will collect all the suitors that are eliminated. The function returns an int storing the position a suitor should stand in to marry the princess if there are n suitors. The function that calls processSuitors will send the vector already filled with n suitors (1, 2, 3... n), and an empty list that needs to be filled with the position number of the suitors that were eliminated, in the order they were eliminated.
Restrictions: You may not create any containers (no arrays, no vectors, etc.); you need to use the vector and the list that are passed as parameters.
Use ONLY the following STL functions:
vector::size
vector::erase
vector::begin
ist::push_back
vector::operator[ ]
The adjacent files are hidden since we are to rely on what is given. Any clean-up of my code would be extremely appreciated as well.
What do you think of this solution.
Keep another vector that marks whether an index in your currentSuitors vector has been removed. Then have a helper function that will always find the next free index.
Instead of trying to reduce currentSuitors, you just keep marking elements in the taken list.
size_t findNextFreeSlot(const vector<bool>& taken, size_t pos)
{
// increment to the next candidate position
pos = (pos + 1) % taken.size();
// search for the first free slot
for (size_t i = 0; i < taken.size(); i++)
{
if (taken[pos] == false)
{
return next;
}
pos = (pos + 1) % taken.size();
}
// assert(false); // we should never get here as long as there's one free slot index in taken
return -1;
}
int processSuitors(vector<int>& currentSuitors, list<int>& rekt)
{
size_t len = currentSuitors.size();
vector<bool> taken(len); // keep a vector of eliminated indices from current
size_t index = len; // initialize one past the last valid element
size_t eliminated = 0;
if (len == 0)
{
return -1;
}
while (eliminated < (len-1))
{
// advance the index three times to the next "untaken" index
index = findNextFreeSlot(taken, index);
index = findNextFreeSlot(taken, index);
index = findNextFreeSlot(taken, index);
taken[index] = true; // claim this index as taken
rekt.push_back(currentSuitors[index]); // add the value at this index to the eliminated list
eliminated++;
}
index = findNextFreeSlot(taken, index); // find the last free index
return currentSuitors[index];
}

Finding the shortest word ladder between two given words and a dictionary

I am trying to find shortest ladder from a dictionary between two given words. All the words including the given one and in the dictionary have same number of characters. In one pass, only one character may be changed and shortest path is required. Ex.: given: "hit" and "cil" Dic: ["hil", "hol", "hot", "lot", "lit", "lil"] So, the answer should be "hit"->"hil"->"cil"
I have tried to solve this problem using BFS; by finding next word in the dictionary and checking if that is adjacent to the popped item from the queue. This approach won't give me the shortest path though:
If, I try to replace each letter with 26 alphabets and if the resulting word is present in the dictionary, accept that: still this approach won't give me shortest path. Ex.: Here, it shall give me: hit->lit->lot->hot->hol->lil->cil
Probably, the better approach would be to construct a tree first and then find the shortest path in the tree from starting word to ending word.
I know, there are solutions to this problem on this forum but none of them explains the algorithm. I am new to BFS and so not much familiar.
I am interested in knowing how to find one of the shortest path and if several then all shortest paths.
What I suggest is to build a graph over the words in the dictionary, where a node represents a word and there is an edge from a <-> b if b can be transformed from a by changing only one character from a (and of course, the vice versa is also true). This process will take O(n*n) time where n is the no. of words in the dictionary. How to do this is as follows :
For each word build frequency array of characters, call it farr, which is 26 length long, and farr[i] tells how many times character i, in alphabetical order occurs in word, and then in a nested loop running n*n times you just need to compare the entries of frequency table for the words, they must differ by only one character in order to have an edge from word a to b.
Also Note that the edges are undirected(in both directions) in this graph.
After building the complete graph on words of dictionary, add the question word as well to graph. And then go ahead with the BFS searching for the target word from node of initial word, where the transformation required is initial word -> target word.
Now say you find target word at level 'i', while exploring from initial word then the shortest path is 'i' units long.
This approach is a bit brute force, but may be a good starting point.
If the target word is equal to the start word, or has Levenshtein distance 1 the result is [start, target] and you are done.
Otherwise you have to find all members of the dictionary with Levenshtein distance 1 from the start word. If one of them has Levenshtein distance 1 to the target word, the result is [start, word, target] and you're done. Otherwise, you recurse with every word in the chosen list as a start and the target as the target and prepend start to the shortest result.
pseudo code - a bit python like:
myDict = {"hil", "hol", "hot", "lot", "lit", "lil"}
used_wordlist = {}
shortestWordLadder(start, target):
if start == target or levenshtein(start, target) = 1:
return [start, target]
current_wordlist = [x for x in myDict
if x not in used_wordlist and
levenshtein(ladder[-1], x) = 1]
if current_wordlist.size = 0:
return null
for word in current_wordlist:
if levenshtein(word, target) == 1:
return [start, word, target]
used_wordlist.insert_all(current_wordlist)
min_ladder_size = MAX_INT
min_ladder = null
for word in currrent_wordlist:
ladder = shortestWordLadder(word, target)
if ladder is not null and ladder.size < min_ladder_size:
min_ladder_size = ladder.size
min_ladder = ladder.prepend(start)
return min_ladder
Possible optimization:
I considered to reuse the matrix, that levenshtein(start, target) would create internally, but I could not gain enough confidence, that it would work in all cases. The idea was to start at the bottom right of the matrix and chose the smallest neighbor, that would create a word from the dictionary. Then continue with that position. If no neighbor of the current cell creates a word from the dictionary, we'd have to backtrack until we find a way to a field with value 0. If backtracking brings us back to the bottom right cell, there is no solution.
I am not sure now, that there might not be solutions, that you'd maybe ignore that way. If it finds a solution, I am pretty confident, that it is one of the shortest.
At the moment I lack the time to think it through. If that proves to not be a complete solution, you could use it as an optimization step instead of the levenshtein(start, target) call in the first line of shortestWOrdLadder(), since the algorithm gives you the Levenshtein distance and, if possible, the shortest path.
I have worked out a solution by adopting the following approach:
1.) I have built a tree from the dictionary first, assuming the starting point to be the given word; and finding all words adjacent to this word and so on
2.) Next I have tried to construct all possible paths from given word to the end word using this tree.
Complexity: O(n*70 + 2^n-1 * lg(n)) = O(2^n-1*lg(n)) Here n is the number of words in the dictionary, 70 comes out as the range of ASCII values from 65 to 122 (A to a), I have taken a round figure here. Complexity is exponential as expected.
Even, after certain optimizations, worst case complexity won't change.
Here is the code I wrote (its tested by me and works. Any bugs or suggestions would be highly appreciated.):
#include <iostream>
#include <vector>
#include <cstring>
#include <deque>
#include <stack>
#include <algorithm>
using namespace std;
struct node {
string str;
vector<node *> children;
node(string s) {
str = s;
children.clear();
}
};
bool isAdjacent(string s1, string s2) {
int table1[70], table2 [70];
int ct = 0;
for (int i = 0; i < 70; i++) {
table1[i] = 0;
table2[i] = 0;
}
for (int i = 0; i < s1.length(); i++) {
table1[((int)s1[i])- 65] += 1;
table2[((int)s2[i])- 65] += 1;
}
for (int i = 0; i < 70; i++) {
if (table1[i] != table2[i])
ct++;
if (ct > 2)
return false;
}
if (ct == 2)
return true;
else
return false;
}
void construct_tree(node *root, vector<string> dict) {
deque<node *> q;
q.push_back(root);
while (!q.empty()) {
node *curr = q.front();
q.pop_front();
if (dict.size() == 0)
return;
for (int i = 0; i < dict.size(); i++) {
if (isAdjacent(dict[i], curr->str)) {
string n = dict[i];
dict.erase(dict.begin()+i);
i--;
node *nnode = new node(n);
q.push_back(nnode);
curr->children.push_back(nnode);
}
}
}
}
void construct_ladders(stack<node *> st, string e, vector<vector <string> > &ladders) {
node *top = st.top();
if (isAdjacent(top->str,e)) {
stack<node *> t = st;
vector<string> n;
while (!t.empty()) {
n.push_back(t.top()->str);
t.pop();
}
ladders.push_back(n);
}
for (int i = 0; i < top->children.size(); i++) {
st.push(top->children[i]);
construct_ladders(st,e,ladders);
st.pop();
}
}
void print(string s, string e, vector<vector<string> > ladders) {
for (int i = 0; i < ladders.size(); i++) {
for (int j = ladders[i].size()-1; j >= 0; j--) {
cout<<ladders[i][j]<<" ";
}
cout<<e<<endl;
}
}
int main() {
vector<string> dict;
string s = "hit";
string e = "cog";
dict.push_back("hot");
dict.push_back("dot");
dict.push_back("dog");
dict.push_back("lot");
dict.push_back("log");
node *root = new node(s);
stack<node *> st;
st.push(root);
construct_tree(root, dict);
vector<vector<string> > ladders;
construct_ladders(st, e, ladders);
print(s,e,ladders);
return 0;
}

C++ if statement order

A portion of a program needs to check if two c-strings are identical while searching though an ordered list (e.g.{"AAA", "AAB", "ABA", "CLL", "CLZ"}). It is feasible that the list could get quite large, so small improvements in speed are worth degradation of readability. Assume that you are restricted to C++ (please don't suggest switching to assembly). How can this be improved?
typedef char StringC[5];
void compare (const StringC stringX, const StringC stringY)
{
// use a variable so compareResult won't have to be computed twice
int compareResult = strcmp(stringX, stringY);
if (compareResult < 0) // roughly 50% chance of being true, so check this first
{
// no match. repeat with a 'lower' value string
compare(stringX, getLowerString() );
}
else if (compareResult > 0) // roughly 49% chance of being true, so check this next
{
// no match. repeat with a 'higher' value string
compare(stringX, getHigherString() );
}
else // roughly 1% chance of being true, so check this last
{
// match
reportMatch(stringY);
}
}
You can assume that stringX and stringY are always the same length and you won't get any invalid data input.
From what I understand, a compiler will make the code so that the CPU will check the first if-statement and jump if it's false, so it would be best if that first statement is the most likely to be true, as jumps interfere with the pipeline. I have also heard that when doing a compare, a[n Intel] CPU will do a subtraction and look at the status of flags without saving the subtraction's result. Would there be a way to do the strcmp once, without saving the result into a variable, but still being able to check that result during the both of the first two if-statements?
std::binary_search may help:
bool cstring_less(const char (&lhs)[4], const char (&rhs)[4])
{
return std::lexicographical_compare(std::begin(lhs), std::end(lhs),
std::begin(rhs), std::end(rhs));
}
int main(int, char**)
{
const char cstrings[][4] = {"AAA", "AAB", "ABA", "CLL", "CLZ"};
const char lookFor[][4] = {"BBB", "ABA", "CLS"};
for (const auto& s : lookFor)
{
if (std::binary_search(std::begin(cstrings), std::end(cstrings),
s, cstring_less))
{
std::cout << s << " Found.\n";
}
}
}
Demo
I think using hash tables can improve the speed of comparison drastically. Also, if your program is multithreaded, you can find some useful hash tables in intel thread building blocks library. For example, tbb::concurrent_unordered_map has the same api as std::unordered_map
I hope it helps you.
If you try to compare all the strings to each other you'll get in a O(N*(N-1)) problem. The best thing, as you have stated the lists can grow large, is to sort them (qsort algorithm has O(N*log(N))) and then compare each element with the next one in the list, which adds a new O(N) giving up to O(N*log(N)) total complexity. As you have the list already ordered, you can just traverse it (making the thing O(N)), comparing each element with the next. An example, valid in C and C++ follows:
for(i = 0; i < N-1; i++) /* one comparison less than the number of elements */
if (strcmp(array[i], array[i+1]) == 0)
break;
if (i < N-1) { /* this is a premature exit from the loop, so we found a match */
/* found a match, array[i] equals array[i+1] */
} else { /* we exhausted al comparisons and got out normally from the loop */
/* no match found */
}

Need suggestion to improve speed for word break (dynamic programming)

The problem is: Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "hithere",
dict = ["hi", "there"].
Return true because "hithere" can be segmented as "leet code".
My implementation is as below. This code is ok for normal cases. However, it suffers a lot for input like:
s = "aaaaaaaaaaaaaaaaaaaaaaab", dict = {"aa", "aaaaaa", "aaaaaaaa"}.
I want to memorize the processed substrings, however, I cannot done it right. Any suggestion on how to improve? Thanks a lot!
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict) {
int len = s.size();
if(len<1) return true;
for(int i(0); i<len; i++) {
string tmp = s.substr(0, i+1);
if((wordDict.find(tmp)!=wordDict.end())
&& (wordBreak(s.substr(i+1), wordDict)) )
return true;
}
return false;
}
};
It's logically a two-step process. Find all dictionary words within the input, consider the found positions (begin/end pairs), and then see if those words cover the whole input.
So you'd get for your example
aa: {0,2}, {1,3}, {2,4}, ... {20,22}
aaaaaa: {0,6}, {1,7}, ... {16,22}
aaaaaaaa: {0,8}, {1,9} ... {14,22}
This is a graph, with nodes 0-23 and a bunch of edges. But node 23 b is entirely unreachable - no incoming edge. This is now a simple graph theory problem
Finding all places where dictionary words occur is pretty easy, if your dictionary is organized as a trie. But even an std::map is usable, thanks to its equal_range method. You have what appears to be an O(N*N) nested loop for begin and end positions, with O(log N) lookup of each word. But you can quickly determine if s.substr(begin,end) is a still a viable prefix, and what dictionary words remain with that prefix.
Also note that you can build the graph lazily. Staring at begin=0 you find edges {0,2}, {0,6} and {0,8}. (And no others). You can now search nodes 2, 6 and 8. You even have a good algorithm - A* - that suggests you try node 8 first (reachable in just 1 edge). Thus, you'll find nodes {8,10}, {8,14} and {8,16} etc. As you see, you'll never need to build the part of the graph that contains {1,3} as it's simply unreachable.
Using graph theory, it's easy to see why your brute-force method breaks down. You arrive at node 8 (aaaaaaaa.aaaaaaaaaaaaaab) repeatedly, and each time search the subgraph from there on.
A further optimization is to run bidirectional A*. This would give you a very fast solution. At the second half of the first step, you look for edges leading to 23, b. As none exist, you immediately know that node {23} is isolated.
In your code, you are not using dynamic programming because you are not remembering the subproblems that you have already solved.
You can enable this remembering, for example, by storing the results based on the starting position of the string s within the original string, or even based on its length (because anyway the strings you are working with are suffixes of the original string, and therefore its length uniquely identifies it). Then, in the beginning of your wordBreak function, just check whether such length has already been processed and, if it has, do not rerun the computations, just return the stored value. Otherwise, run computations and store the result.
Note also that your approach with unordered_set will not allow you to obtain the fastest solution. The fastest solution that I can think of is O(N^2) by storing all the words in a trie (not in a map!) and following this trie as you walk along the given string. This achieves O(1) per loop iteration not counting the recursion call.
Thanks for all the comments. I changed my previous solution to the implementation below. At this point, I didn't explore to optimize on the dictionary, but those insights are very valuable and are very much appreciated.
For the current implementation, do you think it can be further improved? Thanks!
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict) {
int len = s.size();
if(len<1) return true;
if(wordDict.size()==0) return false;
vector<bool> dq (len+1,false);
dq[0] = true;
for(int i(0); i<len; i++) {// start point
if(dq[i]) {
for(int j(1); j<=len-i; j++) {// length of substring, 1:len
if(!dq[i+j]) {
auto pos = wordDict.find(s.substr(i, j));
dq[i+j] = dq[i+j] || (pos!=wordDict.end());
}
}
}
if(dq[len]) return true;
}
return false;
}
};
Try the following:
class Solution {
public:
bool wordBreak(string s, unordered_set<string>& wordDict)
{
for (auto w : wordDict)
{
auto pos = s.find(w);
if (pos != string::npos)
{
if (wordBreak(s.substr(0, pos), wordDict) &&
wordBreak(s.substr(pos + w.size()), wordDict))
return true;
}
}
return false;
}
};
Essentially one you find a match remove the matching part from the input string and so continue testing on a smaller input.

Calculating the big-O complexity of this string match function?

can anyone help me calculate the complexity of the following?
I've written a strStr function for homework, and although it's not part of my homework, I want to figure out the complexity of it.
basically it takes a string, finds 1st occurence of substring, returns it's index,
I believe it O(n), because although it's double loop'd at most it'll run only n times, where n is the length of s1, am I correct?
int strStr( char s1[] , char s2[] ){
int haystackInd, needleInd;
bool found = false;
needleInd = haystackInd = 0;
while ((s1[haystackInd] != '\0') && (!found)){
while ( (s1[haystackInd] == s2[needleInd]) && (s2[needleInd] != '\0') ){
needleInd++;
haystackInd++;
}
if (s2[needleInd] == '\0'){
found = true;
}else{
if (needleInd != 0){
needleInd = 0;
}
else{
haystackInd++;
}
}
}
if (found){
return haystackInd - needleInd;
}
else{
return -1;
}
}
It is indeed O(n), but it is also not functioning properly. Consider finding "nand" in "nanand"
There is an O(n) solution to the problem though.
Actually, the outer loop could run 2n times (each iteration increments haystackInd at least once OR it sets needleInd to 0, but never sets needleInd to 0 in 2 successive iterations), but you end up w/ the same O(n) complexity.
Your algorithm isn't correct. The indices, haystackInd, in your solution are incorrect. But your conclusion based on your wrong algorithm was right. It is O(n), but just it can't find the first occurrence of the substring. The most trivial solution is like yours, compare string S2 to substrings starting from S1[0], S1[1],...And the running time is O(n^2). If you want O(n) one, you should check out KMP algorithm as templatetypedef mentioned above.