C++: Binary Search Part of a String - c++

I am now on part 4 of my homework, I am trying to figure out how to Binary Search just the first 14 characters from an array of strings. We were given two .csv files:
The 01110011100110 is a simulated barcode.
Inventory.csv: Inventory items separated by a comma - Barcode, product, price
First Five Lines:
01001010100011,Air freshener,12.43
01101000111101,Alfredo sauce,10.71
10000010100101,All spice,4.14
00011011001111,Allergy medication,20.06
00101010111011,Aluminum foil,8.92
Carts.csv: A 'cart' containing barcodes of purchases
First Five Lines:
01000011010100,00010011010100
00110111100000
00100110100000,01011000110111,01011101000110,01011101000000,01001110000010
01110101100101,00100111110101,00101101110110,00100110000000,00100000001100,00101101011100
00101100111110,01000110110000,01010110111110,00100111111001,01011101101010,01011011010011,00010011010100,01010111101001
I am working on Part4 of the homework:
Part 4: Write a function to read in the Carts.csv file. Read in one line at a time and process it. The process steps are:
a. Separate the barcodes in each cart based on the comma delimiter.
b. For each barcode, lookup the barcode in the product list.
c. Extract the price from the product string data line and add it to a running total for the cart.
d. After processing the line, output the total price for the cart.
e. Repeat from step (a) until end-of-file.
When I try to lookup it's returning -1, it is comparing the whole line rather than just part of it. How do Binary Search just part of a string. Here is the excerpt of my code:
int binarySearch(string* list, int size, string value){
int first = 0,
last = size - 1, middle,
position = -1;
bool found = false;
while (!found && first <= last){
middle = (first + last) / 2;
if (list[middle].compare(value) == 0){
found = true;
position = middle;
}
else if (list[middle].compare(value) > 0)
last = middle - 1;
else
first = middle + 1;
}
return position;
}
void processFile(string filename, string* list, int size){
ifstream file(filename);
string line;
int count = 1;
if (file.is_open()){
while (!file.eof()){
int ctr = 0;
int start = 0;
getline(file, line);
string substring = "";
int find = line.find(COMMA);
cout << "Cart " << count << ": ";
while (find != string::npos){
substring = line.substr(start, find - start);
int position = binarySearch(list, size, substring);
cout << position << " ";
start = find + 1;
ctr++;
find = line.find(COMMA, start);
}
if (ctr > 0 || find == -1){
substring = line.substr(start, line.length() - start);
int position = binarySearch(list, size, substring);
cout << position << endl;
count++;
}
}
}
file.close();
}

try the following function realization
int binarySearch( const std::string* list, int size, std::string value )
{
int position = -1;
int first = 0, last = size - 1;
while ( first <= last )
{
int middle = ( first + last ) / 2;
int result = list[middle].substr( 0, value.size() ).compare( value );
if ( result == 0 )
{
position = middle;
break;
}
else if ( result > 0 )
{
last = middle - 1;
}
else
{
first = middle + 1;
}
}
return position;
}
Of course array list must be sorted according to barcodes in the ascending order.

Related

find the maximum number of words in a sentence from a paragraph with C++

I am trying to find out the maximum number of words in a sentence (Separated by a dot) from a paragraph. and I am completely stuck into how to sort and output to stdout.
Eg:
Given a string S: {"Program to split strings. By using custom split function. In C++"};
The expected output should be : 5
#define max 8 // define the max string
string strings[max]; // define max string
string words[max];
int count = 0;
void split (string str, char seperator) // custom split() function
{
int currIndex = 0, i = 0;
int startIndex = 0, endIndex = 0;
while (i <= str.size())
{
if (str[i] == seperator || i == str.size())
{
endIndex = i;
string subStr = "";
subStr.append(str, startIndex, endIndex - startIndex);
strings[currIndex] = subStr;
currIndex += 1;
startIndex = endIndex + 1;
}
i++;
}
}
void countWords(string str) // Count The words
{
int count = 0, i;
for (i = 0; str[i] != '\0';i++)
{
if (str[i] == ' ')
count++;
}
cout << "\n- Number of words in the string are: " << count +1 <<" -";
}
//Sort the array in descending order by the number of words
void sortByWordNumber(int num[30])
{
/* CODE str::sort? std::*/
}
int main()
{
string str = "Program to split strings. By using custom split function. In C++";
char seperator = '.'; // dot
int numberOfWords;
split(str, seperator);
cout <<" The split string is: ";
for (int i = 0; i < max; i++)
{
cout << "\n initial array index: " << i << " " << strings[i];
countWords(strings[i]);
}
return 0;
}
Count + 1 in countWords() is giving the numbers correctly only on the first result then it adds the " " whitespace to the word count.
Please take into consideration answering with the easiest solution to understand first. (std::sort, making a new function, lambda)
Your code does not make a sense. For example the meaning of this declaration
string strings[max];
is unclear.
And to find the maximum number of words in sentences of a paragraph there is no need to sort the sentences themselves by the number of words.
If I have understood correctly what you need is something like the following.
#include <iostream>
#include <sstream>
#include <iterator>
int main()
{
std::string s;
std::cout << "Enter a paragraph of sentences: ";
std::getline( std::cin, s );
size_t max_words = 0;
std::istringstream is( s );
std::string sentence;
while ( std::getline( is, sentence, '.' ) )
{
std::istringstream iss( sentence );
auto n = std::distance( std::istream_iterator<std::string>( iss ),
std::istream_iterator<std::string>() );
if ( max_words < n ) max_words = n;
}
std::cout << "The maximum number of words in sentences is "
<< max_words << '\n';
return 0;
}
If to enter the paragraph
Here is a paragraph. It contains several sentences. For example, how to use string streams.
then the output will be
The maximum number of words in sentences is 7
If you are not yet familiar with string streams then you could use member functions find, find_first_of, find_first_not_of with objects of the type std::string to split a string into sentences and to count words in a sentence.
Your use case sounds like a reduction. Essentially you can have a state machine (parser) that goes through the string and updates some state (e.g. counters) when it encounters the word and sentence delimiters. Special care should be given for corner cases, e.g. when having continuous multiple white-spaces or >1 continous full stops (.). A reduction handling these cases is shown below:
int max_words_in(std::string const& str)
{
// p is the current and max word count.
auto parser = [in_space = false] (std::pair<int, int> p, char c) mutable {
switch (c) {
case '.': // Sentence ends.
if (!in_space && p.second <= p.first) p.second = p.first + 1;
p.first = 0;
in_space = true;
break;
case ' ': // Word ends.
if (!in_space) ++p.first;
in_space = true;
break;
default: // Other character encountered.
in_space = false;
}
return p; // Return the updated accumulation value.
};
return std::accumulate(
str.begin(), str.end(), std::make_pair(0, 0), parser).second;
}
Demo
The tricky part is deciding how to handle degenerate cases, e.g. what should the output be for "This is a , ,tricky .. .. string to count" where different types of delimiters alternate in arbitrary ways. Having a state machine implementation of the parsing logic allows you to easily adjust your solution (e.g. you can pass an "ignore list" to the parser and update the default case to not reset the in_space variable when c belongs to that list).
vector<string> split(string str, char seperator) // custom split() function
{
size_t i = 0;
size_t seperator_pos = 0;
vector<string> sentences;
int word_count = 0;
for (; i < str.size(); i++)
{
if (str[i] == seperator)
{
i++;
sentences.push_back(str.substr(seperator_pos, i - seperator_pos));
seperator_pos = i;
}
}
if (str[str.size() - 1] != seperator)
{
sentences.push_back(str.substr(seperator_pos + 1, str.size() - seperator_pos));
}
return sentences;
}

c++ binary search with array of char arrays

binary search is sorta working if it finds the value it will correctly return the pos at which the value was found at in array the issue is its only looping through first half of array its never hitting the else statement to increase the start index and always hits the if statement regardless of if the searchingFor word is before or after alphabetically. the players array list has been sorted alphabetically. Any help would be amazing :)
cout << "type name youd like to search for" << endl;
char playerChoice[10] = "len";
databaseFunctions.SelectionSort(databaseFunctions.players, databaseFunctions.size);
int result = databaseFunctions.BinarySearch(databaseFunctions.players, 0, databaseFunctions.size, playerChoice);
int DataBaseFunctions::BinarySearch(Player* players, int start_index, int end_index, char searchingFor[])
{
while (start_index <= end_index)
{
int middlePoint = (start_index + end_index) / 2;
if (strcmp(players[middlePoint].playerName, searchingFor) == 0)
return middlePoint;
if (searchingFor < players[middlePoint].playerName)
end_index = middlePoint - 1;
else
start_index = middlePoint + 1; // need to fix thiss never hitting else regardless always just -1 instead of adding 1
}
return -1;
}

Search a string for all occurrences of a substring in C++

Write a function countMatches that searches the substring in the given string and returns how many times the substring appears in the string.
I've been stuck on this awhile now (6+ hours) and would really appreciate any help I can get. I would really like to understand this better.
int countMatches(string str, string comp)
{
int small = comp.length();
int large = str.length();
int count = 0;
// If string is empty
if (small == 0 || large == 0) {
return -1;
}
// Increment i over string length
for (int i = 0; i < small; i++) {
// Output substring stored in string
for (int j = 0; j < large; j++) {
if (comp.substr(i, small) == str.substr(j, large)) {
count++;
}
}
}
cout << count << endl;
return count;
}
When I call this function from main, with countMatches("Hello", "Hello"); I get the output of 5. Which is completely wrong as it should return 1. I just want to know what I'm doing wrong here so I don't repeat the mistake and actually understand what I am doing.
I figured it out. I did not need a nested for loop because I was only comparing the secondary string to that of the string. It also removed the need to take the substring of the first string. SOOO... For those interested, it should have looked like this:
int countMatches(string str, string comp)
{
int small = comp.length();
int large = str.length();
int count = 0;
// If string is empty
if (small == 0 || large == 0) {
return -1;
}
// Increment i over string length
for (int i = 0; i < large; i++) {
// Output substring stored in string
if (comp == str.substr(i, small)) {
count++;
}
}
cout << count << endl;
return count;
}
The usual approach is to search in place:
std::string::size_type pos = 0;
int count = 0;
for (;;) {
pos = large.find(small, pos);
if (pos == std::string::npos)
break;
++count;
++pos;
}
That can be tweaked if you're not concerned about overlapping matches (i.e., looking for all occurrences of "ll" in the string "llll", the answer could be 3, which the above algorithm will give, or it could be 2, if you don't allow the next match to overlap the first. To do that, just change ++pos to pos += small.size() to resume the search after the entire preceding match.
The problem with your function is that you are checking that:
Hello is substring of Hello
ello is substring of ello
llo is substring of llo
...
of course this matches 5 times in this case.
What you really need is:
For each position i of str
check if the substring of str starting at i and of length = comp.size() is exactly comp.
The following code should do exactly that:
size_t countMatches(const string& str, const string& comp)
{
size_t count = 0;
for (int j = 0; j < str.size()-comp.size()+1; j++)
if (comp == str.substr(j, comp.size()))
count++;
return count;
}

Getting last N segments of URL in C++

I need to write a function to return the last N segments of a given URL, i.e. given /foo/bar/zoo and N=2, I expect to get back /bar/zoo. Boundary conditions should be handled appropriately. I have no problem doing it in C, but the best C++ version I could come up is this:
string getLastNSegments(const string& url, int N)
{
basic_string<char>::size_type found = 0, start = path.length()+1;
int segments = 2;
while (start && segments && (start = path.find_last_of('/', start-1)) != string::npos) {
found = start;
segments--;
}
return url.substr(found);
}
cout << "result: " << getLastNSegments("/foo/bar/zoo", 2) << endl;
Is there a more idiomatic (STL+algorithms) way of doing this?
Use std::string and rfind().
You call rfind successively N times feeding the last index as parameter. You now have the start index of the string you're looking for and use substr to extract the substring.
std::string x("http:/example.org/a/b/abc/bcd");
int N = 3;
int idx = x.length();
while ( idx >= 0 && --N > 0 )
{
idx = x.rfind('/',idx) - 1;
}
std::string final = x.substr(idx);
Nothing wrong with just using a loop.. Don't know of any STL string functions that will do what you want in a single call.
By the way, what happens when You ask for the last 3 segments of http://www.google.com/?
Call me old-school, but personally I would not use any STL searches here... What's the matter with this:
if( N <= 0 || url.length() == 0 ) return "";
const char *str = url.c_str();
const char *start = str + url.length();
int remain = N;
while( --start != str )
{
if( *start == '/' && --remain == 0 ) break;
}
return string(start);
Last but not least, a simple boost split solution
string getLastNSegments(const string& url, int n)
{
string selected;
vector<string> elements;
boost::algorithm::split(elements, url, boost::is_any_of("/"));
for (int i = 0; i < min(n, int(elements.size())); i++)
selected = "/" + elements.at(elements.size()-1-i) + selected;
return selected;
}

Sub-sequence numbers in a string

I'm getting the longest consecutive increasing numbers in an array with 10 items
int list[] = {2,3,8,9,10,11,12,2,6,8};
int start_pos = 0;
int lenght=0; // lenght of the sub-~consetuve
for (int a =0; a <=9; a++ )
{
if ((list[a]+1) == (list[a+1])) {
// continue just the string;
lenght++;
} else {
start_pos = a;
}
}
cout << lenght << " and start in " << start_pos;
getchar();
but it not working, it should return in length & start_pos ( 3 and lenght 4 ) because longest increasing is from 9 , 10 , 11 , 12 but it not working.
Assuming you actually meant subsequence, just guess the digit your sequence starts with and then run a linear scan. If you meant substring, it's even easier --- left as an exercise to OP.
The linear scan goes like this:
char next = <guessed digit>;
int len = 0;
char *ptr = <pointer to input string>;
while (*ptr) {
if ((*ptr) == next) {
next = next + 1;
if (next > '9') next = '0';
len++;
}
ptr++;
}
Now wrap that with a loop that sets to all digits from '0' to '9' and you are done, pick the one that gives the longest length.
simple idea: start point, end point and length of the sequence.
Run loop i
sequence will start whenever current number (at index i) less than next number 1 => start point set = i
it ends when condition above false => get end point => get the length = end -start (make more variable called max to compare lengths) => result could be max, reset start point, end point = 0 again when end of sequence
I made it myself:
#include <iostream>
using namespace std;
bool cons(int list[] , int iv) { bool ret=true; for (int a=0; a<=iv; a++) { if (list[a] != list[a+1]-1) ret=false; } return ret; }
void main() {
int str[10] = {12,13,15,16,17,18,20,21};
int longest=0;
int pos=0;
for (int lenght=1; lenght <= 9; lenght++) {
int li[10];
for (int seek=0; seek <= 9; seek++) {
for (int kor=0; kor <= lenght-1; kor ++ ) {
li[kor] = str[seek+kor];
}
if (cons(li , lenght-2)) {
longest = lenght;
pos=seek;
}
}
}
for (int b=pos; b <= pos+longest-1; b++) cout << str[b] << " - "; cout << "it is the end!" << endl; getchar();
}