Sorting given text alphabetically w/o additional library - c++

My homework was to write an application that sorts given text in alphabetically order. To do so, I was allowed only to use 'vector', 'string' and 'iostream' library.
I succeed but now struggling with strange problem - while I'm trying to sort a short text, everything works well but with longer inputs program seems to get into infinity loop or efficiency problem. Eg in text
"Albert Einstein 14 March 1879 – 18 April 1955 was a German-born theoretical physicist who developed the theory of relativity one of the two pillars of modern physics alongside quantum mechanics His work is also known[...]"
everything works great until "mechanics" phrase. After adding this, or any other word, program is running eternally like I mentioned before.
I'm afraid that I have to paste whole code in this case (please forgive).
#include <iostream>
#include <string>
#include <vector>
int compare(std::string first, std::string second){
int flag = 1;
int i;
if (first.size() <= second.size()){
for (i = 0; i<first.size(); ++i ){
if (first[i] == second[i]){
continue;}
else if (first[i] > second[i]){
flag = 0;
break;}
else {
break;}}}
else {
for (i = 0; i<second.size(); ++i ){
if (first[i] == second[i]) {
continue;}
else if (first[i] > second[i]){
flag = 0;
break;}
else {
break;}
}
int m = second.size() - 1;
if (first[m] == second[m]){
flag = 0;}}
return flag;}
int main() {
std::vector<std::string> text;
std::string word;
std::string tmp;
while(std::cin >> word){
text.push_back(word);}
int mistakes, m = 1;
while(m) {
mistakes = 0;
for (int index = 1; index < text.size() ; ++index){
if (!(compare(text[index-1], text[index]))){
tmp = text[index];
text[index] = text[index-1];
text[index-1] = tmp;
mistakes += 1;}}
m = mistakes;}
for (auto element: text){
std::cout << element << " ";}}
I would love to hear how to fix it and why exactly this problem appears - at least time of execution doesn't grow with lenght of input, but more like "work/doesn't work", what is unlike to efficency issues.

You had missed some conditions because of which your while loop running infinitely. For example:
Your mistakes variable on whose value your while loop executes never becomes 0, if the very first pair of words are in wrong order. Negative test case would be : "ball apple". In this case your code runs infinitely.
In your compare method because of following line of code test case like this "Apple ball" were giving wrong answers. Here first[m] = second[m] = l , thus according to your condition it will return false and swap them. It will swap "Apple" with "Ball" which is wrong.
int m = second.size() - 1;
if (first[m] == second[m]){
flag = 0;
}
You also need to handle the cases where there will be comparison between upper and lower case words. For example: "Month also". In this case answer should be "also Month". So before comparing two string you should bring them to same case and then compare.
Number comparison case where 1874 should come after 18. (you can add this)
Following is the corrected code.
#include <iostream>
#include <string>
#include <vector>
#include <cctype>
int compare(std::string first, std::string second){
// this is to handle the comparison of two words with mixed case(upper/lower) of letters.
// earlier solution failed for comparison between 'Month' and 'a'
for(int i=0;i<first.size();i++){
first[i] = tolower(first[i]);
}
for(int i=0;i<second.size();i++){
second[i] = tolower(second[i]);
}
int flag = 1;
int i;
if (first.size() <= second.size()){
for (i = 0; i<first.size(); ++i ){
if (first[i] == second[i]){
continue;}
else if (first[i] > second[i]){
flag = 0;
break;}
else {
break;}}}
else {
for (i = 0; i<second.size(); ++i ){
if (first[i] == second[i]) {
continue;}
else if (first[i] > second[i]){
flag = 0;
break;}
else {
break;}
}
}
return flag;}
int main() {
std::vector<std::string> text;
std::string word;
std::string tmp;
while(std::cin >> word){
text.push_back(word);}
// bubble short
for(int i=0;i<(text.size()-1);i++){
for(int j=0;j<(text.size()-1-i);j++){
if (!(compare(text[j], text[j+1]))){
tmp = text[j];
text[j] = text[j+1];
text[j+1] = tmp;
}
}
}
for (auto element: text){
std::cout << element << " ";}}

Related

Find prefix within a string

I'm currently doing a leetcode question where I have to find a prefix within a sentence and return the word number within the sentence else return -1. I came up with a solution but it crashes with some strings and i dont know why. An example of this is the following:
Input: sentence = "i love eating burger", searchWord = "burg"
Output: 4 (I also get an output of 4)
Explanation: "burg" is prefix of "burger" which is the 4th word in the sentence.
but fails this example:
Input: sentence = "this problem is an easy problem", searchWord = "pro"
Output: 2 ( I get an output of 6)
Explanation: "pro" is prefix of "problem" which is the 2nd and the 6th word in the sentence, but we return 2 as it's the minimal index.
My cout for this one produced a very weird snippet:
problem is an easy problem
problem is an easy problem
problem is an easy problem
problem is an easy problem
probl
proble
problem
problem
problem i
problem is
it completely ignored the first couple substrings when i increments, this is the only time it happens tho.
int isPrefixOfWord(string sentence, string searchWord)
{
string sub;
int count = 1;
for (int i = 0; i < sentence.length(); i++)
{
if (sentence[i] == ' ')
count++;
for (int j = i; j < sentence.length(); j++)
{
sub = sentence.substr(i, j);
cout<<sub<<endl;
if (sub == searchWord)
{
return count;
}
}
}
return -1;
}
Any Ideas?
int isPrefixOfWord(string sentence, string searchWord)
{
string sub;
int count = 1;
for (int i = 0; i < sentence.length() - searchWord.length() - 1; i++)
{
if (sentence[i] == ' ')
count++;
sub = sentence.substr(i,searchWord.length());
if ( sub == searchWord && (sentence[i-1] == ' ' || i == 0))
{
return count;
}
}
return -1;
}
A very simple C++20 solution using starts_with:
#include <string>
#include <sstream>
#include <iostream>
int isPrefixOfWord(std::string sentence, std::string searchWord)
{
int count = 1;
std::istringstream strm(sentence);
std::string word;
while (strm >> word)
{
if ( word.starts_with(searchWord) )
return count;
++count;
}
return -1;
}
int main()
{
std::cout << isPrefixOfWord("i love eating burger", "burg") << "\n";
std::cout << isPrefixOfWord("this problem is an easy problem", "pro") << "\n";
std::cout << isPrefixOfWord("this problem is an easy problem", "lo");
}
Output:
4
2
-1
Currently, LeetCode and many other of the online coding sites do not support C++20, thus this code will not compile successfully on those online platforms.
Therefore, here is a live example using a C++20 compiler
We can just use std::basic_stringstream for solving this problem. This'll pass through:
// The following block might slightly improve the execution time;
// Can be removed;
static const auto __optimize__ = []() {
std::ios::sync_with_stdio(false);
std::cin.tie(nullptr);
std::cout.tie(nullptr);
return 0;
}();
// Most of headers are already included;
// Can be removed;
#include <cstdint>
#include <string>
#include <sstream>
static const struct Solution {
static const int isPrefixOfWord(
const std::string sentence,
const std::string_view search_word
) {
std::basic_stringstream stream_sentence(sentence);
std::size_t index = 1;
std::string word;
while (stream_sentence >> word) {
if (!word.find(search_word)) {
return index;
}
++index;
}
return -1;
}
};
The bug that effects the function output is that you aren't handling the increment of i within your inner for loop:
for (int i = 0; i < sentence.length(); i++)
{
if (sentence[i] == ' ')
count++;
for (int j = i; j < sentence.length(); j++)
{
sub = sentence.substr(i, j);
cout<<sub<<endl;
if (sub == searchWord)
{
return count;
}
}
}
Notice that once your inner-loop is complete that i always iterates by one. So your next search through a word will incorrectly start at its next character, which incorrectly searches for "sub-words" instead of only prefixes, and so creates false positives (and unnecessary work).
Also note that every time that you do:
(sub == searchWord)
That this checks all j characters, even though we're only interested in whether the new jth character is a match.
Another bug, which effects your performance and your couts is that you're not handling mismatches:
if (sub == searchWord)
...is never false, so the only way to exit the inner loop is to keep increments j till the end of the array, so sub ends up being large.
A way to fix your second bug is to replace your inner loop like so:
if (sentence.substr(i, i + searchWord.length()) == searchWord)
return count;
and finally, to fix all bugs:
int isPrefixOfWord (const string & sentence, const string & searchWord)
{
if (sentence.length() < searchWord.length())
return -1;
const size_t i_max = sentence.length() - searchWord.length();
for (size_t i = 0, count = 1; ; ++count)
{
// flush spaces:
while (sentence[i] == ' ')
{
if (i >= i_max)
return -1;
++i;
}
if (sentence.substr(i, searchWord.length()) == searchWord)
return count;
// flush word:
while (sentence[i] != ' ')
{
if (i >= i_max)
return -1;
++i;
}
}
return -1;
}
Note that substr provides a copy of the object (it's not just a wrapper around a string), so this takes linear time with respect to searchWord.length(), which is particularly bad the word within sentence is smaller.
We can improve the speed by replacing
if (sentence.substr(i, searchWord.length()) == searchWord)
return count;
...with
for (size_t j = 0; sentence[i] == searchWord[j]; )
{
++j;
if (j == searchWord.size())
return count;
++i;
}
Others have shown nice applications of the libraries that help solve the problem.
If you don't have access to those libraries for your assignment, or if you just want to learn how you could modularise a problem like this without loosing efficiency, then here's a way to do it in c++11 without any libraries (except string):
bool IsSpace (char c)
{
return c == ' ';
}
bool NotSpace (char c)
{
return c != ' ';
}
class PrefixFind
{
using CharChecker = bool (*)(char);
template <CharChecker Condition>
void FlushWhile ()
{
while ((m_index < sentence.size())
&& Condition(sentence[m_index]))
++m_index;
}
void FlushWhiteSpaces ()
{
FlushWhile<IsSpace>();
}
void FlushToNextWord ()
{
FlushWhile<NotSpace>();
FlushWhile<IsSpace>();
}
bool PrefixMatch ()
{
// SearchOngoing() must equal `true`
size_t j = 0;
while (sentence[m_index] == search_prefix[j])
{
++j;
if (j == search_prefix.size())
return true;
++m_index;
}
return false;
}
bool SearchOngoing () const
{
return m_index + search_prefix.size() <= sentence.size();
}
const std::string & sentence;
const std::string & search_prefix;
size_t m_index;
public:
PrefixFind (const std::string & s, const std::string & sw)
: sentence(s),
search_prefix(sw)
{}
int FirstMatchingWord ()
{
const int NO_MATCHES = -1;
if (!search_prefix.length())
return NO_MATCHES;
m_index = 0;
FlushWhiteSpaces();
for (int n = 1; SearchOngoing(); ++n)
{
if (PrefixMatch())
return n;
FlushToNextWord();
}
return NO_MATCHES;
}
};
In terms of speed: If we consider the length of sentence to be m, and the length of searchWord to be n, then original (buggy) code had O(n*m^2) time complexity. But with this improvement we get O(m).

C++ Palindrome program always giving 0 (false) as an output problem; Where is my code wrong?

The problem is that it always outputs 0 (false) as a result. Probably the problem is in the isPalindrome function, but I cannot figure where exactly. Would be grateful if someone helped.
#include <iostream>
#include <cmath>
#include <string>
using namespace std;
bool isPalindrome(string word)
{
bool result;
for (int i = 0; i <= word.length() - 1; i++)
{
if (word.at(i) == word.length() - 1)
{
result = true;
}
else
{
result = false;
}
return result;
}
}
int main()
{
string word1;
int count;
cout << "How many words do you want to check whether they are palindromes: " << flush;
cin >> count;
for (int i = 0; i < count; i++)
{
cout << "Please enter a word: " << flush;
cin >> word1;
cout << "The word you entered: " << isPalindrome(word1);
}
}
Try this one:
bool isPalindrome(string word)
{
bool result = true;
for (int i = 0; i < word.length() / 2; i++) //it is enough to iterate only the half of the word (since we take both from the front and from the back each time)
{
if (word[i] != word[word.length() - 1 - i]) //we compare left-most with right-most character (each time shifting index by 1 towards the center)
{
result = false;
break;
}
}
return result;
}
In this statement
if (word.at(i) == word.length() - 1)
the right side expression of the comparison operator is never changed and have the type std::string::size_type instead of the type char. You mean
if (word.at(i) == word.at( word.length() - 1 - i ))
However there is no sense to use the member function at. You could us the subscript operator. For example
if ( word[i] == word[word.length() - 1 - i ] )
And the loop should have word.length() / 2 iterations.
Also within the loop you are overwriting the variable result. So you are always returning the last value of the variable. It can be equal to true though a string is not a palindrome.
Also the parameter should be a referenced type. Otherwise a redundant copy of the passed argument is created.
The function can be defined the following way
bool isPalindrome( const std::string &word )
{
std::string::size_type i = 0;
std::string::size_type n = word.length();
while ( i < n / 2 && word[i] == word[n - i - 1] ) i++;
return i == n / 2;
}
Another approach is the following
bool isPalindrome( const std::string &word )
{
return word == std::string( word.rbegin(), word.rend() );
}
Though this approach requires to create a reverse copy of the original string.
The simplest way is to use the standard algorithm std::equal. Here is a demonstrative program
#include <iostream>
#include <string>
#include <iterator>
#include <algorithm>
bool isPalindrome( const std::string &word )
{
return std::equal( std::begin( word ),
std::next( std::begin( word ), word.size() / 2 ),
std::rbegin( word ) );
}
int main()
{
std::cout << isPalindrome( "123454321" ) << '\n';
return 0;
}
I hope this one helps you also (corrected also warnings):
bool isPalindrome(string word)
{
bool result = false;
int lengthWord = (int)word.length();
for (int i = 0; i <= (lengthWord / 2); ++i)
{
if (word.at(i) == word.at(lengthWord - i -1))
{
result = true;
continue;
}
result = false;
}
return result;
}
Two possible problems.
You appear to be comparing a character to a number
if (word.at(i) == word.length() - 1)
shouldn't this be
if (word.at(i) == word.at(word.length() - i)) ?
There are 3 returns within the if statement, so no matter what the outcome it's only going to compare one character before returning to the calling function.
As a point of technique, repeated calls to .length inside the loop, which always returns the same value, wastes time and makes the code more difficult to understand.
You need to return as soon as you find a mismatch. If you are looking for a palindrome you only need to compare the first half of the word with the second half in reverse order. Something like
bool isPalindrome(string word)
{
for (int i = 0, j= word.length() - 1; i<j; i++, j--)
// i starts at the beginning of the string, j at the end.
// Once the i >= j you have reached the middle and are done.
// They step in opposite directions
{
if (word[i] != word[j])
{
return false;
}
}
return true;
}
The loop in the function isPalindrome will only execute once, because the return statement is unconditionally executed in the first iteration of the loop. I am sure that this is not intended.
To determine whether a string is a palindrome, the loop must be executed several times. Only after the last character has been evaluated (in the last iteration of the loop) will it be time to use the return statement, unless you determine beforehand that the string is not a palindrome.
Also, in the function isPalindrome, the following expression is nonsense, as you are comparing the ASCII Code of a letter with the length of the string:
word.at(i) == word.length() - 1
Therefore, I suggest the following code for the function:
bool isPalindrome(string word)
{
for (int i = 0; i < word.length() / 2; i++)
{
if (word.at(i) != word.at( word.length() - i - 1) ) return false;
}
return true;
}
As discussed in the comments under your question. You made some mistakes in the code.
Your function should more or less look like this:
bool isPalindrome(string word) {
bool result = true;
for (int i = 0; i <= word.length() - 1; i++)
{
if (word.at(i) != word.at(word.length() - 1 -i))
{
return false;
}
}
return result;
}

What's wrong with my dynamic programming algorithm with memoization?

*Sorry about my poor English. If there is anything that you don't understand, please tell me so that I can give you more information that 'make sence'.
**This is first time asking question in Stackoverflow. I've searched some rules for asking questions correctly here, but there should be something I missed. I welcome all feedback.
I'm currently solving algorithm problems to improve my skill, and I'm struggling with one question for three days. This question is from https://algospot.com/judge/problem/read/RESTORE , but since this page is in KOREAN, I tried to translate it in English.
Question
If there are 'k' pieces of partial strings given, calculate shortest string that includes all partial strings.
All strings consist only lowercase alphabets.
If there are more than 1 result strings that satisfy all conditions with same length, choose any string.
Input
In the first line of input, number of test case 'C'(C<=50) is given.
For each test case, number of partial string 'k'(1<=k<=15) is given in the first line, and in next k lines partial strings are given.
Length of partial string is between 1 to 40.
Output
For each testcase, print shortest string that includes all partial strings.
Sample Input
3
3
geo
oji
jing
2
world
hello
3
abrac
cadabra
dabr
Sample Output
geojing
helloworld
cadabrac
And here is my code. My code seems to work perfect with Sample Inputs, and when I made test inputs for my own and tested, everything worked fine. But when I submit this code, they say my code is 'wrong'.
Please tell me what is wrong with my code. You don't need to tell me whole fixed code, I just need sample inputs that causes error with my code. Added code description to make my code easier to understand.
Code Description
Saved all input partial strings in vector 'stringParts'.
Saved current shortest string result in global variable 'answer'.
Used 'cache' array for memoization - to skip repeated function call.
Algorithm I designed to solve this problem is divided into two function -
restore() & eraseOverlapped().
restore() function calculates shortest string that includes all partial strings in 'stringParts'.
Result of resotre() is saved in 'answer'.
For restore(), there are three parameters - 'curString', 'selected' and 'last'.
'curString' stands for currently selected and overlapped string result.
'selected' stands for currently selected elements of 'stringParts'. Used bitmask to make my algorithm concise.
'last' stands for last selected element of 'stringParts' for making 'curString'.
eraseOverlapped() function does preprocessing - it deletes elements of 'stringParts' that can be completly included to other elements before executing restore().
#include <algorithm>
#include <iostream>
#include <vector>
#include <cstring>
#include <string>
#define MAX 15
using namespace std;
int k;
string answer; // save shortest result string
vector<string> stringParts;
bool cache[MAX + 1][(1 << MAX) + 1]; //[last selected string][set of selected strings in Bitmask]
void restore(string curString, int selected=0, int last=0) {
//base case 1
if (selected == (1 << k) - 1) {
if (answer.empty() || curString.length() < answer.length())
answer = curString;
return;
}
//base case 2 - memoization
bool& ret = cache[last][selected];
if (ret != false) return;
for (int next = 0; next < k; next++) {
string checkStr = stringParts[next];
if (selected & (1 << next)) continue;
if (curString.empty())
restore(checkStr, selected + (1 << next), next + 1);
else {
int check = false;
//count max overlapping area of two strings and overlap two strings.
for (int i = (checkStr.length() > curString.length() ? curString.length() : checkStr.length())
; i > 0; i--) {
if (curString.substr(curString.size()-i, i) == checkStr.substr(0, i)) {
restore(curString + checkStr.substr(i, checkStr.length()-i), selected + (1 << next), next + 1);
check = true;
break;
}
}
if (!check) { // if there aren't any overlapping area
restore(curString + checkStr, selected + (1 << next), next + 1);
}
}
}
ret = true;
}
//check if there are strings that can be completely included by other strings, and delete that string.
void eraseOverlapped() {
//arranging string vector in ascending order of string length
int vectorLen = stringParts.size();
for (int i = 0; i < vectorLen - 1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].length() < stringParts[j].length()) {
string temp = stringParts[i];
stringParts[i] = stringParts[j];
stringParts[j] = temp;
}
}
}
//deleting included strings
vector<string>::iterator iter;
for (int i = 0; i < vectorLen-1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].find(stringParts[j]) != string::npos) {
iter = stringParts.begin() + j;
stringParts.erase(iter);
j--;
vectorLen--;
}
}
}
}
int main(void) {
int C;
cin >> C; // testcase
for (int testCase = 0; testCase < C; testCase++) {
cin >> k; // number of partial strings
memset(cache, false, sizeof(cache)); // initializing cache to false
string inputStr;
for (int i = 0; i < k; i++) {
cin >> inputStr;
stringParts.push_back(inputStr);
}
eraseOverlapped();
k = stringParts.size();
restore("");
cout << answer << endl;
answer.clear();
stringParts.clear();
}
}
After determining which string-parts can be removed from the list since they are contained in other string-parts, one way to model this problem might be as the "taxicab ripoff problem" problem (or Max TSP), where each potential length reduction by overlap is given a positive weight. Considering that the input size in the question is very small, it seems likely that they expect a near brute-force solution, with possibly some heuristic and backtracking or other form of memoization.
Thanks Everyone who tried to help me solve this problem. I actually solved this problem with few changes on my previous algorithm. These are main changes.
In my previous algorithm I saved result of restore() in global variable 'answer' since restore() didn't return anything, but in new algorithm since restore() returns mid-process answer string I no longer need to use 'answer'.
Used string type cache instead of bool type cache. I found out using bool cache for memoization in this algorithm was useless.
Deleted 'curString' parameter from restore(). Since what we only need during recursive call is one previously selected partial string, 'last' can replace role of 'curString'.
CODE
#include <algorithm>
#include <iostream>
#include <vector>
#include <cstring>
#include <string>
#define MAX 15
using namespace std;
int k;
vector<string> stringParts;
string cache[MAX + 1][(1 << MAX) + 1];
string restore(int selected = 0, int last = -1) {
if (selected == (1 << k) - 1) {
return stringParts[last];
}
if (last == -1) {
string ret = "";
for (int next = 0; next < k; next++) {
string resultStr = restore(selected + (1 << next), next);
if (ret.empty() || ret.length() > resultStr.length())
ret = resultStr;
}
return ret;
}
string& ret = cache[last][selected];
if (!ret.empty()) {
cout << "cache used in [" << last << "][" << selected << "]" << endl;
return ret;
}
string curString = stringParts[last];
for (int next = 0; next < k; next++) {
if (selected & (1 << next)) continue;
string checkStr = restore(selected + (1 << next), next);
int check = false;
string resultStr;
for (int i = (checkStr.length() > curString.length() ? curString.length() : checkStr.length())
; i > 0; i--) {
if (curString.substr(curString.size() - i, i) == checkStr.substr(0, i)) {
resultStr = curString + checkStr.substr(i, checkStr.length() - i);
check = true;
break;
}
}
if (!check)
resultStr = curString + checkStr;
if (ret.empty() || ret.length() > resultStr.length())
ret = resultStr;
}
return ret;
}
void EraseOverlapped() {
int vectorLen = stringParts.size();
for (int i = 0; i < vectorLen - 1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].length() < stringParts[j].length()) {
string temp = stringParts[i];
stringParts[i] = stringParts[j];
stringParts[j] = temp;
}
}
}
vector<string>::iterator iter;
for (int i = 0; i < vectorLen - 1; i++) {
for (int j = i + 1; j < vectorLen; j++) {
if (stringParts[i].find(stringParts[j]) != string::npos) {
iter = stringParts.begin() + j;
stringParts.erase(iter);
j--;
vectorLen--;
}
}
}
}
int main(void) {
int C;
cin >> C;
for (int testCase = 0; testCase < C; testCase++) {
cin >> k;
for (int i = 0; i < MAX + 1; i++) {
for (int j = 0; j < (1 << MAX) + 1; j++)
cache[i][j] = "";
}
string inputStr;
for (int i = 0; i < k; i++) {
cin >> inputStr;
stringParts.push_back(inputStr);
}
EraseOverlapped();
k = stringParts.size();
string resultStr = restore();
cout << resultStr << endl;
stringParts.clear();
}
}
This algorithm is much slower than the 'ideal' algorithm that the book I'm studying suggests, but it was fast enough to pass this question's time limit.

Using vectors to solve the anagrams in C++

I have a function that takes in two vectors of strings and compares each element to see if they are anagrams of one another.
Vector #1: "bat", "add", "zyz", "aaa"
Vector #2: "tab", "dad", "xyx", "bbb"
Restrictions and other things to clarify: The function is supposed to loop through both vectors and compare the strings. I am only supposed to compare based on the index of each vector; meaning I only compare the strings which are in the first index, then the strings which are in the second index, and so on. It's safe to assume that the vectors passed in as parameters will always be the same size.
If the compared strings are anagrams, "Match" is printed on the screen. If they aren't, "No Match" is printed.
Output: Match Match No Match No Match
I'm getting ridiculously stuck on this problem, I know how to reverse strings but when it gets to this I'm getting a bit clueless.
I understand that I would need to iterate through each vector, and then compare. But how would I be able to compare each letter within the string? Also, I'm not allowed to include anything else like algorithm, sort, or set. I've tried digging through a lot of questions but most answers utilized this.
If there are any tips on how to solve this, that would be great. I'll be posting what I find shortly.
Here's what I got so far:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
void anagrams(const vector<string>& vOne, const vector<string>& vTwo){
for(int i=0; i< vOne.size(); i++){
for(int j=0; j< vTwo.size(); j++){
if(vOne[i].size() != vTwo[j].size()){
cout << 0 << endl;
}
else {
cout << 1 << endl;
}
}
}
}
void quicksort(vector<int>& a, int low, int high){
if(low < high)
{
int mid = (low + high)/2;
int pivot = a[mid];
swap(a[high], a[mid]);
int i, j;
for(i=low, j=high-1; ;){
while(a[i]<pivot) ++i;
while(j>i && pivot < a[j]) --j;
if (i < j)
swap(a[i++], a[j--]);
else
break;
}
swap(a[i], a[high]);
}
quicksort(a, low, i - 1);
quicksort(a, i + 1, high);
}
Thanks in advance!
Though you are not able to use sort, you should still sort the the words you are checking against, to see if they are anagrams. You will just have to sort the char[] manually, which is unfortunate, yet a good exercise. I would make a predicate, a function that compares the 2 strings and return true or false, and use that to check if they are anagrams. Also, it seems as though you don't need to print out both words that actually match, if that is true, then you can sort the words in the vectors when you first read them in, then just run them through your predicate function.
// Predicate
bool isMatch(const string &lhs, const string &rhs)
{
...sort and return lhs == rhs;
}
If you write the function, as I have above, you are passing in the parameters by const reference, which then you can copy (not using strcpy() due to vulnerabilities) the parameters into char[] and sort the words. I would recommend writing your sort as its own function.
Another hint, remember that things are much faster, and stl uses smart ptrs to do sorting. Anyway, I hope this helps even a little bit, I didn't want to give you the answer.
A solution that is fairly quick as long as the strings only contain characters between a-z and A-Z would be
bool is_anagram( const string& s1, const string& s2 ) {
if( s1.size() != s2.size() ) {
return false;
}
size_t count[ 26 * 2 ] = { 0 };
for( size_t i = 0; i < s1.size(); i++ ) {
char c1 = s1[ i ];
char c2 = s2[ i ];
if( c1 >= 'a' ) {
count[ c1 - 'a' ]++;
}
else {
count[ c1 - 'A' + 26 ]++;
}
if( c2 >= 'a' ) {
count[ c2 - 'a' ]--;
}
else {
count[ c2 - 'A' + 26 ]--;
}
}
for( size_t i = 0; i < 26 * 2; i++ ) {
if( count[ i ] != 0 ) {
return false;
}
}
return true;
}
If you're willing to use C++11, here is some rather inefficient code for seeing if two strings are anagrams. I'll leave it up to you to loop through the list of words.
#include <iostream>
#include <vector>
using namespace std;
int count_occurrences(string& word, char search) {
int count = 0;
for (char s : word) {
if (s == search) {
count++;
}
}
return count;
}
bool compare_strings(string word1, string v2) {
if (word1.size() != v2.size())
{
return false;
}
for (char s: word1) //In case v1 contains letters that are not in v2
{
if (count_occurrences(word1, s) != count_occurrences(v2, s))
{
return false;
}
}
return true;
}
int main() {
string s1 = "bat";
string s2 = "atb";
bool result = compare_strings(s1, s2);
if (result)
{
cout << "Match" << endl;
}
else
{
cout << "No match" << endl;
}
}
This works by simply counting the number of times a given letter occurs in a string. A better way to do this would be to sort the characters in the string alphabetically, and then compare the sorted strings to see if they are equal. I'll leave it up to you to improve this.
Best wishes.
Another solution, since I'm sufficiently bored:
#include <iostream>
#include <vector>
#include <string>
int equiv_class(char c) {
if ((c>='A')&&(c<='Z')) return c-'A';
if ((c>='a')&&(c<='z')) return c-'a';
return 27;
}
bool is_anagram(const std::string& a, const std::string& b)
{
if (a.size()!=b.size()) return false;
int hist[26]={};
int nz=0; // Non-zero histogram sum tally
for (int i=0, e=a.size() ; i!=e ; ++i)
{
int aclass = equiv_class(a[i]);
int bclass = equiv_class(b[i]);
if (aclass<27) {
switch (++hist[aclass]) {
case 1: ++nz; break; // We were 0, now we're not--add
case 0: --nz; break; // We were't, now we are--subtract
// otherwise no change in nonzero count
}
}
if (bclass<27) {
switch (--hist[bclass]) {
case -1: ++nz; break; // We were 0, now we're not--add
case 0: --nz; break; // We weren't, now we are--subtract
// otherwise no change in nonzero count
}
}
}
return 0==nz;
}
int main()
{
std::vector<std::string> v1{"elvis","coagulate","intoxicate","a frontal lobotomy"};
std::vector<std::string> v2{"lives","catalogue","excitation","bottlein frontofme"};
for (int i=0, e=(v1.size()==v2.size()?v1.size():0); i!=e; ++i) {
if (is_anagram(v1[i],v2[i])) {
std::cout << " Match";
} else {
std::cout << " No Match";
}
}
}

C++ function to count all the words in a string

I was asked this during an interview and apparently it's an easy question but it wasn't and still isn't obvious to me.
Given a string, count all the words in it. Doesn't matter if they are repeated. Just the total count like in a text files word count. Words are anything separated by a space and punctuation doesn't matter, as long as it's part of a word.
For example:
A very, very, very, very, very big dog ate my homework!!!! ==> 11 words
My "algorithm" just goes through looking for spaces and incrementing a counter until I hit a null. Since i didn't get the job and was asked to leave after that I guess My solution wasn't good? Anyone have a more clever solution? Am I missing something?
Assuming words are white space separated:
unsigned int countWordsInString(std::string const& str)
{
std::stringstream stream(str);
return std::distance(std::istream_iterator<std::string>(stream), std::istream_iterator<std::string>());
}
Note: There may be more than one space between words. Also this does not catch other white space characters like tab new line or carriage return. So counting spaces is not enough.
The stream input operator >> when used to read a string from a stream. Reads one white space separated word. So they were probably looking for you to use this to identify words.
std::stringstream stream(str);
std::string oneWord;
stream >> oneWord; // Reads one space separated word.
When can use this to count words in a string.
std::stringstream stream(str);
std::string oneWord;
unsigned int count = 0;
while(stream >> oneWord) { ++count;}
// count now has the number of words in the string.
Getting complicated:
Streams can be treated just like any other container and there are iterators to loop through them std::istream_iterator. When you use the ++ operator on an istream_iterator it just read the next value from the stream using the operator >>. In this case we are reading std::string so it reads a space separated word.
std::stringstream stream(str);
std::string oneWord;
unsigned int count = 0;
std::istream_iterator loop = std::istream_iterator<std::string>(stream);
std::istream_iterator end = std::istream_iterator<std::string>();
for(;loop != end; ++count, ++loop) { *loop; }
Using std::distance just wraps all the above in a tidy package as it find the distance between two iterators by doing ++ on the first until we reach the second.
To avoid copying the string we can be sneaky:
unsigned int countWordsInString(std::string const& str)
{
std::stringstream stream;
// sneaky way to use the string as the buffer to avoid copy.
stream.rdbuf()->pubsetbuf (str.c_str(), str.length() );
return std::distance(std::istream_iterator<std::string>(stream), std::istream_iterator<std::string>());
}
Note: we still copy each word out of the original into a temporary. But the cost of that is minimal.
A less clever, more obvious-to-all-of-the-programmers-on-your-team method of doing it.
#include <cctype>
int CountWords(const char* str)
{
if (str == NULL)
return error_condition; // let the requirements define this...
bool inSpaces = true;
int numWords = 0;
while (*str != '\0')
{
if (std::isspace(*str))
{
inSpaces = true;
}
else if (inSpaces)
{
numWords++;
inSpaces = false;
}
++str;
}
return numWords;
}
You can use the std::count or std::count_if to do that. Below a simple example with std::count:
//Count the number of words on string
#include <iostream>
#include <string>
#include <algorithm> //count and count_if is declared here
int main () {
std::string sTEST("Text to verify how many words it has.");
std::cout << std::count(sTEST.cbegin(), sTEST.cend(), ' ')+1;
return 0;
}
UPDATE: Due the observation made by Aydin Özcan (Nov 16) I made a change to this solution. Now the words may have more than one space between them. :)
//Count the number of words on string
#include <string>
#include <iostream>
int main () {
std::string T("Text to verify : How many words does it have?");
size_t NWords = T.empty() || T.back() == ' ' ? 0 : 1;
for (size_t s = T.size(); s > 0; --s)
if (T[s] == ' ' && T[s-1] != ' ') ++NWords;
std::cout << NWords;
return 0;
}
Another boost based solution that may work (untested):
vector<string> result;
split(result, "aaaa bbbb cccc", is_any_of(" \t\n\v\f\r"), token_compress_on);
More information can be found in the Boost String Algorithms Library
This can be done without manually looking at every character or copying the string.
#include <boost/iterator/transform_iterator.hpp>
#include <cctype>
boost::transform_iterator
< int (*)(int), std::string::const_iterator, bool const& >
pen( str.begin(), std::isalnum ), end( str.end(), std::isalnum );
size_t word_cnt = 0;
while ( pen != end ) {
word_cnt += * pen;
pen = std::mismatch( pen+1, end, pen ).first;
}
return word_cnt;
I took the liberty of using isalnum instead of isspace.
This is not something I would do at a job interview. (It's not like it compiled the first time.)
Or, for all the Boost haters ;v)
if ( str.empty() ) return 0;
size_t word_cnt = std::isalnum( * str.begin() );
for ( std::string::const_iterator pen = str.begin(); ++ pen != str.end(); ) {
word_cnt += std::isalnum( pen[ 0 ] ) && ! std::isalnum( pen[ -1 ] );
}
return word_cnt;
An O(N) solution that is also very simple to understand and implement:
(I haven't checked for an empty string input. But I am sure you can do that easily.)
#include <iostream>
#include <string>
using namespace std;
int countNumberOfWords(string sentence){
int numberOfWords = 0;
size_t i;
if (isalpha(sentence[0])) {
numberOfWords++;
}
for (i = 1; i < sentence.length(); i++) {
if ((isalpha(sentence[i])) && (!isalpha(sentence[i-1]))) {
numberOfWords++;
}
}
return numberOfWords;
}
int main()
{
string sentence;
cout<<"Enter the sentence : ";
getline(cin, sentence);
int numberOfWords = countNumberOfWords(sentence);
cout<<"The number of words in the sentence is : "<<numberOfWords<<endl;
return 0;
}
Here is a single pass, branchless (almost), locale-aware algorithm which handles cases with more than one space between words:
If the string is empty return 0
let transitions = number of adjacent char pairs (c1, c2) where c1 == ' ' and c2 != ' '
if the sentence starts with a space, return transitions else return transitions + 1
Here is an example with string = "A very, very, very, very, very big dog ate my homework!!!!"
i | 0123456789
c1 | A very, very, very, very, very big dog ate my homework!!!!
c2 | A very, very, very, very, very big dog ate my homework!!!!
| x x x x x x x x x x
Explanation
Let `i` be the loop counter.
When i=0: c1='A' and c2=' ', the condition `c1 == ' '` and `c2 != ' '` is not met
When i=1: c1=' ' and c2='A', the condition is met
... and so on for the remaining characters
Here are 2 solutions I came up with
Naive solution
size_t count_words_naive(const std::string_view& s)
{
if (s.size() == 0) return 0;
size_t count = 0;
bool isspace1, isspace2 = true;
for (auto c : s) {
isspace1 = std::exchange(isspace2, isspace(c));
count += (isspace1 && !isspace2);
}
return count;
}
If you think carefully, you will be able to reduce this set of operations into an inner product (just for fun, I don't recommend this as this is arguably much less readable).
Inner product solution
size_t count_words_using_inner_prod(const std::string_view& s)
{
if (s.size() == 0) return 0;
auto starts_with_space = isspace(s.front());
auto num_transitions = std::inner_product(
s.begin()+1, s.end(), s.begin(), 0, std::plus<>(),
[](char c2, char c1) { return isspace(c1) && !isspace(c2); });
return num_transitions + !starts_with_space;
}
I think that will help
the complexty O(n)
#include <iostream>
#include <string>
#include <ctype.h>
using namespace std;
int main()
{
int count = 0, size;
string sent;
getline(cin, sent);
size = sent.size();
check if the char is in alpha and the next char not in alpha
for (int i = 0; i < size - 1; ++i) {
if (isalpha(sent[i]) && !isalpha(sent[i+1])) {
++count;
}
}
if the word in the last of sentence didn't count above so it count here
if (isalpha(sent[size - 1]))++count;
cout << count << endl;
return 0;
}
A very concise O(N) approach:
bool is_letter(char c) { return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z'; }
int count_words(const string& s) {
int i = 0, N = s.size(), count = 0;
while(i < N) {
while(i < N && !is_letter(s[i])) i++;
if(i == N) break;
while(i < N && is_letter(s[i])) i++;
count++;
}
return count;
}
A divide-and-conquer approach, complexity is also O(N):
int DC(const string& A, int low, int high) {
if(low > high) return 0;
int mid = low + (high - low) / 2;
int count_left = DC(A, low, mid-1);
int count_right = DC(A, mid+1, high);
if(!is_letter(A[mid]))
return count_left + count_right;
else {
if(mid == low && mid == high) return 1;
if(mid-1 < low) {
if(is_letter(A[mid+1])) return count_right;
else return count_right+1;
} else if(mid+1 > high) {
if(is_letter(A[mid-1])) return count_left;
else return count_left+1;
}
else {
if(!is_letter(A[mid-1]) && !is_letter(A[mid+1]))
return count_left + count_right + 1;
else if(is_letter(A[mid-1]) && is_letter(A[mid+1]))
return count_left + count_right - 1;
else
return count_left + count_right;
}
}
}
int count_words_divide_n_conquer(const string& s) {
return DC(s, 0, s.size()-1);
}
Efficient version based on map-reduce approach
#include <iostream>
#include <string_view>
#include <numeric>
std::size_t CountWords(std::string_view s) {
if (s.empty())
return 0;
std::size_t wc = (!std::isspace(s.front()) ? 1 : 0);
wc += std::transform_reduce(
s.begin(),
s.end() - 1,
s.begin() + 1,
std::size_t(0),
std::plus<std::size_t>(),
[](char left, char right) {
return std::isspace(left) && !std::isspace(right);
});
return wc;
}
int main() {
std::cout << CountWords(" pretty little octopus "sv) << std::endl;
return 0;
}