Counting words in a c string - c++

I need help completing this function so that it correctly returns the the number of words in the c-string. Maybe my logic is wrong ?
#include <iostream>
#include <string>
#include <cctype>
int countwords(char *, int);
using namespace std;
int main()
{
char a[] = "Four score and seven";
int size = sizeof(a)/sizeof(char);
cout << countwords(a,size);
return 0;
}
int countwords(char* a, int size){
int j = 0;
for(int i = 0; i < size; i++){
if(isspace(i) and isalnum(i - 1) and isalnum(i + 1))
j++;
}
return j;
}

You are passing the value of i to these functions instead of a[i]. That means you're testing if your loop variable is a space (for example), rather than the character at that position in the a array.
Once you have fixed that, understand that you can't blindly reference a[i-1] in that loop (because of the possibility of accessing a[-1]. You will need to update your logic (note also you must use && for logical AND, not and).
I suggest using a flag to indicate whether you are currently "inside" a word. And reset that flag whenever you decide that you are no longer inside a word. eg
int inside = 0;
for (int i = 0; i < size; i++) {
if (alnum(a[i])) {
if (!inside) {
inside = 1;
j++;
}
} else {
inside = 0;
}
}
Also, please use strlen(a) instead of sizeof(a)/sizeof(char). If you continue that practice, you're bound to have an accident one day when you try it on a pointer.

This loop is invalid
for(int i = 0; i < size; i++){
if(isspace(i) and isalnum(i - 1) and isalnum(i + 1))
First of all you does not check characters of the string whether they are spaces or alphanumeric. You check variable i whicj has nothing common with the content of the string. Moreover you have an intention to access memory beyond the array
As you are dealing with a string I would declare the function the following way
size_t countwords( const char *s );
It could be defined as
size_t countwords( const char *s )
{
size_t count = 0;
while ( *s )
{
while ( isspace( *s ) ++s;
if ( *s ) ++count;
wjile ( isalnum( *s ) ++s;
}
return ( count );
}
I do not take into account punctuation symbols. Otherwise you should substitute isspace for !isalnum.

A simpler version would be to repeatedly call strtok() on the string, and each time that an element is returned, you can increment a word count. This would take care of doubled spaces, and so on. You could even split two words with a comma but no space ("this,error") without difficulty.
something along the lines of:
do {
s = strtok(s," ,.;");
if (s) wordcount++;
} while(s);
The only immediate disadvantage is that strtok is destructive, so make a copy before starting.

To count the number of words, you merely need to count the number of times you see a non-whitespace character after a whitespace character. To get things right at the start of the string, assume there is "whitespace" to the left of the string.
int countwords(char* a, int size) {
bool prev_ws = true; // pretend like there's whitespace to the left of a[]
int words = 0;
for (int i = 0; i < size; i++) {
// Is the current character whitespace?
bool curr_ws = isspace( (unsigned char)a[i] );
// If the current character is not whitespace,
// but the previous was, it's the start of a word.
if (prev_ws && !curr_ws)
words++;
// Remember whether the current character was
// whitespace for the next iteration.
prev_ws = curr_ws;
}
return words;
}
You might also notice I included a cast to unsigned char on the call to isspace(). On some platforms, char defaults to signed, but the classifier functions isspace and friends aren't guaranteed to work with negative values. The cast forces all the values to be positive. (More details: http://en.cppreference.com/w/cpp/string/byte/isspace )

Related

How do you break a long string into words and iterate through each character of word and if they match increment a char count using stringstream

int MatchString::comparsion(string newq, string oldq){
//breaks down the string into the smaller strings
stringstream s1(newq);
stringstream s2(oldq);
string new_words;
string old_words;
int word_count = 0;
while(s1>>new_words&&s2>>old_words){
for(int i = 0; i<new_words.length();i++){
for(int j = 0; j<old_words.length();j++){
char a = new_words[i];
char b = old_words[j];
if(a == b){
char_count++;
}
else{
j++;
}
}//end of 2nd for
}//end of for
}
return char_count;
}
I'm currently trying to make a function that takes in two strings and breaks them down into words then into chars. Afterward, I try to compare the value of each char and see if they equal each other. And if they do I increment a char_count by 1. Else I increment j so I compare next char in string 2 with string 1. I need to use this char_count value later to develop another algorithm because I need it to calculate a percentage difference between the two strings which is why I return it at the end because including that calculation with this method would be a bit messy. However when cout the return value I get something completely wrong. I don't know what I'm doing wrong can you please help.
Your j++ under else in the for-loop is redundant, if I'm correct. Allow your for-loop to naturally advance its iterator, don't force it within else{}.

How do you find first character NOT in string array without using classes or libraries when comparing two strings?

I am trying to compare two string arrays, but am not allowed to use classes or libraries to assist.
The issue I have with this is that if one string is more than one character, then it compares the whole string to again, even though it already checked the first one.
char *find_first_not_in_the_set(char *str, const char *set)
{
for(int i = 0; *(str + i) != '\0'; i++)
{
for(int j = 0; *(set + j) != '\0'; j++)
{
if(str[i] != set[j])
{
return &(str[i]);
}
}
}
return NULL;
}
If "Hello World!" is the first string and the second string is "He". The program should return l, but it returns H because it still checks the first character.
I'd rather use this:
bool matrix[256] = {0};
int length = strlen(set);
// remember all characters we have in the 'set'
for( int i=0; i<length; i++) matrix[set[i] & 0xFF] = 1;
length = strlen(str);
// now check the characters from 'str'
for( int i=0; i<length; i++) {
if( ! matrix[str[i] & 0xFF] ) {
printf( "Found: %c", str[i] );
break;
}
}
For every character in str, your code checks if it is present on each and every position in set.Thus, when i=0 'H' is compared with set[0] i.e. 'H' for j=0.But when j=1,'H' is compared with 'e' and this causes the function to return str[0] because i is still 0.
Your problem will be solved if you use just one loop and check str[i]!=set[i].

C++, return duplicate instances from an array to a string

Background to this: This is not homework, it's completely optional review for a basic c++ class. As I want to pass, I'm going through each example the best I can, This one I'm super stuck on, and have been for about three hours now.
Problem: Write a function to return a string composed of the most frequent lowercase letter found in each row of a 10 x 10 array of lowercase alphabetic chars in the range a through z.
If there is more than one most frequent character, use the one that come first alphabetically.
Use neither cin nor cout.
#include <iostream>
#include <string>
using namespace std;
string mostFrequent(char c[10][10]){
// this is the function I need to create
}
int main(){
char c[10][10] = {
'a','b','f','d','e','f','g','h','i','j',
'a','b','c','r','c','r','g','h','r','j',
'a','b','c','d','e','f','g','h','o','o',
'z','w','p','d','e','f','g','h','i','j',
'o','d','o','d','o','b','o','d','o','d',
'a','l','l','d','e','f','f','h','l','j',
'a','b','c','d','i','f','g','h','i','j',
'a','b','z','v','z','v','g','g','v','z',
'a','b','c','d','e','f','g','h','i','e',
'a','b','s','d','e','f','g','h','s','j',
};
cout << mostFrequent(c) << endl;
return 0;
}
So in research for this I found some material that allows me to count how many times a specific int or char would appear inside the array, but it doesn't quite suit the needs of the problem as it needs to return a string composed of the most frequent character. See below.
int myints[] = {10,20,30,30,20,10,10,20};
int mycount = std::count (myints, myints+8, 10);
Because it doesn't work though, I was thinking a for loop, to go row to row, I'll mostly likely need to save things into an array to count, but I'm not sure at all how to implement something like that. I even considered a caesar shift with an array, but I'm not sure where to go if that is the solution.
If I understood the task correctly, you have a matrix 10x10 and you have to create a string of length 10, where character at position i is the one that is most frequent among characters in the row i.
string mostFrequent(char c[10][10]) {
// The idea here is to find the most common character in each row and then append that character to the string
string s = "";
for (int i = 0; i < 10; i++) s += findMostFrequentCharacter(c[i]);
return s;
}
Now we just have to implement a function char findMostFrequentCharacter(char c). We are going to do that by counting all of the characters and picking the one that is most frequent (or it comes alphabetically before if there is more than one most frequent character):
char findMostFrequentCharacter(char c[10]) {
int counts[256]; // Assuming these are ASCII characters, we can have max 256 different values
// Fill it with zeroes (you can use memset to do that, but for clarity I'll write a for-loop
for (int i = 0; i < 256; i++) c[i] = 0;
// Do the actual counting
for (int i = 0; i < 10; i++) // For each character
counts[ c[i] ]++; // Increase it's count by 1, note that c[i] is going to have values between 65 (upper case A) and 122 (lower case z)
char mostFrequent = 0;
// Find the one that is most frequent
for (char ch = 'A'; ch <= 'z' ch++) // This will ensure that we iterate over all upper and lower case letters (+ some more but we don't care about it)
if (counts[ch] > counts[c]) c = ch; // Take the one that is more frequent, note that in case they have the same count nothing will happen which is what we want since we are iterating through characters in alphabetical order
return c;
}
I have written the code out of my head so I'm sorry if there are any compile errors.

Convert a string into a char array

New to C++ and So here is part of a project I'm working on, taking a string and printing the most commonly used number along with how many times it was used. i thought this was right, but for some reason my char array wont be read in. any tips or suggestions on how to fix?
#include <string>
#include <iostream>
using namespace std;
char getMostFreqLetter(string ss);
int main() {
string s; //initilizing a variable for string s
s = ("What is the most common letter in this string "); // giving s a string
getMostFreqLetter(s); // caling the function to print out the most freq Letter
return 0;
}
char getMostFreqLetter(string ss) {
int max, index, i = 0;
int array[255] = {0};
char letters[];
// convert all letters to lowercase to make counting letters non case sensative
for (int i = 0; i < ss.length(); i ++){
ss[i] = tolower(ss[i]);
}
//read each letter into
for (int i = 0; i < ss.length(); i ++){
++array[letters[i]];
}
//
max = array[0];
index = 0;
for (int i = 0; i < ss.length(); i ++){
if( array[i] > max)
{
max = array[i];
index = i;
}
}
return 0;
}
If you are not considering white space as letter.
Then more efficient way could have been
vector<int> count(26,0);
for (int i = 0; i < s.length(); i++) {
int range = to_lower(s[i])-'a';
if ( range >= 0 && range < 26)
count[range]++;
}
// Now you can do fix the max while iterating over count;
Use string::c_str().
It converts a string to a character array.
You have a few errors in your code.
Firstly, the array of chars letters is completely unused. You should disregard it and iterate over the string ss instead which is what I think you intended to do.
This would change your second for loop from ++array[letters[i]]; to ++array[ss[i]];.
Secondly, your last for loop is buggy. You are using i as the index to look for the frequency in array whereas you need to use the ascii value of the character (ss[i]) instead. Here is a fixed version with comments:
index = ss[0];
max = array[index];
for (int i = 0; i < ss.length(); i ++){
if(!isspace(ss[i]) && array[ss[i]] > max)
{
max = array[ss[i]]; // you intended to use the ascii values of the characters in s to mark their place in array. In you code, you use i which is the just the index of the character in s as opposed to the ascii value of that character. Hence you need to use array[ss[i]].
index = ss[i];
}
}
return index;
Once you make the above changes you get the following output when run on your string:
Most freq character: t

Code not returning the right index

I'm tring to solve a small problem. I have two strings. s1 and s2. I want my function to return the first index of s1 that has a character not present in the string s2. This is my code.
int cad_nenhum_dos (char s1[], char s2[]){
int i,j;
for (i=0;s1[i]!='\0';i++)
{
for (j=0;s2[j]!='\0';j++)
if (s1[i]!=s2[j]) return i;
}
return -1;
}
If I run s1="hello" s2="hellm", the result should be index 4, because s1[4]='o' and "o" is not present in s2... But I allways get 0 when I run this. The -1 works fine if the strings are the same.
What am I doing wrong?
Regards
In your inner loop you need to break out when you find a character the same -- as it stands you're returning when there are any different characters in the second string, even if an earlier one was the same. You want something like
for (j=0;s2[j]!='\0';j++)
if (s1[i]==s2[j]) break;
if (s2[j]==0)
return i;
I.e. you want to return the ith character of the first string when you've made you way through the whole of the second string without having found that character.
For programming exercises at the introductory level it's a good idea to carefully execute the code manually (step through yourself and see what's happening).
As TooTone suggested, you need to break out of the loop when you find a match:
for (int i = 0; s1[i] != '\0'; i++)
{
bool charFound = false;
for (int j = 0; s2[j] != '\0'; j++)
{
if (s1[i] == s2[j])
{
charFound = true;
break;
}
}
if ( ! charFound)
return i;
}
Because the inner for-loop is comparing first letter of the first string against all the letters in the second string.
int cad_nenhum_dos (char s1[], char s2[])
{
int i,j;
for(i=0; s1[i]; i++)
{
if(s1[i] != s2[j])
return(i);
}
return(-1);
}