How to count whitespace occurences in a string in c++ - c++

I have a project for my advanced c++ class that's supposed to do a number of things, but I'm trying to focus on this function first, because after it works I can tweak it to fulfill the other needs. This function searches through a file and performs a word count by counting the number of times ' ' appears in the document. Maybe not accurate, but it'll be a good starting place. Here's the code I have right now:
void WordCount()
{
int count_W = 0; //Varaible to store word count, will be written to label
int i, c = 0; //i for iterator
ifstream fPath("F:\Project_1_Text.txt");
FileStream input( "F:\Project_1_Text.txt", FileMode::Open, FileAccess::Read );
StreamReader fileReader( %input );
String ^ line;
//char ws = ' ';
array<Char>^ temp;
input.Seek( 0, SeekOrigin::Begin );
while ( ( line = fileReader.ReadLine() ) != nullptr )
{
Console::WriteLine( line );
c = line->Length;
//temp = line->ToCharArray();
for ( i = 0; i <= c; i++)
{
if ( line[i] == ' ' )
count_W++;
}
//line->ToString();
}
//Code to write to label
lblWordCount->Text = count_W.ToString();
}
All of this works except for one problem. When I try to run the program, and open the file, I get an error that tells me the Index is out of bounds. Now, I know what that means, but I don't get how the problem is occurring. And, if I don't know what's causing the problem, I can't fix it. I've read that it is possible to search through a string with a for loop, and of course that also holds true for a char array, and there is code in there to perform that conversion, but in both cases I get the same error. I know it is reading through the file correctly, because the final program also has to perform a character count (which is working), and it read back the size of each line in the target document perfectly from start to finish. Anyway, I'm out of ideas, so I thought I'd consult a higher power. Any ideas?

Counting whitespace is simple:
int spaces = std::count_if(s.begin(), s.end(),
[](unsigned char c){ return std::isspace(c); });
Two notes, though:
std::isspace() cannot be used immediately with char because char may be signed and std::isspace() takes an int which is required to be positive.
This counts the number of spaces, not the number of words (or words - 1): words may be separated by sequences of spaces consisting of more than one consecutive space.

It could be your loop. You're going from i=0 to i=c, but i=c is too far. You should go to i=c-1:
for ( i=0; i<c; i++)

Related

C++: How to read multiple lines from file until a certain character, store them in array and move on to the next part of the file

I'm doing a hangman project for school, and one of the requirements is it needs to read the pictures of the hanging man from a text file. I have set up a text file with the '-' char which means the end of one picture and start of the next one.
I have this for loop set up to read the file until the delimiting character and store it in an array, but when testing I am getting incomplete pictures, cut off in certain places.
This is the code:
string s;
ifstream scenarijos("scenariji.txt");
for(int i = 0; i < 10; i++ ) {
getline(scenarijos, s, '-');
scenariji[i] = s;
}
For the record, scenariji is an array with type of string
And here is an example of the text file:
example
From your example input, it looks like '-' can be part of the input image (look at the "arms" of the hanged man). Unless you use some other, unique character to delimit the images, you won't be able to separate them.
If you know the dimensions of the images, you could read them without searching for the delimiter by reading a certain amount of bytes from the input file. Alternatively, you could define some more complex rules for image termination, e.g. when the '-' character is the only character in the line. For example:
ifstream scenarijos("scenariji.txt");
string scenariji[10];
for (int i = 0; i < 10; ++i) {
string& scenarij = scenariji[i];
while (scenarijos.good()) {
string s;
getline(scenarijos, s); // read line
if (!scenarijos.good() || s == "-")
break;
scenarij += s;
scenarij.push_back('\n'); // the trailing newline was removed by getline
}
}

C++: Creating a word "letter after letter" in string data type

I'm learning C++ for my exam and one thing is bugging me.
I had a file with 25 words (let's call it "new.txt") and a file with 1000 words ("words.txt").
I had to check how many times a word from new.txt appears in words.txt. And after this I had to check how many times does a "mirror" of a word for new.txt appears in words.txt (mirror meaning the word from right to left => car = rac..)
My idea was to make three arrays: newword[25], words[1000], mirror[25] and then go one from there.
I know how to do this with "char" data type. But i wanted to try doing it with "string" type.
Here is the code:
string mirrors(string word) //function that writes the word from the back
{
int dl=word.length();
string mir;
for (int q=0;q<dl;q++)
{
mir[q]=word[dl-q-1]; //first letter of a new word is a last letter of the original word
}
return mir;
}
int main()
{
ifstream in1 ("words.txt");
ifstream in2 ("new.txt");
string words[1000], newword[25], mirror[25]; //creating arrays
for (int x=0;x<1000;x++) //filling the array with words from words.txt
{
in1>>words[x];
}
for (int y=0;y<25;y++) //filling the array with words from new.txt
{
in2>>newword[y];
}
in1.close();
in2.close();
for (int z=0;z<25;z++)
{
mirror[z]=mirrors(newword[z]);
}
out.close();
return 0;
}
And here is the problem...
When I'm changing the order of the letters, the string from "mirror" does not print using normal cout<
So my question is... Is there something with string data types that makes it impossible to print using one command after creating a word letter after letter, or is there something I have no clue about?
Because the word is there, it is created in this array. But cout<
I'm sorry if the question is not clear but it's my first time posting here...
string mirrors(string word) {
int dl = word.length();
string mir; // here you declare your string, but is's an empty string.
for (int q = 0; q < dl; q++) {
// by calling mir[q] you are referencing to the [0, 1 ... dl-1] char of empty string (so it's size = 0) so it's undefined bhv/error.
// mir[q]=word[dl-q-1]; //first letter of a new word is a last letter of the original word
mir = mir + word[dl - q - 1]; // you rather wanted to do smth like this
}
return mir;
}
using new as a variable name is not a very good idea as #Johny Mop pointed
Jbc to możesz też po polsku zadać pytanko w komentarzu :).
First of all you try to access symbols in empty std::string, which leads to UB. In practice all of that is unnessesary:
std::string mirrors( const std::string &word) //function that writes the word from the back
{
return std::string( word.rbegin(), word.rend() );
}
is enough. As for your program, it would be much better to read content of file "new.txt" into memory, and std::set or std::unordered_set would be much better for lookup. Then create 2 instances std::map<std::string,int> (or std::unordered_map if you do not care of order) and read file "word.txt" one by one, count and update those maps accordingly:
std::unordered_set<std::string> new_words; // this should be populated from file "new.txt"
std::map<std::string,int> counts, mcounts;
// reading loop of file "words.txt"
std::string word = ...;
counts[ word ] += new_words.count( word );
word = mirrors( word );
mcounts[ word ] += new_words.count( word );
then you would have all your counts.

How can I reach the second word in a string?

I'm new here and this is my first question, so don't be too harsh :]
I'm trying to reverse a sentence, i.e. every word separately.
The problem is that I just can't reach the second word, or even reach the ending of a 1-word sentence. What is wrong?
char* reverse(char* string)
{
int i = 0;
char str[80];
while (*string)
str[i++] = *string++;
str[i] = '\0'; //null char in the end
char temp;
int wStart = 0, wEnd = 0, ending = 0; //wordStart, wordEnd, sentence ending
while (str[ending]) /*####This part just won't stop####*/
{
//skip spaces
while (str[wStart] == ' ')
wStart++; //wStart - word start
//for each word
wEnd = wStart;
while (str[wEnd] != ' ' && str[wEnd])
wEnd++; //wEnd - word ending
ending = wEnd; //for sentence ending
for (int k = 0; k < (wStart + wEnd) / 2; k++) //reverse
{
temp = str[wStart];
str[wStart++] = str[wEnd];
str[wEnd--] = temp;
}
}
return str;
}
Your code is somewhat unidiomatic for C++ in that it doesn't actually make use of a lot of common and convenient C++ facilities. In your case, you could benefit from
std::string which takes care of maintaining a buffer big enough to accomodate your string data.
std::istringstream which can easily split a string into spaces for you.
std::reverse which can reverse a sequence of items.
Here's an alternative version which uses these facilities:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <vector>
std::string reverse( const std::string &s )
{
// Split the string on spaces by iterating over the stream
// elements and inserting them into the 'words' vector'.
std::vector<std::string> words;
std::istringstream stream( s );
std::copy(
std::istream_iterator<std::string>( stream ),
std::istream_iterator<std::string>(),
std::back_inserter( words )
);
// Reverse the words in the vector.
std::reverse( words.begin(), words.end() );
// Join the words again (inserting one space between two words)
std::ostringstream result;
std::copy(
words.begin(),
words.end(),
std::ostream_iterator<std::string>( result, " " )
);
return result.str();
}
At the end of the first word, after it's traversed, str[wEnd] is a space and
you remember this index when you assign ending = wEnd.
Immediately, you reverse the characters in the word. At that point,
str[ending] is not a space because you included that space in the
letter-reversal of the word.
Depending on whether there are extra
spaces between words in the rest of the input, execution varies from this point, but it does eventually end with
you reversing a word that ended at the null terminator on the string
because you end the loop that increments wEnd on that null terminator and
include it in the final word reversal.
The very next iteration walks off of
the initialized part of the input string and the execution is undetermined from there because, heck, who knows what's in that array (str is stack-allocated, so it's whatever's sitting around in the memory occupied by the stack at that point).
On top of all of that, you don't update wStart except in the reversal loop,
and it never moves to wEnd all the way (see the loop exit condition), so come to think of it, you're never getting past that first word. Assuming that was fixed, you'd still have the problem I outlined at first.
All this assumes that you didn't just send this function something longer than 80 characters and break it that way.
Oh, and as mentioned in one of the comments on the question, you're returning stack-allocated local storage, which isn't going to go anywhere good either.
Hoo, boy. Where to start?
In C++, use std::string instead of char* if you can.
char[80] is an overflow risk if string is input by a user; it should be dynamically allocated. Preferably by using std::string; otherwise use new / new[]. If you meant to use C, then malloc.
cmbasnett also pointed out that you can't actually return str (and get the expected results) if you declare / allocate it the way you did. Traditionally, you'd pass in a char* destination and not allocate anything in the function at all.
Set ending to wEnd + 1; wEnd points to the last non-null character of the string in question (eventually, if it works right), so in order for str[ending] to break out of the loop, you have to increment once to get to the null char. Disregard that, I misread the while loop.
It looks like you need to use ((wEnd - wStart) + 1), not (wStart + wEnd). Although you should really use something like while(wEnd > wStart) instead of a for loop in this context.
You also should be setting wStart = ending; or something before you leave the loop, because otherwise it's going to get stuck on the first word.

Simple Sentence Reverser in C++

I'm trying to build a program to solve a problem in a text book I bought recently and it's just driving me crazy.
I have to built a sentence reverser so I get the following:
Input = "Do or do not, there is no try."
Output = "try. no is there not, do or Do"
Here's what I've got so far:
void ReverseString::reversalOperation(char str[]) {
char* buffer;
int stringReadPos, wordReadPos, writePos = 0;
// Position of the last character is length -1
stringReadPos = strlen(str) - 1;
buffer = new char[stringReadPos+1];
while (stringReadPos >= 0) {
if (str[stringReadPos] == ' ') {
wordReadPos = stringReadPos + 1;
buffer[writePos++] = str[stringReadPos--];
while (str[wordReadPos] != ' ') {
buffer[writePos] = str[wordReadPos];
writePos++;
wordReadPos++;
}
} else {
stringReadPos--;
}
}
cout << str << endl;
cout << buffer << endl;
}
I was sure I was on the right track but all I get for an output is the very first word ("try.") I've been staring at this code so long I can't make any headway. Initially I was checking in the inner while look for a '/0' character as well but it didn't seem to like that so I took it out.
Unless you're feeling masochistic, throw your existing code away, and start with std::vector and std::string (preferably an std::vector<std::string>). Add in std::copy with the vector's rbegin and rend, and you're pretty much done.
This is utter easy in C++, with help from the standard library:
std::vector< std::string > sentence;
std::istringstream input( str );
// copy each word from input to sentence
std::copy(
(std::istream_iterator< std::string >( input )), std::istream_iterator< std::string >()
, std::back_inserter( sentence )
);
// print to cout sentence in reverse order, separated by space
std::copy(
sentence.rbegin(), sentence.rend()
, (std::ostream_iterator< std::string >( std::cout, " " ))
);
In the interest of science, I tried to make your code work as is. Yeah, it's not really the C++ way to do things, but instructive nonetheless.
Of course this is only one of a million ways to get the job done. I'll leave it as an exercise for you to remove the trailing space this code leaves in the output ;)
I commented my changes with "EDIT".
char* buffer;
int stringReadPos, wordReadPos, writePos = 0;
// Position of the last character is length -1
stringReadPos = strlen(str) - 1;
buffer = new char[stringReadPos+1];
while (stringReadPos >= 0) {
if ((str[stringReadPos] == ' ')
|| (stringReadPos == 0)) // EDIT: Need to check for hitting the beginning of the string
{
wordReadPos = stringReadPos + (stringReadPos ? 1 : 0); // EDIT: In the case we hit the beginning of the string, don't skip past the space
//buffer[writePos++] = str[stringReadPos--]; // EDIT: This is just to grab the space - don't need it here
while ((str[wordReadPos] != ' ')
&& (str[wordReadPos] != '\0')) // EDIT: Need to check for hitting the end of the string
{
buffer[writePos] = str[wordReadPos];
writePos++;
wordReadPos++;
}
buffer[writePos++] = ' '; // EDIT: Add a space after words
}
stringReadPos--; // EDIT: Decrement the read pos every time
}
buffer[writePos] = '\0'; // EDIT: nul-terminate the string
cout << str << endl;
cout << buffer << endl;
I see the following errors in your code:
the last char of buffer is not set to 0 (this will cause a failure in cout<
in the inner loop you have to check for str[wordReadPos] != ' ' && str[wordReadPos] != 0 otherwise while scanning the first word it will never find the terminating space
Since you are using a char array, you can use C string library. It will be much easier if you use strtok: http://www.cplusplus.com/reference/clibrary/cstring/strtok/
It will require pointer use, but it will make your life much easier. Your delimiter will be " ".
What where the problems with your code and what are more cplusplusish ways of doing is yet well written. I would, however, like to add that the methodology
write a function/program to implement algorithm;
see if it works;
if it doesn't, look at code until you get where the problem is
is not too productive. What can help you resolve this problem here and many other problems in the future is the debugger (and poor man's debugger printf). It will make you able to see how your program actually works in steps, what happens to the data etc. In other words, you will be able to see which parts of it works as you expect and which behaves differently. If you're on *nix, don't hesitate to try gdb.
Here is a more C++ version. Though I think the simplicity is more important than style in this instance. The basic algorithm is simple enough, reverse the words then reverse the whole string.
You could write C code that was just as evident as the C++ version. I don't think it's necessarily wrong to write code that isn't ostentatiously C++ here.
void word_reverse(std::string &val) {
size_t b = 0;
for (size_t i = 0; i < val.size(); i++) {
if (val[i] == ' ') {
std::reverse(&val[b], &val[b]+(i - b));
b = ++i;
}
}
std::reverse(&val[b], &val[b]+(val.size() - b));
std::reverse(&val[0], &val[0]+val.size());
}
TEST(basic) {
std::string o = "Do or do not, there is no try.";
std::string e = "try. no is there not, do or Do";
std::string a = o;
word_reverse(a);
CHECK_EQUAL( e , a );
}
Having a multiple, leading, or trailing spaces may be degenerate cases depending on how you actually want them to behave.

Cleaning a string of punctuation in C++

Ok so before I even ask my question I want to make one thing clear. I am currently a student at NIU for Computer Science and this does relate to one of my assignments for a class there. So if anyone has a problem read no further and just go on about your business.
Now for anyone who is willing to help heres the situation. For my current assignment we have to read a file that is just a block of text. For each word in the file we are to clear any punctuation in the word (ex : "can't" would end up as "can" and "that--to" would end up as "that" obviously with out the quotes, quotes were used just to specify what the example was).
The problem I've run into is that I can clean the string fine and then insert it into the map that we are using but for some reason with the code I have written it is allowing an empty string to be inserted into the map. Now I've tried everything that I can come up with to stop this from happening and the only thing I've come up with is to use the erase method within the map structure itself.
So what I am looking for is two things, any suggestions about how I could a) fix this with out simply just erasing it and b) any improvements that I could make on the code I already have written.
Here are the functions I have written to read in from the file and then the one that cleans it.
Note: the function that reads in from the file calls the clean_entry function to get rid of punctuation before anything is inserted into the map.
Edit: Thank you Chris. Numbers are allowed :). If anyone has any improvements to the code I've written or any criticisms of something I did I'll listen. At school we really don't get feed back on the correct, proper, or most efficient way to do things.
int get_words(map<string, int>& mapz)
{
int cnt = 0; //set out counter to zero
map<string, int>::const_iterator mapzIter;
ifstream input; //declare instream
input.open( "prog2.d" ); //open instream
assert( input ); //assure it is open
string s; //temp strings to read into
string not_s;
input >> s;
while(!input.eof()) //read in until EOF
{
not_s = "";
clean_entry(s, not_s);
if((int)not_s.length() == 0)
{
input >> s;
clean_entry(s, not_s);
}
mapz[not_s]++; //increment occurence
input >>s;
}
input.close(); //close instream
for(mapzIter = mapz.begin(); mapzIter != mapz.end(); mapzIter++)
cnt = cnt + mapzIter->second;
return cnt; //return number of words in instream
}
void clean_entry(const string& non_clean, string& clean)
{
int i, j, begin, end;
for(i = 0; isalnum(non_clean[i]) == 0 && non_clean[i] != '\0'; i++);
begin = i;
if(begin ==(int)non_clean.length())
return;
for(j = begin; isalnum(non_clean[j]) != 0 && non_clean[j] != '\0'; j++);
end = j;
clean = non_clean.substr(begin, (end-begin));
for(i = 0; i < (int)clean.size(); i++)
clean[i] = tolower(clean[i]);
}
The problem with empty entries is in your while loop. If you get an empty string, you clean the next one, and add it without checking. Try changing:
not_s = "";
clean_entry(s, not_s);
if((int)not_s.length() == 0)
{
input >> s;
clean_entry(s, not_s);
}
mapz[not_s]++; //increment occurence
input >>s;
to
not_s = "";
clean_entry(s, not_s);
if((int)not_s.length() > 0)
{
mapz[not_s]++; //increment occurence
}
input >>s;
EDIT: I notice you are checking if the characters are alphanumeric. If numbers are not allowed, you may need to revisit that area as well.
Further improvements would be to
declare variables only when you use them, and in the innermost scope
use c++-style casts instead of the c-style (int) casts
use empty() instead of length() == 0 comparisons
use the prefix increment operator for the iterators (i.e. ++mapzIter)
A blank string is a valid instance of the string class, so there's nothing special about adding it into the map. What you could do is first check if it's empty, and only increment in that case:
if (!not_s.empty())
mapz[not_s]++;
Style-wise, there's a few things I'd change, one would be to return clean from clean_entry instead of modifying it:
string not_s = clean_entry(s);
...
string clean_entry(const string &non_clean)
{
string clean;
... // as before
if(begin ==(int)non_clean.length())
return clean;
... // as before
return clean;
}
This makes it clearer what the function is doing (taking a string, and returning something based on that string).
The function 'getWords' is doing a lot of distinct actions that could be split out into other functions. There's a good chance that by splitting it up into it's individual parts, you would have found the bug yourself.
From the basic structure, I think you could split the code into (at least):
getNextWord: Return the next (non blank) word from the stream (returns false if none left)
clean_entry: What you have now
getNextCleanWord: Calls getNextWord, and if 'true' calls CleanWord. Returns 'false' if no words left.
The signatures of 'getNextWord' and 'getNextCleanWord' might look something like:
bool getNextWord (std::ifstream & input, std::string & str);
bool getNextCleanWord (std::ifstream & input, std::string & str);
The idea is that each function does a smaller more distinct part of the problem. For example, 'getNextWord' does nothing but get the next non blank word (if there is one). This smaller piece therefore becomes an easier part of the problem to solve and debug if necessary.
The main component of 'getWords' then can be simplified down to:
std::string nextCleanWord;
while (getNextCleanWord (input, nextCleanWord))
{
++map[nextCleanWord];
}
An important aspect to development, IMHO, is to try to Divide and Conquer the problem. Split it up into the individual tasks that need to take place. These sub-tasks will be easier to complete and should also be easier to maintain.