I have a bit of an issue with my code. My code should be stripping all non-alphanumeric characters except .,: and ; and then sorting lines ending in dots on new lines. So, something like:
First, some characters, legal and illegal: )(=8skf-=&. This should be on a separate line. This too.
would become:
First, some characters, legal and illegal: 8skf.
This should be on a separate line.
This too.
Now, the first part of the code, which strips non-alphanumerics works perfectly. The sorting part works up to a point. So, in my code, the above line actually becomes:
First, some characters, legal and illegal: 8skf.
This should be on a separate line. This too.
I understand that this is because this is a new line and my code cannot read it in the process of becoming a new line. The code is:
int writeFinalv(string path) {
readTempFiles(path.c_str());
string line;
string nline;
int start;
int lnth;
ifstream temp("temp.txt");
ofstream final;
int length;
final.open(path.c_str(), ios::out | ios::trunc);
if(temp.is_open()) {
while(getline(temp, line)) {
length = line.length();
for(int i = 0; i < length; i++) {
if(line[i] == '.') {
if(line[i+1] == ' ') {
nline = line.substr(0, (i+2));
}
else {
nline = line.substr(0, (i+1));
}
final << nline << "\n";
start = line.find(nline);
lnth = nline.length();
line.erase(start, lnth);
}
}
}
}
else {
error = true;
}
return 0;
}
My code first works by calling the function which reads the initial file, strips it of illegal characters, and writes to a temporary file. Then, it reads the temporary file, finding dots, and writing the new lines in the initial file, after truncating it.
Thanks in advance.
By erasing part of the line string inside the for loop, the loop indexing is invalidated. Both i and length no longer hold values that you can reliably keep using to continue the for loop.
You don't actually need to erase from the string though. You can keep track of the current start position, and use that as the first parameter to the substr calls.
Related
Here is the code to find the number of matches of a string, which is input from the user, can be found in the file temp.txt. If, for example, we want love to be counted, then matches like love, lovely, beloved should be considered. We also want to count the total number of words in temp.txt file.
I am doing a line by line reading here, not word by word.
Why does the debugging stop at totalwords += counting(line)?
/*this code is not working to count the words*/
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
int totalwords{0};
int counting(string line){
int wordcount{0};
if(line.empty()){
return 1;
}
if(line.find(" ")==string::npos){wordcount++;}
else{
while(line.find(" ")!=string::npos){
int index=0;
index = line.find(" ");
line.erase(0,index);
wordcount++;
}
}
return wordcount;
}
int main() {
ifstream in_file;
in_file.open("temp.txt");
if(!in_file){
cerr<<"PROBLEM OPENING THE FILE"<<endl;
}
string line{};
int counter{0};
string word {};
cout<<"ENTER THE WORD YOU WANT TO COUNT IN THE FILE: ";
cin>>word;
int n {0};
n = ( word.length() - 1 );
while(getline(in_file>>ws,line)){
totalwords += counting(line);
while(line.find(word)!=string::npos){
counter++;
int index{0};
index = line.find(word);
line.erase(0,(index+n));
}
}
cout<<endl;
cout<<counter<<endl;
cout<<totalwords;
return 0;
}
line.erase(0, index); doesn't erase the space, you need
line.erase(0, index + 1);
Your code reveals a few problems...
At very first, counting a single word for an empty line doesn't appear correct to me. Second, erasing again and again from the string is pretty inefficient, with every such operation all of the subsequent characters are copied towards the front. If you indeed wanted to do so you might rather want to search from the end of the string, avoiding that. But you can actually do so without ever modifying the string if you use the second parameter of std::string::find (which defaults to 0, so has been transparent to you...):
int index = line.find(' ' /*, 0*); // first call; 0 is default, thus implicit
index = line.find(' ', index + 1); // subsequent call
Note that using the character overload is more efficient if you search for a single character anyway. However, this variant doesn't consider other whitespace like e. g. tabulators.
Additionally, the variant as posted in the question doesn't consider more than one subsequent whitespace! In your erasing variant – which erases one character too few, by the way – you would need to skip incrementing the word count if you find the space character at index 0.
However I'd go with a totally new approach, looking at each character separately; you need a stateful loop for in that case, though, i.e. you need to remember if you already are within a word or not. It might look e. g. like this:
size_t wordCount = 0; // note: prefer an unsigned type, negative values
// are meaningless anyway
// size_t is especially fine as it is guaranteed to be
// large enough to hold any count the string might ever
// contain characters
bool inWord = false;
for(char c : line)
{
if(isspace(static_cast<unsigned char>(c)))
// you can check for *any* white space that way...
// note the cast to unsigned, which is necessary as isspace accepts
// an int and a bare char *might* be signed, thus result in negative
// values
{
// no word any more...
inWord = false;
}
else if(inWord)
{
// well, nothing to do, we already discovered a word earlier!
//
// as we actually don't do anything here you might just skip
// this block and check for the opposite: if(!inWord)
}
else
{
// OK, this is the start of a word!
// so now we need to count a new one!
++wordCount;
inWord = true;
}
}
Now you might want to break words at punctuation characters as well, so you might actually want to check for:
if(isspace(static_cast<unsigned char>(c)) || ispunct(static_cast<unsigned char>(c))
A bit shorter is the following variant:
if(/* space or punctuation */)
{
inWord = false;
}
else
{
wordCount += inWord; // adds 0 or 1 depending on the value
inWord = false;
}
Finally: All code is written freely, thus unchecked – if you find a bug, please fix yourself...
debugging getting stopped abruptly
Does debugging indeed stop at the indicated line? I observed instead that the program hangs within the while loop in counting. You may make this visible by inserting an indicator output (marked by HERE in following code):
int counting(string line){
int wordcount{0};
if(line.empty()){
return 1;
}
if(line.find(" ")==string::npos){wordcount++;}
else{
while(line.find(" ")!=string::npos){
int index=0;
index = line.find(" ");
line.erase(0,index);
cout << '.'; // <--- HERE: indicator output
wordcount++;
}
}
return wordcount;
}
As Jarod42 pointed out, the erase call you are using misses the space itself. That's why you are finding spaces and “counting words” forever.
There is also an obvious misconception about words and separators of words visible in your code:
empty lines don't contain words
consecutive spaces don't indicate words
words may be separated by non-spaces (parentheses for example)
Finally, as already mentioned: if the problem is about counting total words, it's not necessary to discuss the other parts. And after the test (see HERE) above, it also appears to be independent on file input. So your code could be reduced to something like this:
#include <iostream>
#include <string>
int counting(std::string line) {
int wordcount = 0;
if (line.empty()) {
return 1;
}
if (line.find(" ") == std::string::npos) {
wordcount++;
} else {
while (line.find(" ") != std::string::npos) {
int index = 0;
index = line.find(" ");
line.erase(0, index);
wordcount++;
}
}
return wordcount;
}
int main() {
int totalwords = counting("bla bla");
std::cout << totalwords;
return 0;
}
And in this form, it's much easier to see if it works. We expect to see a 2 as output. To get there, it's possible to try correcting your erase call, but the result would then still be wrong (1) since you are actually counting spaces. So it's better to take the time and carefully read Aconcagua's insightful answer.
This may be a little bit redundant, but is there a short/compact method of reading in a string until a tab is reached in C++? Similar to other questions, but I want to keep reading even if I hit a space. For example if the STDIN is
Cute Kitty is fabulous as always
Then I want to read in Cute Kitty; is; fabulous as always, three times.
I've seen people do this with regex in files, but how would you do this on the stdin in C++? I want to put it in a string class and whenever I try something like
scanf("%s\t", &mystring);
It throws up an error because I'm not using an array of chars.
Thanks, please keep answers easy enough for a noob to understand.
This code seems to work for me. It basically gets the line that was entered from the user via stdin and then reads each character waiting for a tab character (\t), or the end of the line.
#include <iostream>
#include <string>
int main()
{
std::string a;
std::getline(std::cin,a);
int index_holder = 0;
for(std::string::size_type i = 0; i < a.size(); ++i)
{
if(a[i] == '\t' || (i == a.size() - 1)) {
std::cout << a.substr(index_holder, i - index_holder) << std::endl;
index_holder = i + 1;
}
}
return 0;
}
Have a look at strtok:
char * strtok ( char * str, const char * delimiters );
Split string into tokens. A sequence of calls to this function split str into tokens, which are sequences of contiguous characters separated by any of the characters that are part of delimiters.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
Im trying to create a function that mimics the behavior of the getline() function, with the option to use a delimiter to split the string into tokens.
The function accepts 2 strings (the second is being passed by reference) and a char type for the delimiter. It loops through each character of the first string, copying it to the second string and stops looping when it reaches the delimiter. It returns true if the first string have more characters after the delimiter and false otherwise. The position of the last character is being saved in a static variable.
for some reason the the program is going into an infinite loop and is not executing anything:
const int LINE_SIZE = 160;
bool strSplit(string sFirst, string & sLast, char cDelim) {
static int iCount = 0;
for(int i = iCount; i < LINE_SIZE; i++) {
if(sFirst[i] != cDelim)
sLast[i-iCount] = sFirst[i];
else {
iCount = i+1;
return true;
}
}
return false;
}
The function is used in the following way:
while(strSplit(sLine, sToken, '|')) {
cout << sToken << endl;
}
Why is it going into an infinite loop, and why is it not working?
I should add that i'm interested in a solution without using istringstream, if that's possible.
It is not exactly what you asked for, but have you considered std::istringstream and std::getline?
// UNTESTED
std::istringstream iss(sLine);
while(std::getline(iss, sToken, '|')) {
std::cout << sToken << "\n";
}
EDIT:
Why is it going into an infinite loop, and why is it not working?
We can't know, you didn't provide enough information. Try to create an SSCCE and post that.
I can tell you that the following line is very suspicious:
sLast[i-iCount] = sFirst[i];
This line will result in undefined behavior (including, perhaps, what you have seen) in any of the following conditions:
i >= sFirst.size()
i-iCount >= sLast.size()
i-iCount < 0
It appears to me likely that all of those conditions are true. If the passed-in string is, for example, shorter than 160 lines, or if iCount ever grows to be bigger than the offset of the first delimiter, then you'll get undefined behavior.
LINE_SIZE is probably larger than the number of characters in the string object, so the code runs off the end of the string's storage, and pretty much anything can happen.
Instead of rolling your own, string::find does what you need.
std::string::size_type pos = 0;
std::string::size_type new_pos = sFirst.find('|', pos);
The call to find finds the first occurrence of '|' that's at or after the position 'pos'. If it succeeds, it returns the index of the '|' that it found. If it fails, it returns std::string::npos. Use it in a loop, and after each match, copy the text from [pos, new_pos) into the target string, and update pos to new_pos + 1.
are you sure it's the strSplit() function that doesn't return or is it your caller while loop that's infinite?
Shouldn't your caller loop be something like:
while(strSplit(sLine, sToken, '|')) {
cout << sToken << endl;
cin >> sLine >> endl;
}
-- edit --
if value of sLine is such that it makes strSplit() to return true then the while loop becomes infinite.. so do something to change the value of sLine for each iteration of the loop.. e.g. put in a cin..
Check this out
std::vector<std::string> spliString(const std::string &str,
const std::string &separator)
{
vector<string> ret;
string::size_type strLen = str.length();
char *buff;
char *pch;
buff = new char[strLen + 1];
buff[strLen] = '\0';
std::copy(str.begin(), str.end(), buff);
pch = strtok(buff, separator.c_str());
while(pch != NULL)
{
ret.push_back(string(pch));
pch = strtok(NULL, separator.c_str());
}
delete[] buff;
return ret;
}
I am currently trying to read a file, put extra backward slash () if it finds a backward slash, and write it to another file. The problem is, there are weird characters being printed inside the path.txt. I suspect that, the space characters from the file logdata is the root of this problem. Need advice how to solve this.
Here is the code:
// read a file
char str[256];
fstream file_op("C:\\logdata",ios::in);
file_op >> str;
file_op.close();
// finds the slash, and add additional slash
char newPath[MAX_PATH];
int newCount = 0;
for(int i=0; i < strlen(str); i++)
{
if(str[i] == '\\')
{
newPath[newCount++] = str[i];
}
newPath[newCount++] = str[i];
}
// write it to a different file
ofstream out("c:\\path.txt", ios::out | ios::binary);
out.write(newPath, strlen(newPath));
out.close();
Every char string in C has to end with character \0. It is an indicator that the string ends right there.
Your newPath array, after iterating through your for-loop is not correctly ended. It probably ends somewhere later, where \0 appears by accident in memory.
Try doing the following right after exiting the for-loop:
newPath[newCount]=0;
A safer way for using strings in C++, is to use std::string class over plain char arrays.
Try putting a string terminator in the buffer, after the loop :
newPath[newCount] = 0;
I'm writing a program for an exercise that will read data from a file and format it to be readable. So far, I have a bit of code that will separate a header from the data that goes under it. Here it is:
int main() {
ifstream in("records.txt");
ofstream out("formatted_records.txt");
vector<string> temp;
vector<string> headers;
for (int i = 0; getline(in,temp[i]); ++i) {
static int k = -1;
if (str_isalpha(temp[i])) {
headers[++k] = temp[i];
temp.erase(temp.begin() + i);
}
else {
temp[i] += "," + headers[k];
}
}
}
(str_isalpha() is just a function that applies isalpha() to every character in a string.) Now, the for-loop in this program doesn't execute, and I can't figure out why. Does anybody know?
EDIT: As suggested, I changed it to
string line;
for (int i = 0; getline(in,line); ++i) {
temp.push_back(line);
Still skips the for-loop altogether.
vector<string> temp; makes an empty vector. When you then try to read into temp[0], that is undefined behavior. You should pass as getline's second argument a separate string variable, say string foo; before the loop, then temp.push_back(foo); as the first instruction in the loop's body.
If the loop still doesn't run after ensuring that you're reading into a valid string reference, then you should check that the stream you're reading from is valid. The stream will be invalid if the file doesn't exist or if you lack permission to read it, for instance. When the stream isn't valid, getline won't read anything. Its return value is the same stream, and when converted to bool, it evaluates as false. Check the stream's status before proceeding.
ifstream in("records.txt");
if (!in.is_open()) {
std::cerr << "Uh-oh.\n";
return EXIT_FAILURE;
}