how to fix strtok compiler error? - c++

hi so i am writing this code to to look into a text file and put each word it finds in a c string array. I was able to write the code but I get problems when there is a mistake in the actual text file. for example my program would crash if there is a double space in the sentence like "the car goes fast" it would stop at car. looking at my code i believe that this is because of strtok. i think to fix the problem i need to make strtok make a token of then next value but i am not sure how to do so
my code
#include <iostream>
#include <fstream>
#include <string.h>
#include <stdlib.h>
using namespace std;
int main() {
ifstream file;
file.open("text.txt");
string line;
char * wordList[10000];
int x=0;
while (getline(file,line)){
// initialize a sentence
char *sentence = (char*) malloc(sizeof(char)*line.length());
strcpy(sentence,line.c_str());
// intialize a pointer
char* word;
// this gives us a pointer to the first instance of a space, comma, etc.,
// that is, the characters in "sentence" will be read into "word"
// until it reaches one of the token characters (space, comma, etc.)
word = strtok(sentence, " ,!;:.?");
// now we can utilize a while loop, so every time the sentence comes to a new
// token character, it stops, and "word" will equal the characters from the last
// token character to the new character, giving you each word in the sentence
while (NULL != word){
wordList[x]=word;
printf("%s\n", wordList[x]);
x++;
word = strtok(NULL," ,!;:.?");
}
}
printf("done");
return 0;
}
I know some of the code is in c++ and some is in c but I am trying to do the most of it in c

The problem might be that you are not allocating enough space for a null terminated string.
char *sentence = (char*) malloc(sizeof(char)*line.length());
strcpy(sentence,line.c_str());
If you need to capture "abc", you need 3 elements for the characters and another one for the terminating null character, i.e. 4 characters total.
The value of the argument to malloc needs to be increased by 1.
char *sentence = (char*) malloc(line.length()+1);
strcpy(sentence,line.c_str());
It's not clear why you are using malloc in a C++ program. I suggest use new.
char *sentence = new char[line.length()+1];
strcpy(sentence,line.c_str());

Related

How to read a specific amount of characters

I can get the characters from console with this code:
Displays 2 characters each time in a new line
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
char ch[3] = "";
ifstream file("example.txt");
while (file.read(ch, sizeof(ch)-1))
{
cout << ch << endl;
}
return 0;
}
My problem is, if the set of characters be odd it doesn't displays the last character in the text file!
my text file contains this:
abcdefg
it doesn't displays the letter g in the console
its displaying this:
ab
cd
ef
I wanna display like this:
ab
cd
ef
g
I wanna use this to read 1000 characters at a time for a large file so i don't wanna read character by character, It takes a lot of time, but it has a problem if u can fix it or have a better suggestion, share it with me
The following piece of code should work:
while (file) {
file.read(ch, sizeof(ch) - 1);
int number_read_chars = file.gcount();
// print chars here ...
}
By moving the read call into the loop, you'll be able to handle the last call, where too few characters are available. The gcount method will provide you with the information how many characters were actually read by the last unformatted input operation, e.g. read.
Please note, when reading less than sizeof(ch) chars, you manually have to insert a NUL character at the position returned by gcount, if you intend to use the buffer as a C string, as those are null terminated:
ch[file.gcount()] = '\0';

concatenating tokens of an old cstring into a new c-string

Our professor gave us a palindrome assignment, and in this assignment we need to write a function that removes all punctation marks, spaces, and converts uppercase letter to lowecase letters in a c-string.
The problem I am getting is when I debug/run it, after I enter the cstring for the function, it gives an "Debug Assertion failed" error, and gives the output of only lower case letter version of the c-string input. Does anyone have suggestions how I can fix or improve this piece of code?
Update: I fixed my error by tokenizing the string as geeksforgeeks did. But now the problem I am getting is, when concatenating tokens of s cstring into new_s c-string, it only concatenates the first token of s to new_s.
This is my code:
#define _CRT_SECURE_NO_WARNINGS //I added this because my IDE kept giving me error saying strtok is unsafe and I should use strtok_s.
#include <iostream>
#include <iomanip>
#include <cstring>
using namespace std;
/*This method removes all spaces and punctuation marks from its c-string as well as change any uppercase letters to lowercase. **/
void removePuncNspace(char s[])
{
char new_s[50], *tokenptr;
//convert from uppercase to lowercase
for (int i = 0; i < strlen(s); i++) (char)tolower(s[i]);
//use a cstring function and tokenize s into token pointer and eliminate spaces and punctuation marks
tokenptr = strtok(s, " ,.?!:;");
//concatenate the first token into a c-string.
strcpy_s(new_s,tokenptr);
while (tokenptr != NULL)
{
tokenptr = strtok('\0', " ,.?!:;"); //tokenize rest of the string
}
while (tokenptr != NULL)
{
// concat rest of the tokens to a new cstring. include the \0 NULL as you use a cstrig function to concatenate the tokens into a c-string.
strcat_s(new_s, tokenptr);
}
//copy back into the original c - string for the pass by reference.
strcpy(s, new_s);
}
My output is:
Enter a line:
Did Hannah see bees? Hannah did!
Did is palindrome
Firstly, as #M.M said, when you want to continue tokenizing the same string, you should call strk(NULL, ".."), not with '\0'.
Secondly, your program logic doesn't make much sense. You split the string s into substrings, but never actually concatenate them to new_s. By the time you get to the second while, tokenptr is surely NULL, so you never enter the loop.
To fix your code I merged the two whiles into a single one and added an if to not call strcat(new_s, tokenptr) if tokenptr is NULL.
void removePuncNspace(char s[])
{
char new_s[50], *tokenptr;
//convert from uppercase to lowercase
for (int i = 0; i < strlen(s); i++) (char)tolower(s[i]);
//use a cstring function and tokenize s into token pointer and eliminate spaces and punctuation marks
tokenptr = strtok(s, " ,.?!:;");
//concatenate the first token into a c-string.
strcpy(new_s,tokenptr);
while (tokenptr != NULL)
{
tokenptr = strtok(nullptr, " ,.?!:;"); //tokenize rest of the string
if (tokenptr != NULL)
strcat(new_s, tokenptr);
}
//copy back into the original c - string for the pass by reference.
strcpy(s, new_s);
}
P.S: I used the non-secure versions of cstring functions because, for some reason, my compiler doesn't like the secure ones.

How can I reach the second word in a string?

I'm new here and this is my first question, so don't be too harsh :]
I'm trying to reverse a sentence, i.e. every word separately.
The problem is that I just can't reach the second word, or even reach the ending of a 1-word sentence. What is wrong?
char* reverse(char* string)
{
int i = 0;
char str[80];
while (*string)
str[i++] = *string++;
str[i] = '\0'; //null char in the end
char temp;
int wStart = 0, wEnd = 0, ending = 0; //wordStart, wordEnd, sentence ending
while (str[ending]) /*####This part just won't stop####*/
{
//skip spaces
while (str[wStart] == ' ')
wStart++; //wStart - word start
//for each word
wEnd = wStart;
while (str[wEnd] != ' ' && str[wEnd])
wEnd++; //wEnd - word ending
ending = wEnd; //for sentence ending
for (int k = 0; k < (wStart + wEnd) / 2; k++) //reverse
{
temp = str[wStart];
str[wStart++] = str[wEnd];
str[wEnd--] = temp;
}
}
return str;
}
Your code is somewhat unidiomatic for C++ in that it doesn't actually make use of a lot of common and convenient C++ facilities. In your case, you could benefit from
std::string which takes care of maintaining a buffer big enough to accomodate your string data.
std::istringstream which can easily split a string into spaces for you.
std::reverse which can reverse a sequence of items.
Here's an alternative version which uses these facilities:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <vector>
std::string reverse( const std::string &s )
{
// Split the string on spaces by iterating over the stream
// elements and inserting them into the 'words' vector'.
std::vector<std::string> words;
std::istringstream stream( s );
std::copy(
std::istream_iterator<std::string>( stream ),
std::istream_iterator<std::string>(),
std::back_inserter( words )
);
// Reverse the words in the vector.
std::reverse( words.begin(), words.end() );
// Join the words again (inserting one space between two words)
std::ostringstream result;
std::copy(
words.begin(),
words.end(),
std::ostream_iterator<std::string>( result, " " )
);
return result.str();
}
At the end of the first word, after it's traversed, str[wEnd] is a space and
you remember this index when you assign ending = wEnd.
Immediately, you reverse the characters in the word. At that point,
str[ending] is not a space because you included that space in the
letter-reversal of the word.
Depending on whether there are extra
spaces between words in the rest of the input, execution varies from this point, but it does eventually end with
you reversing a word that ended at the null terminator on the string
because you end the loop that increments wEnd on that null terminator and
include it in the final word reversal.
The very next iteration walks off of
the initialized part of the input string and the execution is undetermined from there because, heck, who knows what's in that array (str is stack-allocated, so it's whatever's sitting around in the memory occupied by the stack at that point).
On top of all of that, you don't update wStart except in the reversal loop,
and it never moves to wEnd all the way (see the loop exit condition), so come to think of it, you're never getting past that first word. Assuming that was fixed, you'd still have the problem I outlined at first.
All this assumes that you didn't just send this function something longer than 80 characters and break it that way.
Oh, and as mentioned in one of the comments on the question, you're returning stack-allocated local storage, which isn't going to go anywhere good either.
Hoo, boy. Where to start?
In C++, use std::string instead of char* if you can.
char[80] is an overflow risk if string is input by a user; it should be dynamically allocated. Preferably by using std::string; otherwise use new / new[]. If you meant to use C, then malloc.
cmbasnett also pointed out that you can't actually return str (and get the expected results) if you declare / allocate it the way you did. Traditionally, you'd pass in a char* destination and not allocate anything in the function at all.
Set ending to wEnd + 1; wEnd points to the last non-null character of the string in question (eventually, if it works right), so in order for str[ending] to break out of the loop, you have to increment once to get to the null char. Disregard that, I misread the while loop.
It looks like you need to use ((wEnd - wStart) + 1), not (wStart + wEnd). Although you should really use something like while(wEnd > wStart) instead of a for loop in this context.
You also should be setting wStart = ending; or something before you leave the loop, because otherwise it's going to get stuck on the first word.

Why can't write full length string (char[]) in a binary file in Turbo C++

In an array like int a[5] we can store 5 values from a[0] to a[4]. not this..?
I have a char mobile[10] variable in my class and I was storing exactly 10 character long string in this variable. But when I am reading it from file, a few characters from the next variable (declared just after this variable in class) are being appended in variable mobile. It took hours to investigate what is wrong.
I tried everything I could by changing the order of variable etc.
At last I'd set the size of mobile to 11 (char mobile[11]) and then store it into the binary file. Then everything goes well.
Here I have created a demo program that can demonstrate my study:
#include <iostream.h>
#include <conio.h>
#include <string.h>
#include <fstream.h>
#include <stdio.h>
class Test
{
public:
char mobile[10], address[30];
};
void main()
{
clrscr();
Test t;
// uncoment below to write to file
/*strcpy(t.mobile, "1234567890");
strcpy(t.address, "Mumbai");
fstream f("_test.bin", ios::binary | ios::out | ios::app);
f.write((char*)&t, sizeof(t));*/
// uncomment below to read from file
/*fstream f("_test.bin", ios::binary | ios::in);
f.read((char*)&t, sizeof(t));
cout << t.mobile << "\t" << t.address;*/
f.close();
getch();
}
Is my assumption correct that I can not store n characters in an array like char[n] when working with files more specifically with binary files..?
Should I always take 1 extra size of required size..??
My compiler is Turbo C++ (may be 3.0). It is very old and discontinued product.
C-style strings (char arrays) are null terminated. You do not need to store the null terminator in your file, but you need it when printing the string.
In your example you use strcpy to copy a 10-character string into a char[10]. This is undefined behavior because strcpy appends a null terminator to the destination string. You need to use a char[11].
In your example you read 10 characters from the file and print them using cout. cout determines the length of the string by the null terminator. Since you don't have one, cout reads past the end of your string. This is also undefined behavior, but happens to work in most cases by reading characters from the next field in the struct. You need a null terminator on this array, which means you would need to increase your array size to 11 for this as well.
character pointers in C/C++ must be null terminated. That means you must allot another character with value of '\0' at the end.
Also note, strcpy function copies all the characters from one string to another, until \0 is encountered, unless its a const string(an example is "hello world") which is stored as "hello world\0" during compilation.
Try this code:
#include <iostream.h>
#include <conio.h>
#include <string.h>
#include <fstream.h>
#include <stdio.h>
class Test
{
public:
char mobile[11], address[30];
};
void main()
{
clrscr();
Test t;
// uncoment below to write to file
strcpy(t.mobile, "1234567890");
strcpy(t.address, "Mumbai");
t.address[10] = '\0';
fstream f("_test.bin", ios::binary | ios::out | ios::app);
f.write((char*)&t, sizeof(t))
// uncomment below to read from file
fstream f("_test.bin", ios::binary | ios::in);
f.read((char*)&t, sizeof(t));
cout << t.mobile << "\t" << t.address;
f.close();
getch();
}
The string literal "1234567890" occupies 11 bytes, not 10!
printf("%d", sizeof("1234567890"));
// 11
This is because the compiler silently adds a \0 character - end of string marker - at the end of string literals. This marker is used by various string manipulation functions, including strcpy.
Now, the following line:
strcpy(t.mobile, "1234567890");
attempts to copy the string - the 10 characters plus the \0 - into t.mobile. Since t.mobile is 10 bytes long, the \0 will overflow into the space used by other variables' space (or worse).
In your example:
strcpy(t.mobile, "1234567890") copies the string as expected but the \0 overflows into the space used by t.address
strcpy(t.address, "Mumbai") copies the string as expected, the \0 gets overwritten
The result of printing t.mobile should be "1234567890Mumbai"
Moral of the story: always account for the \0 byte when using C string functions. Failing to do so will cause unexpected problems including variable corruption, run time errors, or worse (e.g. data execution).

How to get only first words from several C++ strings?

I have several C++ strings with some words. I need to get the first word from every string. Then I have to put all of them into a char array. How can I do it?
Here is one way of doing it...
// SO2913562.cpp
//
#include <iostream>
#include <sstream>
using namespace std;
void getHeadWords(const char *input[]
, unsigned numStrings
, char *outBuf
, unsigned outBufSize)
{
string outStr = "";
for(unsigned i = 0; i<numStrings; i++)
{
stringstream ss(stringstream::in|stringstream::out);
ss<<input[i];
string word;
ss>>word;
outStr += word;
if(i < numStrings-1)
outStr += " ";
}
if(outBufSize < outStr.size() + 1)//Accomodate the null terminator.
//strncpy omits the null terminator if outStr is of the exact same
//length as outBufSize
throw out_of_range("Output buffer too small");
strncpy(outBuf, outStr.c_str(), outBufSize);
}
int main ()
{
const char *lines[] = {
"first sentence"
, "second sentence"
, "third sentence"
};
char outBuf[1024];
getHeadWords(lines, _countof(lines), outBuf, sizeof(outBuf));
cout<<outBuf<<endl;
return 0;
}
But note the above code has marginal error checking and may have security flaws. And needless to say my C++ is a bit rusty. Cheers.
I'll assume it's homework, so here is a general description:
First, you need to allocate enough space in your char array. In homework, you are usually told the maximum size. That maximum has to be enough for all the first words.
Now, you need to have an index for the insertion point in that array. Start it at zero.
Now go over your strings in order. In each, move an index forward from 0 until you see a \0 or a space (or other delimiter. Insert the character at the insertion point in the result array and increase that index by 1.
If you have encountered a space or a \0, you've found your first word. If you were on the last string, insert a \0 at the insertion point and you're done. If not, insert a space and move to the next string.
what compiler are you using?
converting to a chararray is the first thing to look for.
after done that, you can easily step through your array (and look for spaces)
something like this:
while (oldarray[i++] != ' ')
yournewarray[j++];
i think you gotta figure out the rest yourself, since this looks like some homework for school :)
Assuming this is homework, and that when you say "strings" you mean simple null-delimited arrays of char (and not std::string):
define your strings
define your resulting char array
for each string
find the offset of the first char that is not in the first word
append that many bytes of the string to the result array
If this is not homework, give us a little code to start with and we'll fill in the blanks.