Why isn't C++ strtok() working for me? - c++

The program is supposed to receive an input through cin, tokenize it, and then output each one to show me that it worked properly. It did not.
The program compiles with no errors, and takes an input, but fails to output anything.
What am I doing wrong?
int main(int argc, char* argv[])
{
string input_line;
while(std::cin >> input_line){
char* pch = (char*)malloc( sizeof( char ) *(input_line.length() +1) );
char *p = strtok(pch, " ");
while (p != NULL) {
printf ("Token: %s\n", p);
p = strtok(NULL, " ");
}
}
return 0;
}
I followed the code example here: http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Thanks.

Looks like you forget to copy the contents of input_line to pch:
strcpy(pch, input_line.c_str());
But I'm not sure why you're doing string tokenization anyway. Doing cin >> input_line will not read a line, but a token.. so you get tokens anyway?

This is more of a correctness post, Hans has your problem.
The correct way to get a line of input is with getline:
std::string s;
std::getline(std::cin, s);
std::cin breaks at whitespace anyway, so if you typed asd 123 and ran your code, input_line would first be "asd", then the second time in the loop "123" (without waiting for enter).
That said, an easy way to get your result is with a stringstream. Any time you explicitly allocate memory, especially with malloc, you're probably doing something the hard way. Here's one possible solution to tokenizing a string:
#include <sstream>
#include <string>
#include <iostream>
int main(void)
{
std::string input;
std::getline(std::cin, input);
std::stringstream ss(input);
std::string token;
while(std::getline(ss, token, ' '))
{
std::cout << token << "...";
}
std::cout << std::endl;
}
If you really want to use strtok, you might do something like this:
#include <cstring>
#include <string>
#include <iostream>
#include <vector>
int main(void)
{
std::string input;
std::getline(std::cin, input);
std::vector<char> buffer(input.begin(), input.end());
buffer.push_back('\0');
char* token = strtok(&buffer[0], " ");
for (; token; token = strtok(0, " "))
{
std::cout << token << "...";
}
std::cout << std::endl;
}
Remember, manually memory management is bad. Use a vector for arrays, and you avoid leaks. (Which your code has!)

You didn't initialize your string. Insert
strcpy(pch, input_line.c_str());
after the malloc line.

GMan's answer is probably better and more purely c++. This is more of a mix which specifically uses strtok(), since I think that was your goal.
I used strdup()/free() since it was the easiest way to copy the string. In the question you were leaking memory since you'd malloc() with no matching free().
Also operator>> with the string will break on whitespace and so inappropriate for getting lines. Use getline() instead.
token.cpp
#include <iostream>
#include <string>
#include <cstring> /* for strtok() and strdup() */
#include <cstdlib> /* for free() */
int main(int argc, char * argv[]){
std::string line;
while(getline(std::cin, line)){
char *pch = strdup(line.c_str());
char *p = strtok(pch, " ");
while(p){
std::cout<<"Token: "<<p<<std::endl;
p = strtok(NULL, " ");
}
std::cout <<"End of line"<<std::endl;
free(pch);
}
return 0;
}
When you run this, you get what appears to be the correct result/
$ printf 'Hi there, I like tokens\nOn new lines too\n\nBlanks are fine'|./token
Token: Hi
Token: there,
Token: I
Token: like
Token: tokens
End of line
Token: On
Token: new
Token: lines
Token: too
End of line
End of line
Token: Blanks
Token: are
Token: fine
End of line

Or use this:
pch = strdup(input_line.c_str());

Related

Difference between std::cin and scanf() applied to string

I am trying to get the first character of a string written to a variable of type char. With std::cin (commented out) it works fine, but with scanf() I get runtime error. It crushes when I enter "LLUUUR". Why is it so? Using MinGW.
#include <cstdio>
#include <string>
#include <iostream>
int main() {
std::string s;
scanf("%s", &s);
//std::cin >> s;
char c = s[0];
}
scanf knows nothing about std::string. If you want to read into the underlying character array you must write scanf("%s", s.data());. But do make sure that the string's underlying buffer is large enough by using std::string::resize(number)!
Generally: don't use scanf with std::string.
Another alternative if you want to use scanf and std::string
int main()
{
char myText[64];
scanf("%s", myText);
std::string newString(myText);
std::cout << newString << '\n';
return 0;
}
Construct the string after reading.
Now for the way directly on the string:
int main()
{
std::string newString;
newString.resize(100); // Or whatever size
scanf("%s", newString.data());
std::cout << newString << '\n';
return 0;
}
Although this will of course only read until the next space. So if you want to read a whole line, you would be better off with:
std::string s;
std::getline(std::cin, s);

Delimiting Using Multiple Delimiters

So I have an arbitrarily long string that I take as an input from the user and I want to tokenise it and store it in a vector<std::string>. here is the code that I am using (which maybe inspired from my C background):
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <sstream>
#include <string.h>
using namespace std;
int main()
{
string input;
cout << "Input a \' \' or \',\' or \'\\r\' separated string: ";
cin >> input;
vector<string> tokens;
char *str = new char[input.length() + 1];
strcpy(str, input.c_str());
char * pch;
pch = strtok(str, " , \r");
while (pch != NULL)
{
tokens.push_back(pch);
pch = strtok(NULL, " , \r");
}
for (vector<string>::const_iterator i = tokens.begin(); i != tokens.end(); ++i)
cout << *i << ' ';
return 0;
}
However, this only tokenizes the first word and nothing after that, like viz:
Input a ' ' or ',' or '\r' string: hello, world I am C.
hello
What am I doing wrong and what would be the correct way to do it without using third party library?
Regards.
This is, sadly, a quite common pitfall. Many introductory courses and books on C++ teach you to accept interactive input like this:
cin >> input;
Many introductory simple exercises typically prompt for a single value of some sort, and that works fine, for that use case.
Unfortunately, these books don't fully explain what >> actually does, and what it does, really, is strip whitespace from input, and only process input up until the next whitespace. Even when input is a string.
So, when you enter a whole line of text, only the first word is read into input. The solution is to use the right tool, for the right job: std::getline(), which reads a single line of text, and puts it into a single string variable:
getline(cin, input);

Using strtok/strtok_r in a while loop in C++

I'm getting unexpected behavior from the strtok and strtrok_r functions:
queue<string> tks;
char line[1024];
char *savePtr = 0;
while(true)
{
//get input from user store in line
tks.push(strtok_r(line, " \n", &savePtr)); //initial push only works right during first loop
char *p = nullptr;
for (...)
{
p = strtok_r(NULL, " \n", &savePtr);
if (p == NULL)
{
break;
}
tks.push(p);
}
delete p;
savePtr = NULL;
//do stuff, clear out tks before looping again
}
I've tried using strtok and realized that during the second loop, the initial push is not occurring. I attempted to use the reentrant version strtok_r in order to control what the saved pointer is pointing to during the second loop by making sure it is null before looping again.
tks is only correctly populated during the first time through the loop - subsequent loops give varying results depending on the length of line
What am I missing here?
Just focusing on the inner loop and chopping off all of the stuff I don't see as necessary.
#include <iostream>
#include <queue>
#include <string>
#include <cstring>
using namespace std;
int main()
{
std::queue<std::string> tks;
while(true)
{
char line[1024];
char *savePtr;
char *p;
cin.getline(line, sizeof(line));
p = strtok_r(line, " \n", &savePtr); // initial read. contents of savePtr ignored
while (p != NULL) // exit when no more data, which includes an emtpy line
{
tks.push(p); // got data, store it
p = strtok_r(NULL, " \n", &savePtr); // get next token
}
// consume tks
}
}
I prefer the while loop over the for loop used by Toby Speight in his answer because I think it is more transparent and easier to read. Your mileage may vary. By the time the compiler is done with it they will be identical.
There is no need to delete any memory. It is all statically allocated. There is no need to clear anything before the next round except for tks. savePtr will be reset by the first strtok_r.
There is a failure case if the user inputs more than 1024 characters on a line, but this will not crash. If this still doesn't work, look into how you're consuming tks. It's not posted so we can't troubleshoot that portion.
Wholeheartedly recommend changing to a string-based solution if possible. This is a really simple, easy to write, but slow, one:
#include <iostream>
#include <queue>
#include <string>
#include <sstream>
int main()
{
std::queue<std::string> tks;
while(true)
{
std::string line;
std::getline(std::cin, line);
std::stringstream linestream(line);
std::string word;
// parse only on ' ', not on the usual all whitespace of >>
while (std::getline(linestream, word, ' '))
{
tks.push(word);
}
// consume tks
}
}
Your code wouldn't compile for me, so I fixed it:
#include <iostream>
#include <queue>
#include <string>
#include <cstring>
std::queue<std::string> tks;
int main() {
char line[1024] = "one \ntwo \nthree\n";
char *savePtr = 0;
for (char *p = strtok_r(line, " \n", &savePtr); p;
p = strtok_r(nullptr, " \n", &savePtr))
tks.push(p);
// Did we read it correctly?
for (; tks.size() > 0; tks.pop())
std::cout << ">" << tks.front() << "<" << std::endl;
}
This produces the expected output:
>one<
>two<
>three<
So your problem isn't with the code you posted.
If you have the option to use boost, try this one out to tokenize a string. Of course by providing your own string and delimeters.
#include <vector>
#include <boost/algorithm/string.hpp>
int main()
{
std::string str = "Any\nString\nYou want";
std::vector< std::string > results;
boost::split( results, str, boost::is_any_of( "\n" ) );
}

CString Parsing Carriage Returns

Let's say I have a string that has multiple carriage returns in it, i.e:
394968686
100630382
395950966
335666021
I'm still pretty amateur hour with C++, would anyone be willing to show me how you go about: parsing through each "line" in the string ? So I can do something with it later (add the desired line to a list). I'm guessing using Find("\n") in a loop?
Thanks guys.
while (!str.IsEmpty())
{
CString one_line = str.SpanExcluding(_T("\r\n"));
// do something with one_line
str = str.Right(str.GetLength() - one_line.GetLength()).TrimLeft(_T("\r\n"));
}
Blank lines will be eliminated with this code, but that's easily corrected if necessary.
You could try it using stringstream. Notice that you can overload the getline method to use any delimeter you want.
string line;
stringstream ss;
ss << yourstring;
while ( getline(ss, line, '\n') )
{
cout << line << endl;
}
Alternatively you could use the boost library's tokenizer class.
You can use stringstream class in C++.
#include <iostream>
#include <sstream>
#include <vector>
using namespace std;
int main()
{
string str = "\
394968686\
100630382\
395950966\
335666021";
stringstream ss(str);
vector<string> v;
string token;
// get line by line
while (ss >> token)
{
// insert current line into a std::vector
v.push_back(token);
// print out current line
cout << token << endl;
}
}
Output of the program above:
394968686
100630382
395950966
335666021
Note that no whitespace will be included in the parsed token, with the use of operator>>. Please refer to comments below.
If your string is stored in a c-style char* or std::string then you can simply search for \n.
std::string s;
size_t pos = s.find('\n');
You can use string::substr() to get the substring and store it in a list. Pseudo code,
std::string s = " .... ";
for(size_t pos, begin = 0;
string::npos != (pos = s.find('\n'));
begin = ++ pos)
{
list.push_back(s.substr(begin, pos));
}

How to read the string into a file C++

i have a little problem on writing the string into a file,
How can i write the string into the file and able to view it as ascii text?
because i am able to do that when i set the default value for str but not when i enter a str data
Thanks.
#include <iostream>
#include <fstream>
#include <cstring>
using namespace std;
int main()
{
fstream out("G://Test.txt");
if(!out) {
cout << "Cannot open output file.\n";
return 1;
}
char str[200];
cout << "Enter Customers data seperate by tab\n";
cin >> str;
cin.ignore();
out.write(str, strlen(str));
out.seekp(0 ,ios::end);
out.close();
return 0;
}
Please use std::string:
#include <string>
std::string str;
std::getline(cin, str);
cout << str;
I'm not sure what the exact problem in your case was, but >> only reads up to the first separator (which is whitespace); getline will read the entire line.
Just note that >> operator will read 1 word.
std::string word;
std::cin >> word; // reads one space seporated word.
// Ignores any initial space. Then read
// into 'word' all character upto (but not including)
// the first space character (the space is gone.
// Note. Space => White Space (' ', '\t', '\v' etc...)
You're working at the wrong level of abstraction. Also, there is no need to seekp to the end of the file before closing the file.
You want to read a string and write a string. As Pavel Minaev has said, this is directly supported via std::string and std::fstream:
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::ofstream out("G:\\Test.txt");
if(!out) {
std::cout << "Cannot open output file.\n";
return 1;
}
std::cout << "Enter Customer's data seperated by tab\n";
std::string buffer;
std::getline(std::cin, buffer);
out << buffer;
return 0;
}
If you want to write C, use C. Otherwise, take advantage of the language you're using.
I can't believe no one found the problem. The problem was that you were using strlen on a string that wasn't terminated with a null character. strlen will keep iterating until it finds a zero-byte, and an incorrect string length might be returned (or the program might crash - it's Undefined Behavior, who knows?).
The answer is to zero-initialize your string:
char str[200] = {0};
Supplying your own string as the value of str works because those in-memory strings are null-terminated.