C++ strtok - multiple use with more data buffers - c++

I have little issue with using strtok() function.
I am parsing two files. Firts I load file 1 into buffer. This file constains name of the second file I need to load. Both files are read line after line. My code looks like this:
char second_file_name[128] = { "" };
char * line = strtok( buffer, "\n" );
while( line != NULL )
{
if ( line[0] = 'f' )
{
sscanf( line, "%*s %s", &second_file_name );
LoadSecondFile( second_file_name );
}
// processing other lines, not relevant for question
line = strtok( NULL, "\n" );
}
While the LoadSecondFile(...) function works in pretty same way, thus:
char * line = strtok( buffer, "\n" );
while( line != NULL )
{
// process file data
line = strtok( NULL, "\n" );
}
What my problem is, after calling the LoadSecondFile(...) function, the strtok() pointer used for parsing the first file gets "messed up". Instead of giving me line that follows the name of the second file, it gives me nothing - understand as "complete nonsense". Do I get it right that this is caused by strtok() pointer being shared in program, not only in function? If so, how can I "back up" the pointer of strtok() used for parsing first file before using it for parsing second file?
Thanks for any advice.
Cheers.

strtok is an evil little function which maintains global state, so (as you've found) you can't tokenise two strings at the same time. On some platforms, there are less evil variants with names like strtok_r or strtok_s; but since you're writing C++ not C, why not use the C++ library?
ifstream first_file(first_file_name); // no need to read into a buffer
string line;
while (getline(first_file, line)) {
if (!line.empty() && line[0] == 'f') { // NOT =
istringstream line_stream(line);
string second_file_name;
line_stream.ignore(' '); // skip first word ("%*s")
line_stream >> second_file_name; // read second word ("%s")
LoadSecondFile(second_file_name);
}
}

You can use strtok_r which allows you to have different state pointers.

Which is why it is constantly recommended to not use strtok
(not to mention the problems with threads). There are many
better solutions, using the functions in the C++ standard
library. None of which modify the text they're working on, and
none of which use hidden, static state.

Related

scanf for only ONE char

I need to be able to detect extraneous input and exit my program.
If I input a string of more than 1 char into scanf("%c", &var); it takes the first letter and stores it into var but the programs continues.
I tried to use if (scanf("%c", &var) != 1) but it returns 1 every time no matter the input so its no difference.
I know other functions like fgets() may be better suited for this but I have been instructed to use the scanf() function.
How should I do this?
When reading user input scanf can take its input from a file, or the console.
When reading from the console, the data becomes available to the program only when a line break is added. This means anything that was typed on the same line, is read.
Imagine I have a maze program, and it wants me to choose which way to go....
while( !atGoalLocation() ) {
printf( "Which direction (f)orward (l)eft (r)ight?\n" );
scanf( "%c", &dir );
processDirection( dir );
}
I could either enter the route through the maze as
f
l
f
r
f
f
Or it may also be correct to enter my input as
flfrff
Depending on your task, they may mean the same thing.
If you want to allow either of these inputs, then make sure you eat the white space by adding " " to the scanf.
if( scanf( " %c", &dir ) == 1 )
If the line break version is the only method you want to accept, then you should separate the lines and then try the scan.
char line[200];
while( fgets( line, sizeof( line ), stdin ) != NULL ){
sscanf( line, "%c", &dir );
Also for C++ we should be using the std::cin and std::cout. However the same complications occur for unread characters on the same line, so I would still use a line based parser.
std::string line;
while( std::getline( std::cin, line ) ){
// here we could parse the line using std::strstream to decode more complext things than a char.
dir = line[0];
You need to add a space before the %c:
scanf(" %c",&var);
And if you just dont want your program to wait for an enter after typing the character, use getch() defined in "conio.h"
c = getch();
OR
In scanf("%c",&var); you could add a newline character \n after %c in order to absorb the extra characters.
scanf("%c\n",&in);

Can I use 2 or more delimiters in C++ function getline? [duplicate]

This question already has answers here:
How can I read and parse CSV files in C++?
(39 answers)
Closed 4 years ago.
I would like to know how can I use 2 or more delimiters in the getline functon, that's my problem:
The program reads a text file... each line is goning to be like:
New Your, Paris, 100
CityA, CityB, 200
I am using getline(file, line), but I got the whole line, when I want to to get CityA, then CityB and then the number; and if I use ',' delimiter, I won't know when is the next line, so I'm trying to figure out some solution..
Though, how could I use comma and \n as a delimiter?
By the way,I'm manipulating string type,not char, so strtok is not possible :/
some scratch:
string line;
ifstream file("text.txt");
if(file.is_open())
while(!file.eof()){
getline(file, line);
// here I need to get each string before comma and \n
}
You can read a line using std::getline, then pass the line to a std::stringstream and read the comma separated values off it
string line;
ifstream file("text.txt");
if(file.is_open()){
while(getline(file, line)){ // get a whole line
std::stringstream ss(line);
while(getline(ss, line, ',')){
// You now have separate entites here
}
}
No, std::getline() only accepts a single character, to override the default delimiter. std::getline() does not have an option for multiple alternate delimiters.
The correct way to parse this kind of input is to use the default std::getline() to read the entire line into a std::string, then construct a std::istringstream, and then parse it further, into comma-separate values.
However, if you are truly parsing comma-separated values, you should be using a proper CSV parser.
Often, it is more intuitive and efficient to parse character input in a hierarchical, tree-like manner, where you start by splitting the string into its major blocks, then go on to process each of the blocks, splitting them up into smaller parts, and so on.
An alternative to this is to tokenize like strtok does -- from the beginning of input, handling one token at a time until the end of input is encountered. This may be preferred when parsing simple inputs, because its is straightforward to implement. This style can also be used when parsing inputs with nested structure, but this requires maintaining some kind of context information, which might grow too complex to maintain inside a single function or limited region of code.
Someone relying on the C++ std library usually ends up using a std::stringstream, along with std::getline to tokenize string input. But, this only gives you one delimiter. They would never consider using strtok, because it is a non-reentrant piece of junk from the C runtime library. So, they end up using streams, and with only one delimiter, one is obligated to use a hierarchical parsing style.
But zneak brought up std::string::find_first_of, which takes a set of characters and returns the position nearest to the beginning of the string containing a character from the set. And there are other member functions: find_last_of, find_first_not_of, and more, which seem to exist for the sole purpose of parsing strings. But std::string stops short of providing useful tokenizing functions.
Another option is the <regex> library, which can do anything you want, but it is new and you will need to get used to its syntax.
But, with very little effort, you can leverage existing functions in std::string to perform tokenizing tasks, and without resorting to streams. Here is a simple example. get_to() is the tokenizing function and tokenize demonstrates how it is used.
The code in this example will be slower than strtok, because it constantly erases characters from the beginning of the string being parsed, and also copies and returns substrings. This makes the code easy to understand, but it does not mean more efficient tokenizing is impossible. It wouldn't even be that much more complicated than this -- you would just keep track of your current position, use this as the start argument in std::string member functions, and never alter the source string. And even better techniques exist, no doubt.
To understand the example's code, start at the bottom, where main() is and where you can see how the functions are used. The top of this code is dominated by basic utility functions and dumb comments.
#include <iostream>
#include <string>
#include <utility>
namespace string_parsing {
// in-place trim whitespace off ends of a std::string
inline void trim(std::string &str) {
auto space_is_it = [] (char c) {
// A few asks:
// * Suppress criticism WRT localization concerns
// * Avoid jumping to conclusions! And seeing monsters everywhere!
// Things like...ah! Believing "thoughts" that assumptions were made
// regarding character encoding.
// * If an obvious, portable alternative exists within the C++ Standard Library,
// you will see it in 2.0, so no new defect tickets, please.
// * Go ahead and ignore the rumor that using lambdas just to get
// local function definitions is "cheap" or "dumb" or "ignorant."
// That's the latest round of FUD from...*mumble*.
return c > '\0' && c <= ' ';
};
for(auto rit = str.rbegin(); rit != str.rend(); ++rit) {
if(!space_is_it(*rit)) {
if(rit != str.rbegin()) {
str.erase(&*rit - &*str.begin() + 1);
}
for(auto fit=str.begin(); fit != str.end(); ++fit) {
if(!space_is_it(*fit)) {
if(fit != str.begin()) {
str.erase(str.begin(), fit);
}
return;
} } } }
str.clear();
}
// get_to(string, <delimiter set> [, delimiter])
// The input+output argument "string" is searched for the first occurance of one
// from a set of delimiters. All characters to the left of, and the delimiter itself
// are deleted in-place, and the substring which was to the left of the delimiter is
// returned, with whitespace trimmed.
// <delimiter set> is forwarded to std::string::find_first_of, so its type may match
// whatever this function's overloads accept, but this is usually expressed
// as a string literal: ", \n" matches commas, spaces and linefeeds.
// The optional output argument "found_delimiter" receives the delimiter character just found.
template <typename D>
inline std::string get_to(std::string& str, D&& delimiters, char& found_delimiter) {
const auto pos = str.find_first_of(std::forward<D>(delimiters));
if(pos == std::string::npos) {
// When none of the delimiters are present,
// clear the string and return its last value.
// This effectively makes the end of a string an
// implied delimiter.
// This behavior is convenient for parsers which
// consume chunks of a string, looping until
// the string is empty.
// Without this feature, it would be possible to
// continue looping forever, when an iteration
// leaves the string unchanged, usually caused by
// a syntax error in the source string.
// So the implied end-of-string delimiter takes
// away the caller's burden of anticipating and
// handling the range of possible errors.
found_delimiter = '\0';
std::string result;
std::swap(result, str);
trim(result);
return result;
}
found_delimiter = str[pos];
auto left = str.substr(0, pos);
trim(left);
str.erase(0, pos + 1);
return left;
}
template <typename D>
inline std::string get_to(std::string& str, D&& delimiters) {
char discarded_delimiter;
return get_to(str, std::forward<D>(delimiters), discarded_delimiter);
}
inline std::string pad_right(const std::string& str,
std::string::size_type min_length,
char pad_char=' ')
{
if(str.length() >= min_length ) return str;
return str + std::string(min_length - str.length(), pad_char);
}
inline void tokenize(std::string source) {
std::cout << source << "\n\n";
bool quote_opened = false;
while(!source.empty()) {
// If we just encountered an open-quote, only include the quote character
// in the delimiter set, so that a quoted token may contain any of the
// other delimiters.
const char* delimiter_set = quote_opened ? "'" : ",'{}";
char delimiter;
auto token = get_to(source, delimiter_set, delimiter);
quote_opened = delimiter == '\'' && !quote_opened;
std::cout << " " << pad_right('[' + token + ']', 16)
<< " " << delimiter << '\n';
}
std::cout << '\n';
}
}
int main() {
string_parsing::tokenize("{1.5, null, 88, 'hi, {there}!'}");
}
This outputs:
{1.5, null, 88, 'hi, {there}!'}
[] {
[1.5] ,
[null] ,
[88] ,
[] '
[hi, {there}!] '
[] }
I don't think that's how you should attack the problem (even if you could do it); instead:
Use what you have to read in each line
Then split up that line by the commas to get the pieces that you want.
If strtok will do the job for #2, you can always convert your string into a char array.

Find a string in a file C++

I am trying to parse a file in C++. My file contents are as follows:
//Comments should be ignored
FileVersion,1;
Count,5;
C:\Test\Files\Test_1.txt 0,16777216,16777552,0,0,1,0,1,1,1;
FileVersion is the first line I need to read information. All the previous lines are just comments which begin with a '//'. How do I set my cursor to line containing FileVersion? Becuase I am using fscanf to read the information from the file.
if ( 1 != fscanf( f, "FileVersion,%d;\n", &lFileVersion ))
{
//Successfully read the file version.
}
I like to write parsers (assuming "line-based") by reading a line at a time, and then using sscanf strncmp and strcmp (or C++'s std::stringstream and std::string::substr) to check for various content.
In your example, something like:
enum Sates
{
Version = 1,
Count = 2,
...
} state = Version;
char buffer[MAXLEN];
while(fgets(buffer, MAXLEN, f) != NULL)
{
if (0 == strncmp("//", buffer, 2))
{
// Comment. Skip this line.
continue;
}
switch (state)
{
case Version:
if (0 == strncmp("FileVersion,", buffer, 12))
{
if (1 == sscanf(buffer, "FileVersion,%d;", &version))
{
state = Count;
break;
}
Error("Expected file version number...");
}
break;
...
}
}
There are of course oodles of other ways to do this.
Since this is tagged C++, I will give you a C++ solution.
You can use a single call to f.ignore() to discard the first line of the stream:
f.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
Technically this skips everything up and including the newline at the end of the first line, so the stream position will be just before the newline on the second line. Formatted I/O discards leading whitespace so this will be no issue.
The above requires the use of C++ file streams since this is C++, and the use of the formatted operators operator>>() and operator<<() to perform input and output.
Not a particular C++ solution, but:
read a line with fgets (oh okay, if you want, you can substitute a C++ function for that);
if it starts with your 'comment' designator, skip to end of loop
if the line is empty (i.e., it contains only a hard return; or, possibly, check for zero or more whitespace characters and then an end-of-line), skip to end of loop
at end of loop: if you got something else, use sscanf on that string.

Pull out data from a file and store it in strings in C++

I have a file which contains records of students in the following format.
Umar|Ejaz|12345|umar#umar.com
Majid|Hussain|12345|majid#majid.com
Ali|Akbar|12345|ali#geeks-inn.com
Mahtab|Maqsood|12345|mahtab#myself.com
Juanid|Asghar|12345|junaid#junaid.com
The data has been stored according to the following format:
firstName|lastName|contactNumber|email
The total number of lines(records) can not exceed the limit 100. In my program, I've defined the following string variables.
#define MAX_SIZE 100
// other code
string firstName[MAX_SIZE];
string lastName[MAX_SIZE];
string contactNumber[MAX_SIZE];
string email[MAX_SIZE];
Now, I want to pull data from the file, and using the delimiter '|', I want to put data in the corresponding strings. I'm using the following strategy to put back data into string variables.
ifstream readFromFile;
readFromFile.open("output.txt");
// other code
int x = 0;
string temp;
while(getline(readFromFile, temp)) {
int charPosition = 0;
while(temp[charPosition] != '|') {
firstName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
lastName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
contactNumber[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != endl) {
email[x] += temp[charPosition];
charPosition++;
}
x++;
}
Is it necessary to attach null character '\0' at the end of each string? And if I do not attach, will it create problems when I will be actually implementing those string variables in my program. I'm a new to C++, and I've come up with this solution. If anybody has better technique, he is surely welcome.
Edit: Also I can't compare a char(acter) with endl, how can I?
Edit: The code that I've written isn't working. It gives me following error.
Segmentation fault (core dumped)
Note: I can only use .txt file. A .csv file can't be used.
There are many techniques to do this. I suggest searching StackOveflow for "[C++] read file" to see some more methods.
Find and Substring
You could use the std::string::find method to find the delimiter and then use std::string::substr to return a substring between the position and the delimiter.
std::string::size_type position = 0;
positition = temp.find('|');
if (position != std::string::npos)
{
firstName[x] = temp.substr(0, position);
}
If you don't terminate a a C-style string with a null character there is no way to determine where the string ends. Thus, you'll need to terminate the strings.
I would personally read the data into std::string objects:
std::string first, last, etc;
while (std::getline(readFromFile, first, '|')
&& std::getline(readFromFile, last, '|')
&& std::getline(readFromFile, etc)) {
// do something with the input
}
std::endl is a manipulator implemented as a function template. You can't compare a char with that. There is also hardly ever a reason to use std::endl because it flushes the stream after adding a newline which makes writing really slow. You probably meant to compare to a newline character, i.e., to '\n'. However, since you read the string with std::getline() the line break character will already be removed! You need to make sure you don't access more than temp.size() characters otherwise.
Your record also contains arrays of strings rather than arrays of characters and you assign individual chars to them. You either wanted to yse char something[SIZE] or you'd store strings!

Simple C++ File I/O issue

It's been a while since I've worked with File I/O in C++ (and just C++ in general) but I recently decided to use it to make a small console project for a friend.
My issue is that I'm having some issues with a string array and File I/O (I'm not sure which is causing the problem). My code is as follows (ReadPSWDS is an ifstream):
int i = 0;
string str[200];
ReadPSWDS.clear();
ReadPSWDS.open("myPasswords.DoNotOpen");
if(ReadPSWDS.is_open())
{
while(!ReadPSWDS.eof())
{
getline(ReadPSWDS, str[i]); //Store the line
if(str[i].length()<1 || str[i] == "")
{
//Ignore the line if it's nothing
}
else
{
i++; //Move onto the next 'cell' in the array
}
}
}
ReadPSWDS.close();
My issue is that on testing this out, the string array would appear to be empty (and on writing all those lines to a file, the file is empty as expected).
Why is the string array empty and not filled with the appropriate lines of the text file?
Regards,
Joe
The loop you've written is clearly wrong: you're testing eof() before
failure, and you're not testing for failure after the getline. C++
I/O isn't predictive. (I can't be, since whether you're at eof() will
depend on what you try to read.) The correct pattern would be:
while ( i < size(str) && getline( readSWDS, str[i] ) ) {
if ( !str[i].empty() ) {
++ i;
}
Note that I've added a test for i. As written, if your file contains
more than 200 lines, you're in deep trouble.
I'm not sure that this is your problem, however; the loop as you've
written it will normally only cause problems on the last line.
(Typically, if the last line ends with a '\n', and is not empty, it
will appear twice in your array.) Unless, of course, your file does
contain more than 200 lines.
I might add that an even more typical idiom would be to make str an
std::vector<std::string>, and write the loop:
std::string line;
while ( std::getline( readSWDS, line ) ) {
if ( !line.empty() ) {
str.push_back(line);
}
}
This avoids having to define a fixed maximum number of lines.