Simple string parsing without using boost [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
I'm working on an assignment for my C++ class and I was hoping I could get some help. One of my biggest problems in coding with C++ is parsing strings. I have found longer more complicated ways to parse strings but I have a very simple program I need to write which only needs to parse a string into 2 sections: a command and a data section. For instance: Insert 25 which will split it into Insert and 25.
I was planning on using an array of strings to store the data since I know that it will only split the string into 2 sections. However I also need to be able to read in strings that require no parsing such as Quit
What is the simplest way to accomplish this without using an outside library such as boost?

The simplest may be like this:
string s;
int i;
cin >> s;
if (s == "Insert")
{
cin >> i;
... // do stuff
}
else if (s == "Quit")
{
exit(0);
}
else
{
cout << "No good\n";
}
The simplest way may be not so good if you need e.g. good handing of user errors, extensibility etc.

You can read strings from a stream using getline, and then to a split by finding the firs position of a space character ' ' within the string, and using the substr function twice (for the command to the left of the space and for the data to the right of space).
while (cin) {
string line;
getline(cin, line);
size_t pos = line.find(' ');
string cmd, data;
if (pos != string::npos) {
cmd = line.substr(0, pos-1);
data = line.substr(pos+1);
} else {
cmd = line;
}
cerr << "'" << cmd << "' - '" << data << "'" << endl;
}
Here is a link to a demo on ideone.

This is another way :
string s("Insert 25");
istringstream iss(s);
do
{
string command; int value;
iss >> command >> value;
cout << "Values: " << command << " " << values << endl;
} while (iss);

I like using streams for such things.
int main()
{
int Value;
std::string Identifier;
std::stringstream ss;
std::multimap<std::string, int> MyCollection;
ss << "Value 25\nValue 23\nValue 19";
while(ss.good())
{
ss >> Identifier;
ss >> Value;
MyCollection.insert(std::pair<std::string, int>(Identifier, Value));
}
for(std::multimap<std::string, int>::iterator it = MyCollection.begin(); it != MyCollection.end(); it++)
{
std::cout << it->first << std::endl;
std::cout << it->second << std::endl;
}
std::cin.get();
return 0;
}
This way you can allready convert your data into the needed format. And the stream automatically splits on whitespaces. It works the same way with std::fstream if your working with files.

Related

Assigning splitted string into substrings

My problem is rather simple yet I can't get my head around it.
I was searching through the internet of course, but all solutions I found were using std::vectors and I'm not allowed to use them.
I have the following string:
std::string str "Tom and Jerry";
I want to split this string using space as a delimiter, and then assign the three words into three different strings.
//this is what I am trying to achieve
std::string substr1 = "Tom";
std::string substr1 = "and";
std::string substr1 = "Jerry";
This is how I am splitting the string by the space as a delimiter:
std::string buf;
std::string background;
std::stringstream ss(str);
while (ss >> buf) {
if (buf == " ")
background = buf; // don't really understand that part
std::cout << "splitted strings: " << buf << std::endl;
}
But I have no idea when and how should I assign the splitted strings into the substr1, substr2, substr3. Would anyone explain how should I throw in the strings assignment part into this?
I have tried some weird stuff like:
std::string substr1, substr2, substr3;
int counter = 1;
while (ss >> buf) {
if (buf == " ")
background = buf; // don't really understand that part
counter = 1;
if (counter == 1) {
substr1 = buf;
std::cout << "substr1 (Tom): " << substr1 << std::endl;
counter++;
}
else if (counter == 2) {
substr2 = buf;
std::cout << "substr2 (and): " << substr2 << std::endl;
counter++;
}
else if (counter == 3) {
substr3 = buf;
std::cout << "substr3 (Jerry): " << substr3 << std::endl;
counter++;
}
Thanks.
You can simply do ss >> substr1; ss >> substr2; ss >> substr3;. The >> operator works exactly with spaces as separator.
Code
in "while" ,when coming a space ,make it a substring before the space and the "tom and jerry" has 2 space so it was splitted to two words. ss>>buf means input "ss"'s string to buf. so if there comes a spce it can store the word before space.

How do i check if a string format is valid while reading from a text file in C++? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I'm quite new to programming in C++ and i'm trying to learn how do we validate if a given word/string in a text file is in the right format. I have text file that contains a line like this:Car,Red,ZX342DC. The line contains the type of vehicle,color,plate. While i'm reading the line I want to perform a check on the string ZX342DC such that it must have 2 upper case letters followed by 3 numbers and 2 upper case letters before assigning this string to a string object. If none of these conditions are met, i want to flag out an error saying there's an invalid entry in line number-" ", ignore the line and move on to the next line in the file.
When I give out my answer, it's yet unclear that what name your input text file has. Let's suppose the name of your text file is test.txt and it locates right within the same directory as the C++ source file, test.cpp.
test.txt
Car,Red,ZX342DC
test.cpp
#include <fstream>
#include <string>
#include <regex>
#include <iostream>
using namespace std;
int main()
{
// Let's start with variable declaration and initialization
// First you have to open the text file,
// for which you need `ifstream`.
// It is responsible for decoding text files.
ifstream ifs("test.txt");
// Provide a temporary storage for each line
string line;
// Regular expression pattern object
// This is what you need
// in order to validate each line
// and extract data from each line,
regex p("(\\w+),(\\w+),([A-Z]{2}[0-9]{3}[A-Z]{2})");
// Match object
smatch m;
// Check if the file stream is opened
// otherwise you might have some problems
// to continue the following steps
if (ifs.is_open())
{
// Then you need to traverse over each line in the file.
for (int lineNum = 1;
getline(ifs, line);
lineNum++)
{
// Validate the line
if (regex_search(line, m, p) && m.size() > 3)
{
cout << "Pattern matched." << endl;
cout << endl;
string vehicle = m.str(1);
string color = m.str(2);
string plate = m.str(3);
cout << "Vehicle:\t" << vehicle << endl;
cout << "Color :\t" << color << endl;
cout << "Plate :\t" << plate << endl;
// You can insert some codes here to handle the results
cout << endl;
}
else
{
cerr << "There's an invalid entry in line number - " << lineNum << "!" << endl;
}
}
// Always remember to close a stream
// before exiting the program
// otherwise you may suffer from **memory leakage**
ifs.close();
}
else
{
cerr << "Fail to open input file!" << endl;
}
return 0;
}
Reference:
std::ifstream::ifstream
std::getline (string)
regex
C++ Regular Expressions with std::regex
You can do something along the following lines:
#include<iostream>
#include<string>
#include<regex>
int main()
{
std::string input{"Car,Red,ZX342DC"};
std::regex regex{R"(([[:alpha:]]+),([[:alpha:]]+),([A-Z]{2}[0-9]{3}[A-Z]{2}))"};
std::smatch match;
if (std::regex_search(input, match, regex)) {
std::cout << "Found" << "\n";
} else {
std::cout << "Did Not Found" << "\n";
}
}
As others have stated, Regular Expressions are a good way to approach this. However, for a complete beginner in C++, they can be a bit overwhelming if you are not familiar with them in other languages.
Here is an alternative that doesn't use Regular Expressions:
inline bool IsInRange(char c, char lower, char upper)
{
return ((c >= lower) && (c <= upper));
}
inline bool IsUpper(char c)
{
return IsInRange(c, 'A', 'Z');
}
inline bool IsDigit(char c)
{
return IsInRange(c, '0', '9');
}
bool IsValidPlate(const std::string &plate)
{
return (
(plate.size() == 7) &&
IsUpper(plate[0]) &&
IsUpper(plate[1]) &&
IsDigit(plate[2]) &&
IsDigit(plate[3]) &&
IsDigit(plate[4]) &&
IsUpper(plate[5]) &&
IsUpper(plate[6])
);
}
struct VehicleInfo
{
std::string vehicle;
std::string color;
std::string plate;
};
bool ParseVehicleInfo(const std::string &line, VehicleInfo &info)
{
std:istringstream iss(line);
return (
std::getline(iss, info.vehicle, ',') &&
std::getline(iss, info.color, ',') &&
std::getline(iss, info.plate) &&
IsValidPlate(info.plate)
);
}
...
std::ifstream inputFile("file.txt");
std::string line;
int lineNum = 0;
while (std::getline(inputFile, line))
{
++lineNum;
VehicleInfo info;
if (!ParseVehicleInfo(line, info))
{
std::cout << "invalid data on line " << lineNum << std::endl;
}
else
{
// use info as needed...
}
}
What you are looking for is called "Regular Expressions", or more commonly, Regex. C++11 offers Regex as a standard library feature 2.

Split String with math expression

How to split the string to two-parts after I assign the operation to math operator? For example 4567*6789 I want to split string into three part
First:4567 Operation:* Second:6789
Input is from textfile
char operation;
while (getline(ifs, line)){
stringstream ss(line.c_str());
char str;
//get string from stringstream
//delimiter here + - * / to split string to two part
while (ss >> str) {
if (ispunct(str)) {
operation = str;
}
}
}
Maybe, just maybe, by thinking this out, we can come up with a solution.
We know that operator>> will stop processing when encounter a character that is not a digit. So we can use this fact.
int multiplier = 0;
ss >> multiplier;
The next characters are not digits, so they could be an operator character.
What happens if we read in a character:
char operation = '?';
ss >> operation;
Oh, I forgot to mention that the operator>> will skip spaces by default.
Lastly, we can input the second number:
int multiplicand = 0;
ss >> multiplicand;
To confirm, let's print out what we have read in:
std::cout << "First Number: " << multiplier << "\n";
std::cout << "Operation : " << operation << "\n";
std::cout << "Second Number: " << multiplicand << "\n";
Using a debugger here will help show what is happening, as each statement is executed, one at at time.
Edit 1: More complicated
You can always get more complicated and use a parser, lexer or write your own. A good method of implementation is to use a state machine.
For example, you would read a single character, then decide what to do with it depending on the state. For example, if the character is a digit, you may want to build a number. For a character (other than white space), convert it to a token and store it somewhere.
There are parse trees and other data structures which can ease the operation of parsing. There are parsing libraries out there too, such as boost::spirit, yacc, bison, flex and lex.
One way is:
char opr;
int firstNumber, SecondNumber;
ss>>firstNumber>>opr>>SecondNumber;
instead of:
while (ss >> str) {
if (ispunct(str)) {
operation = str;
}
}
Or using regex for complex expersions. Here is an example of using regex in math expersions.
If you have a string at hand, you could simply split the string into left and right at the operator position as follows:
char* linePtr = strdup("4567*6789"); // strdup to preserve original value
char* op = strpbrk(linePtr, "+-*");
if (op) {
string opStr(op,1);
*op = 0x0;
string lhs(linePtr);
string rhs(op+1);
cout << lhs << " " << opStr << " " << rhs;
}
A simple solution would be to use sscanf:
int left, right;
char o;
if (sscanf("4567*6789", "%d%c%d", &left, &o, &right) == 3) {
// scan valid...
cout << left << " " << o << " " << right;
}
My proposual is to create to functions:
std::size_t delimiter_pos(const std::string line)
{
std::size_t found = std::string::npos;
(found = line.find('+')) != std::string::npos ||
(found = line.find('-')) != std::string::npos ||
(found = line.find('*')) != std::string::npos ||
(found = line.find('/')) != std::string::npos;
return found;
}
And second function that calculate operands:
void parse(const std::string line)
{
std::string line;
std::size_t pos = delimiter_pos(line);
if (pos != std::string::npos)
{
std::string first = line.substr(0, pos);
char operation = line[pos];
std::string second = line.substr(pos + 1, line.size() - (pos + 1));
}
}
I hope my examples helped you

Splitting input string c++ [duplicate]

This question already has answers here:
How do I iterate over the words of a string?
(84 answers)
Closed 6 years ago.
I am reading in a file, that contains data in this format on each line. 30304 Homer Simpson I need to be able to pass this to the following constructor, the integer being the regNo, the name the rest of the string, and every student would have their own map of marks.
Student::Student (string const& name, int regNo):Person(name)
{
regNo = regNo;
map<string, float> marks;
}
I then have to add each student to a collection of students, which would be best, and how do I do this?
So far all I've got is getting the file name and checking it exists.
int main()
{
//Get file names
string studentsFile, resultsFile, line;
cout << "Enter the Students file: ";
getline(cin, studentsFile);
cout << "Enter the results file: ";
getline(cin, resultsFile);
//Check for students file
ifstream students_stream(studentsFile);
if (!students_stream) {
cout << "Unable to open " << studentsFile << "\n";
return 1;
}
}
I tried using getline with 3 arguments and " " as the delimiter but that would also split the name part of the string, so I'm not sure how to do this another way.
Replace std::cin with your input file stream of course. It would be probably sane to "trim" the name result, unless you know by 100% the input is well formatted. I added only bare-minimal error state handling to somehow "survive".
Names are read also for single/three/more variants of course, as any real world application should.
#include <iostream>
#include <string>
#include <stdexcept>
int main()
{
std::string line, name;
unsigned long long regNo;
size_t nameOfs;
while (true) {
// Read full non-empty line from input stream
try {
std::getline(std::cin, line);
if (line.empty()) break;
}
catch(const std::ios_base::failure & readLineException) {
break;
}
// parse values:
// 1. unsigned long long ending with single white space as "regNo"
// 2. remaining part of string is "name"
try {
regNo = std::stoull(line, &nameOfs);
name = line.substr(nameOfs + 1);
}
catch(const std::logic_error & regNoException) {
// in case of invalid input format, just stop processing
std::cout << "Invalid regNo or name in line: [" << line << "]";
break;
}
// here values regNo + name are parsed -> insert them into some vector/etc.
std::cout << "RegNo [" << regNo << "] name [" << name << "]\n";
}
}
A regular expression could be used:
We can then select group 2 and 3 from the result.
std::vector<Student> students;
std::regex r{R"(((\d+) )(.+))"};
for(std::string line; getline(students_stream, line);) {
auto it = std::sregex_iterator(line.begin(), line.end(), r);
auto end = std::sregex_iterator();
if(it == end || it->size() != 4)
throw std::runtime_error("Could not parse line containing the following text: " + line);
for(; it != end; ++it) {
auto match = *it;
auto regNo_text = match[2].str();
auto regNo{std::stoi(regNo_text)};
auto name = match[3].str();
students.emplace_back(name, regNo);
}
}
Live demo
You can take input using getline()and read one complete line(no third argument) and then use stringstream to extract the number and the remaining string. Example of stringstream:
string s = "30304 Homer Simpson", name;
stringstream ss(s);
int num;
ss >> num; //num = 30304
getline(ss, name); //name = Homer Simpson
cout << num;
cout << name;

Example for file input to structure members?

I have the following structure:
struct productInfo
{
int item;
string details;
double cost;
};
I have a file that will input 10 different products that each contain an item, details, and cost. I have tried to input it using inFile.getline but it just doesn't work. Can anyone give me an example of how to do this? I would appreciate it.
Edit
The file contains 10 lines that look like this:
570314,SanDisk Sansa Clip 8 GB MP3 Player Black,55.99
Can you provide an example please.
Edit
Sorry guys, I am new to C++ and I don't really understand the suggestions. This is what I have tried.
void readFile(ifstream & inFile, productInfo products[])
{
inFile.ignore(LINE_LEN,'\n'); // The first line is not needed
for (int index = 0; index < 10; index++)
{
inFile.getline(products[index].item,SIZE,DELIMETER);
inFile.getline(products[index].details,SIZE,DELIMETER);
inFile.getline(products[index].cost,SIZE,DELIMETER);
}
}
This is another approach that uses fstream to read the file and getline() to read each line on the file. The parsing of the line itself was left out on purpose since other posts have already done that.
After each line is read and parsed into a productInfo, the application stores it on a vector, so all products could be accessed in memory.
#include <iostream>
#include <fstream>
#include <vector>
#include <iterator>
#include <string>
using namespace std;
struct productInfo
{
int item;
string details;
double cost;
};
int main()
{
vector<productInfo> product_list;
ifstream InFile("list.txt");
if (!InFile)
{
cerr << "CouldnĀ“t open input file" << endl;
return -1;
}
string line;
while (getline(InFile, line))
{ // from here on, check the post: How to parse complex string with C++ ?
// https://stackoverflow.com/questions/2073054/how-to-parse-complex-string-with-c
// to know how to break the string using comma ',' as a token
cout << line << endl;
// productInfo new_product;
// new_product.item =
// new_product.details =
// new_product.cost =
// product_list.push_back(new_product);
}
// Loop the list printing each item
// for (int i = 0; i < product_list.size(); i++)
// cout << "Item #" << i << " number:" << product_list[i].item <<
// " details:" << product_list[i].details <<
// " cost:" << product_list[i].cost << endl;
}
EDIT: I decided to take a shot at parsing the line and wrote the code below. Some C++ folks might not like the strtok() method of handling things but there it is.
string line;
while (getline(InFile, line))
{
if (line.empty())
break;
//cout << "***** Parsing: " << line << " *****" << endl;
productInfo new_product;
// My favorite parsing method: strtok()
char *tmp = strtok(const_cast<char*>(line.c_str()), ",");
stringstream ss_item(tmp);
ss_item >> new_product.item;
//cout << "item: " << tmp << endl;
//cout << "item: " << new_product.item << endl;
tmp = strtok(NULL, ",");
new_product.details += tmp;
//cout << "details: " << tmp << endl;
//cout << "details: " << new_product.details << endl;
tmp = strtok(NULL, " ");
stringstream ss_cost(tmp);
ss_cost >> new_product.cost;
//cout << "cost: " << tmp << endl;
//cout << "cost: " << new_product.cost << endl;
product_list.push_back(new_product);
}
It depends on what's in the file? If it's text, you can use the redirect operator on a file input stream:
int i;
infile >> i;
If it's binary, you can just read it in to &your_struct.
You have to
0) Create a new instance of productInfo, pinfo;
1) read text (using getline) to the first comma (','), convert this string to an int, and put it into pinfo.item.
2) read text to the next comma and put it into pinfo.details;
3) read text to the endline, convert the string to a double, and put it into pinfo.cost.
Then just keep doing this until you reach the end of the file.
Here is how I would use getline. Note that I use it once to read from the input file, and then again to chop that line at ",".
ostream& operator>>(istream& is, productInfo& pi)
{
string line;
getline(is, line); // fetch one line of input
stringstream sline(line);
string item;
getline(sline, item, ',');
stringstream(item) >> pi.item; // convert string to int
getline(sline, item, ',');
pi.details = item; // string: no conversion necessary
getline(sline, item);
stringstream(item) >> pi.cost; // convert string to double
return is;
}
// usage:
// productInfo pi; ifstream inFile ("inputfile.txt"); inFile >> pi;
N.b.: This program is buggy if the input is
99999,"The Best Knife, Ever!",16.95