Split String with math expression - c++

How to split the string to two-parts after I assign the operation to math operator? For example 4567*6789 I want to split string into three part
First:4567 Operation:* Second:6789
Input is from textfile
char operation;
while (getline(ifs, line)){
stringstream ss(line.c_str());
char str;
//get string from stringstream
//delimiter here + - * / to split string to two part
while (ss >> str) {
if (ispunct(str)) {
operation = str;
}
}
}

Maybe, just maybe, by thinking this out, we can come up with a solution.
We know that operator>> will stop processing when encounter a character that is not a digit. So we can use this fact.
int multiplier = 0;
ss >> multiplier;
The next characters are not digits, so they could be an operator character.
What happens if we read in a character:
char operation = '?';
ss >> operation;
Oh, I forgot to mention that the operator>> will skip spaces by default.
Lastly, we can input the second number:
int multiplicand = 0;
ss >> multiplicand;
To confirm, let's print out what we have read in:
std::cout << "First Number: " << multiplier << "\n";
std::cout << "Operation : " << operation << "\n";
std::cout << "Second Number: " << multiplicand << "\n";
Using a debugger here will help show what is happening, as each statement is executed, one at at time.
Edit 1: More complicated
You can always get more complicated and use a parser, lexer or write your own. A good method of implementation is to use a state machine.
For example, you would read a single character, then decide what to do with it depending on the state. For example, if the character is a digit, you may want to build a number. For a character (other than white space), convert it to a token and store it somewhere.
There are parse trees and other data structures which can ease the operation of parsing. There are parsing libraries out there too, such as boost::spirit, yacc, bison, flex and lex.

One way is:
char opr;
int firstNumber, SecondNumber;
ss>>firstNumber>>opr>>SecondNumber;
instead of:
while (ss >> str) {
if (ispunct(str)) {
operation = str;
}
}
Or using regex for complex expersions. Here is an example of using regex in math expersions.

If you have a string at hand, you could simply split the string into left and right at the operator position as follows:
char* linePtr = strdup("4567*6789"); // strdup to preserve original value
char* op = strpbrk(linePtr, "+-*");
if (op) {
string opStr(op,1);
*op = 0x0;
string lhs(linePtr);
string rhs(op+1);
cout << lhs << " " << opStr << " " << rhs;
}

A simple solution would be to use sscanf:
int left, right;
char o;
if (sscanf("4567*6789", "%d%c%d", &left, &o, &right) == 3) {
// scan valid...
cout << left << " " << o << " " << right;
}

My proposual is to create to functions:
std::size_t delimiter_pos(const std::string line)
{
std::size_t found = std::string::npos;
(found = line.find('+')) != std::string::npos ||
(found = line.find('-')) != std::string::npos ||
(found = line.find('*')) != std::string::npos ||
(found = line.find('/')) != std::string::npos;
return found;
}
And second function that calculate operands:
void parse(const std::string line)
{
std::string line;
std::size_t pos = delimiter_pos(line);
if (pos != std::string::npos)
{
std::string first = line.substr(0, pos);
char operation = line[pos];
std::string second = line.substr(pos + 1, line.size() - (pos + 1));
}
}
I hope my examples helped you

Related

strtok() only printing first word rest are (null)

I am trying to parse a large text file and split it up into single words using strtok. The delimiters remove all special characters, whitespace, and new lines. For some reason when I printf() it, it only prints the first word and a bunch of (null) for the rest.
ifstream textstream(textFile);
string textLine;
while (getline(textstream, textLine))
{
struct_ptr->numOfCharsProcessedFromFile[TESTFILEINDEX] += textLine.length() + 1;
char *line_c = new char[textLine.length() + 1]; // creates a character array the length of the line
strcpy(line_c, textLine.c_str()); // copies the line string into the character array
char *word = strtok(line_c, delimiters); // removes all unwanted characters
while (word != nullptr && wordCount(struct_ptr->dictRootNode, word) > struct_ptr->minNumOfWordsWithAPrefixForPrinting)
{
MyFile << word << ' ' << wordCount(struct_ptr->dictRootNode, word) << '\n'; // writes each word and number of times it appears as a prefix in the tree
word = strtok(NULL, delimiters); // move to next word
printf("%s", word);
}
}
Rather than jumping through the hoops necessary to use strtok, I'd write a little replacement that works directly with strings, without modifying its input, something on this general order:
std::vector<std::string> tokenize(std::string const &input, std::string const &delims = " ") {
std::vector<std::string> ret;
int start = 0;
while ((start = input.find_first_not_of(delims, start)) != std::string::npos) {
auto stop = input.find_first_of(delims, start+1);
ret.push_back(input.substr(start, stop-start));
start = stop;
}
return ret;
}
At least to me, this seems to simplify the rest of the code quite a bit:
std::string textLine;
while (std::getline(textStream, textLine)) {
struct_ptr->numOfCharsProcessedFromFile[TESTFILEINDEX] += textLine.length() + 1;
auto words = tokenize(textLine, delims);
for (auto const &word : words) {
MyFile << word << ' ' << wordCount(struct_ptr->dictRootNode, word) << '\n';
std::cout << word << '\n';
}
}
This also avoids (among other things) the massive memory leak you had, allocating memory every iteration of your loop, but never freeing any of it.
Move printf two lines UP.
while (word != nullptr && wordCount(struct_ptr->dictRootNode, word) > struct_ptr->minNumOfWordsWithAPrefixForPrinting)
{
printf("%s", word);
MyFile << word << ' ' << wordCount(struct_ptr->dictRootNode, word) << '\n'; // writes each word and number of times it appears as a prefix in the tree
word = strtok(NULL, delimiters); // move to next word
}
As #j23 pointed out, your printf is in the wrong location.
As #Jerry-Coffin points out, there are more c++-ish and modern ways to accomplish, what you try to do. Next to avoiding mutation, you can also avoid copying the words out of the text string. (In my code below, we read line by line, but if you know your whole text fits into memory, you could as well read the whole content into a std::string.)
So, using std::string_view avoids to perform extra copies, it being just something like a pointer into your string and a length.
Here, how it looks like, for a use case, where you need not store the words in another data structure - some kind of one-pass processing of words:
#include <iostream>
#include <fstream>
#include <string>
#include <string_view>
#include <cctype>
template <class F>
void with_lines(std::istream& stream, F body) {
for (std::string line; std::getline(stream,line);) {
body(line);
}
}
template <class F>
void with_words(std::istream& stream, F body) {
with_lines(stream,[&body](std::string& line) {
std::string_view line_view{line.cbegin(),line.cend()};
while (!line_view.empty()) {
// skip whitespaces
for (; !line_view.empty() && isspace(line_view[0]);
line_view.remove_prefix(1));
size_t position = 0;
for (; position < line_view.size() &&
!isspace(line_view[position]);
position++);
if (position > 0) {
body(line_view.substr(0,position));
line_view.remove_prefix(position);
}
}
});
}
int main (int argc, const char* argv[]) {
size_t word_count = 0;
std::ifstream stream{"input.txt"};
if(!stream) {
std::cerr
<< "could not open file input.txt" << std::endl;
return -1;
}
with_words(stream, [&word_count] (std::string_view word) {
std::cout << word_count << " " << word << std::endl;
word_count++;
});
std::cout
<< "input.txt contains "
<< word_count << " words."
<< std::endl;
return 0;
}

Assigning splitted string into substrings

My problem is rather simple yet I can't get my head around it.
I was searching through the internet of course, but all solutions I found were using std::vectors and I'm not allowed to use them.
I have the following string:
std::string str "Tom and Jerry";
I want to split this string using space as a delimiter, and then assign the three words into three different strings.
//this is what I am trying to achieve
std::string substr1 = "Tom";
std::string substr1 = "and";
std::string substr1 = "Jerry";
This is how I am splitting the string by the space as a delimiter:
std::string buf;
std::string background;
std::stringstream ss(str);
while (ss >> buf) {
if (buf == " ")
background = buf; // don't really understand that part
std::cout << "splitted strings: " << buf << std::endl;
}
But I have no idea when and how should I assign the splitted strings into the substr1, substr2, substr3. Would anyone explain how should I throw in the strings assignment part into this?
I have tried some weird stuff like:
std::string substr1, substr2, substr3;
int counter = 1;
while (ss >> buf) {
if (buf == " ")
background = buf; // don't really understand that part
counter = 1;
if (counter == 1) {
substr1 = buf;
std::cout << "substr1 (Tom): " << substr1 << std::endl;
counter++;
}
else if (counter == 2) {
substr2 = buf;
std::cout << "substr2 (and): " << substr2 << std::endl;
counter++;
}
else if (counter == 3) {
substr3 = buf;
std::cout << "substr3 (Jerry): " << substr3 << std::endl;
counter++;
}
Thanks.
You can simply do ss >> substr1; ss >> substr2; ss >> substr3;. The >> operator works exactly with spaces as separator.
Code
in "while" ,when coming a space ,make it a substring before the space and the "tom and jerry" has 2 space so it was splitted to two words. ss>>buf means input "ss"'s string to buf. so if there comes a spce it can store the word before space.

How to find a word which contains digits in a string

I need to check words inside the string to see whether any of them contains digits, and if it isn't — erase this word. Then print out the modified string
Here's my strugle to resolve the problem, but it doesn't work as I need it to
void sentence_without_latin_character( std::string &s ) {
std::cout << std::endl;
std::istringstream is (s);
std::string word;
std::vector<std::string> words_with_other_characters;
while (is >> word) {
std::string::size_type temp_size = word.find(std::ctype_base::digit);
if (temp_size == std::string::npos) {
word.erase(word.begin(), word.begin() + temp_size);
}
words_with_other_characters.push_back(word);
}
for (const auto i: words_with_other_characters) {
std::cout << i << " ";
}
std::cout << std::endl;
}
This part is not doing what you think it does:
word.find(std::ctype_base::digit);
std::string::find only searches for complete substrings (or single characters).
If you want to search for a set of some characters in a string, use std::string::find_first_of instead.
Another option is testing each character using something like std::isdigit, possibly with an algorithm like std::any_of or with a simple loop.
As Acorn explained, word.find(std::ctype_base::digit) does not search for the first digit. std::ctype_base::digit is a constant that indicates a digit to specific std::ctype methods. In fact there's a std::ctype method called scan_is that you can use for this purpose.
void sentence_without_latin_character( std::string &s ) {
std::istringstream is (s);
std::string word;
s.clear();
auto& ctype = std::use_facet<std::ctype<char>>(std::locale("en_US.utf8"));
while (is >> word) {
auto p = ctype.scan_is(std::ctype_base::digit, word.data(), &word.back()+1);
if (p == &word.back()+1) {
s += word;
if (is.peek() == ' ') s += ' ';
}
}
std::cout << s << std::endl;
}

c++ convert hexadecimal string with ":" to original "binary" string

I have the following code to convert an encrypted ciphertext to a readable hexadecimal format:
std::string convertToReadable(std::string ciphertext)
{
std::stringstream outText;
for(unsigned int i = 0; i < ciphertext.size(); i++ )
outText << std::hex << std::setw(2) << std::setfill('0') << (0xFF & static_cast<byte>(ciphertext[i])) << ":";
return outText.str();
}
The readable result of this function is something as:
56:5e:8b:a8:04:93:e2:f1:5c:20:8b:fd:f5:b7:22:0b:82:42:46:58:9b:d4:c1:8e:ac:62:85:04:ff:7f:c6:d3:
Now I need to do the way back, converting the readable format to the original ciphertext in order to decrypt it:
std::string convertFromReadable(std::string text)
{
std::istringstream cipherStream;
for(unsigned int i = 0; i < text.size(); i++ )
{
if (text.substr(i, 1) == ":")
continue;
std::string str = text.substr(i, 2);
std::istringstream buffer(str);
int value;
buffer >> std::hex >> value;
cipherStream << value;
}
return cipherStream.str();
}
This is not absolutely working, as I´m getting the wrong string back.
How can I fix the convertFromReadable() so that I can have the original ciphertext back ?
Thanks for helping
Here are problems that you should fix before debugging this any further:
cipherStream should be ostringstream, not istringstream
The for loop should stop two characters before the end. Otherwise your substr is going to fail. Make the loop condition i+2 < text.size()
When you read two characters from the input, you need to advance i by two, i.e. add i++ after the std::string str = text.substr(i, 2); line.
Since you want character output, add a cast to char when writing the data to cipherStream, i.e. cipherStream << (char)value
Good you got your code working. Just thought I'd illustrate a slightly simpler, more direct approach using streams without the fiddly index tracking and substr extraction:
std::string convertFromReadable(const std::string& text)
{
std::istringstream iss(text);
std::ostringstream cipherStream;
int n;
while (iss >> std::hex >> n)
{
cipherStream << (char)n;
// if there's another character it better be ':'
char c;
if (iss >> c && c != ':')
throw std::runtime_error("invalid character in cipher");
}
return cipherStream.str();
}
Note that after the last hex value, if there's no colon the if (iss >> c... test will evaluate false as will the while (iss >> ... test, fallingt through to return.

Simple string parsing without using boost [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
I'm working on an assignment for my C++ class and I was hoping I could get some help. One of my biggest problems in coding with C++ is parsing strings. I have found longer more complicated ways to parse strings but I have a very simple program I need to write which only needs to parse a string into 2 sections: a command and a data section. For instance: Insert 25 which will split it into Insert and 25.
I was planning on using an array of strings to store the data since I know that it will only split the string into 2 sections. However I also need to be able to read in strings that require no parsing such as Quit
What is the simplest way to accomplish this without using an outside library such as boost?
The simplest may be like this:
string s;
int i;
cin >> s;
if (s == "Insert")
{
cin >> i;
... // do stuff
}
else if (s == "Quit")
{
exit(0);
}
else
{
cout << "No good\n";
}
The simplest way may be not so good if you need e.g. good handing of user errors, extensibility etc.
You can read strings from a stream using getline, and then to a split by finding the firs position of a space character ' ' within the string, and using the substr function twice (for the command to the left of the space and for the data to the right of space).
while (cin) {
string line;
getline(cin, line);
size_t pos = line.find(' ');
string cmd, data;
if (pos != string::npos) {
cmd = line.substr(0, pos-1);
data = line.substr(pos+1);
} else {
cmd = line;
}
cerr << "'" << cmd << "' - '" << data << "'" << endl;
}
Here is a link to a demo on ideone.
This is another way :
string s("Insert 25");
istringstream iss(s);
do
{
string command; int value;
iss >> command >> value;
cout << "Values: " << command << " " << values << endl;
} while (iss);
I like using streams for such things.
int main()
{
int Value;
std::string Identifier;
std::stringstream ss;
std::multimap<std::string, int> MyCollection;
ss << "Value 25\nValue 23\nValue 19";
while(ss.good())
{
ss >> Identifier;
ss >> Value;
MyCollection.insert(std::pair<std::string, int>(Identifier, Value));
}
for(std::multimap<std::string, int>::iterator it = MyCollection.begin(); it != MyCollection.end(); it++)
{
std::cout << it->first << std::endl;
std::cout << it->second << std::endl;
}
std::cin.get();
return 0;
}
This way you can allready convert your data into the needed format. And the stream automatically splits on whitespaces. It works the same way with std::fstream if your working with files.