how do you split a string embedded in a delimiter in C++?

how do you split a string embedded in a delimiter in C++? - c++

I understand how to split a string by a string by a delimiter in C++, but how do you split a string embedded in a delimiter, e.g. try and split ”~!hello~! random junk... ~!world~!” by the string ”~!” into an array of [“hello”, “ random junk...”, “world”]? are there any C++ standard library functions for this or if not any algorithm which could achieve this?

#include <iostream>
#include <vector>
using namespace std;
vector<string> split(string s,string delimiter){
vector<string> res;
s+=delimiter; //adding delimiter at end of string
string word;
int pos = s.find(delimiter);
while (pos != string::npos) {
word = s.substr(0, pos); // The Word that comes before the delimiter
res.push_back(word); // Push the Word to our Final vector
s.erase(0, pos + delimiter.length()); // Delete the Delimiter and repeat till end of String to find all words
pos = s.find(delimiter); // Update pos to hold position of next Delimiter in our String
}
res.push_back(s); //push the last word that comes after the delimiter
return res;
}
int main() {
string s="~!hello~!random junk... ~!world~!";
vector<string>words = split(s,"~!");
int n=words.size();
for(int i=0;i<n;i++)
std::cout<<words[i]<<std::endl;
return 0;
}
The above program will find all the words that occur before, in between and after the delimiter that you specify. With minor changes to the function, you can make the function suit your need ( like for example if you don't need to find the word that occurs before the first delimiter or last delimiter) .
But for your need, the given function does the word splitting in the right way according to the delimiter you provide.
I hope this solves your question !

Related

Having issues finding multiple substrings within a string

I am trying to write a program that compares two strings (string and substring) and incitements each time the substring is found within the string. However, using the standard:
if(str.find(substr) != string::npos)
{
count++;
}
I run into the problem that if the substring appears multiple times in the string it only increments once. So if the string is "test test test test" and the substring is "test" count only ends up being 1 instead of 4.
What would be the best way to fix this?
*Notes for context:
1) At one point I was checking the string character by character to see if they matched, but had to scrap that when I ran into issues when some words had smaller words in them.
Example: 'is' would get picked up inside the word 'this', etc
2)The larger program that this is for accepts two vectors. The first vector has a string for each element being sentences the user get to type in (acting at the main string in the example above). And the second vector has each word from all the sentences entered into the first vector (acting as the substring in the example above). Not sure if that bit matters or not, but figured I would throw it in there
Example:
vector<string> str {this is line one, this is line two, this is line three};
vector<string> substr {is, line, one, this, three, two};
3) I'm thinking if there was some way of doing the opposite of !=string::npos would work, but not sure if that even exist.

You need a loop to find all of the occurances of a substring in a given string.
However, since you want to differentiate substrings that are whole words from substrings in larger words, you need to parse the string to determine the whole words before you compare them.
You can use std::string::find_first_of() and std::string::find_first_not_of() to find the beginning and ending indexes of each whole word between desired delimiters (whitespace, punctuation, etc). You can use std::string::compare() to compare a substring between those two indexes to your desired substring. For example:
#include <string>
const std::string delims = ",. ";
size_t countWord(const std::string &str, const std::string &word)
{
std::string::size_type start = 0, end;
size_t count = 0;
while ((start = str.find_first_not_of(delims, start)) != std::string::npos)
{
end = str.find_first_of(delims, start+1);
if (end == std::string::npos)
{
if (str.compare(start, str.size()-start, word) == 0)
++count;
break;
}
if (str.compare(start, end-start, word) == 0)
++count;
start = end + 1;
}
return count;
}
Alternatively, you can extract the whole words into a std::vector and then use std::count() to count how many elements match the substring. For example:
#include <string>
#include <vector>
#include <algorithm>
const std::string delims = ",. ";
size_t countWord(const std::string &str, const std::string &word)
{
std::vector<std::string> vec;
std::string::size_type start = 0, end;
while ((start = str.find_first_not_of(delims, start)) != string::npos)
{
end = str.find_first_of(delims, start+1);
if (end == std::string::npos)
{
vec.push_back(str.substr(start));
break;
}
vec.push_back(str.substr(start, end-start));
start = end + 1;
}
return std::count(vec.begin(), vec.end(), word);
}

Reading a iostream until a string delimiter is found

I currently have a function that reads from a stream until a predefined stream-stopper is found. The only way I could currently get it up and running is by using std::getline and having one character followed by a newline (in my case char(3)) as my stream-stopper.
std::string readuntil(std::istream& in) {
std::string text;
std::getline(in, text, char(3));
return text;
}
Is there any way to achieve the same but with a larger string as my stream-stopper? I don't mind it having to be followed by a new-line, but I want my delimiter to be a random string of some size so that the probability of it occurring by change in the stream is very very low.
Any idea how to achieve this?

I assume that your requirements are:
a function taking an istream ref and a string as parameter
the string is a delimiter and the function must return a string containing all the characters that arrived before it
the stream must be positioned immediately after the delimiter for further processing.
AFAIK, neither the C++ nor the C standard library contain a function for that. I would just:
read until the last character of the delimiter in a temporary string
accumulate that in a global string
iterate the 2 above actions if the global string does not end with the delimiter
optionaly remove the delimiter from the end of the global string
return the global string
A possible C++ implementation is:
std::string readuntil(std::istream& in, std::string delimiter) {
std::string cr;
char delim = *(delimiter.rbegin());
size_t sz = delimiter.size(), tot;
do {
std::string temp;
std::getline(in, temp, delim);
cr += temp + delim;
tot = cr.size();
} while ((tot < sz) || (cr.substr(tot - sz, sz) != delimiter));
return cr.substr(0, tot - sz); // or return cr; if you want to keep the delimiter
}

No output after running the program

To keep it short, I'm quite a beginner at c++ and I'm learning character sequences.
Here's my problem: I'm trying to change every word with an even number of letters to a symbol ( # ), but I think that I'm approaching the problem in a bad way. I get nothing when running it.
#include<iostream>
#include<string.h>
using namespace std;
int main()
{
char s[101];
cin.getline(s,101);
int i;
for(int i=0; i<strlen(s); i++)
{
if(strchr(s,' ')) // searching for a space
{}
else
if((strlen(s)%2==0)) //trying to find if the word has an even number
{
strcat(s,"#"); // I'm sticking the # character to the word and then deleting everything after #.
strcpy(s+i,s+i+1);
cout<<s;
}
else
cout<<"Doens't exist";
}
return 0;
}

the only flow of code which doesnot contain cout is
if(strchr(s,' ')) // searching for a space
{}
so debug this.

Look what will happen if you input a single word with an even number of letters with space at end like abcd . Your program will search for space five times and every time do nothing.

Here is the algorithm I came up with:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
// declare input string and read it
string s;
getline(cin, s);
// declare vector of strings to store words,
// and string tmp to temporarily save word
vector <string> vec;
string tmp = "";
// iterate through each character of input string
for(auto c : s)
{
// if there is space push the word to vector,
// clear tmp string
if (c == ' ')
{
vec.push_back(tmp);
tmp = "";
continue;
}
// add character to temporary string
tmp += c;
}
// push last word to vector
vec.push_back(tmp);
// clear the input string
s = "";
// iterate through each word
for(auto w : vec)
{
// if number of word's characters are odd
// just add the word itself
// otherwise add '#' symbol
(w.size() % 2) ? s += w : s += '#';
s += ' ';
}
// remove last space
s.erase(s.begin() + s.size() - 1, s.begin() + s.size());
cout << s;
}

Your solution (algorithm) is completely wrong! First you should separate each word by space,
if(strchr(s,' '))
then you should find length of separated word and then replace it to #.

How do you reverse the order of words (not the letters) in a string? [duplicate]

This question already has answers here:
Reversing order of words in a sentence
(8 answers)
Closed 9 years ago.
How would I go about reversing the order of words in a string? I tried this but it doesn't work:
string sen = "Go over there";
reverse(sen.begin(), sen.end());
But this reverses the entire string but doesn't keep the words in the right order. How do I only reverse the order of words in the string?

I have written many string functions like this before:
// Make copy of this original if you don't wish to destroy in the process
string sen = "Go over there";
// string that will become your reversed string
string newStr = new string();
// A temp variable that will hold the current position of the last separator character
int aChar = -1;
////////////////////////////////////////////////////////////////////////////////////////
// You may want to delete pre and post spaces here
////////////////////////////////////////////////////////////////////////////////////////
// Loop through the entire string until the original is empty
while(sen.length > 0){
// Find the last separator character (in your case a space) within the string
aChar = sen.find_last_of(" ");
// Append the word created from one char forward of the last found separator char
// to the end of the CURRENT version of the original string
newStr += sen.substr(aChar + 1, sen.length - aChar - 1);
// Store a new version of the original string that is the SUBSTRING from beginning (char 0)
// to one char before the last found separator character
sen = sen.substr(0, aChar - 1);
// Need to add the space between the words, but only if the new substring is not empty
if(sen.length > 0) newStr += " ";
}
I have not tested this code, but if the APIs work the way they are intended, algorithmically this should work.
As an API this might look like follows
string reverse(string inStr){
// Make copy of the original so we don't destroy it in the process
string sen = inStr.copy();
// string that will become your reversed string
string newStr();
// A temp variable that will hold the current position of the last separator character
int aChar = -1;
////////////////////////////////////////////////////////////////////////////////////////
// You may want to delete pre and post spaces here
////////////////////////////////////////////////////////////////////////////////////////
// Loop through the entire string until the original is empty
while(sen.length > 0){
// Find the last separator character (in your case a space) within the string
aChar = sen.find_last_of(" ");
// Append the word created from one char forward of the last found separator char
// to the end of the CURRENT version of the original string
newStr += sen.substr(aChar + 1, sen.length - aChar - 1);
// Store a new version of the original string that is the SUBSTRING from beginning
// (char 0) to one char before the last found separator character
sen = sen.substr(0, aChar - 1);
// Need to add the space between the words, but only if the new substring is not empty
if(sen.length > 0) newStr += " ";
}
return newStr;
}
int main(int argc, char *argv[]){
string sen = "Go over there";
string rev = reverse(sen);
}

If the words inside the string are separated by spaces, you can use string.find() inside a while loop to locate the breaks, and then use string.substr() to output the words to a vector. Then you can simply read the vector backwards.

Live Example
If you use C++11, sen.pop_back() gets rid of the last space, otherwise you can check out Remove last character from C++ string for other examples. Secondly, instead of doing std::reverse(output.begin(), output.end()), we just use the reverse iterators rbegin() and rend(). The for loop could be improved definitely, but it does the job.
#include <sstream>
#include <iterator>
#include <algorithm>
#include <iostream>
using namespace std;
int main(int argc, char *argv[]) {
std::vector<std::string> output;
std::string sen = "Go over there";
std::string word = "";
unsigned int len = 0;
for (const auto& c : sen) {
++len;
if (c == ' ' || len == sen.size()) {
if (len == sen.size())
word += c;
output.push_back(word);
word = "";
}
if (c != ' ')
word += c;
}
std::ostringstream oss;
std::copy(output.rbegin(), output.rend(), std::ostream_iterator<std::string>(oss, " "));
sen = oss.str();
sen.pop_back(); // Get rid of last space, C++11 only
std::cout << sen;
}

wordwap function fix to preserve whitespace between words

Some time ago I was looking for a snippet to do a wordwrap for a certain size of line length without breaking up the words. It was working fair enough, but now when I started using it in edit control, I noticed it eats up multiple white space symbols in between. I am contemplating how to fix it or get rid of it completely if wstringstream is not suitable for the task. Maybe someone out there have a similar function?
void WordWrap2(const std::wstring& inputString, std::vector<std::wstring>& outputString, unsigned int lineLength)
{
std::wstringstream iss(inputString);
std::wstring line;
std::wstring word;
while(iss >> word)
{
if (line.length() + word.length() > lineLength)
{
outputString.push_back(line+_T("\r"));
line.clear();
}
if( !word.empty() ) {
if( line.empty() ) line += word; else line += +L" " + word;
}
}
if (!line.empty())
{
outputString.push_back(line+_T("\r"));
}
}
Wrap line delimiter symbol should remain \r

Instead of reading a word at a time, and adding words until you'd exceed the desired line length, I'd start from the point where you want to wrap, and work backwards until you find a white-space character, then add that entire chunk to the output.
#include <iostream>
#include <string>
#include <vector>
#include <stdlib.h>
void WordWrap2(const std::wstring& inputString,
std::vector<std::wstring>& outputString,
unsigned int lineLength) {
size_t last_pos = 0;
size_t pos;
for (pos=lineLength; pos < inputString.length(); pos += lineLength) {
while (pos > last_pos && !isspace((unsigned char)inputString[pos]))
--pos;
outputString.push_back(inputString.substr(last_pos, pos-last_pos));
last_pos = pos;
while (isspace((unsigned char)inputString[last_pos]))
++last_pos;
}
outputString.push_back(inputString.substr(last_pos));
}
As it stands, this will fail if it encounters a single word that's longer than the line length you've specified (in such a case, it probably should just break in the middle of the word, but it currently doesn't).
I've also written it to skip over whitespace between words when they happen at a line break. If you really don't want that, just eliminate the:
while (isspace((unsigned char)inputString[last_pos]))
++last_pos;

If you don't want to loose space characters, you need to add the following line before doing any reads:
iss >> std::noskipws;
But then using >> with a string as a second argument won't work well w.r.t. spaces.
You'll have to resort to reading chars, and manage them in an ad'hoc manner yourself.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how do you split a string embedded in a delimiter in C++? - c++

Related

Having issues finding multiple substrings within a string

Reading a iostream until a string delimiter is found

No output after running the program

How do you reverse the order of words (not the letters) in a string? [duplicate]

wordwap function fix to preserve whitespace between words

Categories

Resources