I have a string which I input as follows
using namespace std;
string s;
getline(cin, s);
I input
a.b~c.d
I want to split the string at . and ~ but also want to store the delimiters. The split elements will be stored in a vector.
Final output should look like this
a
.
b
~
c
.
d
I saw a solution here but it was in java.
How do I achieve this in c++?
This solution is copied verbatim from this answer except for the commented lines:
std::stringstream stringStream(inputString);
std::string line;
while(std::getline(stringStream, line))
{
std::size_t prev = 0, pos;
while ((pos = line.find_first_of(".~", prev)) != std::string::npos) // only look for . and ~
{
if (pos > prev)
wordVector.push_back(line.substr(prev, pos-prev));
wordVector.push_back(line.substr(pos, 1)); // add delimiter
prev = pos+1;
}
if (prev < line.length())
wordVector.push_back(line.substr(prev, std::string::npos));
}
I haven't tested the code, but the basic idea is you want to store the delimiter character in the result as well.
Related
I understand how to split a string by a string by a delimiter in C++, but how do you split a string embedded in a delimiter, e.g. try and split ”~!hello~! random junk... ~!world~!” by the string ”~!” into an array of [“hello”, “ random junk...”, “world”]? are there any C++ standard library functions for this or if not any algorithm which could achieve this?
#include <iostream>
#include <vector>
using namespace std;
vector<string> split(string s,string delimiter){
vector<string> res;
s+=delimiter; //adding delimiter at end of string
string word;
int pos = s.find(delimiter);
while (pos != string::npos) {
word = s.substr(0, pos); // The Word that comes before the delimiter
res.push_back(word); // Push the Word to our Final vector
s.erase(0, pos + delimiter.length()); // Delete the Delimiter and repeat till end of String to find all words
pos = s.find(delimiter); // Update pos to hold position of next Delimiter in our String
}
res.push_back(s); //push the last word that comes after the delimiter
return res;
}
int main() {
string s="~!hello~!random junk... ~!world~!";
vector<string>words = split(s,"~!");
int n=words.size();
for(int i=0;i<n;i++)
std::cout<<words[i]<<std::endl;
return 0;
}
The above program will find all the words that occur before, in between and after the delimiter that you specify. With minor changes to the function, you can make the function suit your need ( like for example if you don't need to find the word that occurs before the first delimiter or last delimiter) .
But for your need, the given function does the word splitting in the right way according to the delimiter you provide.
I hope this solves your question !
So I have a file of strings that I am reading in, and I have to replace certain values in them with other values. The amount of possible replacements is variable. As in, it reads the patterns to replace with in from a file. Currently I'm storing in a vector<pair<string,string>> for the patterns to find and match. However I run into issues:
Example:
Input string: abcd.eaef%afas&333
Delimiter patterns:
. %%%
% ###
& ###
Output I want: abcd%%%eaef###afas###333
Output I get: abcd#########eaef###afas###333
The issue being it ends up replacing the % sign or any other symbol that was already a replacement for something else, it should not be doing that.
My code is (relevant portions):
std::string& replace(std::string& s, const std::string& from, const std::string& to){
if(!from.empty())
for(size_t pos = 0; (pos = s.find(from, pos)) != std::string::npos; pos += to.size()) s.replace(pos, from.size(), to);
return s;
}
string line;
vector<pair<string, string>> myset;
while(getline(delimiterfile, line)){
istringstream is(line);
string delim, pattern;
if(is >> delim >> pattern){
myset.push_back(make_pair(delim, pattern));
} else {
throw runtime_error("Invalid pattern pair!");
}
}
while(getline(input, line)){
string temp = line;
for(auto &item : myset){
replace(temp, item.first, item.second);
}
output << temp << endl;
}
Can someone please tell me what I'm messing up and how to fix it?
In pseudo-code a simple replacement algorithm could look something like this:
string input = getline();
string output; // The string containing the replacements
for (each char in input)
{
if (char == '.')
output += "%%%";
// TODO: Other replacements
else
output += char;
}
If you implement the above code, once it's done the variable output will contain the string with all replacements made.
I would suggest you use stringstream. This way you will be able to achieve what you are looking for very easily.
I have a string that goes like this:
Room -> Subdiv("X", 0.5, 0.5) { sleep | work } : 0.5
I need to somehow extract the 2 strings between {} , i.e. sleep and work. The format is strict, there can be just 2 words between the brackets, the words can change though. The text before and after the brackets can also change. My initial way of doing it was:
string split = line.substr(line.find("Subdiv(") + _count_of_fchars);
split = split.substr(4, axis.find(") { "));
split = split.erase(split.length() - _count_of_chars);
However, I do realised that this is no going to work if the strings in side the brackets are changed o anything with a different length.
How can this be done? Thanks!
Without hard-coding any numbers:
Find A as the index of the first "{" from the end of the string, search backward.
Find B as the index of the first "|" from the position of "{", search forward.
Find C as the index of the first "}" from the position of "|", search forward.
The substring between B and A gives you the first string. While the substring between C and B gives you the first string. You can include the spaces in your substring search, or take them out later.
std::pair<std::string, std::string> SplitMyCustomString(const std::string& str){
auto first = str.find_last_of('{');
if(first == std::string::npos) return {};
auto mid = str.find_first_of('|', first);
if(mid == std::string::npos) return {};
auto last = str.find_first_of('}', mid);
if(last == std::string::npos) return {};
return { str.substr(first+1, mid-first-1), str.substr(mid+1, last-mid-1) };
}
For Trimming the spaces:
std::string Trim(const std::string& str){
auto first = str.find_first_not_of(' ');
if(first == std::string::npos) first = 0;
auto last = str.find_last_not_of(' ');
if(last == std::string::npos) last = str.size();
return str.substr(first, last-first+1);
}
Demo
Something like:
unsigned open = str.find("{ ") + 2;
unsigned separator = str.find(" | ");
unsigned close = str.find(" }") - 2;
string strNew1 = str.substr (open, separator - open);
string strNew2 = str.substr (separator + 3, close - separator);
Even though you said that the amount of words to find is fixed I made a little more flexible example using a regular expression. However you could still achieve the same result using Мотяs answer.
std::string s = ("Room -> Subdiv(\"X\", 0.5, 0.5) { sleep | work } : 0.5")
std::regex rgx("\\{((?:\\s*\\w*\\s*\\|?)+)\\}");
std::smatch match;
if (std::regex_search(s, match, rgx) && match.size() == 2) {
// match[1] now contains "sleep | work"
std::istringstream iss(match[1]);
std::string token;
while (std::getline(iss, token, '|')) {
std::cout << trim(token) << std::endl;
}
}
trim removes leading and trailing spaces and the input string could easily be expanded to look like this: "...{ sleep | work | eat }...".
Here is the complete code.
I am writing a program that should receive 3 parameters by User: file_upload "local_path" "remote_path"
code example:
std::vector split(std::string str, char delimiter) {
std::vector<string> v;
std::stringstream src(str);
std::string buf;
while(getline(src, buf, delimiter)) {
v.push_back(buf);
}
return v;
}
void function() {
std::string input
getline(std::cin, input);
// user input like this: file_upload /home/Space Dir/file c:\dir\file
std::vector<std::string> v_input = split(input, ' ');
// the code will do something like this
if(v_input[0].compare("file_upload") == 0) {
FILE *file;
file = fopen(v_input[1].c_str(), "rb");
send_upload_dir(v_input[2].c_str());
// bla bla bla
}
}
My question is: the second and third parameter are directories, then they can contain spaces in name. How can i make the split function does not change the spaces of the second and third parameter?
I thought to put quotes in directories and make a function to recognize, but not work 100% because the program has other functions that take only 2 parameters not three. can anyone help?
EDIT: /home/user/Space Dir/file.out <-- path with space name.
If this happens the vector size is greater than expected, and the path to the directory will be broken.. this can not happen..
the vector will contain something like this:
vector[1] = /home/user/Space
vector[2] = Dir/file.out
and what I want is this:
vector[1] = /home/user/Space Dir/file.out
Since you need to accept three values from a single string input, this is a problem of encoding.
Encoding is sometimes done by imposing fixed-width requirements on some or all fields, but that's clearly not appropriate here, since we need to support variable-width file system paths, and the first value (which appears to be some kind of mode specifier) may be variable-width as well. So that's out.
This leaves 4 possible solutions for variable-width encoding:
1: Unambiguous delimiter.
If you can select a separator character that is guaranteed never to show up in the delimited values, then you can split on that. For example, if NUL is guaranteed never to be part of the mode value or the path values, then we can do this:
std::vector<std::string> v_input = split(input,'\0');
Or maybe the pipe character:
std::vector<std::string> v_input = split(input,'|');
Hence the input would have to be given like this (for the pipe character):
file_upload|/home/user/Space Dir/file.out|/home/user/Other Dir/blah
2: Escaping.
You can write the code to iterate through the input line and properly split it on unescaped instances of the separator character. Escaped instances will not be considered separators. You can parameterize the escape character. For example:
std::vector<std::string> escapedSplit(std::string str, char delimiter, char escaper ) {
std::vector<std::string> res;
std::string cur;
for (size_t i = 0; i < str.size(); ++i) {
if (str[i] == delimiter) {
res.push_back(cur);
cur.clear();
} else if (str[i] == escaper) {
++i;
if (i == str.size()) break;
cur.push_back(str[i]);
} else {
cur.push_back(str[i]);
} // end if
} // end for
if (!cur.empty()) res.push_back(cur);
return res;
} // end escapedSplit()
std::vector<std::string> v_input = escapedSplit(input,' ','\\');
With input as:
file_upload /home/user/Space\ Dir/file.out /home/user/Other\ Dir/blah
3: Quoting.
You can write the code to iterate through the input line and properly split it on unquoted instances of the separator character. Quoted instances will not be considered separators. You can parameterize the quote character.
A complication of this approach is that it is not possible to include the quote character itself inside a quoted extent unless you introduce an escaping mechanism, similar to solution #2. A common strategy is to allow repetition of the quote character to escape it. For example:
std::vector<std::string> quotedSplit(std::string str, char delimiter, char quoter ) {
std::vector<std::string> res;
std::string cur;
for (size_t i = 0; i < str.size(); ++i) {
if (str[i] == delimiter) {
res.push_back(cur);
cur.clear();
} else if (str[i] == quoter) {
++i;
for (; i < str.size(); ++i) {
if (str[i] == quoter) {
if (i+1 == str.size() || str[i+1] != quoter) break;
++i;
cur.push_back(quoter);
} else {
cur.push_back(str[i]);
} // end if
} // end for
} else {
cur.push_back(str[i]);
} // end if
} // end for
if (!cur.empty()) res.push_back(cur);
return res;
} // end quotedSplit()
std::vector<std::string> v_input = quotedSplit(input,' ','"');
With input as:
file_upload "/home/user/Space Dir/file.out" "/home/user/Other Dir/blah"
Or even just:
file_upload /home/user/Space" "Dir/file.out /home/user/Other" "Dir/blah
4: Length-value.
Finally, you can write the code to take a length before each value, and only grab that many characters. We could require a fixed-width length specifier, or skip a delimiting character following the length specifier. For example (note: light on error checking):
std::vector<std::string> lengthedSplit(std::string str) {
std::vector<std::string> res;
size_t i = 0;
while (i < str.size()) {
size_t len = std::atoi(str.c_str());
if (len == 0) break;
i += (size_t)std::log10(len)+2; // +1 to get base-10 digit count, +1 to skip delim
res.push_back(str.substr(i,len));
i += len;
} // end while
return res;
} // end lengthedSplit()
std::vector<std::string> v_input = lengthedSplit(input);
With input as:
11:file_upload29:/home/user/Space Dir/file.out25:/home/user/Other Dir/blah
I had similar problem few days ago and solve it like this:
First I've created a copy, Then replace the quoted strings in the copy with some padding to avoid white spaces, finally I split the original string according to the white space indexes from the copy.
Here is my full solution:
you may want to also remove the double quotes, trim the original string and so on:
#include <sstream>
#include<iostream>
#include<vector>
#include<string>
using namespace std;
string padString(size_t len, char pad)
{
ostringstream ostr;
ostr.fill(pad);
ostr.width(len);
ostr<<"";
return ostr.str();
}
void splitArgs(const string& s, vector<string>& result)
{
size_t pos1=0,pos2=0,len;
string res = s;
pos1 = res.find_first_of("\"");
while(pos1 != string::npos && pos2 != string::npos){
pos2 = res.find_first_of("\"",pos1+1);
if(pos2 != string::npos ){
len = pos2-pos1+1;
res.replace(pos1,len,padString(len,'X'));
pos1 = res.find_first_of("\"");
}
}
pos1=res.find_first_not_of(" \t\r\n",0);
while(pos1 < s.length() && pos2 < s.length()){
pos2 = res.find_first_of(" \t\r\n",pos1+1);
if(pos2 == string::npos ){
pos2 = res.length();
}
len = pos2-pos1;
result.push_back(s.substr(pos1,len));
pos1 = res.find_first_not_of(" \t\r\n",pos2+1);
}
}
int main()
{
string s = "234 \"5678 91\" 8989";
vector<string> args;
splitArgs(s,args);
cout<<"original string:"<<s<<endl;
for(size_t i=0;i<args.size();i++)
cout<<"arg "<<i<<": "<<args[i]<<endl;
return 0;
}
and this is the output:
original string:234 "5678 91" 8989
arg 0: 234
arg 1: "5678 91"
arg 2: 8989
I have a vector of characters which contains some words delimited by comma.
I need to separate text by words and add those words to a list.
Thanks.
vector<char> text;
list<string> words;
I think I'd do it something like this:
while ((stop=std::find(start, text.end(), ',')) != text.end()) {
words.push_back(std::string(start, stop));
start = stop+1;
}
words.push_back(std::string(start, text.end()));
Edit: That said, I have to point out that the requirement seems a bit odd -- why are you starting with a std::vector<char>? A std::string would be much more common.
vector<char> text = ...;
list<string> words;
ostringstream s;
for (auto c : text)
if (c == ',')
{
words.push_back(s.str());
s.str("");
}
else
s.put(c);
words.push_back(s.str());
Try to code this simple psuedocode and see how it goes
string tmp;
for i = 0 to text.size
if text[i] != ','
insert text[i] to tmp via push_back
else add tmp to words via push_back and clear out tmp