I have a vector of characters which contains some words delimited by comma.
I need to separate text by words and add those words to a list.
Thanks.
vector<char> text;
list<string> words;
I think I'd do it something like this:
while ((stop=std::find(start, text.end(), ',')) != text.end()) {
words.push_back(std::string(start, stop));
start = stop+1;
}
words.push_back(std::string(start, text.end()));
Edit: That said, I have to point out that the requirement seems a bit odd -- why are you starting with a std::vector<char>? A std::string would be much more common.
vector<char> text = ...;
list<string> words;
ostringstream s;
for (auto c : text)
if (c == ',')
{
words.push_back(s.str());
s.str("");
}
else
s.put(c);
words.push_back(s.str());
Try to code this simple psuedocode and see how it goes
string tmp;
for i = 0 to text.size
if text[i] != ','
insert text[i] to tmp via push_back
else add tmp to words via push_back and clear out tmp
Related
This function returns an array of strings with a list of files in a folder. It looks like this:
"folder//xyz.txt"
How can I make it look like this?
folder//xyz.txt
Its the same but without "".
vector<string> list_of_files(string folder_name)
{
vector<string> files;
string path = folder_name;
for (const auto& entry : fs::directory_iterator(path))
{
stringstream ss;
ss << entry.path(); //convert entry.path() to string
string str = ss.str();
files.push_back(ss.str());
}
return files;
}
Erasing the first and last characters of a string is easy:
if (str.size() >= 1)
str.erase(0, 1); // from 1st char (#0), len 1; bit verbose as not designed for this
if (str.size() >= 1)
str.pop_back(); // chop off the end
Your quotes have come from inserting the path to a stream (quoted is used to help prevent bugs due to spaces down the line).
Fortunately, you don't need any of this: as explored in the comments, the stringstream is entirely unnecessary; the path already converts to a string if you ask it to:
vector<string> list_of_files(string folder_name)
{
vector<string> files;
for (const auto& entry : fs::directory_iterator(folder_name))
files.push_back(entry.path().string());
return files;
}
I have a string which I input as follows
using namespace std;
string s;
getline(cin, s);
I input
a.b~c.d
I want to split the string at . and ~ but also want to store the delimiters. The split elements will be stored in a vector.
Final output should look like this
a
.
b
~
c
.
d
I saw a solution here but it was in java.
How do I achieve this in c++?
This solution is copied verbatim from this answer except for the commented lines:
std::stringstream stringStream(inputString);
std::string line;
while(std::getline(stringStream, line))
{
std::size_t prev = 0, pos;
while ((pos = line.find_first_of(".~", prev)) != std::string::npos) // only look for . and ~
{
if (pos > prev)
wordVector.push_back(line.substr(prev, pos-prev));
wordVector.push_back(line.substr(pos, 1)); // add delimiter
prev = pos+1;
}
if (prev < line.length())
wordVector.push_back(line.substr(prev, std::string::npos));
}
I haven't tested the code, but the basic idea is you want to store the delimiter character in the result as well.
So I have a file of strings that I am reading in, and I have to replace certain values in them with other values. The amount of possible replacements is variable. As in, it reads the patterns to replace with in from a file. Currently I'm storing in a vector<pair<string,string>> for the patterns to find and match. However I run into issues:
Example:
Input string: abcd.eaef%afas&333
Delimiter patterns:
. %%%
% ###
& ###
Output I want: abcd%%%eaef###afas###333
Output I get: abcd#########eaef###afas###333
The issue being it ends up replacing the % sign or any other symbol that was already a replacement for something else, it should not be doing that.
My code is (relevant portions):
std::string& replace(std::string& s, const std::string& from, const std::string& to){
if(!from.empty())
for(size_t pos = 0; (pos = s.find(from, pos)) != std::string::npos; pos += to.size()) s.replace(pos, from.size(), to);
return s;
}
string line;
vector<pair<string, string>> myset;
while(getline(delimiterfile, line)){
istringstream is(line);
string delim, pattern;
if(is >> delim >> pattern){
myset.push_back(make_pair(delim, pattern));
} else {
throw runtime_error("Invalid pattern pair!");
}
}
while(getline(input, line)){
string temp = line;
for(auto &item : myset){
replace(temp, item.first, item.second);
}
output << temp << endl;
}
Can someone please tell me what I'm messing up and how to fix it?
In pseudo-code a simple replacement algorithm could look something like this:
string input = getline();
string output; // The string containing the replacements
for (each char in input)
{
if (char == '.')
output += "%%%";
// TODO: Other replacements
else
output += char;
}
If you implement the above code, once it's done the variable output will contain the string with all replacements made.
I would suggest you use stringstream. This way you will be able to achieve what you are looking for very easily.
I've been practicing C++ for a competition next week. And in the sample problem I've been working on, requires splitting of paragraphs into words. Of course, that's easy. But this problem is so weird, that the words like: isn't should be separated as well: isn and t. I know it's weird but I have to follow this.
I have a function split() that takes a constant char delimiter as one of the parameter. It's what I use to separate words from spaces. But I can't figure out this one. Even numbers like: phil67bs should be separated as phil and bs.
And no, I don't ask for full code. A pseudocode will do, or something that will help me understand what to do. Thanks!
PS: Please no recommendations for external libs. Just the STL. :)
Filter out numbers, spaces and anything else that isn't a letter by using a proper locale. See this SO thread about treating everything but numbers as a whitespace. So use a mask and do something similar to what Jerry Coffin suggests but only for letters:
struct alphabet_only: std::ctype<char>
{
alphabet_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
std::fill(&rc['A'], &rc['['], std::ctype_base::upper);
std::fill(&rc['a'], &rc['{'], std::ctype_base::lower);
return &rc[0];
}
};
And, boom! You're golden.
Or... you could just do a transform:
char changeToLetters(const char& input){ return isalpha(input) ? input : ' '; }
vector<char> output;
output.reserve( myVector.size() );
transform( myVector.begin(), myVector.end(), insert_iterator(output), ptr_fun(changeToLetters) );
Which, um, is much easier to grok, just not as efficient as Jerry's idea.
Edit:
Changed 'Z' to '[' so that the value 'Z' is filled. Likewise with 'z' to '{'.
This sounds like a perfect job for the find_first_of function which finds the first occurrence of a set of characters. You can use this to look for arbitrary stop characters and generate words from the spaces between such stop characters.
Roughly:
size_t previous = 0;
for (; ;) {
size_t next = str.find_first_of(" '1234567890", previous);
// Do processing
if (next == string::npos)
break;
previous = next + 1;
};
Just change your function to delimit on anything that isn't an alphabetic character. Is there anything in particular that you are having trouble with?
Break down the problem: First, write a function that gets the first "word" from the sentence. This is easy; just look for the first non-alphabetic character. The next step is to remove all leading non-alphabetic character from the remaining string. From there, just repeat.
You can do something like this:
vector<string> split(const string& str)
{
vector<string> splits;
string cur;
for(int i = 0; i < str.size(); ++i)
{
if(str[i] >= '0' && str[i] <= '9')
{
if(!cur.empty())
{
splits.push_back(cur);
}
cur="";
}
else
{
cur += str[i];
}
}
if(! cur.empty())
{
splits.push_back(cur);
}
return splits;
}
let's assume that the input is in a std::string (use std::getline(cin, line) for example to read a full line from cin)
std::vector<std::string> split(std::string const& input)
{
std::string::const_iterator it(input), end(input.end());
std::string current;
vector<std::string> words;
for(; it != end; ++it)
{
if (isalpha(*it))
{
current.push_back(*it); // add this char to the current word
}
else
{
// push the current word in to the result list
words.push_back(current);
current.clear(); // next word
}
}
return words;
}
I've not tested it, but I guess it ought to work...
Whats the most efficient way of removing a 'newline' from a std::string?
#include <algorithm>
#include <string>
std::string str;
str.erase(std::remove(str.begin(), str.end(), '\n'), str.cend());
The behavior of std::remove may not quite be what you'd expect.
A call to remove is typically followed by a call to a container's erase method, which erases the unspecified values and reduces the physical size of the container to match its new logical size.
See an explanation of it here.
If the newline is expected to be at the end of the string, then:
if (!s.empty() && s[s.length()-1] == '\n') {
s.erase(s.length()-1);
}
If the string can contain many newlines anywhere in the string:
std::string::size_type i = 0;
while (i < s.length()) {
i = s.find('\n', i);
if (i == std::string:npos) {
break;
}
s.erase(i);
}
You should use the erase-remove idiom, looking for '\n'. This will work for any standard sequence container; not just string.
Here is one for DOS or Unix new line:
void chomp( string &s)
{
int pos;
if((pos=s.find('\n')) != string::npos)
s.erase(pos);
}
Slight modification on edW's solution to remove all exisiting endline chars
void chomp(string &s){
size_t pos;
while (((pos=s.find('\n')) != string::npos))
s.erase(pos,1);
}
Note that size_t is typed for pos, it is because npos is defined differently for different types, for example, -1 (unsigned int) and -1 (unsigned float) are not the same, due to the fact the max size of each type are different. Therefore, comparing int to size_t might return false even if their values are both -1.
s.erase(std::remove(s.begin(), s.end(), '\n'), s.end());
The code removes all newlines from the string str.
O(N) implementation best served without comments on SO and with comments in production.
unsigned shift=0;
for (unsigned i=0; i<length(str); ++i){
if (str[i] == '\n') {
++shift;
}else{
str[i-shift] = str[i];
}
}
str.resize(str.length() - shift);
std::string some_str = SOME_VAL;
if ( some_str.size() > 0 && some_str[some_str.length()-1] == '\n' )
some_str.resize( some_str.length()-1 );
or (removes several newlines at the end)
some_str.resize( some_str.find_last_not_of(L"\n")+1 );
Another way to do it in the for loop
void rm_nl(string &s) {
for (int p = s.find("\n"); p != (int) string::npos; p = s.find("\n"))
s.erase(p,1);
}
Usage:
string data = "\naaa\nbbb\nccc\nddd\n";
rm_nl(data);
cout << data; // data = aaabbbcccddd
All these answers seem a bit heavy to me.
If you just flat out remove the '\n' and move everything else back a spot, you are liable to have some characters slammed together in a weird-looking way. So why not just do the simple (and most efficient) thing: Replace all '\n's with spaces?
for (int i = 0; i < str.length();i++) {
if (str[i] == '\n') {
str[i] = ' ';
}
}
There may be ways to improve the speed of this at the edges, but it will be way quicker than moving whole chunks of the string around in memory.
If its anywhere in the string than you can't do better than O(n).
And the only way is to search for '\n' in the string and erase it.
for(int i=0;i<s.length();i++) if(s[i]=='\n') s.erase(s.begin()+i);
For more newlines than:
int n=0;
for(int i=0;i<s.length();i++){
if(s[i]=='\n'){
n++;//we increase the number of newlines we have found so far
}else{
s[i-n]=s[i];
}
}
s.resize(s.length()-n);//to delete only once the last n elements witch are now newlines
It erases all the newlines once.
About answer 3 removing only the last \n off string code :
if (!s.empty() && s[s.length()-1] == '\n') {
s.erase(s.length()-1);
}
Will the if condition not fail if the string is really empty ?
Is it not better to do :
if (!s.empty())
{
if (s[s.length()-1] == '\n')
s.erase(s.length()-1);
}
To extend #Greg Hewgill's answer for C++11:
If you just need to delete a newline at the very end of the string:
This in C++98:
if (!s.empty() && s[s.length()-1] == '\n') {
s.erase(s.length()-1);
}
...can now be done like this in C++11:
if (!s.empty() && s.back() == '\n') {
s.pop_back();
}
Optionally, wrap it up in a function. Note that I pass it by ptr here simply so that when you take its address as you pass it to the function, it reminds you that the string will be modified in place inside the function.
void remove_trailing_newline(std::string* str)
{
if (str->empty())
{
return;
}
if (str->back() == '\n')
{
str->pop_back();
}
}
// usage
std::string str = "some string\n";
remove_trailing_newline(&str);
Whats the most efficient way of removing a 'newline' from a std::string?
As far as the most efficient way goes--that I'd have to speed test/profile and see. I'll see if I can get back to you on that and run some speed tests between the top two answers here, and a C-style way like I did here: Removing elements from array in C. I'll use my nanos() timestamp function for speed testing.
Other References:
See these "new" C++11 functions in this reference wiki here: https://en.cppreference.com/w/cpp/string/basic_string
https://en.cppreference.com/w/cpp/string/basic_string/empty
https://en.cppreference.com/w/cpp/string/basic_string/back
https://en.cppreference.com/w/cpp/string/basic_string/pop_back