Removing items from vector string C++ - c++

I have a vector, it's contents are like so..
std::vector<string> vec;
vec.push_back("XXXX_LLLL");
vec.push_back("XXXX_HHHH");
vec.push_back("XXXX_XXXX");
I'd like to iterate over the vector and remove the "_" from the string. I've tried using the find-erase idiom as so, with a struct I made to find _.
vec.erase(remove_if(vec.begin(), vec.end(), IsUnderScore2()),vec.end());
But I realized it's not iterating over each individual string in my vector string, so it will never erase the underscore. Is there another method of iterating over a vector, and it's individual components, that can help me here?

Iterate through the vector and use the erase-remove idiom on each string, instead of on the vector elements as you're doing right now
std::vector<string> vec;
vec.push_back("XXXX_LLLL");
vec.push_back("XXXX_HHHH");
vec.push_back("XXXX_XXXX");
for(auto& str : vec) {
str.erase(std::remove(str.begin(), str.end(), '_'),
str.end());
}
C++03 version:
for(std::vector<std::string>::iterator it = vec.begin(), it != vec.end(), ++it) {
it->erase(std::remove(it->begin(), it->end(), '_'),
it->end());
}

Try the following. You can use standard algorithm std::remove applied to each string of the vector.
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int main()
{
std::vector<std::string> vec;
vec.push_back("XXXX_LLLL");
vec.push_back("XXXX_HHHH");
vec.push_back("XXXX_XXXX");
for ( std::string &s : vec )
{
s.erase( std::remove( s.begin(), s.end(), '_'), s.end() );
}
for ( const std::string &s : vec ) std::cout << s << std::endl;
return 0;
}
The output is
XXXXLLLL
XXXXHHHH
XXXXXXXX
If your compiler does not support the C++ 2011 then you can write
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int main()
{
std::vector<std::string> vec;
vec.push_back("XXXX_LLLL");
vec.push_back("XXXX_HHHH");
vec.push_back("XXXX_XXXX");
for (std::vector<std::string>::iterator it = vec.begin(); it != vec.end(); ++it )
{
it->erase( std::remove( it->begin(), it->end(), '_'), it->end() );
}
for (std::vector<std::string>::iterator it = vec.begin(); it != vec.end(); ++it )
{
std::cout << *it << std::endl;
}
return 0;
}

Using regular expressions would look like :
for_each(vec.begin(), vec.end(), [&](string &str) {
regex_replace(str.begin(), str.begin(), str.end(), regex("_"), "");
});
Demo
A range based for loop version might be more readable :
for(auto &str : vec) {
regex_replace(str.begin(), str.begin(), str.end(), regex("_"), "");
}

Related

How to realize sequent addition of elements to vector, sorting them before to insert?

I tried to write programm inserting elements into a vector, sorting them in alphabetical order. The element before to be inserted compares with another ones till it more than element already inserted. After it was assumed to add compared element using .insert(). I want to realize it without using sort algorithmes.
std::string name;
std::vector<std::string> students;
std::vector<std::string>::iterator beg = students.begin();
while (std::cin>>name){
for (std::vector<std::string>::iterator e = students.end() ; beg !=e ; ) {
if (!name.compare(*beg))
{
students.insert(beg, name);
break;
}
else
beg++;
}
}
To avoid invalidate of iterator pointed to the last element I renew it each iteration.
The problem is after this part of code I check the vector but it's empty.
This comparison
if (!name.compare(*beg))
does not make sense. It checks only that two strings are equal.
Consider for example the following code snippet
std::string s1 = "one";
std::string s2 = "one";
std::cout << !s1.compare( s2 ) << '\n';
Its output is 1. It means that the two objects are equal.
Moreover the for loop can ends without finding the position where a string can be inserted for example when initially the vector is empty.
And this statement
std::vector<std::string>::iterator beg = students.begin();
must be inside the outer while loop. That is the iterator shall be initialized anew in each iteration of the loop.
Here is a demonstrative program that shows how the inner loop can be implemented.
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
void insert( std::vector<std::string> &v, const std::string &s )
{
auto it = std::begin( v );
while ( it != std::end( v ) && not( s < *it ) ) ++it;
v.insert( it, s );
}
int main()
{
std::string names[] = { "One", "Two", "Three" };
std::vector<std::string> v;
for ( const auto &s : names )
{
insert( v, s );
}
for ( const auto &s : v ) std::cout << s << ' ';
std::cout << '\n';
return 0;
}
The program output is
One Three Two
That is the strings are inserted in the ascending order.
Relative to your code snippet the loops can look like
while ( std::cin >> name )
{
auto it = std::begin( students ); // or students.begin()
while ( it != std::end( students ) && not( name < *it ) ) ++it;
students.insert( it, name );
}
Also instead of the inner while loop you could use the standard algorithm std::find_if. For example
#include <iostream>
#include <string>
#include <functional>
#include <vector>
#include <iterator>
#include <algorithm>
//...
while ( std::cin >> name )
{
using namespace std::placeholders;
auto it = std::find_if( std::begin( students ), std::end( students ),
std::bind( std::greater_equal<>(), _1, name ) );
students.insert( it, name );
}
For an empty vector begin and end are the same, hence you never insert anything.
It is not clear why you do not want to use a sorting algorithm, hence I would propose the following:
std::string name;
std::vector<std::string> students;
while (std::cin>>name){
students.push_back(name);
}
std::sort(students.begin(),students.end());
Alternatively, replace the last line with your favourite sorting routine.

Rcpp - Capture result of sregex_token_iterator to vector

I'm an R user and am learning c++ to leverage in Rcpp. Recently, I wrote an alternative to R's strsplit in Rcpp using string.h but it isn't regex based (afaik). I've been reading about Boost and found sregex_token_iterator.
The website below has an example:
std::string input("This is his face");
sregex re = sregex::compile(" "); // find white space
// iterate over all non-white space in the input. Note the -1 below:
sregex_token_iterator begin( input.begin(), input.end(), re, -1 ), end;
// write all the words to std::cout
std::ostream_iterator< std::string > out_iter( std::cout, "\n" );
std::copy( begin, end, out_iter );
My rcpp function runs just fine:
#include <Rcpp.h>
#include <boost/xpressive/xpressive.hpp>
using namespace Rcpp;
// [[Rcpp::export]]
StringVector testMe(std::string input,std::string uregex) {
boost::xpressive::sregex re = boost::xpressive::sregex::compile(uregex); // find a date
// iterate over the days, months and years in the input
boost::xpressive::sregex_token_iterator begin( input.begin(), input.end(), re ,-1), end;
// write all the words to std::cout
std::ostream_iterator< std::string > out_iter( std::cout, "\n" );
std::copy( begin, end, out_iter );
return("Done");
}
/*** R
testMe("This is a funny sentence"," ")
*/
But all it does is print out the tokens. I am very new to C++ but I understand the idea of making a vector in rcpp with StringVector res(10); (make a vector named res of length 10) which I can then index res[1] = "blah".
My question is - how do I take the output of boost::xpressive::sregex_token_iterator begin( input.begin(), input.end(), re ,-1), end; and store it in a vector so I can return it?
http://www.boost.org/doc/libs/1_54_0/doc/html/xpressive/user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization
Final working Rcpp solution
Including this because my need was Rcpp specific and I had to make some minor changes to the solution provided.
#include <Rcpp.h>
#include <boost/xpressive/xpressive.hpp>
typedef std::vector<std::string> StringVector;
using boost::xpressive::sregex;
using boost::xpressive::sregex_token_iterator;
using Rcpp::List;
void tokenWorker(/*in*/ const std::string& input,
/*in*/ const sregex re,
/*inout*/ StringVector& v)
{
sregex_token_iterator begin( input.begin(), input.end(), re ,-1), end;
// write all the words to v
std::copy(begin, end, std::back_inserter(v));
}
//[[Rcpp::export]]
List tokenize(StringVector t, std::string tok = " "){
List final_res(t.size());
sregex re = sregex::compile(tok);
for(int z=0;z<t.size();z++){
std::string x = "";
for(int y=0;y<t[z].size();y++){
x += t[z][y];
}
StringVector v;
tokenWorker(x, re, v);
final_res[z] = v;
}
return(final_res);
}
/*** R
tokenize("Please tokenize this sentence")
*/
My question is - how do I take the output of
boost::xpressive::sregex_token_iterator begin( input.begin(),
input.end(), re ,-1), end; and store it in a vector so I can return
it?
You're already halfway there.
The missing link is just std::back_inserter
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
#include <boost/xpressive/xpressive.hpp>
typedef std::vector<std::string> StringVector;
using boost::xpressive::sregex;
using boost::xpressive::sregex_token_iterator;
void testMe(/*in*/ const std::string& input,
/*in*/ const std::string& uregex,
/*inout*/ StringVector& v)
{
sregex re = sregex::compile(uregex);
sregex_token_iterator begin( input.begin(), input.end(), re ,-1), end;
// write all the words to v
std::copy(begin, end, std::back_inserter(v));
}
int main()
{
std::string input("This is his face");
std::string blank(" ");
StringVector v;
// find white space
testMe(input, blank, v);
std::copy(v.begin(), v.end(),
std::ostream_iterator<std::string>(std::cout, "|"));
std::cout << std::endl;
return 0;
}
output:
This|is|his|face|
I used legacy C++ because you used a regex lib from boost instead of std <regex>; maybe you better consider C++14 right from the start when you learn c++ right now; C++14 would have shortened even this small snippet and made it more expressive.
And here's the C++11 version.
Aside from the benefits of using a standardized <regex>, the <regex>-using version compiles roughly twice as fast as the boost::xpressive version with gcc-4.9 and clang-3.5 (-g -O0 -std=c++11) on a QuadCore-Box running with Debian x86_64 Jessie.
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
//////////////////////////////////////////////////////////////////////////////
// A minimal adaption layer atop boost::xpressive and c++11 std's <regex> //
//--------------------------------------------------------------------------//
// remove the comment sign from the #define if your compiler suite's //
// <regex> implementation is not complete //
//#define USE_REGEX_FALLBACK_33509467 1 //
//////////////////////////////////////////////////////////////////////////////
#if defined(USE_REGEX_FALLBACK_33509467)
#include <boost/xpressive/xpressive.hpp>
using regex = boost::xpressive::sregex;
using sregex_iterator = boost::xpressive::sregex_token_iterator;
auto compile = [] (const std::string& s) {
return boost::xpressive::sregex::compile(s);
};
auto make_sregex_iterator = [] (const std::string& s, const regex& re) {
return sregex_iterator(s.begin(), s.end(), re ,-1);
};
#else // #if !defined(USE_REGEX_FALLBACK_33509467)
#include <regex>
using regex = std::regex;
using sregex_iterator = std::sregex_token_iterator;
auto compile = [] (const std::string& s) {
return regex(s);
};
auto make_sregex_iterator = [] (const std::string& s, const regex& re) {
return std::sregex_token_iterator(s.begin(), s.end(), re, -1);
};
#endif // #if defined(USE_REGEX_FALLBACK_33509467)
//////////////////////////////////////////////////////////////////////////////
typedef std::vector<std::string> StringVector;
StringVector testMe(/*in*/const std::string& input,
/*in*/const std::string& uregex)
{
regex re = compile(uregex);
sregex_iterator begin = make_sregex_iterator(input, re),
end;
return StringVector(begin, end); // doesn't steal the strings
// but try (and succeed) to move the vector
}
int main() {
std::string input("This is his face");
std::string blank(" ");
// tokenize by white space
StringVector v = testMe(input, blank);
std::copy(v.begin(), v.end(),
std::ostream_iterator<std::string>(std::cout, "|"));
std::cout << std::endl;
return EXIT_SUCCESS;
}

Find a substring from a vector using iterators

I am trying to make a search function in my application. If the user inputs a substring (or the complete string) I want to know if that substring matches any of the strings or part of the strings stored in my vector.
The following code is written so far:
cout << "Input word to search for: ";
cin >> searchString;
for (multimap <string, vector<string> >::const_iterator it = contactInformationMultimap.cbegin(); it != contactInformationMultimap.cend(); ++it)
{
for (vector<string>::const_iterator iter = it->second.cbegin(); iter != it->second.cend(); ++iter)
{
if (*iter.find(searchString))
^^^^^^^^^^^^^^^^^^^^^^^ this does not work, if i cout *iter it is the correct word stored in the vector. The problem is that i can not use the find function.
}
}
Anyone having any suggestions?
Unary operators have less priority than postfix operators. In your if statement you need that the unary operator * would be evaluated before member access operator. So you have to write
if ( ( *iter ).find(searchString) != std::string::npos )
Or you could write
if ( iter->find(searchString) != std::string::npos )
Take into account that this record
if ( ( *iter ).find(searchString) )
makes no sense.
Also you could write
for (multimap <string, vector<string> >::const_iterator it = contactInformationMultimap.cbegin(); it != contactInformationMultimap.cend(); ++it)
{
for ( const std::string &s : it->second )
{
if ( s.find(searchString ) != std::string::npos ) /*...*/;
}
}
The comments have shown how to correct the syntax so your code can compile, but the result is code that I'd still (at least personally) rather avoid. The primary reason for iterators to allow their use in generic algorithms. In this case, generic algorithms can do the job quite nicely. For example, let's assume that you wanted to print out the key for every record that the value associated with that key contained whatever value was in searchString. To do that you could write the code like this:
std::copy_if(data.begin(), data.end(), // The source "range"
std::ostream_iterator<T>(std::cout, "\n"), // the destination "range"
[&](T const &v) {
return std::any_of(v.second.begin(), v.second.end(),
[&](std::string const &s) {
return s.find(searchString) != std::string::npos;
}
);
}
);
This depends on an operator<< for the correct type, something like this:
typedef std::pair < std::string, std::vector<std::string>> T;
namespace std {
std::ostream &operator<<(std::ostream &os, T const &t) {
return os << t.first;
}
}
A complete test program could look like this:
#include <map>
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
#include <string>
typedef std::pair < std::string, std::vector<std::string>> T;
namespace std {
std::ostream &operator<<(std::ostream &os, T const &t) {
return os << t.first;
}
}
int main() {
std::multimap<std::string, std::vector<std::string>> data{
{ "key1", { "Able", "Bend", "Cell" } },
{ "key2", { "Are", "Dead" } },
{ "key3", { "Bad", "Call" } }
};
std::string searchString = "a";
std::copy_if(data.begin(), data.end(),
std::ostream_iterator<T>(std::cout, "\n"),
[&](T const &v) {
return std::any_of(v.second.begin(), v.second.end(),
[&](std::string const &s) {
return s.find(searchString) != std::string::npos;
}
);
}
);
}
Result:
key2
key3
You can use : if ( *iter.find(searchString) != string::npos )

Obtaining First and Last Charsof Each String From A Vector of Strings

I have created a vector<string> names; which stores peoples first names. I want to take each name and create two variables first_letter and last_letter which contains the first and last characters of each name. However I am not quite sure how to get this done since I am just starting with c++. Can anyone explain to me how this can be done, and possibly provide an example?
In C++11 it got easier:
for (std::string& name : names) {
char first_letter = name.front();
char last_letter = name.back();
// do stuff
}
Before that, you'd have to access them directly using operator[]:
for (size_t i = 0; i < names.size(); ++i) {
std::string& name = names[i];
char first_letter = name[0];
char last_letter = name[name.size() - 1];
// do stuff
}
Assuming name is your string and you're OK with using C++11, name.front() and name.back() will work, otherwise dereference the iterator: *name.begin() and *name.rbegin(). Though you'd check whether the name is empty or not:
if (!name.empty()) {
// Safe to proceed now
}
You can iterate over names like (range loop - since C++11)
for (auto& name : names) {
// Do things with individual name
}
or (for older C++)
for (vector<string>::iterator it = names.begin(); it != names.end(); it++) {
// Do things with individual name (*it)
}
It's advised to use constant iterators where possible, if you're not planning to modify strings, replace auto with const auto and ::iterator with ::const_iterator.
Use the string functions front() and back().
Make sure that the string is not empty before using these functions:
Assuming that i is an index into your vector:
if ( !names[i].empty() )
{
char fChar = names[i].front();
char bChar = names[i].back();
}
Create a function to get the two letters from a single string:
std::pair<char, char>
first_and_last(const std::string& s)
{
if (s.length() == 0)
throw std::runtime_error("Empty string!")
return {s.front(), s.back()};
}
(for C++03 return std::make_pair(s[0], s[s.length()-1]) or another of the ways to do it shown by the other answers.)
Then apply that function to each name in turn, saving the results in a new vector:
std::vector<std::pair<char, char>> letters;
letters.reserve(names.size());
std::transform(names.begin(), names.end(), std::back_inserter(letters), first_and_last);
Or use the C++11 range-based for loop:
std::vector<std::pair<char, char>> letters;
letters.reserve(names.size());
for (const auto& name : names)
letters.push_back( first_and_last(name) );
Something like this? There's no error checking, but it's a start.
vector<string> names = ...;
for (vector<string>::iterator i = names.begin(); i != names.end(); ++i)
{
string first_letter = i->substr(0, 1);
string last_letter = i->substr(i->size() - 1, 1);
}
First off, of course, you start with a loop to iterate through the vector.
Then you get those characters with substr, it would look something like this
vector <string>::iterator it;
for(it = names.begin(); it != names.end(); it++)
{
string first = (*it).substr(0, 1);
string second = (*it).substr((*it).length()-1, 1);
..
do whatever you want to
..
}
Consider the following approach
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
#include <utility>
#include <iterator>
int main()
{
std::vector<std::string> v { "Hello", "NinjaZ", "How", "do", "you", "do" };
for ( const auto &s : v ) std::cout << s << ' ';
std::cout << std::endl;
std::vector<std::pair<char, char>> v2;
v2.reserve( v.size() );
std::transform( v.begin(), v.end(),
std::back_inserter( v2 ),
[]( const std::string &s )
{
return std::make_pair( s.front(), s.back() );
} );
for ( const auto &p : v2 )
{
std::cout << p.first << ' ' << p.second << std::endl;
}
return 0;
}
The output is
Hello NinjaZ How do you do
H o
N Z
H w
d o
y u
d o
Instead of the algorithm std::transform you could use an ordinary range based for loop. For example
for ( const auto &s : v ) v2.push_back( { s.front(), s.back() } );
There are many ways to skin this cat.
Here's another short readable example using C++11. What this brings to the table is the use of std::vector::emplace_back which allows for in-place construction of elements, as opposed to move- or copyconstructing. Also shorter syntax which is nice.
Say you have a container that stores the pairs of letters.
std::vector<std::pair<char, char>> letters;
Then use this:
for (auto&& name : names)
letters.emplace_back(name.front(), name.back());
If you want to throw on empty name strings, simply add a statement before the std::vector::emplace_back statement:
if (name.empty()) throw std::runtime_error("Empty string!");

Getting the words from a sentence and storing them in a vector of strings

Alright, guys ...
Here's my set that has all the letters. I'm defining a word as consisting of consecutive letters from the set.
const char LETTERS_ARR[] = {"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"};
const std::set<char> LETTERS_SET(LETTERS_ARR, LETTERS_ARR + sizeof(LETTERS_ARR)/sizeof(char));
I was hoping that this function would take in a string representing a sentence and return a vector of strings that are the individual words in the sentence.
std::vector<std::string> get_sntnc_wrds(std::string S) {
std::vector<std::string> retvec;
std::string::iterator it = S.begin();
while (it != S.end()) {
if (LETTERS_SET.count(*it) == 1) {
std::string str(1,*it);
int k(0);
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1) == 1))) {
str.push_back(*(it + (++k)));
}
retvec.push_back(str);
it += k;
}
else {
++it;
}
}
return retvec;
}
For instance, the following call should return a vector of the strings "Yo", "dawg", etc.
std::string mystring("Yo, dawg, I heard you life functions, so we put a function inside your function so you can derive while you derive.");
std::vector<std::string> mystringvec = get_sntnc_wrds(mystring);
But everything isn't going as planned. I tried running my code and it was putting the entire sentence into the first and only element of the vector. My function is very messy code and perhaps you can help me come up with a simpler version. I don't expect you to be able to trace my thought process in my pitiful attempt at writing that function.
Try this instead:
#include <vector>
#include <cctype>
#include <string>
#include <algorithm>
// true if the argument is whitespace, false otherwise
bool space(char c)
{
return isspace(c);
}
// false if the argument is whitespace, true otherwise
bool not_space(char c)
{
return !isspace(c);
}
vector<string> split(const string& str)
{
typedef string::const_iterator iter;
vector<string> ret;
iter i = str.begin();
while (i != str.end())
{
// ignore leading blanks
i = find_if(i, str.end(), not_space);
// find end of next word
iter j = find_if(i, str.end(), space);
// copy the characters in [i, j)
if (i != str.end())
ret.push_back(string(i, j));
i = j;
}
return ret;
}
The split function will return a vector of strings, each element containing one word.
This code is taken from the Accelerated C++ book, so it's not mine, but it works. There are other superb examples of using containers and algorithms for solving every-day problems in this book. I could even get a one-liner to show the contents of a file at the output console. Highly recommended.
It's just a bracketing issue, my advice is (almost) never put in more brackets than are necessary, it's only confuses things
while (it+k+1 != S.end() && LETTERS_SET.count(*(it+k+1)) == 1) {
Your code compares the character with 1 not the return value of count.
Also although count does return an integer in this context I would simplify further and treat the return as a boolean
while (it+k+1 != S.end() && LETTERS_SET.count(*(it+k+1))) {
You should use the string steam with std::copy like so:
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <vector>
int main() {
std::string sentence = "And I feel fine...";
std::istringstream iss(sentence);
std::vector<std::string> split;
std::copy(std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>(),
std::back_inserter(split));
// This is to print the vector
for(auto iter = split.begin();
iter != split.end();
++iter)
{
std::cout << *iter << "\n";
}
}
I would use another more simple approach based on member functions of class std::string. For example
const char LETTERS[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
std::string s( "This12 34is 56a78 test." );
std::vector<std::string> v;
for ( std::string::size_type first = s.find_first_of( LETTERS, 0 );
first != std::string::npos;
first = s.find_first_of( LETTERS, first ) )
{
std::string::size_type last = s.find_first_not_of( LETTERS, first );
v.push_back(
std::string( s, first, last == std::string::npos ? std::string::npos : last - first ) );
first = last;
}
for ( const std::string &s : v ) std::cout << s << ' ';
std::cout << std::endl;
Here you make 2 mistakes, I have correct in the following code.
First, it should be
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1)) == 1))
and, it should move to next by
it += (k+1);
and the code is
std::vector<std::string> get_sntnc_wrds(std::string S) {
std::vector<std::string> retvec;
std::string::iterator it = S.begin();
while (it != S.end()) {
if (LETTERS_SET.count(*it) == 1) {
std::string str(1,*it);
int k(0);
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1)) == 1)) {
str.push_back(*(it + (++k)));
}
retvec.push_back(str);
it += (k+1);
}
else {
++it;
}
}
return retvec;
}
The output have been tested.