I'm trying to extract the numerical data from the mentioned string, however, when using the given pattern, I also miss out on the following string.
pattern : [^\\n)]+
409416,-15.84361,-22.66174,-15.777729,-11.565274,0.184927,2.184308,-2.918847,-1.438143,-1.832789,-2.392894,-2.936923,-1.699626,-1.699626,-0.559298,-0.559298,-0.559298,-0.559298,-0.559298\n0.268223,0.088596,-0.953149,-1.344175,0.197503,4.143355,3.463934,0.289587,-0.063034,0.35563,2.322007,-13.589606,-11.883781,17.186039,-7.376517,10.132304,-4.420093,0.77321,5.358715,3.092631,0.457418,1.67359,4.545597,1.758356,1.758356,0.544843,0.544843,0.544843,0.544843,0.544843\n-4.421537,-2.864239,-3.992804,-2.769629,-0.345838,1.462282,-0.733731,-1.554252,0.376582,5.262342,7.720245,-14.295092,-14.852295,-16.991022,15.644931,14.116446,-4.67732,-6.69726,-0.406152,1.403272,-1.297639,-2.341637,-1.378868,-2.402558,-2.402558,-3.345482,-3.345482,-3.345482,-3.345482,-3.345482\n0.303624,-1.55541,-1.163894,-0.002663,1.203844,0.47408,-1.725865,-1.635311,-0.809665,1.496815,0.127842,2.615432,1.528776,-34.86355,4.610298,1.973559,-2.828502,1.598024,1.195854,0.623229,-1.526112,-0.921527,-0.346238,-0.905547,-0.905547,0.348902,0.348902,0.348902,0.348902,0.348902\n0.03196,-1.725865,-1.523449,-1.086656,-0.183773,0.516694,0.561972,0.292971,-0.183773,-0.002663,-2.048133,-13.026555,-17.415792,29.832436,3.382483,2.988304,-1.811093,0.114525,0.386189,-0.628556,-1.704558,-1.853707,-1.222488,-1.182537,-1.182537,-0.255684,-0.255684,-0.255684,-0.255684,-0.255684\n0.287644,-1.054696,-1.134597,-0.761725,0.109198,0.242367,-0.415486,-0.191763,-0.514031,0.138495,0.596595,4.54904,-4.29602,5.593082,7.870266,2.460956,1.787123,0.70313,-0.258347,0.103872,-0.26101,-0.058594,0.189099,0.713784,0.713784,-0.114525,-0.114525,-0.114525,-0.114525,-0.114525',Badminton_Smash
Required: string without \n.
Link: https://regex101.com/r/sMtFzd/1
Use splitting with
\\n|\)
See proof.
C++ supports splitting, see Split a string using C++11.
Use
#include <iostream>
#include <regex>
using namespace std;
std::vector<std::string> split(const string& input, const string& regex) {
// passing -1 as the submatch index parameter performs splitting
std::regex re(regex);
std::sregex_token_iterator
first{input.begin(), input.end(), re, -1},
last;
return {first, last};
}
int main() {
std::string input("409416,-15.84361,-22.66174,-15.777729,-11.565274,0.184927,2.184308,-2.918847,-1.438143,-1.832789,-2.392894,-2.936923,-1.699626,-1.699626,-0.559298,-0.559298,-0.559298,-0.559298,-0.559298\n0.268223,0.088596,-0.953149,-1.344175,0.197503,4.143355,3.463934,0.289587,-0.063034,0.35563,2.322007,-13.589606,-11.883781,17.186039,-7.376517,10.132304,-4.420093,0.77321,5.358715,3.092631,0.457418,1.67359,4.545597,1.758356,1.758356,0.544843,0.544843,0.544843,0.544843,0.544843\n-4.421537,-2.864239,-3.992804,-2.769629,-0.345838,1.462282,-0.733731,-1.554252,0.376582,5.262342,7.720245,-14.295092,-14.852295,-16.991022,15.644931,14.116446,-4.67732,-6.69726,-0.406152,1.403272,-1.297639,-2.341637,-1.378868,-2.402558,-2.402558,-3.345482,-3.345482,-3.345482,-3.345482,-3.345482\n0.303624,-1.55541,-1.163894,-0.002663,1.203844,0.47408,-1.725865,-1.635311,-0.809665,1.496815,0.127842,2.615432,1.528776,-34.86355,4.610298,1.973559,-2.828502,1.598024,1.195854,0.623229,-1.526112,-0.921527,-0.346238,-0.905547,-0.905547,0.348902,0.348902,0.348902,0.348902,0.348902\n0.03196,-1.725865,-1.523449,-1.086656,-0.183773,0.516694,0.561972,0.292971,-0.183773,-0.002663,-2.048133,-13.026555,-17.415792,29.832436,3.382483,2.988304,-1.811093,0.114525,0.386189,-0.628556,-1.704558,-1.853707,-1.222488,-1.182537,-1.182537,-0.255684,-0.255684,-0.255684,-0.255684,-0.255684\n0.287644,-1.054696,-1.134597,-0.761725,0.109198,0.242367,-0.415486,-0.191763,-0.514031,0.138495,0.596595,4.54904,-4.29602,5.593082,7.870266,2.460956,1.787123,0.70313,-0.258347,0.103872,-0.26101,-0.058594,0.189099,0.713784,0.713784,-0.114525,-0.114525,-0.114525,-0.114525,-0.114525',Badminton_Smash");
std::string rgx("\\\\n|\\)");
for (auto const c : split(input, rgx)) {
std::cout << c << "\n";
}
return 0;
}
See C++ proof.
I got a string and I want to remove all the punctuations from it. How do I do that? I did some research and found that people use the ispunct() function (I tried that), but I cant seem to get it to work in my code. Anyone got any ideas?
#include <string>
int main() {
string text = "this. is my string. it's here."
if (ispunct(text))
text.erase();
return 0;
}
Using algorithm remove_copy_if :-
string text,result;
std::remove_copy_if(text.begin(), text.end(),
std::back_inserter(result), //Store output
std::ptr_fun<int, int>(&std::ispunct)
);
POW already has a good answer if you need the result as a new string. This answer is how to handle it if you want an in-place update.
The first part of the recipe is std::remove_if, which can remove the punctuation efficiently, packing all the non-punctuation as it goes.
std::remove_if (text.begin (), text.end (), ispunct)
Unfortunately, std::remove_if doesn't shrink the string to the new size. It can't because it has no access to the container itself. Therefore, there's junk characters left in the string after the packed result.
To handle this, std::remove_if returns an iterator that indicates the part of the string that's still needed. This can be used with strings erase method, leading to the following idiom...
text.erase (std::remove_if (text.begin (), text.end (), ispunct), text.end ());
I call this an idiom because it's a common technique that works in many situations. Other types than string provide suitable erase methods, and std::remove (and probably some other algorithm library functions I've forgotten for the moment) take this approach of closing the gaps for items they remove, but leaving the container-resizing to the caller.
#include <string>
#include <iostream>
#include <cctype>
int main() {
std::string text = "this. is my string. it's here.";
for (int i = 0, len = text.size(); i < len; i++)
{
if (ispunct(text[i]))
{
text.erase(i--, 1);
len = text.size();
}
}
std::cout << text;
return 0;
}
Output
this is my string its here
When you delete a character, the size of the string changes. It has to be updated whenever deletion occurs. And, you deleted the current character, so the next character becomes the current character. If you don't decrement the loop counter, the character next to the punctuation character will not be checked.
ispunct takes a char value not a string.
you can do like
for (auto c : string)
if (ispunct(c)) text.erase(text.find_first_of(c));
This will work but it is a slow algorithm.
Pretty good answer by Steve314.
I would like to add a small change :
text.erase (std::remove_if (text.begin (), text.end (), ::ispunct), text.end ());
Adding the :: before the function ispunct takes care of overloading .
The problem here is that ispunct() takes one argument being a character, while you are trying to send a string. You should loop over the elements of the string and erase each character if it is a punctuation like here:
for(size_t i = 0; i<text.length(); ++i)
if(ispunct(text[i]))
text.erase(i--, 1);
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
int main() {
string str = "this. is my string. it's here.";
transform(str.begin(), str.end(), str.begin(), [](char ch)
{
if( ispunct(ch) )
return '\0';
return ch;
});
}
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;//string is defined here.
cout << "Please enter a string with punctuation's: " << endl;//Asking for users input
getline(cin, s);//reads in a single string one line at a time
/* ERROR Check: The loop didn't run at first because a semi-colon was placed at the end
of the statement. Remember not to add it for loops. */
for(auto &c : s) //loop checks every character
{
if (ispunct(c)) //to see if its a punctuation
{
c=' '; //if so it replaces it with a blank space.(delete)
}
}
cout << s << endl;
system("pause");
return 0;
}
Another way you could do this would be as follows:
#include <ctype.h> //needed for ispunct()
string onlyLetters(string str){
string retStr = "";
for(int i = 0; i < str.length(); i++){
if(!ispunct(str[i])){
retStr += str[i];
}
}
return retStr;
This ends up creating a new string instead of actually erasing the characters from the old string, but it is a little easier to wrap your head around than using some of the more complex built in functions.
I tried to apply #Steve314's answer but couldn't get it to work until I came across this note here on cppreference.com:
Notes
Like all other functions from <cctype>, the behavior of std::ispunct
is undefined if the argument's value is neither representable as
unsigned char nor equal to EOF. To use these functions safely with
plain chars (or signed chars), the argument should first be converted
to unsigned char.
By studying the example it provides, I am able to make it work like this:
#include <string>
#include <iostream>
#include <cctype>
#include <algorithm>
int main()
{
std::string text = "this. is my string. it's here.";
std::string result;
text.erase(std::remove_if(text.begin(),
text.end(),
[](unsigned char c) { return std::ispunct(c); }),
text.end());
std::cout << text << std::endl;
}
Try to use this one, it will remove all the punctuation on the string in the text file oky.
str.erase(remove_if(str.begin(), str.end(), ::ispunct), str.end());
please reply if helpful
i got it.
size_t found = text.find('.');
text.erase(found, 1);
I'm looking for an elegant way to transform an std::string from something like:
std::string text = " a\t very \t ugly \t\t\t\t string ";
To:
std::string text = "a very ugly string";
I've already trimmed the external whitespace with boost::trim(text);
[edit]
Thus, multiple whitespaces, and tabs, are reduced to just one space
[/edit]
Removing the external whitespace is trivial. But is there an elegant way of removing the internal whitespace that doesn't involve manual iteration and comparison of previous and next characters? Perhaps something in boost I have missed?
You can use std::unique with std::remove along with ::isspace to compress multiple whitespace characters into single spaces:
std::remove(std::unique(std::begin(text), std::end(text), [](char c, char c2) {
return ::isspace(c) && ::isspace(c2);
}), std::end(text));
std::istringstream iss(text);
text = "";
std::string s;
while(iss >> s){
if ( text != "" ) text += " " + s;
else text = s;
}
//use text, extra whitespaces are removed from it
Most of what I'd do is similar to what #Nawaz already posted -- read strings from an istringstream to get the data without whitespace, and then insert a single space between each of those strings. However, I'd use an infix_ostream_iterator from a previous answer to get (IMO) slightly cleaner/clearer code.
std::istringstream buffer(input);
std::copy(std::istream_iterator<std::string>(buffer),
std::istream_iterator<std::string>(),
infix_ostream_iterator<std::string>(result, " "));
#include <boost/algorithm/string/trim_all.hpp>
string s;
boost::algorithm::trim_all(s);
If you check out https://svn.boost.org/trac/boost/ticket/1808, you'll see a request for (almost) this exact functionality, and a suggested implementation:
std::string trim_all ( const std::string &str ) {
return boost::algorithm::find_format_all_copy(
boost::trim_copy(str),
boost::algorithm::token_finder (boost::is_space(),boost::algorithm::token_compress_on),
boost::algorithm::const_formatter(" "));
}
Here is a possible version using regular expressions. My GCC 4.6 doesn't have regex_replace yet, but Boost.Regex can serve as a drop-in replacement:
#include <string>
#include <iostream>
// #include <regex>
#include <boost/regex.hpp>
#include <boost/algorithm/string/trim.hpp>
int main() {
using namespace std;
using namespace boost;
string text = " a\t very \t ugly \t\t\t\t string ";
trim(text);
regex pattern{"[[:space:]]+", regex_constants::egrep};
string result = regex_replace(text, pattern, " ");
cout << result << endl;
}