boost::regex matches[] c++ - regex

I have a code that returns all matches from a regex but i can't use matches, for example i want to cout just the second result like match[1] or match[2]
std::string text("5345345345334 456456454353");
boost::regex regex("[0-9]{10}");
boost::sregex_token_iterator iter(text.begin(), text.end(),regex,0);
boost::sregex_token_iterator end;
for( ; iter != end; ++iter ) {
std::cout<<*iter<<'\n';
is there any way to convert *iter to matches[] so i can use each match ? thanks

Here is the correct answer:
just create the std::vector from iterators you have:
std::vector<std::string> v(iter, end);
Full code:
#include <stdlib.h>
#include <string>
#include <iostream>
#include <boost/regex.hpp>
#include <vector>
int main(int argc, char* argv[]) {
std::string text("5345345345334 456456454353");
boost::regex regex("[0-9]{10}");
boost::sregex_token_iterator iter(text.begin(), text.end(),regex,0);
boost::sregex_token_iterator end;
std::vector<std::string> v(iter, end);
std::cout << "0: " << v[0] << ", 1: " << v[1] << std::endl;
return EXIT_SUCCESS;
}
And the output is:
0: 5345345345, 1: 4564564543

Related

Is there a way I can remove a character from a string?

I want to remove a character ('#') from a string,
I tried to check if the string has '#' with the find function, which it does, then erase this with the erase function.
For some reason I get a run time error that says I have no memory.
Error: std::out_of_range at memory location 0x003BF3B4
#include <iostream>
#include <algorithm>
#include <string>
int main()
{
std::string str = "Hello World#";
if (str.find('#')) {
str.erase('#');
}
return 0;
}
The excepted output is: "Hello World"
Try something like this:
#include <algorithm>
str.erase(std::remove(str.begin(), str.end(), '#'), str.end());
#include <iostream>
#include <string>
#include <algorithm>
int main()
{
std::string s = "Hello World#";
char c = '#';
/* Removing character c from s */
s.erase(std::remove(s.begin(), s.end(), c), s.end());
std::cout << "\nString after removing the character "
<< c << " : " << s;
}
If you want to delete all '#' from the string:
std::string str = "Hello World#";
std::size_t pos = str.find('#');
while(pos != std::string::npos)
{
str.erase(pos, 1); //<- edited here
pos = str.find('#');
}
cout << str;

Why Boost:Regex not found all results as expected?

I have a c++ sample and i want to find all queries inside a relative uri
(like: /class?class_id=-1&course_ref=1&student_ref=2&score_ref=1). If it works, i would find all results: ( "class_id=-1" "course_ref=1" "student_ref=2" "score_ref=1: ) but only "course_ref=1" was found! Here's my code:
#include <iostream>
#include <boost/regex.hpp>
int main() {
std::string url = "/class?class_id=-1&course_ref=1&student_ref=2&score_ref=1";
const boost::regex queries_pattern("(?<=(\?|\&))[a-zA-Z0-9_=-]+");
boost::smatch queries_result;
boost::regex_search(url, queries_result, queries_pattern);
std::string results("");
for (unsigned int i = 0; i <= queries_result.size(); i++) {
if (!queries_result[i].str().empty())
std::cout << queries_result[i] << std::endl;
}
std::cin.get();
}
I also tried other regex patterns (without look-behind) but non of them worked. Also i tested std::regex and Boost:Xpressive and no result extracted.
Does anyone knows why this fails?? Or there's a different solution? Thanks.
I don't know why but i must loop on iterator not directly on results. Here's the worked code:
#include <iostream>
#include <boost/regex.hpp>
int main() {
std::string url = "/class?class_id=-1&course_ref=1&student_ref=2&score_ref=1";
const boost::regex pattern("[a-zA-Z0-9_=-]+((?=&)|(?=$))");
boost::sregex_token_iterator iter(url.begin(), url.end(), pattern, 0);
boost::sregex_token_iterator end;
for (; iter != end; ++iter) {
std::cout << *iter << '\n';
}
std::cin.get();
}
Thank you "the fourth bird" for your correct point.

how to iterate all regex matches in a std::string with their starting positions in c++11 std::regex?

I know two ways of getting regex matches from std::string, but I don't know how to get all matches with their respective offsets.
#include <string>
#include <iostream>
#include <regex>
int main() {
using namespace std;
string s = "123 apples 456 oranges 789 bananas oranges bananas";
regex r = regex("[a-z]+");
const sregex_token_iterator end;
// here I know how to get all occurences
// but don't know how to get starting offset of each one
for (sregex_token_iterator i(s.cbegin(), s.cend(), r); i != end; ++i) {
cout << *i << endl;
}
cout << "====" << endl;
// and here I know how to get position
// but the code is finding only first match
smatch m;
regex_search ( s, m, r );
for (unsigned i=0; i< m.size(); ++i) {
cout << m.position(i) << endl;
cout << m[i] << endl;
}
}
First of all, why token iterator? You don't have any marked subexpressions to iterate over.
Second, position() is a member function of a match, so:
#include <string>
#include <iostream>
#include <regex>
int main() {
std::string s = "123 apples 456 oranges 789 bananas oranges bananas";
std::regex r("[a-z]+");
for(std::sregex_iterator i = std::sregex_iterator(s.begin(), s.end(), r);
i != std::sregex_iterator();
++i )
{
std::smatch m = *i;
std::cout << m.str() << " at position " << m.position() << '\n';
}
}
live at coliru: http://coliru.stacked-crooked.com/a/492643ca2b6c5dac

string iterator incompatible for reading eachline

I have an std::ostringstream.
I would like to iterate for each line of this std::ostringstream.
I use boost::tokenizer :
std::ostringstream HtmlStream;
.............
typedef boost::tokenizer<boost::char_separator<char> > line_tokenizer;
line_tokenizer tok(HtmlStream.str(), boost::char_separator<char>("\n\r"));
for (line_tokenizer::const_iterator i = tok.begin(), end = tok.end(); i != end; ++i)
{
std::string str = *i;
}
On the line
for (line_tokenizer::const_iterator i = tok.begin(), end = tok.end(); i != end;
I have an assert error with "string iterator incompatible".
I have read about this error, on google and on StackOverflow too, but i have diffuclty to find my error.
Anyone could help me please ?
Thanks a lot,
Best regards,
Nixeus
I like to make it non-copying for efficiency/error reporting:
See it Live on Coliru
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
#include <iostream>
#include <vector>
int main()
{
auto const& s = "hello\r\nworld";
std::vector<boost::iterator_range<char const*>> lines;
boost::split(lines, s, boost::is_any_of("\r\n"), boost::token_compress_on);
for (auto const& range : lines)
{
std::cout << "at " << (range.begin() - s) << ": '" << range << "'\n";
};
}
Prints:
at 0: 'hello'
at 7: 'world'
This is more efficient than most of the alternatives shown. Of course, if you need more parsing capabilities, consider Boost Spirit:
See it Live on Coliru
#include <boost/spirit/include/qi.hpp>
int main()
{
std::string const s = "hello\r\nworld";
std::vector<std::string> lines;
{
using namespace boost::spirit::qi;
auto f(std::begin(s)),
l(std::end(s));
bool ok = parse(f, l, *(char_-eol) % eol, lines);
}
for (auto const& range : lines)
{
std::cout << "'" << range << "'\n";
};
}

find pattern in string regular expression c++

I have the following string : E:\501_Document_60_1_R.xml
I am trying to find the pattern "_R"
I am using the following : boost::regex rgx("[R]");
But it's not working : "Empty Match"
thank you.
Code:
vector<string> findMono(string s)
{
vector<string> vec;
boost::regex rgx("[R]");
boost::smatch match;
boost::sregex_iterator begin {s.begin(), s.end(), rgx},
end {};
for (boost::sregex_iterator& i = begin; i != end; ++i)
{
boost::smatch m = *i;
vec.push_back(m.str());
}
return vec;
}
int maint()
{
vector<string> m = findMono("E:\501_Document_60_1_R.xml");
if(m.size() > 0) cout << "Match" << endl;
else cout << "No Match" << endl;
return 0;
}
As we discussed in the comments, "_R" will technically work for your regular expression given your current data set.
However, I'd strongly consider something more sophisticated to avoid running into problems in the event that your paths contain the sequence "_R" elsewhere. It's fairly easy to protect yourself against that problem, it's good general practice, and it will most likely avoid bugs in the future.
Here is a very basic working example:
#include <iostream>
#include <string>
#include <vector>
#include <boost/regex.hpp>
std::vector<std::string> findMono(const std::string& path)
{
boost::regex rgx("_R");
boost::sregex_iterator begin {path.begin(), path.end(), rgx}, end {};
std::vector<std::string> matches;
for (boost::sregex_iterator& i = begin; i != end; ++i) {
matches.push_back((*i).str());
}
return matches;
}
int main(int argc, char * argv[])
{
const std::string path = "E:\\501_Document_60_1_R.xml";
const std::vector<std::string>& matches = findMono(path);
for (const auto& match : matches) {
std::cout << match << std::endl;
}
return 0;
}