I am trying to use Boost regular expressions module to extract the numbers from character strings of this format: "{ 12354,21354, 123 }"
The following code has been written to this end. As I do this operation in a loop the string is stored in it->c_str():
boost::cmatch matches;
boost::regex reNumber("-*[0-9.]+");
boost::regex reFiniteValues(" *\\{.*\\} *");
std::cout << "\ttesting for finite values" << std::endl;
if (boost::regex_match(it->c_str(), matches, reFiniteValues))
{
boost::regex_search(it->c_str(), matches, reNumber);
std::cout << "matches.size(): " << matches.size() << std::endl;
for(unsigned int i = 0; i < matches.size(); ++i)
{
std::cout << matches[i] << std::endl;
}
if (matches.size() > 0)
{
std::cout << "\tpattern found" << std::endl;
continue;
}
}
However the size of the matches object is 1, and it only contains 12354 in this example. I would like to know how I can retrieve all the numbers from the string.
You could maybe try to loop regex_search(). -
typedef std::string::const_iterator SITR;
std::string str = it->c_str();
SITR start = str.begin();
SITR end = str.end();
boost::smatch m;
while ( boost::regex_search (start, end, m, reNumber ) )
{
std::cout << m[0].str() << std::endl;
start = m[0].second;
}
Related
So i have a c++ vector which contains about 106 thousand words which are stored on vector<string>words i need to find the longest word in this vector and i also need to get the location at which the word is, for example (1,2,3) in my vector. I need this location because i have two other vectors which has the meaning and the type for the words. vector<string>definition, vector<string>type
Please help
My current code
this code is not working at all
copy_if(words.begin(), words.end(), back_inserter(length), [](const string& x) { return x.length() > 40; });// looks for words longer than 7 letters
for (const string& s : length)
{
cout << "found!!" << endl;
auto i = find(words.begin(), words.end(), s);//looks for the word in the words vector
if (i != words.end())
{
auto pos = i - words.begin();
//displays the word, type and the definition of the word that the user has entered
cout << "Word : " << words[pos] << '\n';
cout << "Type : " << definitions[pos] << '\n';
cout << "Definition: " << types[pos] << '\n';
cout << '\n';
}
else
cout << "word not found" << endl;
}
You could use the standard algorithm std::max_element to search through the vector<string>.
Example:
#include <algorithm> // max_element
#include <iostream>
#include <iterator> // distance
#include <string>
#include <vector>
int main() {
std::vector<std::string> words{"a", "bb", "ccc"};
auto it = std::max_element(words.begin(), words.end(),
[](const auto& a, const auto& b) {
return a.size() < b.size();
});
std::cout << "The longest word is " << *it << " at (zero-based) pos "
<< std::distance(words.begin(), it) << '\n';
}
Output:
The longest word is ccc at (zero-based) pos 2
I would prefer thinking simply: just check length of elements according to each indice and update information according to that.
std::vector<std::string> length;
// initialize the vector length
size_t max_length = 0; // the length of longest word(s)
std::vector<size_t> max_indice; // the indice of longest word(s)
for (size_t i = 0; i < length.size(); i++) {
size_t this_len = length[i].length();
if (this_len > max_length) {
// new record
max_length = this_len;
max_indice.clear();
max_indice.push_back(i);
} else if (this_len == max_length) {
// tie
max_indice.push_back(i);
}
}
for (size_t pos : max_indice) {
cout << "Word : " << words[pos] << '\n';
cout << "Type : " << definitions[pos] << '\n';
cout << "Definition: " << types[pos] << '\n';
cout << '\n';
}
I have a regex function that parses a URL request and finds a match for an IP and port pattern. I want to push these matches into a vector and then print them out to the screen. The size of the vector prints to the screen but nothing is printed to the screen when I attempt to iterate through the vector and print the elements.
code:
std::vector<std::string> matchVector;
std::smatch m;
std::regex e ("\\/([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3})\\:?([0-9]{1,5})");
while (std::regex_search (requestURL,m,e))
{
for (auto x:m)
{
std::stringstream ss;
ss << x;
std::string str = ss.str();
matchVector.push_back(str);
std::cout << "match " << str << " ";
}
std::cout << std::endl;
requestURL = m.suffix().str();
}
std::cout << "print vector of size : " << matchVector.size()<< '\n';
//this is where nothing prints to the screen
for (int i =0; i < matchVector.size(); i++)
{
std::cout << matchVector[i];
}
current output:
match /192.xxx.111.xxx:8080 match 192.xxx.111.xxx match 8080
print vector of size : 3
std::cout is buffered, so it's not synchronized with what you see on the terminal. Try simply flushing std::cout after your print loop:
std::cout << std::flush;
Here is my code:
std::string var = "(1,2)";
std::smatch match;
std::regex rgx("[0-9]+");
if(std::regex_search(var,match,rgx))
for (size_t i = 0; i < match.size(); ++i)
std::cout << i << ": " << match[i] << '\n';
I want to be able to extract both 1 AND 2, but so far output is just the first match (1). I can't seem to figure out why and my brain is fried. It's probably something obvious
regex_match's elements are for matching groups within the regex.
In a slightly modified example
std::string var = "(11b,2x)";
std::smatch match;
std::regex rgx("([0-9]+)([a-z])");
if(std::regex_search(var,match,rgx))
for (size_t i = 0; i < match.size(); ++i)
std::cout << i << ": " << match[i] << '\n';
You'd get the following output:
0: 11b
1: 11
2: b
What you want is to use std::regex_iterator to go over all the matches:
auto b = std::sregex_iterator(var.cbegin(), var.cend(), rgx);
auto e = std::sregex_iterator();
std::for_each(b, e, [](std::smatch const& m){
cout << "match: " << m.str() << endl;
});
This will yield the desired output:
match: 1
match: 2
live demo
I feel like this is a pretty basic question but I did not find a post for it. If you know one please link it below.
So what I'm trying to do is look through a string and extract the numbers in groups of 2.
here is my code:
int main() {
string line = "P112233";
boost::regex e ("P([0-9]{2}[0-9]{2}[0-9]{2})");
boost::smatch match;
if (boost::regex_search(line, match, e))
{
boost::regex f("([0-9]{2})"); //finds 11
boost::smatch match2;
line = match[0];
if (boost::regex_search(line, match2, f))
{
float number1 = boost::lexical_cast<float>(match2[0]);
cout << number1 << endl; // this works and prints out 11.
}
boost::regex g(" "); // here I want it to find the 22
boost::smatch match3;
if (boost::regex_search(line, match3, g))
{
float number2 = boost::lexical_cast<float>(match3[0]);
cout << number2 << endl;
}
boost::regex h(" "); // here I want it to find the 33
boost::smatch match4;
if (boost::regex_search(line, match4, h))
{
float number3 = boost::lexical_cast<float>(match4[0]);
cout << number3 << endl;
}
}
else
cout << "found nothing"<< endl;
return 0;
}
I was able to get the first number but I have no idea how to get the second(22) and third(33).
what's the proper expression I need to use?
As #Cornstalks mentioned you need to use 3 capture groups and then you access them like that:
int main()
{
std::string line = "P112233";
boost::regex e("P([0-9]{2})([0-9]{2})([0-9]{2})");
boost::smatch match;
if (boost::regex_search(line, match, e))
{
std::cout << match[0] << std::endl; // prints the whole string
std::cout << match[1] << ", " << match[2] << ", " << match[3] << std::endl;
}
return 0;
}
Output:
P112233
11, 22, 33
I don't favour regular expressions for this kind of parsing. The key point being that the numbers are still strings when you're done with that hairy regex episode.
I'd use Boost Spirit here instead, which parses into the numbers all at once, and you don't even have to link to the Boost Regex library either, because Spirit is header-only.
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
static qi::int_parser<int, 10, 2, 2> two_digits;
int main() {
std::string const s = "P112233";
std::vector<int> nums;
if (qi::parse(s.begin(), s.end(), "P" >> *two_digits, nums))
{
std::cout << "Parsed " << nums.size() << " pairs of digits:\n";
for(auto i : nums)
std::cout << " * " << i << "\n";
}
}
Parsed 3 pairs of digits:
* 11
* 22
* 33
I have read an entire file into a string from a memory mapped file Win API
CreateFile( "WarandPeace.txt", GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0 )
etc...
Each line is terminated with a CRLF. I need to find something on a line like "Spam" in the line "I love Spam and Eggs" (and return the entire line (without the CRLF) in a string (or a pointer to the location in the string) The original string cannot be altered.
EDITED:
Something like this:
string ParseStr( string sIn, string sDelim, int nField )
{
int match, LenStr, LenDelim, ePos, sPos(0), count(0);
string sRet;
LenDelim = sDelim.length();
LenStr = sIn.length();
if( LenStr < 1 || LenDelim < 1 ) return ""; // Empty String
if( nField < 1 ) return "";
//=========== cout << "LenDelim=" << LenDelim << ", sIn.length=" << sIn.length() << endl;
for( ePos=0; ePos < LenStr; ePos++ ) // iterate through the string
{ // cout << "sPos=" << sPos << ", LenStr=" << LenStr << ", ePos=" << ePos << ", sIn[ePos]=" << sIn[ePos] << endl;
match = 1; // default = match found
for( int k=0; k < LenDelim; k++ ) // Byte value
{
if( ePos+k > LenStr ) // end of the string
break;
else if( sIn[ePos+k] != sDelim[k] ){ // match failed
match = 0; break; }
}
//===========
if( match || (ePos == LenStr-1) ) // process line
{
if( !match ) ePos = LenStr + LenDelim; // (ePos == LenStr-1)
count++; // cout << "sPos=" << sPos << ", ePos=" << ePos << " >" << sIn.substr(sPos, ePos-sPos) << endl;
if( count == nField ){ sRet = sIn.substr(sPos, ePos-sPos); break; }
ePos = ePos+LenDelim-1; // jump over Delim
sPos = ePos+1; // Begin after Delim
} // cout << "Final ePos=" << ePos << ", count=" << count << ", LenStr=" << LenStr << endl;
}// next
return sRet;
}
If you like it, vote it up. If not, let's see what you got.
If you are trying to match a more complex pattern then you can always fall back to boost's regex lib.
See: http://www.boost.org/doc/libs/1_41_0/libs/regex/doc/html/index.html
#include <iostream>
#include <string>
#include <boost/regex.hpp>
using namespace std;
int main( )
{
std::string s;
std::string sre("Spam");
boost::regex re;
ifstream in("main.cpp");
if (!in.is_open()) return 1;
string line;
while (getline(in,line))
{
try
{
// Set up the regular expression for case-insensitivity
re.assign(sre, boost::regex_constants::icase);
}
catch (boost::regex_error& e)
{
cout << sre << " is not a valid regular expression: \""
<< e.what() << "\"" << endl;
continue;
}
if (boost::regex_match(line, re))
{
cout << re << " matches " << line << endl;
}
}
}
Do you really have to do it in C++? Perhaps you could use a language which is more appropriate for text processing, like Perl, and apply a regular expression.
Anyway, if doing it in C++, a loop over Prev_delim_position = sIn.find(sDelim, Prev_delim_position) looks like a fine way to do it.
system("grep ....");