replace special character in boost regex replace c++ - c++

I'm trying to replace a special character Æ with space using the boost regex replace, but I cannot do so. In order to replace the special character, can anyone please help me with what exactly my regular expression should be. Below is the code snippet which I am trying out.
#include <iostream>
#include <regex>
#include <string>
using namespace std;
int main()
{
regex escape("\\xC6");
const char* specialChar = "bananaÆ,test";
string xml = regex_replace(specialChar, escape, "");
cout << specialChar << "\n";
cout << xml << '\n';
}
Output:
bananaÆ,test
bananaÆ,test

Related

Using the already-defined regex pattern in another regex pattern and a question about applying regex to file

How can I use the already-defined regex pattern in another regex pattern. For example in the following code sign and number are defined and I want to use them in defining relation:
regex sign("=<|=|>|<=|<>|>=");
regex number("^[1-9]\\d*");
regex relation(number, sign, number)
So, I need to find all matches (to the pattern like 23<=34 or 123<>2000) in the given file.
Since I haven't completed the relation, I've been testing with sign:
#include <iostream>
#include <fstream>
#include <regex>
using namespace std;
int main() {
regex sign("=<|=|>|<=|<>|>=");
regex digit("[0-9]");
regex number("^[1-9]\\d*");
//regex relation("^[1-9]\d*[=<|=|>|<=|<>|>=]^[1-9]\d*"); (this part is what I couldn't do)
string line;
ifstream fin;
fin.open("name.txt");
if (fin.good()) {
while (getline(fin, line)) {
bool match_sign = regex_search(line, sign);
if (match_sign) {
cout << line << endl; // but I need to print the match only
}
}
}
return 0;
}
When I want to print the matches in the file, it prints the whole line which contains any match. How can I make it print only the match itself but not the whole line?
Update:
#include <iostream>
#include <fstream>
#include <vector>
#include <regex>
using namespace std;
#define REGEX_SIGN "=<|=|>|<=|<>|>="
#define REGEX_DIGIT "[0-9]"
#define REGEX_NUMBER "^" REGEX_DIGIT "\\d*"
int main() {
regex sign(REGEX_SIGN);
regex digit(REGEX_DIGIT);
regex number(REGEX_NUMBER);
regex relation(REGEX_NUMBER REGEX_SIGN REGEX_NUMBER);
string line, text;
ifstream fin;
fin.open("name.txt");
if (fin.good()) {
while (getline(fin, line)) {
text += line + " ";
}
int count = 0;
string word = "";
for (int i = 0; i < text.length(); i++) {
if (text[i] == ' ') {
cout << "word = " << word << " | match: " << regex_match(word, relation) << endl;
if (regex_match(word, relation)) {
cout << word << endl;
}
word = "";
}
else {
word += text[i];
}
}
}
// cout << text << endl;
return 0;
}
Current name.txt looks like this:
But I think the regular expression is not working right:
It says no word matches. Where is the problem?
The problem of "reusing" a smaller regex inside a larger regex is not really possible.
The only workaround I can see is to define the strings of the regexes as macros, and use the compilers literal-string concatenation feature to create larger strings:
#define REGEX_SIGN "=<|=|>|<=|<>|>="
#define REGEX_DIGIT "[0-9]"
#define REGEX_NUMBER "^" REGEX_DIGIT "\\d*"
regex sign(REGEX_SIGN);
regex digit(REGEX_DIGIT);
regex number(REGEX_NUMBER);
regex relation(REGEX_NUMBER REGEX_SIGN REGEX_NUMBER);
This doesn't reuse the actual regex objects, only create longer literal strings from smaller.

How to find all sentences except those defined using regular expressions?

The bottom line is that I need to find all the comments in some Python code and cut them out, leaving only the code itself.
But I can't do it from the opposite. That is, I find the comments themselves, but I cannot find everything except them.
I tried using "?!", Made up a regular expression like "(. *) (?! #. *)". But it does not work as I expected.
Just as in the code that I attached, there is an "else" that I tried to use too, that is, write to different variables, but for some reason it doesn't even go there
#include <iostream>
#include <fstream>
#include <string>
#include <regex>
int main()
{
std::string line;
std::string new_line;
std::string result;
std::string result_re;
std::string path;
std::smatch match;
std::regex re("(#.*)");
std::cout << "Enter the path\n";
std::cin >> path;
std::ifstream in(path);
if (in.is_open())
{
while (getline(in, line))
{
if (std::regex_search(line, match, re))
{
for (int i = 0; i < match.size(); i++)
result_re += match[i + 1];
result_re += "\n";
}
else
{
for (int i = 0; i < match.size(); i++)
result += match[i];
//result += "\n";
}
std::cout << line << std::endl;
}
}
in.close();
std::cout << result_re << std::endl;
std::cout << "End of program" << std::endl;
std::cout << result << std::endl;
system("pause");
return 0;
}
As I said above, I want to get everything except comments, and not the other way around.
I also need to do a search for multi-line comments, which are defined in """Text""".
But in this implementation, I can’t even imagine how to do it, since now it is reading line by line, and a multi-line comment in this case with the help of a regulars program is impossible for me to get
I would be grateful for your advices and help.
1. don't try parsing your input file line by line. Instead suck in the whole text and let regex to replace all the comments, this way your entire program would look like this:
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <regex>
using namespace std; // for brevity
int main() {
cout << "Enter the path: ";
string filename;
getline(cin, filename);
string pprg{ istream_iterator<char>(ifstream{filename, ifstream::in} >> noskipws),
istream_iterator<char>{} };
pprg = regex_replace(pprg, regex{"#.*"}, "");
cout << pprg << endl;
}
to handle multi-line Python literals """...""", with C++ regex is quite uneasy to do (unlike in the example above): there are few mutually exclusive requirements (imho):
regex should be extended POSIX, but
POSIX regex does not support empty regex matches, however
for crafting an RE to match a negated sequence of characters a negative look-ahead assert is required, which will be an empty match :(
thus it would mean, you'd need to think and put up some programming logic to remove multi-line Python text literals

How to get the function name from function definition string using regex in c or c++?

I want to find string from given text. My regular expression is not working. i am not sure what i did mistake. Can some one help me please.
I am expecting ouput as : myFunction
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
using namespace std;
int main()
{
// Target sequence
std::string s = "function myFunction(p1, p2) { return p1 * p2; }";
// An object of regex for pattern to be searched
regex r("/^function\\s+([\\w\\$]+)\\s*\\(/");
// flag type for determining the matching behavior
// here it is for matches on 'string' objects
smatch m;
// regex_search() for searching the regex pattern
// 'r' in the string 's'. 'm' is flag for determining
// matching behavior.
regex_search(s, m, r);
// for each loop
for (auto x : m)
cout << x << " ";
}
The string you are using for the regex is not correct. You don't need the two / characters at the start and the end. Use:
regex r("^function\\s+([\\w\\$]+)\\s*\\(");
// ^ No / ^ No /
See it working at https://ideone.com/bLavL0.

Code word search function c++

Below is a simple code to find 2401 in the string. I do not know what that the number is 2401, it can be any number from 0-9. To find the 4 digit number i want to use "DDDD". The letter D will find a number between 0->9. How do i make it so the compiler realizes that a letter D is a a code to find a 1 digit number.
#include <string>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
std::string pattern ;
std::getline(std::cin, pattern);
std::string sentence = "where 2401 is";
//std::getline(std::cin, sentence);
int a = sentence.find(pattern,0);
int b = pattern.length();
cout << sentence.substr(a,b) << endl;
//std::cout << sentence << "\n";
}
try using regular expressions. They can be kind of a pain in the ass, but pretty powerful once mastered. In your case i would recommend using regex_search(), like here:
#include <string>
#include <iostream>
#include <vector>
#include <regex>
using namespace std;
int main()
{
std::smatch m;
std::regex e ("[0-9]{4}"); // matches numbers
std::string sentence = "where 2401 is";
//int a = sentence.find(pattern,0);
//int b = pattern.length();
if (std::regex_search (sentence, m, e))
cout << m.str() << endl;
//cout << sentence.substr(a,b) << endl;
//std::cout << sentence << "\n";
}
If you want to make the exact matching user-specific you can also just ask for the number of digits in the number or the complete regular expression, etc.
Also noted:
the simple regular expression provided [0-9]{4} means: "any character between 0 and 9 excactly 4 times in a sequence". Have a look here for more information
in your question you mentioned, you wanted the compiler to do the matching. Regular expressions are not matched by the compiler, but at runtime. In that case you could also variable the input string and the regular expression.
using namespace std; makes the prefix std:: unnecessary for those variable declarations
std::getline(std::cin, pattern); could be replaced by cin >> pattern;

How to express an assembly lw/sw instruction using regular expression regex library?

I want to detect when the user enters "lw 2, 3(9)" , but it can't read the parenthesis, I used this code but it still doesn't detect the parenthesis.
{ R"((\w+) ([[:digit:]]+), ([[:digit:]]+) (\\([[:digit:]]+\\)) )"}
Can someone please help?
You need to be careful with excessive spaces in the pattern, and since you are using a raw string literal, you should not double escape special chars:
R"((\w+) ([[:digit:]]+), ([[:digit:]]+)(\([[:digit:]]+\)))"
^^^ ^ ^^
It might be a good idea to replace literal spaces with [[:space:]]+.
C++ demo printing lw 2, 3(9):
#include <iostream>
#include <regex>
#include <string>
using namespace std;
int main() {
regex rx(R"((\w+) ([[:digit:]]+), ([[:digit:]]+)(\([[:digit:]]+\)))");
string s("Text lw 2, 3(9) here");
smatch m;
if (regex_search(s, m, rx)) {
std::cout << m[0] << std::endl;
}
return 0;
}
R"((\w+) (\d+), (\d+)(\(\d+\)))"
worked for me
Since you didn't specify whether you want to capture something or not, I'll provide both snippets.
You don't have to escape characters with raw string literals but you do have to escape capture groups
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string str = "lw 2, 3(9)";
{
std::regex my_regex(R"(\w+ \d+, \d+\(\d+\))");
if (std::regex_search(str, my_regex)) {
std::cout << "Detected\n";
}
}
{
// With capture groups
std::regex my_regex(R"((\w+) (\d+), (\d+)(\(\d+\)))");
std::smatch match;
if (std::regex_search(str, match, my_regex)) {
std::cout << match[0] << std::endl;
}
}
}
Live example
An additional improvement could be to handle multiple spacing (if that is allowed in your particular case) with \s+.
I can't help but notice that EJP's concerns might also be spot-on: this is a very fragile solution parsing-wise.