Trying to understand regex snippet (/[-/\\^$*+?.()|[\]{}]/g, '\\$&') [duplicate] - regex

This question already has answers here:
Difference between $1 and $& in regular expressions
(3 answers)
Closed 2 years ago.
I am most interested in learning more about that piece at the end, '\\$&'
I'm not really sure what its doing, or how it works, but it gets the job done.
The code that I have:
function escapeRegExp(s) {
return s.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&')
}
const searchRegex = new RegExp(
searchQuery
.split(/\s+/g)
.map(s => s.trim())
.filter(s => !!s)
.map(word => `(?=.*\\b${escapeRegExp(word)})`).join('') + '.+',
'i'
)

$& is defined in MDN's String#replace -- Specifying a string as a parameter reference as
Pattern $& Inserts the matched substring.
Essentially, this gives you back the entire matched substring. In your example, \\ prepends a single backslash, effectively escaping the match.
Here are some examples:
// replace every character with itself doubled
console.log("abc".replace(/./g, "$&$&"));
// replace entire string with itself doubled
console.log("abc".replace(/.*/g, "$&$&"));
// prepend a backslash to every match
console.log("abc".replace(/./g, "\\$&"));
// behavior with capture groups that can be accessed with $1, $2...
console.log("abc".replace(/(.)(.)/g, "[ $$1: $1 ; $$2: $2 ; $$&: $& ]"));

Related

Remove parenthesis and Characters inside it [duplicate]

This question already has answers here:
Remove text between parentheses in dart/flutter
(2 answers)
Regular expresion for RegExp in Dart
(2 answers)
JavaScript/regex: Remove text between parentheses
(5 answers)
Closed 3 months ago.
I want to remove parenthesis along with all the characters inside it...
var str = B.Tech(CSE)2020;
print(str.replaceAll(new RegExp('/([()])/g'), '');
// output => B.Tech(CSE)2020
// output required => B.Tech 2020
I tried with bunch of Regex but nothing is working...
I am using Dart...
Using Dart, you don't have to use the forward slashes / to delimit the pattern. You can use a string and prepend it with r for a raw string and then you don't have to double escape the backslashes.
In your pattern you have to:
escape the parenthesis
negate the character class to match any character except the parenthesis
repeat the character class with a quantifier like * or else it will match a single character
The pattern will look like:
\([^()]*\)
Regex demo | Dart demo
Example
var str = "B.Tech(CSE)2020";
print(str.replaceAll(new RegExp(r'\([^()]*\)'), ' '));
Output
B.Tech 2020
Your Dart syntax is off, and seems to be confounded with JavaScript. Consider this version:
String str = "B.Tech(CSE)2020";
print(str.replaceAll(RegExp(r'\(.*?\)'), " ")); // B.Tech 2020

How to return first match sub-string of a string using Ruby regex? [duplicate]

This question already has answers here:
Return first match of Ruby regex
(5 answers)
What do ^ and $ mean in a regular expression?
(2 answers)
Closed 2 years ago.
I'm looking for a way to perform a regex match on a string in Ruby and get the first match sub-string, and assign in to a variable. I have checked different solutions here in stack overflow but couldn't find a proper solution so far.
This is my string
/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv
I need to get the first sub-string of 20200904_151507. well, this file path can change time to time. And also the sub-string. But the pattern is, date_time. In the regex below, I tried to get the first eight(8) numbers, _ and last six(6) numbers.
here are the solutions I tried,
report_path[/^[0-9]{8}[_][0-9]{6}$/,1]
report_path.scan(/^[0-9]{8}[_][0-9]{6}$/).first
above report_path varibale has the full file path I have mentioned above.
What did I do wrong here?
scan will return all substrings that matches the pattern. You can use match, scan or [] to achieve your goal:
report_path = '/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv'
report_path.match(/\d{8}_\d{6}/)[0]
# => "20200904_151507"
report_path.scan(/\d{8}_\d{6}/)[0]
# => "20200904_151507"
# String#[] supports regex
report_path[/\d{8}_\d{6}/]
# => "20200904_151507"
Note that match returns a MatchData object, which may contains multiple matches (if we use capture groups). scan will return an Array containing all matches.
Here we're calling [0] on the MatchData to get the first match
Capture groups:
Regex allow us to capture multiples substring using one patern. We can use () to create capture groups. (?'some_name'<pattern>) allow us to create named capture groups.
report_path = '/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv'
matches = report_path.match(/(\d{8})_(\d{6})/)
matches[0] #=> "20200904_151507"
matches[1] #=> "20200904"
matches[2] #=> "151507"
matches = report_path.match(/(?'date'\d{8})_(?'id'\d{6})/)
matches[0] #=> "20200904_151507"
matches["date"] #=> "20200904"
matches["id"] #=> "151507"
We can even use (named) capture groups with []
From String#[] documentation:
If a Regexp is supplied, the matching portion of the string is returned. If a capture follows the regular expression, which may be a capture group index or name, follows the regular expression that component of the MatchData is returned instead.
report_path = '/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv'
# returns the full match if no second parameter is passed
report_path[/(\d{8})_(\d{6})/]
# => 20200904_151507
# returns the capture group n°2
report_path[/(\d{8})_(\d{6})/, 2]
# => 151507
# returns the capture group called "date"
report_path[/(?'date'\d{8})_(?'id'\d{6})/, 'date']
# => 20200904

What does s/(\W)/\\$1/g do in perl? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
I went over a piece of code in which a subroutine takes video filename as argument, and then printing it's duration time. Here I'm only showing the snippet.
sub videoInfo {
my $file = shift;
$file =~ s/(\W)/\\$1/g;
}
So far I understood is that it is dealing with whitespaces but I'm not able to break the meaning of code, I mean what is $1 and how it will work?
It puts backslashes in front of non-word characters. Things like "untitled file" becomes "untitled\ file".
As in most regular expression operations $1 represents the first thing captured with (...) which in this case is the (\W) representing a single non-word character.
I think this is an unnecessary home-rolled version of quotemeta.

How to include regex pattern matches into substitution output [duplicate]

This question already has answers here:
re.sub replace with matched content
(4 answers)
Closed 4 years ago.
For example, if I want to add a space in-between all instances where I have one uppercase letter preceding a hyphen (A-, C-, etc...), then what function can I use to achieve this?
Alternatively, is there a way to get re.sub to output the pattern that was matched? :
>>> text = 'T- AB-'
>>> re.sub(r'\b[A-Z]-', 'what goes here?', text)
>>> text
'T - AB-'
You are looking to use capturing parenthesis and a \1
import re
text = 'T- AB-'
text = re.sub(r'\b([A-Z])-', r'\1 -', text)
print (text)
results:
T - AB-
That should do the trick. Whatever you capture in the ( ) can be referenced with \1. If you had a series of parenthesis each set can be referenced like \2, \3, etc. Good luck!

Weird behaviour of the global g regex flag [duplicate]

This question already has answers here:
Help understanding global flag in perl
(2 answers)
Closed 9 years ago.
my $test = "There was once an\n ugly ducking";
if ($test =~ m/ugly/g) {
if ($test =~ m/here/g) {
print 'Match';
}
}
Results in no output, but
my $test = "There was once an\n ugly ducking";
if ($test =~ m/here/g) {
if ($test =~ m/ugly/g) {
print 'Match';
}
}
results in Match!
If I remove the g flag from the regex, then the second internal test matches whichever way around the matches appear in $test. I can't find a reference to why this is so.
Yes. That behaviour is documented in perlop man page. Using m/.../ with g flag advances in the string for the next match.
In scalar context, each execution of "m//g" finds the next match, returning true if it matches, and false if there is no further match. The position after the last match can be read or set using the "pos()" function; see "pos" in perlfunc. A failed match normally resets the search position
to the beginning of the string, but you can avoid that by adding the "/c" modifier (e.g. "m//gc"). Modifying the target string also resets the search position.
So, in first case after ugly there isn't any here substring, but in second case it first matches here in There and later it finds the ugly word.