Regex for double slash comments - regex

I need a regex that will match a double-slash comment on the end of a line, while ignoring it if is enclosed in quotes:
something something // match this and the double slash
this address: "https://foo" but don't match this
I want to capture the comment along with the slashes in a capturing group.
I originally simply had .*(/\/\/.*), but that fails in case of quotes and I haven't managed to find how to do that.

You could try this regex pattern: /.*\s(\/\/.*)$/gm
It should work on your sample data, the comment along with the slashes will be captured in 1st capturing group.
let pattern = /.*\s(\/\/.*)$/gm;
let texts =
['something something // match this and the double slash',
'this address: "https://foo" but do not match this',
'this address: "https://foo" // "hello"'];
texts.forEach(text => {
let matches = text.matchAll(pattern);
console.log([...matches]);
});
Please check regex demo here.

How about using it like this:
/(?:\s|^)(?<comment>\/\/(?<text>.+))/gm
Rule: space or nothing (line start) should be there before //
Accepted examples
// some thing
// some thing
something something // match this and the double slash
foo bar // "hello"
Not accepted examples:
h// some things
this address: "https://foo" but don't match this
Regex101 Demo

let s=`something something // match this and the double slash\nthis address: "https://foo" but don't match this`
console.log(s.match(/\"? [\/]{2}.+/ig))
//second example with EOL sign
let s=[`something something // match this and the double slash`,`this address: "https://foo" but don't match this`]
console.log(s[0].match(/\"? [\/]{2}.+$/ig))
console.log(s[1].match(/\"? [\/]{2}.+$/ig))
but You still have to stick some argument rules, like at least one space between https://foo/foo //comment

Related

How to get method/function in a string by using regex

I trying to get arguments from function in the string.
Argument possible to contains:
Example
Placeholder:
{{request.expires_in}}
//can match regex: \{\{[a-zA-z0-9\.\-\_]\}\}
function
#func_compare('string1','string2',1234)
Others:
dERzUxOfnj49g/kCGLR3vhzBOTLwEMgrpa1/MCBpXQR2NIFV1yjraGVZLkujG63J0joj+TvNocjpJSQq2TpPRzLfCSZADcjmbkBkphIpsT8=
//Any string except brackets
Case
Below is the sample case I working with.
Content:
#func_compare('string1',#func_test(2),1234),'Y-m-d H:i:s',#func_strtotime({{request.expires_in}} - 300)
Regex using:
(?<=#func_compare\().*[^\(](?=\))
I expect will get
'string1',#func_test(2),1234
But what matched from the regex now is
'string1',#func_test(2),1234),'Y-m-d H:i:s',#func_strtotime({{request.expires_in}} - 300
Anyone know how to get the arguments in between the #func_strtotime brackets. I will appreciate any response.
Would you please try:
(?<=#func_compare\().*?(?:\(.*?\).*?)?(?=\))
which will work for both cases.
[Explanation of the regex]
.*?(?:\(.*?\).*?)?(?=\))
.*? the shortest match not to overrun the next pattern
(?:\(.*?\).*?)? a group of substring which includes a pair of parens followed by another substring of length 0 or longer
(?=\)) positive lookahead for the right paren
You'll get the result using recursive regex:
(?<=#func_compare\()([^()]*\((?:.*?\)|(?1))*)[^()]*(?=\))
Demo & explanation

Regex to find after particular word inside a string

I am using regex to find few keywords after colon(:) and the best I have reached so far is:
sample test case
test {
test1 {
sadffd(test: "aff", aaa: "aa1") {}
}
}
Now I have to find a keyword inside () brackets and its working for 'aaa' but when I add test it fails, it matches entire words in string.
my regex so far
\btest(.*\w") (failed case) expected "aff" returned "aff", aaa: "aa1"
\baaa(.*\w") (pass case) returned "aa1"
please let me know if more information is needed
You may try
:\s*"(.*?)"
And the data you need is in the first capturing group.
Explanation
:\s*"(.*?)"
: colon
\s* followed by optionally any number of spaces
" followed by quote
( ) capturing group, containing...
.*? any number of character, matching as few as possible
" followed by quote
Demo:
https://regex101.com/r/WnvzdG/1
Update:
If you want to match ONLY after specific keywords, followed by colon, you can do something like:
(KEYWORD1|KEYWORD2|KEYWORD3)\s*:\s*"(.*?)"
First capture group will be the keyword matched, second capture group will be the value.
One more approach (executed in Python)
items = ['test{test1 {sadffd(test: "aff", aaa: "aa1") {}}}']
for item in items:
print(re.findall(r'"(\w+)"',item))
print(re.findall(r'(?<=: )"(\w+)"',item))
Output
['aff', 'aa1']
['aff', 'aa1']
I believe a simple regex would work to get everything inside the double quotes in your case:
("\w+")
Note that your question above says you want to capture "aff" and not just aff so I've included the surrounding quotes within the capturing group.
Example from regex101:
It's pretty crude but this should be OK for the input you've presented. (It wouldn't handle things like an escaped double quote in the string, for example).

Regex to extract second word from URL

I want to extract a second word from my url.
Examples:
/search/acid/all - extract acid
/filter/ion/all/sss - extract ion
I tried to some of the ways
/.*/(.*?)/
but no luck.
A couple things:
The forward slashes / have to be escaped like this \/
The (.*?) will match the least amount of any character, including zero characters. In this case it will always match with an empty string.
The .* will take as many characters as it can, including forward slashes
A simple solution will be:
/.+?\/(.*?)\//
Update:
Since you are using JavaScript, try the following code:
var url = "/search/acid/all";
var regex = /.+?\/(.*?)\//g;
var match = regex.exec(url);
console.log(match[1]);
The variable match is a list. The first element of that list is a full match (everything that was matched), you can just ignore that, since you are interested in the specific group we wanted to match (the thing we put in parenthesis in the regex).
You can see the working code here
This regex will do the trick:
(?:[^\/]*.)\/([^\/]*)\/
Proof.
For me, I had difficulties with the above answers for URL without an ending forward slash:
/search/acid/all/ /* works */
/search/acid /* doesn't work */
To extract the second word from both urls, what worked for me is
var url = "/search/acid";
var regex = /(?:[^\/]*.)\/([^\/]*)/g;
var match = regex.exec(url);
console.log(match[1]);

Regex to match alphanumerics, URL operators except forward slashes

I've been trying for the past couple of hours to get this regex right but unfortunately, I still can't get it. Tried searching through existing threads too but no dice. :(
I'd like a regex to match the following possible strings:
userprofile?id=123
profile
search?type=player&gender=male
someotherpage.htm
but not
userprofile/
helloworld/123
Basically, I'd like the regex to match alphanumerics, URL operators such as ?, = and & but not forward slashes. (i.e. As long as the string contains a forward slash, the regex should just return 0 matches.)
I've tried the following regexes but none seem to work:
([0-9a-z?=.]+)
(^[^\/]*$[0-9a-z?=.]+)
([0-9a-z?=.][^\/]+)
([0-9a-z?=.][\/$]+)
Any help will be greatly appreciated. Thank you so much!
The reason they all match is that your regexp matches part of the string and you've not told it that it needs to match the entire string. You need to make sure that it doesn't allow any other characters anywhere in the string, e.g.
^[0-9a-z&?=.]+$
Here's a small perl script to test it:
#!/usr/bin/perl
my #testlines = (
"userprofile?id=123",
"userprofile",
"userprofile?type=player&gender=male",
"userprofile.htm",
"userprofile/",
"userprofile/123",
);
foreach my $testline(#testlines) {
if ($testline =~ /^[0-9a-z&?=.]+$/) {
print "$testline matches\n";
} else {
print "$testline doesn't match - bad regexp, no cookie\n";
}
}
This should do the trick:
/\w+(\.htm|\?\w+=\w*(&\w+=\w*)*)?$/i
To break this down:
\w+ // Match [a-z0-9_] (1 or more), to specify resource
( // Alternation group (i.e., a OR b)
\.htm // Match ".htm"
| // OR
\? // Match "?"
\w+=\w* // Match first term of query string (e.g., something=foo)
(&\w+=\w*)* // Match remaining terms of query string (zero or more)
)
? // Make alternation group optional
$ // Anchor to end of string
The i flag is for case-insensitivity.

Regex AND operator

Based on this answer
Regular Expressions: Is there an AND operator?
I tried the following on http://regexpal.com/ but was unable to get it to work. What am missing? Does javascript not support it?
Regex: (?=foo)(?=baz)
String: foo,bar,baz
It is impossible for both (?=foo) and (?=baz) to match at the same time. It would require the next character to be both f and b simultaneously which is impossible.
Perhaps you want this instead:
(?=.*foo)(?=.*baz)
This says that foo must appear anywhere and baz must appear anywhere, not necessarily in that order and possibly overlapping (although overlapping is not possible in this specific case because the letters themselves don't overlap).
Example of a Boolean (AND) plus Wildcard search, which I'm using inside a javascript Autocomplete plugin:
String to match: "my word"
String to search: "I'm searching for my funny words inside this text"
You need the following regex: /^(?=.*my)(?=.*word).*$/im
Explaining:
^ assert position at start of a line
?= Positive Lookahead
.* matches any character (except newline)
() Groups
$ assert position at end of a line
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Test the Regex here: https://regex101.com/r/iS5jJ3/1
So, you can create a javascript function that:
Replace regex reserved characters to avoid errors
Split your string at spaces
Encapsulate your words inside regex groups
Create a regex pattern
Execute the regex match
Example:
function fullTextCompare(myWords, toMatch){
//Replace regex reserved characters
myWords=myWords.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
//Split your string at spaces
arrWords = myWords.split(" ");
//Encapsulate your words inside regex groups
arrWords = arrWords.map(function( n ) {
return ["(?=.*"+n+")"];
});
//Create a regex pattern
sRegex = new RegExp("^"+arrWords.join("")+".*$","im");
//Execute the regex match
return(toMatch.match(sRegex)===null?false:true);
}
//Using it:
console.log(
fullTextCompare("my word","I'm searching for my funny words inside this text")
);
//Wildcards:
console.log(
fullTextCompare("y wo","I'm searching for my funny words inside this text")
);
Maybe you are looking for something like this. If you want to select the complete line when it contains both "foo" and "baz" at the same time, this RegEx will comply that:
.*(foo)+.*(baz)+|.*(baz)+.*(foo)+.*
Maybe just an OR operator | could be enough for your problem:
String: foo,bar,baz
Regex: (foo)|(baz)
Result: ["foo", "baz"]