Pattern matching with regex - regex

I am a novice in regex and trying to understand it by solving small problems. So here I am with a problem which I couldn't solve (warning: it may be extremely silly). Your inputs will help me understand the concept.
I want to write a regex which will match all items in list1 but none of those from list2
list1
pit
spot
spate
slap two
respite
list2
pt
Pot
peat
part
I was thinking like "give me all the items that starts with p|s|r and endswith it|ot|e|o
So i wrote ^[p|s|r].*[it|ot|e|o]$ which eventually resulted in undesired result.
Thanks in advance for your inputs.

In notepad you can't do or operations (taken from Notepad++: A guide to using regular expressions and extended search mode and tested on my Notepad++ 5.9.3)
This would work in other "standard" regexes :-)
^[psr].*(it|ot|e|o)$
Try here. http://gskinner.com/RegExr/?2uudn
What were you doing was using the [] instead of the grouping (). It was equivalent to: [itoe|] (were the | was a "standard" character instead of or) and in general everything in an [] is in or :-) [ab] means a or b.

/(pit|spot|spate|slap two|respite)/.test('Pot')
This matches the words from list one, and none from list two

I feel like I must be missing something.
^(pit|spot|spate|slap two|respite)$

It depends entirely on how you categorise the differences between the lists:
/p[ioa\s]t/

Related

Regex that matches even amount of character

Disclamer (after solved): this is my uni assignment thus I the answer could be simple. Hints are shown but my answer is hidden from here. Alternative answers could be found here but I take no responsibility with any plagiarism with direct answers posted here.
Hi I'm having troubles with the following exercise
Find regex that strictly represents the language:
b^(m+1), such that m>=0, m mod 2 = 1
The language breaks down to words:
{bb,bbbb,bbbbbb,bbbbbbbb,...}
I have tried the following:
b(bbb)?(bb)*
But this also accepts
{bb,bbb,bbbb,bbbbb,...}
Is there a way to write it such one bit of expression is depended on the other? ie: (bb)* cannot be chosen if (bbb)? is chosen at once, then repeat the decision but allow the vice versa.
Any help would be appreciated. Thanks
Update:-
You can use
^(?:bb)+$
Regex Demo
Initial heading of question was --> Regex that matches odd amount of character
You can try this
^b(?:(?:b{2})+)?$
Regex Demo
My guess is that, this might be closer,
^(?:bb){1,}$
and your set might look like,
bb
bbbb
bbbbbb
not sure though. If your set was correct, expression can likely be modified.
also, b would not probably be in the set, since m=0 does not pass the second requirement.
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.

Regular expression to find all math operations

I have an expression like this:
5*3 + 2 +7 -sqrt(32312.323)
And i want to find all of the math operation in that expression. I've tried:
[A-Za-z\*\+\-\/]
Because of the "-" is next to the "sqrt" operation so it got combined as "-sqrt", i want them to be separated in this case ? How can i fix this ?
Thanks you in advance!
[*+\/-]+|[A-Za-z]+
This will works. According to #CertainPerformance The performance of that should be fine. The problem is always when several branches of alternation start with matching the same content (like a*|aa*, for example). There is no possibility your two branches will ever both match on the same character, so that is as good as regex gets.
Also, thanks for your answer #revo.

Regular Expressions for finding files

Ok I'm giving up and ask the question after I read through the help article of regex and still don't have a clue what I'm looking for:
I Have a list of files:
files <- c("files_combined.csv","file_1-10.csv","file_11-20.csv",
"file_21-30.csv","file_2731-2740.csv","file_2731-2740.txt")
I want only the csv files that start with "file_" and end with ".csv". I know the it looks something like this:
grep(pattern = "^file_???.csv$" ,files)
But I need to find the correct regular expression that ignores the number of characters between the first and the second pattern ("file_" + ".csv"). I'd really appreciate if somebody knows a complete list with the regular expressions in R since it is tedious to read through the help every time and, as in my case not successful, sometimes...
R offers a function for doing wildcard expansion using glob patterns for those who don't like regex:
files <- Sys.glob("file_*.csv")
This should match your pattern.
Thanks a lot! Seems David Arenburg and Heroka, you came up with the solution at the same time. Also thanks to MichaelChirico for providing the cheatsheet.
This is the answer to my specific problem:
grep("^file_.+\\.csv$",files,ignore.case = T)
As for problems with regex, this is helpful as well txt2re

Rewrite regex without negation

I have wrote this regex to help me extract some links from some text files:
https?:\/\/(?:.(?!https?:\/\/))+$
Because I am using golang/regexp lib, I'm not able to use it, due to my negation (?!..
What I would like to do with it, is to select all the text from the last occurance of http/https till the end.
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2
=> Output: http://websites.com/path/subpath/#query2
Can anyone help me with a solution, I've spent several hours trying different ways of reproducing the same result with no success.
Try this regex:
https?:[^:]*$
Regex live here.
The lookaheads exist for a reason.
However, if you insist on a supposedly equivalent alternative, a general strategy you can use is:
(?!xyz)
is somewhat equivalent to:
$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)
With that said, hopefully I didn't make any mistakes:
https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$

Using a regular expression to insert text in a match

Regular Expressions are incredible. I'm in my regex infancy so help solving the following would be greatly appreciated.
I have to search through a string to match for a P character that's not surrounded by operators, power or negative signs. I then have to insert a multiplication sign. Case examples are:
33+16*55P would become 33+16*55*P
2P would become 2*P
P( 33*sin(45) ) would become P*(33*sin(45))
I have written some regex that I think handles this although I don't know how using regex I can insert a character:
The reg is I've written is:
[^\^\+\-\/\*]?P+[^\^\+\-\/\*]
The language where the RegEx will be used is ActionScript 3.
A live example of the regex can be seen at:
http://www.regexr.com/39pkv
I would be massively grateful if someone could show me how I insert a multiplication sign in middle of the match ie P2, becomes P*2, 22.5P becomes 22.5P
ActionScript 3 has search, match and replace functions that all utilise regular expressions. I'm unsure how I'd use string.replace( expression, replaceText ) in this context.
Many thanks in advance
Welcome to the wonder (and inevitable frustration that will lead to tearing your hair out) that is regular expressions. You should probably read over the documentation on using regular expressions in ActionScript, as well as this similar question.
You'll need to combine RegExp.test() with the String.replace() function. I don't know ActionScript, so I don't know if it will work as is, but based on the documentation linked above, the below should be a good start for testing and getting an idea of what the form of your solution might look like. I think #Vall3y is right. To get the replace right, you'd want to first check for anything leading up to a P, then for anything after a P. So two functions is probably easier to get right without getting too fancy with the Regex:
private function multiplyBeforeP(str:String):String {
var pattern:RegExp = new RegExp("([^\^\+\-\/\*]?)P", "i");
return str.replace(pattern, "$1*P");
}
private function multiplyAfterP(str:String):String {
var pattern:RegExp = new RegExp("P([^\^\+\-\/\*])", "i");
return str.replace(pattern, "P*$1");
}
Regex is used to find patterns in strings. It cannot be used to manipulate them. You will need to use action script for that.
Many programming languages have a string.replace method that accepts a regex pattern. Since you have two cases (inserting after and before the P), a simple solution would be to split your regex into two ([^\^\+\-\/\*]?P+ and P+[^\^\+\-\/\*] for example, this might need adjustment), and switch each pattern with the matching string ("*P" and "P*")