Cannot match a regular expression - regex

I am trying to match the following two words, but for some reason, my regexp doesn't work
Here is what I'm trying to match: OCXXXXXX GXXXXXXX
X being any number or letter
Here is my regexp
OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+$
if I remove the dollar sign, it owrks, but I want the regexp to match exactly those two words and fail if there ar emote than those two words. Because of that, I want to use the $. Any ideas why this does not work?

May be you have a space at the end, try this:
OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+\b
or
OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+\s*

Thats weird, it is now working. I was using this website:
http://www.regexr.com/
I ended up uding
^OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+$
After I refreshed the website, it started working. Sorry about this post.
Thanks anyway M42

^\s*OC[a-zA-Z0-9]+\s+G[a-zA-Z0-9]+\s*$ should work.
Its anchored at beginning and end of string ^$ and allows for optional
whitespace at beginning or end and required whitespace between words.
The quantifiers on the whitespace are open ended.

Related

Regex for selecting words ending in 'ing' unless

I want to select words ending in with a regular expression, but I want exclude words that end in thing. For example:
everything
running
catching
nothing
Of these words, running and catching should be selected, everything and nothing should be excluded.
I've tried the following:
.+ing$
But that selects everything. I'm thinking look aheads/look arounds could be the solution, but I haven't been able to get one that works.
Solutions that work in Python or R would be helpful.
In python you can use negative lookbehind assertion as this:
^.*(?<!th)ing$
RegEx Demo
(?<!th) is negative lookbehind expression that will fail the match if th comes before ing at the end of string.
Note that if you are matching words that are not on separate lines then instead of anchors use word boundaries as:
\w+(?<!th)ing\b
Something like \b\w+(?<!th)ing\b maybe.
You might also use a negative lookahead (?! to assert that what is on the right is not 0+ times a word character followed by thing and a word boundary:
\b(?!\w*thing\b)\w*ing\b
Regex demo | Python demo

Notepad++ Regex Find all endline without periods

I'm trying to find all lines without ending period (dot) but without finding blank (empty) lines. And after that I want to add ending period to that sentence.
Example:
The good is whatever stops such things from happening.
Meaning as the Higher Good
It was from this that I drew my fundamental moral conclusions.
I have tried few regex but they also find empty lines as well.
Is there a regex for Notepad++ that can achieve that?
Enable Regular Expression match, then search for:
\S(?<!\.)\K\s*$
and replace with:
.$0
Breakdown:
\S Match a non-whitespace character
(?<!\.) It shouldn't be a period
\K Reset match
\s* Match optional whitespace characters
$ End of line
You could use something like this to find the lines that you are interested in adding capture group to it and appending you needed chars.
(?<!\.)\r\n
This works by using negative look behind (?<!\.) to check that there is no . before \r
There is a group or regex operators that can be used to accomplish this type of tasks.
Look ahead positive (?=)
Look ahead negative (?!)
Look behind positive (?<=)
Look behind negative (?
Try this short and effective solution too.
Search: \w$
Replace: $0.

regex to match word (url) only if it does not contain character

I'm using an API that sometimes truncates links inside the text that it returns and instead of "longtexthere https://fancy.link" I get "longtexthere https://fa…".
I'm trying to get to match the link only if it's complete, or in other words does not contain "…" character.
So far I am able to get links by using the following regex:
((?:https?:)?\/\/\S+\/?)
but obviously it returns every link including broken ones.
I've tried to do something like this:
((?:https?:)?\/\/(?:(?!…)\S)+\/?)
Although that started to ignore the "…" character it was still returning the link but just without including the character, so with the case of "https://fa…" it returned "https://fa" whereas I simply want it to ignore that broken link and move on.
Been fighting this for hours and just can't get my head around it. :(
Thanks for any help in advance.
You can use
(?:https?:)?\/\/[^\s…]++(?!…)\/?
See the regex demo. The possessive quantifier [^\s…]++ will match all non-whitespace and non-… characters without later backtracking and then check if the next character is not …. If it is, no match will be found.
As an alternative, if your regex engine allow possessive quantifiers, use a negative lookahead version:
(?!\S+…)(?:https?:)?\/\/\S+\/?
See another regex demo. The lookahead (?!\S+…) will fail the match if 1+ non-whitespace characters are followed with ….
You can try following regex
https?:\/\/\w+(?:\.\w+\/?)+(?!\.{3})(\s|$)
See demo https://regex101.com/r/bS6tT5/3
Try:
((?:https?:)?\/\/\S+[^ \.]{3}\/?)
Its the same as your original pattern.. you just tell it that the last three characters should not be '.' (period) or ' ' (space)
UPDATE: Your second link worked.
and if you tweak your regex just slightly it will do what you want:
((?:https?:)?\/\/\S+[^ …] \/?)
Yes it looks just like what you had in there except I added a ' ' (space) after the part we do not want.. this will force the regular expression to match up until and including the space which it cannot with a url that has the '...' character. Without the space at the end it would match up until the not including the '...' which was why it was not doing what we wanted ;)
Please try:
https?:\/\/[^ ]*?…|(https?:\/\/[^ ]+\.[^ ]+)
Here is the demo.

Regex to search for a phrase

Hy,
please help me with a Regex to find phrases in a text.
My Regex is not ok. My assumption that phrases begin with uppercase and end with an dot, and between can contain anything.
\b([A-Z]+[aA-zZ]*\b(.)+)
Sincerly,
You can use the following if your between phrase doesn't also consist of a dot.
[A-Z][^.]*\.
Or perhaps, you could try using the following.
[A-Z].*?\.
Here is one variant
\b([A-Z][^.]*\.+)\b
Try this, it starts with a Capital letter, end with a dot, zero or more anything in between them:
^[A-Z].*[.]$

Regex for deleting characters before a certain character?

I'm very new at regex, and to be completely honest it confounds me. I need to grab the string after a certain character is reached in said string. I figured the easiest way to do this would be using regex, however like I said I'm very new to it. Can anyone help me with this or point me in the right direction?
For instance:
I need to check the string "23444:thisstring" and save "thisstring" to a new string.
If this is your string:
I'm very new at regex, and to be completely honest it confounds me
and you want to grab everything after the first "c", then this regular expression will work:
/c(.*)/s
It will return this match in the first matched group:
"ompletely honest it confounds me"
Try it at the regex tester here: regex tester
Explanation:
The c is the character you are looking for
.* (in combination with /s) matches everything left
(.*) captures what .* matched, making it available in $1 and returned in list context.
Regex for deleting characters before a certain character!
You can use lookahead like this
.*(?=x)
where x is a particular character or word or string.{using characters like .,$,^,*,+ have special meaning in regex so don't forget to escape when using it within x}
EDIT
for your sample string it would be
.*(?=thisstring)
.* matches 0 to many characters till thisisstring
Here is a one-line solution for matching everything after "before"
print $1."\n" if "beforeafter" =~ m/before(.*)/;
Edit:
While using lookbehind is possible, it's not required. Grouping provides an easier solution.
To get the string before : in your example, you have to use [^:][^:]*:\(.*\). Notice that you should have at least one [^:] followed by any number of [^:]s followed by an actual :, the character you are searching for.