Rewrite regex without negation - regex

I have wrote this regex to help me extract some links from some text files:
https?:\/\/(?:.(?!https?:\/\/))+$
Because I am using golang/regexp lib, I'm not able to use it, due to my negation (?!..
What I would like to do with it, is to select all the text from the last occurance of http/https till the end.
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2
=> Output: http://websites.com/path/subpath/#query2
Can anyone help me with a solution, I've spent several hours trying different ways of reproducing the same result with no success.

Try this regex:
https?:[^:]*$
Regex live here.

The lookaheads exist for a reason.
However, if you insist on a supposedly equivalent alternative, a general strategy you can use is:
(?!xyz)
is somewhat equivalent to:
$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)
With that said, hopefully I didn't make any mistakes:
https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$

Related

Regex to match everything."LettersNumbers"."extension" and forum searching tip

I would need a regex to match my files named "something".Title"numberFrom1to99".mp4 on Windows' File Explorer, my first approach as a regex newbie was something like
"..mp4"
, but it didn't work, so i tried
"*.Title[1-9][0-9].mp4"
, that also did not work.
I would also like a tip on how to search regex related advices on Stackoverflow archive but also on the web, so that i can be specific, but without having the regex in the searching bar interact.
Thank you!
EDIT
About the second part of the question: in the question itself there is written "..mp4" but i wrote "asterisk"."asterisk".mp4, is there any universal way to write regex on the web without it having effect and without escaping the characters? (in that way the backslash shows inside the regex, and that could be misunderstood)
Try something like this:
(.*)\.[A-za-z]+\d+\.mp4
See this Regex Demo to get an explanation on the regex.
Use regex101.com to test your regexs
Here it is:
^[\s\S]*\.Title[1-9][0-9]?\.mp4$
I suggest regexr.com to find many interesting regexes(Favourites tab) and simple tutorial.
About the second part of the question: in the question itself there is written "..mp4" but i wrote "asterisk"."asterisk".mp4, is there any universal way to write regex on the web without it having effect and without escaping the characters? (in that way the backslash shows inside the regex, and that could be misunderstood)

Monitoring bad links with RegEx in Google Analytics

How do I optimize this to find all links ending in weird typos, yet still exclude correct links (ending with .html) from the results?
htmll$|hhtml$|httml$|htmml$|htmll$|btml$|hml$|htl$
Thanks in advance!
Wow, that's some pretty restrictive regex rules but that kinda makes it interesting.
since we have no character negation but we do have character classes we could do:
[a-gi-z]tml$|h[a-su-z]ml|ht[a-ln-z]l|htm[a-km-z]
for my second suggestion and:
h.+tml|ht.+ml|htm.+l|html.+
to replace the first option leading to a total of:
[a-gi-z]tml$|h[a-su-z]ml|ht[a-ln-z]l|htm[a-km-z]|h.+tml|ht.+ml|htm.+l|html.+
EDIT: Having noticed that the .+'s can catch things we don't want this should be changed slightly.
(.*[a-gi-z]tml|h.*[a-su-z]ml|ht.*[a-ln-z]l|htm.*[a-km-z])$

RegEx all URLs that do NOT contain a string

I seem to be having a bit of a brain fart atm. I've got Google counting my transitions correctly but I'm getting false positives.
This is the current goal RegEx which works great.
^/click/[0-9]+\.html\?.*
But I also want it the RegEx to NOT county anything that has &confirm=1 I'm quite stuck as to how to do that in the RegEx, I thought I might be able to use [^(?:&confirm=1)] but I don't think that's valid.
Use "exclude", not "include" filter option
Try this:
^/click/[0-9]+\.html\?(?!.*\bconfirm=1).*
I changed it slightly so it will still exclude if confirm=1 is the first param (preceded by the ? rather than &)
I'm afraid you can't... I've tried doing this before, what I found was that you used to be able to do this with negative lookahead (see Rubens), but Google Analytics stopped supporting this at some point (source: http://productforums.google.com/forum/#!topic/analytics/3YnwXM0WYxE).
Maybe I'm a little late.
What about just writing :
[^(&confirm=1)]
?

In what ways can I improve this regular expression?

I have written this regex that works, but honestly, it’s like 75% guesswork.
The goal is this: I have lots of imports in Xcode, like so:
#import <UIKit/UIKit.h>
#import "NSString+MultilineFontSize.h"
and I only want to return the categories that contain +. There are also lots of lines of code throughout the source which include + in other contexts.
Right now, this returns all of the proper lines throughout the Xcode project. But if there is one thing I’ve learned from googling and searching Stack Overflow for regex tutorials, it is that there are LOTS of different ways to do things. I’d love to see all of the different ways you guys can come up with that make it either more efficient or more bulletproof regarding potential spoofs or misses.
^\#import+.[\"]*+.(?:(?!\+).)*+.*[\"]
Thanks in advance for all of your help.
Update
Also I suppose I’ll accept the answer of whoever does this with the shortest string, without missing any possible spoofs. But again, thanks to everyone who participates in this learning experience.
Resources from answers
This is an awesome resource for practicing regex from Dan Rasmussen: RegExr
The first thing I notice is that your + characters are misplaced: t+. matches t one or more times, followed by a single character .. I'm assuming you wanted to match the end of import, followed by one or more of any character: import.+
Secondly, # doesn't need to be escaped.
Here's what I came up with: ^#import\s+(.*\+.*)$
\s+ matches one or more whitespace character, so you're guaranteed that the line actually starts with #import and not #importbutnotreally or anything else.
I'm not familiar with xcode syntax, but the following part of the expression, (.*\+.*), simply matches any string with a + character somewhere in it. This means invalid imports may be matched, but I'm working under the assumption your trying to match valid code. If not, this will need to be modified to validate the importer syntax as well.
P.S. To test your expression, try RegExr. You can hover over characters to check what they do.
sed 's:^#import \(.*[+].*\):\1:' FILE
will display
"NSString+MultilineFontSize.h"
for your sample.

Regex - match a string not contain a 'semi-word'

I tried to make regex syntax for that but I failed.
I have 2 variables
PlayerInfo[playerid][pLevel]
and
Character[playerid]
and I want to catch only the second variable,I mean only the world what don't contain PlayerInfo, but cointains [playerid]
"(\S+)\[playerid\]" cath both words and (\S+[^PlayerInfo])\[playerid\] jump on some variables- they contais p,l,a,y ...
I need to replace in notepad++,all variables like Text[playerid] to ExClass [playerid][Text]
Couple Pluasible solutions.
List item
Notepad has a plugin called python script. Running regex from there
gives full regex functionality, the python version anyway, and a lot
of powerful potential beyond that. And I use the online python regex tester to help out.
RegRexReplace plugin helps create regex plugins in Notepad++, so when you do hit a limitation, you find out a lot quicker.
Or of course default to your alternate editor (I'm assuming you have
one?) or this online regex tool is absolutely amazing. You
can perform the action on the text online as well.
(I'd try to build a regex for you, but I'm a bit lost as to what you're looking for. Unless the Ivo Abeloos got it. If you're still coming up short, maybe a code example along with values displayed?)
Good luck!
It seems that Notepad++ support negative lookbehind since v6.
In notepad++ you could try to replace (.+)\[(.+)\] with ExClass\[\2\]\[\1\]
Try to use negative lookbehind.
(?<!PlayerInfo)\[playerid\]
EDIT: unfortunately notepad++ does not support negative lookbehind.
I tried to make a workaround based on the following naive idea:
(.[^o]|[^f]o)[playerid]
But this expression does not work either. Notepad++ seems to fail in alternative operator. Thus the answer is: it is impossible to do exactly what you want. Try to solve the problem in other way or use alternative tool.