Stripping 3 parentheses in regex - regex

I run a small forum that has an issue with people using parentheses to bracket statements. They do it to signify they are talking about Jews. I guess it is called echoes or something. So they will put a name like (((Prominent Person))) like that in the middle of a conversation.
I have recently been trying to combat this without just banning people that can't behave. I have a decent word filter but it doesn't block that. I recently installed something that allows me to use regex to strip things out but I am having trouble finding the proper string that doesn't break everything else.
"/\W{3}(.*)\W{3}/","$1"
The first is the matching string and the comma separates what is left. This string works, it strips the parentheses out and leaves everything else alone. The problem is that the string is too broad. It also strips out any [ brackets as well which breaks all of the bbcode in a post. Any post that has any number of at least 3 brackets will be broken after that.
I have been playing with different strings on regex101 but not finding the best solution. I need any time that ((( or ))) is seen to strip out those and replace it with nothing, like it never happened. It has to be exactly three and only ((( and not the other brackets it could trigger on.
Does anyone have a good solution?

\({3}(.*)\){3}
https://regex101.com/r/wD5TMb/1
So in your format probably: "/\({3}(.*)\){3}/","$1"

Related

Improve regex that works

I'm not a regex expert, so please be nice :-)
I created this regex to verify if a user submitted a day of the week (in italian language):
/((lun|mart|giov)e|mercol(e?)|vener)d(ì|i('?)|í)|sabato|domenica/
This regex perfectly works and it matches the following:
lunedi
lunedì
lunedí
lunedi’
martedi
martedì
martedí
martedi'
mercoledi
mercoledì
mercoledí
mercoledi'
mercoldi
mercoldì
mercoldí
mercoldi'
giovedi
giovedì
giovedí
giovedi'
venerdi
venerdì
venerdí
venerdi'
sabato
domenica
Now consider the first part of the regex and focus on venerdì: as you can see, I added an OR (|) just to manage the venerdì day, just because of the presence of that “r”.
Anything works just fine but I’m here to ask if is there any way to start the regex this way:
(lun|mar|giov|ven)e
and then manage that “r” some way.
I red about backrefences and conditionals but I’m not sure they can be of any help.
My idea is something like: “if the first group captured ‘ven’, than add “r” to the “e” right after the end of the group.
Is this possible?
Don't "golf" your regex. If you want to improve it at all, make it more readable. While it it certainly worthwile to use different cases for the different "i" variants, everything else should IMHO be kept as simple as possible.
How about something like this?
(lune|marte|mercole?|giove|vener)d(ì|i'?|í)|sabato|domenica
Don't use backreferences and other advanced features if you don't need them, just to make your regex a few chars shorter. Even if you would still understand what it means, think about your fellow co-developers -- or just yourself two months from now.
I just removed a few redundant (...) and the "shared e" part. Note how (besides the (...)) it is the same length, whether you use (lun|mart|giov)e or lune|marte|giove, but the latter is arguably more readable. Similarly, a backreference or some conditional would likely make your regex longer instead of shorter -- and considerably more complicated.

Retrieve characters after nth occurrence of an another with Regex

I'm writing a simple bot that broadcasts messages to clients based on messages from a server. This will be done in JavaScript but I am trying to understand Regex. I've been Googling for the past hour and I've come so close but I am simply unable to solve this one.
Basically I need to retrieve everything between the second / and the first [. It sounds really simple but I cannot figure out how to do this.
Here's some sample code:
192.168.1.1:33291/76561198014386231/testName joined [linux/76561198014386231]
Here's the Regex I've come up with:
\/(.*?)\[
I've found lots of similar questions here on StackOverflow but most of them seem specific to a particular language or end up being too complex and I'm unable to whittle down the query.
I know this is a simple one, but I am totally stumped.
Instead of .*?. Then you could match everything but a forward slash by doing [^\/]*.
([^\/]*)\s*\[
Live preview
If it needs to be after the second slash. As in the contents between the second slash and the square bracket can contain slashes. Then you could do:
(?:.*?\/){2}(.*)\s*\[
Live preview
Remove the \s* if you want to. I'm just assuming you don't care about that whitespace.

Matching "Not a Word" in URL?

I'm trying to catch:
http://anydomain/MYDIR/filename.aspx
but NOT
http://anydomain/subdir/MYDIR/filename.aspx
(essentially, the rule is to capture the first, and redirect to the second -- I've moved the files - and anyone bookmarking the old, I want them whisked away to the new).
Everything I've done is capturing both, and generates a fun redirect loop. Fun as in GRR, not fun as in YAY. Admittedly, I'm terrible at RegEx beyond the basics, for the 2 times a decade I need it, and have promptly forgotten everything. The closest I've gotten is something like this:
^.*!NEWDIR\/MYDIR\/filename\.aspx$
but it doesn't seem to validate. I believe it's my "grouping" of NEWDIR in the regex, is it thinking I'm only not'ing the N and EWDIR is supposed to be there? How do I get it to "not" NEWDIR entirely?
Try this one:
^.*(?<!NEWDIR)\/MYDIR\/filename\.aspx$
This is using negative lookbehind. Here the goal is to math a string that is not proceeded by another string.
Here is a working example. For details on lookbehind check this page.

select area within characters using regex (spaces are an issue)

Some other guy asked a similar question earlier which got a lot of down votes, and I was interested in solving it. I came to a similar issue and would like some help with it.
Take into consideration this wall of text:
__don't__ and __do it__
__yellow__
__green__ and __purple__
I would like to select all the area within the underscores __'s
I attempted the following regex:
/__[!-~]+__/g which worked great on most things. I would like to add the ability to have spaces within the underscores. __do it__ will not be encapsulated in the search because it includes a space which was ruled out by the regex. I attempted the following:
/__[ -~]+__/g
It didn't work as planned, and selected everything from the very first __ to the very last. I was wondering how to tell the regex it has reached the end of a search once it sees a space after a __.
Here is the regex you could play around with below:
http://regexr.com/39br7
I tried using __[^ ]/g at the end but It didn't seem to help.
You could simply use the below regex,
__[^_]*__
DEMO
__(.*?)__
This seems to work.Look at the demo.
http://regex101.com/r/lJ1jB1/1

In what ways can I improve this regular expression?

I have written this regex that works, but honestly, it’s like 75% guesswork.
The goal is this: I have lots of imports in Xcode, like so:
#import <UIKit/UIKit.h>
#import "NSString+MultilineFontSize.h"
and I only want to return the categories that contain +. There are also lots of lines of code throughout the source which include + in other contexts.
Right now, this returns all of the proper lines throughout the Xcode project. But if there is one thing I’ve learned from googling and searching Stack Overflow for regex tutorials, it is that there are LOTS of different ways to do things. I’d love to see all of the different ways you guys can come up with that make it either more efficient or more bulletproof regarding potential spoofs or misses.
^\#import+.[\"]*+.(?:(?!\+).)*+.*[\"]
Thanks in advance for all of your help.
Update
Also I suppose I’ll accept the answer of whoever does this with the shortest string, without missing any possible spoofs. But again, thanks to everyone who participates in this learning experience.
Resources from answers
This is an awesome resource for practicing regex from Dan Rasmussen: RegExr
The first thing I notice is that your + characters are misplaced: t+. matches t one or more times, followed by a single character .. I'm assuming you wanted to match the end of import, followed by one or more of any character: import.+
Secondly, # doesn't need to be escaped.
Here's what I came up with: ^#import\s+(.*\+.*)$
\s+ matches one or more whitespace character, so you're guaranteed that the line actually starts with #import and not #importbutnotreally or anything else.
I'm not familiar with xcode syntax, but the following part of the expression, (.*\+.*), simply matches any string with a + character somewhere in it. This means invalid imports may be matched, but I'm working under the assumption your trying to match valid code. If not, this will need to be modified to validate the importer syntax as well.
P.S. To test your expression, try RegExr. You can hover over characters to check what they do.
sed 's:^#import \(.*[+].*\):\1:' FILE
will display
"NSString+MultilineFontSize.h"
for your sample.