How to reverse regx to not match - regex

I have regular which select url, I want that it not select url only word, how to not select url? instead select word like (admin,hello).
Regex
((.*?\w+|\W):\/\/[\w\-\.]+.*?\/*.*?\w\W+.*\/.*?\w\W+.*?\/{0,})
Text
htt$ps://b24-56kck1.$bitr%ix24.kz/com#pany/pe#rsonal/us^&er/19/k/roce/
https://1.tesssst1.ru/ororo
admin
hello
##$#$$#w_svccx354V2346Vf
SendAjaxFilterToServer(quiz_questions);

Alex, it is very hard to invert a regular expression, so you want to think in terms of the attributes of what you want to match. One thing that jumps out to me is you just want the line to contain letters. For that, you can use ^[a-zA-Z]+$
Another way to go at it, is you can create an inverted list of characters - ones which you don't want present. This can be harder, but for the simple example input you give, you don't want ":", "/" or "#" to be in the line. That would be ^[^:/#]+$.
These are examples of how you need to think about the problem.

Try this, then trip the surrounding whitespace (because of lack of support for lookaround in Go):
(^|[\n\s])[a-zA-Z]+([\n\s]|$)
https://regex101.com/r/MqyDWC/3

Related

Regex Expressions, URL and end part

I'm trying to make a regex expression to detect a URL with a dynamic ending from a message. So for example it would be something like this.
"http://loclhost/something/randomstring example text example text example text"
So the "http://localhost/something/" will always be the same but the random string part wont and I want to grab "http://loclhost/something/randomstring" only...
I've tried doing this expression
"/http://localhost/something/(.*) "
The thing is, it selects the whole text. I've tried looking up online but can't find anything. Would love some help :)
The .* will keep 'eating up' characters. You probably want something like
/http:\/\/localhost\/something\/([^\s]*)/
to make it 'stop' at a white-space character. Or
/http:\/\/localhost\/something\/([a-z0-9]*)/
if you are sure that randomstring only contains alpha-numerical characters.
Example: https://regex101.com/r/U12o53/1
You need to modify the (.*) part of the url so it only contains valid url characters, e.g.
/http:\/\/localhost\/something\/([\d\w\-_]*)/
You can modify it as you need based on the characters that can be in randomstring.

How do I properly format this Regex search in R? It works fine in the online tester

In R, I have a column of data in a data-frame, and each element looks something like this:
Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Marinilabiaceae
What I want is the section after the last semicolon, and I've been trying to use 'sub' and also duplicating the existing column and create a new one with just the endings kept. In essence, I want this (the genus):
Marinilabiaceae
A snippet of the code looks like this:
mydata$new_column<- sub("([\\s\\S]*;)", "", mydata$old_column)
In this situation, I am using \\ rather than \ because of R's escape sequences. The sub replaces the parts I don't want and updates it to the new column. I've tested the Regex several times in places such as this: http://regex101.com/r/kS7fD8/1
However, I'm still struggling because the results are very bizarre. Now my new column is populated with the organism's domain rather than the genus: Bacteria.
How do I resolve this? Are there any good easy-to-understand resources for learning more about R's Regex formats?
Starting with your simple string,
string <- "Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Marinilabiaceae"
You can remove everything up to the last semicolon with "^(.*);" in your call to sub
> sub("^(.*);", "", string)
# [1] "Marinilabiaceae"
You can also use strsplit with tail
> tail(strsplit(string, ";")[[1]], 1)
# [1] "Marinilabiaceae"
Your regular expression, ([\\s\\S]*;) wouldn't work primarily because \\s matches any space characters, and your string does not contain any spaces. I think it worked in the regex101 site because that regex tester defaults to pcre (php) (see "Flavor" in top-left corner), and R regex syntax is slightly different. R requires extra backslash escape characters in many situations. For reference, this R text processing wiki has come in handy for me many times before.
Make it Greedy and get the matched group from desired index.
(.*);(.*)
^^^------- Marinilabiaceae
Here is regex101 demo
Or to get the first word use Non-Greedy way
(.*?);(.*)
Bacteria -----^^^
Here is demo
To extract everything after the last ; to the end of the line you can use:
[^;]*?$

Clear Regex for "URL Contains"

I'm always stymied by regular expressions. My tool has a filtering option for "Current URL Matches Regex (case insensitive)" but I'm not sure how to write the regular expression for my needs. I'd love to figure out how to write a regex that would ONLY trigger for URLs that contain ANY of these 5 strings anywhere in URL:
Product=Neo-Supreme
Product=Cordura
Product=Hawaiian
Product=Animal%20Deluxe
Product=Camo
Basically the regex you need is something along the lines of
'Product\=[^&]+'
unless you know that the product can be something other than one of those 5 options.
If so, you'll need to use
'Product\=(Neo-Supreme|Cordura|Hawaiian|Animal%20Deluxe|Camo)'
EDIT for comments:
To match anything you can always use .*, which matches on any number of any character (except a newline, unless otherwise specified).
'.*seat-option.*Product\=(Neo-Supreme|Cordura|Hawaiian|Animal%20Deluxe|Camo).*'
Here's a demo

Regex substring

I'm trying to select a substring using regex and I'm going round in circles. I need to select everything before the first "_".
exampale URL - GI_2013_JUNE_10_VOL3_LASTCHANCE
So the result Im looking for from the URL above would be "GI". The text before the first "_" can vary in length.
Any help would be much apprecited
The regex would be:
^[^_]+
and grab the whole regex match. But as a comment says, using a substring function is more efficient!
^[^_]*
...is the expression you're looking for.
It basically says: Select everything that is not an underscore, starting at the beginning of the string.
http://regexr.com?356in

Regular expression for a list of items separated by comma or by comma and a space

Hey,
I can't figure out how to write a regular expression for my website, I would like to let the user input a list of items (tags) separated by comma or by comma and a space, for example "apple, pie,applepie". Would it be possible to have such regexp?
Thanks!
EDIT:
I would like a regexp for javascript in order to check the input before the user submits a form.
What you're looking for is deceptively easy:
[^,]+
This will give you every comma-separated token, and will exclude empty tokens (if the user enters "a,,b" you will only get 'a' and 'b'), BUT it will break if they enter "a, ,b".
If you want to strip the spaces from either side properly (and exclude whitespace only elements), then it gets a tiny bit more complicated:
[^,\s][^\,]*[^,\s]*
However, as has been mentioned in some of the comments, why do you need a regex where a simple split and trim will do the trick?
Assuming the words in your list may be letters from a to z and you allow, but do not require, a space after the comma separators, your reg exp would be
[a-z]+(,\s*[a-z]+)*
This is match "ab" or "ab, de", but not "ab ,dc"
Here's a simpler solution:
console.log("test, , test".match(/[^,(?! )]+/g));
It doesn't break on empty properties and strips spaces before and after properties.
This thread is almost 7 years old and was last active 5 months ago, but I wanted to achieve the same results as OP and after reading this thread, came across a nifty solution that seems to work well
.match(/[^,\s?]+/g)
Here's an image with some example code of how I'm using it and how it's working
Regarding the regular expression... I suppose a more accurate statement would be to say "target anything that IS NOT a comma followed by any (optional) amount of white space" ?
I often work with coma separated pattern, and for me, this works :
((^|[,])pattern)+
where "pattern" is the single element regexp
This might work:
([^,]*)(, ?([^,]*))*
([^,]*)
Look For Commas within a given string, followed by separating these. in regards to the whitespace? cant you just use commas? remove whitespace?
I needed an strict validation for a comma separated input alphabetic characters, no spaces. I end up using this one is case anyone needed:
/^[a-z]+(,[a-z]+)*$/
Or, to support lower- and uppercase words:
/^[A-Za-z]+(?:,[A-Za-z]+)*$/
In case one need to allow whitespace between words:
/^[A-Za-z]+(?:\s*,\s*[A-Za-z]+)*$/
/^[A-Za-z]+(?:,\s*[A-Za-z]+)*$/
You can try this, it worked for me:
/.+?[\|$]/g
or
/[^\|?]+/g
but replace '|' for the one you need. Also, don't forget about shielding.
something like this should work: ((apple|pie|applepie),\s?)*