How to match a word based on slash in regular expression - regex

I am trying to match a word with regex. for example, I want to match only first 2 folders in below string
/folder1/folder2/filder3/folder4/folder5
I wrote a below regex to match first two folders but it matches everything till /folder5 but I wanted to match only till /folder2
/(\w.+){2}
I guess .+ matches everything. Any idea how to handle this?

You can use
^/[^/]+/[^/]+
^(?:/[^/]+){2}
Or, if you need to escape slashes:
^\/[^\/]+\/[^\/]+
^(?:\/[^\/]+){2}
See the regex demo. [^/] is a negated character class that matches any char other than a / char.

Related

Regex should not be recognized for special characters

I want the regex not to be recognized, should be a special character before, between and after the regex.
My Regex:
\b([t][\W_]*?)+([e][\W_]*?)+([s][\W_]*?)+([t][\W_]*?)*?\b
https://regex101.com/r/zKg2eR/1
Example:
#test, te+st, t'est or =test etc.
I hope I could bring it across reasonably understandable.
If you want to match a word character excluding an underscore, you can write it as [^\W_] using a negated character class.
You don't need a character class for a single char [t] and you are repeating the groups as well, which you don't have to when you want to match a form of test
If the words are on a single line, you can append anchors ^ and $
^(t[^\W_]*)(e[^\W_]*)(s[^\W_]*)(t[^\W_]*)$
Regex demo
As you selected golang in the regex tester, you can not use lookarounds. Instead you can use an alternation to match either a whitespace char or the start/end of the string.
Then capture the whole match in another capture group.
(?:^|\s)((t[^\W_]*)(e[^\W_]*)(s[^\W_]*)(t[^\W_]*))(?:$|\s)
Regex demo

Regular Expression for url paths

I want a regular expression to match on paths containing '/food/' but not '/food/api/':
http://example.com/food/api/pasta?sauce=true
Right now I'm using this:
/^((?!\/food\/api\/).)*$/
The problem with this is it matches ANY path that doesn't contain '/food/api/'
Behavior I want to achieve:
REGEX MATCHES
example.com/food/
example.com/food/meals
REGEX IGNORES
example.com/food/api/pasta?sauce=true
example.com/food/api/pasta
example.com/food/api/
example.com/meal
example.com/
Using a pattern like this ((?!\/food\/api\/).)* (a tempered greedy token solution) will match the whole line if it does not contain the sub string /food/api
As the quantifier is a * it will also match an empty line.
Instead, you can use an alternation to match until the first occurrence of a / followed by food or meal followed and a forward slash. After this slash, check that it is not followed by /api
^[^/]+/(?:food|meal)/(?!api/).*$
Regex demo
If the string can not contains spaces, you can exclude them using the negated character class [^/\s]+ and match \S* instead of .*
^[^/\s]+/(?:food|meal)/(?!api/)\S*$
Regex demo

Match a part of a string using regex

I have a string and would like to match a part of it.
The string is Accept: multipart/mixedPrivacy: nonePAI: <sip:4168755400#1.1.1.238>From: <sip:4168755400#1.1.1.238>;tag=5430960946837208_c1b08.2.3.1602135087396.0_1237422_3895152To: <sip:4168755400#1.1.1.238>
I want to match PAI: <sip:4168755400#
the whitespace can be a word so i would like to use .* but if i used that it matches most of the string
The example on that link is showing what i'm matching if i use the whitespace instead of .*
(PAI: <sip:)((?:\([2-9]\d{2}\)\ ?|[2-9]\d{2}(?:\-?|\ ?))[2-9]\d{2}[- ]?\d{4})#
The example on that link is showing what i'm trying to achieve with .* but it should only match PAI: <sip:4168755400#
(PAI:.*<sip:)((?:\([2-9]\d{2}\)\ ?|[2-9]\d{2}(?:\-?|\ ?))[2-9]\d{2}[- ]?\d{4})#
I tried lookaround but failing.
Any idea?
thanks
Matching the single space can be updated by using a character class matching either a space or a word character and repeat that 1 or more times to match at least a single occurrence.
Note that you don't have to escape the spaces, and in both occasions you can use an optional character class matching either a space or hyphen [ -]?
If you want the match only, you can omit the 2 capturing groups if you want to.
(PAI:[ \w]+<sip:)((?:\([2-9]\d{2}\) ?|[2-9]\d{2}[ -]?)[2-9]\d{2}[- ]?\d{4})#
Regex demo
The regex should be like
PAI:.*?(<sip:.*?#)
Explanation:
PAI:.*? find the word PAI: and after the word it can be anything (.*) but ? is used to indicate that it should match as few as possible before it found the next expression.
(<sip:.*?#) capturing group that we want the result.
<sip:.*?# find <sip: and after the word it can be anything .*? before it found #.
Example

Regex match last occurrence of substring among the same substrings in the string

For example we have a string:
asd/asd/asd/asd/1#s_
I need to match this part: /asd/1#s_ or asd/1#s_
How is it possible to do with plain regex?
I've tried negative lookahead like this
But it didn't work
\/(?:.(?!\/))?(asd)(\/(([\W\d\w]){1,})|)$
it matches this '/asd/asd/asd/asd/asd/asd/1#s_'
from this 'prefix/asd/asd/asd/asd/asd/asd/1#s_'
and I need to match '/asd/1#s_' without all preceding /asd/'s
Match should work with plain regex
Without any helper functions of any programming language
https://regexr.com/
I use this site to check if regex matches or not
here's the possible strings:
prefix/asd/asd/asd/1#s
prefix/asd/asd/asd/1s#
prefix/asd/asd/asd/s1#
prefix/asd/asd/asd/s#1
prefix/asd/asd/asd/#1s
prefix/asd/asd/asd/#s1
and asd part could be replaced with any word like
prefix/a1sd/a1sd/a1sd/1#s
prefix/a1sd/a1sd/a1sd/1s#
...
So I need to match last repeating part with everything to the right
And everything to the right could be character, not character, digit, in any order
A more complicated string example:
prefix/a1sd/a1sd/a1sd/1s#/ds/dsse/a1sd/22$$#!/123/321/asd
this should match that part:
/a1sd/22$$#!/123/321/asd
If you want the match only, you can use \K to reset the match buffer right before the parts that you want to match:
^.*\K/a\d?sd/\S+
The pattern will match
^ Start of string
.* Match any char except a newline until end of the line
\K Forget what is matched until now
/a\d?sd/ match a, optional digits and sd between forward slashes
\S+ Match 1+ non whitespace chars
See a regex demo

Regular expression not matching specific string

My use case is as follows: I would like to find all occurrences of something similar to this /name.action, but where the last part is not .action eg:
name.actoin - should match
name.action - should not match
nameaction - should not match
I have this:
/\w+.\w*
to match two words separated by a dot, but I don't know how to add 'and do not match .action'.
Firstly, you need to escape your . character as that's taken as any character in Regex.
Secondly, you need to add in a Match if suffix is not present group - signified by the (?!) syntax.
You may also want to put a circumflex ^ to signify the start of a new line and change your * (any repetitions) to a + (one or more repititions).
^/\w+\.(?!action)\w+ is the finished Regex.
^\w+\.(?!action)\w*
You need to escape the dot character.
\w+\.(?!action).*
Note the trailing .* Not sure what you want to do after the action text.
See also Regular expression to match string not containing a word?
You'll need to use a zero-width negative lookahead assertion. This will let you look ahead in the string, and match based on the negation of a word.
So the regex you'd need (including the escaped . character) would look something like:
/name\.(?!action)/