Regex not ended in html - regex

I'm trying to make a regex that matches with a text like /es/whathever1/whathever2/whatever3 and not ends with html
I tried with :
\/es\/.*[aA-zZ\-\_]\/.*[aA-zZ\-\_]\/.*[aA-zZ\-\_].*[.]html$.*$
but only matches if ends with .html

Using the character class [aA-zZ\-\_] matches a single character of one of the listed. It is not the same as [a-zA-Z] as A-z matches more characters.
You can repeat 3 times matching / and 1+ word characters using a quantifier and add anchors for the start ^ and end $ of the string to prevent a partial match
^\/es(?:\/[\w-]+){3}$
See a regex demo

Related

Regex that doesn't recognise a pattern

I want to make a regex that recognize some patterns and some not.
_*[a-zA-Z][a-zA-Z0-9_][^-]*.*(?<!_)
The sample of patterns that i want to recognize:
a100__version_2
_a100__version2
And the sample of patterns that i dont want to recognize:
100__version_2
a100__version2_
_100__version_2
a100--version-2
The regex works for all of them except this one:
a100--version-2
So I don't want to match the dashes.
I tried _*[a-zA-Z][a-zA-Z0-9_][^-]*.*(?<!_)
so the problem is at [^-]
You could write the pattern like this, but [^-]* can also match newlines and spaces.
To not match newlines and spaces, and matching at least 2 characters:
^_*[a-zA-Z][a-zA-Z0-9_][^-\s]*$(?<!_)
Regex demo
Or matching only word characters, matching at least a single character repeating \w* zero or more times:
^_*[a-zA-Z]\w*$(?<!_)
^ Start of string
_* Match optional underscores
[a-zA-Z] Match a single char a-zA-Z
\w* Match optional word chars (Or [a-zA-Z0-9_]*)
$ End of string
(?<!_) Assert not _ to the left at the end of the string
Regex demo

I wrote url validation regex but the regex is very slow

I know this is slow because of ([\.\-][a-z0-9])*. But I don't know how to optimize it.
^https:\/\/([a-z0-9]+([\.\-][a-z0-9])*)+(\.([a-z]{2,11}|[0-9]{1,5}))(:[0-9]{1,5})?(\/.*)?$
You don't have to use this part )*)+ in your pattern. This could also potentially lead to catastrophic backtracking.
Note that you only have to escape the backslash if the delimiters for the regex are also / and you don't have to escape the [\.\-]
If you don't need that capture groups afterwards, you can omit them.
^https:\/\/[a-z0-9]+(?:[.-][a-z0-9]+)*\.(?:[a-z]{2,11}|[0-9]{1,5})(?::[0-9]{1,5})?(\/.*)?$
The pattern matches:
^ Start of string
https:\/\/ Match https:// As you only want to match https
[a-z0-9]+ Match 1+ times any of the listed
(?:[.-][a-z0-9]+)* Optionally repeat matching . or - and 1+ times any of the listed
\.(?:[a-z]{2,11}|[0-9]{1,5}) Match either 2-11 times a char a-z or match 1-5 digits
(?::[0-9]{1,5})? Optionally match : and 1-5 digits
(\/.*)? Optionally match /` and the rest of the line
$ End of string
Regex demo

regular expression to get the start and end matches of a string

i Have a string of words. I want get a word which begins and ends with 3 back ticks ```. how to I use regular expressions to accomplish this in flutter. I have tried this(^```.*\.```$)\w+but its not working on a sentence like Hello there, ```friend```, how are you doing?
The pattern you tried (^```.*\.```$)\w+ uses anchors to assert the start ^ and the end $ of the string and in between match any char except a newline followed by a literal dot around triple backticks.
After that it tries to match 1+ word characters which will not match.
You could use a capturing group and match 1+ word characters in between
```(\w+)```
Regex demo

Regex for all illegal filename characters before filetype extension

I'm looking for a Regex that exchanges all illegal filename chars like () space . etc before the filetype ending like .jpg by an -
i got:
[^a-zA-Z0-9_-]+
matches every illegal filename char, but including file extension
and
.*(?=.)
matching everything until the last occurence of .
how do i combine these?
one of my evil file names is
(800x800-png)MGC1000-03EPTD-021_RAL7035-5010.tif.png
after regex replace it should look like
-800x800-png-MGC1000-03EPTD-021_RAL7035-5010-tif.png
the regex should be working in libre office / excel search and replace.
thanks for your help!
You could use your negated character class [^a-zA-Z0-9_-]+ and use a positive lookahead to assert that the string ends with a dot and 1+ word characters.
In the replacement use a hyphen -
[^a-zA-Z0-9_-]+(?=.*\.\w+$)
As per comment from #Stein you might shorten it to:
[^\w-]+(?=.*\.\w+$)
Explanation
[^a-zA-Z0-9_-]+ Match 1+ times any character that is not in the character class
(?= Positive lookahead, assert what is on the right is
.*\.\w+ Match any character 0+ times, then a dot and 1+ word chars
$ Assert the end of the string
) Close positive lookahead
Regex demo
If the extension itself could have special characters, then you might update \w+$ to [^.\s]+$ like:
[^\w-]+(?=.*\.[^.\s]+$)

REGEX: Select all text between last underscore and dot

I'm having trouble retrieving specific information of a string.
The string is as follows:
20190502_PO_TEST.pdf
This includes the .pdf part. I need to retrieve the part between the last underscore (_) and the dot (.) leaving me with TEST
I've tried this:
[^_]+$
This however, returns:
TEST.PDF
I've also tried this:
_(.+)\.
This returns:
PO_TEST
This pattern [^_]+$ will match not an underscore until the end of the string and will also match the .
In this pattern _(.+). you have to escape the dot to match it literally like _(.+)\. see demo and then your match will be in the first capturing group.
What you also might use:
^.*_\K[^.]+
^.*_ Match the last underscore
\K Forget what was matched
[^.]+ Match 0+ times not a dot
Regex demo