Regex with exception of specific word of path - regex

I need to replace image URL with a dummy image URL. I'm currently having a problem to exclude paths that have ignore file name.
I've successfully implemented regex that match these two paths:
images/image-filename.png and ../images/image-filename.png
with this following regex:
..\/images\/(.*?)\.(?:png|jpg|jpeg|gif|png|svg)|images\/(.*?)\.(?:png|jpg|jpeg|gif|png|svg)
However, I'd like to exclude any path with ignore word in the file name, for example:
images/image-filename-ignore.png
Thanks!

Here is one option using a negative lookahead to assert that ignore does not appear as part of the filename:
images\/(?!.*ignore.*\.[^.]+).*\.(?:png|jpg|jpeg|gif|png|svg)
Demo
But, you might also be able to proceed by actually matching the invalid filenames with ignore, and then logically excluding these matches:
images\/.*ignore.*\.(?:png|jpg|jpeg|gif|png|svg)

My guess is that you may likely want to add an i flag and word boundaries:
\/?images\/(?!.*\bignore\b)[^.]*\.(?:png|jpe?g|gif|svg|tiff|other_extensions)
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.

Related

Can I improve simplicity using negative lookahead to find the last folder in a file path?

I’m trying to find a simpler solution to locating the last folder path in a file list that does not contain a file of type, but must use lookarounds. Can anyone explain some improvements in my regex code that follows?
Search text:
c:\this\folder\goes\findme.txt
c:\this\folder\cant\findme.doc
c:\this\folder\surecanfind.txt
c:\\anothertest.rtf
c:\t.txt
RegEx:
(?<=\\)[^\\\n\r]+?(?=\\[^\\]*\.)(?!.*\.doc)
Expected result:
‘goes’
‘folder’
Can the RegEx lookahead be improved and simplified? Thanks for the help.
In your original regex:
(?<=\\)[^\\\n\r]+?(?=\\[^\\]*\.)(?!.*\.doc)
there isn't really much to improve in terms of the use of lookarounds.
The positive look behind is necessary to tell the regex when it is allowed to begin a match.
The positve look ahead is necessary to terminate the expansion of the +? quantifier.
And the negative look ahead is needed to negate invalid matches.
You might be able to condense both look aheads into one. But keeping them separate is more efficient, since if the evaluation of one fails, it can skip the evaluation of the second.
However, if your looking for a more efficient/"normal" Regex, I would typically use something like:
^.*\\(.+?)\\[^\\]+\.(?!doc).+$
instead of using lookarounds to exclude everything except my desired output from a match, I'd include my desired output in a capture group.
this allows me to tell regex to only check for a match once per line, instead of after ever \ character.
Then, to get my desired output, all I have to do is grab the content of capture group 1 from each match.
working example
orignal (98,150 steps)
Capture Groups (66,586 steps)
Hopefully that'll help you out

How to exclude file extension from string with regex

I want to be able to get two matching groups from a regex and exclude a third.
This is an example of a string I want to match:
my-file-name-0.44.0.6-SOME-SNAPSHOT.zip
I want two matching groups, one for the file name without the version and one for the version without the file extension.
Group 1: my-file-name
Group 2: 0.44.0.6-SOME-SNAPSHOT
Excluded: .zip
the file name can be random, but the version will always have a hyphen before it, then file extension can also be random.
This is what I have come up with, but can't figure out the exclude part.
(.*?)-([0-9.]{1,4}.*)
Append \. to your regex:
(.*?)-([0-9.]{1,4}.*)\.
However you may want to modify it a little:
(.*?)-(\d.*)\.\w+
Live demo
Use this regular expression to remove file extension:
/(.*)\.[^.]+$/

Regex to match any .config file with a few exceptions

I'm trying to get a regex working to use in an .hgignore file that will ignore various copies of .config files made during debugging.
The regex should match any path ending in .config as long as the path does not start with _config, config, or packages and as long as the file name (the characters immediately following the last slash) is not app, web, packages, or repositories (or web.release, web.debug).
The closest I seem to get is
^(?!(_config|[Cc]onfig|packages)).*\/(?!([Aa]pp|[Ww]eb|packages|repositories)\.).*config$
This will properly ignore Data/app.config, and seems to work with all other cases, but it will incorrectly match Libraries/Data/app.config. When I check this out at http://regex101.com/ it shows me that the .*\/ group is only matching through Libraries/, not Libraries/Data/ as I expected.
I tried changing it to
^(?!(_config|[Cc]onfig|packages))(.*\/)*(?!([Aa]pp|[Ww]eb|packages|repositories)\.).*config$
But then the group (.*\/)* seems to match the whole path for any .config file.
If I change the last negative lookahead to a matching group like so
^(?!(_config|[Cc]onfig|packages))(.*\/)(([Aa]pp|[Ww]eb|packages|repositories)\.).*config$
Then the (.*\/) matches Libraries/Data/, which is what I want and expected, but it appears the negative lookahead changes the matching behavior of (.*\/).
I'm not sure where to go from here? The conditions I'm trying to match or not match don't seem that complicated, but I'm not the most experienced with regexes. Maybe there is a simpler way to achieve the same thing in .hgignore?
These are examples of paths that should match and be ignored:
Web/smtp.config
Libraries/Data/connectionStrings.config
These are examples of paths that should NOT match and not be ignored
_config/staging/smtp.config
Web/web.config
Web/web.release.config
Web/Views/web.config
Libraries/Data/app.config
Libraries/Data/packages.config
Data/app.config
packages/MiniProfiler.EF6.3.0.11/lib/net40/MiniProfiler.EntityFramework6.dll.config
packages/repositories.config
You were really close. Try this regex on regex101:
^(?!_?config|packages).*\/(?!(app|web|packages|repositories)\.)[^\/]*config$
I simplified it a little, but the main change was to specify no slashes in the match before the "config".
Note: I used a case-insensitive flag to simplify the regex itself.

Regex to select path till a folder name

Given a string "C:\Tom\Dick\Harry\Chocolate\Treat\Hunt\Fruitless" I have to select anything which appears before Treat.
I have tried with
(.*)\\Treat
but it includes the Treat word also.
Result is "C:\Tom\Dick\Harry\Chocolate\Treat".
Any help will be much appreciated.
You could use a lookahead in the regex if you don't want to include the word \Treat.
.*(?=\\Treat)
DEMO
OR
If you want to include the word Treat then try the below regex,
^.*?\\Treat
DEMO
(.*?)(?:Treat).*
This simple re should do it.
See demo

Regex: Search for verb roots

I've seen the results for classifying verbs by their endings. But I want to use Regular Expressions to find verb roots for regular verbs in Spanish.
I'm using this fancy site: http://regexpal.com/
Which I suspect may not be compatible with my end use, but will be a great starting point.
From what I have seen, the caret should identify all strings after it based on your supplied string-pattern.
So, to me:
ˆgust
Should find "gusta", "gustan", "gustamos", "gustas","gustar".
I know that I'm way off, but looking at many of the pages and tutorials and examples, I don't see anything that looks similar to what I want to do.
When you look for regex matching you'll get only the matching part, meaning, in case you have the word "gustan" and you're trying to match it with ^gust like you suggested, the output of the matcher will be "gust" - which is not what you want (you want the whole word).
So instead of matching to ^gust try matching to ^gust\w*$ which means anything that starts with "gust" and has zero or more characters following it.
^(gust[a-zA-Z]*)$
Edit live on Debuggex
^ denotes the start of the line
[a-zA-Z] letters only
* means zero or more
() is called a capture group
$ is the end of the line
If you want to edit with different words you could do this...
^((?:gust|otherwords)[a-zA-Z]*)$
Edit live on Debuggex
all you have to change/edit is |otherwords this will allow you to add more words that you want to match.
please read more about regex here and use debugexx.com to experiment.