regex match file name in a path - regex

I am trying to match the file name in a path. For example:
/path/test/index.html
I would want to match index.html
However I also can have a path with no / so the path could be
index.html
and would want to match index.html
I have the following to match the first case and can grab it with a group.
.*/([^/]+)
But how can I also match a file name when the only thing in the path is the file name?

There is probably no need to have anything but [^/]+$ unless you want to
match the entire line and your engine matcher requires it.

There is usually more than one way to do a regex, and also that there are often edge cases that end up complicating a simple task.
If you want to match any/ every valid file name in a string then perhaps:
[A-Za-z0-9_-]+\.?[A-Za-z0-9]*
or (since you can have a file named ConfigurationFile.txt.bac for example)...
[A-Za-z0-9_-\.]+\.?[A-Za-z0-9]*
But that is not what you want because each directory name is a valid file name... so...
this will match only valid file names with an extension.
[A-Za-z0-9_-]+\.[A-Za-z0-9]+
or
[A-Za-z0-9_-\.]+\.[A-Za-z0-9]+
Clearly there are many options. The AA(accepted answer) only matches any string in a path that is at the end of the path. It does not match a file name without a path. The AA may well do for the OP. It is helpful to me to be able to match any file name within a string.
There are always edge cases, for example in my case I am still matching version numbers with this regex. I have a work around for my case but I am getting too specific.

Make the .*/ into an optional group:
(?:.*/)?([^/]+)

Related

NotePad++ regex match and replace and also keep match to convert to different markdown image reference link

I have the following link syntax that needs to be changed:
![[afoldernamenolongerneededandwillbedeprecated/somemarkdownfilename_image1.png]]
I tried (successfully) with this regex to match:
![[].*[\/].*_image[0-9].png[]]]
Although I have a hunch it may not be what I should use. I the novice think it may be only good for matching and not replacing. All images are png's, by the way. All filenames have _image in them, prefixed by the markdown file-name.
Desired end format:
![image](imagenamefromabovestring1,2,orhowevermanythereare.png)
The
![]()
is a known syntax in markdown to reference images. Images will be populated in subdirectories the program/app will find.
It goes without saying I want to run find and replace recursively on some 4000 files containing image references.
I put up the unfinished substitution example here:
https://regex101.com/r/Bl8HJC/1
So to clarify more on what I need. I need the formerly present folder name gone. I don't need it anymore. Then after the slash comes the name of the image, the syntax of which is always: current filename to be proccessed by NotePad++ recursively (it can be a markdown file named Ab, Aba, Abracadabra, etc.) and this filename always serves as prefix, then comes an underscore and 'image' + a number depending on how many images are linked to the markdown file as attachments. The names of the files to go in an attachment folder will look like this:
AB_image3.png
Abracadabra_image2.png
.
.
.
Zodiac_image45.png
I am looking for the right syntax as I couldn't figure it out with the dollar sign.
Cheers,
Otto
I have modified your example to get it working here. What you needed to do is escape the square brackets so they would be interpreted literally, since they have special meaning in regex, and you needed to use a capture group to store the matching value in $1 so you could use it in the replacement.
Regular expression:
!\[\[.*\/(.*_image[0-9]{1,2}\.png)\]\]
Substitution format:
![image]\($1\)
Edit: Question was revised to state that the folder name was unwanted in the final output, so matches are delimited after the final / character in the file path.
Edit 2: Support for file numbers 1 through 99.

Regex: Identify file name with "string" but exclude if has .filepart extension

I have a requirement to search through a directory to identify specific files with a string contained in the file name. But I want to exclude part loaded files with a ".filepart" extension.
This must be done through Regex due to tool limitations.
The file names can be in multiple formats, and we must identify them from the "file identifier" string that we pass into the Regex.
I have read some very good articles within SO and other websites but I am struggling to nail down the correct syntax.
I have saved a page on regex101.com to provide a more detailed explanation of what I am trying to achieve. The "FILETYPE" can be considered the string we pass into the Regex.
https://regex101.com/r/zTrbyX/4
Thanks,
K
Your original regex is:
.*FILETYPE.*\.[[:alnum:]]*(?!filepart)
will give the same result as:
.*FILETYPE.*
Instead you could use the following regex (similar to CAustin solution in comments):
.*FILETYPE.*(?<!filepart)$
This will match every line starting with .*FILETYPE.* and not ending with filetpart. Here $ denotes the end of the line. In regex101.com you need to activate flag m for $ to be recognized as EOL.

Regex to find directory in text

How do I find a path and file name in a block of text?
Before you mark this as duplicate I know questions about file paths exist
Regex for parsing directory and filename (does not match in a paragraph.)
Regex that matches directory path excluding filepath (this one just match the file name, answer doesn't work for paragraphs, and doesn't address . or spaces)
java regular expression to match file path (doesn't address . or spaces)
Regex for extracting filename from path (doesn't address being in a paragraph)
For example
In file included from /some/directoy/3.33A.37.2/something else/dogs.txt,
from /some/directoy/something else/dogs.txt,
from /some/directoyr/3.33A.37.2/something else/dogs.txt,
from /var/log/xyz/10032008.log,
from /var/log/xyz/test.c:29:
Solution:
please the file something.h has to be alone without others include, it has to be present in release letter,
in order to be included in /var/log/xyz/test.c and /var/log/xyz/test.h automatically
Other Note:
The file something.c must contain the somethinge.h and not the ecpfmbsd.h because it doesn't contain C operative code.. everything good..
The following are the ideal matches:
/some/directoy/3.33A.37.2/something else/dogs.txt
/some/directoy/something else/dogs.txt
/some/directoyr/3.33A.37.2/something else/dogs.txt
/var/log/xyz/10032008.log
/var/log/xyz/test.c:29 (this is a tricky one, ok with out it)
/var/log/xyz/test.c
/var/log/xyz/test.h
Going further what if I find an answer how can I change it to work with \ instead of / directories
You can use a regex like this:
\/.*\.[\w:]+
Working demo
Btw, if you want to allow backslashes in the path you can have:
[\\\/].*\.[\w:]+
This looks to be working:
\/[^,:]*\.\w+
See demo.
You can fine-tune this if you know the exact extensions, their lengths and what characters they have. As for me, \w+ would do to match extensions.

Regex expression to match a string but exclude something at the same time

I want to try and ask this as concisely as possible please forgive me if I'm leaving something out. I want the expression to match all cases except where an exact filename string is present.
A backup software I'm using uses regular expressions and I want to setup an exclusion to skip all of a particular file extension type, except I have certain files I need to backup so I don't want them to match.
The files I want to exclude are we'll say for this example *.FLV
(?i).*\.flv
I want to include in my backups three files: abc123.flv, ghk432.flv, and fdw917.flv
This is where I'm having trouble, even just including one file from the three to be included to backup
(?i).*\.flv^(?!(abc123\.flv))&
The expression is being added to an Exclusion List for code42 CrashPlan backup, their support unfortunately cannot assist with complex RegEx expressions.
The closest thing I can supply as an example is their Example 3: Using An Exclude To Include:
.*/Documents/((?!(.*\.(doc|rtf)|.*/)$).)*$
http://support.code42.com/Administrator/3.6_And_4.0/Configuring/Using_Include_And_Exclude_Filters
However it excludes all files within directories named "Documents" and includes any files in those folders with doc or rtf file extensions. I'm trying to create an expression working with file extensions irregardless of folder location.
In my brain logically it seems like I need to write this as some kind of if then else statement but regex is not my forte.
Use an anchored negative look ahead with an alternation for the files you want to keep:
^(?i)(?!.*(abc123|ghk432|fdw917)\.flv).*\.flv
The negative lookahead asserts that the following input does not match its regex, and the pipe character means "or".
Try to put the negative lookahead at the position of the filename in the path:
^([^/]*/)*(?!(abc123|ghk432|fdw917)\.flv$)[^/]*\.flv$

regex to get portion of file name after last dot without file extension

I have a bunch of files, some examples are as follows:
/foo1/foo2/bar1.bar2.bar3.answer.jar
/foo1/bar1.bar2.answer.jar
/foo1/foo2/answer.jar
and for all of the above I would like a regex that matches 'answer'. In other words, I'm looking to get an alias for the file that is the portion of the file name after the last dot (or the file name itself if there are no dots) with the file extension (.jar can be guaranteed here to make it simpler) stripped off.
I know I can do this with a more simple regex to split the value up by dots and then get the second last one, but in this case I'm building a back-end thing that will ideally take a regex that is defined in a configuration definition for the given file type, and spit out the alias, which might be different for other file types.
Yep, I'm over-engineering. :)
Any ideas?
Following regex should work for you:
[^/.]+(?=\.jar$)
If using Javascript or a similar flavor where / is regex delimiter then you need to escape / like this:
[^\/.]+(?=\.jar$)
You can use the following regexp: (assuming that the answer part doesn't contain . or /)
[/\.]([^/\.]+)\.jar
The first capturing group is the part what you want to.