REGEX for any file extension - regex

I am trying to build a regex to tell if a string is a valid file extension. It could be any extentions.
hello no
.hello Yes
..hello No
hello.world No
.hello.world No
.hello world No
I have tried ^\. and ^\.[\.] but can't get what i am looking for. This seems like it should be simple.

^\.[^.]+$
This means start with . and then anything other than dot (.)
You can also use this one if you want to have only aplhanumeric.:
^\.[a-zA-Z0-9]+$

Try this regex:
^\.[\w]+$
Matches a string starting with a ".", followed by one or more "word" character(s), until the end of the string.

Try this regex, which matches all strings starting with a dot followed by at least one other character:
^\.[^.]+$

If you already have a string like ".hello" with the extension and you're just testing it to see if it matches then you can try something like ^\.[^\\/:*?"<>|\s.]{1,255}$. It works with all of your example cases.
The beginning ^\. means the whole string must start with a literal dot "."
The [^\\/:*?"<>|\s.] means that after the dot you can have any character except a backslash, forward slash, colon, asterisk, question mark, double quotation mark, less than or greater than symbol, vertical bar, whitespace character, or dot. Feel free to add whatever other characters you'd like to disallow inside of the square brackets after the carrot or delete any characters that I added that you wish to allow.
(Note: the allowable characters for filenames/extensions depends on the file system.)
The {1,255}$ at the end quantifies the amount of allowable characters that we just defined all the way until the end of the string. So anything that's after the dot and allowed can be between 1 and 255 characters long and it must go on until the end of the string. Feel free to change the 255 to any number that you like.
(Note: the maximum length for filenames/extensions depends on the file system.)
If you are searching a string like "https://sub.example.com/directory1/directory2/file.php" for the file extension you should instead use \.[^\\/:*?"<>|\s.]{1,255}$ to search for the final extension including the dot.

This works for me in javascript
^[.][a-zA-Z0-9.,$;]+$

I use:
(?:.*\\)+([^\\]+)
for Windows for it produces short filename with extension.

If you are looking for a regex to get the file extension from a filename, here it is
(?<=\.)[^.\s]+$

Related

Extract specific string using regular expression

I want to extract only a specific string if its match
example as an input string:
13.10.0/
13.10.1/
13.10.2/
13.10.3/
13.10.4.2/
13.10.4.4/
13.10.4.5/
I'm using this regex [0-9]+.[0-9]+.[0-9] to extract only digit.digit.digit from a string if its match
but in that case, this is the wrong output related to my regex :
13.10.0
13.10.1
13.10.2
13.10.3
13.10.4.2 (no need to match this string 13.10.4 )
13.10.4.4 (no need to match this string13.10.4 )
13.10.4.5(no need to match this string 13.10.4 )
the correct output that I need :
13.10.0
13.10.1
13.10.2
13.10.3
It's hard to say without knowing how you're passing these strings in -- are they lines in a file? An array of strings in a programming language?
If you're searching a file using grep or a similar tool, it will give you all lines that match anywhere, even if only part of the line matches.
Normally, you'd deal with this using anchors to specify the regex must start on the first character of the line, and end on the last (e.g. ^[0-9]+.[0-9]+.[0-9]$). ^ matches the start of the line, and $ matches at the end.
In your case, you've got slashes at the end of all the lines, so the easiest fix is to match that final slash, with ^[0-9]+.[0-9]+.[0-9]/.
You could also use lookahead or groups to match the slash without returning it -- but that depends a bit more on what tool you're running this regex in and how you're processing it.
If your strings are separated by whitespace (other than newlines), replacing ^ with (^|\s) (either the beginning of the string, or some whitespace character) may work -- but it will add a leading space to some of your results.
You may also need to set your regex tool to match multiple times in a line (e.g. the -o flag in grep). Again, it's hard to give useful advice about this without knowing what regular-expression tool you're using, or how you're processing the results.
I think you want:
^\d+\.\d+\.\d+$
Which is exactly 3 groups of digit(s) separates by (literal) dots.
Some tools (like grep) match all lines that contain your regex, and may have additional characters before/after.
Use $ character to match end of line after your regex. (Also note, that . matches any character, not literal dot)
[0-9]+\.[0-9]+\.[0-9]$

PowerShell RegEx with multiple options

Given a file name of 22-PLUMB-CLR-RECTANGULAR.0001.rfa I need a RegEx to match it. Basically it's any possible characters, then . and 4 digits and one of four possible file extensions.
I tried ^.?\.\d{4}\.(rvt|rfa|rte|rft)$ , which I would have thought would be correct, but I guess my RegEx understanding has not progressed as much as I thought/hoped. Now, .?\.\d{4}\.(rvt|rfa|rte|rft)$ does work and the ONLY difference is that I am not specifying the start of the string with ^. In a different situation where the file name is always in the form journal.####.txt I used ^journal\.\d{4}\.txt$ and it matched journal.0001.txt like a champ. Obviously when I am specifying a specific string, not any characters with .? is the difference, but I don't understand the WHY.
That never matches the mentioned string since ^.? means match beginning of input string then one optional single character. Then it looks for a sequence of dots and digits and nothing's there. Because we didn't yet pass the first character.
Why does it work without ^? Because without ^ it is allowed to go through all characters to find a match and it stops right before R and continues matching up to the end.
That's fine but with first approach it should be ^.*. Kleene star matches every thing greedily then backtracks but ? is the operator here which makes preceding pattern optional. That means one character, not many characters.

regex: find a line somewhere after another line

I need a regular expression to find a specific line in a file that occurs somewhere after another line. for example, I may want to find the string "friend", but only when it occurs on a line after a line containing the string "hello". so for example:
hello there
how are you
my friend
should pass, but
how are you
my friend
hello
or
hello friend
how are you
should not pass.
The only thing I've thought of is something like hello[.\s]*\n[.\s]*friend, which does not work.
EDIT: I'm using a customized program that has a lot of limitations. I don't have access to switches or custom modes. I need a single regular expression that works for the standard python regex mode.
hello[.\s]*\n[.\s]*friend
First note that a dot inside a character class matches for a literal dot, not as a "match all" character, so you really want alternation, not character class for this. But also not that a "match all" dot will also match spaces, so you don't even need alternation.
So overall, you really just need this:
hello.*?friend
Now comes the problem with matching across new-line chars. By default the "match all" dot does not match new-line chars. You can flag/modifier it to match it, but how you do that depends on what language you are using. In php or perl, you can use the s modifier, e.g.
php:
preg_match('~hello.*?friend~s',$content);
edit:
If you are trying to use regex in something like an editor (or otherwise can't add flags/modifiers), most editors have an option to flag it as such. If not, you can try alternation with newline chars like so:
hello(.|\r?\n)*friend
You need to include two newline characters.
hello(?:.*\n)+.*friend
This expects atleast one newline character present inbetween.
I'm by no means a regex expert (particularly not in Python), but my RegexBuddy app thinks this will work:
(?s)hello.*\n+.*friend
The (?s) is apparently an inline way of specifying the "Dot matches newline" option, which seems to be necessary for the \n to work.

How to make a regular expression looking for a list of extensions separated by a space

I want to be able to take a string of text from the user that should be formated like this:
.ext1 .ext2 .ext3 ...
Basically, I am looking for a dot, a string of alphanumeric characters of any length a space, and rinse and repeat. I am a little confused on how to say " i need a period, string of characters and a space". But also, the last extension could either be followed by nothing, or a space, or a series of spaces. Also, I guess in between extensions could be followed by any number of spaces?
EDIT: I made it clearer what I was looking for.
Thanks!
Try this:
^(?:\.[A-Za-z0-9]+ +)*\.[A-Za-z0-9]+ *$
(Rubular)
In a Java string literal you need to escape the backslashes:
"^(?:\\.[A-Za-z0-9]+ +)*\\.[A-Za-z0-9]+ *$"
(\.\w+)\s* Match this and get your results.
^((\.\w+)\s*)*$ Check this and if it's true, your String is exactly what you want.
For the last pattern thing, you can't (AFAIK) do both getting all extensions (separated) and checking that the last is followed by other things. Either you check your string, or you extract the extensions from it.
I'd start with something like: ^.[a-z0-9]+([\t\n\v ]+.[a-z0-9]+)*$

Problem with basic regex to match ending optional character

Hi all i was hoping someone could help be with some basic regex i am really struggling with.
Bascially i need to match a url for redirection. I have been using
^~/abc(/)?
however i need to change the end part to just check the last optional character as this will also match ^/abcd
How about ^~/abc(/?)
or more generally: ^~/[a-zA-Z0-9]+/?
Assuming PCRE, you will want:
^~/abc(.)?$
Which will match "~/abc" followed (optionally) by any single character, which will be captured. Leave the () off if you don't need to capture said character.
Just like ^ matches the beginning of string (or line, depending upon mode), $ matches the end of string (or line).
I'll do something like this :
^~/([a-zA-Z0-9]+/?)*$