I have this regex which can detect specific extension file,
([a-zA-Z0-9\s_\\.\-\(\):])+(.cmd|.exe|.bat)$
but I would like to change it so that it never applies to c:\ , the goal is to detect these extension files only on secondary or external drives
Example
D:\test.bat match
c:\test.bat does not match
Thank you
In the pattern that you tried, you have to escape the dot to match it literally, and you don't have to escape the dot or the parenthesis in the character class.
Note that \s could also match a newline.
For the listed examples, you can make use of a negetive lookahead if supported, to rule out c:\ or C:\
Without the capture groups, to get a match only:
^(?![cC]:\\)[a-zA-Z0-9\s_\\.():-]+\.(?:cmd|exe|bat)$
^ Start of string
(?![cC]:\\) Negative lookahead to assert what is directly to the right is not c:\ or C:\
[a-zA-Z0-9\s_\\.():-]+ Match 1+ times any of the listed in the character class
\.(?:cmd|exe|bat) Match a dot, and 1 of the alternatives
$ End of string
Regex demo
Or with the capture groups:
^(?![cC]:\\)([a-zA-Z0-9\s_\\.():-]+)(\.(?:cmd|exe|bat))$
Regex demo
Assuming every path is in a separate line based on the $ you included in your pattern, here's a very simple solution you can build upon:
^[^cC].*(cmd|exe|bat)$
Explanation:
^ matches the beginning of a line.
[^cC] matches everything except c or C.
.* matches any character except line terminators, zero or more times.
(cmd|exe|bat) matches your extensions. Since the dot was matched in the previous line, there's no need to match it again.
$ matches end of line.
TL;DR: you forgot to match the beginning of your lines.
Related
I am trying to extract part of a filename out of a file path so that I can use it in the filename of a modified file. I'm having a little trouble trying to get RegEx to give me the part of the filename that I need, though. Here is the file path that I'm working with:
X:\\folder1\\folder2\\folder3\\folder4\\folder5\\Wherever-Place_2555025_Monthly-Report_202209150000.csv
Within this path, the drive name, the number of folders, the number of dashes in "Wherever-Place", and the information after the second underscore in the filename may vary. The important part is that I need to extract the following information:
Wherever-Place_2555025
from the path. Basically, I need to match everything between the last backslash and the second underscore. I can come up with the following RegEx to match everything after the last backslash:
[^\\]+$
And, if I run the output of that first RegEx through this next RegEx, I can get a match that includes the beginning of the string through the last character before the second underscore:
[^_]+_[^_]+
But, that also gives me another match that starts after the second underscore and goes through the end of the filename. This is not desirable - I need a single match, but I can't figure out how to get it to stop after it finds one match. I'd also really like to do all of this in one single RegEx, if that is possible. My RegEx has never been that good, and on top of that what I had is rusty...
Any help would be much appreciated.
If Lookarounds are supported, you may use:
(?<=\\)[^\\_]*_[^\\_]*(?=_[^\\]*$)
Demo.
For this match:
Basically, I need to match everything between the last backslash and
the second underscore.
You can use a capture group:
.*\\\\([^\s_]+_[^\s_]+)
The pattern matches:
.*\\\\ Match the last occurrence of \\
( Capture group 1
[^\s_]+_[^\s_]+ Match 1+ chars other than _ and \, then match the first _ and again match 1+ chars other than _ and \
) Close group 1
Regex demo
Or if supported with lookarounds and a match only:
(?<=\\)[^\s_\\]+_[^\s_]+(?![^\\]*\\)
The pattern matches:
(?<=\\) Positive lookbehind, assert \ to the left
[^\s_\\]+_[^\s_]+ Match 1+ chars other than _ and \, then match the first _ and again match 1+ chars other than _ and \
(?![^\\]*\\) Negative lookahead, assert not \ to the right
Regex demo
I am looking to get all non dot-files in a folder with a particular extension. So far my regex is:
(?<=\/|^)(?<!\.)(\w+(?:\.mov|\.py|))$
Is there a way to improve the above regex? What might be some examples where this regex might not work?
The \w+ will only match one or more letters, digits or _. It will not match the rest of the chars that may constitute a valid file name. Also, your (?<!\.) lookbehind is redundant because the previous lookbehind already excludes a dot at that position.
Besides, you do not have to repeat the comma pattern, you may use grouping for extensions only.
You may use
(?<=\/|^)([^\/]+)(\.(?:mov|py))$
See this regex demo
(?<=\/|^) - / or start of string allowed immediately on the left
([^\/]+) - Group 1: any one or more chars other than /
(\.(?:mov|py)) - Group 2: a . char and then either mov or py
$ - end of string/
Note you may also replace (?<=\/|^) with (?<![^\/]) in real code since it will work the same with standalone strings. It will mess the demo results at regex101.com because there, you test against a single multiline string (that is why I added \n to the negated character class there, too).
Here's how I would do it:
(?<=\/|^)[^\/\\:*?"<>|\n]+\.(?:mov|py)$
(?<=\/|^) Lookbehind just like you had it
[^\/\\:*?"<>|\n]+ One or more of any character that is not disallowed in filenames
\. A literal dot
(?:mov|py) Either "mov" or "py" in a non-capturing group (similar to yours, but I moved the dot out and excluded the redundant "|")
$ Anchors the search to the end of the line, so only files will match, no folders
I'm trying to write a regex which includes all 'component.ts' files which start with 'src' but excludes those files which have 'debug' folder in its file path using
(src\/.*[^((?!debug).)*$]/*.component.ts)
I'm testing the following strings on regex101 tester:
src/abcd/debug/xtz/component/ddd/xyz.component.ts
src/abcd/arad/xtz/xyz.component.ts
Both these strings are giving a perfect match, even though the first one has 'debug' in its path. Where am I going wrong?
You are specifying a negative lookahead (?! in a character class [^((?!debug).)*$] which would then only match the characters inside the character class.
What you could do is move the negative lookahead to the beginning to assert that what follows is not /debug or /debug/:
^(?!.*\/debug\/)src\/.*component\.ts$
Explanation
^ Assert the start of the line
(?!.*\/debug\/) Negative lookahead to assert that what follows is not /debug/
src Match literally
\/.*component\.ts Match a forward slash followed by any character zero or more times followed by .ts
$ Assert the end of the string
Note that to match the dot literally you have to escape it \. or else it would match any character.
Your regex matches:
src/
followed by zero or more non-newline characters
followed by one character that is not in the character class ((?!debug).)*$
followed by zero or more slashes
followed by a non-newline character
followed by component
followed by a non-newline character followed by ts.
In other words, the [^((?!debug).)*$], is not a lookbehind as you probably intended but rather a character class.
We can rephrase the desired match to see what we need:
src
followed by one or more path segments, each of which is not equal to debug
followed by the filename
Which gives us:
^src(?:/[^/]+(?<!debug))+/[^/]+\.component\.ts$
(Remember to escape the forward slashes if you’re using these in JavaScript.)
Try it on Regex101.
I added the ^ and $ because I assume you want the entire input to match. If you’re searching within a large string, you can remove those and instead change both instances of [^/] to [^\n/].
By the way, there’s no need to place the entire regex inside parentheses, as the first match will be the entire matched string in most languages.
I have the following directory:
Videos/common/Project/Project01/video.project_01.StatusOK/video.project_01.StatusOK.csproj
And the regular expression that I use to extract only with the last part of the route (video.project_01.StatusOK.csproj) is the following:
([\w|.])/Project/([\w|.|\s])/([\w|.|\s])/([\w|.|\s])([.]*)
The problem is that if the route varies, that is if there is a directory before: video.project_01.StatusOK.csproj, for example like this: Videos/common/Project/Project01/video.project_01.StatusOK/test/video.project_01. StatusOK.csproj, I would extract 'test'.
Let's see if someone can help me with a regular expression for java, always extract the last part which contains the '.csproj', whatever the route.
Regards, and thank you very much
Try this Regex:
(?<=\/)[^\/]+csproj
Click for Demo
See JAVA code HERE
Explanation:
(?<=\/) - positive lookbehind to find the position immediately preceded by a /
[^\/]+ - matches 1+ occurrences of any character that is not a /
csproj - matches csproj literally
In case you are unaware, Java 7 introduced NIO2 which brought a new interface java.nio.file.Path. You can break up the path to your directory and then use a regular expression on each part of the path.
Oracle's Java Tutorial has a section on Path Operations
(There is also a section on Regular Expressions)
If you want to keep to the /Project/ in your path, you could try this:
.*?/Project/.*?(?<=\/)([\w+. ]+\.csproj)$
That would match
match any character zero or more times non greedy (.*?)
match /Project/
match any character zero or more times non greedy (.*?)
positive lookbehind that asserts that what is before is a forward slash (?<=\/)
A capturing group ( this will contain your match
A character class that will match one or more word characters, dot or whitespace [\w. ]+ one or more times
Match .csproj \.csproj
Close the capturing group )
The end of the string $
A perforce depot path is of the following format:
//depot/solution/project/file.cs#232
How can I extract just the "file.cs". I have tried the following.
[^//]*$
Not sure how to eliminate "#1" part. Could anyone help?
This will find file names even if they don't have a # after them.
(\w+\.\w+)[^/]*$
Explanation:
(\w+\.\w+)
This matches the file name itself, \w is a word character (same as [a-zA-Z0-9_]). So its 1+ word character, a full stop (. on its own matches any character, you need \. to match an actual .), then 1+ more word characters.
[^/]*
Matches 0+ characters that are not /. But all the word characters will get put into the \w+ match before (because it is evaluated first and + will try to match as much as it can) so in your example this matches the #200
$
matches the end of the line. Which is needed so a.directory wouldn't get matched in /a.directory/file.txt
You can use this regex:
/\/([^\/#]*)#/
And use matched group #1 for your value file.cs
Assuming you're using PCRE, you can use the pattern:
'[^/]*(?=#)'