Atom regex to match file path - regex

I'm using Atom. I've pressed the ".*" button to turn on regular expressions. I'm trying to search for a string only in a file with a path that contains the "src" directory somewhere in the path. I would expect .*\/src\/.* to work but it doesn't. I've tried a bunch of permutations of this but still no luck.
What am I doing wrong?
Sample path "/Users/me/Development/ui/ui-enduser/src/main/js/config/AppConfiguration.js"

For the benefit of those with a similar issue, according to this link Atom uses minmatch library.
The syntax for this would be something like:
/Users/me/Development/\*\*/src/\*\*/
assuming you wanted to limit your search to subdirectories of /Users/me/Development/.
If you also wanted to limit your search to certain extension the syntax would be:
/Users/me/Development/\*\*/src/\*\*/\*.ext

/ is a special character in regular expressions. Try escaping them: \/src\/, or if you want the whole string, ^.*\/src\/.*

Related

Regex expression to match a string but exclude something at the same time

I want to try and ask this as concisely as possible please forgive me if I'm leaving something out. I want the expression to match all cases except where an exact filename string is present.
A backup software I'm using uses regular expressions and I want to setup an exclusion to skip all of a particular file extension type, except I have certain files I need to backup so I don't want them to match.
The files I want to exclude are we'll say for this example *.FLV
(?i).*\.flv
I want to include in my backups three files: abc123.flv, ghk432.flv, and fdw917.flv
This is where I'm having trouble, even just including one file from the three to be included to backup
(?i).*\.flv^(?!(abc123\.flv))&
The expression is being added to an Exclusion List for code42 CrashPlan backup, their support unfortunately cannot assist with complex RegEx expressions.
The closest thing I can supply as an example is their Example 3: Using An Exclude To Include:
.*/Documents/((?!(.*\.(doc|rtf)|.*/)$).)*$
http://support.code42.com/Administrator/3.6_And_4.0/Configuring/Using_Include_And_Exclude_Filters
However it excludes all files within directories named "Documents" and includes any files in those folders with doc or rtf file extensions. I'm trying to create an expression working with file extensions irregardless of folder location.
In my brain logically it seems like I need to write this as some kind of if then else statement but regex is not my forte.
Use an anchored negative look ahead with an alternation for the files you want to keep:
^(?i)(?!.*(abc123|ghk432|fdw917)\.flv).*\.flv
The negative lookahead asserts that the following input does not match its regex, and the pipe character means "or".
Try to put the negative lookahead at the position of the filename in the path:
^([^/]*/)*(?!(abc123|ghk432|fdw917)\.flv$)[^/]*\.flv$

Regex to match directory path and ignore filepaths

I have following input list from which I want to extract directories path and ignore filepaths. Below is an example of input list separated by ;
MI4/Search/Service/src/main/resources/META-INF/persistence.xml;MI4/Search/Service/src/main/resources/META-INF;MI4/FRSearch/Service/src/main/resources/resource/spring.xml;MI4/Search/Service/src/main/resources/conf;
The regex should match
MI4/Search/Service/src/main/resources/META-INF;
MI4/Search/Service/src/main/resources/conf;
First of all, directories can be named with extensions, so checking for the presence or absence of an extension in a path is not a reliable way to do this to determine if something really is a file or directory. In fact, the only way you can determine if a path is a directory or file name would be to use the appropriate OS API, e.g. GetFileAttributes on Windows or stat on Linux.
If this is your requirement, then you should split on the semicolon and iterate over each path that results, feeding each one in turn into the appropriate API to determine if it is a file or directory. If a textual match is all you need, I would still suggest you split on the semicolon and then match each individual path against an appropriate regular expression.
A Ruby function that would extract the extension might look like the following:
def get_extension(path)
path =~ /[^\/](\.[^.\/]*)$/
$1
end
Note that there are a few issues you'll need to deal with. This regular expression, for example, would treat the path foo/bar/.hidden as a path without an extension. This might not be exactly the behaviour you need. You'd need to tweak the expression appropriately.
You would then obtain all the paths for which get_extension returns nil. Please let us know which language you're trying to do this in, since there are significant syntactic differences.

How do you find a "."?

I'm trying to create a regular expression to look for filenames from full file paths, but it should not return for just a directory. For example, C:\Users\IgneusJotunn\Desktop\file.doc should return file.doc while C:\Users\IgneusJotunn\Desktop\folder should find no matches. These are all Word or Excel files, but I prefer not to rely on that. This:
StringRegExp($string, "[^\\]*\z",1)
finds whatever is after the last slash, but can't differentiate files from folders. This:
StringRegExp($string, "[^\\]*[dx][ol][cs]\z",1)
almost works, but is an ugly hack and there may be docx or xlsx files. Plus, files could be named like MyNamesDoc.doc. Easily solved if I could search for a period, but . is a used character (it means any single character except a newline) which does not seem to work with escapes. This:
StringRegExp($ue_string, "[^\\]*\..*\z",1)
should work, finding anything after the last backslash, capturing only something with a period in it. How to incorporate a period? Or any way to just match files?
Edit: Answered my own question. I'm interested in why it wasn't working and if there's a more elegant solution.
Local $string = StringRegExp($string, "[^\\]*\.doc\z|[^\\]*\.docx\z|[^\\]*\.xls\z|[^\\]*\.xlsx\z",1)
Periods do in fact work with the same escape slash most special characters use. As for the document type, an Or pipe and a different extension works great. If for some reason you need to add an extension, just add another Or.
Meh, I'm bored. You could do this:
$sFile = StringRegExp($sPath, "[^\\]+\.(?:doc|xls)x?$", 1)
There's no guarantees that a folder wouldn't be named that, so to be absolutely certain you'd have to check the file/folder attributes. However it's doubtful anyone would name a folder with something like '.docx'
Reverse the string.
Look for the "."
Look for "\" with StringInStr (and/or "/")
Trim the right side from the return of StringinStr
Reverse it again.

replace urls

I have a huge txt file and Editpad Pro list of urls with images on the root folder.
http://www.othersite.com/image01.jpg
http://www.mysite.com/image01.jpg
http://www.mysite.com/category/image01.jpg
How can I change only that ones that has images on the root using regexp?
http://www.othersite.com/image01.jpg
http://www.NEW_WEBSITE.com/image01.jpg
http://www.mysite.com/category/image01.jpg
I'm using the RegExr online app.
Search and replace (case insensitive, regular expression):
http://www\.mysite\.com/([^/]*\.(?:jpg|gif|png))
with:
http://www\.NEW_WEBSITE\.com/\1
EDIT
And yes, this will also re-base files such as http://www.mysite.com/.jpg, if any such files or directories exist. If anyone doesn't like this then just replace * with + -- or with {X,} if your assumption happens to be that an image file needs at least a X character name s etc. etc. -- but really, this is probably quite outside the scope of what lab72 is trying to achieve (i.e. not image file name validation.)
url1.replace(/((https?:\/\/www.?)(\w*?)(.com\/image\d*?\.(png|gif|jpg))/,
"$1newName$3");
Something like the above should work. The code is in AS (not compiled though :P) Note that $2 matches the sites name which we are replacing with yoursite.
Replace
http://www\.mysite\.com/image(.*)
with
http://www.newsite.com/image$1
That being said, you might also be interested in a decent text editor. That flash applet is really yucky. You can still use the same regexp, although you'll have to replace the dollar sign $ with a backslash \.

Regex: Get Filename Without Extension in One Shot?

I want to get just the filename using regex, so I've been trying simple things like
([^\.]*)
which of course work only if the filename has one extension. But if it is adfadsfads.blah.txt I just want adfadsfads.blah. How can I do this with regex?
In regards to David's question, 'why would you use regex' for this, the answer is, 'for fun.' In fact, the code I'm using is simple
length_of_ext = File.extname(filename).length
filename = filename[0,(filename.length-length_of_ext)]
but I like to learn regex whenever possible because it always comes up at Geek cocktail parties.
Try this:
(.+?)(\.[^.]*$|$)
This will:
Capture filenames that start with a dot (e.g. .logs is a file named .logs, not a file extension), which is common in Unix.
Gets everything but the last dot: foo.bar.jpeg gets you foo.bar.
Handles files with no dot: secret-letter gets you secret-letter.
Note: as commenter j_random_hacker suggested, this performs as advertised, but you might want to precede things with an anchor for readability purposes.
Everything followed by a dot followed by one or more characters that's not a dot, followed by the end-of-string:
(.+?)\.[^\.]+$
The everything-before-the-last-dot is grouped for easy retrieval.
If you aren't 100% sure every file will have an extension, try:
(.+?)(\.[^\.]+$|$)
how about 2 captures one for the end and one for the filename.
eg.
(.+?)(?:\.[^\.]*$|$)
^(.*)\\(.*)(\..*)$
Gets the Path without the last \
The file without extension
The the extension with a .
Examples:
c:\1\2\3\Books.accdb
(c:\1\2\3)(Books)(.accdb)
Does not support multiple . in file name
Does support . in file path
I realize this question is a bit outdated, however, I had some trouble finding a good source and wound up making the regex myself. To save whoever may find this time,
If you're looking for a ~standalone~ regex
This will match the extension without the dot
\w+(?![\.\w])
This will always match the file name if it has an extention
[\w\. ]+(?=[\.])
Ok, I am not sure why I would use regular expression for this. If I know for example that the string is a full filepath, then I would use another API to get the file name. Regular expressions are very powerfull but at the same time quite complex (you have just proved that by asking how to create such a simple regex). Somebody said: you had a problem that you decided to solve it using regular expressions. Now you have two problems.
Think again. If you are on .NET platform for example, then take a look at System.IO.Path class.
I used this pattern for simple search:
^\s*[^\.\W]+$
for this text:
file.ext
fileext
file.ext.ext
file.ext
fileext
It finds fileext in the second and last lines.
I applied it in a text tree view of a folder (with spaces as indents).
Just the name of the file, without path and suffix.
^.*[\\|\/](.+?)\.[^\.]+$
Try
(?<=[\\\w\d-:]*\\)([\w\d-:]*)(?=\.[\.\w\d-:]*)
Captures just the filename of any kind within an entire filepath. Purposefully excludes the file path and the file extension
Etc:
C:\Log\test\bin\fee105d1-5008-410c-be39-883e5e40a33d.pdf
Doesn't capture (C:\Log\test\bin)
Captures (fee105d1-5008-410c-be39-883e5e40a33d)
Doesn't capture (.pdf)
This RegExp works for me:
(.+(?=\..+$))|(.+[^\.])
Results (bold means match):
test.txt
test 234!.something123
.test
.test.txt
test.test2.txt
.