Change a string with backward slashes to forward slashes - regex

I have a string in the format of "c:\replaceallslashes\directory1\subdirectory1\etc\etc\file.html" in a large number of files. All the backslashes in these strings need be changed to forward slashes so that the path can become a URL. I want to change this via find & replace in a text or regex editor, but don't want to accidentally replace any backslashes outside the string that may occur in the documents.
How should I construct the find and replace command?
Edit: just to be clear, I am looking for regex strings for both the "find" and "replace" fields. The answer below only gives the "find" command.

A very rudimentary Windows path regex would look something like:
[a-z]:\\(?:[a-z0-9_-]+\\?)*
https://regex101.com/r/g1hubs/1
The issue is that Windows filenames are only restricted from using \/:*?"<>| so there's only a small fraction of chars which tell you that something is definitely not a path. So my example assumes that you only have alphanumeric paths which may or may not include underscores and dashes.

Related

Elasticsearch Regex to match url starting with one string and not ending with another, without look ahead/behind

I have two groups of strings that take the formats
http://example.com/foo/something
and
http://example.com/foo/something/something-else/bar/1
Where example.com, foo and bar are fixed, something and something else could be any string and 1 is any number.
I want to use regex to match strings following the first format (they must start with http://example.com/foo/) and not the second. The exclusion could be around number of slashes, the "bar" string or ending in a number.
I don't have support for look ahead or look back.
What's the best approach?
Examples of strings that should match
http://example.com/foo/apple
http://example.com/foo/bear-bear
http://example.com/foo/cake-cake
Examples of strings that should NOT match
http://example.com/baa/apple
http://example.com/foo/apple/cake/bar/1
http://example.com/foo/bear-apple/camel/bar/2
Examples of strings that wouldn't exist in the data set
(So it doesn't matter if they match or not)
http://example.com/foo/bear-bear/cake/bar/two
http://example.com/foo/bear/camel/tar/2
http://example.com/foo/bear-bear/camel
http://example.com/foo/bear/camel/
http://example.com/foo/bear-bear/camel/tar/2
UPDATE
It turns out that the regex engine the application I'm using this in is from Elasticsearch, so this documentation (and one of our developers) was helpful: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html
The end solution was:
(http://example.com/foo.*)&~(.*bar.*)
All your examples have a specific prefix URL, followed by one-and-only-one path element. If this is the general case, you can do this by simply looking for the prefix URL followed by a word which doesn't contain a path separator, followed by EOL.
You didn't say what engine you're using, so here's an example with Gnu grep in bash:
grep -e '^http://example.com/foo/[^/]\+$'
Bash makes for readable examples, because single-quoting means very few characters need escaping. The sole exception in my example is the + character.

How do I escape comma in WMIC inside like string

I wish to be able to run a query like the following:
wmic path Win32_Service where "DisplayName like 'FooBarService % (X, Y)'" get *
But, it doesn't work because of the comma inside the like string. The error I get is "Invalid Verb." I tried escaping it with a backslash, and I tried escaping it using brackets as underscores are meant to be escaped, and both resulted in the "Invalid Verb." error.
As a less-than-ideal workaround, I can replace the commas with underscores, and it works, but the underscore will match any single character rather than just the comma, so I'd rather find a way to escape the commas.
Is there a way to escape the comma like in this example?
One way I have found to include a comma in the like clause is to place the entire where expression in parentheses. Unfortunately, I also found that this means I cannot include a close paren in the string at the same time (but an open paren is okay). I experimented with the /trace:on option to see what was going on under the covers a little bit and it helped me find a couple things the program accepts:
Here is an example I got to work with a comma, but it apparently cannot contain a close paren:
C:\> wmic /trace:on path Win32_Service where (Description like '%(%, %') get DisplayName
And here is an example I got to work with both open and close parentheses, but apparently it cannot contain a comma (obviously, this is quite similar to your original example):
C:\> wmic /trace:on path Win32_Service where "Description like '%(TAPI)%'" get DisplayName
It seems like the parser just isn't complex enough to handle these cases, but with tracing on, you can see the WMI Win32 functions that it uses, so maybe you could write your own program that uses the functions directly. I think IWbemServices::ExecQuery is capable of what you're looking to do.

RegEx for for string that has bezier<somestring>Path

I have a large file in xCode that contains numerous occurrences of bezierPath (ie bezier456Path). What I'd like to do is come up with a regular expression for this string so I can replace it simply with the string "bezierPath". I've tried things like bezier\w*Path to no avail. Does anyone know what I could use to search for a string like this?
\w is for word characters. Based on your example, you need to search for digits (\d):
bezier\d+Path
Also, since your replacement string is bezierPath, there is no point in using * (i.e. zero or more), since that would include replacing bezierPath with itself. Therefore, you should use + (i.e. one or more), instead.
I don't have access to xcode, but can't you use Find with
(bezier.*?Path)
and then use Replace with the path that you want, so in this case just bezierPath

how to extract filename in this situation?

my input strings look like this:
1 warning: rg: W, MULT: file 'filename_a.h' was listed twice.
2 warning: rg: W, SCOP: scope redefined in '/proj/test/site_a/filename_b.c'.
3 warning: rg: W, ATTC: file /proj/test/site_b/filename_c.v is not resolved.
4 warning: rg: W, MULTH: property file filename_d.vu was listed outside.
They come in four different flavors as listed above. I read these from a log file line by line.
For the one with path specified (line 2,3) I can extract filename using $file=~s#.*/##; and seems to work fine. Is there a way not to use conditional statements for different type and extract the filename? I want to use just one clean regex and extract the filename. Perl's File::basename will not work also in this case.
I am using Perl.
You could do it in two steps:
extract path from each line
get basename from the path
Example
#!/usr/bin/perl -n
use feature 'say';
use File::Basename;
#NOTE: assume that unquoted path has no spaces in it
say basename($1.$2) if /(?:file|redefined in)\s+(?:'([^']+)'|(\S+))/;
Output
filename_a.h
filename_b.c
filename_c.v
filename_d.vu
Your problem needs more constraints. For example, what's a good way to characterize a string as a "path" (or "filename") or not? You might say, "Hey, when I see a single dot immediately followed by letters and numbers (but not symbols), and there are a bunch of characters before that dot too, then it might be a path or filename!"
\s+([^\s]+\.\w+)
But this doesn't catch all paths, nor files without an extension. So we might latch on an alternation to say, "Either the above, or, a string with at least one slash in it."
\s+([^\s]+\.\w+|[^\s]*\/[^\s]*)
(Note that you may not need to escape the slash in the above example, since you seem to be using # as your delimiter.)
What I'm getting at, in any case, is that you need to specify your problem more rigorously, and this will automatically bring you to a satisfying solution. Of course, there is no truly "correct" solution using regexes alone: you'd need to do file tests to do that.
To go further with this example, perhaps you want to define a list of extensions:
\s+([^\s]+\.(?:c|h|cc|cpp)|[^\s]*\/[^\s]*)
Or, perhaps you want to be more generic, but allow only extensions up to 4 characters long:
\s+([^\s]+\.\w{1,4}|[^\s]*\/[^\s]*)
Perhaps you only consider something a path if it begins with a slash, but you still want at least one another slash somewhere in it:
\s+([^\s]+\.\w{1,4}|/[^\s]*\/[^\s]*)
Good luck.
/\w*.\w*/
This will match the file name expressed in the four different warning logs. \w will match any word character (letters, digits, and underscores), so this regex looks for any number of word characters, followed by a dot followed by more word characters.
This works because the only other dot in your logs is at the end of the log.

How to search (using regex) for a regex literal in text?

I just stumbled on a case where I had to remove quotes surrounding a specific regex pattern in a file, and the immediate conclusion I came to was to use vim's search and replace util and just escape each special character in the original and replacement patterns.
This worked (after a little tinkering), but it left me wondering if there is a better way to do these sorts of things.
The original regex (quoted): '/^\//' to be replaced with /^\//
And the search/replace pattern I used:
s/'\/\^\\\/\/'/\/\^\\\/\//g
Thanks!
You can use almost any character as the regex delimiter. This will save you from having to escape forward slashes. You can also use groups to extract the regex and avoid re-typing it. For example, try this:
:s#'\(\\^\\//\)'#\1#
I do not know if this will work for your case, because the example you listed and the regex you gave do not match up. (The regex you listed will match '/^\//', not '\^\//'. Mine will match the latter. Adjust as necessary.)
Could you avoid using regex entirely by using a nice simple string search and replace?
Please check whether this works for you - define the line number before this substitute-expression or place the cursor onto it:
:s:'\(.*\)':\1:
I used vim 7.1 for this. Of course, you can visually mark an area before (onto which this expression shall be executed (use "v" or "V" and move the cursor accordingly)).