This question already has answers here:
Regular expression for a string containing one word but not another
(5 answers)
Closed 3 years ago.
The community reviewed whether to reopen this question 5 months ago and left it closed:
Original close reason(s) were not resolved
I am trying to quickly find all .java files which contain one term but are missing another term. I'm using MyEclipse 10.7 and its 'Search | File Search' feature, which supports regular expressions.
Will regex work in this scenario? What would the correct regex be?
The only solution I could find to work is the following Regex:
^(?!.[\s\S]*MISSING_TERM).[\s\S]*INCLUDED_TERM.*$
It finds every file which includes INCLUDED_TERM but lacks MISSING_TERM, regardless of the line.
The key is the \s\S, which ensures the whole file is searched and not each line.
If you want to find it on a single line, use it like this:
^(?!.*MISSING_TERM).*INCLUDED_TERM.*$
You can also use \ as an escape character, cause you may need it like class\.variable.
You could use something like:
(?<!.*bar)foo(?!.*bar)
Will match if "foo" is found but "bar" is not.
Notice: you must configure your search engine to use multiline regex (EX: Notepad++ has an option called ". matches newline") because usually the dot represent any character except line break.
(?m)\A(?=.*REGEX_TO_FIND)(?!.*MISSING_REGEX.*).*\z
The regex can get kinda tricky but it breaks down into two pieces.
Find the matching term/phrase/word. This part isn't too tricky as this is what regex normally looks for.
Finding the term not present. This is the tricky part, but it's possible.
I have an example HERE which shows how you want to find the word connectReadOnly in the text, and fail to find disconnect. Since the text contains connectReadOnly it starts looking for the next piece, not finding disconnect. Since disconnect is in the text it fails on the entire string (what you will need for your entire file to match). If you play around with the second piece, the negation part (?!.*disconnect.*), you can set that as whatever regex you need. In my example I don't want to find disconnect anywhere in my code :) You can easily replace that with your word to search on, or even a more complex regex to "not find".
The key is to use multi line mode, which is set using the beginning (?m) and then using the start/end of string chars. Using ^ and $ to start/end a line, where \A and \z start and end a string, thus extending the match over the entire file.
EDIT: For the connectReadOnly and disconnect question use: (?m)\A(?=.*connectReadOnly)(?!.*disconnect.*).*\z. The updated example can be found here.
Related
How can I match one line of text with a regex and follow it up with a line of dashes exactly as many as characters in the initial match to achieve text-only underlining. I intend to use this with the search and replace function (likely in the scope of a macro) inside an editor. Probably, but not necessarily, Visual Studio Code.
This is a heading
should turn into
This is a heading
-----------------
I believe I have read an example for that years ago but can't find it; neither do I seem to be able to formulate a search query to get anything useful out of Google (including variations of the question's title). If you are I'd be interested in that, too.
The best I can come up with is this:
^(.)(?=(.*\n?))|.
Substitution
$1$2-
syntax
note
^(.)
match the first character of a line, capture it in group 1
(?=(.*\n?))
then look ahead for the rest of this line and capture it in group 2, including a line break if there's any
|.
or a normal character
But the text must has a line break after it, or the underline only stays on the same line.
Not sure if it is any useful but here are the test cases.
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
Suppose we have a string:
some random text $$ hello world $$ some more stuff
In the string delimited by $$, I would like to capture all occurrences of o (for example, in order to find and replace in Sublime Text 3).
How do I formulate such a regex command? Even though I plan to use this regex command in Sublime Text, I can take a regex command that uses different notation and fix it for use in ST3.
Use the regex:
(\$\$[^o]*)o(.*?\$\$)
replace with:
\15\2
where 5 is the substitution string.
Test here.
If you want to search for a sub-string instead of only one character, you may use:
(\$\$.*?)lo(.*?\$\$)
with the same replacement string.
Test here.
Another option is to extract first the sub-string between the delimiters ($$ hello world $$), and then on the result perform any search-and-replace action you need. This might also need looping until no replacements are done any more.
To the best of my regex knowledge, you'll need to use two regular expressions to replace all the o's between the delimiters. The first one would be to actually grab the text within the delimiter. For example:
(?P<start>\$\$) # start by grabbing the starting literal '$$'
(?<middle>.*?)(?=\$\$) # then grab everything up until the ending '$$'
(?P<end>\$\$) # now grab the ending '$$'
Example: https://regex101.com/r/015Xj8/4
Obviously you know what the start and ending is so you can simplify it further if you wanted (no start or end group for example), but I've included that for thoroughness.
Once you've captured the start/end of it, you can replace the o by a straight find/replace on the literal 'o'. As far as I know, it takes two steps via regex to do what you're after. I'm not too knowledgeable about sublime but perhaps there's a sed-like replacement feature in it.
The following will replace the first o in the string (similar to the other answer by virolino), but you'll need to click "Enter" a bunch of times to make sure that all the last o's are replaced for this to be useful:
(?P<start>\$\$)
(?<middle>.*?(?<first_o>o).*?)(?=\$\$)
(?P<end>\$\$)
Or, if you're just looking to capture O's (and nothing else), you can just make sure it's between a starting and ending $$:
\$\$.*?(o).*?\$\$
I've got a practical application for a vim regex where I'd like to remove numbers from the end of file location links. For example, if the developer is sloppy and just adds files and doesn't reuse file locations, you'll end up with something awful like this:
PATH_TO_MY_FILES>
PATH_TO_MY_FILES1>
...
PATH_TO_MY_FILES22>
PATH_TO_MY_FILES_ELSEWHERE>
PATH_TO_MY_FILES_ELSEWHERE1>
...
So all I want to do is to S&R and replace PATH_TO_MY_FILES*\d+ with PATH_TO_MY_FILES* using regex. Obviously I am not doing it quite right, so I was hoping someone here could not spoon feed the answer necessarily, but throw a regex buzzword my way to get me on track.
Here's what I have tried:
:%s\(PATH_TO_MY_FILES\w*\)\(\d+\)>:gc
But this doesn't work, i.e. if I just do a vim search on that, it doesn't find anything. However, if I use this:
:%s\(PATH_TO_MY_FILES\w*\)\(\d\)>:gc
It will match the string, but the grouping is off, as expected. For example, the string PATH_TO_MY_FILES22 will be grouped as (PATH_TO_MY_FILES2)(2), presumably because the \d only matches the 2, and the \w match includes the first 2.
Question 1: Why doesn't \d+ work?
If I go ahead and use the second string (which is wrong), Vim appears to find a match (even though the grouping is wrong), but then does the replacement incorrectly.
For example, given that we know the \d will only match the last number in the string, I would expect PATH_TO_MY_FILES22> to get replaced with PATH_TO_MY_FILES2>. However, instead it replaces it with this:
PATH_TO_MY_FILES2PATH_TO_MY_FILES22>gt
So basically, it looks like it finds PATH_TO_MY_FILES22>, but then replaces only the & with group 1, which is PATH_TO_MY_FILES2.
I tried another regex at Regexr.com to see how it would interpret my grouping, and it looked correct, but maybe a hack around my lack of regex understanding:
(PATH_TO_\D*)(\d*)>
This correctly broke my target string into the PATH part and the entire number, so I was happy. But then when I used this in Vim, it found the match, but still replaced only the &.
Question 2: Why is Vim only replacing the &?
Answer 1:
You need to escape the + or it will be taken literally. For example \d\+ works correctly.
Answer 2:
An unescaped & in the replacement portion of a substitution means "the entire matched text". You need to escape it if you want a literal ampersand.
I am facing a problem whereby I am given a string that contains a path to a file and the file's name and I only want to extract the path (without the file's name)
For example, I will receive something like
C:\Users\OopsD\Projects\test.acdbd
and from that string I want to extract only
C:\Users\OopsD\Projects
I was trying to create a RegEx to match a backslash followed by a word, followed by a dot followed by another word - this is to match the
\test.acdbd
part and replace it with empty string so that the final result is
C:\Users\OopsD\Projects
Can anyone, familiar with RegEx, help me on this one? Also, I will be using regular expressions quite a lot in the future. Is there a (free) program I can download to create regular expressions?
Are you really sure you need to be using Regex for such as simple task? How about this:
Dim file As New IO.FileInfo(" C:\Users\OopsD\Projects\test.acdbd")
MsgBox(file.Directory.FullName)
Regarding the free program on Regex, I would definitely recommend http://www.gskinner.com/RegExr/ - using it all the time. But you always have to consider alternatives, before going the Regex way.
The regex that you are looking for is as below:
[^/]+$
where,
^ (caret):Matches at the start of the string the regex pattern is applied to. Matches a position rather than a character. Most regex flavors have an option to make the caret match after line breaks (i.e. at the start of a line in a file) as well.
$ (dollar):Matches at the end of the string the regex pattern is applied to. Matches a position rather than a character. Most regex flavors have an option to make the dollar match before line breaks (i.e. at the end of a line in a file) as well. Also matches before the very last line break if the string ends with a line break.
+ (plus):Repeats the previous item once or more. Greedy, so as many items as possible will be matched before trying permutations with less matches of the preceding item, up to the point where the preceding item is matched only once.
More reference can be found out at this link.
Many Regex softwares and tools are out there. Some of them are:
www.gskinner.com/RegExr/
www.txt2re.com
Rubular- It is not just for Ruby.
This question already has answers here:
Regex search and replace with optional plural
(4 answers)
Closed 6 years ago.
I'm trying to do what should be a simple Regular Expression, where all I want to do is match the singular portion of a word whether or not it has an s on the end. So if I have the following words
test
tests
EDIT: Further examples, I need to this to be possible for many words not just those two
movie
movies
page
pages
time
times
For all of them I need to get the word without the s on the end but I can't find a regular expression that will always grab the first bit without the s on the end and work for both cases.
I've tried the following:
([a-zA-Z]+)([s\b]{0,}) - This returns the full word as the first match in both cases
([a-zA-Z]+?)([s\b]{0,}) - This returns 3 different matching groups for both words
([a-zA-Z]+)([s]?) - This returns the full word as the first match in both cases
([a-zA-Z]+)(s\b) - This works for tests but doesn't match test at all
([a-zA-Z]+)(s\b)? - This returns the full word as the first match in both cases
I've been using http://gskinner.com/RegExr/ for trying out the different regex's.
EDIT: This is for a sublime text snippet, which for those that don't know a snippet in sublime text is a shortcut so that I can type say the name of my database and hit "run snippet" and it will turn it into something like:
$movies= $this->ci->db->get_where("movies", "");
if ($movies->num_rows()) {
foreach ($movies->result() AS $movie) {
}
}
All I need is to turn "movies" into "movie" and auto inserts it into the foreach loop.
Which means I can't just do a find and replace on the text and I only need to take 60 - 70 words into account (it's only running against my own tables, not every word in the english language).
Thanks!
- Tim
Ok I've found a solution:
([a-zA-Z]+?)(s\b|\b)
Works as desired, then you can simply use the first match as the unpluralized version of the word.
Thanks #Jahroy for helping me find it. I added this as answer for future surfers who just want a solution but please check out Jahroy's comment for more in depth information.
For simple plurals, use this:
test(?=s| |$)
For more complex plurals, you're in trouble using regex. For example, this regex
part(y|i)(?=es | )
will return "party" or "parti", but what you do with that I'm not sure
Here's how you can do it with vi or sed:
s/\([A-Za-z]\)[sS]$/\1
That replaces a bunch of letters that end with S with everything but the last letter.
NOTE:
The escape chars (backslashes before the parens) might be different in different contexts.
ALSO:
The \1 (which means the first pattern) may also vary depending on context.
ALSO:
This will only work if your word is the only word on the line.
If your table name is one of many words on the line, you could probably replace the $ (which stands for the end of the line) with a wildcard that represents whitespace or a word boundary (these differ based on context).