Regex for finding substrings using Grep Console in Eclipse - regex

I am using Grep Console in Eclipse to highlight lines in the console output that contain characters, e.g. cancel, based on a regex. The characters may have a symbol preceding and/or following it, may be surrounded by spaces, or may be substrings. In other words, I want to match the following lines (regardless of case):
The flight was cancelled.
[Cancelled] Flight 101
Are they going to cancel it?
What is the regex that I need to use to highlight these lines?

As acdcjunior already explained, you basically just need a case insensitive regular expression to match "cancel".
If you already have your output in the console, the easiest way to create this expression is to just select the word "cancel" in the output, then right click and select "Add Expression" from the context menu. A submenu will you select a group to which the new expression will be added, or create a new one. The expression item will then be created, using the following expression:
(\Qcancel\E)
Be sure to uncheck the "Case sensitive" checkbox, which is enabled by default for performance reasons and would prevent the expression from matching your second line with the capital 'C'.
This is basically the same expression acdcjunior provided, with a few differences:
The .* matchers at the beginning and end of the expression are not included, as they are not necessary. Expressions will always match substrings anywhere in a line unless the $ or ^ matchers are used to specifically refer to the beginning or end of a line.
The expression is also wrapped in parentheses to create a capture group, allowing you to assign a style not only to the entire line containing the string cancel, but also to that string itself. You can leave out the parentheses if you don't want to style that string.
\Q and \E are always included when creating an expression from a selected text string to make sure that no characters from the selected string are interpreted as special expression characters. In this case, this not necessary, as cancel only contains word characters.
This means that in your case, the simplest sufficient expression is just:
cancel
This expression also works if you use it as a "quick expression", as suggested by acdcjunior, though there is no real need for this. The idea behind quick expressions is that very long lines in the console can considerably slow down pattern matching. Grep Console therefore has a configurable limit to how many characters in each line will be matched with the configured expressions. Any characters after this limit in long lines are ignored, which means that lines which contain keywords only after the limit will not be recognised and therefore not styled.
If you configure a quick expression, every line is first matched with this expression, and only if the match is positive will the "normal" expression be used. In this case, the expressions are matched against the entire line. The quick expression should therefore be as simple as possible, so as not to slow down the matching too much.
In your case, using cancel as a quick expression and leaving the normal expression blank works because first the quick expression is positively matched against your line, and then the blank expression matches as well. If you have very long lines, it may cost you some performance though, as the quick expression will ignore the length limits explained above. Also, quick expression don't use capture groups, so you can't highlight the cancel string with a separate style in this case.

Use:
.*(\Qcancel\E).*
And do not check "Case sensitive".
Or just cancel in the "Quick expression" text box.

Related

Regular expression to delete all words between two specific words

I'm normally ok with regex but I'm struggling with this.
I have a simple file with two words that start and end a set of data. The data between the words changes but - start and status are always in the same place.
Example :
start
Everything in between
status
I'm trying to work out how to delete (replace) everything between and including start and status
I'm sure I had it working with this at one time
(?i)^start.+?status
set(#replaceAll,$replace regular expression(#textTest,"(?i)^start.+?status"," "),"Global")
but its just not working anymore.
You could use the regular expression
\bstart\b.+?\bstatus\b
which does not require "status" to be on the same line as "start". Two flags should be set:
case indifference (/i)
single-line mode, which allows . to match a newline (/s)
Demo
The regex reads, "match 'start' with a word break fore and aft (to avoid matching 'starting' or 'jumpstart', for example), then match one or more characters lazily, then match 'status' with wordbreaks". The middle match must be lazy so that the regex engine will stop at the next (rather than last) instance of 'status'.
If the regex engine being used does not support single-line mode, or something comparable, one can replace .+ with [\s\S]+.
So my original expression works and so dose Cary's
The files have changed since I last used the expression. They contain some white-space in the form of newlines that needed to be removed first
set(#cleanup,$replace(#text2,$new line," "),"Global")
set(#text2,$replace regular expression(#cleanup,"\\bstart\\b.*?\\bstatus\\b",""),"Global")
set(#cleanup,$replace regular expression(#cleanup,"(?i)^start.+?status:",""),"Global")
Sorry about that but thanks to all who looked and helped :)

Regular expression look-back matches correctly but doesn't replace in Notepad++ [duplicate]

I have the following CSS markup.
.previous-container{
float:left;
}
.primary-commands {
float:right;
}
Using the regex syntax search (?<=[\s,;])([a-zA-Z\-]+): it highlights the CSS property name as expected, however, upon clicking replace nothing is replaced. I have tried using group token syntax in replace line e.g. $[nth group] and any plain literal string replacement. No matter my attempts it will not replace the matched string with anything. I am using notepad++ version 6.7.5. Perhaps there is something obvious I am missing here?
Based on comments to the original question here are some work-arounds that solved my problem.
Option #1
Replace the lookbehind portion of the regex statement (?<=[\s,;]) with a simple non-lookbehind group matching statement such as ([\s,;]). This will continue to limit search results to strings beginning with the specified characters in the lookbehind. The only caveat is that in my replacement string e.g. $1 $2 I would need to leave out the undesired matched characters that should not be a part of the replacement string.
Option #2
Use the "Replace All" button. It will perform replacements correctly when using a positive lookbehind in your regex statement as opposed to using the "Replace" button for single replacement.
I went with Options #1 only because it achieves what I need while allowing me to still perform a single replacement at a time. With larger documents I don't want to use "Replace All" until I have thoroughly tested my regex expression.

Including Regular Expressions in AutoHotKey Script

I am currently developing a very "simple" script in AutoHotKey, but it involves using hotstrings following the format:
::btw::by the way
which would detect whenever a user types "btw" and replace it with "by the way".
However, whenever I try to put a regular expression in between the colons, it interprets it literally. Is there any way to use regular expressions with hotstrings? Workarounds are accepted.
Hotstrings don't natively support RegEx,
but there is RegEx Powered Dynamic Hotstrings which I've never tried.
Your other option is a Loop with the Input command inside of it.
That would require an end character, such as space.
Then you would have the script analyze what the Input command returns with RegExReplace.
Place the number in the regular expression in a capturing group and use it as a back-reference in the replacement. But unless the pattern always has the digit in the same place I think it would require two steps (with RegExMatch) as shown in this working example:
loop
{
Input, retrieved, V, {space}
RegExMatch(retrieved, "[a-zA-Z0-9]{6}", match)
RegExMatch(match, "\d", output)
If (output != "")
Sendinput, {bs 7}%output%
}
Type any sequence of six with five letters and one digit,
press space and it will replace the sequence with only the number.

How to delete regex match text in emacs?

How can I delete some text that match with a regex in emacs?
I suppose that using:
'(query-replace-regexp PATTERN EMPTY)
and:
'(replace-regexp PATTERN EMPTY)
but they throw:
perform-replace: Invalid regexp: "Premature end of regular expression".
In general, you can delete text that matches a given regexp by using the empty string "" as the replacement in the two functions you mention. However, as others mentioned in the comments above, your regular expression is faulty.
For instance, if your buffer contains the following text:
1. My todo list
1.1. Brush teeth
1.2. Floss
2. My favorite movies
2.1. Star Wars episodes 4-6
and you would like to get rid of the numbers at the beginning of each line, you could place the cursor at the beginning of the buffer and then type M-C-% (that is, you press at a time: ALT, CTRL, Shift, 5) to invoke the command query-replace-regexp. You'll get asked two parameters in the minibuffer, first the regexp to match than the replacement string.
So, in our example, you could use the following regexp:
\([0-9]\.\)+\s-
as the first parameter, and simply hit ENTER for the second parameter, i.e., don't specify anything as the replacement. That way, the replacement is the empty string: you replace what ever matches the regexp with nothing.
query-replace-regexp will ask you interactively for every match if you want to replace it or if you want to skip it. This is the "query"-part in query-replace-regexp and it is helpful to see if the regexp you came up with actually matches what you thought it does. If you're sure it does, you can type ! to make Emacs replace the remaining matches without asking every time.
If you use M-x replace-regexp instead of M-C-% Emacs will replace every match without asking for input at every match.
For the special case that you'd like to delete whole lines when a certain part of the line matches a regexp, there's also delete-matching-lines and its evil, goatee-wearing twin brother from a parallel universe delete-non-matching-lines.

Explain this Regular Expression please

Regular Expressions are a complete void for me.
I'm dealing with one right now in TextMate that does what I want it to do...but I don't know WHY it does what I want it to do.
/[[:alpha:]]+|( )/(?1::$0)/g
This is used in a TextMate snippet and what it does is takes a Label and outputs it as an id name. So if I type "First Name" in the first spot, this outputs "FirstName".
Previously it looked like this:
/[[:alpha:]]+|( )/(?1:_:/L$0)/g (it might have been \L instead)
This would turn "First Name" into "first_name".
So I get that the underscore adds an underscore for a space, and that the /L lowercases everything...but I can't figure out what the rest of it does or why.
Someone care to explain it piece by piece?
EDIT
Here is the actual snippet in question:
<column header="$1"><xmod:field name="${2:${1/[[:alpha:]]+|( )/(?1::$0)/g}}"/></column>
This regular expression (regex) format is basically:
/matchthis/replacewiththis/settings
The "g" setting at the end means do a global replace, rather than just restricting the regex to a particular line or selection.
Breaking it down further...
[[:alpha:]]+|( )
That matches an alpha numeric character (held in parameter $0), or optionally a space (held in matching parameter $1).
(?1::$0)
As Roger says, the ? indicates this part is a conditional. If a match was found in parameter $1 then it is replaced with the stuff between the colons :: - in this case nothing. If nothing is in $1 then the match is replaced with the contents of $0, i.e. any alphanumeric character that is not a space is output unchanged.
This explains why the spaces are removed in the first example, and the spaces get replaced with underscores in your second example.
In the second expression the \L is used to lowercase the text.
The extra question in the comment was how to run this expression outside of TextMate. Using vi as an example, I would break it into multiple steps:
:0,$s/ //g
:0,$s/\u/\L\0/g
The first part of the above commands tells vi to run a substitution starting on line 0 and ending at the end of the file (that's what $ means).
The rest of the expression uses the same sorts of rules as explained above, although some of the notation in vi is a bit custom - see this reference webpage.
I find RegexBuddy a good tool for me in dealing with regexs. I pasted your 1st regex in to Buddy and I got the explanation shown in the bottom frame:
I use it for helping to understand existing regexs, building my own, testing regexs against strings, etc. I've become better # regexs because of it. FYI I'm running under Wine on Ubuntu.
it's searching for any alpha character that appears at least once in a row [[:alpha:]]+ or space ( ).
/[[:alpha:]]+|( )/(?1::$0)/g
The (?1 is a conditional and used to strip the match if group 1 (a single space) was matched, or replace the match with $0 if group 1 wasn't matched. As $0 is the entire match, it gets replaced with itself in that case. This regex is the same as:
/ //g
I.e. remove all spaces.
/[[:alpha:]]+|( )/(?1:_:/\L$0)/g
This regex is still using the same condition, except now if group 1 was matched, it's replaced with an underscore, and otherwise the full match ($0) is used, modified by \L. \L changes the case of all text that comes after it, so \LABC would result in abc; think of it as a special control code.