Including Regular Expressions in AutoHotKey Script - regex

I am currently developing a very "simple" script in AutoHotKey, but it involves using hotstrings following the format:
::btw::by the way
which would detect whenever a user types "btw" and replace it with "by the way".
However, whenever I try to put a regular expression in between the colons, it interprets it literally. Is there any way to use regular expressions with hotstrings? Workarounds are accepted.

Hotstrings don't natively support RegEx,
but there is RegEx Powered Dynamic Hotstrings which I've never tried.
Your other option is a Loop with the Input command inside of it.
That would require an end character, such as space.
Then you would have the script analyze what the Input command returns with RegExReplace.
Place the number in the regular expression in a capturing group and use it as a back-reference in the replacement. But unless the pattern always has the digit in the same place I think it would require two steps (with RegExMatch) as shown in this working example:
Input, retrieved, V, {space}
RegExMatch(retrieved, "[a-zA-Z0-9]{6}", match)
RegExMatch(match, "\d", output)
If (output != "")
Sendinput, {bs 7}%output%
Type any sequence of six with five letters and one digit,
press space and it will replace the sequence with only the number.


Regular expression to delete all words between two specific words

I'm normally ok with regex but I'm struggling with this.
I have a simple file with two words that start and end a set of data. The data between the words changes but - start and status are always in the same place.
Example :
Everything in between
I'm trying to work out how to delete (replace) everything between and including start and status
I'm sure I had it working with this at one time
set(#replaceAll,$replace regular expression(#textTest,"(?i)^start.+?status"," "),"Global")
but its just not working anymore.
You could use the regular expression
which does not require "status" to be on the same line as "start". Two flags should be set:
case indifference (/i)
single-line mode, which allows . to match a newline (/s)
The regex reads, "match 'start' with a word break fore and aft (to avoid matching 'starting' or 'jumpstart', for example), then match one or more characters lazily, then match 'status' with wordbreaks". The middle match must be lazy so that the regex engine will stop at the next (rather than last) instance of 'status'.
If the regex engine being used does not support single-line mode, or something comparable, one can replace .+ with [\s\S]+.
So my original expression works and so dose Cary's
The files have changed since I last used the expression. They contain some white-space in the form of newlines that needed to be removed first
set(#cleanup,$replace(#text2,$new line," "),"Global")
set(#text2,$replace regular expression(#cleanup,"\\bstart\\b.*?\\bstatus\\b",""),"Global")
set(#cleanup,$replace regular expression(#cleanup,"(?i)^start.+?status:",""),"Global")
Sorry about that but thanks to all who looked and helped :)

Regex for removing spaces and random trailing chars

I am successfully validating an ID such as:
using this regex:
This ID sometimes arrives corrupted (comes from a OCR process) and therefore the previous regex does not work. I need to support the most common way of corruption which is having a space within the ID:
ZFA1G2H34 J5K6L7P5
The regex should remove the space and compose just the allowed 17 chars of the ID.
Please note I cannot use scripting (.replace for example) because the software where this regex is used does not support it.
As a bonus, sometimes the ID contains trailing chars which I would like to remove as well:
ZFA1G2H34 J5K6L7P5...ç
You can use one of the following regular expressions to validate the query:
^(?:(?![iIoO])[ ç0-9a-zA-Z]){17,}$
^([ ça-hA-Hj-nJ-Np-zP-Z0-9]){17,}$
And then, you can use the following regular expression to only match characters you like:
Don't use , in a set like [A-Z,a-z], because commas are actually part of the set and not a separator between the character ranges.

Regex for finding substrings using Grep Console in Eclipse

I am using Grep Console in Eclipse to highlight lines in the console output that contain characters, e.g. cancel, based on a regex. The characters may have a symbol preceding and/or following it, may be surrounded by spaces, or may be substrings. In other words, I want to match the following lines (regardless of case):
The flight was cancelled.
[Cancelled] Flight 101
Are they going to cancel it?
What is the regex that I need to use to highlight these lines?
As acdcjunior already explained, you basically just need a case insensitive regular expression to match "cancel".
If you already have your output in the console, the easiest way to create this expression is to just select the word "cancel" in the output, then right click and select "Add Expression" from the context menu. A submenu will you select a group to which the new expression will be added, or create a new one. The expression item will then be created, using the following expression:
Be sure to uncheck the "Case sensitive" checkbox, which is enabled by default for performance reasons and would prevent the expression from matching your second line with the capital 'C'.
This is basically the same expression acdcjunior provided, with a few differences:
The .* matchers at the beginning and end of the expression are not included, as they are not necessary. Expressions will always match substrings anywhere in a line unless the $ or ^ matchers are used to specifically refer to the beginning or end of a line.
The expression is also wrapped in parentheses to create a capture group, allowing you to assign a style not only to the entire line containing the string cancel, but also to that string itself. You can leave out the parentheses if you don't want to style that string.
\Q and \E are always included when creating an expression from a selected text string to make sure that no characters from the selected string are interpreted as special expression characters. In this case, this not necessary, as cancel only contains word characters.
This means that in your case, the simplest sufficient expression is just:
This expression also works if you use it as a "quick expression", as suggested by acdcjunior, though there is no real need for this. The idea behind quick expressions is that very long lines in the console can considerably slow down pattern matching. Grep Console therefore has a configurable limit to how many characters in each line will be matched with the configured expressions. Any characters after this limit in long lines are ignored, which means that lines which contain keywords only after the limit will not be recognised and therefore not styled.
If you configure a quick expression, every line is first matched with this expression, and only if the match is positive will the "normal" expression be used. In this case, the expressions are matched against the entire line. The quick expression should therefore be as simple as possible, so as not to slow down the matching too much.
In your case, using cancel as a quick expression and leaving the normal expression blank works because first the quick expression is positively matched against your line, and then the blank expression matches as well. If you have very long lines, it may cost you some performance though, as the quick expression will ignore the length limits explained above. Also, quick expression don't use capture groups, so you can't highlight the cancel string with a separate style in this case.
And do not check "Case sensitive".
Or just cancel in the "Quick expression" text box.

Replace all characters in a regex match with the same character in Vim

I have a regex to replace a certain pattern with a certain string, where the string is built dynamically by repeating a certain character as many times as there are characters in the match.
For example, say I have the following substitution command:
However, I would like to do something like this instead:
where the non-existing notation -{5} would stand for the dash character repeated five times.
Is there a way to do this?
Ultimately, I'd like to achieve something like this:
which would replace any instance of a string of only hellos with the string consisting of the dash character repeated the number of times equal to the length of the matched string.
As an alternative to using the :substitute command (the usage of
which is already covered in #Peter’s answer), I can suggest automating
the editing commands for performing the replacement by means of
a self-referring macro.
A straightforward way of overwriting occurrences of the search pattern
with a certain character by hand would the following sequence of
Normal-mode commands.
Search for the start of the next occurrence.
Select matching text till the end.
Replace selected text.
Repeat from step 1.
Thus, to automate this routine, one can run the command
and execute the contents of that s register starting from the
beginning of the buffer (or anther appropriate location) by

Explain this Regular Expression please

Regular Expressions are a complete void for me.
I'm dealing with one right now in TextMate that does what I want it to do...but I don't know WHY it does what I want it to do.
/[[:alpha:]]+|( )/(?1::$0)/g
This is used in a TextMate snippet and what it does is takes a Label and outputs it as an id name. So if I type "First Name" in the first spot, this outputs "FirstName".
Previously it looked like this:
/[[:alpha:]]+|( )/(?1:_:/L$0)/g (it might have been \L instead)
This would turn "First Name" into "first_name".
So I get that the underscore adds an underscore for a space, and that the /L lowercases everything...but I can't figure out what the rest of it does or why.
Someone care to explain it piece by piece?
Here is the actual snippet in question:
<column header="$1"><xmod:field name="${2:${1/[[:alpha:]]+|( )/(?1::$0)/g}}"/></column>
This regular expression (regex) format is basically:
The "g" setting at the end means do a global replace, rather than just restricting the regex to a particular line or selection.
Breaking it down further...
[[:alpha:]]+|( )
That matches an alpha numeric character (held in parameter $0), or optionally a space (held in matching parameter $1).
As Roger says, the ? indicates this part is a conditional. If a match was found in parameter $1 then it is replaced with the stuff between the colons :: - in this case nothing. If nothing is in $1 then the match is replaced with the contents of $0, i.e. any alphanumeric character that is not a space is output unchanged.
This explains why the spaces are removed in the first example, and the spaces get replaced with underscores in your second example.
In the second expression the \L is used to lowercase the text.
The extra question in the comment was how to run this expression outside of TextMate. Using vi as an example, I would break it into multiple steps:
:0,$s/ //g
The first part of the above commands tells vi to run a substitution starting on line 0 and ending at the end of the file (that's what $ means).
The rest of the expression uses the same sorts of rules as explained above, although some of the notation in vi is a bit custom - see this reference webpage.
I find RegexBuddy a good tool for me in dealing with regexs. I pasted your 1st regex in to Buddy and I got the explanation shown in the bottom frame:
I use it for helping to understand existing regexs, building my own, testing regexs against strings, etc. I've become better # regexs because of it. FYI I'm running under Wine on Ubuntu.
it's searching for any alpha character that appears at least once in a row [[:alpha:]]+ or space ( ).
/[[:alpha:]]+|( )/(?1::$0)/g
The (?1 is a conditional and used to strip the match if group 1 (a single space) was matched, or replace the match with $0 if group 1 wasn't matched. As $0 is the entire match, it gets replaced with itself in that case. This regex is the same as:
/ //g
I.e. remove all spaces.
/[[:alpha:]]+|( )/(?1:_:/\L$0)/g
This regex is still using the same condition, except now if group 1 was matched, it's replaced with an underscore, and otherwise the full match ($0) is used, modified by \L. \L changes the case of all text that comes after it, so \LABC would result in abc; think of it as a special control code.