Regex find/replace - regex

I am attempting to do some find and replace on a java source file.
Currently my classes have invalid names (imported from a tool that did poor auto naming) of the form:
public class [0-9]{2}[A-Za-z]+
I would like to insert underscores around the digits, resulting in a valid class name of the form
public class [_][0-9]{2}[_][A-Za-z]+
However using Eclipses find and replace tool, with the regex box check on both the find and replace strings does not format the output as I'd like.
It takes
02ListOfValidAppIDs
and makes it
[-][0-9]{2}[_][A-Za-z]+
instead of
_02_ListOfValidAppIDs
How can you make the regex keep the arbitrary number and text and just plug them in for the replace string?
(Edit: As a note, with the preview feature I can see that eclipse is correctly finding all of the names I wish to replace, and nothing else)

I'm not sure of the exact flavor of Regex that you'll need, but something like this should get you started in the right direction.
Update your "Find" pattern to use capture groups:
(public class )([0-9]{2})([A-Za-z]+)
And then reference those captures in the replacement:
\1_\2_\3
NOTE: Some flavors of Regex will use {} instead of () to represent a captured group, and some flavors will use $1, $2, $3, etc. as the reference instead of using \#.

In Eclipse's Find/Replace dialogue, this works fine if you use
([0-9]+)([A-Za-z]+)
in the Find box and
_$1_$2
in the Replace with box.
Of course, the Regular expressions box must be checked too.

Related

REGEX in MS Word 2016: Exclude a simple String from Search

So I read a lot about Negation in Regex but can't solve my problem in MS Word 2016.
How do I exclude a String, Word, Number(s) from being found?
Example:
<[A-Z]{2}[A-Z0-9]{9;11}> to search a String like XY123BBT22223
But how to exclude for example a specefic one like SEDWS12WW04?
Well it depends on what you need to achieve or is this a matter of curiosity... RegEx is not the same as the built-in Advanced Find with Wildcards; for that you need VBA.
Depending on your need, without using VBA, you could make use of space and return characters - something like this will work for the strings provided: [ ^13][A-Z]{2}[0-9]{1,}[A-Z]{1,}[0-9]{1,}[ ^13] (assuming you use normal carriage returns and spaces in your document)
Anyway, this is a good article on wildcard searches in MS Word: https://wordmvp.com/FAQs/General/UsingWildcards.htm
EDIT:
In light of your further comments you will probably want to look at section 8 of the linked article which explains grouping. For my proposed search you can use this to your advantage by creating 3 groups in your 'find' and only modifying the middle group, if indeed you do intend to modify. Using groups the search would look something like:
([ ^13])([A-Z]{2}[0-9]{1,}[A-Z]{1,}[0-9]{1,})([ ^13])
and the replace might look like this:
\1 SOMETHING \3
Note also: compared to a RegEx solution my suggestion is kinda lame, mainly because compared to RegEx, MS-Words find and replace (good as it is, and really it is) is kinda lame... it's hacky but it might work for you (although you might need to do a few searches).
BUT... if it really is REGEX that you want, well you can get access to this via VBA: How to Use/Enable (RegExp object) Regular Expression using VBA (MACRO) in word
And... then you will be able to use proper RegEx for find and replace, well almost - I'm under the impression that the VBA RegEx still has some quirks...
As already noted by others, this is not possible in Microsoft Word's flavor of regular expressions.
Instead, you should use standard regular expressions. It is actually possible to use standard regular expressions in MS Word if you use a special tool that integrates into Microsoft Word called Multiple Find & Replace (see http://www.translatortools.net/products/transtoolsplus/word-multiplefindreplace). This tool opens as a pane to the right of the document window and works just like the Advanced Find & Replace dialog. However, in addition to Word's existing search functionality, it can use the standard regular expressions syntax to search and replace any text within a Word document.
In your particular case, I would use this:
\b[A-Z]{2}[A-Z0-9]{9,11}\b(?<!\bSEDWS12WW04)
To explain, this searches for a word boundary + ID + word boundary, and then it looks back to make sure that the preceding string does not match [word boundary + excluded ID]. In a similar vein, you can do something like
(?<!\bSEDWS12WW04|\bSEDWS12WW05|\bSEDWS12WW05)
to exlude several IDs.
Multiple Find & Replace is quite powerful: you can add any number of expressions (either using regular expressions or using Word's standard search syntax) to a list and then search the document for all of them, replace everything, display all matches in a list and replace only specific matches, and a few more things.
I created this tool for translators and editors, but it is great for any advanced search/replace operations in Word, and I am sure you will find it very useful.
Best regards, Stanislav

Find and replace with regular expressions in Intellij

I got the following string:
http://www.foo.com/images/bar/something/image.gif|png
I would like to replace all the occurences of such string as:
<someTag>/images/bar/something/image.gif|png</someTag>
How can I achieve this using a regEx?
To open Replace in Path popup press Ctrl+Shift+R (or on mac Cmd+Shift+R).
If you want to switch everywhere in your project, make sure the scope is In Project.
Make sure Regex? is checked and put the next values in the boxes:
(http://)([a-zA-Z.0-9]*)([a-zA-Z.-/0-9#?%&|]*)
<someTag>$3</someTag>
You will need to make sure all the characters you need are in the expression. For example strings with # will not be replaced as expected.
The way this works is the patterns matched are matched in 3 groups and by using the parenthesis and then only the 3d group is being used to replace the string found.
Example:
Good luck!

Find replace named groups regexp in Geany

I am trying to replace public methods to protected methods for methods that have a comment.
This because I am using phpunit to test some of those methods, but they really don't need to be public, so I'd like to switch them on the production server and switch back when testing.
Here is the method declaration:
public function extractFile($fileName){ //TODO: change to protected
This is the regexp:
(?<ws>^\s+)(?<pb>public)(?<fn>[^/\n]+)(?<cm>//TODO: change to protected)
If I replace it with:
\1protected\3\//TODO: change back to public for testing
It seems to be working, but what I cannot get to work is naming the replace with. I have to use \1 to get the first group. Why name the groups if you can't access them in the replacing texts? I tried things like <ws>, $ws, $ws, but that doesn't work.
What is the replacing text if I want to replace \1 with the <ws> named group?
The ?<ws> named group syntax is the same as that used by .NET/Perl. For those regex engines the replacement string reference for the named group is ${ws}. This means your replacement string would be:
${ws}protected${fn}\//TODO: change back to public for testing
The \k<ws> reference mentioned by m.buettner is only used for backreferences in the actual regex.
Extra Information:
It seems like Geany also allows use of Python style named groups:
?P<ws> is the capturing syntax
\g<ws> is the replacement string syntax
(?P=ws) is the regex backreference syntax
EDIT:
It looks my hope for a solution didn't pan out. From the manual,
A subpattern can be named in one of three ways: (?...) or (?'name'...) as in Perl, or (?P...) as in Python. References to capturing parentheses from other parts of the pattern, such as backreferences, recursion, and conditions, can be made by name as well as by number.
And further down:
Back references to named subpatterns use the Perl syntax \k or \k'name' or the Python syntax (?P=name).
and
A subpattern that is referenced by name may appear in the pattern before or after the reference.
So, my inference of the syntax for using named groups was correct. Unfortunately, they can only be used in the matching pattern. That answers your question "Why name groups...?".
How stupid is this? If you go to all the trouble to implement named groups and their usage in the matching pattern, why not also implement usage in the replacement string?

Using find & replace with regex to refactor a method call in Visual C++

I have loads of instances of code like:
throw CODBCException("Error!",GetHENV(),GetHDBC());
throw CODBCException(Msg,GetHENV(),GetHDBC());
I want to replace each with a utility method:
throwException("Error!") or throwException(Msg)
Is this something Visual C++ search/replace can do using a regex? I never used this feature before and I don't really know regex well either, but it would be pretty neat.
I'm only interested what comes up to the first comma e.g. throw CODBCException("Error!", so really I'm searching to replace throw CODBCException(X,...) with throwException(X)
It depends on how generic you want the regex to be (eg. how close to the string examples you gave does the match have to be), but you use this for the find option:
throw[ \t]*CODBCException\({.*},[ \t]*GetHENV\(\),[ \t]*GetHDBC\(\)\);
and then use this for the replace with option:
throwException(\1);
In general, you can use the curly braces to specify a back-reference you want to replace, and use "\1", etc. to replace them.
Edit: Per updated question description, the following should be used for the find option:
throw[ \t]*CODBCException\({(".*")|([^,]+)},.*\);

Regex to change to sentence case

I'm using Notepad++ to do some text replacement in a 5453-row language file. The format of the file's rows is:
variable.name = Variable Value Over Here, that''s for sure, Really
Double apostrophe is intentional.
I need to convert the value to sentence case, except for the words "Here" and "Really" which are proper and should remain capitalized. As you can see, the case within the value is typically mixed to begin with.
I've worked on this for a little while. All I've got so far is:
(. )([A-Z])(.+)
which seems to at least select the proper strings. The replacement piece is where I'm struggling.
Find: (. )([A-Z])(.+)
Replace: \1\U\2\L\3
In Notepad++ 6.0 or better (which comes with built-in PCRE support).
Regex replacement cannot execute function (like capitalization) on matches. You'd have to script that, e.g. in PHP or JavaScript.
Update: See Jonas' answer.
I built myself a Web page called Text Utilities to do that sort of things:
paste your text
go in "Find, regexp & replace" (or press Ctrl+Shift+F)
enter your regex (mine would be ^(.*?\=\s*\w)(.*)$)
check the "^$ match line limits" option
choose "Apply JS function to matches"
add arguments (first is the match, then sub patterns), here s, start, rest
change the return statement to return start + rest.toLowerCase();
The final function in the text area looks like this:
return function (s, start, rest) {
return start + rest.toLowerCase();
};
Maybe add some code to capitalize some words like "Really" and "Here".
In Notepad++ you can use a plugin called PythonScript to do the job. If you install the plugin, create a new script like so:
Then you can use the following script, replacing the regex and function variables as you see fit:
import re
#change these
regex = r"[a-z]+sym"
function = str.upper
def perLine(line, num, total):
for match in re.finditer(regex, line):
if match:
s, e = match.start(), match.end()
line = line[:s] + function(line[s:e]) + line[e:]
editor.replaceWholeLine(num, line)
editor.forEachLine(perLine)
This particular example works by finding all the matches in a particular line, then applying the function each each match. If you need multiline support, the Python Script "Conext-Help" explains all the functions offered including pymlsearch/pymlreplace functions defined under the 'editor' object.
When you're ready to run your script, go to the file you want it to run on first, then go to "Scripts >" in the Python Script menu and run yours.
Note: while you will probably be able to use notepad++'s undo functionality if you mess up, it might be a good idea to put the text in another file first to verify it works.
P.S. You can 'find' and 'mark' every occurrence of a regular expression using notepad++'s built-in find dialog, and if you could select them all you could use TextFX's "Characters->UPPER CASE" functionality for this particular problem, but I'm not sure how to go from marked or found text to selected text. But, I thought I would post this in case anyone does...
Edit: In Notepad++ 6.0 or higher, you can use "PCRE (Perl Compatible Regular Expression) Search/Replace" (source: http://sourceforge.net/apps/mediawiki/notepad-plus/?title=Regular_Expressions) So this could have been solved using a regex like (. )([A-z])(.+) with a replacement argument like \1\U\2\3.
The questioner had a very specific case in mind.
As a general "change to sentence case" in notepad++
the first regexp suggestion did not work properly for me.
while not perfect, here is a tweaked version which
was a big improvement on the original for my purposes :
find: ([\.\r\n][ ]*)([A-Za-z\r])([^\.^\r^\n]+)
replace: \1\U\2\L\3
You still have a problem with lower case nouns, names, dates, countries etc. but a good spellchecker can help with that.