Find and replace with regular expressions in Intellij - regex

I got the following string:
http://www.foo.com/images/bar/something/image.gif|png
I would like to replace all the occurences of such string as:
<someTag>/images/bar/something/image.gif|png</someTag>
How can I achieve this using a regEx?

To open Replace in Path popup press Ctrl+Shift+R (or on mac Cmd+Shift+R).
If you want to switch everywhere in your project, make sure the scope is In Project.
Make sure Regex? is checked and put the next values in the boxes:
(http://)([a-zA-Z.0-9]*)([a-zA-Z.-/0-9#?%&|]*)
<someTag>$3</someTag>
You will need to make sure all the characters you need are in the expression. For example strings with # will not be replaced as expected.
The way this works is the patterns matched are matched in 3 groups and by using the parenthesis and then only the 3d group is being used to replace the string found.
Example:
Good luck!

Related

Search and replace with particular phrase

I need a help with mass search and replace using regex.
I have a longer strings where I need to look for any number and particular string - e.g. 321BS and I need to replace just the text string that I was looking for. So I need to look for BS in "gf test test2 321BS test" (the pattern is always the same just the position differs) and change just BS.
Can you please help me to find particular regex for this?
Update: I need t keep the number and change just the text string. I will be doing this notepad++. However I need a general funcion for this if possible. I am a rookie in regex. Moreover, is it possible to do it in Trados SDL Studio? Or how am i able to do it in excel file in bulk?
Thank you very much!
Your question is a bit vague, however, as I understand it you want to match any digits followed by BS, ie 123BS. You want to keep 123 but replace BS?
Regex: (\d+)BS matches 123BS
In notepad++ you can:
match (\d+)BS
replace \1NEWTEXT
This will replace 123BS with 123NEWTXT.
\1 will substitue the capture group (\d+). (which matches 1 or more digits.
You could do this in Trados Studio using an app. The SDLXLIFF Toolkit may be the most appropriate for you. The advantage over Notepad++ is that it's controlled and will only affect the translatable text and not anything that might break the integrity of the file if you make a mistake. You can also handle multiple files, or even multiple Trados Studio projects in one go.
The syntax would be very similar to the suggestion above... you would:
match (\d+)BS
replace $1NEWTEXT

VS 2017 find and replace ConfigurationManager.Appsettings["stringname"]

I am trying to find and replace ConfigurationManager.Appsettings["stringname"] and Convert.ToInt32(ConfigurationManager.Appsettings["stringname"]) (where "stringname" could be any valid AppSetting name) in all the files in my project with a call to a custom method which wraps this.
There are around 1000 of these entries in the project, trying to avoid doing this manually on each file.
Is there a better way to do this? Gave a few tries with the Find in all files with a regex, but with no luck.
Thanks in advance.
You should be able to match and replace all with this regex:
ConfigurationManager.Appsettings\[([^\])\]
The regex search for the string 'ConfigurationManager.Appsettings[', then creates a capturing Group containing the content of the Square brackets.
You then need to replace with:
ClassA.Method($1)
That will replace with the string, where '$1' will be replaced by the matched Group 1 from the regex (which would be for instance '"SessionTimeout"').

Regular Expression to Remove Subdomains from Domain List

I have a list of domains and subdomains stored in a .txt file (I'm using Windows XP).
The format of the domains is this:
somesite1.com
sub1.somesite1.com
sub2.somesite1.com
somesite2.com
sub1.somesite2.com
sub2.somesite2.com
somesite3.com
sub1.somesite3.com
sub2.somesite3.com
I use notepad++, and I need to use regular expressions
Anyway, I don't know what to put in the find & replace boxes so it can go through the contents of the file and leave me with only the root domains. If done properly, it would turn the above example list into this:
somesite1.com
somesite2.com
somesite3.com
Can somebody help me out?
Thank you in advance.
It's an old question, but the answers provided didn't work for me. You need a negative lookahead. The correct regex is:
^\w*\.(?!\w+\s*\n)
You can use:
Find what: [^\r\n]+\.[^.\r\n]+\.[^.\r\n]+[\r\n]+
Replace with: empty_string
with regular expression checked and dot match line-feed NOT checked
I suggest using the Mark tab of the Notepad++ Find dialogue. Enter the regular expression ^\w+\.\w+\.\w+$, make sure that Bookmark line is selected, then click Mark all. Next, use Menu => Search => Bookmark => Remove bookmarked lines. These will remove all entries having with three "words" separated by two dots. It will leave all other lines in place.
An alternative is to mark all lines matching the regular expression ^\w+\.\w+$ and use the Remove unmarked lines menu entry. This I do not recommend as it will remove all lines with an unexpected format as well as the lines for subdomains.
Another method would use the Replace tab of the Notepad++ Find dialogue. Enter the regular expression ^\w+\.\w+\.\w+\r\n in the Find what field, and leave the Replace with field empty. The \r\n part of this expression may need some adjustment to account for the line endings set on the file.

Regexp-replace: Multiple replacements within a match

I'm converting our MVC3 project to use T4MVC. And I would like to replace java-script includes to work with T4MVC as well. So I need to replace
"~/Scripts/DataTables/TableTools/TableTools.min.js"
"~/Scripts/jquery-ui-1.8.24.min.js"
Into
Scripts.DataTables.TableTools.TableTools_min_js
Scripts.jquery_ui_1_8_24_min_js
I'm using Notepad++ as a regexp tool at the moment, and it is using POSIX regexps.
I can find script name and replace it with these regexps:
Find: \("~/Scripts/(.*)"\)
Replace with \(Scripts.\1\)
But I can't figure out how do I replace dots and dashes in the file names into underscores and replace forward slashes into dots.
I can check that js-filename have dot or dash in a name with this
\("~/Scripts/(?=\.*)(?=\-*).*"\)
But how do I replace groups within a group?
Need to have non-greedy replacement within group, and have these replacements going in an order, so forward slashes converted into a dot will not be converted to underscore afterwards.
This is a non-critical problem, I've already done all the replacements manually, but I thought I'm good with regexp, so this problem bugs me!!
p.s. preferred tool is Notepad++, but any POSIX regexp solution would do -)
p.p.s. Here you can get a sample of stuff to be replaced
And here is the the target text
I would just use a site like RegexHero
You can past the code into the target string box, then place (?<=(~/Script).*)[.-](?=(.*"[)]")) into the Regular Expression box, with _ in the Replacement String box.
Once the replace is done, click on Final String at the bottom, and select Move to target string and start a new expression.
From there, Paste (?<=(<script).*)("~/)(?=(.*[)]" ))|(?<=(Url.).*)(")(?=(.*(\)" ))) into the Regular Expression box and leave the Replacement String box empty.
Once the replace is done, click on Final String at the bottom, and select Move to target string and start a new expression.
From there paste (?<=(Script).*)[/](?=(.*[)]")) into the Regular Expression box and . into the Replacement String box.
After that, the Final String box will have what you are looking for. I'm not sure the upper limits of how much text you can parse, but it could be broken up if that's an issue. I'm sure there might be better ways to do it, but this tends to be the way I go about things like this. One reason I like this site, is because I don't have to install anything, so I can do it anywhere quickly.
Edit 1: Per the comments, I have moved step 3 to Step 5 and added new steps 3 and 4. I had to do it this way, because new Step 5 would have replaced the / in "~/Scripts with a ., breaking the removal of "~/. I also had to change Step 5's code to account for the changed beginning of Script
Here is a vanilla Notepad++ solution, but it's certainly not the most elegant one. I managed to do the transformation with several passes over the file.
First pass
Replace . and - with _.
Find: ("~/Scripts[^"]*?)[.-]
Replace With: \1_
Unfortunately, I could not find a way to match only the . or -, because it would require a lookbehind, which is apparently not supported by Notepad++. Due to this, every time you execute the replacement only the first . or - in a script name will be replaced (because matches cannot overlap). Hence, you have to run this replacement multiple times until no more replacements are done (in your example input, that would be 8 times).
Second pass
Replace / with ..
Find: ("~/Scripts[^"]*?)/
Replace with: \1.
This is basically the same thing as the first pass, just with different characters (you will have to this 3 times for the example file). Doing the passes in this order ensures that no slashes will end up as underscores.
Third pass
Remove the surrounding characters.
Find: "~/(Scripts[^"]*?)"
Replace with: \1
This will now match all the script names that are still surrounded by "~/ and ", capturing what is in between and just outputting that.
Note that by including those surrounding characters in the find patterns of the first two passes, you can avoid converting the . in strings that are already of the new format.
As I said this is not the most convenient way to do it. Especially, since passes one and two have to be executed manually multiple times. But it would still save a lot of time for large files, and I cannot think of a way to get all of them - only in the correct strings - in one pass, without lookbehind capabilities. Of course, I would very much welcome suggestions to improve this solution :). I hope I could at least give you (and anyone with a similar problem) a starting point.
If, as your question indicates, you'd like to use N++ then use N++ Python Script. Setup the script and assign a shortcut key, then you have a single pass solution requiring only to open, modify, and save... can't get much simpler than that.
I think part of the problem is that N++ is not a regex tool and the use of a dedicated regex tool
, or even a search/replace solution, is sometimes warranted. You may be better off, both in speed and in time value using a tool made for text processing vs editing.
[Script Edit]:: Altered to match the modified in/out expectations.
# Substitute & Replace within matched group.
from Npp import *
import re
def repl(m):
return "(Scripts." + re.sub( "[-.]", "_", m.group(1) ).replace( "/", "." ) + ")"
editor.pyreplace( '(?:[(].*?Scripts.)(.*?)(?:"?[)])', repl )
Install:: Plugins -> Plugin Manager -> Python Script
New Script:: Plugins -> Python Script -> script-name.py
Select target tab.
Run:: Plugins -> Python Script -> Scripts -> script-name
[Edit: An extended one-liner PythonScript command]
Having need for the new regex module for Python (that I hope replaces re) I played around and compiled it for use with the N++ PythonScript plugin and decided to test it on your sample set.
Two commands on the console ended up with the correct results in the editor.
import regex as re
editor.setText( (re.compile( r'(?<=.*Content[(].*)((?<omit>["~]+?([~])[/]|["])|(?<toUnderscore>[-.]+)|(?<toDot>[/]+))+(?=.*[)]".*)' ) ).sub(lambda m: {'omit':'','toDot':'.','toUnderscore':'_'}[[ key for key, value in m.groupdict().items() if value != None ][0]], editor.getText() ) )
Very sweet!
What else is really cool about using regex instead of re was that I was able to build the expression in Expresso and use it as is! Which allows for a verbose explanation of it, just by copy-paste of the r'' string portion into Expresso.
The abbreviated text of which is::
Match a prefix but exclude it from the capture. [.*Content[(].*]
[1]: A numbered capture group. [(?<omit>["~]+?([~])[/]|["])|(?<toUnderscore>[-.]+)|(?<toDot>[/]+)], one or more repetitions
Select from 3 alternatives
[omit]: A named capture group. [["~]+?([~])[/]|["]]
Select from 2 alternatives
["~]+?([~])[/]
Any character in this class: ["]
[toUnderscore]: A named capture group. [[-.]+]
[toDot]: A named capture group. [[/]+]
Match a suffix but exclude it from the capture. [.*[)]".*]
The command breakdown is fairly nifty, we are telling Scintilla to set the full buffer contents to the results of a compiled regex substitution command by essentially using a 'switch' off of the name of the group that isn't empty.
Hopefully Dave (the PythonScript Author) will add the regex module to the ExtraPythonLibs part of the project.
Alternatively you could use a script that would do it and avoid copy pasting and the rest of the manual labor altogether. Consider using the following script:
$_.gsub!(%r{(?:"~/)?Scripts/([a-z0-9./-]+)"?}i) do |i|
'Scripts.' + $1.split('/').map { |i| i.gsub(/[.-]/, '_') }.join('.')
end
And run it like this:
$ ruby -pi.bak script.rb *.ext
All the files with extension .ext will be edited in-place and the original files will be saved with .ext.bak extension. If you use revision control (and you should) then you can easily review changes with some visual diff tool, correct them if necessary and commit them afterwards.

Regex find/replace

I am attempting to do some find and replace on a java source file.
Currently my classes have invalid names (imported from a tool that did poor auto naming) of the form:
public class [0-9]{2}[A-Za-z]+
I would like to insert underscores around the digits, resulting in a valid class name of the form
public class [_][0-9]{2}[_][A-Za-z]+
However using Eclipses find and replace tool, with the regex box check on both the find and replace strings does not format the output as I'd like.
It takes
02ListOfValidAppIDs
and makes it
[-][0-9]{2}[_][A-Za-z]+
instead of
_02_ListOfValidAppIDs
How can you make the regex keep the arbitrary number and text and just plug them in for the replace string?
(Edit: As a note, with the preview feature I can see that eclipse is correctly finding all of the names I wish to replace, and nothing else)
I'm not sure of the exact flavor of Regex that you'll need, but something like this should get you started in the right direction.
Update your "Find" pattern to use capture groups:
(public class )([0-9]{2})([A-Za-z]+)
And then reference those captures in the replacement:
\1_\2_\3
NOTE: Some flavors of Regex will use {} instead of () to represent a captured group, and some flavors will use $1, $2, $3, etc. as the reference instead of using \#.
In Eclipse's Find/Replace dialogue, this works fine if you use
([0-9]+)([A-Za-z]+)
in the Find box and
_$1_$2
in the Replace with box.
Of course, the Regular expressions box must be checked too.