Search and replace with particular phrase - regex

I need a help with mass search and replace using regex.
I have a longer strings where I need to look for any number and particular string - e.g. 321BS and I need to replace just the text string that I was looking for. So I need to look for BS in "gf test test2 321BS test" (the pattern is always the same just the position differs) and change just BS.
Can you please help me to find particular regex for this?
Update: I need t keep the number and change just the text string. I will be doing this notepad++. However I need a general funcion for this if possible. I am a rookie in regex. Moreover, is it possible to do it in Trados SDL Studio? Or how am i able to do it in excel file in bulk?
Thank you very much!

Your question is a bit vague, however, as I understand it you want to match any digits followed by BS, ie 123BS. You want to keep 123 but replace BS?
Regex: (\d+)BS matches 123BS
In notepad++ you can:
match (\d+)BS
replace \1NEWTEXT
This will replace 123BS with 123NEWTXT.
\1 will substitue the capture group (\d+). (which matches 1 or more digits.

You could do this in Trados Studio using an app. The SDLXLIFF Toolkit may be the most appropriate for you. The advantage over Notepad++ is that it's controlled and will only affect the translatable text and not anything that might break the integrity of the file if you make a mistake. You can also handle multiple files, or even multiple Trados Studio projects in one go.
The syntax would be very similar to the suggestion above... you would:
match (\d+)BS
replace $1NEWTEXT

Related

REGEX in MS Word 2016: Exclude a simple String from Search

So I read a lot about Negation in Regex but can't solve my problem in MS Word 2016.
How do I exclude a String, Word, Number(s) from being found?
Example:
<[A-Z]{2}[A-Z0-9]{9;11}> to search a String like XY123BBT22223
But how to exclude for example a specefic one like SEDWS12WW04?
Well it depends on what you need to achieve or is this a matter of curiosity... RegEx is not the same as the built-in Advanced Find with Wildcards; for that you need VBA.
Depending on your need, without using VBA, you could make use of space and return characters - something like this will work for the strings provided: [ ^13][A-Z]{2}[0-9]{1,}[A-Z]{1,}[0-9]{1,}[ ^13] (assuming you use normal carriage returns and spaces in your document)
Anyway, this is a good article on wildcard searches in MS Word: https://wordmvp.com/FAQs/General/UsingWildcards.htm
EDIT:
In light of your further comments you will probably want to look at section 8 of the linked article which explains grouping. For my proposed search you can use this to your advantage by creating 3 groups in your 'find' and only modifying the middle group, if indeed you do intend to modify. Using groups the search would look something like:
([ ^13])([A-Z]{2}[0-9]{1,}[A-Z]{1,}[0-9]{1,})([ ^13])
and the replace might look like this:
\1 SOMETHING \3
Note also: compared to a RegEx solution my suggestion is kinda lame, mainly because compared to RegEx, MS-Words find and replace (good as it is, and really it is) is kinda lame... it's hacky but it might work for you (although you might need to do a few searches).
BUT... if it really is REGEX that you want, well you can get access to this via VBA: How to Use/Enable (RegExp object) Regular Expression using VBA (MACRO) in word
And... then you will be able to use proper RegEx for find and replace, well almost - I'm under the impression that the VBA RegEx still has some quirks...
As already noted by others, this is not possible in Microsoft Word's flavor of regular expressions.
Instead, you should use standard regular expressions. It is actually possible to use standard regular expressions in MS Word if you use a special tool that integrates into Microsoft Word called Multiple Find & Replace (see http://www.translatortools.net/products/transtoolsplus/word-multiplefindreplace). This tool opens as a pane to the right of the document window and works just like the Advanced Find & Replace dialog. However, in addition to Word's existing search functionality, it can use the standard regular expressions syntax to search and replace any text within a Word document.
In your particular case, I would use this:
\b[A-Z]{2}[A-Z0-9]{9,11}\b(?<!\bSEDWS12WW04)
To explain, this searches for a word boundary + ID + word boundary, and then it looks back to make sure that the preceding string does not match [word boundary + excluded ID]. In a similar vein, you can do something like
(?<!\bSEDWS12WW04|\bSEDWS12WW05|\bSEDWS12WW05)
to exlude several IDs.
Multiple Find & Replace is quite powerful: you can add any number of expressions (either using regular expressions or using Word's standard search syntax) to a list and then search the document for all of them, replace everything, display all matches in a list and replace only specific matches, and a few more things.
I created this tool for translators and editors, but it is great for any advanced search/replace operations in Word, and I am sure you will find it very useful.
Best regards, Stanislav

How to match and replace n number of times with RegEx

I'm using TextWrangler, the free version of BBEdit on the Mac, which I understand uses the PCRE engine.
What I want to do is match a specific number of lines and replace as well.
After a lot of searching I came up with this:
(^(.*\r)){25}
This lets me match up to 25 lines. It works great, but the problem comes when I want to actually replace something. I can't figure out how to do it.
For example, I would like to replace all of the returns "\r" with tabs "\t".
Hopefully this is actually possible. I'd appreciate any help. Thanks!
Regexp domain is searching. You cannot replace using regexp; a programming language or editor can use regexp as the search part of its search-and-replace function. Thus, the way to do 25 replacements is purely in the domain of said programming language or editor. If it does not provide such capability, either directly in search-and-replace or as a macro/loop/other, then you cannot do it automatically.

Regex to match text between two strings, including the strings

I'm trying to fix some conflicts in a merge by git, there are a lot of <<<<<< HEAD and ====== blocks I want to be able to just find and replace with an empty string in a lot of files.
I found this regex pattern that correctly matches everything between the two strings, but it leaves out the beginning and ending strings, and I want to be able to match them also.
(?s)(?<=<<<<<<< HEAD).*?(?=\=\=\=\=\=\=\=)
So, match <<<<<<< HEAD, ======= and everything between them to do a search/replace.
Can anyone help me out? I would be running this on files I'm certain I don't want anything between those strings, I guess that's also why I didn't try a "use theirs" flag when doing the merge, because I need to see the files first.
Just leave out the look-arounds mentioned by Xufox
(?s)(<<<<<<< HEAD)(.*?)(\=\=\=\=\=\=\=)
The .*? is wrapped with parentheses so you can reference it in the replacement. \1 for the first group, \2 for the second, and \3 for everything in between (but the syntax can vary.)
I think you might be asking the wrong question here. The best way to actually handle merge conflicts is with a merge tool. You should look into something like meld. And specifically setting git merge tool to use that. Manual merges are not fun...
Use a pretty ui to analyze the merge instead
You want to match on the opening and closing tags of the conflict sections?
Parentheses are primarily used for group capturing, if you do like so:
(<<<<<<< HEAD)(.*\s)+(\s*=======)
It will create 4 groups which you can access the members.
Tested: http://regexr.com/

Regex exclude value from repetition

I'm working with load files and trying to write a regex that will check for any rows in the file that do not have the correct number of delimiters. Let's pretend the delimiter is % (I'm not sure this text field supports the delimiters that are in the load file). The regex I wrote that finds all rows that are correct is:
^%([^%]*%){20}$
Sometimes it's more beneficial to find any rows that do not have the correct number of delimiters, so to accomplish that, I wrote this:
(^%([^%]*%){0,19}$)|(^%([^%]*%){21,}$)
I'm concerned about the efficiency of this (and any regex I write in general), so I'm wondering if there's a better way to write it, or if the way I wrote it is fine. I thought maybe there would be some way to use alternation with the repetition tokens, such as:
{0,19}|{21,}
but that doesn't seem to work.
If it's helpful to know, I'm just searching through the files in Sublime Text, which I believe uses PCRE. I'm also open to suggestions for making the first regex better in general, although I've found it to work pretty well even in exceptionally large load files.
If your regex engine supports negative lookaheads, you can slightly modify your original regex.
^(?!%([^%]*%){20}$)
The regex above is useful for test only. If you want to capture, then you need to add .* part.
^(?!%([^%]*%){20}$).*$

How do I join two regular expressions into one in Notepad++?

I've been searching a lot in the web and in here but I can't find a solution to this.
I have to make two replacements in all registry paths saved in a text file as follows:
replace all asterisc with: [#42]
replace all single backslashes with two.
I already have two expressions that do this right:
1st case:
Find: (\*) - Replace: \[#42\]
2nd case:
Find: ([^\\])(\\)([^\\]) - Replace: $1$2\\$3
Now, all I want is to join them together into just one expression so that I can do run this in one time only.
I'm using Notepad++ 6.5.1 in Windows 7 (64 bits).
Example line in which I want this to work (I include backslashes but i don't know if they will appear right in the html):
HKLM\SOFTWARE\Classes\*\shellex\ContextMenuHandlers\
I already tried separating it with a pipe, like I do in Jscript (WSH), but it doesn't work here. I also tried a lot of other things but none worked.
Any help?
Thanks!
Edit: I have put all the backslashes right, but the page html seem to be "eating" some of them!
Edit2: Someone reedited my text to include an accent that doesn't remove the backslashes, so the expressions went wrong again. But I got it and fixed it. ;-)
Sorry, but this was my first post here. :)
As everyone else already mentioned this is not possible.
But, you can achieve what you want in Notepad++ by using a Macro.
Go to "Macro" > "Start Recording" menu, apply those two search and replace regular expressions, press "Stop Recording", then "Save Current Recorded Macro", there give it a name, assign a shortcut, and you are done. You now can reuse the same replacements whenever you want with one shortcut.
Since your replacement strings are totally different and use data that come not from any capture (i.e. [#42]), you can't.
Keep in mind that replacement strings are only masks, and can not contain any conditional content.