Regular Expression Replace String - regex

I have a rather complicating data file with many rows of many different types. For the particular column I'm interested in I have a pattern that looks like this:
12.6 \pm 0.8
^^ The number of digits before and after the decimal in each of those pieces of the entry may vary.
I'm hoping I can use regular expressions to replace that column entry to:
[12.6,-0.8,+0.8]
What I am requesting help on is how I should go about replacing once I've found entries like what I had earlier. All of the examples I've found so far are for when you want to replace static strings with other static strings, but for each line I'm necessarily going to have different numbers (and different digits perhaps). The regular expression I've attempted so far to find entries like "12.6 \pm 0.8" is the following:
\d*\.\d*\s\\\w{2})\s\d*\.\d*
I would also appreciate if I could get a check on that, too. At the moment I'm just manipulating the datafile in my text editor, but I'm also open to Python solutions, too.
Thanks!

Your expression is close. Are there any conditions where this won't work?
(\d*\.\d*)\s\\\w{2}\s(\d*\.\d*)
with the replace pattern being (for JS)
[$1, -$2, $2]
or for emacs (according to http://www.emacswiki.org/emacs/RegularExpression)
[\1, -\2, \2]

Related

Find and replace with regular expression in Notepad++

At the moment, I have a PHP function that gets the contents of a CSV file and puts it into a multi-dimensional array, which contains text that I print out in various places, using the indexes.
an example of use would be:
$localText[index][pageText][conceptQualityText][$lang];
The first index, [index], would be the name of the page. The second index [pageText] would indicate what it is (text for the page). The third index, [conceptQualityText] indicates what the actual text is. The last index, [$lang] gets the text in the desired language.
so:
->page location
->what is it
->the content
->what language it should be displayed in.
This all worked fine in the previous PHP versions. However, upgrading to 7.2, PHP seems to be a bit more strict. I was a bit more green ~2 years ago when I first made this solution, and now know that since these indexes aren't defined as strings e.g. encapsulated in single quotes like so: ['index'], they fit the notation of a superglobal (DEFINE). I didn't give it much thought back then, but now PHP seems to interpret them as so (superglobals), and so I get thrown the error that x word is an undefined superglobal.
My initial thought is to make a search and replace on my example string:
$localText[index][pageText][conceptQualityText][$lang];
using the regular expression functionality in Notepad++.
However, the example is just one of many, the notation of the array indexing is basically:
$localText[index][index2][index3][$lang];
So my question is:
How can I make use of the Notepad++ search and replace, using a regular expression, so that my index pointers become strings, instead of acting as superglobal variables?
e.g. make:
$localText[index][index2][index3][$lang];
into:
$localText['index']['index2']['index3'][$lang];
I will need some sort of logic that checks for whatever is inside the brackets and encapsulates them with single quotes, except for the last index, [$lang].
I tried to give as much information as possible, let me know if anything needs to be elaborated.
I tried to refer to these docs without much luck.
I found a solution using
this:
find: \b(localText\[)([a-zA-z0-9_\-]+)(\]\[)([a-zA-z0-9_\-]+)(\]\[)([a-zA-z0-9_\-]+)
replace: $1'$2'$3'$4'$5'$6'
and it works like a charm. Thanks for everyone who took their time to help.
You can use the following regex to match:
\[[^'](\w+)[^']\]
The regex matches a Word between Square brackets unless it quoted.
Replace with:
['$1']
The regex will not match the last brackets because it contains a '$' sign.

Simple find-and replace regexp

In a .txt file i have multiple lines. Every line contains timing data like this:
time [4.1s] [4100ms]
time [5.53s] [5530ms]
All lines have different words/chars before and after the times.
I want to do a Find- and replace action (In Notepad++) to get the following, simple, format:
4.1
5.53
How do I do it? What is the regular expression to use?
Any help is greatly appreciated!
Find:
.*\[([\d.]+)s\].*
Replace with:
\1
Assuming that you only want the first number in brackets and that has a decimal point as per your example:
\d*[.]\d+
This returns 4.1 and 5.53 as requested when applied to your example.
If the first number might not have a decimal point, then you want to consider:
\d*[.]?\d+s
but append s in your replace to account for the s.
Update
Update based on your latest information. I don't know if Notepad++ supports positive lookbehind (?<=), but if it does you could do this:
(?<=time \[)\d*[.]\d+

Matching Any Word Regex

I would like to remove hundreds on onmouseover events from my code. the evt all pass different variables and I want to be able to use dreamwaever to find and replace all the strings with nothing.
Here is an example
onmouseover="parent.mv_mapTipOver(evt,'Wilson');"
onmouseover="parent.mv_mapTipOver(evt,'Harris');"
onmouseover="parent.mv_mapTipOver(evt,'Walker');"
I want to run a search that will identify all of these and replace/remove them.
I have tried seemingly infinite permutations of things like:
onmouseover="parent.mv_mapTipOver(evt,'[^']');"
or
onmouseover="parent.mv_mapTipOver(evt,'[^']);"
or
onmouseover="parent.mv_mapTipOver(evt,[^']);"
or
onmouseover="parent.mv_mapTipOver(evt,'[^']+');"
And many more. I cannot find the regular expression that will work.
Any/all help would be appreciated.
Thanks a ton!
"." and "(" have special meaning in regular expressions, so you need to escape them:
onmouseover="parent\.mv_mapTipOver\(evt,'[^']+'\);"
I'm not sure if this is correct dreamweaver regex syntax, but this stuff is standard enough.
Try this one:
onmouseover="parent\.mv_mapTipOver\(evt,'.+?'\);"
And see it in action here.
When using reg expressions you have to be very careful about how you handle white space. For example the following piece of code will not get caught by most of the reg expressions mentioned so far because of the space after the comma and equals sign, despite the fact that it is most likely valid syntax in the language you are using.
onmouseover= "parent.mv_mapTipOver(evt, 'Walker');"
In order to create regexp that ignore white space you must insert /s* everywhere in the regexp that white space might occur.
The following regexp should work even if there is additional white space in your code.
onmouseover\s*=\s*"parent\.mv_mapTipOver\(\s*evt\s*,\s*'[A-Za-z]+'\s*\);"

Find Acronym with Regular Expression Dreamweaver

I have 2000 page website and it contains over 500 acronyms. What Regular expression could I use to find all the acronyms in the text only? I'm using dream-weaver. Some examples would be AFD, GTDC, IJQW and so on.. these are 2 or more capitals might be bounded or surround by other characters. Such example would be (DFT) or l'WQF - any ideas??
If dreamweaver has search via grep capability, you could just search for any string of letters with all capitals, including whatever necessary punctuation you need, e.g. [A-Z'-]{3,}. The 3 is the minumum number of letters in the acronym... you can change that as needed.
This would probably be better done via shell script, though, just for speed's sake. Let us know what OS you're using and someone else can leave a comment as to how to script that, as I probably don't know.

How to enclose text patterns within xml elements, except when it is already inside a certain xml element?

I have several thousand xml files generated from java properties files prepared for translation in the TTX format. They contain quite a few variables, that I need to protect from the translators, as they often break such things. The variables are in the form of numbers or occasionally text between a pair of curly braces eg. {0}, {this}.
I need to surround these variables with an xml element if they are not already an attribute and if they are not already part of the inner text of a ut element, like so:
<ut DisplayText="{0}"><{0}></ut>
My input looks like this:
<ut Type="start"DisplayText="string"><string></ut> text string {0}
<ut DisplayText="{1}"><{1}></ut> in:
<ut DisplayText="\n"><\n/></ut> {2}.
<ut Type="end" DisplayText="resource"></resource></ut>
The correct output should be this:
<ut Type="start"DisplayText="string"><string></ut> text string <ut DisplayText="{0}">{0}</ut>
<ut DisplayText="{1}"><{1}></ut> in:
<ut DisplayText="\n"><\n/></ut> <ut DisplayText="{2}">{2}</ut>.
<ut Type="end" DisplayText="resource"></resource></ut>
My initial approach was to use a regular expression to match the term in the braces and just build the xml elements around it with pattern substitution. This approach fails when the pattern is present found as in the first code block above.
Previous find and replace patters (in notepad++):
Find
({[A-Za-z0-9]*})
Replace
<ut DisplayText="\1">\1</ut>
It is beginning to look like regex is not the right tool for the job, so I would like some suggestions on better approaches to take, different tools, or even just a more complete regex that may allow me to solve this quickly and repeatably.
Update: The problem turned out to be a little more complex than previously envisioned. It seems there are also a couple more things that needed protecting, involving some rather obscure syntax, mixing variables with text in what appears to be some kind of conditional statement. From memory:
{o,choice|1#1 error|1<{0,number,integer} errors}
Where "error" and "errors" are translatable and should not be protected. The simplest solution we have at present is to run the above regex, fix the odd few of erros it creates and then run a couple more normal find & replace passes for the more complex items. It could be abstracted out as regex, but right now there is not much point in doing that.
I appreciate the pointers to xslt and other editors with better regex support, in addition to the improved expressions offered. I will have a play with some of the options when time allows.
Let me know if my assumption is wrong, but from your example it seems you want to change text that is in {} and not in a <ut> element. To me this seems like an easy use of XSLT. Simply output UT elements as they are and process any text in between.
Why not try using the expression
(?<=.){[A-Za-z0-9]+}(?=.$)
This would find the { with 1 or more letters or numbers and the } when this pattern follows the tag and any number of spaces AND is followed by any number of spaces and a line break.
I ended up using a combination of the Regex in the question and manually fixing the odd error that caused. It wasn't ideal but it was quicker than trying to find the perfect solution.