<setting name="Media.MediaLinkServerUrl" value=" "/>
The Regex to select whatever inside: value=" and "/> is
$regex = '(?<=<setting name="Media\.MediaLinkServerUrl" value=")[^"]*'
It work well!
but what about if:
<setting name="Media.MediaLinkServerUrl" value=" \/>
I tried:
$regex = '(?<=<setting name="Media\.MediaLinkServerUrl" value=")[^\]*'
$regex = '(?<=<setting name="Media\.MediaLinkServerUrl" value=")[^\\]*'
but it doesn't work!
I know \ is an reserved regex character, but [^.]* or [^|]* for example work well.
So how to select between \ and \ with Regex in Powershell in the example above?
A Big thx!
Ps: I can't comment or ask on the original post because i don't have 50 reputation, sorry.
I am using
I have a JSON file that I'd like to run spellChecker on and it looks like this:
"abc.editGroupsMaxLengthError": "Maximum {{charLen}} characters"
I would like to know how can all words between {{ and }} be ignored by the spellchecker.
I tried with
as documented here to ignore regex.
but it doesn't seem to use }} or {{ for some reason.
How can this be fixed?
You can wrap your {{...}} substrings with <!-- spellchecker-disable --> / <!-- spellchecker-enable --> tags, see this Github issue.
So, make sure your JSON looks like
"abc.editGroupsMaxLengthError": "Maximum <!-- spellchecker-disable -->{{charLen}}<!-- spellchecker-enable --> characters"
And the result will be
C:\Users\admin\Documents\1>spellchecker spellchecker -f spellchecker_test.json
Spellchecking 1 file...
spellchecker_test.json: no issues found
To wrap the {{...}} strings in a certain file in Windows you could use PowerShell, e.g., for a spellchecker_test.json file:
powershell -Command "& {(Get-Content spellchecker_test.json -Raw) -replace '(?s){{.*?}}','<!-- spellchecker-disable -->$&<!-- spellchecker-enable -->' | Set-Content spellchecker_test.json}"
In *nix, Perl is preferable:
perl -0777 -i -pe 's/\{\{.*?}}/<!-- spellchecker-disable -->$&<!-- spellchecker-enable -->/s' spellchecker_test.json
I'm trying to write a bash script that will change the fill color of certain elements within SVG files. I'm inexperienced with shell scripting, but I'm good with regexes ( JS).
Here's the SVG tag I want to modify:
<!-- is the target because its ID is exactly "" -->
<path id="" d="..." style="fill:#000000" />
Here's the bash code I've got so far:
local newSvg="" # will hold newly-written SVG file content
while IFS="<$IFS" read tag
if [[ "${tag}" =~ +id *= *"the\.target" ]]; then
tag=$(echo "${tag}" | sed 's/fill:[^;];/fill:${color};/')
done < ${iconSvgPath} # is an argument to the script
Explained: I'm using read (splitting the file on < via custom IFS) to read the SVG content tag by tag. For each tag, I test to see if it includes an id property with the exact value I want. If it doesn't, I add this tag as-is to a newSvg string that I will later write to a file. If the tag does have the desired ID, I'll used sed to replace fill:STUFF; with fill:${myColor};. (Note that my sed is also failing, but that's not what I'm asking about here.)
It fails to find the right line with the test [[ "${tag}" =~ +id *= *"the\.target" ]].
It succeeds if I change the test to [[ "${tag}" =~ \"the\.target\" ]].
I'm not happy with the working version because it's too brittle. While I don't intend to support all the flexibility of XML, I would like to be tolerant of semantically irrelevant whitespace, as well as the id property being anywhere within the tag. Ideally, the regex I'd like to write would express:
id (preceded by at least one whitespace)
followed by zero or more whitespaces
followed by =
followed by zero or more whitespaces
followed by ""
I think I'm not delimiting the regex properly inside the [[ ... =~ REGEX ]] construction, but none of the answers I've seen online use any delimiters whatsoever. In javascript, regex literals are bounded (e.g. / +id *= *"the\.target"/), so it's straightforward beginning a regex with a whitespace character that you care about. Also, JS doesn't have any magic re: *, whereas bash is 50% magic-handling-of-asterisks.
Any help is appreciated. My backup plan is maybe to try to use awk instead (which I'm no better at).
EDIT: My sed was really close. I forgot to add + after the [^;] set. Oof.
It would be much easier if you define regular expression pattern in a variable :
tag=' id = ""'
pattern=' +id *= *"the\.target"'
if [[ $tag =~ $pattern ]]; then
echo matched.
Thank you for giving us such a clear example that regex is not the way to solve this problem.
A SVG file is an XML file, and a possible tool to modify these is xmlstarlet.
Try this script I called modifycolor:
# invoke as: modifycolor <svg.file> <target_id> <new_color>
xmlstarlet edit \
--update "//path[#id = '$2']/#style" --value "fill:#$3" \
Assuming the svg file is test.svg, invoke it as:
./modifycolor test.svg ff0000
You will be astonished by the result.
If you want to paste a piece of code inside your bash script, try this:
newSvg=$(xmlstarlet edit \
--update "//path[#id = '${target}']/#style" --value "fill:#${myColor}" \
Thanks to folks for pointing out the mistakes in my bash-fu, I came up with this code which does what I said I wanted. I will not be marking this as the accepted answer because, as folks have observed, regex is a bad way to operate on XML. Sharing this for posterity.
local newSvg="" # will hold newly-written SVG code
while IFS="<$IFS" read tag
if [[ "${tag}" =~ \ +id\ *=\ *\"the\.target\" ]]; then
tag=$(echo "${tag}" | sed -E 's/fill:[^;]+;/fill:'"${color}"';/')
done < ${iconSvgPath}
escape the whitespace in the regex: =~ \ +id\ *=\ *
for sed, switch to double-quotes for the variable in the pattern
also for sed, I added the -E extended regex flag in order to support the negated set [^;]
Re: XML, I'll be comparing the list of available CLI-friendly XML parsers to the set of tools commonly available on my users' machines.
Having an input such as:
./Tomcatv8.1/projects.xml: <jdbc url="jdbc:sqlserver://localhost:1433;databaseName=MMMABC" />
./Tomcatv8.2/projects.xml: <jdbc url="jdbc:sqlserver://localhost:1433;databaseName=MMMABC_New" />
./Tomcatv8.3/projects.xml: <jdbc url="jdbc:sqlserver://localhost:1433;databaseName=ABC_20170407_STG" />
./Tomcatv8.5/projects.xml: <jdbc url="jdbc:sqlserver://localhost:1433;databaseName=UPGABC_New" />
I want to colorize the database name.
I used
grep --color=auto -E "[a-zA-Z0-9_]+\""
It works quite well, except that it also highlight the final " sign, which is used as a boundary in my regexp.
How to just highlight the database name?
You may enclose the " into a lookahead and use a PCRE regex with grep:
grep --color=auto -P "[a-zA-Z0-9_]+(?=\")"
^ ^^^^^^
See the regex demo
The (?=\") only checks if the text matches the pattern, but the value is not added to the resulting match. See more about lookarounds in regex here.
I trying to clear file from <math>.*?</math>. It is easy to do it in one line but how to do it with multiline? Where in one line can be more tags or less?
I prepare some test text for Wikipedia to show problem:
: <math>A =
a_{1,1} & a_{1,2} & \dots \\
a_{2,1} & a_{2,2} & \dots \\
\vdots & \vdots & \ddots
</math> oraz <math>B =
b_{1,1} & b_{1,2} & \dots \\
b_{2,1} & b_{2,2} & \dots \\
\vdots & \vdots & \ddots
B_1 \\
B_2 \\
We discuss problem on Stackoverflow and receive such good solution but not working if line contains overlapping tags like </math> oraz <math> it is correct since we have pair but it not works.
I am not expert in awk, sed, perl - only know very well regex.
Perl suggestion (not working on this example):
cat dirt-math-2.txt | perl -wlne '
unless(((/.*<math>/../<\/math>/)||0) > 1){s/<math>//;print}
' | less
Awk suggestion (not working on this example):
cat dirt-math-2.txt | awk '
sub(/<math>.*/, "") {print; cut=1}
/<\/math>/ {cut=0; next}
!cut' | less
File to parse is whole Wikipedia in Polish language so it is need be parsed without loading 6Gb into memory. Thank you in advance for any suggestion. I asked some similar question before but it is not the same.
Here's a Perl solution. It works by accumulating lines from the file into a buffer $text and then removing all <math>...</math> pairs. If the resulting buffer has no opening <math> tag then it is printed and emptied. That way, text from the file will only be stored in memory until it has no unpaired <math> tags, and normally it will contain only a single line of input
The program expects the path to the input file as a parameter on the command line. It has been tested against your sample data in this and your previous questions, and works fine
use strict;
use warnings;
my $text;
while ( <> ) {
$text .= $_;
$text =~ s/<math>.*?<\/math>//sg;
if ( $text !~ /<math>/ ) {
print $text;
$text = '';
A way with sed:
sed -r ':a;/<math>/{:b;s!<math>([^<]|<[^/]|</[^m]|</m[^a]|</ma[^t]|</mat[^h]|</math[^>])*</math>!!g;ta;N;bb;}' file
:a; # defines the label "a"
/<math>/ { # condition: if the pattern space contains "<math>"
:b; # defines the label "b"
# try to replace (the ugly alternation "emulate" a non greedy quantifier)
ta; # if something is replaced go to label "a"
N; # else append the next line to the pattern space
bb; # and go to label "b"
I'm trying to reformat some data that I have that isn't playing well when I copy text from a pdf.
Framing / Sheathing tools
Framing / Sheathing tools
Framing / Sheathing tools
I want to have it formatted like this:
Cordless 9B12071R CHARGER, 3.6V,LI-ION
What I'm trying to do is a find and replace that replaces the first two new lines "\n" with a tab "\t" and leaving the third "\n" in tact.
The first thing I do is replace all "\n" with "\t" which is easy. After that, I want to replace the third "\t" with "\n". How would I do that using regex?
For EditPadPro, paste this into the Search box
([A-Za-z /]+)
Paste this into the Replace box
\1 \2 \3
And that should do it. Basically you can add carriage returns and tabs using Ctrl+Enter and Ctrl+Tab in EditPadPro.
I had to add a carriage return to your text in the question as it's missing the last line I think. All the others are in triples of data.
Alright here is the php code that does exactly as you want:
$s = "Cordless
$p = '/(Cordless.*?)\\n(.+?)\\n(CHARGER.+?)(\\n|$)/s';
$r = '\\1' . "\t" . '\\2' . "\t" . '\\3' . "\n";
echo preg_replace($p, $r, $s);
>php -q regex.php
Cordless 9B12071R CHARGER, 3.6V,LI-ION
Is this a regex job or can you rely on the line number?
$ perl -nE 'chomp; print $_, $.%3? "\t": "\n"' file
EDIT (after comment)
If you have to do this in an editor, then this works in vim:
The important bit here is the assumption that a line that consists entirely of A-Z, 0-9 and - constitutes a part number. ^I is a tab, you type tab and vim prints ^I. (I hope your editor has this many steroids!)