This question already has answers here:
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 9 years ago.
In my document I have
Between the
I want to find ANYTHING except for US and PR.
For example
<country>US</country> = ignore
<country>PR</country> = ignore
<country>UP</county> = match found
What I have is
Pattern = "<Country>(.*?[^USPR].*?)</Country>"
but this ignores strings like
Not sure how to write allowing only 2 options between the tags.. US and PR only.
This should work.
Matches the opening <country> tag not followed by US or PR. Then goes on to match anything before the closing </country> tag.
Try this one:
This question already has answers here:
Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms
(10 answers)
Closed 3 years ago.
I know I got BeautifulSoup, But I want to try my own.
Regex I've been working on
<br>This Text needed
<a>unwanted text</a>
This text needed
<a >unwanted text</a>
This text needed
<a>unwanted text</a>
<br>this text needed
What I have come up with:
I want to match the This text needed but one of them isn't matching.
How about this way with lookahead negative and lookbehind positive,
(?<!<a>)This Text needed(?!<\/a>)
This question already has answers here:
What is the best way to parse html in C#? [closed]
(15 answers)
Closed 5 years ago.
I have a requirement where I don't have to match a specific word when in occurs between anchor tag. Anchor tags can have other html tags nested.
For Example:
<a title="Test" href=""><span style="color: blue;">Test</span></a><p>Test - MANUALLY<br /><br />Google </p><p> Resolving as duplicate of Test</p><p>Test test</p>
Here every "Test" gets selected. All I want here is getting only "Test" not present inside "anchor tag" and also not part of attributes of "anchor tag".
Regex I used was:
Not sure if this will accomplish your needs, but the second capturing group should only include matches that do not fall within the anchor tag.
However, I would highly recommend utilizing an XML parser or XPath.
This question already has answers here:
Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms
(10 answers)
Closed 5 years ago.
I have a xml file with this data format
<row Id="9" Body="aaaaaaaaa" Target="123456" />
I want to find & replace all Body="" things with a space from my xml file. What is the regex for that?
There are many possibilities, here is one way to remove the content from the Body attribute
This creates two capturing groups for the content before and after the Body attribute. Then, you just use those capturing groups for the replacement:
It will transform:
<row Id="9" Body="aaaaaaaaa" Target="123456" />
<row Id="9" Body="" Target="123456" />
You can see it working here.
This question already has answers here:
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 8 years ago.
I'm trying to write a regular expression to see if a string contains any of the typical table tags:
Along with tags that may contain other attributes e.g:
<table border="1">
I've come up with this so far, however, it matches <br /> tag and I'm not sure why:
Regular expressions use parentheses, not square brackets, to group things. A set of characters inside square brackets matches any of those characters.
When you want to match 1 or more of something, use + rather than {1,}.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
RegEx match open tags except XHTML self-contained tags
How to remove single attribute with quotes via RegEx
I am trying to remove the "sfref" attribute from the html code below:
<a sfref="[Libraries]719c25f9-89b3-4a7c-b6d5-e734b0c06ac1" href="../../HPLC.sflb.ashx">Determination</a> <br />
<img sfref="[Libraries]3e60aebb-acac-4806-bd22-f7986f66e7b3" src="../../Note52011.sflb.ashx">Test</a><br />
So far I have come up with this regex, but it is not matching:
This is where I am testing if it help:
Can someone please help me remove the "sfref" attribute?
You really really really shouldn't use regex (see the link in #Jack Maney's comment), but if you have to, this should work:
This will work for single or double quotes.