Regex to delete js HTML attributes - regex

I've got this file from google that has these js attributes like jsname="data", jscontroller="data" etc.
I'd like to use Atom's find and replace with regex feature to replace all attributes beginning with js*="*" with blanks.
How would the regex for this be?
So <div class="l-o-c-qd" jsname="name" jscontroller="somecontroller">Text</div>
will be <div class="l-o-c-qd">Text</div>

Search correct RegEx corresponding to js*="*" en replace it with nothing (check space before/after for avoid double spaces after replacement)

Related

Notepad++ ( perl ) regex match multiple line pattern

I want to remove a div from a couple hundred html files
<div id="mydiv">
blahblah blah
more blah blah
more html
<some javascript here too>
</div>
I thought that this would do the job but it doesn't
<div(.*)</div>
Does anyone know which is the proper regex for this?
Regex
<div[^>]+>(.*?)</div>
Don't forget to check the option . matches newline like in the image below :
Alternatively, you can use this regex also: <div[^>]+>([\s\S]*?)</div> with or without the checkbox checked.
Discussion
Since * metacharacter is greedy, you need to tell him to take as few as possible characters (use of ?).
Check that the divs you want to remove DO NOT contain nested div. In that case, the regex at the start of my answer won't help you.
If you face this case, I'd suggest you using an html parser.

Regex match for contents of <li> element

I have the following content
<li>Title: [...]</li>
and I'm looking for regex that will match and replace this so that I can parse it as XML. I'm just looking to use a regex find and replace inside Sublime Text 2, so I want to match everything in the above example except for the [...] which is the content.
Why not extract the content and use it to build the xml rather than trying to mold the wrapper of the content into xml? (or am i mis understanding you?)
<li>Title: ([^<]*)<\/li>
is the regular expression to extract the content.
Its pretty self explanatory other than the [^<]* which means match any number of characters that is not a "<"
I don't know Sublime, but something like this should suffice to get you the contents of the li. It allows for there being optional extra attributes on the tag. Make sure and turn off case-sensitivity, incase of LI or Li etc. (lifted straight from http://www.regular-expressions.info/examples.html ):
<li\b[^>]*>(.*?)</li>
<li>\S*(.*)?</li>
That should match your string, with the content being capturing group 1.

REGEX Pattern - How do I match upto a certain tag in html

I have some html which I want to grab between 2 tags. However nested tags exist in the html so looking for wouldn't work as it would return on the first nested div.
Basically I want my regex to..
Match some text literally, followed by ANY character upto another literal text string. So my question is how do I get [^<]* to continue matching until it see's the next div.
such as
<div id="test"[^<]*<div id="test2"
Example html
<div id="test" class="whatever">
<div class="wrapper">
<fieldset>Test</fieldset><div class="testclass">some info</div>
</div>
<!-- end test div--></div>
</div>
<div id="test2" class="endFind">
In general, I suspect you want to look at "greedy" vs "lazy" in your regex, assuming that's supported by your platform/language.
For example, <div[^>]*>(.*?)</div> would make $1 match all the text inside a div, but would try to keep it as small as possible. Some people call *? a "lazy star".
But it seems you're looking to find the text within a div that is before the start of the first nested div. That would be something like <div[^>]*>(.*?)<div
Read about greedy vs lazy here and check to make sure that whatever language you're using supports it.
$ php -r '$text="<div>Test<div>foo</div></div>\n"; print preg_replace("/<div[^>]*>(.*?)<div.*/", "\$1", $text);'
Test
$
Regex is not capable of parsing HTML. If this is part of an application, you're doing something wrong. If you absolutely have to parse a document, use a html/xml parser.
If you're trying to screen scrape something and don't want to bother with a parser, look for identifying marks in the page you're scraping. For example, maybe the embedded div ends just before the one you want to match, so you could match </div></div> instead.
Alternatively, here's a regex that meets your requirements. However, it is very fragile: it will break if, for example, #test's children have children, or the html isn't valid, or I missed something, etc, etc ...
/<div id="test"[^<]*(<([^ >]+).+<\/$2>[^<]*)*<\/div>/

Get html content with RegEx between several tags

For example there are some html tags <div id="test"><div><div>testtest</div></div></div></div></div></div>
From that html, I need to get this <div id="test"><div><div>testtest</div></div></div>
Current regex /<div id=\"test\">.*(</div>){3}/gim
Since you have the specific requirement of needing exactly three closing tags, this regular expression should do the trick:
(<div.*?>)+.*?(</div>){3}
The trick here is to use the lazy star (*?) to keep the catch-all (.) character from matching more than you'd like.

Smarty replace text with double quotes

I have the following string in the smarty (php templating system) variable $test:
<img height="113" width="150" alt="Sunset" src="/test.jpg"/>
I want to add "em" to the height and width like this:
{$test|replace:'" w':'em" w'|replace:'" a':'em" a'}
But this doesn't work... What's the problem and the solution?
You do know ‘em’ units in HTML width/height attributes aren't valid, right? That's CSS only.
my regex isn't the greatest, or i'd give you a better matcher, but maybe using what you have through the regex replace would work.
{$test|regex_replace:'/".w/':'em" w'|regex_replace:'/".a/':'em" a'}
other matchers to try
'/\".w/'
'/".*w/'
'/\".*w/'
i can't play with my smarty sites at the moment, but i'd first remove the " from the replacement value, to see if the bug is there, then remove it from the matcher and just look for height/width.
otherwise i'd do the replace in PHP if you can.
With Aggiorno's Smart Search and Replace you can do it like this:
Search Pattern:
<img height="$h" width="$w" $attributes/>
Replace Pattern:
<img height="$[h]em" width="$[w]em" $attributes"/>
When you click the "Search" button, all the occurrences are highlighted before applying the replacement so you can do further checking, after that you can apply the replacement confidently.