Parsing with regular expressions - regex

I have some text like
some text [http://abc.com/a.jpg] here will be long text
can be multiple line breaks again [http://a.com/a.jpg] here will be other text
blah blah
Which I need to transform into
<div>some text</div><img src="http://abc.com/a.jpg"/><div>here will be long text
can be multiple line breaks again</div><img src="http://a.com/a.jpg"/><div>here will be other text
blah blah</div>
To get the <img> tags, I replaced \[(.*?)\] with <img src="$1"/>, resulting in
some text<img src="http://abc.com/a.jpg"/>here will be long text
can be multiple enters again<img src="http://a.com/a.jpg"/>here will be other text
blah blah
However, I have no idea how to wrap the text in a <div>.
I'm doing everything on the iPhone with RegexKitLite

Here's the simplest approach:
Replace all occurrences of \[(.*?)\] with </div><img src="$1"/><div>
Prepend a <div>
Append a </div>
That does have a corner case where the result starts or ends with <div></div>, but this probably doesn't matter.
If it does, then:
Replace all occurrences of \](.*?)\[ with ]<div>$1</div>[
Replace all occurrences of \[(.*?)\] with <img src="$1"/>

Related

How to Match Redundant Lines From Contenteditable Div in Regex

I'm trying to process the html inside a contenteditable div. It might look like:
<div>Hi I'm Jack...</div>
<div><br></div>
<div><br></div>
<div>More text.</div> *<div><br></div>*
*<div><br></div>**<div><br></div>*
*<div><br></div>*
*<div>
<br>
</div>*
What regex expression would match all trailing <div><br></div> but not the ones sandwiched between useful divs containing text, i.e., <div> text (not html) </div>?
I have enclosed all expressions I want to match in asterisks. The asterisk are for reference only and are not part of my string.
Thanks,
Jack
You can use the pattern:
(?:<div>[\n\s]*<br>[\n\s]*<\/div>)(?!.*?<div>[^<]+<\/div>)
You can try it here.
Let me know if this works for all your cases and I will write a detailed explanation of the pattern.

Regular expression group word and sentence

I would like to make a regular expression that does the following:
Gets the whole line of a text file
Gets the first word of that line
Outputs into an input
Currently I can do each of those separately but as one call it is getting hairy:
Whole Line
^\b(.*)\b
First Word
^\b(\w*)\b
Replace for Input
<div class="field"><label><input class="input-checkbox" id="Foo$1" name="Foo" type="checkbox" value="$1" /> <span>$1</span> </label></div>
I would like to use $1 and $2 to separate between the full line for the text display and the first word for the value and ID. Any thoughts? I really like regular expressions for their usefulness and speed as long as I don't hit a knowledge road block like this
Use the entire match:
Search: ^(\w+).*
Replace: First word is $1, whole line is $&
In your case, the replacment term would be:
<div class="field"><label><input class="input-checkbox" id="Foo$1" name="Foo" type="checkbox" value="$1" /> <span>$0</span> </label></div>
The entire match in Atom is coded as $&.
Most other tools/languages use group zero $0 for the entire match.

Sublime 3 multiple replace spaces with underscores

In my html page I have a lot of strings inside tags.
like
<p>Some string 1</p>
<p>Some string 2</p>
<p>Any string 3</p>
I need to put all of them to attribute TRANSLATE, lowercase them and replace all spaces to underscores inside strings.
So I multiselect all of them with holded CTRL, then ctrl+K, ctrl+L make them lowercase, CTRL+x - erase, two left arrows for going inside tags, write translate="PASTE HERE"
Now I have
<p translate="some string 1"></p>
<p translate="some string 2"></p>
<p translate="any string 3"></p>
Next step - I need to make underscores instead of spaces.
To find all translate strings I use regex (?s)translate=".+?"
But how to replace? Help.
Type ctrl + H and then
Use negative-lookbehind to search spaces which are not preceded by p.
(?<!p)\h+
\h matches only horizontal spaces.
Now replace-all it with _.
This is simple but will work and faster than looking for a smarter answer.
Find this: translate="(.*) (.*)"
Replace with this: translate="\1_\2"
Keep using Replace All until all your unwanted spaces are underscores (in the example you gave, twice).

Regex find and replace between <div class="customclass"> and </div> tag

I cant find anywhere a working regex expression to find and replace the text between the div tags
So there is this html where i want to select everything between the <div class="info"> and </div> tag and replace it with some other texts
<div class="extraUserInfo">
<p>Hello World! This is a sample text</p>
<javascript>.......blah blah blah etc etc
</div>
and replace it with
My custom text with some codes
<tags> asdasd asdasdasdasdasd</tags>
so it would look like
<div class="extraUserInfo">
My custom text with some codes
<tags> asdasd asdasdasdasdasd</tags>
</div>
here is a refiddle that all my code is there and as you can see I want to replace the whole bunch of codes between the and tag
http://refiddle.com/1h6j
Hope you get what I mean :)
If there's no nesting, would just do a plain match non-greedy (lazy)
(?s)<div class="extraUserInfo">.*?</div>
.*? matches any amount of any character (as few as possible) to meet </div>
Used s modifier for making the dot match newlines too.
Edit: Here a Javascript-version without s modifier
/<div class="extraUserInfo">[\s\S]*?<\/div>/g
And replace with new content:
<div class="extraUserInfo">My custom...</div>
See example at regex101; Regex FAQ

capture multi-line text

I've a html document..:
<p>blah blah<p>
<p>blah blah
<br>blah blah</p>
<p>blah blah
<br>
<br>
blah blah</p>
And I want to remove double-breaking spaces. (can be found in last paragraph)
I tried this expression, but it will remove anything between first <br> and second <br>.
But I want to remove just last <br> (which comes right after another <br>, in next line):
/<br>(.*?)<br>/s
Try using
<br>((\s*)<br>)+
It will match (on your example) two or more <br> tags that have some sort of space between them.