How to remove everything outside of round brackets in notepad++

How to remove everything outside of round brackets in notepad++ - regex

I am trying to extract image urls from css file using a notepad++.
Since all the image urls are inside of the round brackets I am thinking about regex to remove everything before ( and everything after ).
Here is a text example :
text-transform:uppercase;
font-size:17px;
}
.conTxt .btnGiveAccess{
width:532px;
height:145px;
background:url(http://www.website.com/css/images/btn-give-me-access2.jpg) no-repeat center top;
margin:0 auto;
display:block;
}
.conTxt .btnGiveAccess:hover{
background:url(http://www.website.com/css/images/btn-give-me-access2.jpg) no-repeat center -145px;
}
/*#########################################*/
popup window
/*#########################################*/
/*a {
As a result I woul dlike to get the following:
http://www.website.com/css/images/btn-give-me-access2.jpg
http://www.website.com/css/images/btn-give-me-access2.jpg
I also tried the following regex to delete everything before http://
^[^http]*` Also .*((.*)).*
but it didnt work. Could anybody please help?

For the given text the following works. Use a find text of (\A|\))([^()]*)(\(|\Z) and replace that with \r\n. This will leave the required text plus a few empty lines that can easily be removed, eg by menu => Edit => Line operations => Remove empty lines.
A minor variation is to use a replacement string of \1\3. which will remove everything outside the round brackets leaving the brackets themselves and the text between them. It is then a simple job to remove all round brackets, perhaps replacing them with new lines. This could be done with a find text of [()]+ and replace string of \r\n.
Explanation of the first regular expression. The captures are:
(\A|\)) which looks for either the start of the buffer, the \A or a close bracket.
([^()]*) which looks for a sequence of characters that do not include round brackets.
(\(|\Z) which looks for a close bracket or the end of the buffer, the \Z.
The effect is to look for a three types of text.
From start of buffer to first opening round bracket. This matches (\A)([^()]*)(\(|\Z).
From close bracket to open bracket. This matches (\))([^()]*)(\().
From close bracket to end of buffer. This matches (\))([^()]*)(\Z).
This may not do the desired job if there are nested round brackets, but the question does not specify what should happen in such cases.

If none of the URLs have parens in them, you could use \((.*)\)
The difference between this and what you show above is that the outer () (the literal ones, not the ones that make the regex capture) are escaped using \

Related

Notepad++ Regex Replace Makeshift Footnotes format With Proper Markdown format

In Word, I had to convert my footnotes to lines appearing at the end of each file to able to make changes in formatting. Some macro I found online was using braces and I ended up using also highlighting so I can see easily where my footnotes used to be. In this way, I have the following strings twice in my documents in the main text and also at the end of each document, sort of like makeshift endnotes.
=={1}==
.
.
.
=={99}==
I want to be able to match those instances in the text and convert them to proper markdown now. The problem is that the in-text format
[^1], [^2], etc.
will be different from what needs to come at the bottom with a semi-colon added:
[^1]:
etc.
So I'm guessing I'll have to live with replacing my old formatting with the new ones with semi-colons and deleting the semi-colons individually while I edit/clean up my text in the future. Without adding the semi-colon, it won't work.
My question is how to use the regex to match the two-digit strings with braces and equation marks.
This
==(\{d{1,2}\})==
did not work.
Also, as I am no pro, I would need the replacement as well. It probably will be
[^($1)]:
I reckon. Apparently, the equal mark doesn't have to be escaped.
Current format:
...some text...makeshift footnote in the format of
=={one- or two-digit number with no spaces in between}==
For example,
=={1}==
=={23}==
etc.
Desired result for all occurences recursively:
[^1]:
.
.
.
[^99]:
The markdown format is single square brackets with a caret and a number, also a semi-colon with the actual footnotes. Usually the number goes up to 42-45 maximum but it doesn't matter, the two digit regex is needed. As I said, the semi-colon will be needed in all instances.
Cheers

You have just some errors in your regex, you forget to escaped the d for digit, it should be \d and the capture group must not include the curly braces.
Use:
Ctrl+H
Find what: =={(\d{1,2})}==
Replace with: [^$1]:
TICK Wrap around
SELECT Regular expression
Replace all
Explanation:
=={ # literally
(\d{1,2}) # group 1, 1 or 2 digits
}== # literally
Screenshot (before):
Screenshot (after):

How to replace specific strings between <> using regex

Can anyone tell me how to do the following task using regex?
replace all the ABC with DEF only when ABC is inside both <> and ""
original string:
<tagA nameABC1="attr1ABCx xyzABC" name2="attABCa"> outside"ABC"xyz</tagA>
<tagB nameABC2="attr2ABCx cccABC" name3="testABCb"> outside_"ABC"</tagB>
desired string after replacing:
<tagA nameABC1="attr1DEFx xyzDEF" name2="attDEFa"> outside"ABC"xyz</tagA>
<tagB nameABC2="attr2DEFx cccDEF" name3="testDEFb"> outside_"ABC"</tagB>
Edited:
Thank you guys.
I've decided to use HTML parser library jsoup to handle all html text properly.

Assuming well formed input (no dangling quotes or brackets):
Search: ABC(?=(?:(?:[^"]*"){2})*[^"]*"[^"]*$)(?=[^<>]*>)
Replace: DEF
See live demo.
This works by applying two look aheads:
the first look ahead (?=(?:(?:[^"]*"){2})*[^"]*"[^"]*$) requires there to be an odd number of quote characters in the remaining input, which in turn means the match is inside quotes
the other look ahead (?=[^<>]*>) requires the next angle bracket to be a closing bracket, which in turn means the match is inside an angle bracket pair
This is not bullet proof, for example it doesn't cater for closing angle brackets being inside quotes, but even this could be handled with an even more complicated look ahead that applied similar logic from the first look ahead when matching angle brackets... an excerise left for the reader.

RegEx for transforming the next text using PhpStorm's search and replace dialog

I need to transform text using regex
TPI +2573<br>
NM$ +719<br>
Молоко +801<br>
Прод. жизнь +6.5<br>
Оплод-сть +3.6<br>
Л. отела 6.3/3.9<br>
Вымя +1.48<br>
Ноги +1.61<br>
to this one
<strong>TPI</strong> +2573<br>
<strong>NM$</strong> +719<br>
<strong>Молоко</strong> +801<br>
<strong>Прод. жизнь</strong> +6.5<br>
<strong>Оплод-сть</strong> +3.6<br>
<strong>Л. отела</strong> 6.3/3.9<br>
<strong>Вымя</strong> +1.48<br>
<strong>Ноги</strong> +1.61<br>
Is it possible with regex in PhpStorm's search and replace dialog?

Given your text, you can use this regex,
.* +
and replace it with <strong>$0</strong> (Notice there is a space after </strong>)
We're using .* to capture everything but stop just before one (possible one or more) space because that's the point after which we want the text to remain intact. Once we capture the text, we use back-reference $0 to replace the match with <strong>$0</strong> so only matched text is placed within <strong> tags.
Regex Demo
Just in case, if this doesn't work for any of the samples you haven't included in your post, then please list the rules of replacement and I will give you a more robust solution, that will work flawlessly for your given rules.

Notepad++ and regex (multiline)

I have been facing a challenge. I have a text file with the following pattern:
SOME RANDOM TITLE IN CAPS (nnnn)
text text text
more text
...
SOME OTHER RANDOM TITLE IN CAPS (nnnn)
What is for sure is that what I want to extract are lines with a bracket and a date ex: (2015) ; (20008)
After the (nnnn) there is no text, sometimes space and CR LF, sometimes just CR LF
I would like to delete everything else and keep just the TITLE LINE with the brackets
The time I spent I could have done it by hand (there are 100lines) but I like the challenge :)
I thought I could find the issue but I am stuck.
I have tried something along this line:
^.*\(\d\d\d\d\)(?s)(.*)(^.*\(\d\d\d\d\))
But I don't get what I want. I can't seem to stop the (?s)(.*) going all the way to the end of the text instead of stopping at the next occurrence.

I suggest using the Search > Mark feature. Use a pattern like \(\d{4}\) and check the "Bookmark Line" option then click "Mark All". Then use Search > Bookmark > Remove Unmarked Lines. This will remove all lines except the ones that have matched your pattern.
Note: If it's possible to have parentheses with 4 digits within your other lines you could add $ to the end of the expression to ensure that the pattern only matches the end of the line. E.g. more text (1234) and other stuff would be matched by the pattern I gave above but if you use pattern \(\d{4}\)$ it will no longer match.
If you want to be even more specific with your pattern by looking for those lines with only uppercase letters and spaces followed by parentheses with 4 digits inside where the parentheses are at the end of the line, then you could use a pattern like this: [A-Z ]+\(\d{4}\)$
Sample input:
SOME RANDOM TITLE IN CAPS (2008)
text text text
more text
...
SOME OTHER RANDOM TITLE IN CAPS (2010)
Here is how to mark the lines:
After clicking "Mark All" here is what you see:
Now use Search > Bookmark > Remove Unmarked Lines and you get this:

The following RegEx maches the 2 lines with brackets containing 4 numbers:
.*?\(\d{4}\)\s*
It starts matching anything at start zero or more times (non greedy), then it matches a start bracket followed by 4 numbers. Finally ending White Space and new line.

If you want to remove all lines but the ones that end with (4numbers) you may try with this:
^(?!.*\(\d{4}\)\h*$).*(?:\r?\n|\z)
Replace by: (nothing)
See demo

Vim: remove matching braces and the first word in the braces

For example, change
text 12345 {\color{red}text 123 \ref{label} 567
1234} 567
to
text 12345 text 123 \ref{label} 567
1234 567
What kind of operation should be done in vim?
I aim to find every all patterns {\color{red}
and remove the pattern and the matching brace } for the pattern,
while keeping the text in between.
The pattern {\color{red} can be anywhere in the line (not necessarily at the beginning of the line).
The text between the {\color{red} ...} can have multiple lines as shown above.
Thanks a lot for your help.
Edit:
I just find a way to do it, but may not be efficient enough.
:g/\\color{red}/norm ndiBvaBpd%
g: global
/\\color{red}: match the pattern
/norm: normal mode command
n: forward the cursor to next matching pattern from the cursor. But if the pattern is at the beginning of the line, it may fail to find it.
diB: delete inner block from the cursor
vaB: select block around the cursor
p: put to the selected block
d%: delete \color{red}

didn't get what do you really mean. there are many ways could do it.
{\color{red}text 123 \ref{label} 567}
^
|cursor
you could do:
df}$x
if you have surround.vim installed, removing surrounding braces would be easier. (ds{)
EDIT
for the question update:
open your file, and type:
:g#{\\color{red}#normal 0df}$x
hope the command does what you want.
EDIT II based on question update
if your target text object is crossing lines, you could try this:
g/{\\color{red}/normal 0f{mz%x`zxdf}
above line works if your target pattern crossing multiple lines (not only one/two, could be many). However the syntax must be correct, which means, the { , } must be paired.

I would use a substitution with regex for this:
%s/\v\{\\color\{\w+\}(.*)} ?$/\1
\v very magic (sane regexes)
{\\color\{\w+\} the color thingy
(.*) capture the text you want to save
} ?$ closing nipple bracket and optional space at the end of the line
/\1 replace the whole thing with the first capture, which is stuff between color tag BS
For your edited example, you can use \_. instead of . because it includes linebreak characters.
%s/\v\{\\color\{\w+\}(\_.*)}/\1

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to remove everything outside of round brackets in notepad++ - regex

If none of the URLs have parens in them, you could use \((.*)\) The difference between this and what you show above is that the outer () (the literal ones, not the ones that make the regex capture) are escaped using \

Related

Notepad++ Regex Replace Makeshift Footnotes format With Proper Markdown format

How to replace specific strings between <> using regex

RegEx for transforming the next text using PhpStorm's search and replace dialog

Notepad++ and regex (multiline)

Vim: remove matching braces and the first word in the braces

Categories

Resources