I have a bunch of html files.
I am trying to find all anchor links whose href attribute does not end with slash
For example :-
Helllo
This should match
Helllo
This should not match
How do i go on about making the regular expression.
Building on hjpotter92's answer...
Find: href="(\S*?[^/])"
Replace: href="$1/"
In order to make sure you only collect href attribute values from <a> elements, and to make sure you only match the link itself, you can use the following regex:
<a\s[^<>]*href="\K[^"]*?(?<=[^\/])(?=")
Or
<a\s[^<>]*href="\K[^"]*?(?=(?<=[^\/])")
The following pattern would work:
href="\S*?[^/]"
You can try:
Find: href="(.*?[^/])"
Replace with: href="$1/"
Related
Given this example text:
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ABB</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="FM"/>
...lines...
...........
</abr:rules>
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ADE</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="CM"/>
...lines...
...........
</abr:rules> (end of group)
I would like to find and remove all that goes from <abr:rules> to </abr:rules> with the condition that subapplication IS NOT "CM". Organization and application are the same, <abr:code> it's any string.
What I tried so far is
<abr:rules>\n<abr:ruleTypeDefinition>\n<abr:code>[a-zA-Z0-9]{3,}<\/abr:code>\n<abr:ownership>\n<.*"(FM|PSD|SSC)"\/>\n(?s).*?\n<\/abr:rules>\n
which works but only because I know the other subapplication names.
Is there any way to do it with Regex only ?
Try the following find and replace:
Find:
<abr:rules>((?!subapplication=).)*subapplication="(?!CM")[^"]+"((?!</abr:rules>).)*</abr:rules>
Replace:
(empty string)
Demo
Note: The above pattern will only work if you enable dot in Notepad++ to match newlines. If you don't want to do that, then you may use [\S\s] instead of dot.
You should not use regex for xml, you can read why here:
https://stackoverflow.com/a/1732454/3763374
Instead you can use some parser like Xpath
I'm trying to select text between ></ . Example below I want "text"
>text</
but I'm unable to do so.
tried the following but it doesn't like the slash at the end of the regex
\>(.*?)\<\
I'm trying to do this in TextPad. How is this supposed to be done?
I'm ultimately wanting to delete all text between these two characters so all I'm left with is something like: <element></element>
RegEx wise, you can use 3 groupings and for the replace only use the first and 3rd group: \1\3.
Find: (>)(.*)(</)
Replace: \1\3
Try doing:
\>(.*?)\<\/
The regex that you were trying would actually have given error because you had a \ and nothing after that.
You are close.. use the following:
(>).*?(<\/)
And replace with \1\2
See DEMO
OR
You can use lookbehind and lookaheads:
(?<=>)(.*?)(?=<\/)
And replace with '' (empty string)
See DEMO
I have a bunch of .php files with incomplete references.
For instance: <a href="dishItem.php">.
I'm trying to use the NotePad++ Find and replace feature to find all the .php instances and concatenate a string in front of the match. So, after finding the previous example it would concatenate it with http://Example.com/ like: <a href="http://Example.com/dishItem.php">.
Is this possible with NotePad++ or anything else? I'm not very familiar with regex. Thank you
Find what: <a href="([^"]*?\.php)">
Replace with: <a href="http://Example.com/\1">
I had one file opened so I clicked 'Replace All in All Opened Documents'
Every thing inside parentheses () , is captured.
[^"]*? means every thing but not a " character. (Lazy! stops/fails at the first occurrence.)
\1 refers to the first capturing group.
Maybe the .php files might not be formatted inside an anchor tag you can just search for the file instead of the whole <a href ....>
Find: ([^"]*?\.php)
Replace: http://Example.com/\1
Find: (?<=")(.*?\.php)
Replace by:http://Example.com/$1
See demo.
https://regex101.com/r/eS7gD7/13
Use look arounds:
Find: (?<=<a href=")(?=[^"]+\.php">)
Replace: http://Example.com/
This approach means not having to capture anything or use back references; it matches the insertion point for the replacement.
Select Search Mode as Regular expression
Find: (http://Example.com\/)
Replace: <a href="\1dishItem.php">
Take the cursor on first line and Replace All.
Use following regex in notepad++
Here i'am trying to replace some text with some another text.
Example:-
almada-institute.com:25:info#almada-institute.com:info12345:"hello"<service#iubi.com>:nossl
in-graph.com:25:info#in-graph.com:info123456789:"hello"<service#iubi.com>:nossl
get-herbals.net:25:info#get-herbals.net:info321:"hello"<service#iubi.com>:nossl
I need to be like this
almada-institute.com:25:info#almada-institute.com:info12345:"hello"<info#almada-institute.com>:nossl
in-graph.com:25:info#in-graph.com:info123456789:"hello"<info#in-graph.com>:nossl
get-herbals.net:25:info#get-herbals.net:info321:"hello"<info#get-herbals.net>:nossl
For this i'am using :-
But it just replace the search content with the regular expression in replace field.
You can use regex groups. Based on your sample input/output above, the below worked for me.
Find what:
:(\w+#.*?):(.*?)<(\w+#.*?)>:
Replace with:
:\1:\2<\1>:
You can make regular expression groups by using ( and ) and reference then in replace using $N, which will access N-th group.
So in your case, you can do this
regex: ([^: ]+:[^:]+:([^:]+):[^:]+:[^:]+)(:[^: ]+)
replace: \1<\2>\3 or on some implementation $1<$2>$3
You can try it yourself here: https://www.regex101.com/r/hY8pY9/1
Use capturing groups. Putting another regex in their makes no sense. So with the screenshot you gave, the regex would be something like:
([a-z]\w+[#][a-z]\w+)[.]([a-z]\w+)
Replace with:
\1-\2.\1
I'm not very familiar with regular expressions.
I'm trying to create a regular expression that will match the text between the first group of two forwards slashes. It's easiest just to show an example.
Search Texts:
/index
/index/
/index/foo/
/index/foo/bar/
All of those should return just "index"
Another example:
Search Texts:
/page.php
/page.php/foo?bar=1
Should return just "page.php" for both of those
Thanks alot guys!
Try this one for javascript or php preg_match: ^\/([^\/]*)
The pattern matches only of if there is a slash at the beginning and then matches everything that is not a slash.