I'm really struggling to work out how to remove the full stop from the following:
• this is a test bullet.<br>
• this is a test bullet 2.<br>
• this is a test bullet 3.<br>
It needs to only remove the full stops from the bullets as there are other paragraphs containing full stops and break returns.
Any help with this please?
The output would need to look like:
• this is a test bullet<br>
• this is a test bullet 2<br>
• this is a test bullet 3<br>
How about, given that we should be able to use the bullet character, something simple like:
Find: (•.*)\.(.*)
Replace with: $1$2
You could just use the replaceAll method on the String object, like so:
String values = "• this is a test bullet.<br>\n" +
"• this is a test bullet 2.<br>\n" +
"• this is a test bullet 3.<br>";
values = values.replaceAll("(?i)\\.(?=<br>)", "");
// result:
// • this is a test bullet<br>
// • this is a test bullet 2<br>
// • this is a test bullet 3<br>
It will remove any full stops preceded by a <br> tag, and is case insensitive.
Explanation of regex:
Make pattern case insensitive:
(?i)
Find full stop (.):
\\.
Forward look ahead for <br> tag:
(?=<br>)
Regex:
^(\s*•.*)\.$
Replacement string:
$1
OR
Regex:
^\s*•.*\K\.$
Replacement string:
Empty string
DEMO
Related
As part of a forum that uses BBCode to store posts, I'm trying to write a way to detect mentions and quotes, in order to notify the users.
I have it working for all cases except nested quotes.
This is my regex so far (Python 2.7):
regex = r'\[url=.*?\/users\/(.*?)\/\]#.*?\[\/url\]|\[quote="(.*?)"\].*?\[\/quote\]'
These are my test cases:
# This works fine, I get the `user1` group.
Hello [url=/users/user1/]#Foo Bar[/url]
# This works fine, I get the `user2` and `user3` groups.
[quote="user2"]Test message[/quote] OK [quote="user3"]Test message[/quote]
# This doesn't work as I'd l ike. I only get the `user4` group, but not `user5`.
[quote="user4"][quote="user5"]Test message[/quote][/quote]
How can I modify the regular expression to match also the third test with the nested [quote] block?
Here's a link to regex101 for your convenience: https://regex101.com/r/Ov5SI1/1
Thank you!
A minor change in the original regex will solve your problem. Here is the original regex:
\[url=.*?\/users\/(.*?)\/\]#.*?\[\/url\]|\[quote="(.*?)"\].*?\[\/quote\]
Error
Consider the input string:
[quote="user4"][quote="user5"]Test message[/quote][/quote]
The last alternation tries to match it and it does succeed. However, the first match is
[quote="user4"][quote="user5"]Test message[/quote]
Now the next match starts after the [/quote]. It will not start anywhere before since all the previous text is already part of a successful match.
Correction
Solution 1:
Changing this part .*?\[\/quote\] in the original regex to a look ahead will result in successful match of both the user4 and user5.
\[quote=\"(.*?)\"\](?=.*?\[\/quote\])
final regex: \[url=.*?\/users\/(.*?)\/\]#.*?\[\/url\]|\[quote=\"(.*?)\"\](?=.*?\[\/quote\])
Solution 2:
Focusing on just the right part of the alternation - \[quote="(.*?)"\].*?\[\/quote\]
Here only \[quote="(.*?)"\] this is necessary if you want to find any patter of the form [quote="..."]. The remaining portion is unnecessary.
Here is the final regex:
\[url=.*?\/users\/(.*?)\/\]#.*?\[\/url\]|\[quote=\"(.*?)\"\]
Please do remember that the regex must be applied globally to find all the matches.
I have two URLs from which I need to extract (actually split before) the pagename, i.e. last text string after the last /. For example:
https://example.com/en/pagename
https://example.com/en/pagename/
My current regex can find the last incidence of the "/" character, but when the / is at the end, I need to select the PREVIOUS / in order to break before the pagename. Current regex is:
\/(?!.*\/)
Method 1
I'd guess,
[^/]+/?$
or if you wish to capture the pagename,
([^/]+)/?$
might be OK to look into.
RegEx Demo 1
Method 2
For selecting the forward slash right before the end of the URL, we'd try positive lookahead:
/(?=[^/]+/?$)
RegEx Demo 2
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
I have a file where I want to match a certain word between keywords using regular expressions. For example, lets say I want to match every occurrence of the word "dog" AFTER the keyword "start" and BEFORE the keyword "end".
dog horse animal cat dog // <-- don't match
random text dog // <-- don't match
start
brown dog
black dog
cat horse animals
end
dog cat // <-- don't match
good dog // <-- don't match
Maybe regex has a pipe feature where I can get the text after the word "start" and before the word "end", then pipe it into a new regular expression? Then I could just search for "dog" in the second regular expression. I am new to regular expressions and have been struggling to come up with a solution. Thanks
When you are matching "globally" (ie. collecting several matches that are non-contiguous) and you provide a stipulation such as "matches must all exist in a container" (in this case, between "start" and "end"), this generally calls for a construct such as PCRE's '\G', which matches only at the first attempted position:
(?:\G(?!\A)|start)(?:(?!end).)*?\Kdog
See it in action at: https://regex101.com/r/uV7EjE/1
It's important to note that this uses some constructs that are not universally supported, and one specific to PCRE ('\K'). An explanation of each part:
/(?:
\G(?!\A) # Match only at the first position, since the usual behaviour of regex is to attempt to match at each position. In effect, this ensures we only match immediately after the last valid "dog".
|start # Or match "start".
)
(?:(?!end).)*? # Match as few characters as possible, making sure we don't encounter "end".
\K # Reset the consumption counter so everything before this isn't matched.
dog # Match what we want.
/gmsx
If instead you need something with wider support for more basic regex engines, then you do indeed need to pipe a simpler expression, for instance start.*?end to match a complete group, then check its contents for all occurrences of "dog".
Update:
start(.?)(dog)+(.?)end
Test on the below link, here is a screen:
previous:
(please, note this might not answer exactly your case because it heavily depends on what language you are working)
Ref. 1 link
Ref. 2 link
It also depends on the language you are developing as the other comments are saying. If you can let me know where are you developing I might give you a better answer.
Also you can use this to debug https://regex101.com/
I know you're asking for regex, but if you're using a certain language there may be more apt solutions. For example, in PHP this function would work:
function getStringBetween($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
I use service kimono to parse some data and this service has specific 3-group regular expression tool.
first group is text before the needed one, than expression for text I need, and third is expression after the needed text.
So, by default it looks like: /^()([^]*?)()$/
I have a sentence like {Olive oil in glass bottle 500 ml.} and I need to get only text after second space from the end, except last point, so that result should be: 500 ml
Any of my tries have not succeeded.
If you have support of lookahead you can try this.
^(.*?(?=\s[^ ]*\s[^ ]*$))([^.]*?)(.)$
See demo.
https://regex101.com/r/vD5iH9/35
or
^([^\d]*)([^.]*?)(.)$
See demo.
https://regex101.com/r/vD5iH9/36
or
^(.*?\s)([^\s]*\s[^\s]*)(\.)$
See demo.
https://regex101.com/r/vD5iH9/37
Use the below regex and then get the string you want from group index 1.
^.*?\s(?=\S*\s\S*$)([^.]*)\.
(?=\S*\s\S*$) positive lookahead asserts that there must be a single space exists.
DEMO
How about:
(\S+\s+\S+)\.$
The text you want is in the group 1.
I have the following as3 function below which converts normal html with links so that the links have 'event:' prepended so that I can catch them with a TextEvent listener.
protected function convertLinks(str:String):String
{
var p1:RegExp = /href|HREF="(.[^"]*)"/gs;
str = str.replace(p1,'HREF="event:$1"');
return str;
}
For example
<a href="http://www.somedomain.com">
gets converted to
<a href="event:http://www.somedomain.com">
This works just fine, but i have a problem with links that have already been converted.
I need to exclude the situation where i have a string such as
<a href="event:http://www.somedomain.com">
put through the function, because at the moment this gets converted to
<a href="event:event:http://www.somedomain.com">
Which breaks the link.
How can i modify my function so that links with 'event:' at the start are NOT matched and are left unchanged?
First of all, trying to manipulate HTML with regex may not be a good idea.
That said, according to the flavor comparison chart on regular-expressions.info, ActionScript regex is based off of ECMA engine, which supports lookaheads.
Thus you can write this:
/(?:href|HREF)="(?!event:)(.[^"]*)"/
(?=…) is positive lookahead; it asserts that a given pattern can be matched. (?!…) is negative lookahead; it asserts that a given pattern can NOT be matched.
Note that the inclusion of the . is very peculiar. It's probably not intended to include the . there since it can match a closing doublequote.
Note also that I've fixed the alternation for the href/HREF by using a non-capturing group (?:…).
This is because:
this|that matches either "this" or "that"
this|that thing matches either "this" or "that thing"
(this|that) thing matches either "this thing" or "that thing"
Alternatively you may also want to just turn on case-insensity flag /i, which would handle things like hReF or eVeNt:.
Thus, perhaps your pattern should just be
/href="(?!event:)([^"]*)"/gsi
If lookahead was not supported, you can use an optional pattern that matches event: if it's there, excluding it from group 1, so that it doesn't get included when you substitute in $1.
/href="(?:event:)?([^"]*)"/gsi
\________/ \_____/
non-capturing group 1
optional