Regex conversion on bbcode - regex

We are moving from phpbb to a simpler system and some of the bbcode needs converting, particularly the "quote" code. The current phpbb based quote code looks like this:
[quote="username":nw4lek0o]The quoted text[/quote:nw4lek0o]
and it needs to be simplified to this:
[quote=username]The quoted text[/quote]
So, basically two things: strip the double quotes from around the username, and strip the ID string from the opening and closing tag.
I'm not good at Regex. Help?

Use this regex:
\[quote="(.+?)":.+?\](.+?)\[/quote:.+?]
And replace it with:
[quote=$1]$2[/quote]
Demo: http://regex101.com/r/jL3xU2

find (?=\").|(?::\w+) replace with blank.
demo here : http://regex101.com/r/oN8fS2

Related

How can I select the first quote ("), disconsider the between content, and select the second quote ("), given a pattern?

"234";"CASA "C"";"AM";
"235";"CASA F";"AM";
"236";"CASA "A"";"AM";
I have a file with several lines like the ones above. And I would like to select only the firt quote ("), disconsider the between content, and finally select the other quote ("). So that I could turn
this: "CASA "C"";
into this: "CASA C".
For these cases, I already have a pattern that matches the full content:
(\"([a-zA-Z0-9]*)\"\")
Searching I found a way to match one of the quotes, but I couldn't "merge" the two matches of the quote:
((?=(\"([a-zA-Z0-9]*)\"\")).)
That is what I got so far! Thanks!
--
I am using Sublime Text.
You could use sth. like "([^"]+)"([^"]+)""
See this Demo here

Notepad++ Regex to find group of lines with condition

Given this example text:
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ABB</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="FM"/>
...lines...
...........
</abr:rules>
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ADE</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="CM"/>
...lines...
...........
</abr:rules> (end of group)
I would like to find and remove all that goes from <abr:rules> to </abr:rules> with the condition that subapplication IS NOT "CM". Organization and application are the same, <abr:code> it's any string.
What I tried so far is
<abr:rules>\n<abr:ruleTypeDefinition>\n<abr:code>[a-zA-Z0-9]{3,}<\/abr:code>\n<abr:ownership>\n<.*"(FM|PSD|SSC)"\/>\n(?s).*?\n<\/abr:rules>\n
which works but only because I know the other subapplication names.
Is there any way to do it with Regex only ?
Try the following find and replace:
Find:
<abr:rules>((?!subapplication=).)*subapplication="(?!CM")[^"]+"((?!</abr:rules>).)*</abr:rules>
Replace:
(empty string)
Demo
Note: The above pattern will only work if you enable dot in Notepad++ to match newlines. If you don't want to do that, then you may use [\S\s] instead of dot.
You should not use regex for xml, you can read why here:
https://stackoverflow.com/a/1732454/3763374
Instead you can use some parser like Xpath

Complex regex single quote replace

I have a set of strings for which I would like to replace single quotes by double quotes. But, sometimes the single quote to replace is at the end of the line, sometimes the single quote should be replaced since it follow a S for possessive.
Example :
The song 'Miss you' is featured in The Rolling Stones' album 'Voodoo Lounge'
should be
The song "Miss you" is featured in The Rolling Stones' album "Voodoo Lounge"
Thanks your help :)
Regular expressions can only deal with raw text. It can't tell context or grammar. So it is pretty much impossible to build up a regular expression that will correctly identify the occurrences of non-possessive s characters.
However, if you'd like to ignore such cases, and match rest of them, you can use the following regex with lookaround assertions:
(?<!s)'(?!s\b)
Note that this will not match for valid cases like Blurred Lines, Dangerous etc.
Working demo

RegEx for quoted string with missing open parenthesis

What is RegEx for find quoted string having only close parenthesis at the end, like this :
"People)"
But not
"(People)"
Something like so: "[^(]+?\)" should fit the bill. You might also need to escape the quotation marks and the backslash as well, depending on what regex engine you are using.
Some details on how does this regex work are available here.
Can you try the following ?
String REGEX_TEST_STRING="\"People)\"";
System.out.println(REGEX_TEST_STRING.matches("\"P.*\)\""));
This code returns true for "People)" and false for "(People)"
HTH.

Regex for the value of an HTML Property

I have a load of links that look like this:
Taboola - Content you may like
I want to delete the entire ICON and ADD_DATE attributes and their values.
I'm using sublime with a regex find/replace but I'm not sure how to write the regex to grab everything in between ICON=" AND "
Any help would be appreciated!
This should work (escaping quotes as necessary):
ICON="[^"]*"
The reason ICON=\"(.*)" won't work is that regex can 'be greedy' in what it takes. This means that if it can match more of the string to satisfy the pattern it will.
You can either specify a non greedy search, such as ICON=".*?" or explicitly declare matches on atoms that are not quotes as in the above answer.