Regular expression replace double and single quotes with nothing - regex

I am using a sphinx search module on a site I am developing and there is the option to enter regular expressions to be replaced with specified characters.
The available options are Match Expression,Replace Expression and Replace Char (these are input fields in a CMS admin panel so I'm unsure of the actual code function used behind the scenes unfortunately). My understanding is the search checks for any expressions which match Match Expression and replaces the expressions specified in Replace Expression with those specified in Replace Char. So it's a sort of find and replace on matched terms.
Some examples that work:
Example 1
Match Expression: /[a-zA-Z0-9]*-[a-zA-Z0-9]*/
Replace Expression: /-/
Replace Char: empty
Matched text: SX500-123, GLX-11A, GLZX-VXV, GLZ/123, GLZV 123, CNC-PWR1
Result text: SX500123, GLX11A, GLZXVXV, GLZ/123, GLZV-123-123, CNCPWR1
More examples here: http://mirasvit.com/doc/ssp/2.3.2/ssp/global/long_tail
What I want to do is strip any single or double quotes or apostrophes from a search query.
Example inputs: "examination papers",'examination papers,'examination' "papers",pa"pers,pa'pers
Desired outputs: examination papers,examination papers,papers,papers,papers
I have tried just replacing the - with a " in the examples listed above for now but even this hasn't worked.
Any help would be greatly appreciated! Thank you

You can use these expressions:
Match Expression - /["'][\w\s]+["']|\w+["']\w+/
This will match the following text:
"examination papers",'examination papers','examination' "papers",pa"pers,pa'pers
Then you can use this regex to replace your quotes:
Replace Expression - /["']/
Replace Char - empty
So, your output will be:
examination papers,examination papers,examination papers,papers,papers
As a context for this answer. I understand from the tool you are using that your match expression gathers a resultset where you can apply another regex expression (Replace expression) that will replace the content matched with replace char

Related

Regular Expresion in Tableau returns only Null's in Calculated Field

I'm trying to extratct in Tableau the first occurance of part of speech name (e.g. subst, adj, fin) located between { and : in every line from column below:
{subst:pl:nom:m3=18, subst:pl:voc:m3=1, subst:pl:acc:m3=5}
{subst:sg:gen:m3=5, subst:sg:inst:m3=1, subst:sg:gen:f=1, subst:sg:nom:m3=1}
{subst:sg:nom:f=3, subst:sg:loc:f=2, subst:sg:inst:f=1, subst:sg:nom:m3=1}
{adj:sg:nom:m3:pos=2, adj:sg:acc:m3:pos=1, adj:sg:acc:n1.n2:pos=3, adj:pl:acc:m1.p1:pos=3, adj:sg:nom:f:pos=1}
{adj:sg:gen:f:pos=2, adj:sg:nom:n:pos=1}
{fin:sg:ter:imperf=5}
To do this I use the following regular expression: {(\w+):(?:.*?)}$. Unfortunately my calculated field returns only Null's:
Screeen from Tableau
I checked my regular expression on regex tester and is valid:
Sreen from regex101.com
I don't know what I'm doing wrong so if anybody has any suggestions I would be greatfull.
Tableau regex engine is ICU, and there are some differences between it and PCRE.
One of them is that braces that should be matched as literal symbols must be escsaped.
Your regex also contains a redundant non-capturing group ((?:.*?) = .*?) and a lazy quantifier that slows down matching since you want to check for a } at the end of the string, and thus should be changed to a greedy .*.
You can use
REGEXP_EXTRACT([col], '^\{(\w+):.*\}$')

In JMeter I need to extract a specific Regular Expression

In the following String:
Events('1234', '123456', '', 'QW233Cdse');
I need to extract "QW233Cdse"
Any suggestion?
When we are working with regular expressions then its very important that we should look for the static text in the test string that can help to create a strong regular expression.
As in your case, "Events()" seems to be a static text containing dynamic value in the round parenthesis so in order to generate the regular expression you need to keep 'Events()' text and add the expression in the round parenthesis as mentioned below:
Test String: Events('1234', '123456', '', 'QW233Cdse');
Regular Expression can be:
Events(.'(.)');
Events(.* '(.+?)');
Note: The backslash before round parentheses would avoid interpreting the round braces as unescaped character. For example, a parenthesis "(" begins the definition of a quantifier, but the leading backslash of parenthesis "(" indicates that the regular expression should match the parenthesis.
Regular expression is most important item to learn when you are working with load testing tools and you can refer to below blog post to get more information on regular expression:
https://www.redline13.com/blog/2016/01/jmeter-extract-and-re-use-as-variable/
Let me know if you have any further question
The relevant regular expression would be something like:
Events\(.* '(.+?)'\);
Demo:
References:
JMeter: Regular Expressions
Using Regular Expressions in JMeter
Perl 5 Regex Cheat sheet
Try using this regex:
\w+(?='\))
Regex would be:
, '([^']+?)');
Configuration would be:

Replace string using regular expression in KETTLE

I would like to use regular expression for replacing a certain pattern in the Kettle. For example, AAAA >5< BBBB, I want to replace this with AAAA 555 BBBB. I know how to find the pattern, but I am not sure how to replace that with new string. The one thing I have to keep is that I have to find pattern together ><, not separately like > or < because there is another pattern <5>.
You can use the "Replace in String" step in a transformation.
Set use RegEx to "Y", type your regex on the Search box, with capturing groups if necessary, and the replacement string in the replacement box, referring to capture groups as $1, $2, ...
It'll replace all occurrences of the regex in the original string.
If the Out Stream field is ommitted, it'll overwrite the In stream field.
If you want the pattern >\d< replaced by a triple of the found digit, you can use Replace-In-String in regex mode:
Search: (.*)(>(\d)<)(.*)
Replace: $1$3$3$3$4
If you want all such patterns treated the same:
Search: (>(\d)<)
Replace: $2$2$2
EDIT due to your improved requirement
Since you intend to convert your "simple" markup to a more HTML-like markup, you better use a User-Defined-Java-Expression. Also, you must avoid to reintroduce simple markup when replacing repeatedly.

Regular Expression to unmatch a particular string

I am trying to use regular expression in Jmeter where in I need to unmatch a particular string. Here is my input test string : <activationCode>insvn</activationCode>
I need to extract the code insvn from it. I tried using the expression :
[^/<activationCode>]\w+, but does not yield the required code. I am a newbie to regular expression and i need help with this.
Can you use look-behind assertion in jmeter? If so, you can use thatr regex which will give you a word that follows <activationCode>
(?<=\<activationCode\>)\w+
If your input string is encoded (e.g for HTML), use:
(?<=\<activationCode\>)\w+
When designing a regular expression in any language for something like this you can match your input string as three groups: (the opening tag, the content, and the closing tag) then select the content from the second group.

How do you do a Find and Insert in Notepad++ instead of a replace, while using regular expression?

In Notepad++, how do you Find and Insert (instead of Find and Replace) while using a regular expression as the search criteria?
For non regular expression, you can simply include what you are finding in the replace value, but for regular expression, that won't work. Ideas?
very simple, if you need to add some text to every match of your search you can use backreferences in regular expressions, so for example, you have:
this is a table.
and you want to get "this is a red table",
so you do search for:
(this is a)
and replace with (in regular expression mode):
\1 red
also note, that we've used parenthesis in our search. Each set of parens can be accessed in replace with the corresponding \N tag. So you can, for example search for
(this is).*(table)
and replace it with
\1 not a \2
to get "this is not a table"
Dmitry Avtonomov answered it right but I just wanted to add in case you have something dynamic in between two strings.
Example:
Line 1: Question 1
Line 2: Question 2
And you want to just add a dot after the end of each question number, you can add at this way.
In Notepad++
Replace : (QUESTION)(.*)(\r\n)
With : \1 \2. \3
Result:
Line 1: Question 1.
Line 2: Question 2.
Have you checked other posts?
Maybe this will help you get your answers:
Using regular expressions to do mass replace in Notepad++ and Vim
http://markantoniou.blogspot.com/2008/06/notepad-how-to-use-regular-expressions.html