Regex to find common letters between two strings - regex

I've been searching on Google for a few hours and got a partial solution.
I'm new to both Groovy and regular expressions. I've used regex sporadically over the years, but I am far from comfortable with it.
I've got a simple game that checks how many letters you have in common with a hidden word.
For simplicity's sake, let's say the word is "pan" and the person types "can".
I want the result of the regex to give me "an".
Right now, I've got this partly working by doing this (in Groovy):
// Where "guess" is the user's try and "word" is the word they need to guess.
def expr = "[$word]"
def result = guess.find(expr)
The result string contains only the first matching letter. Anyone have any more elegant solutions?
Thanks in advance

I think this is no use case for a regex. You'll have to take care of things like not leting the user guess automatically if he enters .* or the like.
Typical collection work is better suited for this task IMO. One solution would be to find the intersection of both words treating them as sets of characters:
(word as Set).intersect(guess as Set).join()
Or filtering the guess' characters that appear in the secret word:
guess.findAll { word.contains(it) }.unique().join()

Suppose the two strings are s1 and s2
now to find the common string do:
commonString=s1.replaceAll("[^"+s2+"]","");
and if your word contain meta-character then
first do:
Pattern.quote(s2);
and then
commonString=s1.replaceAll("[^"+s2+"]","");

You could try:
guess.findAll( /[$word]/ ).join()

Related

"REGEX" Match string not containing specific substring

I will give an example, I have two strings:
FL_0DS906555B_3661_27012221225012_V001_S
FL_0DS906555C_3661_27012221225012_V001_S
And I want to get any string, that has no "0DS906555B" in it, has "2701222122" in it and "5012" is in range of 5003-5012.
My regex looks like this:
^.*(?!.*0DS906555B).{6}2701222122(500[3-9]|501[0-2]).*$
unfortunately it keeps matching everything all the time. I have looked into many posts here but nothing helped for me since people usually asked for less complex, smaller strings.
Thank you
Try (regex101):
^(?!.*0DS906555B)(?=.*_2701222122(?:500[3-9]|501[012])_).*$

What is the proper way to check if a string contains a set of words in regex?

I have a string, let's say, jkdfkskjak some random string containing a desired word
I want to check if the given string has a word from a set of words, say {word1, word2, word3} in latex.
I can easily do it in Java, but I want to achieve it using regex. I am very new to regular expressions.
if you want only to recognise the words as part of a word, then use:
(word1|word2|...|wordn)
(see first demo)
if you want them to appear as isolated words, then
\b(word1|word2|...|wordn)\b
should be the answer (see second demo)
I am not able to understand the complete context like what kind of text you have or what kind of words will this be but I can offer you a easy solution the literal way programmatically you can generate this regex (dormammu|bargain) and then search this in text like this "dormammu I come to bargain". I have no clue about latex but I think that is not your question.
For more information you can tinker with it at [regex101][1].
If you are having trouble understanding it [regexone][2] this is the place to go. For beginners its a good start.
[1]: http://regex101.com [2]: https://regexone.com/

Regex to match sentences with jumbled words but preserving sentence order

I want to match sentences in such a way that words with the sentence can be any order but the sentences should be in same order.
e.g.
My name is Sam. I love regex.
Acceptable input:
My Sam is name. regex I love.
name is My Sam. I regex love.
Invalid input:
I love regex. My name is Sam.
regex I love. is My name Sam.
sample regex I have come up so far to solve the above problem
^((?=.*\bMy\b)(?=.*\bSam\b)(?=.*\bis\b)(?=.*\bname\b))((?=.*\bregex\b)(?=.*\bI\b)(?=.*\blove\b)).*$
Which is not working as expected.
Can this problem be solved by regex? What would be the recommended approach to solve this?
Note: Please ignore . I am using it just for clarity.
I think you are looking for something else than regex. If you would want to do this, the most efficient way would be to compare an array of expected words and 'check' if they all appear once in a sentence. This is completely dependent on which context you are using. If you need a regex that literally finds what you stated in your example, you could use something like this:
/(My|name|is|Sam) (My|name|is|Sam) (My|name|is|Sam) (My|name|is|Sam)\. (I|love|regex) (I|love|regex) (I|love|regex)./g
But as you can see, this regex would grow exponentially the more words your sentence has. Also, it's really inefficient compared to parsing it with something else.
I couldn't achieve with a single regex, instead I did the following:
Virtually divided the sentence into multiple blocks.
Maintained a sentence block -> regex configuration.
regex configuration depends on the rule applicable on that sentence block.
Applied the regex on the sentence to identify whether such block is existing or not.
At last verifying whether the blocks are appearing in the configured order or not.

Using a regular expression to insert text in a match

Regular Expressions are incredible. I'm in my regex infancy so help solving the following would be greatly appreciated.
I have to search through a string to match for a P character that's not surrounded by operators, power or negative signs. I then have to insert a multiplication sign. Case examples are:
33+16*55P would become 33+16*55*P
2P would become 2*P
P( 33*sin(45) ) would become P*(33*sin(45))
I have written some regex that I think handles this although I don't know how using regex I can insert a character:
The reg is I've written is:
[^\^\+\-\/\*]?P+[^\^\+\-\/\*]
The language where the RegEx will be used is ActionScript 3.
A live example of the regex can be seen at:
http://www.regexr.com/39pkv
I would be massively grateful if someone could show me how I insert a multiplication sign in middle of the match ie P2, becomes P*2, 22.5P becomes 22.5P
ActionScript 3 has search, match and replace functions that all utilise regular expressions. I'm unsure how I'd use string.replace( expression, replaceText ) in this context.
Many thanks in advance
Welcome to the wonder (and inevitable frustration that will lead to tearing your hair out) that is regular expressions. You should probably read over the documentation on using regular expressions in ActionScript, as well as this similar question.
You'll need to combine RegExp.test() with the String.replace() function. I don't know ActionScript, so I don't know if it will work as is, but based on the documentation linked above, the below should be a good start for testing and getting an idea of what the form of your solution might look like. I think #Vall3y is right. To get the replace right, you'd want to first check for anything leading up to a P, then for anything after a P. So two functions is probably easier to get right without getting too fancy with the Regex:
private function multiplyBeforeP(str:String):String {
var pattern:RegExp = new RegExp("([^\^\+\-\/\*]?)P", "i");
return str.replace(pattern, "$1*P");
}
private function multiplyAfterP(str:String):String {
var pattern:RegExp = new RegExp("P([^\^\+\-\/\*])", "i");
return str.replace(pattern, "P*$1");
}
Regex is used to find patterns in strings. It cannot be used to manipulate them. You will need to use action script for that.
Many programming languages have a string.replace method that accepts a regex pattern. Since you have two cases (inserting after and before the P), a simple solution would be to split your regex into two ([^\^\+\-\/\*]?P+ and P+[^\^\+\-\/\*] for example, this might need adjustment), and switch each pattern with the matching string ("*P" and "P*")

RegEx: Match Mr. Ms. etc in a "Title" Database field

I need to build a RegEx expression which gets its text strings from the Title field of my Database. I.e. the complete strings being searched are: Mr. or Ms. or Dr. or Sr. etc.
Unfortunately this field was a free field and anything could be written into it. e.g.: M. ; A ; CFO etc.
The expression needs to match on everything except: Mr. ; Ms. ; Dr. ; Sr. (NOTE: The list is a bit longer but for simplicity I keep it short.)
WHAT I HAVE TRIED SO FAR:
This is what I am using successfully on on another field:
^(?!(VIP)$).* (This will match every string except "VIP")
I rewrote that expression to look like this:
^(?!(Mr.|Ms.|Dr.|Sr.)$).*
Unfortunately this did not work. I assume this is because because of the "." (dot) is a reserved symbol in RegEx and needs special handling.
I also tried:
^(?!(Mr\.|Ms\.|Dr\.|Sr\.)$).*
But no luck as well.
I looked around in the forum and tested some other solutions but could not find any which works for me.
I would like to know how I can build my formula to search the complete (short) string and matches everything except "Mr." etc. Any help is appreciated!
Note: My Question might seem unusual and seems to have many open ends and possible errors. However the rest of my application is handling those open ends. Please trust me with this.
If you want your string simply to not start with one of those prefixes, then do this:
^(?!([MDS]r|Ms)\.).*$
The above simply ensures that the beginning of the string (^) is not followed by one of your listed prefixes. (You shouldn't even need the .*$ but this is in case you're using some engine that requires a complete match.)
If you want your string to not have those prefixes anywhere, then do:
^(.(?!([MDS]r|Ms)\.))*$
The above ensures that every character (.) is not followed by one of your listed prefixes, to the end (so the $ is necessary in this one).
I just read that your list of prefixes may be longer, so let me expand for you to add:
^(.(?!(Mr|Ms|Dr|Sr)\.))*$
You say entirely of the prefixes? Then just do this:
^(?!Mr|Ms|Dr|Sr)\.$
And if you want to make the dot conditional:
^(?!Mr|Ms|Dr|Sr)\.?$
^
Through this | we can define any number prefix pattern which we gonna match with string.
var pattern = /^(Mrs.|Mr.|Ms.|Dr.|Er.).?[A-z]$/;
var str = "Mrs.Panchal";
console.log(str.match(pattern));
this may do it
/(?!.*?(?:^|\W)(?:(?:Dr|Mr|Mrs|Ms|Sr|Jr)\.?|Miss|Phd|\+|&)(?:\W|$))^.*$/i
from that page I mentioned
Rather than trying to construct a regex that matches anything except Mr., Ms., etc., it would be easier (if your application allows it) to write a regex that matches only those strings:
/^(Mr|Ms|Dr|Sr)\.$/
and just swap the logic for handling matching vs non-matching strings.
re.sub(r'^([MmDdSs][RSrs]{1,2}|[Mm]iss)\.{0,1} ','',name)