Create Regex pattern for calculator - regex

I am trying to create a calculator,where operands are words.It can repeat any number of times.
e.g. EmpName+xyz or EmpName or x+rr+fff.
It should reject such pattern e.g.
EmpName+
I created a regular expression:
(?m)(?<Operand>^[a-z].*?)(?<Operator>[+*])
On this output:
1) a + b
2) ab+dddd
3) ab*fffff*ggggg
4) dfg+fg4444+fgf4
5) xxxxx
But it only targets 1,2,3,4 and up to only first operator. Output in regex 2.05.
"Operand: [ab]"
"Operator:[+]"
I am using regex builder 2.05 to test my regex. How i can repeat this pattern any number of times? Thanks in advance.

This would typically be expressed as
operand followed by (operator operand) one or more times
that is
(?m)<Operand>([+*]<Operator>)*
Yes i need parantheses as well as divide,percentage,minus sign also.
Then I suggest considering using a real parser. The language of balanced parentheses is not regular.

Related

Regular expression to validate sum of numerics

I want to validate the user input,
it should accept,
1+2
1.2+56+3.5
it should not accept any alphabets, special characters other than . and +
and mainly it should not accept ++ and ..
please help me with regular expression.
This should work:
var regex = /^[0-9]+(\.[0-9]+)?(\+[0-9]+(\.[0-9]+)?)*$/;
"1+2".match(regex); // not null
"1.2+56+3.5".match(regex); // not null
"1++2".match(regex); // null
"1..2".match(regex); // null
online: http://regex101.com/r/zJ6tP7/1
Something like this should suffice
^([+]?\d+(\.\d+)?)*$
http://regex101.com/r/qE2kW1/2
Source: Validate mathematical expressions using regular expression?
Note that I'm assuming it should not accept .+ or +. as well. I'm not assuming that you require checking for multiple decimals prior to an addition, meaning this will accept 3.4.5 I'm also assuming you want it to start and end with numbers, so .5 and 4+ will fail.
(\d*[\.\+])*(\d)*
This takes any amount of number values, followed by a . or a +, any number of times, followed by any other number value.
To avoid things like 3.4.5 you'll likely need to use some sort of lookaround.
EDIT: Forgot to format regular expression.

How can I add the multiplication sign to an algebraic expression through regex?

I am writing a mathematical parser in which a user can enter answers to be evaluated. How can I convert something like 'xe^x + xyz' to 'x*e^x + x*y*z' through Regex?
Alternative methods would be welcome too. Thank you!
Look for each occurrence of:
(?<=[a-zA-Z0-9])(?=[a-zA-Z])
Replace by:
*
Edit:
As pointed out by #ChristopherCreutzig, this regex will also handle the cases like 23xy in the most probable expected way. That is:
considering a sequence of digits as a part of a single expressoin,
considering a digit followed by a letter as a multiplication,
considering a letter followed by a digit as part of a single expression.
For example, for this input:
2x1 + 3xy
The resulting output is:
2*x1 + 3*x*y
See it in action and try it out live here on regex101.
(?<=\w)(?=\w)
(looking one letter before and after)
replace by
*

Grep for Pattern in File in R

In a document, I'm trying to look for occurences of a 12-digit string which contains alpha and numerals. A sample string is: "PXB111X2206"
I'm trying to get the line numbers that contain this string in R using the below:
FileInput = readLines("File.txt")
prot_pattern="([A-Z0-9]{12})";
prot_string<-grep(prot_pattern,FileInput)
prot_string
This worked fine until it hit a document containing all upper-case titles and returned a line containing the word "CONCENTRATIO"
The string I am trying to look for is: "PXB111X2206". I am expecting the grep to return the line numbers containing the string : "PXB111X2206". It however is returning the line number containing the word: "CONCENTRATIO"
What is wrong with my expression above? Any idea what I am doing wrong here?
Here is some sample input:
Each design objective described herein is significantly important, yet it is just one aspect of what it takes to achieve a successful project.
A successful project is one where project goals are identified early on and where the >interdependencies of all building systems are coordinated concurrently from the planning and programming phase.
CONCENTRATION:
The areas of concentration for design objectives: accessible, aesthetics, cost effective, >functional/operational, historic preservation, productive, secure/safe, and sustainable and >their interrelationships must be understood, evaluated, and appropriately applied.
Each of these design objectives is presented in the design objectives document number. >PXB111X2206.
>
Thanks & Regards,
Simak
You are using a very powerful tool for a very simple task, the expression
[A-Z0-9]{12}
will match any alphanumeric 12 sized uppercased string, for example the word "CONCENTRATIO", however, your "PXB111X2206" is not even 12 symbols long, so it is not possible that is being matched. If you only want to match "PXB111X2206" you only have to use it as a regular expression itself, for example, if you file contents are:
foo
CONCENTRATIO.
bazz
foo bar bazz PXB111X2206 foo bar bazz
foo
bar
bazz
and you use:
grep('PXB111X2206',readLines("File.txt"))
then R will only match line 4 as you would wish.
EDIT
If you are looking for that specific pattern try:
grep('[A-Z]{3}[0-9]{3}[A-Z]{1}[0-9]{4}',readLines("File.txt"))
That expression will match strings like 'AAADDDADDDD' where A is an capital letter, and D a digit, the regular expression contains a group (symbols inside square brackets) and a quantifier (the number inside the brackets) that tells how many of the previous symbol will the expression accept, if no quantifier is present it assumes it is 1.
Let's take a look at what your regular expression means. [A-Z0-9] means any capitalized letter or number and {12} means the previous expression must occur exactly 12 times. The string CONCENTRATIO is 12 capitaized letters, so it's no surprise that grep picks it up. If you want to take out the matches that match to just letters or just numbers you could try something like
allleters <- grep("[A-Z]{12}",strings)
allnumbers <-grep("[0-9]{12}",strings)
both <- grep("[A-Z0-9]{12}",strings)
the matches you wanted would then be something like
both <- both[!both %in% union(allletters,allnumbers)]
Someone with better regexfu might have a more elegant solution, but this will work too.

Regular Expression: Match Hex/Key Strings

I have tried creating a regular expression myself to do this, but honestly my mind is so boggled with it right now that I must ask for help... This may be helpful for people in the future as well.
I have the following input templates:
06-6A-BF-05-AF-84-DF-A4-23-7C-BE-B4-6C-95-D7
JK1T-XTSRV-2HC4D-RP4S7-ZMKRG
I need to pick out strings like these two from an input string. An input string may look like this:
JK1T-XTSRV-2HC4D-RP4S7-ZMKRG
FDGF-A1S0M-5M8XJ-T08WC-BCZSJ
C6-6C-1C-17-B7-EE-BE-EA-E3-7C-EF-23-6C-12-F1
asdf234 ,f C6-324_EE
In this case, the following would be returned:
JK1T-XTSRV-2HC4D-RP4S7-ZMKRG, FDGF-A1S0M-5M8XJ-T08WC-BCZSJ, C6-6C-1C-17-B7-EE-BE-EA-E3-7C-EF-23-6C-12-F1
Thus, the regular expression would need to have the following restrictions to match a string:
15 two character (numbers or letters) pairs separated by -
5 four character (numbers or letters) pairs separated by -
What regular expression will match these?
You should use two regular expressions:
(\w{2}-){14}\w{2}
\w{4}-(\w{5}-){3}\w{5}
The second type is actually one four char and four five char.
Test 1:
http://fiddle.re/h3ve6
Test 2:
http://fiddle.re/3a5e6

Trying to build a regular expression to check pattern

a) Start and end with a number
b) Hyphen should start and end with a number
c) Comma should start and end with a number
d) Range of number should be from 1-31
[Edit: Need this rule in the regex, thanks Ed-Heal!]
e) If a number starts with a hyphen (-), it cannot end with any other character other than a comma AND follow all rules listed above.
E.g. 2-2,1 OR 2,2-1 is valid while 1-1-1-1 is not valid
E.g.
a) 1-5,5,15-29
b) 1,28,1-31,15
c) 15,25,3 [Edit: Replaced 56 with 3, thanks for pointing it out Brian!]
d) 1-24,5-6,2-9
Tried this but it passes even if the string starts with a comma:
/^[0-9]*(?:-[0-9]+)*(?:,[0-9]+)*$/
How about this? This will check rules a, b and c, at least, but does not check rule d.
/^[0-9]+(-[0-9]+)?(,[0-9]+(-[0-9]+)?)*$/
If you need to ensure that all the numbers are in the range 1-31, then the expression will get a whole lot uglier:
/^([1-9]|[12][0-9]|3[01])(-([1-9]|[12][0-9]|3[01]))?(,([1-9]|[12][0-9]|3[01])(-([1-9]|[12][0-9]|3[01]))?)*$/
Note that your example c contains a number, 56, that does not fall within the range 1-31, so it will not pass the second expression.
try this
^\d+(-\d+)?(,\d+(-\d+)?)*$
DEMO
Here is my workings
Numbers:
0|([1-9][0-9]*) call this expression A Note this expression treats zero as a special case and prevents numbers starting with a zero eg 0000001234
Number or a range:
A|(A-A) call this expression B (i.e (0|([1-9][0-9]*))|((0|([1-9][0-9]*))-(0|([1-9][0-9]*)))
Comma operator
B(,B)*
Putting this togher should do the trick and we get
((0|([1-9][0-9]*))|((0|([1-9][0-9]*))-(0|([1-9][0-9]*))))(,((0|([1-9][0-9]*))|((0|([1-9][0-9]*))-(0|([1-9][0-9]*)))))*
You can abbreviatge this with \d for [0-9]
The other approaches have not restricted the allowed range of numbers. This allows 1 through 31 only, and seems simpler than some of the monstrosities people have come up with ...
^([12][0-9]?|3[01]?|[4-9])([-,]([12][0-9]?|3[01]?|[4-9]))*$
There is no check for sensible ranges; adding that would make the expression significantly more complex. In the end you might be better off with a simpler regex and implementing sanity checks in code.
I propose the following regex:
(?<number>[1-9]|[12]\d|3[01]){0}(?<thing>\g<number>-\g<number>|\g<number>){0}^(\g<thing>,)*\g<thing>$
It looks awful but it isn't :) In fact the construction (?<name>...){0} allows us to define a named regex and to say that it doesn't match where it is defined. Thus I defined a pattern for numbers called number and a pattern for what I called a thing i.e. a range or number called thing. Next I know that your expression is a sequence of those things, so I use the named regex thing to build it with the construct \g<thing>. It gives (\g<thing>,)*\g<thing>. That's easy to read and understand. If you allow whitespaces to be non significant in your regex, you could even indent it like this:
(?<number>[1-9]|[12]\d|3[01]){0}
(?<thing>\g<number>-\g<number>|\g<number>){0}
^(\g<thing>,)*\g<thing>$/
I tested it with Ruby 1.9.2. Your regex engine should support named groups to allow that kind of clarity.
irb(main):001:0> s1 = '1-5,5,15-29'
=> "1-5,5,15-29"
irb(main):002:0> s2 = '1,28,1-31,15'
=> "1,28,1-31,15"
irb(main):003:0> s3 = '15,25,3'
=> "15,25,3"
irb(main):004:0> s4 = '1-24,5-6,2-9'
=> "1-24,5-6,2-9"
irb(main):005:0> r = /(?<number>[1-9]|[12]\d|3[01]){0}(?<thing>\g<number>-\g<number>|\g<number>){0}^(\g<thing>,)*\g<thing>$/
=> /(?<number>[1-9]|[12]\d|3[01]){0}(?<thing>\g<number>-\g<number>|\g<number>){0}^(\g<thing>,)*\g<thing>$/
irb(main):006:0> s1.match(r)
=> #<MatchData "1-5,5,15-29" number:"29" thing:"15-29">
irb(main):007:0> s2.match(r)
=> #<MatchData "1,28,1-31,15" number:"15" thing:"15">
irb(main):008:0> s3.match(r)
=> #<MatchData "15,25,3" number:"3" thing:"3">
irb(main):009:0> s4.match(r)
=> #<MatchData "1-24,5-6,2-9" number:"9" thing:"2-9">
irb(main):010:0> '1-1-1-1'.match(r)
=> nil
Using the same logic in my previous answer but limiting the range
A becomes [1-9]\d|3[01]
B becomes ([1-9]\d|3[01])|(([1-9]\d|3[01])-([1-9]\d|3[01]))
Overall expression
(([12]\d|3[01])|(([12]\d|3[01])-([12]\d|3[01])))(,(([12]\d|3[01])|(([12]\d|3[01])-([12]\d|3[01]))))*
An optimal Regex for this topic could be:
^(?'int'[1-2]?[1-9]|3[01])((,\g'int')|(-\g'int'(?=$|,)))*$
demo