Finding interecept of equation using regular expressions - regex

Lines=["1x+1y+0","1x-1y+0","-1ax+0y-3","0x+1y-0.5"]
I am trying to find the intercept, say for equation no 3 i.e "-1ax+0y-3"
re.findall('[+-][\w]*[^XxYy]',Lines[2])
but it gives me
['-1ax+', '-3']
I was expecting only -3

[+-]?\w*?[^XxYy](?=\+|-|$) will give you the expected result.
[+-]? is making the sign optional, so you can also match a positive value at the start of your string
*? is making it ungreedy and
(?=\+|-|$) is a lookahead to check if there is either +, - or the end of the string after your value.
If you just want to match numbers: [+-]?[0-9\.]+?[^XxYy](?=\+|-|$)
[0-9\.] will match numbers or a decimal point!

You could use
(?<=[xyXY])[+-]?\d+(?:\.\d+)?$
Explanation
(?<=[xyXY]) Positive lookbehind, assert x or y at the left
[+-]? Optionally match + or -
\d+(?:\.\d+)? Match digits with an optional decimal part
$ End of string
Regex demo
import re
Lines = ["1x+1y+0","1x-1y+0","-1ax+0y-3","0x+1y-0.5"]
print(re.findall('(?<=[xyXY])[+-]?\d+(?:\.\d+)?$', Lines[2]))
Output
['-3']

You can use
re.findall(r'-?\b\d+(?:\.\d+)?\b', Lines[2])
See the regex demo. Details:
-? - an optional -
\b - a word boundary, no glued letters allowed
\d+ - one or more digits
(?:\.\d+)? - an optional fractional part
\b - a word boundary.

Related

Validating User Input While Typing using RegEx

I am struggling to write the RegEx for the following criteria:
The number can be positive / negative
Optional - at the start
Between 1 and 5 numbers before the decimal point
2 decimal places only (optional)
Stop user from typing more than 1 . or -
This is the regex I have tried to implement which does not work for me.
^((-?[0-9]{1,5}(\.?){1,1}[0-9]{0,2})
It should allow the user to type out the following numbers.
-1.12
12345
1
123
12.12
Any help would be appreciated!
You may use
^-?\d{0,5}(?:(?<=\d)\.\d{0,2})?$
See the regex demo.
Details
^ - start of string
-? - an optional -
\d{0,5} - zero to five digits
(?:(?<=\d)\.\d{0,2})? - an optional sequence of
(?<=\d) - there must be a digit immediately to the left of the current location
\. - a dot
\d{0,2} - zero, one or two digits
$ - end of string.
If you want to validate while typing, you could make use of optional groups to accept intermediate values and do a final check on the whole pattern when processing the value.
^-?(?:\d{1,5}(?:\.\d{0,2})?)?$
Explanation
^ Start of string
-? Optional hyphen
(?: Non capture group
\d{1,5} Match 1-45 digits
(?: Non capture group
\.\d{0,2} Match a dot and 0-2 digits
)? Close group and make it optional
)? Close group and make it optional
$ End of string
Regex demo
To validate the final pattern, you could match an optional -, 1-5 digits and an optional decimal part:
^-?\d{1,5}(?:\.\d{1,2})?$
Regex demo
The regex ^(-?(\d{1,5}(\.\d{0,2})?)?)$ should work if you want to match strings that end in . such as 123. demo of this regex
Otherwise, change the 0 to a 1 as follows: ^(-?(\d{1,5}(\.\d{1,2})?)?)$. Then it will only match strings that have a digit after the decimal point.
The regex that you posted allows strings with more than 2 digits after the decimal point because it stops matching after the 2 digits, even if the string continues. Adding a $ at the end of the regex stops it from matching strings that continue after the part we want.
This regex ^(-?\d{1,5}(\.\d{0,2})?)$ will validate the input once the user has finished typing, because I assume that you don't want -to be valid at that point.

RegEx for matching operation sequences

I have a numbers operation like this:
-2-28*95+874-1545*-5+36
I need to extract operands, not implied in a multiplication operation with a regex:
-2
+874
+36
I tried things like that without success:
[\+,-]\d+(?=\+|-|$)
This regex matches -5, too, and
(?(?=\d+)[\+,-]|^)\d+(?=\+|-|$)
matches nothing.
How do I solve this problem?
You may use
(?<!\*)[-+]\d*\.?\d+(?![*\d])
See the regex demo
Details
(?<!\*) - (a negative lookbehind making sure the current position is) not immediately preced with a * char
[-+] - - or +
\d* - 0 or more digits
\.? - an optional . char
\d+ - 1+ digits
(?![*\d]) - not immediately followed with a * or digit char.
See the regex graph:
This RegEx might help you to capture your undesired pattern in one group (), then it would leave your desired output:
(((-|\+|)\d+\*(-|\+|)\d+))
You can also use other language specific functions such as (*SKIP)(*FAIL) or (*SKIP)(*F) and get the desired output:
((((-|\+|)\d+\*(-|\+|)\d+))(*SKIP)(*FAIL)|([s\S]))
You can also DRY your expression, if you wish, and remove unnecessary groups that you may not need.
Another option could be to match what you don't want and capture in a group what you want to keep. Your values are then in the first capturing group:
[+-]?\d+(?:\*[+-]?\d+)+|([+-]?\d+)
Explanation
[+-]?\d+ Optional + or - followed by 1+ digits
(?:\*[+-]?\d+)+ Repeat the previous pattern 1+ times with an * prepended
| Or
([+-]?\d+) Capture in group 1 matching an optional + or - and 1+ digits
Regex demo

Regex pattern : Validating a single occurence

I have implemented the following Regex pattern
^[\d,|+\d,]+$
It validates the following pattern
14,+96,4,++67
I need to invalidate ++67 from my pattern and I need to keep values with only a single leading + sign.
How should I change my Regex pattern?
You may use
^\+?\d+(?:,\+?\d+)*$
See the regex demo.
Details
^ - start of string
\+? - an optional + char
\d+ - 1+ digits
(?:,\+?\d+)* - zero or more repetitions of a sequence of patterns:
, - a comma
\+? - an optional plus
\d+ - 1+ digits
$ - end of string
Perhaps you meant to do this?
^(\d,|\+\d,)+$
Square brackets use every character or character class within, which does not appear to be what you really want. For disjunction you need round brackets.
You can try this one
^(\d+\,?|\+\d+,?)+$

Regex match depending on lookbehind match

I need to match these values:
(First approach to a regex that roughly does what I want)
\d+([.,]\d{3})*[.,]\d{2}
like
24,56
24.56
1.234,56
1,234.56
1234,56
1234.56
but I need to not match
1.234.56
1,234,56
So somehow I need to check the last occurrence of "." or "," to not be the same as the previous "." or ",".
Background: Amounts shall be matched in English and German format with (optional) 1000-Separators.
But even with help of regex101 I completely fail at coming up with a correctly working look-behind. Any suggestions are highly appreciated.
UPDATE
Based on the answers I got so far, I came up with this (demo):
\d{1,3}(?:([\.,'])?\d{3})*(?!\1)[\.,\s]\d{2}
But it matches for example 1234.567,23 which is not desirable.
You may capture the digit grouping symbol and use a negative lookahead with a backreference to restrict the decimal separator:
^(?:\d+|\d{1,3}(?:([.,])\d{3})*)(?!\1)[.,]\d{2}$
^ ^ ^^^^^
See the regex demo
Group 1 will contain the last value of the digit grouping symbol and (?!\1)[.,] will match the other symbol.
Details:
^ - start of string
(?:\d+|\d{1,3}(?:([.,])\d{3})*) - either of the two alternatives:
\d+ - 1+ digits
| - or
\d{1,3} - 1 to 3 digits,
(?:([.,])\d{3})* - zero or more sequences of:
([.,]) - Group 1 capturing . or ,
\d{3} - 3 digits
(?!\1)[.,] - a . or , but not equal to what was last captured with ([.,]) pattern above
\d{2} - 2 digits
$ - end of string.
You can use
^\d+(([.,])\d{3})*(?!\2)[.,]\d{2}$
live demo

regular expression in R, match substring only if things after

my_string = "2011, this year I made 750,000 dollars"
Is there an elegant way to match "2011" and "750,000" in the string above. The idea is to extract numeric values when it looks like to numeric values, i.e. \d+ or \d+[\.,]?\d* depending on the presence of a comma after
I tried this but it doesn't match exactly what I wanted, I got "2011," which is no good
library(stringr)
str_match_all(fkin, "(\\d+[\\.,]?\\d*)
Here is my expected resut:
"2011" "750,000"
You can do:
[0-9]+(?:[,.][0-9]+)*
It's very elegant, I tried it in front of a mirror.
Here is a one regex pure base R approach to extract integer or float values that are not part of the string of digits separated with a hyphen:
> str <- "2011, this year I made 750,000 dollars and 750,000-589 here"
> regmatches(str, gregexpr('(?<!\\d-)\\b\\d+(?:[,.]\\d+)?+(?!-)', str, perl=T))[[1]]
[1] "2011" "750,000"
See the IDEONE demo and a regex demo.
Since the regex contains lookarounds, you need to specify the perl=TRUE argument.
Pattern explanation:
(?<!\d-) - a negative lookbehind failing the match when a digit with a hyhen precedes the current location
\b\d+ - a word boundary (before the next digit, there cannot be a word char - letter, digit or _)
(?:[,.]\d+)?+ - a non-capturing group ((?:...)) matching 1 or 0 sequences of a comma or dot ([,.]) followed with 1 or more digits (and this sequence is matched possessively (see ?+) so that the regex engine did not check for a hyphen after \b\d+)
(?!-) - a negative loookahead that fails the match if there is a hyphen after the digits detected.