moving characters by using regex - regex

I'm trying to move matched characters to the end of sentence.
from
300p apple in house
orange 200p in school
to
apple in house 300p
orange in school 200p
So I matched (.+)([\d]+p)(.+)$ and substituted with \1 \3 \2.
But the result is like
30 apple in house 0p
orange 20 in school 0p
I also checked greedy concept, but I don't know what is problem. How can I fix this?

You can use
^(.*?)(\d+p) *(.+)
Replace with \1\3 \2.
See the regex demo. Details:
^ - start of string (or line if you use a multiline mode)
(.*?) - Group 1: any zero or more chars other than line break chars as few as possible
(\d+p) - Group 2: one or more digits, and then a p char
* - zero or more spaces
(.+) - Group 3: any one or more chars other than line break chars as many as possible (since it is a greedy subpattern, no $ anchor is required, the match will go up to the end of string (or line if you use a multiline mode)).

With your shown samples only, please try following regex.
^(\D+)?(\d+p)\s*(.+)$
Online demo for above regex
Explanation:
^(\D+)? ##Matching from starting and creating 1st capturing group which has all non-digits in it and keeping it as optional.
(\d+p) ##Creating 2nd capturing group which matches 1 or more digits followed by p here.
\s* ##Matching 0 or more occurrences of spaces here.
(.+)$ ##Creating 3rd capturing group here which has everything in it.

Related

Regex - add a zero after second period

I have the following example of numbers, and I need to add a zero after the second period (.).
1.01.1
1.01.2
1.01.3
1.02.1
I would like them to be:
1.01.01
1.01.02
1.01.03
1.02.01
I have the following so far:
Search:
^([^.])(?:[^.]*\.){2}([^.].*)
Substitution:
0\1
but this returns:
01 only.
I need the 1.01. to be captured in a group as well, but now I'm getting confuddled.
Does anyone know what I am missing?
Thanks!!
You may try this regex replacement with 2 capture groups:
Search:
^(\d+\.\d+)\.([1-9])
Replacement:
\1.0\2
RegEx Demo
RegEx Details:
^: Start
(\d+\.\d+): Match 1+ digits + dot followed by 1+ digits in capture group #1
\.: Match a dot
([1-9]): Match digits 1-9 in capture group #2 (this is to avoid putting 0 before already existing 0)
Replacement: \1.0\2 inserts 0 just before capture group #2
You could try:
^([^.]*\.){2}\K
Replace with 0. See an online demo
^ - Start line anchor.
([^.]*\.){2} - Negated character 0+ times (greedy) followed by a literal dot, matched twice.
\K - Reset starting point of reported match.
EDIT:
Or/And if \K meta escape isn't supported, than see if the following does work:
^((?:[^.]*\.){2})
Replace with ${1}0. See the online demo
^ - Start line anchor.
( - Open 1st capture group;
(?: - Open non-capture group;
`Negated character 0+ times (greedy) followed by a literal dot.
){2} - Close non-capture group and match twice.
) - Close capture group.
Using your pattern, you can use 2 capture groups and prepend the second group with a dot in the replacement like for example \g<1>0\g<2> or ${1}0${2} or $10$2 depending on the language.
^((?:[^.]*\.){2})([^.])
^ Start of string
((?:[^.]*\.){2}) Capture group 1, match 2 times any char except a dot, then match the dot
([^.].*) Capture group 2, match any char except a dot
Regex demo
A more specific pattern could be matching the digits
^(\d+\.\d+\.)(\d)
^ Start of string
(\d+\.\d+\.) Capture group 1, match 2 times 1+ digits and a dot
(\d) Capture group 2, match a digit
Regex demo
For example in JavaScript
const regex = /^(\d+\.\d+\.)(\d)/;
[
"1.01.1",
"1.01.2",
"1.01.3",
"1.02.1",
].forEach(s => console.log(s.replace(regex, "$10$2")));
Obviously, there will be tons of solutions for this, but if this pattern holds (i.e. always the trailing group that is a single digit)... \.(\d)$ => \.0\1 would suffice - to merely insert a 0, you don't need to match the whole thing, only just enough context to uniquely identify the places targeted. In this case, finding all lines ending in a . followed by a single digit is enough.

Regex to validate subtract equations like "abc-b=ac"

I've stumbled upon a regex question.
How to validate a subtract equation like this?
A string subtract another string equals to whatever remains(all the terms are just plain strings, not sets. So ab and ba are different strings).
Pass
abc-b=ac
abcde-cd=abe
ab-a=b
abcde-a=bcde
abcde-cde=ab
Fail
abc-a=c
abcde-bd=ace
abc-cd=ab
abcde-a=cde
abc-abc=
abc-=abc
Here's what I tried and you may play around with it
https://regex101.com/r/lTWUCY/1/
Disclaimer: I see that some of the comments were deleted. So let me start by saying that, though short (in terms of code-golf), the following answer is not the most efficient in terms of steps involved. Though, looking at the nature of the question and its "puzzle" aspect, it will probably do fine. For a more efficient answer, I'd like to redirect you to this answer.
Here is my attempt:
^(.*)(.+)(.*)-\2=(?=.)\1\3$
See the online demo
^ - Start line anchor.
(.*) - A 1st capture group with 0+ non-newline characters right upto;
(.+) - A 2nd capture group with 1+ non-newline characters right upto;
(.*) - A 3rd capture group with 0+ non-newline characters right upto;
-\2= - An hyphen followed by a backreference to our 2nd capture group and a literal "=".
(?=.) - A positive lookahead to assert position is followed by at least a single character other than newline.
\1\3 - A backreference to what was captured in both the 1st and 3rd capture group.
$ - End line anchor.
EDIT:
I guess a bit more restrictive could be:
^([a-z]*)([a-z]+)((?1))-\2=(?=.)\1\3$
You may use this more efficient regex with a lookahead at the start with a capture group that matches text on the right hand side of - i.e. substring between - and = and captures it in group #1. Then in the main body of regex we just check presence of capture group #1 and capture text before and after \1 in 2 separate groups.
^(?=[^-]+-([^=]+)=.)([^-]*?)\1([^-]*)-[^=]+=\2\3$
RegEx Demo
RegEx Demo:
^: Start
(?=[^-]+-([^=]+)=.): Lookahead to make sure we have expression structure of pqr-pq=r and also more importantly capture substring between - and = in capture group #1. . after = is there for a reason to disallow any empty string after =.
([^-]*?): Match 0 or more non-- characters in capture group #2
\1: Back-reference to group #1 to make sure we match same value as in capture group #1
([^-]*): Match 0 or more non-- characters in capture group #3
-: Match a -
[^=]+: Match 0 or more non-= characters
=: Match a =
\2\3: Back-reference to group #2 and #3 which is difference of substraction
$: End

Regex with exception inside exception

I made the following regex :
(\w{2,3})(,\s*\w{2,3})*
It mean the sentence should start with 2 or 3 letter, 2 or 3 letter as infinite.
Now i should authorise the word blue and yellow.
(\w{2,3}|blue|yellow)(,\s*\w{2,3})*
It will works inly if blue and yellow are at the beginning
Is there a way to allow the exception's word after comma without repeting the word in the code ?
I'd say go with the answer given by #Toto, but if your language doesn't support recursive patterns, you could try:
^(?![, ])(?:,?\s*\b(?:\w{2,3}|blue|yellow))+$
See the online demo
^ - Start string anchor.
(?![, ]) - Negative lookahead to prevent starting with a comma or space.
(?: - Open 1st non-capture group.
,?\b - Match an optional comma, zero or more space characters and a word-boundary.
(?: - A nested 2nd non-capture group.
\w{2,3}|blue|yellow - Lay our your options just once.
) -Close 2nd non-capture group.
)+ - Close 1st non capture group and match at least once.
$ - End string anchor.
Just be aware that \w{2,3} allows for things like __ and _1_ to be valid inputs.
If the language you are using supports recursive patterns, you can use:
^(blue|yellow|\w{2,3})(?:,\s*(?1))*$
Demo & explanation
If either blue or yellow can occur only once:
^(?:\w{2,3}\s*,\s*)*(?:blue|yellow)(?:\s*,\s*\w{2,3})*$
The pattern matches
^ Start of string
(?:\w{2,3}\s*,\s*)* Optionally repeat 2-3 word chars followed by a comma
(?:blue|yellow) Match either blue or yellow
(?:\s*,\s*\w{2,3})* Optionally match a comma and 2-3 word chars
$ End of string
Regex demo

How to capture everything until another capture group

I have the following template :
1251 Left Random Text I want to fill
It can go through multiple lines
As you can see
9841 Right Again we see a lot of random text with 3115 numbers
And this also goes
To multiple lines
0121 Right
5151 Right This one is just one line
I was wrong
9731 Left This one is just a line
5123 NA Instruction 5151 was wrong
4113 Right Instr 9841 was correct
We checked
I want to have 3 groups:
1251
Left
Random Text I want to fill
It can go through multiple lines
As you can see
I'm using
(\d+)\s(\w+)\s(.*)
but it stops at the current line only (so I get only Random Text I want to fill in group 3, although I want including As you can see)
If I'm using Single line flag I get only 1 match for each group, group 3 almost being all
Here is live : https://regex101.com/r/W3x0mH/4
You could use a repeating group matching all the lines while asserting that the next line does not start wit 1+ digits followed by Left or Right:
(\d+)\s(\w+)\s(.*(?:\r?\n(?!\d).*)*)
Explanation
(\d+)\s(\w+)\s Match the first 2 groups
(Third capturing group
.* Match 0+ times any char except a newline
(?: Non capturing group
\r?\n(?!\d).* Match newline, assert what is on the right is not a digit
)* Close non capturing group and repeat 0+ times
) Close capturing group
Regex demo
You may use this regex with a lookahead:
^(\d+)\s(\w+)\s(.*?)(?=\n\d|\z)
with DOTALL and MULTILINE modifiers.
Updated Regex Demo
RegEx Details:
^: Line start
(\d+): Match and capture 1+ digits in group #1
\s: match a whitespace
(\w+): Match and capture 1+ word characters in group #2
\s: match a whitespace
(.*?): Match 0 or more of any character (non-greedy) provided next lookahead assertion is satiSfied
(?=\n\d|\z): Lookahead assertion to assert that we have a newline followed by a digit or there is end of input
Faster Regex:
If you are using this regex on a long string then you should also keep overall performance in mind as a regex with DOTALL modifier will tend to get slow on a large size text. For that I suggest using this regex that doesn't need DOTALL modifier:
^(\d+)\s(\w+)\s(.*(?:\n.*)*?)(?=\n\d|\z)
RegEx Demo 2
On regex101 demo this regex takes just 181 steps as compared to first one that takes 1300 steps.
For the third group, repeat any character while using negative lookahead for ^\d, which would indicate the start of a new match:
(\d+)\s(\w+)\s((?:(?!^\d)[\s\S])*)
https://regex101.com/r/W3x0mH/5
You may try with this regex:
^(\d+)\s+(\w+)\s+(.*?)(?=^\d|\z)
^(\d+)\s+ , ^\d+ Line begins with numbers followed by one or more whitespace character \s+
(\w+)\s+ where \w+ one or more characters (left,right,na or something else) followed by one or more whitespace \w+
(.*?) matches everything until it finds a line beginning with number or \z end of string.
I think it fits your requirement....
Regex101

Seperate string by recognizing first digit with regex

I'm using ([^\d]+)\s?(.+) for dividing a string by taking the first digit that appears inside the string.
Exp.: Test123 --> Group1: Test, Group2: 123 # that works
but
Exp.: Test --> Group1: Tes, Group2: t # I expect: Group1: Test, Group 2: [empty]
How to edit the regex, so it fits my expcetation?
If you need to match up to the first digit if there is one, you may use
^(.*?)\s*(\d.*)?$
See the regex demo
^ - start of string
(.*?) - Group 1: any 0+ chars other than line break chars, as few as possible (since *? is a lazy quantifier)
\s* - 0+ whitespaces
(\d.*)? - Group 2: an optional capturing group matching 1 or 0 occurrences of a digit and then any 0+ chars other than line break chars as many as possilbe (* is a greedy quantifier)
$ - end of string.
Your regex almost works
Problem: The problem lies in your second capturing group (.+) this means at least one of any character. It will grab the 't' at the end of test in order to make a match, since it must have at least one character in it.
Solution: replace your second capturing group with (.*) this means at least zero of any character. (ie): it does not need to have any characters in it to make a match and it will grab any number of characters after 'Test'
here is your new working regex code:
([^\d]+)\s?(.*)