Matching Word Regex

Matching Word Regex - regex

Hello i want to match with regex this word
(Parc Installé)
from this text:
31/1/2017 17:19:23,4245986,ct0001#Intotel.int,Parc Installé,100.100.30.100
I did this regex ',[A-Za-zA-zÀ-ú+ \/\w+0-9._%+-]+,'
But the result is : 4245986 ans Parc Installé.
How can i match only Parc Installé

You may try a regex based on a lookahead that will require a comma and digits/commas after it up to the end of string:
[^,]+(?=\s*,[\d.]+$)
See this regex demo
Details:
[^,]+ - 1 or more chars other than ,
(?=\s*,[\d.]+$) - a lookahead requiring
\s* - zero or more whitespaces
, - a comma
[\d.]+ - 1+ digits or dots up to...
$ - ... the end of string
To make it a bit more restrictive, you may replace the lookahead with (?=\s*,\d+(?:\.\d+){3}$) to require 4 sequences of dot-separated 1+ digits. See this regex demo.
If a lookahead is not supported (case with a RE2 engine), you might want to use a capturing group based solution:
([^,]+)\s*,[\d.]+$
Here, the part within (...) will be captured into Group 1 and will be accessible via a backreference or a function like =REGEXEXTRACT in Google Spreasheets that only retrieves the contents of a capturing group if the latter is present in the pattern.

Related

Regex - add a zero after second period

I have the following example of numbers, and I need to add a zero after the second period (.).
1.01.1
1.01.2
1.01.3
1.02.1
I would like them to be:
1.01.01
1.01.02
1.01.03
1.02.01
I have the following so far:
Search:
^([^.])(?:[^.]*\.){2}([^.].*)
Substitution:
0\1
but this returns:
01 only.
I need the 1.01. to be captured in a group as well, but now I'm getting confuddled.
Does anyone know what I am missing?
Thanks!!

You may try this regex replacement with 2 capture groups:
Search:
^(\d+\.\d+)\.([1-9])
Replacement:
\1.0\2
RegEx Demo
RegEx Details:
^: Start
(\d+\.\d+): Match 1+ digits + dot followed by 1+ digits in capture group #1
\.: Match a dot
([1-9]): Match digits 1-9 in capture group #2 (this is to avoid putting 0 before already existing 0)
Replacement: \1.0\2 inserts 0 just before capture group #2

You could try:
^([^.]*\.){2}\K
Replace with 0. See an online demo
^ - Start line anchor.
([^.]*\.){2} - Negated character 0+ times (greedy) followed by a literal dot, matched twice.
\K - Reset starting point of reported match.
EDIT:
Or/And if \K meta escape isn't supported, than see if the following does work:
^((?:[^.]*\.){2})
Replace with ${1}0. See the online demo
^ - Start line anchor.
( - Open 1st capture group;
(?: - Open non-capture group;
`Negated character 0+ times (greedy) followed by a literal dot.
){2} - Close non-capture group and match twice.
) - Close capture group.

Using your pattern, you can use 2 capture groups and prepend the second group with a dot in the replacement like for example \g<1>0\g<2> or ${1}0${2} or $10$2 depending on the language.
^((?:[^.]*\.){2})([^.])
^ Start of string
((?:[^.]*\.){2}) Capture group 1, match 2 times any char except a dot, then match the dot
([^.].*) Capture group 2, match any char except a dot
Regex demo
A more specific pattern could be matching the digits
^(\d+\.\d+\.)(\d)
^ Start of string
(\d+\.\d+\.) Capture group 1, match 2 times 1+ digits and a dot
(\d) Capture group 2, match a digit
Regex demo
For example in JavaScript
const regex = /^(\d+\.\d+\.)(\d)/;
[
"1.01.1",
"1.01.2",
"1.01.3",
"1.02.1",
].forEach(s => console.log(s.replace(regex, "$10$2")));

Obviously, there will be tons of solutions for this, but if this pattern holds (i.e. always the trailing group that is a single digit)... \.(\d)$ => \.0\1 would suffice - to merely insert a 0, you don't need to match the whole thing, only just enough context to uniquely identify the places targeted. In this case, finding all lines ending in a . followed by a single digit is enough.

Validate string # followed by digits but # increases after every occurance

I have a string looks like this
#123##1234###2356####69
It starts with # and followed by any digits, every time the # appears, the number of # increases, first time 1, second time 2, etc.
It's similar to this regex, but since I don't know how long this pattern goes, so it's not very useful.
^#\d+##\d+###\d+$
I'm using PCRE regex engine, it allows recursion (?R) and conditions (?(1)...) etc.
Is there a regex to validate this pattern?
Valid
#123
#12##235
#1234##12###368
#1234##12###368####22235#####723356
Invalid
##123
#123###456
#123##456##789
I tried ^(?(1)(?|(#\1)|(#))\d+)+$ but it doesn't seem to work at all

You can do this using PCRE conditional sub-pattern matching:
^(?:((?(1)\1)#)\d+)++$
RegEx Demo
RegEx Details:
^: Start
(?:: Start non-capture group
(: Start capture group #1
(?(1)\1): if/then/else directive that means match back-reference \1 only if 1st capture group is available otherwise match null
#: Match an additional #
): End capture group #1
\d+: Match 1+ digits
)++: End non-capture group. Match 1+ of this non-capture group.
$: End

One option could be optionally matching a backreference to group 1 inside group 1 using a possessive quantifier \1?+# adding # on every iteration.
^(?:(\1?+#)\d+)++$
^ Start of string
(?: Non capture group
(\1?+#)\d+ Capture group 1, match an optional possessive backreference to what is already captured in group 1 and add matching a # followed by 1+ digits
)++ Close the non capture group and repeat 1+ times possessively
$ End of string
Regex demo

I think you can use forward-referencing here:
^(?:((?:\1(?!^)|^)#)\d+)+$
See the regex demo.
Details:
^ - start of string
(?:((?:\1(?!^)|^)#)\d+)+ - one or more occurrences of
((?:\1(?!^)|^)#) - Group 1 (the \1 value): start of string or an occurrence of the Group 1 value if it is not at the string start position
\d+ - one or more digits
$ - end of string.
NOTE: This technique does not work in regex flavors that do not support forward referencing, like ECMAScript based flavors (e.g. JavaScript, VBA, C++ std::regex)

Despite there are already working answers, and inspired by Wiktor's answer, I came up this idea:
(?:(^#|#\1)\d+)+$
Which is also quite short and effective(also works for non pcre environment).
See the test cases

Regular expression using positive lookbehind not working in Alteryx

I am trying to match a string the 2nd word after "Vores ref.:" using positive lookbehind. It works in online testers like https://regexr.com/, but my tool Alteryx dont allow quantifiers like + in a lookbehind.
"ABC This is an example Vores ref.: 23244-2234 LW782837673 Test 2324324"
(?<=Vores\sref.:\s\d+-\d+\s+)\w+ is correctly matching the LW78283767, on regexr.com but not in Alteryx.
How can I rewrite the lookahead expression by using quantifiers but still get what I want?

You can use a replacement approach here using
.*?\bVores\s+ref\.:\s+\d+-\d+\s+(\w+).*
Replace with $1.
See the regex demo.
Details:
.*? - any 0+ chars other than line break chars, as few as possible
\bVores - whole word Vores
\s+ - one or more whitespaces
ref\.: - ref.: substring
\s+ - one or more whitespaces
\d+-\d+ - one or more digits, - and one or more digits
\s+ - one or more whitespaces
(\w+) - Capturing group 1: one or more word chars.
.* - any 0+ chars other than line break chars, as many as possible.

You can use a capture group instead.
Note to escape the dot \. to match it literally.
\bVores\sref\.:\s\d+-\d+\s+(\w+)
The pattern matches:
\bVores\sref\.:\s\d+-\d+\s+ Your pattern turned into a match
(\w+) Capture group 1, match 1+ word characters
Regex demo

Find regex expression

I am trying to find a regex expression to match strings with 4 repeating digits and optional hyphens in between, such as:
-3-3-3-3-
-1111-
2222-
0-0-00
Currently I have:
\-?(\d(\-*))\1{3}\-?
which matches the first two but not the last two. Any suggestions?

You may use
^-?(\d)(?:-?\1){3}-?$
See the regex demo. To find the pattern in a larger string, remove the ^ and $ anchors:
-?(\d)(?:-?\1){3}-?
If the pattern is a part of a longer pattern, you might have to adjust the backreference number (if there are other capturing groups in the pattern).
Details
^ - start of string
-? - an optional -
(\d) - Group 1: any digit
(?:-?\1){3} - three occurrences of an optional - and then the same value as captured in Group 1
-? - an optional hyphen
$ - end of the string.

Regex to capture everything up to (but not including the 1st space and hyphen)

Here is my samples string
Google Chrome-Helper -type=renderer -field-trial-handle=1
But I want just Google Chrome-Helper
Ive tried: ^.*[ ][-] but captures up to the last parameter.
Example Here

You need to use lazy dot matching and either use capturing or a lookahead:
^(.*?)\s+-
(your value will be in Group 1) or
^.*?(?=\s+-)
See the regex demo with capturing and with a lookahead.
Details:
^ - start of string anchor
.*? - any 0+ chars other than a newline, as few as possible (i.e. the subsequent subpatterns are tried first, this one is skipped, the regex engine only comes back here if they fail to find a match)
(?=\s+-) - a positive lookahead that requires 1+ whitespace and then a hyphen.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Matching Word Regex - regex

Hello i want to match with regex this word (Parc Installé) from this text: 31/1/2017 17:19:23,4245986,ct0001#Intotel.int,Parc Installé,100.100.30.100 I did this regex ',[A-Za-zA-zÀ-ú+ \/\w+0-9._%+-]+,' But the result is : 4245986 ans Parc Installé. How can i match only Parc Installé

Related

Regex - add a zero after second period

Validate string # followed by digits but # increases after every occurance

Regular expression using positive lookbehind not working in Alteryx

Find regex expression

Regex to capture everything up to (but not including the 1st space and hyphen)

Categories

Resources