Regex Pattern where group may not exist

Regex Pattern where group may not exist - regex

I have a RegEx pattern that needs to match on any of the following lines:
10-10-15 15:16:41.1 Some Text here
10-10-15 15:16:41.12 Some Text here
10-10-15 15:16:41.123 Some Text here
10-10-15 15:16:41 Some Text here
I can match the first 3 with the pattern below:
(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})\.(?<milli>\d{0,3})))\s(?<Line>.*)
How do i Match this line (10-10-15 15:16:41 Some Text here) which has no milliseconds but still get the group back in my result either wit a blank value or with 0 as the value?
Thanks
As i said each of the lines below will match:
10-10-15 15:16:41.123 Some text Here
10-10-15 15:16:41.12 Some Text here
10-10-15 15:16:41.1 Some Text here
10-10-15 15:16:41. Some Text here
The groups look like so:
date [0-18] `10-10-15 15:16:41.`
day [0-2] `10`
month [3-5] `10`
year [6-8] `15`
time [9-18] `15:16:41.`
hour [9-11] `15`
minutes [12-14] `16`
seconds [15-17] `41`
milli [18-18] ``
Line [19-34] `Some Text here `

You can use the following (slightly modified version of your regex):
(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.\d{0,3})?))\s(?<logEntry>.*)
See DEMO
Explanation:
Make the <milli> part optional.. and not the . since it matches strings like 10-10-15 15:16:41123 Some Text here also..

Worked it out. I needed the following pattern:
(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.?\d{0,3})))\s(?<logEntry>.*)

^(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)\.?(\d*)([a-zA-Z\s]+)
Note the (\d*) which will return the group even if empty.
Demo

Make the milliseconds optional ?
/^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$/
Example:
<?php
$strings = <<< LOL
10-10-15 15:16:41.1 Some Text here
10-10-15 15:16:41.12 Some Text here
10-10-15 15:16:41.123 Some Text here
10-10-15 15:16:41 Some Text here
LOL;
preg_match_all('/^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$/m', $strings , $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[0]); $i++) {
$day = $matches[1][$i];
$month = $matches[2][$i];
$year = $matches[3][$i];
$hours = $matches[4][$i];
$minutes = $matches[5][$i];
$seconds = $matches[6][$i];
$ms = $matches[7][$i];
$text = $matches[8][$i];
echo "$day $month $year $hours $minutes $seconds $ms $text \n";
}
Regex Demo:
https://regex101.com/r/aF9wN6/1
PHP Demo:
http://ideone.com/1aEt2E
Regex Explanation:
^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$
Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed) «^»
Match the regex below and capture its match into backreference number 1 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “-” literally «-»
Match the regex below and capture its match into backreference number 2 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “-” literally «-»
Match the regex below and capture its match into backreference number 3 «([\d]{2}|[\d]{4})»
Match this alternative (attempting the next alternative only if this one fails) «[\d]{2}»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Or match this alternative (the entire group fails if this one fails to match) «[\d]{4}»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{4}»
Exactly 4 times «{4}»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 4 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “:” literally «:»
Match the regex below and capture its match into backreference number 5 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “:” literally «:»
Match the regex below and capture its match into backreference number 6 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “.” literally «\.?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the regex below and capture its match into backreference number 7 «(\d+)?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character that is a “digit” (any decimal number in any Unicode script) «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 8 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line (at the end of the string or before a line break character) (line feed) «$»

Related

RegEx after certain string

I have a manifest file
Bundle-ManifestVersion: 2
Bundle-Name: BundleSample
Bundle-Version: 4
I want to change the value of Bundle-Name using -replace in Powershell.
I used this pattern Bundle-Name:(.*)
But it returns including the Bundle-Name. What would be the pattern if I want to change only the value of the Bundle-Name?

You could capture both the Bundle-Name: and its value in two separate capture groups.
Then replace like this:
$manifest = #"
Bundle-ManifestVersion: 2
Bundle-Name: BundleSample
Bundle-Version: 4
"#
$newBundleName = 'BundleTest'
$manifest -replace '(Bundle-Name:\s*)(.*)', ('$1{0}' -f $newBundleName)
# or
# $manifest -replace '(Bundle-Name:\s*)(.*)', "`$1$newBundleName"
The above will result in
Bundle-ManifestVersion: 2
Bundle-Name: BundleTest
Bundle-Version: 4
Regex details:
( Match the regex below and capture its match into backreference number 1
Bundle-Name: Match the character string “Bundle-Name:” literally (case sensitive)
\s Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
( Match the regex below and capture its match into backreference number 2
. Match any single character that is NOT a line break character (line feed)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
Thanks to LotPings, there is even an easier regex that can be used:
$manifest -replace '(?<=Bundle-Name:\s*).*', $newBundleName
This uses a positive lookbehind.
The regex details for that are:
(?<= Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
Bundle-Name: Match the characters “Bundle-Name:” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)

regex findall to retrieve a substring based on start and end character

I have the following string:
6[Sup. 1e+02]
I'm trying to retrieve a substring of just 1e+02. The variable first refers to the above specified string. Below is what I have tried.
re.findall(' \d*]', first)

You need to use the following regex:
\b\d+e\+\d+\b
Explanation:
\b - Word boundary
\d+ - Digits, 1 or more
e - Literal e
\+ - Literal +
\d+ - Digits, 1 or more
\b - Word boundary
See demo
Sample code:
import re
p = re.compile(ur'\b\d+e\+\d+\b')
test_str = u"6[Sup. 1e+02]"
re.findall(p, test_str)
See IDEONE demo

import re
first = "6[Sup. 1e+02]"
result = re.findall(r"\s+(.*?)\]", first)
print result
Output:
['1e+02']
Demo
http://ideone.com/Kevtje
regex Explanation:
\s+(.*?)\]
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 1 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “]” literally «\]»

regular expression match _ underscore

I have a string like this :
002_part1_part2_______________by_test
and I would like to stop the match at the second underscore character, like this :
002_part1_part2_
How can I do that with a Regular expression ?
Thanks

Create a pattern to match any character but not of an _ zero or more times followed by an underscore symbol. Put that pattern inside a capturing or non-capturing group and make it to repeat exactly 3 times by adding range quantifier {3} next to that group.
^(?:[^_]*_){3}
DEMO

You can use:
.*\d_
EXPLANATION:
Match any single character that is NOT a line break character (line feed) «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character that is a “digit” (any decimal number in any Unicode script) «\d»
Match the character “_” literally «_»
https://regex101.com/r/uX0qD5/1

How to match a string, for which a specific method is invoked, via regex?

Regex: \"(.+?)\"\.Localize\(\)
Text: ModelState.AddModelError("Property", "Invalid property.".Localize());
Example:
http://regex101.com/r/aY5jK2
Currently the text Property", "Invalid property. gets matched. How do i match just the Invalid property string?

Try this:
\"([^"]+?)\"\.Localize\(\)
Demo Here

You can use this regex:
(")([^"]+)\1(?=\.Localize\(\))
Group 2 will contain what you want.

#Dante, Keep your regex that's working a simply add a space (\s) before the quote:
\s\"(.*?)\"\.Localize\(\)
DEMO
REGEX EXPLANATION:
\s\"(.*?)\"\.Localize\(\)
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s»
Match the character “"” literally «\"»
Match the regex below and capture its match into backreference number 1 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “"” literally «\"»
Match the character “.” literally «\.»
Match the character string “Localize” literally (case insensitive) «Localize»
Match the character “(” literally «\(»
Match the character “)” literally «\)»

Chaning the image url with a regular expression

I have to change a url that looks like
http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg
into this format
http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
I understand I need to create a regular expression pattern that will divide the initial url into three groups:
http://my-assets.s3.amazonaws.com/uploads/
2011/10/
PiaggioBeverly-001-106x106.jpg
and then cut off the resolution string (106x106) from the third group, get rid of the hyphen at the end and move the resolution next to the second. Any idea how to get it done using something like preg_replace?

search this : (.*\/)(\w+-\d+)-(.*?)\.
and replace with : \1\3/\2.
demo here : http://regex101.com/r/fX7gC2

The pattern will be as follow(for input uploads/2011/10/PiaggioBeverly-001-106x106.jpg)
^(.*/)(.+?)(\d+x\d+)(\.jpg)$
And the groups will be holding as follows:
$1 = uploads/2011/10/
$2 = PiaggioBeverly-001-
$3 = 106x106
$4 = .jpg
Now rearrange as per your need. You can check this example from online.
As you have mentioned about preg_replace(), so if its in PHP, you can use preg_match() for this.

<?php
$oldurl = "http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg";
$newurl = preg_replace('%(.*?)/(\w+)-(\w+)-(\w+)\.(\w+)%sim', '$1/$4/$2-$3.jpg', $oldurl);
echo $newurl;
#http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
?>
DEMO
EXPLANATION:
Options: dot matches newline; case insensitive; ^ and $ match at line breaks
Match the regular expression below and capture its match into backreference number 1 «(.*?)»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “/” literally «/»
Match the regular expression below and capture its match into backreference number 2 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 3 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 4 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “.” literally «\.»
Match the regular expression below and capture its match into backreference number 5 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex Pattern where group may not exist - regex

Worked it out. I needed the following pattern: (?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.?\d{0,3})))\s(?<logEntry>.*)

^(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)\.?(\d)([a-zA-Z\s]+) Note the (\d) which will return the group even if empty. Demo

Related

RegEx after certain string

regex findall to retrieve a substring based on start and end character

regular expression match _ underscore

How to match a string, for which a specific method is invoked, via regex?

Chaning the image url with a regular expression

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex Pattern where group may not exist - regex

Worked it out. I needed the following pattern: (?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.?\d{0,3})))\s(?<logEntry>.*)

^(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)\.?(\d*)([a-zA-Z\s]+) Note the (\d*) which will return the group even if empty. Demo

Related

RegEx after certain string

regex findall to retrieve a substring based on start and end character

regular expression match _ underscore

How to match a string, for which a specific method is invoked, via regex?

Chaning the image url with a regular expression

Categories

Resources

^(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)\.?(\d)([a-zA-Z\s]+) Note the (\d) which will return the group even if empty. Demo