RegEx after certain string - regex

I have a manifest file
Bundle-ManifestVersion: 2
Bundle-Name: BundleSample
Bundle-Version: 4
I want to change the value of Bundle-Name using -replace in Powershell.
I used this pattern Bundle-Name:(.*)
But it returns including the Bundle-Name. What would be the pattern if I want to change only the value of the Bundle-Name?

You could capture both the Bundle-Name: and its value in two separate capture groups.
Then replace like this:
$manifest = #"
Bundle-ManifestVersion: 2
Bundle-Name: BundleSample
Bundle-Version: 4
"#
$newBundleName = 'BundleTest'
$manifest -replace '(Bundle-Name:\s*)(.*)', ('$1{0}' -f $newBundleName)
# or
# $manifest -replace '(Bundle-Name:\s*)(.*)', "`$1$newBundleName"
The above will result in
Bundle-ManifestVersion: 2
Bundle-Name: BundleTest
Bundle-Version: 4
Regex details:
( Match the regex below and capture its match into backreference number 1
Bundle-Name: Match the character string “Bundle-Name:” literally (case sensitive)
\s Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
( Match the regex below and capture its match into backreference number 2
. Match any single character that is NOT a line break character (line feed)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
Thanks to LotPings, there is even an easier regex that can be used:
$manifest -replace '(?<=Bundle-Name:\s*).*', $newBundleName
This uses a positive lookbehind.
The regex details for that are:
(?<= Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
Bundle-Name: Match the characters “Bundle-Name:” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Related

regex findall to retrieve a substring based on start and end character

I have the following string:
6[Sup. 1e+02]
I'm trying to retrieve a substring of just 1e+02. The variable first refers to the above specified string. Below is what I have tried.
re.findall(' \d*]', first)
You need to use the following regex:
\b\d+e\+\d+\b
Explanation:
\b - Word boundary
\d+ - Digits, 1 or more
e - Literal e
\+ - Literal +
\d+ - Digits, 1 or more
\b - Word boundary
See demo
Sample code:
import re
p = re.compile(ur'\b\d+e\+\d+\b')
test_str = u"6[Sup. 1e+02]"
re.findall(p, test_str)
See IDEONE demo
import re
first = "6[Sup. 1e+02]"
result = re.findall(r"\s+(.*?)\]", first)
print result
Output:
['1e+02']
Demo
http://ideone.com/Kevtje
regex Explanation:
\s+(.*?)\]
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 1 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “]” literally «\]»

Regex Pattern where group may not exist

I have a RegEx pattern that needs to match on any of the following lines:
10-10-15 15:16:41.1 Some Text here
10-10-15 15:16:41.12 Some Text here
10-10-15 15:16:41.123 Some Text here
10-10-15 15:16:41 Some Text here
I can match the first 3 with the pattern below:
(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})\.(?<milli>\d{0,3})))\s(?<Line>.*)
How do i Match this line (10-10-15 15:16:41 Some Text here) which has no milliseconds but still get the group back in my result either wit a blank value or with 0 as the value?
Thanks
As i said each of the lines below will match:
10-10-15 15:16:41.123 Some text Here
10-10-15 15:16:41.12 Some Text here
10-10-15 15:16:41.1 Some Text here
10-10-15 15:16:41. Some Text here
The groups look like so:
date [0-18] `10-10-15 15:16:41.`
day [0-2] `10`
month [3-5] `10`
year [6-8] `15`
time [9-18] `15:16:41.`
hour [9-11] `15`
minutes [12-14] `16`
seconds [15-17] `41`
milli [18-18] ``
Line [19-34] `Some Text here `
You can use the following (slightly modified version of your regex):
(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.\d{0,3})?))\s(?<logEntry>.*)
See DEMO
Explanation:
Make the <milli> part optional.. and not the . since it matches strings like 10-10-15 15:16:41123 Some Text here also..
Worked it out. I needed the following pattern:
(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.?\d{0,3})))\s(?<logEntry>.*)
^(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)\.?(\d*)([a-zA-Z\s]+)
Note the (\d*) which will return the group even if empty.
Demo
Make the milliseconds optional ?
/^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$/
Example:
<?php
$strings = <<< LOL
10-10-15 15:16:41.1 Some Text here
10-10-15 15:16:41.12 Some Text here
10-10-15 15:16:41.123 Some Text here
10-10-15 15:16:41 Some Text here
LOL;
preg_match_all('/^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$/m', $strings , $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[0]); $i++) {
$day = $matches[1][$i];
$month = $matches[2][$i];
$year = $matches[3][$i];
$hours = $matches[4][$i];
$minutes = $matches[5][$i];
$seconds = $matches[6][$i];
$ms = $matches[7][$i];
$text = $matches[8][$i];
echo "$day $month $year $hours $minutes $seconds $ms $text \n";
}
Regex Demo:
https://regex101.com/r/aF9wN6/1
PHP Demo:
http://ideone.com/1aEt2E
Regex Explanation:
^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$
Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed) «^»
Match the regex below and capture its match into backreference number 1 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “-” literally «-»
Match the regex below and capture its match into backreference number 2 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “-” literally «-»
Match the regex below and capture its match into backreference number 3 «([\d]{2}|[\d]{4})»
Match this alternative (attempting the next alternative only if this one fails) «[\d]{2}»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Or match this alternative (the entire group fails if this one fails to match) «[\d]{4}»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{4}»
Exactly 4 times «{4}»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 4 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “:” literally «:»
Match the regex below and capture its match into backreference number 5 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “:” literally «:»
Match the regex below and capture its match into backreference number 6 «([\d]{2})»
Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
Exactly 2 times «{2}»
Match the character “.” literally «\.?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the regex below and capture its match into backreference number 7 «(\d+)?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character that is a “digit” (any decimal number in any Unicode script) «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 8 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line (at the end of the string or before a line break character) (line feed) «$»

Javascript transformation

Is there any simple way to transform:
"<A[hello|home]>"
to:
"hello|home"
Thanks!
Apart from the clever advice in the comments to simply remove certain characters, if you are unable to remove these characters because they are present elsewhere in the text and do want to match that format, here is a way to do it with regex:
Search: <\w+\[([^|]*\|[^\]]*)\]>
Replace: \1 or $1 depending on editor or regex engine.
See the Substitution pane at the bottom of the demo.
Explanation
<\w+\[([^|]*\|[^\]]*)\]>
Match the character “<” literally <
Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation) \w+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the character “[” literally \[
Match the regex below and capture its match into backreference number 1 ([^|]*\|[^\]]*)
Match any character that is NOT a “|” [^|]*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “|” literally \|
Match any character that is NOT a “]” [^\]]*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “]” literally \]
Match the character “>” literally >
\1
Insert the backslash character \
Insert the character “1” literally 1

Chaning the image url with a regular expression

I have to change a url that looks like
http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg
into this format
http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
I understand I need to create a regular expression pattern that will divide the initial url into three groups:
http://my-assets.s3.amazonaws.com/uploads/
2011/10/
PiaggioBeverly-001-106x106.jpg
and then cut off the resolution string (106x106) from the third group, get rid of the hyphen at the end and move the resolution next to the second. Any idea how to get it done using something like preg_replace?
search this : (.*\/)(\w+-\d+)-(.*?)\.
and replace with : \1\3/\2.
demo here : http://regex101.com/r/fX7gC2
The pattern will be as follow(for input uploads/2011/10/PiaggioBeverly-001-106x106.jpg)
^(.*/)(.+?)(\d+x\d+)(\.jpg)$
And the groups will be holding as follows:
$1 = uploads/2011/10/
$2 = PiaggioBeverly-001-
$3 = 106x106
$4 = .jpg
Now rearrange as per your need. You can check this example from online.
As you have mentioned about preg_replace(), so if its in PHP, you can use preg_match() for this.
<?php
$oldurl = "http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg";
$newurl = preg_replace('%(.*?)/(\w+)-(\w+)-(\w+)\.(\w+)%sim', '$1/$4/$2-$3.jpg', $oldurl);
echo $newurl;
#http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
?>
DEMO
EXPLANATION:
Options: dot matches newline; case insensitive; ^ and $ match at line breaks
Match the regular expression below and capture its match into backreference number 1 «(.*?)»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “/” literally «/»
Match the regular expression below and capture its match into backreference number 2 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 3 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 4 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “.” literally «\.»
Match the regular expression below and capture its match into backreference number 5 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Extract string using regex from procedure script text?

In this code below I'd like to return this string suing regex: 'DataSource=xxxtransxxx;Initial Catalog=Sales'
Sales could or could not have a prefix and there could or could not be spaces in between the items in the string above. I have tried the regex in the code below but it does not work. Any advice appreciated, thanks!!
var thestring = #"'SELECT CONVERT(VARCHAR(MAX), [Measures].[EmployeeID ParameterCaption]) AS [EmployeeID]
,CONVERT(VARCHAR(50), [Measures].[Sales Rep Name ParameterCaption]) AS [EmployeeName]
,CONVERT(VARCHAR(50), [Measures].[Manager EmployeeID ParameterCaption]) AS [ManagerEmployeeID]
,CONVERT(MONEY, [MEASURES].[PrevFYYTD])AS PrevFYYTD
,CONVERT(MONEY, [MEASURES].[CurrentFYYTD] ) AS CurrentFYYTD
,CONVERT(VARCHAR(50),[MEASURES].[PCTGrowth] )+''%'' AS [PCTGrowth]
,CONVERT(VARCHAR, [MEASURES].[DollarGrowth] ) AS DollarGrowth
,CONVERT(VARCHAR, [MEASURES].[HasParent] ) AS HasParent
,CONVERT(VARCHAR, [MEASURES].[HasChild] ) AS HasChild
FROM OPENROWSET(''MSOLAP'',''DataSource=xxxtransxxx;Initial Catalog=Sales'' , SET #MDX = ''' WITH;";
Regex rgx = new Regex(#"'\s*DataSource\s*=\s.*trans*(.*Sales) *'", RegexOptions.IgnoreCase);
string result = rgx.Match(thestring).Groups[0].Value;
You could use
\s*DataSource\s*=[^';]+?;\s*Initial *Catalog\s*=[^;']+$
code
string resultString = null;
try {
resultString = Regex.Match(subjectString, #"\bDataSource\s*=[^';]+?;\s*Initial *Catalog\s*=[^;']+", RegexOptions.IgnoreCase | RegexOptions.Multiline).Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
explanation
#"
(?i) # Match the remainder of the regex with the options: case insensitive (i)
\b # Assert position at a word boundary
DataSource # Match the characters “DataSource” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
= # Match the character “=” literally
[^';] # Match a single character NOT present in the list “';”
+? # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
; # Match the character “;” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Initial # Match the characters “Initial” literally
\ # Match the character “ ” literally
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Catalog # Match the characters “Catalog” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
= # Match the character “=” literally
[^;'] # Match a single character NOT present in the list “;'”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"