Extract string using regex from procedure script text? - regex

In this code below I'd like to return this string suing regex: 'DataSource=xxxtransxxx;Initial Catalog=Sales'
Sales could or could not have a prefix and there could or could not be spaces in between the items in the string above. I have tried the regex in the code below but it does not work. Any advice appreciated, thanks!!
var thestring = #"'SELECT CONVERT(VARCHAR(MAX), [Measures].[EmployeeID ParameterCaption]) AS [EmployeeID]
,CONVERT(VARCHAR(50), [Measures].[Sales Rep Name ParameterCaption]) AS [EmployeeName]
,CONVERT(VARCHAR(50), [Measures].[Manager EmployeeID ParameterCaption]) AS [ManagerEmployeeID]
,CONVERT(MONEY, [MEASURES].[PrevFYYTD])AS PrevFYYTD
,CONVERT(MONEY, [MEASURES].[CurrentFYYTD] ) AS CurrentFYYTD
,CONVERT(VARCHAR(50),[MEASURES].[PCTGrowth] )+''%'' AS [PCTGrowth]
,CONVERT(VARCHAR, [MEASURES].[DollarGrowth] ) AS DollarGrowth
,CONVERT(VARCHAR, [MEASURES].[HasParent] ) AS HasParent
,CONVERT(VARCHAR, [MEASURES].[HasChild] ) AS HasChild
FROM OPENROWSET(''MSOLAP'',''DataSource=xxxtransxxx;Initial Catalog=Sales'' , SET #MDX = ''' WITH;";
Regex rgx = new Regex(#"'\s*DataSource\s*=\s.*trans*(.*Sales) *'", RegexOptions.IgnoreCase);
string result = rgx.Match(thestring).Groups[0].Value;

You could use
\s*DataSource\s*=[^';]+?;\s*Initial *Catalog\s*=[^;']+$
code
string resultString = null;
try {
resultString = Regex.Match(subjectString, #"\bDataSource\s*=[^';]+?;\s*Initial *Catalog\s*=[^;']+", RegexOptions.IgnoreCase | RegexOptions.Multiline).Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
explanation
#"
(?i) # Match the remainder of the regex with the options: case insensitive (i)
\b # Assert position at a word boundary
DataSource # Match the characters “DataSource” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
= # Match the character “=” literally
[^';] # Match a single character NOT present in the list “';”
+? # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
; # Match the character “;” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Initial # Match the characters “Initial” literally
\ # Match the character “ ” literally
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Catalog # Match the characters “Catalog” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
= # Match the character “=” literally
[^;'] # Match a single character NOT present in the list “;'”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"

Related

RegEx after certain string

I have a manifest file
Bundle-ManifestVersion: 2
Bundle-Name: BundleSample
Bundle-Version: 4
I want to change the value of Bundle-Name using -replace in Powershell.
I used this pattern Bundle-Name:(.*)
But it returns including the Bundle-Name. What would be the pattern if I want to change only the value of the Bundle-Name?
You could capture both the Bundle-Name: and its value in two separate capture groups.
Then replace like this:
$manifest = #"
Bundle-ManifestVersion: 2
Bundle-Name: BundleSample
Bundle-Version: 4
"#
$newBundleName = 'BundleTest'
$manifest -replace '(Bundle-Name:\s*)(.*)', ('$1{0}' -f $newBundleName)
# or
# $manifest -replace '(Bundle-Name:\s*)(.*)', "`$1$newBundleName"
The above will result in
Bundle-ManifestVersion: 2
Bundle-Name: BundleTest
Bundle-Version: 4
Regex details:
( Match the regex below and capture its match into backreference number 1
Bundle-Name: Match the character string “Bundle-Name:” literally (case sensitive)
\s Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
( Match the regex below and capture its match into backreference number 2
. Match any single character that is NOT a line break character (line feed)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
Thanks to LotPings, there is even an easier regex that can be used:
$manifest -replace '(?<=Bundle-Name:\s*).*', $newBundleName
This uses a positive lookbehind.
The regex details for that are:
(?<= Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
Bundle-Name: Match the characters “Bundle-Name:” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)

regex findall to retrieve a substring based on start and end character

I have the following string:
6[Sup. 1e+02]
I'm trying to retrieve a substring of just 1e+02. The variable first refers to the above specified string. Below is what I have tried.
re.findall(' \d*]', first)
You need to use the following regex:
\b\d+e\+\d+\b
Explanation:
\b - Word boundary
\d+ - Digits, 1 or more
e - Literal e
\+ - Literal +
\d+ - Digits, 1 or more
\b - Word boundary
See demo
Sample code:
import re
p = re.compile(ur'\b\d+e\+\d+\b')
test_str = u"6[Sup. 1e+02]"
re.findall(p, test_str)
See IDEONE demo
import re
first = "6[Sup. 1e+02]"
result = re.findall(r"\s+(.*?)\]", first)
print result
Output:
['1e+02']
Demo
http://ideone.com/Kevtje
regex Explanation:
\s+(.*?)\]
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 1 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “]” literally «\]»

Javascript transformation

Is there any simple way to transform:
"<A[hello|home]>"
to:
"hello|home"
Thanks!
Apart from the clever advice in the comments to simply remove certain characters, if you are unable to remove these characters because they are present elsewhere in the text and do want to match that format, here is a way to do it with regex:
Search: <\w+\[([^|]*\|[^\]]*)\]>
Replace: \1 or $1 depending on editor or regex engine.
See the Substitution pane at the bottom of the demo.
Explanation
<\w+\[([^|]*\|[^\]]*)\]>
Match the character “<” literally <
Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation) \w+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the character “[” literally \[
Match the regex below and capture its match into backreference number 1 ([^|]*\|[^\]]*)
Match any character that is NOT a “|” [^|]*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “|” literally \|
Match any character that is NOT a “]” [^\]]*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “]” literally \]
Match the character “>” literally >
\1
Insert the backslash character \
Insert the character “1” literally 1

Chaning the image url with a regular expression

I have to change a url that looks like
http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg
into this format
http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
I understand I need to create a regular expression pattern that will divide the initial url into three groups:
http://my-assets.s3.amazonaws.com/uploads/
2011/10/
PiaggioBeverly-001-106x106.jpg
and then cut off the resolution string (106x106) from the third group, get rid of the hyphen at the end and move the resolution next to the second. Any idea how to get it done using something like preg_replace?
search this : (.*\/)(\w+-\d+)-(.*?)\.
and replace with : \1\3/\2.
demo here : http://regex101.com/r/fX7gC2
The pattern will be as follow(for input uploads/2011/10/PiaggioBeverly-001-106x106.jpg)
^(.*/)(.+?)(\d+x\d+)(\.jpg)$
And the groups will be holding as follows:
$1 = uploads/2011/10/
$2 = PiaggioBeverly-001-
$3 = 106x106
$4 = .jpg
Now rearrange as per your need. You can check this example from online.
As you have mentioned about preg_replace(), so if its in PHP, you can use preg_match() for this.
<?php
$oldurl = "http://my-assets.s3.amazonaws.com/uploads/2011/10/PiaggioBeverly-001-106x106.jpg";
$newurl = preg_replace('%(.*?)/(\w+)-(\w+)-(\w+)\.(\w+)%sim', '$1/$4/$2-$3.jpg', $oldurl);
echo $newurl;
#http://my-assets.s3.amazonaws.com/uploads/2011/10/106x106/PiaggioBeverly-001.jpg
?>
DEMO
EXPLANATION:
Options: dot matches newline; case insensitive; ^ and $ match at line breaks
Match the regular expression below and capture its match into backreference number 1 «(.*?)»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “/” literally «/»
Match the regular expression below and capture its match into backreference number 2 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 3 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 4 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “.” literally «\.»
Match the regular expression below and capture its match into backreference number 5 «(\w+)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Need regular expression to validate username

Need a regular expression to validate username which:
should allow trailing spaces but not spaces in between characters
must contain at least one letter,may contain letters and numbers
7-15 characters max(alphanumeric)
cannot contain special characters
underscore is allowed
Not sure how to do this. Any help is appreciated. Thank you.
This is what I was using but it allows space between characters
"(?=.*[a-zA-Z])[a-zA-Z0-9_]{1}[_a-zA-Z0-9\\s]{6,14}"
Example: user name
No spaces are allowed in user name
Try this:
foundMatch = Regex.IsMatch(subjectString, #"^(?=.*[a-z])\w{7,15}\s*$", RegexOptions.IgnoreCase);
Is also allows the use of _ since you allowed this in your attempt.
So basically I use three rules. One to check if at least one letter exists.
Another to check if the string consists only of alphas plus the _ and finally I accept trailing spaces and at least 7 with a max of 15 alpha's. You are in a good track. Keep it up and you will be answering questions here too :)
Breakdown:
"
^ # Assert position at the beginning of the string
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
. # Match any single character that is not a line break character
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
[a-z] # Match a single character in the range between “a” and “z”
)
\w # Match a single character that is a “word character” (letters, digits, etc.)
{7,15} # Between 7 and 15 times, as many times as possible, giving back as needed (greedy)
\s # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
"