How to get the content of parentheses but not the parentheses themselves - regex

I have this kind of text 1323-DI-004 (2013-07-16).pdf and I want to have the date placed in parentheses. I tried with the regex (\(.*)\). It give this (2013-07-16). I want to have the same result but without parenthses.
This is for a VBA code.
Is it possible and how to do it?

Edit: you're using VBA, so
Dim myMatches As MatchCollection
Set myRegExp = New RegExp
myRegExp.Pattern = "\((.*)\)"
Set myMatches = myRegExp.Execute(subjectString)
MsgBox(myMatches(1).Value) 'I think this be your reference? You may need to iterate myMatches to get the right one
Assuming this is a fully PCRE compliant matching platform (PHP, PERL, etc. -- not javascript), use lookarounds to achieve this, matching the () on either side without including them in the capture:
(?<=\()(.*)(?=\))
See it in action: http://regex101.com/r/oI3gD6
If you're using javascript, this won't work, however you can use \((.*)\) and retrieve the first capture group, which will be what's inside the ().

Related

RegEx specific numeric pattern in Excel VBS

I do not have much RegEx experience and need advice to create a specific Pattern in Excel VBA.
The Pattern I want to match on to validate a Userform field is: nnnnnn.nnn.nn where n is a 0-9 digit.
My code looks like this but Reg.Test always returns false.
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Pattern = "/d/d/d/d/d/d\./d/d/d\./d/d"
End With
If RegEx.Test(txtProjectNumber.Value) = False Then
txtProjectNumber.SetFocus
bolAllDataOK = False
End If
Try this. You need to match the whole contents of the textbox (I assume) so use anchors (^ and $).
Your slashes were the wrong way round. Also you can use quantifiers to simplify the pattern.
Private Sub CommandButton1_Click()
Dim RegEx As Object, bolAllDataOK As Boolean
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Pattern = "^\d{6}\.\d{3}\.\d{2}$"
End With
If Not RegEx.Test(txtProjectNumber.Value) Then
txtProjectNumber.SetFocus
bolAllDataOK = False
End If
End Sub
VBA got it's own build-in alternative called Like operator. So besides the fact you made an error with forward slashes instead of backslashes (as #SJR rightfully mentioned), you should have a look at something like:
If txtProjectNumber.Value Like "######.###.##" Then
Where # stands for any single digit (0–9). Though not as versatile as using regular expressions, it seems to do the trick for you. That way you won't have to use any external reference nor extra object.

VBA to VB.NET - Regex - System.Text.RegularExpressions - with no global modifier

I am trying to migrate a lib of regular expressions (utilities) from VBA to VB.NET, as (my general impression is that) it offers more support to obtain "clean" and re-usable code (including Regex support).
The library is a factory pattern to reuse compiled regex'es (for performance optimization purposes; not sure at which extend the option RegexOptions.Compiled can help it). It is used in combination with a Lib that holds records of patterns (utilities) and returns an object; which, besides the pattern includes also the modifiers (as properties).
However, the RegEx object of System.Text.RegularExpressions does not have a clean system to specify flags / modifiers...
' VBA
Dim oRegExp As New RegExp
With oRegExp
.Pattern = Pattern
.IgnoreCase = IgnoreCase
.Multiline = Multiline
.Global = MatchGlobal
End With
Versus
' VB.NET
Dim opts As RegexOptions = New RegexOptions
If IgnoreCase Then opts = opts Or RegexOptions.IgnoreCase
If Multiline Then opts = opts Or RegexOptions.Multiline
Dim oRegExp As RegEx
oRegExp = New RegEx(Pattern, opts)
'Were can I specify MatchGlobal???
As I do not see this as an improvement to this part of the code, I will rely on applying inline modifiers instead (these here) (directly embedded to the Pattern itself), and get rid of the object of the library of patterns that includes the modifiers as properties (not included in the examples).
That way...
' This -> "\bpre([^\r\n]+)\b"
' in .NET, can be this -> "\bpre(?<word>\w*)\b"
' as .NET supports named groups
Dim Pattern as String = "(?i)\bpre(?<word>\w*)\b" ' case insensitive
The only problem is that, as shown at the VB.NET example above, the RegEx object of the namespace System.Text.RegularExpressions seems not to allow you changing the global match modifier (and inline modifiers, logically, do not include the global match flag).
Any idea on how to deal with it?
There is no support for a global regex option as this behavior is implemented via two different methods.
To only get the first (one) match use Regex.Match:
Searches the specified input string for the first occurrence of the regular expression specified in the Regex constructor.
To match all occurrences, use Regex.Matches:
Searches an input string for all occurrences of a regular expression and returns all the matches.
You need to implement the logic: if all matches are expected, trigger Regex.Matches, if only one, use Regex.Match.

Excluding Portion Of RegEx From Results

I have a very large text file that has multiple instances of "CLM*[NUMBER I WANT]*". I have been able to use regex to mostly obtain this thanks to another user on this site, but the results I'm getting are displaying the CLM* portion, when I really just want the number. You can see the relevant code below.
Dim strClaimData As String = ""
Dim strClaimNumber As String = ClaimLoadedGetCLM(strClaimData)
Public Function ClaimLoadedGetCLM(ByVal ediString As String) As String
Dim regex As New Regex("CLM\*(\d*?\*??\d*)")
Dim ClaimMatches As MatchCollection = regex.Matches(strClaimData)
For Each strClaimData As Match In ClaimMatches
lstClaimLoaded837Data.Items.Add(strClaimData.Value)
Next
End Function
I've tried a few things I've found online, such as appending a \K or \2, but I just get compile errors if I do that.
https://regex101.com/r/jH9eJ7/1
That shows what I want as "Match 1, Group 1", but I can't figure out how to get to it. I thought appending /1 would work, but that only returned CLM* with no number.
Any help would be greatly appreciated.
What you want to do is wrap the CLM\* part in a possitive lookbehind assertion:
(?<=CLM\*)
What this does asserts that (\d*\.?\d*) is preceded by CLM*, but doesn't include CLM* in the match.
https://regex101.com/r/jH9eJ7/3
You can tell it to use the captured group.
Replace:
strClaimData.Value
with:
strClaimData.Groups[1].Value
in your for loop.

RegEx pattern to extract URLs

I have to extract all there is between this caracters:
<a href="/url?q=(text to extract whatever it is)&amp
I tried this pattern, but it's not working for me:
/(?<=url\?q=).*?(?=&amp)/
I'm programming in Vb.net, this is the code, but I think that the problem is that the pattern is wrong:
Dim matches As MatchCollection
matches = regex.Matches(TextBox1.Text)
For Each Match As Match In matches
listbox1.items.add(Match.Value)
Next
Could you help me please?
Your regex is seemed to be correct except the slash(/) in the beginning and ending of expression, remove it:
Dim regex = New Regex("(?<=url\?q=).*?(?=&amp)")
and it should work.
Some utilities and most languages use / (forward slash) to start and end (de-limit or contain) the search expression others may use single quotes. With System.Text.RegularExpressions.Regex you don't need it.
This regex code below will extract all urls from your text (or any other):
(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?

Need to extract text from within first curly brackets

I have strings that look like this
{/CSDC} CHOC SHELL DIP COLOR {17}
I need to extract the value in the first swirly brackets. In the above example it would be
/CSDC
So far i have this code which is not working
Dim matchCode = Regex.Matches(txtItems.Text, "/\{(.+?)\}/")
Dim itemCode As String
If matchCode.Count > 0 Then
itemCode = matchCode(0).Value
End If
I think the main issue here is that you are confusing your regular expression syntax between different languages.
In languages like Javascript, Perl, Ruby and others, you create a regular expression object by using the /regex/ notation.
In .NET, when you instantiate a Regex object, you pass it a string of the regular expression, which is delimited by quotes, not slashes. So it is of the form "regex".
So try removing the leading and trailing / from your string and see how you go.
This may not be the whole problem, but it is at least part of it.
Are you getting the whole string instead of just the 1st value? Regular expressions are greedy by default so .Net is trying to grab the largest matching string.
Try this:
Dim matchCode = Regex.Matches(txtItems.Text, "\{[^}]*\}")
Dim itemCode As String
If matchCode.Count > 0 Then
itemCode = matchCode(0).Groups(0).Value
End If
Edited: I've tried this in Linqpad and it worked.
It appears you are using a capture group.. so try matchCode(0).Groups(0).Value
Also, remove the /\ from the beginning of the pattern and remove the trailing /