How do I combine a regular expression function in vlookup?

How do I combine a regular expression function in vlookup? - regex

I have a VBA regular expression which I would like to combine with VLOOKUP however it does not return the value based on the regular expression if used with VLOOKUP.
This is what it returns when I execution the function
=udfRegEx(A2,B2)
String
Microsoft Windows Server 2003, Standard Edition (64-bit)
Regular expression
^([^,]*)
Result
Microsoft Windows Server 2003
However when I execute =IFERROR(VLOOKUP(udfRegEx(A2,RegularExpression!B2),[Sample.xls]Sheet1!$B$2:$E$4177,4,FALSE),0) it still returns Microsoft Windows Server 2003, Standard Edition (64-bit)
Column B2 is the regular expression ^([^,]*)

Try using:
=IFERROR(udfRegEx(VLOOKUP(udfRegEx(A2,RegularExpression!B2),[Sample.xls]Sheet1!$B$2:$E$4177,4,FALSE),RegularExpression!B2),0)
A shot in the dark.

I had to do this for my personal use, so I made an Excel Addin, here is the GitHub address.
https://github.com/BlueTrin/BlueXL
If you want I can host a compiled version if you need it. It adds a function called BXLookup, this function supports Regex, you can also select the column on which you perform the lookup and select the columns to print.
I made a binary for you:
https://bintray.com/bluetrin/BlueXL/BlueXL/0.1.0/view?sort=&order=#
Of course this does not work if you want only to use VBA, but if you do not mind using an addin, there is an example in the spreadsheet on GitHub.
Please could you clarify what you have in: [Sample.xls]Sheet1!$B$2:$E$4177

From Office 365 on there is new function XLookUp, which does (finally) the hob you looked for. It is explained here: https://www.excelcampus.com/functions/xlookup-explained/

You don't need a regular expression to remove everything after the first comma. The following function does the same:
MID(A1,1,SEARCH(",",A1)-1)
That said, the following works, at least with Office 365 (not tested on an earlier version):
Public Function RegExpGroup(R As String, S As String, IMatch As Integer, IGroup As Integer) As Variant
Dim RegExp As Object, Matches As Object, SubMatches As Object
Set RegExp = CreateObject("VBScript.RegExp")
With RegExp
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = R
End With
Set Matches = RegExp.Execute(S)
If Matches.Count >= IMatch Then
Set SubMatches = Matches.Item(IMatch - 1).SubMatches
If SubMatches.Count >= IGroup Then
RegExpGroup = SubMatches.Item(IGroup - 1)
Else
RegExpGroup = CVErr(xlErrValue)
End If
Else
RegExpGroup = CVErr(xlErrValue)
End If
End Function
Now, with the values as:
And the formulas in A4, A5:
=RegExpGroup(A2,A1,1,1),C1:D2,2,FALSE)
=IFERROR(VLOOKUP(RegExpGroup(A2,A1,1,1),C1:D2,2,FALSE),"Not found")
You get the expected result.

Related

Why does Find/Replace zRngResult.Find work fine, but RegEx myRegExp.Execute(zRngResult) mess up the range.Start?

I wish to select and add comments after certain words, e.g. “not”, “never”, “don’t” in sentences in a Word document with VBA. The Find/Replace with wildcards works fine, but “Use wildcards” cannot be selected with “Match case”. The RegEx can “IgnoreCase=True”, but the selection of the word is not reliable when there are more than one comments in a sentence. The Range.start seems to be getting modified in a way that I cannot understand.
A similar question was asked in June 2010. https://social.msdn.microsoft.com/Forums/office/en-US/f73ca32d-0af9-47cf-81fe-ce93b13ebc4d/regex-selecting-a-match-within-the-document?forum=worddev
Is there a new/different way of solving this problem?
Any suggestion will be appreciated.
The code using RegEx follows:
Function zRegExCommentor(zPhrase As String, tComment As String) As Long
Dim sTheseSentences As Sentences
Dim rThisSentenceToSearch As Word.Range, rThisSentenceResult As Word.Range
Dim myRegExp As RegExp
Dim myMatches As MatchCollection
Options.CommentsColor = wdByAuthor
Set myRegExp = New RegExp
With myRegExp
.IgnoreCase = True
.Global = False
.Pattern = zPhrase
End With
Set sTheseSentences = ActiveDocument.Sentences
For Each rThisSentenceToSearch In sTheseSentences
Set rThisSentenceResult = rThisSentenceToSearch.Duplicate
rThisSentenceResult.Select
Do
DoEvents
Set myMatches = myRegExp.Execute(rThisSentenceResult)
If myMatches.Count > 0 Then
rThisSentenceResult.Start = rThisSentenceResult.Start + myMatches(0).FirstIndex
rThisSentenceResult.End = rThisSentenceResult.Start + myMatches(0).Length
rThisSentenceResult.Select
Selection.Comments.Add Range:=Selection.Range
Selection.TypeText Text:=tComment & "{" & zPhrase & "}"
rThisSentenceResult.Start = rThisSentenceResult.Start + 1 'so as not to find the same phrase again and again
rThisSentenceResult.End = rThisSentenceToSearch.End
rThisSentenceResult.Select
End If 'If myMatches.Count > 0 Then
Loop While myMatches.Count > 0
Next 'For Each rThisSentenceToSearch In sTheseSentences
End Function

Relying on Range.Start or Range.End for position in a Word document is not reliable due to how Word stores non-printing information in the text flow. For some kinds of things you can work around it using Range.TextRetrievalMode, but the non-printing characters inserted by Comments aren't affected by these settings.
I must admit I don't understand why Word's built-in Find with wildcards won't work for you - no case matching shouldn't be a problem. For instance, based on the example: "Never has there been, never, NEVER, a total drought.":
FindText:="[n,N][e,E][v,V][e,E][r,R]"
Will find all instances of n-e-v-e-r regardless of the capitalization. The brackets let you define a range of values, in this case the combination of lower and upper case for each letter in the search term.
The workarounds described in my MSDN post you link to are pretty much all you can if you insist on RegEx:
Using the Office Open XML (or possibly Word 2003 XML) file format will let you use RegEx and standard XML processing tools to find the information, add comment "tags" into the Word XML, close it all up... And when the user sees the document it will all be there.
If you need to be doing this in the Word UI a slightly different approach should work (assuming you're targeting Word 2003 or later): Work through the document on a range-by-range basis (by paragraph, perhaps). Read the XML representation of the text into memory using the Range.WordOpenXML property, perform the RegEx search, add comments as WordOpenXML, then write the WordOpenXML back into the document using the InserXml method, replacing the original range (paragraph). Since you'd be working with the Paragraph object Range.Start won't be a factor.

How do you use RegEx to return a parsed value?

I have a data column that has a heading value with multiple levels, where I only want the first three levels, but I cannot figure out how to get the parsed value?
I was reading this and it shows how to use create a function to return a boolean for the condition, but how would I create a function that would return a parsed value?
This is the Regular Expression that I think I need.
^(\d.\d.\d)
I'm looking for something that would change 1.2.3.4.5. to 1.2.3 and similar for any other header I have that has more than three levels.
Ideally, I'd like to be able to put it into my Query Design as a Field Expression, but I'm not sure how I would do that.

I assumed your input values could have more than one digit between the dots. In other words, I think you want this ...
? RegExpGetMatch("1.2.3.4.5.", "^(\d+\.\d+\.\d+).*", 1)
1.2.3
? RegExpGetMatch("1.27.3.4.5.", "^(\d+\.\d+\.\d+).*", 1)
1.27.3
If that is the correct behavior, here is the function I used.
Public Function RegExpGetMatch(ByVal pSource As String, _
ByVal pPattern As String, _
ByVal pGroup As Long) As String
'requires reference to Microsoft VBScript Regular Expressions
'Dim re As RegExp
'Set re = New RegExp
'late binding; no reference needed
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Global = True
re.Pattern = pPattern
RegExpGetMatch = re.Replace(pSource, "$" & pGroup)
Set re = Nothing
End Function
See also this answer by KazJaw. His answer taught me how to select the match group with RegExp.Replace.
In a query run within an Access session, you could use the function like this:
SELECT
RegExpGetMatch([Data Column], "^(\d+\.\d+\.\d+).*", 1) AS parsed_value
FROM YourTable;
Note however a custom VBA function is not usable for queries run from outside an Access session.

Try changing your RegEx to ^(\d\.\d\.\d). You need to escape the . since it has a special meaning in RegExp.

Regular expression replace body content

I tried the following to replace all the text content in the current open document with numeric zero, but it doesn't work
Set objWdDoc = Word.Application.ActiveDocument
Set objWdRange = objWdDoc.Content
Dim re As New RegExp
re.Global = True
re.Pattern = "[a-z]"
re.IgnoreCase = True
objWdRange = re.Replace(objWdRange, "0")
Can anyone suggest a working method?

Assuming you have referenced microsoft vbscript regular expressions
objWdRange.Text = re.Replace(objWdRange, "0")
Will work, although you will of course lose any formatting.
You can also use the built-in search/replace which has limited support to find digits/characters. Record a macro of yourself doing this and you can examine the code.

Microsoft office Access `LIKE` VS `RegEx`

I have been having trouble with the Access key term LIKE and it's use. I want to use the following RegEx (Regular Expression) in query form as a sort of "verfication rule" where the LIKE operator filters my results:
"^[0]{1}[0-9]{8,9}$"
How can this be accomplished?

I know you were not asking about the VBA, but it maybe you will give it a chance
If you open a VBA project, insert new module, then pick Tools -> References and add a reference to Microsoft VBScript Regular Expressions 5.5. Given that pate the code below to the newly inserted module.
Function my_regexp(ByRef sIn As String, ByVal mypattern As String) As String
Dim r As New RegExp
Dim colMatches As MatchCollection
With r
.Pattern = mypattern
.IgnoreCase = True
.Global = False
.MultiLine = False
Set colMatches = .Execute(sIn)
End With
If colMatches.Count > 0 Then
my_regexp = colMatches(0).Value
Else
my_regexp = ""
End If
End Function
Now you may use the function above in your SQL queries. So your question would be now solved by invoking
SELECT my_regexp(some_variable, "^[0]{1}[0-9]{8,9}$") FROM some_table
if will return empty string if nothing is matched.
Hope you liked it.

I don't think Access allows regex matches (except in VBA, but that's not what you're asking). The LIKE operator doesn't even support alternation.
Therefore you need to split it up into two expressions.
... WHERE (Blah LIKE "0#########") OR (Blah LIKE "0########")
(# means "a single digit" in Access).

What's Regular Expression for update Assembly build number in AssemblyInfo.cs file?

Now, I'm writing VS 2008 Macro for replace Assembly version in AssemblyInfo.cs file. From MSDN, Assembly version must be wrote by using the following pattern.
major.minor[.build[.revision]]
Example
1.0
1.0.1234
1.0.1234.0
I need to dynamically generate build number for 'AssemblyInfo.cs' file and use Regular Expression for replace old build number with new generated build number.
Do you have any Regular Expression for solving this question? Moreover, build number must not be contained in commented statement like below code. Finally, don't forget to check your regex for inline comment.
Don't replace any commented build number
//[assembly: AssemblyVersion("0.1.0.0")]
/*[assembly: AssemblyVersion("0.1.0.0")]*/
/*
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
[assembly: ComVisible(false)]
[assembly: AssemblyVersion("0.1.0.0")]
*/
Replace build number that are not commented
[assembly: AssemblyVersion("0.1.0.0")] // inline comment
/* inline comment */ [assembly: AssemblyVersion("0.1.0.0")]
[assembly: /*inline comment*/AssemblyVersion("0.1.0.0")]
Hint.
Please try your regex at Online Regular Expression Testing Tool

This is somewhat crude, but you could do the following.
Search for:
^{\[assembly\: :w\(\"0\.1\.}\*
Replace with:
\1####
Where #### is your replacement string.
This regex work as follows:
It starts by searching for lines beginning with \[assembly\: ,(^ indicates the beginning fo a line, backslashes escape special characters) followed by...
...some alphabetic identifier :w, followed by...
...an opening brace \(, followed by...
...The beginning of the version string, in quotes \"0\.1\., finally followed by...
...an asterisk \*.
Steps 1-4 are captured as the first tagged expression using the curly braces { } surrounding them.
The replacement string drops the tagged expression verbatim, so that it's not harmed with: \1, followed by your replacement string, some ####.
Commented lines are ignored as they do not start with [assembly: .Subsequent in-line comments are left untouched as they are not captured by the regex.
If this isn't exactly what you need, it's fairly straightforward to experiment with the regex to capture and/or replace different parts of the line.

I doubt using regular expressions will do you much good here. While it could be possible to formulate an expression that matches "uncommented" assembly version attributes it will be hard to maintain and understand.
You are making it very very hard on yourself with the syntax that you present. What about enforcing a coding standard on your AssemblyInfo.cs file that says that lines should always be commented out with a beginning // and forbid inline comments? Then it should be easy enough to parse it using a StreamReader.
If you can't do that then there's only one parser who's guaranteed to handle all of your edge cases and that's the C# compiler. How about just compiling your assembly and then reflecting it to detect the version number?
var asm = Assembly.LoadFile("foo.dll");
var version = Assembly.GetExecutingAssembly().GetName().Version;
If you're simply interested in incrementing your build number you should have a look at this question: Can I automatically increment the file build version when using Visual Studio?

You can achieve same effect much more easily, by downloading and installing MS Build Extension Pack and adding following line at the top of your .csproj file:
<Import Project="$(MSBuildExtensionsPath)\ExtensionPack\MSBuild.ExtensionPack.VersionNumber.targets"/>
This will automatically use current date (MMdd) as the build number, and increment the revision number for you. Now, to override minor and major versions, which are set to 1.0 by default, just add following anywhere in the .csproj file:
<PropertyGroup>
<AssemblyMajorVersion>2</AssemblyMajorVersion>
<AssemblyFileMajorVersion>1</AssemblyFileMajorVersion>
</PropertyGroup>
You can further customize how build number and revision are generated, and even set company, copyright etc. by setting other properties, see this page for the list of properties.

I just find answer for my question. But answer is very very complicate & very long regex. By the way, I use this syntax only 1 time per solution. So, It doesn't affect overall performance. Please look at my complete source code.
Module EnvironmentEvents.vb
Public Module EnvironmentEvents
Private Sub BuildEvents_OnBuildBegin(ByVal Scope As EnvDTE.vsBuildScope, ByVal Action As EnvDTE.vsBuildAction) Handles BuildEvents.OnBuildBegin
If DTE.Solution.FullName.EndsWith(Path.DirectorySeparatorChar & "[Solution File Name]") Then
If Scope = vsBuildScope.vsBuildScopeSolution And Action = vsBuildAction.vsBuildActionRebuildAll Then
AutoGenerateBuildNumber()
End If
End If
End Sub
End Module
Module AssemblyInfoHelp.vb
Public Module AssemblyInfoHelper
ReadOnly AssemblyInfoPath As String = Path.Combine("Common", "GlobalAssemblyInfo.cs")
Sub AutoGenerateBuildNumber()
'Declear required variables
Dim solutionPath As String = Path.GetDirectoryName(DTE.Solution.Properties.Item("Path").Value)
Dim globalAssemblyPath As String = Path.Combine(solutionPath, AssemblyInfoPath)
Dim globalAssemblyContent As String = ReadFileContent(globalAssemblyPath)
Dim rVersionAttribute As Regex = New Regex("\[[\s]*(\/\*[\s\S]*?\*\/)?[\s]*assembly[\s]*(\/\*[\s\S]*?\*\/)?[\s]*:[\s]*(\/\*[\s\S]*?\*\/)?[\s]*AssemblyVersion[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\([\s]*(\/\*[\s\S]*?\*\/)?[\s]*\""([0-9]+)\.([0-9]+)(.([0-9]+))?(.([0-9]+))?\""[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\)[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\]")
Dim rVersionInfoAttribute As Regex = New Regex("\[[\s]*(\/\*[\s\S]*?\*\/)?[\s]*assembly[\s]*(\/\*[\s\S]*?\*\/)?[\s]*:[\s]*(\/\*[\s\S]*?\*\/)?[\s]*AssemblyInformationalVersion[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\([\s]*(\/\*[\s\S]*?\*\/)?[\s]*\""([0-9]+)\.([0-9]+)(.([0-9]+))?[\s]*([^\s]*)[\s]*(\([\s]*Build[\s]*([0-9]+)[\s]*\))?\""[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\)[\s]*(\/\*[\s\S]*?\*\/)?[\s]*\]")
'Find Version Attribute for Updating Build Number
Dim mVersionAttributes As MatchCollection = rVersionAttribute.Matches(globalAssemblyContent)
Dim mVersionAttribute As Match = GetFirstUnCommentedMatch(mVersionAttributes, globalAssemblyContent)
Dim gBuildNumber As Group = mVersionAttribute.Groups(9)
Dim newBuildNumber As String
'Replace Version Attribute for Updating Build Number
If (gBuildNumber.Success) Then
newBuildNumber = GenerateBuildNumber(gBuildNumber.Value)
globalAssemblyContent = globalAssemblyContent.Substring(0, gBuildNumber.Index) + newBuildNumber + globalAssemblyContent.Substring(gBuildNumber.Index + gBuildNumber.Length)
End If
'Find Version Info Attribute for Updating Build Number
Dim mVersionInfoAttributes As MatchCollection = rVersionInfoAttribute.Matches(globalAssemblyContent)
Dim mVersionInfoAttribute As Match = GetFirstUnCommentedMatch(mVersionInfoAttributes, globalAssemblyContent)
Dim gBuildNumber2 As Group = mVersionInfoAttribute.Groups(12)
'Replace Version Info Attribute for Updating Build Number
If (gBuildNumber2.Success) Then
If String.IsNullOrEmpty(newBuildNumber) Then
newBuildNumber = GenerateBuildNumber(gBuildNumber2.Value)
End If
globalAssemblyContent = globalAssemblyContent.Substring(0, gBuildNumber2.Index) + newBuildNumber + globalAssemblyContent.Substring(gBuildNumber2.Index + gBuildNumber2.Length)
End If
WriteFileContent(globalAssemblyPath, globalAssemblyContent)
End Sub
Function GenerateBuildNumber(Optional ByVal oldBuildNumber As String = "0") As String
oldBuildNumber = Int16.Parse(oldBuildNumber) + 1
Return oldBuildNumber
End Function
Private Function GetFirstUnCommentedMatch(ByRef mc As MatchCollection, ByVal content As String) As Match
Dim rSingleLineComment As Regex = New Regex("\/\/.*$")
Dim rMultiLineComment As Regex = New Regex("\/\*[\s\S]*?\*\/")
Dim mSingleLineComments As MatchCollection = rSingleLineComment.Matches(content)
Dim mMultiLineComments As MatchCollection = rMultiLineComment.Matches(content)
For Each m As Match In mc
If m.Success Then
For Each singleLine As Match In mSingleLineComments
If singleLine.Success Then
If m.Index >= singleLine.Index And m.Index + m.Length <= singleLine.Index + singleLine.Length Then
GoTo NextAttribute
End If
End If
Next
For Each multiLine As Match In mMultiLineComments
If multiLine.Success Then
If m.Index >= multiLine.Index And m.Index + m.Length <= multiLine.Index + multiLine.Length Then
GoTo NextAttribute
End If
End If
Next
Return m
End If
NextAttribute:
Next
Return Nothing
End Function
End Module
Thanks you every body
PS. Special Thank to [RegExr: Online Regular Expression Testing Tool][1]. The best online regex tool which I have ever been played. [1]: http://gskinner.com/RegExr/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do I combine a regular expression function in vlookup? - regex

Try using: =IFERROR(udfRegEx(VLOOKUP(udfRegEx(A2,RegularExpression!B2),[Sample.xls]Sheet1!$B$2:$E$4177,4,FALSE),RegularExpression!B2),0) A shot in the dark.

From Office 365 on there is new function XLookUp, which does (finally) the hob you looked for. It is explained here: https://www.excelcampus.com/functions/xlookup-explained/

Related

Why does Find/Replace zRngResult.Find work fine, but RegEx myRegExp.Execute(zRngResult) mess up the range.Start?

How do you use RegEx to return a parsed value?

Regular expression replace body content

Microsoft office Access `LIKE` VS `RegEx`

What's Regular Expression for update Assembly build number in AssemblyInfo.cs file?

Categories

Resources