Classic ASP comparison of comma separated lists - regex

I have two comma separated lists:-
36,189,47,183,65,50
65,50,189,47
The question is how to compare the two in classic ASP in order to identify and return any values that exist in list 1 but that don't exist in list 2 bearing in mind that associative arrays aren't available.
E.g., in the above example I would need the return value to be 36,183
Thanks

Something like this (untested):
str1 = "36,189,47,183,65,50"
str2 = "65,50,189,47"
arr1 = Split(str1, ",")
arr2 = Split(str2, ",")
for i = 0 to UBound(arr1)
found = false
for j = 0 to UBound(arr2)
if arr1(i) = arr2(j) then
found = true
end if
next
if found = false then
Response.Write(arr1(i))
end if
next

In order to solve this with regular expression you can use lookaheads (both positive and negative) and references, for example:
(zyx:~) % echo '36,189,47,183,65,50;65,50,189,47' | grep -oP '((?>(?<![^,;])[^,;]+))(?=.*;)(?!.*;(|.*,)\1(?=,|$))'
36
183
Other variant (works in PCRE but not in perl):
(zyx:~) % echo '36,189,47,183,65,50' | grep -oP '((?!(?<=,|^)65|50|189|47(?=,|$))(?<=,|^)[^,]+(?=,|$))'
36
183
Have no idea whether any of these works in asp.

VBScript has associative arrays in the form of the Dictionary object.
Dim list1, list2
list1 = "36,189,47,183,65,50"
list2 = "65,50,189,47"
Dim arr1, arr2
arr1 = Split(list1, ",")
arr2 = Split(list2, ",")
' oDict will hold values from list1
Dim oDict, i
Set oDict = Server.CreateObject("Scripting.Dictionary")
For i = 0 To UBound(arr1)
oDict(arr1(i)) = 1
Next
' Now loop through list2 and remove matching items from list1
For i = 0 To UBound(arr2)
If oDict.Exists(arr2(i)) Then
oDict.Remove arr2(i)
End If
Next
Response.Write Join(oDict.Keys, ",") ' should be "36,183"

Related

Excel UDF for capturing numbers within characters

I have a variable text field sitting in cell A1 which contains the following:
Text;#Number;#Text;#Number
This format can keep repeating, but the pattern is always Text;#Number.
The numbers can vary from 1 digit to n digits (limit 7)
Example:
Original Value
MyName;#123;#YourName;#3456;#HisName;#78
Required value:
123, 3456, 78
The field is too variable for excel formulas from my understanding.
I tried using regexp but I am a beginner when it comes to coding. if you can break down the code with some explanation text, it would be much appreciated.
I have tried some of the suggestions below and they work perfectly. One more question.
Now that I can split the numbers from the text, is there any way to utilize the code below and add another layer, where we split the numbers into x cells.
For example: once we run the function, if we get 1234, 567 in the same cell, the function would put 1234 in cell B2, and 567 in cell C2. This would keep updating all cells in the same row until the string has exhausted all of the numbers that are retrieved from the function.
Thanks
This is the John Coleman's suggested method:
Public Function GetTheNumbers(st As String) As String
ary = Split(st, ";#")
GetTheNumbers = ""
For Each a In ary
If IsNumeric(a) Then
If GetTheNumbers = "" Then
GetTheNumbers = a
Else
GetTheNumbers = GetTheNumbers & ", " & a
End If
End If
Next a
End Function
If the pattern is fixed, and the location of the numbers never changes, you can assume the numbers will be located in the even places in the string. This means that in the array result of a split on the source string, you can use the odd indexes of the resulting array. For example in this string "Text;#Number;#Text;#Number" array indexes 1, 3 would be the numbers ("Text(0);#Number(1);#Text(2);#Number(3)"). I think this method is easier and safer to use if the pattern is indeed fixed, as it avoids the need to verify data types.
Public Function GetNums(src As String) As String
Dim arr
Dim i As Integer
Dim result As String
arr = Split(src, ";#") ' Split the string to an array.
result = ""
For i = 1 To UBound(arr) Step 2 ' Loop through the array, starting with the second item, and skipping one item (using Step 2).
result = result & arr(i) & ", "
Next
If Len(result) > 2 Then
GetNums = Left(result, Len(result) - 2) ' Remove the extra ", " at the end of the the result string.
Else
GetNums = ""
End If
End Function
The numbers can vary from 1 digit to n digits (limit 7)
None of the other responses seems to take the provided parameters into consideration so I kludged together a true regex solution.
Option Explicit
Option Base 0 '<~~this is the default but I've included it because it has to be 0
Function numsOnly(str As String, _
Optional delim As String = ", ")
Dim n As Long, nums() As Variant
Static rgx As Object, cmat As Object
'with rgx as static, it only has to be created once; beneficial when filling a long column with this UDF
If rgx Is Nothing Then
Set rgx = CreateObject("VBScript.RegExp")
End If
numsOnly = vbNullString
With rgx
.Global = True
.MultiLine = False
.Pattern = "[0-9]{1,7}"
If .Test(str) Then
Set cmat = .Execute(str)
'resize the nums array to accept the matches
ReDim nums(cmat.Count - 1)
'populate the nums array with the matches
For n = LBound(nums) To UBound(nums)
nums(n) = cmat.Item(n)
Next n
'convert the nums array to a delimited string
numsOnly = Join(nums, delim)
End If
End With
End Function
      
Regexp option that uses Replace
Sub Test()
Debug.Print StrOut("MyName;#123;#YourName;#3456;#HisName;#78")
End Sub
function
Option Explicit
Function StrOut(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(^|.+?)(\d{1,7})"
.Global = True
If .Test(strIn) Then
StrOut = .Replace(strIn, "$2, ")
StrOut = Left$(StrOut, Len(StrOut) - 2)
Else
StrOut = "Nothing"
End If
End With
End Function

Slight adaptation of a User Defined Function

I would like to extract a combination of text and numbers from a larger string located within a column within excel.
The constants I have to work with is that each Text string will
•either start with a A, C or S, and
•will always be 7 Characters long
•the position of he string I would like to extract varies
The code I have been using which has been working efficiently is;
Public Function Xtractor(r As Range) As String
Dim a, ary
ary = Split(r.Text, " ")
For Each a In ary
If Len(a) = 7 And a Like "[SAC]*" Then
Xtractor = a
Exit Function
End If
Next a
Xtractor = ""
End Function
However today I have learnt that sometimes my data may include scenarios like this;
What I would like is to adapt my code so If the 8th character is "Underscore" and the 1st character of the 7 characters is either S, A or C please extract up until the "Underscore"
Secondly I would like to exclude commons words like "Support" & "Collect" from being extracted.
Finally the 7th letter should be a number
Any ideas around this would be much appreciated.
Thanks
try this
ary = Split(Replace(r.Text, "_", " "))
or
ary = Split(Replace(r.Text, "_", " ")," ")
result will be same for both variants
test
update
Do you know how I could leave the result blank if the 7th character returned a letter?
Public Function Xtractor(r As Range) As String
Dim a, ary
ary = Split(Replace(r.Text, "_", " "))
For Each a In ary
If Len(a) = 7 And a Like "[SAC]*" And IsNumeric(Mid(a, 7, 1)) Then
Xtractor = a
Exit Function
End If
Next a
Xtractor = ""
End Function
test
Add Microsoft VBScript Regular Expressions 5.5 to project references. Use the following code to test matching and extracting with Xtractor:
Public Function Xtractor(ByVal p_val As String) As String
Xtractor = ""
Dim ary As String, v_re As New VBScript_RegExp_55.RegExp, Matches
v_re.Pattern = "^([SAC][^_]{1,6})_?"
Set Matches = v_re.Execute(p_val)
If Matches.Count > 0 Then Xtractor = Matches(0).SubMatches(0) Else Xtractor = ""
End Function
Sub test_Xtractor(p_cur As Range, p_val As String, p_expected As String)
Dim v_cur As Range, v_res As Range
p_cur.Value = p_val
Set v_cur = p_cur.Offset(columnOffset:=1)
v_cur.FormulaR1C1 = "='" & ThisWorkbook.Name & "'!Xtractor(RC[-1])"
Set v_res = v_cur.Offset(columnOffset:=1)
v_res.FormulaR1C1 = "=RC[-1]=""" & p_expected & """"
Debug.Print p_val; "->"; v_cur.Value; ":"; v_res.Value
End Sub
Sub test()
test_Xtractor ActiveCell, "A612002_MDC_308", "A612002"
test_Xtractor ActiveCell.Offset(1), "B612002_MDC_308", ""
test_Xtractor ActiveCell.Offset(2), "SUTP038_MDC_3", "SUTP038"
test_Xtractor ActiveCell.Offset(3), "KUTP038_MDC_3", ""
End Sub
Choose the workbook and cell for writing test fixture, then run test from the VBA Editor.
Output in the Immediate window (Ctrl+G):
A612002_MDC_308->A612002:True
B612002_MDC_308->:True
SUTP038_MDC_3->SUTP038:True
KUTP038_MDC_3->:True
UPD
Isit possible to ammend this code so if the 7th character is a letter to return blank?
Replace line with assign to v_re by the following:
v_re.Pattern = "^([SAC](?![^_]{5}[A-Z]_?)[^_]{1,6})_?"
v_re.IgnoreCase = True
And add to the test suite:
test_Xtractor ActiveCell.Offset(4), "SUTP03A_MDC_3", ""
Output:
A612002_MDC_308->A612002:True
B612002_MDC_308->:True
SUTP038_MDC_3->SUTP038:True
KUTP038_MDC_3->:True
SUTP03A_MDC_3->:True
I inserted negative lookahead subrule (?![^_]{5}[A-Z]_?) to reject SUTP03A_MDC_3. But pay attention: the rejecting rule is applied exactly to the 7th character. Now v_re.IgnoreCase set to True, but if only capitalized characters are allowed, set it to False. See also Regular Expression Syntax on MSDN.

Matlab: regexprep with variable

I have an array a : a list of identified words to be compared and replace by empty character in an array b. newB is the result.
The value of a might vary according to an input file.
I am trying to use regexprep but it is not working well.
e.g.:
a = {'apple';'banana';'orange'}; % a might be also ‘watermelon’, ‘papaya’ etc
b = {'1 apple = 2 kiwi';'1 fig = 1 banana';'1 orange = 3 strawberry'};
newB = {' = 2 kiwi';'1 fig = ';' = 3 strawberry'};
From your example it seems like you want to remove a special word and a number, the appropriate regular expression for this is (for word = 'apple'): '\d+ apple'. Building the regular expression from all the words in a, using sprintf:
re = sprintf('\\d+ %s|',a{:}); %// adding | operator to select between expressions
re(end)=[]; %// discard the last '|'
The resulting regular expression is
re =
'\d+ apple|\d+ banana|\d+ orange'
Now the actual replacement:
newB = regexprep(b,re,'')
Resulting with
newB =
' = 2 kiwi'
'1 fig = '
' = 3 strawberry'

Add two different numbers in a single text field space separated in Access VBA

I am using Access and VBA to tidy up a database before a migration. One field is going from text to an INT. So I need to convert and possibly add some numbers which exist in a singular field.
Examples:
F/C 3 other 8 should become 11
Calender-7 should become 7
21 F/C and 1 other should become 22
29 (natural ways) should become 29
The second and fourth line are simple enough, just use the following regex in VBA
Dim rgx As New RegExp
Dim inputText As String
Dim outputText As String
rgx.Pattern = "[^0-9]*"
rgx.Global = True
inputText = "29 (natural ways)"
outputText = rgx.Replace(inputText, "")
The downside is if I use it on option 1 or 3:
F/C 3 other 8 will become 38
Calender-7 will become 7
21 F/C and 1 other will become 211
29 (natural ways) will become 29
This is simple enough in bash, I can just keep the spaces by adding one to [^0-9 ]* and then piping it into awk which will add every field using a space as a delimiter like so:
sed 's/[^0-9 ]*//g' | awk -F' ' 's=0; {for (i=1; i<=NF; i++) s=s+$i; print s}'
F/C 3 other 8 will become 11
21 F/C and 1 other will become 22
The problem is I cannot use bash, and there are far too many values to do it by hand. Is there any way to use VBA to accomplish this?
Instead of using the replace method, just capture and then add up all the numbers. For example:
Option Explicit
Function outputText(inputText)
Dim rgx As RegExp
Dim mc As MatchCollection, m As Match
Dim I As Integer
Set rgx = New RegExp
rgx.Pattern = "[0-9]+"
rgx.Global = True
Set mc = rgx.Execute(inputText)
For Each m In mc
I = I + CInt(m) 'may Need to be cast as an int in Access VBA; not required in Excel VBA
Next m
outputText = I
End Function
I'm not sure if there are any easier way for your question. Here I've wrote small function for you.
Requirement: add all numbers in a string, identify "consecutive" digits as one number.
pseudo:
Loop through given text
find the first number and check/loop if following chars are numbers
if following chars are numbers treat as one number else pass the
result
continue searching from last point and add the result to the total
in code:
Public Function ADD_NUMB(iText As String) As Long
Dim I, J As Integer
Dim T As Long
Dim TM As String
For I = 1 To Len(iText)
If (InStr(1, "12346567890", Mid$(iText, I, 1)) >= 1) Then
TM = Mid(iText, I, 1)
For J = I + 1 To Len(iText)
If (InStr(1, "12346567890", Mid$(iText, J, 1)) >= 1) Then
TM = TM & Mid$(iText, J, 1)
Else
Exit For
End If
Next J
T = T + Val(Nz(TM, 0))
I = J
End If
Next I
ADD_NUMB = T
End Function
usage:
dim total as integer
total = ADD_NUMB("21 F/C and 1 other")
not sure about performance but it will get you what you need :)

In Perl, how many groups are in the matched regex?

I would like to tell the difference between a number 1 and string '1'.
The reason that I want to do this is because I want to determine the number of capturing parentheses in a regular expression after a successful match. According the perlop doc, a list (1) is returned when there are no capturing groups in the pattern. So if I get a successful match and a list (1) then I cannot tell if the pattern has no parens or it has one paren and it matched a '1'. I can resolve that ambiguity if there is a difference between number 1 and string '1'.
You can tell how many capturing groups are in the last successful match by using the special #+ array. $#+ is the number of capturing groups. If that's 0, then there were no capturing parentheses.
For example, bitwise operators behave differently for strings and integers:
~1 = 18446744073709551614
~'1' = Î ('1' = 0x31, ~'1' = ~0x31 = 0xce = 'Î')
#!/usr/bin/perl
($b) = ('1' =~ /(1)/);
print isstring($b) ? "string\n" : "int\n";
($b) = ('1' =~ /1/);
print isstring($b) ? "string\n" : "int\n";
sub isstring() {
return ($_[0] & ~$_[0]);
}
isstring returns either 0 (as a result of numeric bitwise op) which is false, or "\0" (as a result of bitwise string ops, set perldoc perlop) which is true as it is a non-empty string.
If you want to know the number of capture groups a regex matched, just count them. Don't look at the values they return, which appears to be your problem:
You can get the count by looking at the result of the list assignment, which returns the number of items on the right hand side of the list assignment:
my $count = my #array = $string =~ m/.../g;
If you don't need to keep the capture buffers, assign to an empty list:
my $count = () = $string =~ m/.../g;
Or do it in two steps:
my #array = $string =~ m/.../g;
my $count = #array;
You can also use the #+ or #- variables, using some of the tricks I show in the first pages of Mastering Perl. These arrays have the starting and ending positions of each of the capture buffers. The values in index 0 apply to the entire pattern, the values in index 1 are for $1, and so on. The last index, then, is the total number of capture buffers. See perlvar.
Perl converts between strings and numbers automatically as needed. Internally, it tracks the values separately. You can use Devel::Peek to see this in action:
use Devel::Peek;
$x = 1;
$y = '1';
Dump($x);
Dump($y);
The output is:
SV = IV(0x3073f40) at 0x3073f44
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
SV = PV(0x30698cc) at 0x3073484
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x3079bb4 "1"\0
CUR = 1
LEN = 4
Note that the dump of $x has a value for the IV slot, while the dump of $y doesn't but does have a value in the PV slot. Also note that simply using the values in a different context can trigger stringification or nummification and populate the other slots. e.g. if you did $x . '' or $y + 0 before peeking at the value, you'd get this:
SV = PVIV(0x2b30b74) at 0x3073f44
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 1
PV = 0x3079c5c "1"\0
CUR = 1
LEN = 4
At which point 1 and '1' are no longer distinguishable at all.
Check for the definedness of $1 after a successful match. The logic goes like this:
If the list is empty then the pattern match failed
Else if $1 is defined then the list contains all the catpured substrings
Else the match was successful, but there were no captures
Your question doesn't make a lot of sense, but it appears you want to know the difference between:
$a = "foo";
#f = $a =~ /foo/;
and
$a = "foo1";
#f = $a =~ /foo(1)?/;
Since they both return the same thing regardless if a capture was made.
The answer is: Don't try and use the returned array. Check to see if $1 is not equal to ""