String masking - inserting dashes - regex

I am writing a function to format a string. I receive a string of numbers, sometimes with dashes, sometimes not. I need to produce an output string of 14 characters, so if the input string contains less than 14, I need to pad it with zeros. then I need to mask the string of numbers by inserting dashes in appropriate places. Here is what I got so far:
strTemp = strTemp.Replace("-", "")
If IsNumeric(strTemp) Then
If strTemp.Length < 14 Then
strTemp = strTemp.PadRight(14 - strTemp.Length)
End If
output = String.Format(strTemp, "{00-000-0-0000-00-00}")
End If
The above works fine, except it just returns a string of numbers without putting in the dashes. I know I am doing something wrong with String.Format but so far I've only worked with pre-defined formats. Can anyone help? How can I use Regex for string formatting in this case?

This function should do the trick:
Public Function MaskFormat(input As String) As String
input = input.Replace("-", String.Empty)
If IsNumeric(input) Then
If input.Length < 14 Then
input = input.PadRight(14 - input.Length)
End If
Return String.Format("{0:00-000-0-0000-00-00}", CLng(input))
Else
Return String.Empty
End If
End Function
You can find more on String formatting here.

Related

Remove all underscores until last number

I faced with the following problem. I need to remove all underscores between the start of the string and last digit in string (like was: 123_456__ - became: 123456__). I used the usual loop for it, which goes through string.length - 1 down to 0 and when the symbol is digit I start the new loop from the 0 to the i, where i is position of the found digit and forming new string skipping underscores. But it seems that there are some ways to replace it with regex or more "Kotlin-style" code, but I do not know how to do it. Is it possible to do it in more convenient way?
One way to to this is to use string functions like takeLastWhile / drop etc.
val s = "123_456__"
val end = s.takeLastWhile { !it.isDigit() }
val start = s.dropLast(end.length).filter { it != '_' } // or replace("_", "")
val result = start + end
println(result)

Manipulate string to extract address

I'm currently doing some work with a very large data source on city addresses where the data looks something like this.
137 is the correct address but it belongs in a building that takes up 135-138A on the street.
source:
137 9/F 135-138A KING STREET 135-138A KING STREET TOR
i've used a function which removes the duplicates shown on extendoffice.
the second column has become this:
137 9/F 135-138A KING STREET TOR
what I want to do now is
find address number and add it in front of the street name
remove the numbers that are connected to the dash - ):
9/F 137 KING STREET TOR
Would the the best way to accomplish this?
The main problem I'm having with this is there are many inconsistent spaces in address names ex. "van dyke rd".
Is there anyway I can locate in an array the "-" and set variables for the 2 numbers on either side of the dash and replace it with the correct address number located at the front
Function RemoveDupes2(txt As String, Optional delim As String = " ") As String
Dim x
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For Each x In Split(txt, delim)
If Trim(x) <> "" And Not .exists(Trim(x)) Then .Add Trim(x), Nothing
Next
If .Count > 0 Then RemoveDupes2 = Join(.keys, delim)
End With
End Function
Thanks
Regular Expressions are a way to (amongst other things) search for a feature in a string.
It looks like the feature you are looking for is: number:maybe some spaces : dash : maybe some spaces : number
In regex notation this would be expressed as:
([0-9]*)[ ]*-[ ]*([0-9]*)
Which translates to: Find a sequential group of digits followed by zero or more spaces, then a dash, then zero or more spaces, then some more digits.
The parenthesis indicate the elements that will be returned. So you could assign variables to the be the first number or the second number.
You might need to tweak this if a dash can potentially occur elsewhere in the address.
Further information on actually implementing that is available here: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
This meets the case you want, it captures the address range as two separate matches (if you want to process further).
The current code simple removes this range altogether.
What logic is there to move the 9/F to front?
See regex here
Function StripString(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\d+[A-C]?)-(\d+[A-C]?)"
If .test(strIn) Then
StripString = .Replace(strIn, vbullstring)
Else
StripString = "No match"
End If
End With
End Function
I'd just:
swap 1st and 2nd substrings
erase the substring with "-" in it
Function RemoveDupes2(txt As String, Optional delim As String = " ") As String
Dim x As Variant, arr As Variant, temp As Variant
Dim iArr As Long
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For Each x In Split(txt, delim)
If Trim(x) <> "" And Not .exists(Trim(x)) Then .Add Trim(x), Nothing
Next
If .count > 0 Then
arr = .keys
temp = arr(0)
arr(0) = arr(1)
arr(1) = temp
For iArr = LBound(arr) To UBound(arr)
If InStr(arr(iArr), "-") <> 0 Then arr(iArr) = ""
Next
RemoveDupes2 = Join(arr, delim)
End If
End With
End Function

Verify and cut a string using regexp in matlab

I have the following string:
{'output',{'variable','VGRG_Pos_Var1/Parameters/D_foo'},'date',734704.60904050921}
I would like to verify the format of the string that the word 'variable' is the second word and i would like to retrive the string after the last '/' in the 3rd string (In this example 'D_foo').
how could i verify this and retrive the sting i search?
I tried the following:
regexp(str,'{''\w+'',{''variable'',''([(a-z)|(A-Z)|/|_])+')
without success
REMARK
The string to analysis is not splited after the komma, it is only due to length of the string.
EDIT
my string is:
'{''output'',{''variable'',''VGRG_Pos_Var1/Parameters/D_foo''},''date'',734704.60904050921}';
and not a cell, which could be understood. I added the sybol ' at the start and end of the string to symbolizied that it is a string.
I realise that you mention using regexp in the question, but I'm not sure if this is a requirement? If other solutions are acceptable you could try this:
str='{''output'',{''variable'',''VGRG_Pos_Var1/Parameters/D_foo''},''date'',734704.60904050921}';
parts1=textscan( str, '%s','delimiter',{',','{','}'},'MultipleDelimsAsOne',1);
parts2=textscan( parts1{1}{3}, '%s','delimiter',{'/',''''},'MultipleDelimsAsOne',1);
string=parts2{1}{end}
match=strcmp(parts1{1}{2},'variable')
To answer the first part of your question, you can write this:
str = {'output',{'variable','VGRG_Pos_Var1/Parameters/D_foo'},'date',734704.60904050921};
temp = str(2); %this holds the cell containing the two strings
if cmpstr(temp{1}(1), 'variable')
%do stuff
end
For the second part you can do this:
str = {'output',{'variable','VGRG_Pos_Var1/Parameters/D_foo'},'date',734704.60904050921};
temp = str(2); %like before, this contains the cell
temp = temp{1}(2); %this picks out the second string in the cell
temp = char(temp); %turns the item from a cell to a string
res = strsplit(temp, '/'); %splits the string where '/' are found, res is an array of strings
string = res(3); %assuming there will always be just 2 '/'s.

VBA RegEx getting String with only Number and Hyphen

I have a string with something like
Bl. 01 - 03
I want this to be reduced to only
01-03
Everything other than digits & hyphen should be removed. Any ideas how to do it using regex or any other method?
you can use this pattern in a replace expression:
reg.Pattern = "[^\d-]+"
Debug.Print reg.Replace(yourstring, "")
Barring a more complete description of exactly what you mean by something like "BI. 01 - 03", this:
^.*(\d{2}\s?-\s?\d{2}).*$
Will capture the portion you seem to be interested in as group 1. If you want to get rid of the spaces as well, then something like:
^.*(\d{2})\s?-\s?(\d{2}).*$
might be more suited, where you will have the two numbers in groups 1 and 2, and can replace the hyphen in output.
Here's a function with a non-RegEx approach to remove anything but digits and the hyphen from a given input string:
Function removeBadChars(sInput As String) As String
Dim i As Integer
Dim sResult As String
Dim sChr As String
For i = 1 To Len(sInput)
sChr = Mid(sInput, i, 1)
If IsNumeric(sChr) Or sChr = "-" Then
sResult = sResult & sChr
End If
Next
removeBadChars = sResult
End Function

Regex To Match Order Of String

I wanted to match the words in string with reverse order.
We wanted to put validation to prompt user, if name exists in reverse order.
For example:
If name column has the value, 'Viral,Tennis'
Now if user enters a new name with the value, 'Tennis,Viral'
Then how can we match reverse order of word using regex or some other way?
I am using C#.net for development.
You could take a look at the Regex.Split(String input, String regex) and do something like so:
String[] userEntry = Regex.Split(userString, "\\s+");
StringBuilder sb = new StringBuilder()
for (int i = userEntry.Length -1; i >= 0; i--)
{
sb.append(userEntry[i]).append(" ");
}
String result = sb.ToString();
//Do Validation
That would do the trick, however, you need to keep in mind that things will get a little bit messy if you do not want to change the order of special symbols such as the comma. You could easily remove those and do any validation without special symbols.
EDIT: It depends on what you mean by special symbols. The regex [^a-zA-z0-9]+ will match any character which is not a letter (upper or lower case) and which is also not a number. So you could easily do something like so:
string input = ...
string pattern = "[^a-zA-z0-9]+";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
The above should yield a string which is only made from letters and digits. White spaces will also be removed.