How to delete repeated specific characters - regex

I need to replace the "#" with "-" in a string. This is straightforward, but I also need to replace multiple "#####" with just one single "-". Any ideas on how to do the latter with ASP.
Here is an example:
input string:
#Introducción a los Esquemas Algorítmicos: Apuntes y colección de problemas. Report LSI-97-6-T########09/30/1997#####TRE#
Desired output:
-Introducción a los Esquemas Algorítmicos: Apuntes y colección de problemas. Report LSI-97-6-T-09/30/1997-TRE-
Thanks.

Try this for classic ASP:
Dim regEx
Set regEx = New RegExp
With regEx
.Pattern = "([\#])\1+|(\#)"
.Global = True
.MultiLine = True
End With
strMessage = regEx.Replace(str, "-")
This will match every occurrence of multiple #### or single occurrences of #
Not sure what language you are using so here's the expression in full with delimiters: /([\#])\1+|(\#)/g
Edit - Even simpler: /#+/g

using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
Console.WriteLine("Hello World");
String input = "#Introducción a los Esquemas Algorítmicos: Apuntes y colección de problemas. Report LSI-97-6-T########09/30/1997#####TRE#";
String output=Regex.Replace(input,#"\#+","-");
Console.WriteLine(output);
}
}

Related

Match n amount of words separated by commas after base text

I would like to match an infinite amount of words separated by commas and whitespaces.
Is there a better solution than just repeating the search parameter?
Sample:
"2_i Art des Problems:\s*(.[^,\s]+)[,]\s*(.[^,\s]+)[,]\s*(.[^,\s]+)"
2_i Art des Problems: Elektrisch, Schweißausrüstung, Burgenland
View on regex101: https://regex101.com/r/yP7PPO/1
Full code for this operation:
With Reg1
.Pattern = "2_i Art des Problems:+\s*([^\r\n]*\S)"
.Global = False
End With
If Reg1.Test(olMail.Body) Then
Set M1 = Reg1.Execute(olMail.Body)
End If
For Each M In M1
With xExcelApp
Select Case M.SubMatches
Case Software
Range("D6").Value = 1
Case Mechanisch
Range("E6").Value = 1
Case Elektrisch
Range("F6").Value = 1
Case Roboter
Range("G6").Value = 1
Case Schweißausrüstung
Range("H6").Value = 1
Case Anwendung
Range("I6").Value = 1
Case Ersatzteil
Range("J6").Value = 1
Case Else
Range("K6").Value = 1
End Select
End With
Next M
Does it really need to be a RegEx?
I think this is over complicating things as this can easily be solved with Split():
Option Explicit
Public Sub Example()
Const TestString As String = "2_i Art des Problems: Elektrisch, Schweißausrüstung, Burgenland"
Const ConstantPart As String = "2_i Art des Problems: "
If Left$(TestString, Len(ConstantPart)) = ConstantPart Then
Dim Parts() As String
Parts = Split(Mid$(TestString, Len(ConstantPart) + 1), ", ")
Dim Part As Variant
For Each Part In Parts
Debug.Print Part
Next Part
End If
End Sub
Output is:
Elektrisch
Schweißausrüstung
Burgenland
If you realy need to use regexp than use global flag and e.g. this regexp
(.[^,\s]+)(,|$)
Explanation here
With regEx
.Global = True
Use .SubMatches to get capturing groups values
EDIT:
according to one of comment "Then you still need to Trim the matches because they will include the spaces. – Pᴇʜ 1 min ago"
you can still use regexp
.([^,\s]+)(,|$)
check

How to capture several portions of a string at once with regex?

I need to capture several strings within a longer string strText and process them. I use VBA.
strText:
Salta pax {wenn([gender]|1|orum|2|argentum)} {[firstname]} {[lastname]},
ginhox seperatum de gloria desde quativo,
dolus {[start]} tofi {[end]}, ([{n_night]}
{wenn([n_night]|1|dignus|*|digni)}), cum {[n_person]}
{wenn([n_person]|1|felix|*|semporum)}.
Quod similis beruntur: {[number]}
I'm trying to capture different portions of strText, all within the curly braces:
If there's only a string within square brackets, I'd like to capture the string:
{[firstname]} --> firstname
If there's a conditional operation (starting with wenn()), I'd like to capture the string within the square brackets plus the number-value-pairs after:
{[gender]|1|orum|2|argentum} --> gender / 1=orum / 2=argentum
I managed to define a pattern to get any one of the tasks above,
e.g. \{\[(.+?)\]\} capturing the strings within square brackets,
see this regex101
but I figure there must be a way to have a pattern that does all of the above?
I'm not sure if the following code is helpful to you. It uses the | symbol to capture both conditions.
Function extractStrings(strText As String) As MatchCollection
Dim regEx As New RegExp
Dim SubStrings As MatchCollection
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "(\{\[)(.+?)(\]\})|(wenn\(\[)(.+?)(\])(\|)(.+?)(\|)(.+?)(\|)(.+?)(\|)(.+?)(\)\})"
End With
On Error Resume Next
Set extractStrings = regEx.Execute(strText)
If Err = 0 Then Exit Function
Set extractStrings = Nothing
End Function
Sub test()
Dim strText As String
strText = "Salta pax {wenn([gender]|1|orum|2|argentum)} {[firstname]} {[lastname]},ginhox seperatum de gloria desde quativo,dolus {[start]} tofi {[end]}, ([{n_night]} " & _
"{wenn([n_night]|1|dignus|*|digni)}), cum {[n_person]}{wenn([n_person]|1|felix|*|semporum)}.Quod similis beruntur: {[number]}"
Dim SubStrings As MatchCollection
Dim SubString As Match
Set SubStrings = extractStrings(strText)
For Each SubString In SubStrings
On Error Resume Next
If SubString.SubMatches(1) <> "" Then
Debug.Print SubString.SubMatches(1)
Else
Debug.Print "wenn(" & SubString.SubMatches(4) & "|" & SubString.SubMatches(7) & "=" & SubString.SubMatches(9) & "|" & SubString.SubMatches(11) & "=" & SubString.SubMatches(13) & ")"
End If
Next SubString
End Sub
You can iterate through all substrings with the for each loop. I am well aware, that the regex pattern is not optimal, but at least it does the trick.

Get the third Regex between special char

I have this text:
2|#Favo|Name||26.0000|50.10000|_GRE|||||City|Road||||
I want to capture anything between those special chars: ||
For example, I want to capture "Name" only or I want to capture "City"
I've spent many hours and all I came up with is this regex:
([^|].*[$|])\w+
Here are the required values:
How can I capture one of them?
Thank you.
You may split the string with | removing empty entries and also all those that are blank or consisting only of digits:
Dim strng As String = "2|#Favo|Name||26.0000|50.10000|_GRE|||||City|Road||||"
Dim reslt As List(Of String) = strng.Split(New String() {"|"}, StringSplitOptions.RemoveEmptyEntries).Where(
Function(m) m.All(AddressOf Char.IsDigit) = False And String.Equals(m.Trim(), String.Empty) = False).ToList()
Console.Write(String.Join(", ", reslt))
Output:

Extract Excel string from matched Regular Expression (VBA)

I would like to extract the matched RegExp pattern from a given string in Excel VBA.
For example,
Given this expression:
"[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
from this string:
"CSDT2_EXC_6+000#6+035_JM_150323"
I'd like to get: "6+000#6+035"
But I don't know how to accomplish this.
The nearest I could get was this:
Function getStations(file_name As String)
'Use Regular Expressiosn for grabbing the input and automatically filter it
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
'This matches the pattern: e.g. 06+900#07+230
.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
End With
If regEx.Test(file_name) Then
strReplace = ""
getStations = regEx.Replace(file_name, strReplace)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function
But this would bring me the following:
"CSDT2_EXC__JM_150323"
I'd like to only take the matched pattern. How can I achieve this?
Thanks a million for all the replies ;)
You can use this:
Function getStations(file_name As String)
'Use Regular Expressiosn for grabbing the input and automatically filter it
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
'This matches the pattern: e.g. 06+900#07+230
.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
End With
If regEx.Test(file_name) Then
getStations = regEx.Execute(file_name)(0)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function
Some minor suggestions to Rory's excellent answer (given you have redundancy in your initial function):
Function getStations(file_name As String) As String
'Use Regular Expressionn for grabbing the input and automatically filter it
Dim regEx As Object
Set regEx = CreateObject("vbscript.regexp")
regEx.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
If regEx.Test(file_name) Then
getStations = regEx.Execute(file_name)(0)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function

Remove all the String before :

"\:(.*)$"
Hi all i am using above expression to remove all the string before : (colon), but it is giving me all the string before this. how can i do this. Thanks a lot.
My string is:
This is text: Hi here we go
I am getting: This is text
I want : Hi here we go
Updated code
Sub Main()
Dim input As String = "This is text with : far too much "
Dim pattern As String = "\:(.*)$"
Dim replacement As String = " "
Dim rgx As New Regex(pattern)
Dim result As String = rgx.Replace(input, replacement)
Console.WriteLine("Original String: {0}", input)
' MsgBox("Original String: {0}")
Console.WriteLine("Replacement String: {0}", result)
MsgBox("Original String: {0}")
End Sub
Try this pattern. This will help you to match string after colon
/?:(.)/
or
/: (.+)/
It should be:
Dim pattern As String = "(.*)\:"
' in vb if above one doesn't work, then try this one
' Dim pattern As String = "^(.*)\:"
' also i don't think we need to use any brackets here as well.
This regex means, anything before the colon(:), Where you were using anything after the colon(:) in your example.
If you are not dead set on RegEx then you can also use
Dim result As String
result = Strings.Split(Input, ":", 2)(1)
This splits the input into an array with two elements. First element is the text before the first ":", the second element is the text after.