Hey all I am terrible at RegEX so I am posting this question in hopes that a RegEX Guru will easily know and share the answer.
I have the following string types:
508815 AYBK1619RAUEZP
AWBZ4222TYBE1207CWSWER
DEFAULT EP1 O25R60
And I am needing it in this format (split):
508815 AYBK1619 RAU EZP
AWBZ4222 TYBE1207 CWS WER
DEFAULT EP1 O25 R60
So:
xxxxxxxx xxxxxxxx xxx xxx
First 8 characters in string
Next 8 characters in string
Next 3 characters in string
Last 3 characters in string
I can do the Mid(x,x) and all to do that but I figured that using RegEX would be quicker and cleaner looking code.
Any help would be great! Thanks!
If your desire is to actually use regex to split at those positions, you could use the following:
Dim s As String = "508815 AYBK1619RAUEZP"
Dim m() As String = Regex.Split(s, "(?<=^.{8})|(?<=^.{16})|(?<=^.{19})")
Console.WriteLine(String.Join(" ", m)) '=> "508815 AYBK1619 RAU EZP"
You could also just match the substrings at those positions instead of splitting ...
Dim s As String = "AWBZ4222TYBE1207CWSWER"
Dim m As Match = Regex.Match(s, "^(.{8})(.{8})(.{3})(.{3})$")
If m.Success Then
Console.WriteLine(
String.Join(" ",
m.Groups(1).Value,
m.Groups(2).Value,
m.Groups(3).Value,
m.Groups(4).Value
))
End If
'**Output => "AWBZ4222 TYBE1207 CWS WER"
You can use the following regex to get what you need:
^(\w{0,8})\s*(\w+)\s*(\w{3})(\w{3})$
This regex will:
Match the 0 to 8 word characters from the beginning of the string
Followed by 0 or more spaces
Followed by 1 or more word characters
Followed by 0 or more spaces
Followed by 3 word characters
Followed by 3 word characters
End of string
Word characters (\w) are any alphanumeric character, plus the underscore character. If you strictly want only capital letters for instance, you can replace \w with a character class of A-Z (any letter in the range A-Z), using [A-Z]
See example
Related
Without using a gem, I just want to write a simple regex formula to remove the first character from strings if it's a 1, and, if there are more than 10 total characters in the string. I never expect more than 11 characters, 11 should be the max. But in the case there are 10 characters and the string begins with "1", I don't want to remove it.
str = "19097147835"
str&.remove(/\D/).sub(/^1\d{10}$/, "\1").to_i
Returns 0
I'm looking for it to return "9097147835"
You could use your pattern, but add a capture group around the 10 digits to use the group in the replacement.
\A1(\d{10})\z
For example
str = "19097147835"
puts str.gsub(/\D/, '').sub(/\A1(\d{10})\z/, '\1').to_i
Output
9097147835
Another option could be removing all the non digits, and match the last 10 digits:
\A1\K\d{10}\z
\A Start of string
1\K Match 1 and forget what is matched so far
\d{10} Match 10 digits
\z End of string
Regex demo | Ruby demo
str = "19097147835"
str.gsub(/\D/, '').match(/\A1\K\d{10}\z/) do |match|
puts match[0].to_i
end
Output
9097147835
You can use
str.gsub(/\D/, '').sub(/\A1(?=\d{10})/, '').to_i
See the Ruby demo and the regex demo.
The regex matches
\A - start of string
1 - a 1
(?=\d{10}) - immediately to the right of the current location, there must be 10 digits.
Non regex example:
str = str[1..] if (str.start_with?("1") and str.size > 10)
Regexes are powerful, but not easy to maintain.
I am trying to learn Regex to answer a question on SO portuguese.
Input (Array or String on a Cell, so .MultiLine = False)?
1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With number 0n mid. 4. Number 9 incorrect. 11.12 More than one digit. 12.7 Ending (no word).
Output
1 One without dot.
2. Some Random String.
3.1 With SubItens.
3.2 With number 0n mid.
4. Number 9 incorrect.
11.12 More than one digit.
12.7 Ending (no word).
What i thought was to use Regex with Split, but i wasn't able to implement the example on Excel.
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "plum-pear"
Dim pattern As String = "(-)"
Dim substrings() As String = Regex.Split(input, pattern) ' Split on hyphens.
For Each match As String In substrings
Console.WriteLine("'{0}'", match)
Next
End Sub
End Module
' The method writes the following to the console:
' 'plum'
' '-'
' 'pear'
So reading this and this. The RegExr Website was used with the expression /([0-9]{1,2})([.]{0,1})([0-9]{0,2})/igm on the Input.
And the following is obtained:
Is there a better way to make this? Is the Regex Correct or a better way to generate? The examples that i found on google didn't enlight me on how to use RegEx with Split correctly.
Maybe I am confusing with the logic of Split Function, which i wanted to get the split index and the separator string was the regex.
I can make that it ends with word and period
Use
\d+(?:\.\d+)*[\s\S]*?\w+\.
See the regex demo.
Details
\d+ - 1 or more digits
(?:\.\d+)* - zero or more sequences of:
\. - dot
\d+ - 1 or more digits
[\s\S]*? - any 0+ chars, as few as possible, up to the first...
\w+\. - 1+ word chars followed with ..
Here is a sample VBA code:
Dim str As String
Dim objMatches As Object
str = " 1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With Another SubItem. 4. List item. 11.12 More than one digit."
Set objRegExp = New regexp ' CreateObject("VBScript.RegExp")
objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\w+\."
objRegExp.Global = True
Set objMatches = objRegExp.Execute(str)
If objMatches.Count <> 0 Then
For Each m In objMatches
Debug.Print m.Value
Next
End If
NOTE
You may require the matches to only stop at the word + . that are followed with 0+ whitespaces and a number using \d+(?:\.\d+)*[\s\S]*?[a-zA-Z]+\.(?=\s*(?:\d+|$)).
The (?=\s*(?:\d+|$)) positive lookahead requires the presence of 0+ whitespaces (\s*) followed with 1+ digits (\d+) or end of string ($) immediately to the right of the current location.
If VBA's split supports look-behind regex then this one may work, assuming there's no digit except in the indexes:
\s(?=\d)
I want to extract all strings in a sentence which start with a digit and end with digit, and are of 7 character length. Between the first and last digit, it can contain digits or letters.
Example: Sample text for testing: 0012345 15R7544 35P2699
I want to get these strings- 0012345, 15R7544 , 35P2699
let patt = /\b\d[\da-zA-Z]{5}\d\b/g;
let str = "hello 35P2699 world another 0012345 random";
console.log(str.match(patt));
\b is word boundary, checks for the string start and end.
\d checks for a digit.
[\da-zA-Z]{5} checks for 5 occurrences of a digit or a letter, in between a start and end digit.
In perl, you can place all of them in an array like this
my #strings= $str=~/(\b\d\w{5}\d\b)/g ;
Try the following regexp. You would need to trim each output.
(^|\s)\d{1,1}[\d|\w]{5,5}\d{1,1}($|\s)
I have a question on Classic ASP regarding validating a string's first 3 letters to be uppercase while the last 4 characters should be in numerical form using regex.
For e.g.:
dim myString = "abc1234"
How do I validate that it should be "ABC1234" instead of "abc1234"?
Apologies for my broken English and for being a newbie in Classic ASP.
#ndn has a good regex pattern for you. To apply it in Classic ASP, you just need to create a RegExp object that uses the pattern and then call the Test() function to test your string against the pattern.
For example:
Dim re
Set re = New RegExp
re.Pattern = "^[A-Z]{3}.*[0-9]{4}$" ' #ndn's pattern
If re.Test(myString) Then
' Match. First three characters are uppercase letters and last four are digits.
Else
' No match.
End If
^[A-Z]{3}.*[0-9]{4}$
Explanation:
Surround everything with ^$ (start and end of string) to ensure you are matching everything
[A-Z] - gives you all capital letters in the English alphabet
{3} - three of those
.* - optionally, there can be something in between (if there can't be, you can just remove this)
[0-9] - any digit
{4} - 4 of those
I need help with finding the first 3 capital letters A-Z and then a space followed by 4 numbers 0-9.
Dim IndividualClasses As MatchCollection = Regex.Matches(AllExitClasses(a), "([A-Z])([A-Z])([A-Z]) ([0-9])([0-9])([0-9])([0-9])")
An example input string would be AML 4309 or DEF 4298.
The above 7 characters are what I want to get out of string.
EDIT: Since you preprocess your input string, you can use
Dim IndividualClasses As MatchCollection = Regex.Matches(AllExitClasses(a).Replace(" ", "-"), "[A-Z]{3}[-][0-9]{4}")
REGEX EXPLANATION:
[A-Z]{3} - 3 occurrences of English letters A to Z
[-] - A character class matching exactly one hyphen
[0-9]{4} - Exactly 4 occurrences of digits from 0 to 9.
Note that I removed capturing groups since you do not seem to be using them at all, and I am using limiting quantifiers, e.g. {4}.
Note that you could use your input string as is and previous regex [A-Z]{3}\p{Zs}[0-9]{4}, but you would need to iterate through the match collection and replace a space in each Match.Value with a hyphen creating a new array.
Here is an IDEONE demo
Ok I replaced the spaces with a dash
then I am using this Regular expression
"([A-Z])([A-Z])([A-Z])([-])([0-9])([0-9])([0-9])([0-9])")
which works
AllExitClasses(a) = AllExitClasses(a).Replace(" ", "-")
'
MyClassString = AllExitClasses(a).ToString
Dim IndividualClasses As MatchCollection = Regex.Matches(MyClassString, "([A-Z])([A-Z])([A-Z])([-])([0-9])([0-9])([0-9])([0-9])")
Regex.Matches([variable], "^([A-Z]{3,3})(\s)([0-9]{4,4})$")
This regex will find your AAA 1111 (3 uppercase letters with [A-Z]{3,3}; one white space with (\s); and exactly 4 digits with ([0-9]{4})). I have found that http://regex101.com helps a lot with expressions in different languages.