Regular expression to match n times in which n is not fixed - regex

The pattern I want to match is a sequence of length n where n is right before the sequence.
For example, when the input is "1aaaaa", I want to match the single character "a", as the first number specifies only 1 character is matched.
Similar, when the input is "2aaaaa", I want to match the first two characters "aa", but not the rest, as the number 2 specifies two characters will be matched.
I understand a{1} and a{2} will match "a" one or two times. But how to match a{n} in which n is not fixed?
Is it possible to do this type of match using regular expressions?

This will work for repeating numbers.
import re
a="1aaa2bbbbb1cccccccc4dddddddddddd"
for b in re.findall(r'\d[a-z]+', a):
print b[int(b[0])+1:int(b[0])+1+int(b[0])]
Output:
a
bb
c
dddd

Though I have done in Java, it will help you get going in your program.
Here you can select the first letter as sub-string from the given input string and use it in your regex to match the string accordingly.
public class DynamicRegex {
public static void main(String args[]){
Scanner scan = new Scanner(System.in);
System.out.println("Enter a string: ");
String str = scan.nextLine();
String testStr = str.substring(0, 1); //Get the first character from the string using sub-string.
String pattern = "a{"+ testStr +"}"; //Use the sub-string in your regex as length of the string to match.
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(str);
if(m.find()){
System.out.println(m.group());
}
}
}

Related

How to use Regex to parse variable length string in Google Application Scripts

Trying to extract the alphanumeric parts of a variable length string with a known pattern using Regex in Google Application Scripts. Pattern repeats n times as follows (XXXs are groups of Alphanumeric characters):
XX-XXX-X-XX-XX-........ for example ABC-AB or ABCD-AB-ABC-AA
I want to extract the alphanumeric parts into an Array if possible like e[0] = ABCD e[1] = AB e[2] = ABC .....
I tried repeated \w+ but that requires knowing the possible lengths of string. See below. Is there a way for Regex to process varying size strings? See my example code below:
var data1 = 'ABC-AB';
var data2 = 'ABCD-AB-ABCD-AA';
var regex1 = new RegExp(/(\w+)-(\w+)/);
var regex2 = new RegExp(/(\w+)/);
e = regex1.exec(data1); //stores ABC and AB as separate array elements.
This is fine but won't work on a string with larger size
e = regex2.exec(data2); //stores ABCD only as a single array element "ABCD"
To match any length of kebab case letters:
var regex1 = new RegExp(/\w+(-\w+)*/)
For each of the matches found, split the result on dashes to get your array.
var array = found.split("-")

Replace 2 step Regex with 1 step Regex to get one upper case letter between underscores

I have a string, myFile, that looks like: Name_2019-11-29_D_HPSeries.txt. I need to extract the letter D between the underscores...the letter could be any uppercase letter. Right now I am using a 2 step Regex code.
Dim bC As String = Regex.Match(myFile, "_[A-Z]+_").ToString
boatClass = Regex.Match(bC, "[A-Z]+").ToString
This works but I believe it could be done with one line. I tried the code below but it doesn't work.
boatClass = Regex.Replace(myFile, "_[A-Z]_", "[A-Z]").ToString
You can use positive lookarounds to avoid a 2-step process, checking that the characters before and after the letter are underscores without capturing them:
Dim myFile AS String = "Name_2019-11-29_D_HPSeries.txt"
Dim bC As String = Regex.Match(myFile, "(?<=_)[A-Z](?=_)").ToString
Console.WriteLine(bc)
Output:
D
You were almost there with a single char A-Z, but you could wrap it in a capturing group and then use the Match.Groups property.
_([A-Z])_
Regex demo | VB.Net Demo
For example
Dim myFile AS String = "Name_2019-11-29_D_HPSeries.txt"
Dim bC As String = Regex.Match(myFile, "_([A-Z])_").Groups(1).Value
Console.WriteLine(bc)
Result
D

Regex for string *11F23H3*: Start and end with *, 7 Uppercase literals or numbers in between

I need to check strings like *11F23H3* that start and end with a *and have 7 uppercase literals or numbers in between. So far I have:
if (!barcode.match('[*A-Z0-9*]')) {
console.error(`ERROR: Barcode not valid`);
process.exitCode = 1;
}
But this does not cover strings like *11111111111*. How would the correct regex look like?
I need to check strings like 11F23H3 that start and end with a *and have 7 uppercase literals or numbers in between
You can use this regex:
/\*[A-Z0-9]{7}\*/
* is regex meta character that needs to be escaped outside character class
[A-Z0-9]{7} will match 7 characters containing uppercase letter or digits
RegEx Demo
Code:
var re = /\*[A-Z0-9]{7}\*/;
if (!re.test(barcode)) {
console.error(`ERROR: Barcode ${barcode} in row ${row} is not valid`);
process.exitCode = 1;
}
Note that if barcode is only going to have this string then you should also use anchors like this to avoid matching any other text on either side of *:
var re = /^\*[A-Z0-9]{7}\*$/;

.Net Regular Expression(Regex)

VB.NET separate strings using regex split?
Im having a logical error with the pattern string variable, the error occur after i extend the string from "(-)" to "(-)(+)(/)(*)"..
Dim input As String = txtInput.Text
Dim pattern As String = "(-)(+)(/)(*)"
Dim substrings() As String = Regex.Split(input, pattern)
For Each match As String In substrings
lstOutput.Items.Add(match)
This is my output when my pattern string variable is "-" it works fine
input: dog-
output: dog
-
My desired output(This is want i want to happen) but there is something wrong with the code.. its having an error after i did this "(-)(+)(/)()" even this
"(-)" + "(+)" + "(/)" + "()"
input: dog+cat/tree
output: dog
+
cat
/
tree
when space character input from textbox to listbox
input: dog+cat/ tree
output: dog
+
cat
/
tree
You need a character class, not the sequence of subpatterns inside separate capturing gorups:
Dim pattern As String = "([+/*-])"
This pattern will match and capture into Group 1 (and thus, all the captured values will be part of the resulting array) a char that is either a +, /, * or -. Note the position of the hyphen: since it is the last char in the character class, it is treated as a literal -, not a range operator.
See the regex demo:

How to format a string to replace all existing number inside a string to prefix with leading zero using regex

Anyone knows how to use regex to convert a string with characters and numbers to prefix with leading zero for each occurance of a number inside the string.
Eg ABC123 -> ABC000100020003
BCD02 - > BCD00000002
CD1A2 - > CD0001A0002
i.e for each occurance of a number it will prefix with leading zeros (total 4 digit for each occurance of a number)
Other characters to remain the same.
search /(\d)/g
and replace with 000\1
will do it.
demo here : http://regex101.com/r/aB8iE9
javascript demo here:
var str = "ABC123";
var res = str.replace(/(\d)/g, '000$1');
console.log(res);