Match all consecutive numbers of length n [duplicate] - regex

This question already has answers here:
How to use regex to find all overlapping matches
(5 answers)
Closed 5 years ago.
The community reviewed whether to reopen this question 7 months ago and left it closed:
Original close reason(s) were not resolved
Where n=4 in my example.
I'm very new to Regex and have searched for 20 minutes now. There are some helpful websites out there that simplify things but I can't work out how to proceed with this.
I wish to extract every combination of 4 consecutive digits from this:
12345
to get:
1234 - possible with ^\d{4}/g - Starts at the beginning
2345 - possible with \d{4}$/g - Starts at the end
But I can't get both! The input could be any length.

Your expression isn't working as expected because those two sub-strings are overlapping.
Aside from zero-length assertions, any characters in the input string will be consumed in the matching process, which results in the overlapping matches not being found.
You could work around this by using a lookahead and a capturing group to retrieve the overlapping matches. This works because lookahead assertions (as well as lookbehind assertions) are classified as zero-length assertions, which means that they don't consume the matches; thereby allowing you to find any overlapping matches.
(?=(\d{4}))
Here is a quick snippet demonstrating this:
var regex = /(?=(\d{4}))/g;
var input = '12345678';
var match;
while ((match = regex.exec(input)) !== null) {
if (match.index === regex.lastIndex) {
regex.lastIndex++;
}
console.log(match[1]);
}

You can use a lookahead with a capturing group:
(?=(\d{4}))
See demo

Use a look ahead assertion with all the possibilities
(?=(0123|1234|2345|3456|4567|5678|6789))
(?=
( # (1 start)
0123
| 1234
| 2345
| 3456
| 4567
| 5678
| 6789
) # (1 end)
)
Output
** Grp 0 - ( pos 0 , len 0 ) EMPTY
** Grp 1 - ( pos 0 , len 4 )
1234
------------------
** Grp 0 - ( pos 1 , len 0 ) EMPTY
** Grp 1 - ( pos 1 , len 4 )
2345

Related

Regex quantifier more than one group

I need a regex to get a sequence of number 1 followed by number 0 and the total numbers should be equal to a max length. Is there a way to do something like (([1]+)([0]+)){maxLength} ?
Ex.:
maxLength = 7
10 -> should not pass (total length < maxLength)
1111100 -> should match
1000000 -> should match
11110000000 -> should match 1111000.
111111111111 -> should match 1111111.
Plus: The sequence could be 0 followed by 1, and the greater the amount of 1 the better (I don't know if it's possible in only one regex).
000000001111 -> should get 0001111.
I'm focusing on 1 followed by 0.
I started with [1]+[0]+,
after I quantified the 0s ([1]+)([0]{1,7}),
but it still giving more 0s than I want.
Then I was thinking in ([1]{7,}|[1]{6}[0]{1}|[1]{5}[0]{2}|[1]{4}[0]{3}|[1]{3}[0]{4}|[1]{2}[0]{5}|[1]{1}[0]{6}),
and ok, it works. BUT if maxLength = 100 the above solution is not viable.
Is there some way to count the length of the first matched group and then the second group to be the difference from the first one?
Or something like (([1]+)([0]+)){7} ?
My attempt using branch reset group:
0*(?|(1[10]{6})|([10]{6}1))
See an online demo. You can use the result from 1st capture group.
0* - 0+ literal zeros (greedy) upto;
(?| - Open branch reset group:
(1[10]{6}) - 1st Capture group holding a literal 1 and 6 ones or zeros.
| - Or:
([10]{6}1) - 1st Capture group holding 6 ones or zeros upto and a literal one.
) - Close branch reset group.
It seems you just want:
^(?:(?=1+0*$)|(?=0+1*$))[01]{7}
Here the {7} can be replaced with whatever the max length is minus one.
I think the regex can be as simple as:
/0*([01]{7})/
example:
const result = `
10
1111100
1000000
11110000000
111111111111
000000001111
`.split("\n").reduce((acc, str) => {
const m = str.match(/0*([01]{7})/);
m && acc.push(m[1]);
return acc
}, []);
console.log(result)

Regex Replace to Remove Spaces and Change Comma to Period

I have the following values:
10 000,00
10 000,00
750,00
750,00
1 000 000,00
1 000 000,00
and need the following results:
10000.00
10000.00
750.00
750.00
1000000.00
1000000.00
I've managed to do this in 2 steps; first by replacing , with . and then by regex replacing [^0-9.] with nothing.
How can I achieve this in 1 regex replace step?
I couldn't think of any generic approach that would be useful in every situation, but there is a regex that accomplishes what you want: ( ?(\d+))?( ?(\d+))?( ?(\d+)),
let string = `10 000,00
10 000,00
750,00
750,00
1 000 000,00
1 000 000,00
1,00
10,00`;
let result = string.replace(/( ?(\d+))?( ?(\d+))?( ?(\d+)),/g, '$2$4$6.');
console.log(result);
The explanation is pretty simple (I'll only explain the last part):
( # Capturing group
? # Space (one or zero)
( # Nested capturing group
\d+ # One or more digits
)
)
, # Captures a comma
The pattern repeats itself for three times because the maximum value in the sample string is at 103. It won't capture higher order values, for that you'd need to repeat the pattern ( ?(\d+))? according to your needs (Note the ? at the end to make the pattern optional, allowing you to keep matching smaller values).
For the replacement, you'd select only the inner capturing groups, which in this case are $2, $4 and $6, but if the pattern grows up you'd keep going to $8, $10 and over. Then you insert a dot at the end of the replacement: $2$4$6. and that's it.
I am not sure which language you're using.
But In JS this can be achieved with the callback function available in replace method.
let str = `10 000,00
10 000,00
750,00
750,00
1 000 000,00
1 000 000,00`
let op = str.replace(/( +)|(,)/g, function(match,g1,g2){
if(g1 && g1.length){
return ''
} else {
return '.'
}
})
console.log(op)

IBAN Regex design [duplicate]

This question already has answers here:
IBAN Validation check
(11 answers)
Closed 4 years ago.
Help me please to design Regex that will match all IBANs with all possible whitespaces. Because I've found that one, but it does not work with whitespaces.
[a-zA-Z]{2}[0-9]{2}[a-zA-Z0-9]{4}[0-9]{7}([a-zA-Z0-9]?){0,16}
I need at least that formats:
DE89 3704 0044 0532 0130 00
AT61 1904 3002 3457 3201
FR14 2004 1010 0505 0001 3
Just to find the example IBAN's from those countries in a text :
Start with 2 letters then 2 digits.
Then allow a space before every 4 digits, optionally ending with 1 or 2 digits:
\b[A-Z]{2}[0-9]{2}(?:[ ]?[0-9]{4}){4}(?!(?:[ ]?[0-9]){3})(?:[ ]?[0-9]{1,2})?\b
regex101 test here
Note that if the intention is to validate a complete string, that the regex can be simplified.
Since the negative look-ahead (?!...) won't be needed then.
And the word boundaries \b can be replaced by the start ^ and end $ of the line.
^[A-Z]{2}[0-9]{2}(?:[ ]?[0-9]{4}){4}(?:[ ]?[0-9]{1,2})?$
Also, it can be simplified even more if having the 4 groups of 4 connected digits doesn't really matter.
^[A-Z]{2}(?:[ ]?[0-9]){18,20}$
Extra
If you need to match an IBAN number from accross the world?
Then the BBAN part of the IBAN is allowed to have up to 30 numbers or uppercase letters. Reference
And can be written with either spaces or dashes or nothing in between.
For example: CC12-XXXX-12XX-1234-5678-9012-3456-7890-123
So the regex pattern to match a complete string with a long IBAN becomes a bit longer.
^([A-Z]{2}[ \-]?[0-9]{2})(?=(?:[ \-]?[A-Z0-9]){9,30}$)((?:[ \-]?[A-Z0-9]{3,5}){2,7})([ \-]?[A-Z0-9]{1,3})?$
regex101 test here
Also note, that a pure regex solution can't do calculations.
So to actually validate an IBAN number then extra code is required.
Example Javascript Snippet:
function smellsLikeIban(str){
return /^([A-Z]{2}[ \-]?[0-9]{2})(?=(?:[ \-]?[A-Z0-9]){9,30}$)((?:[ \-]?[A-Z0-9]{3,5}){2,7})([ \-]?[A-Z0-9]{1,3})?$/.test(str);
}
function validateIbanChecksum(iban) {
const ibanStripped = iban.replace(/[^A-Z0-9]+/gi,'') //keep numbers and letters only
.toUpperCase(); //calculation expects upper-case
const m = ibanStripped.match(/^([A-Z]{2})([0-9]{2})([A-Z0-9]{9,30})$/);
if(!m) return false;
const numbericed = (m[3] + m[1] + m[2]).replace(/[A-Z]/g,function(ch){
//replace upper-case characters by numbers 10 to 35
return (ch.charCodeAt(0)-55);
});
//The resulting number would be to long for javascript to handle without loosing precision.
//So the trick is to chop the string up in smaller parts.
const mod97 = numbericed.match(/\d{1,7}/g)
.reduce(function(total, curr){ return Number(total + curr)%97},'');
return (mod97 === 1);
};
var arr = [
'DE89 3704 0044 0532 0130 00', // ok
'AT61 1904 3002 3457 3201', // ok
'FR14 2004 1010 0505 0001 3', // wrong checksum
'GB82-WEST-1234-5698-7654-32', // ok
'NL20INGB0001234567', // ok
'XX00 1234 5678 9012 3456 7890 1234 5678 90', // only smells ok
'YY00123456789012345678901234567890', // only smells ok
'NL20-ING-B0-00-12-34-567', // stinks, but still a valid checksum
'XX22YYY1234567890123', // wrong checksum again
'droid#i.ban' // This Is Not The IBAN You Are Looking For
];
arr.forEach(function (str) {
console.log('['+ str +'] Smells Like IBAN: '+ smellsLikeIban(str));
console.log('['+ str +'] Valid IBAN Checksum: '+ validateIbanChecksum(str))
});
Here is a suggestion that may works for the patterns you provided:
[A-Z]{2}\d{2} ?\d{4} ?\d{4} ?\d{4} ?\d{4} ?[\d]{0,2}
Try it on regex101
Explanation
[A-Z]{2}\d{2} ? 2 capital letters followed by 2 digits (optional space)
\d{4} ? 4 digits, repeated 4 times (optional space)
[\d]{0,2} 0 to 2 digits
You can use a regex like this:
^[A-Z]{2}\d{2} (?:\d{4} ){3}\d{4}(?: \d\d?)?$
Working demo
This will match only those string formats
It's probably best to look up the specifications for a correct IBAN number. But if you want to have a regex similar to your existing one, but with spaces, you can use the following one:
^[a-zA-Z]{2}[0-9]{2}\s?[a-zA-Z0-9]{4}\s?[0-9]{4}\s?[0-9]{3}([a-zA-Z0-9]\s?[a-zA-Z0-9]{0,4}\s?[a-zA-Z0-9]{0,4}\s?[a-zA-Z0-9]{0,4}\s?[a-zA-Z0-9]{0,3})?$
Here is a live example: https://regex101.com/r/ZyIPLD/1

Regex with percentage and int

Hy, just a quick one here, does anyone know a good regular expression for a percentage and another number? I want to use it in a XML Schema..
Should Match:
-1
100.00
20.00
20.0
10.0
20
99.0
66.4
0.00
So it should match a percentage OR -1
My approach doesnt work...
([/-1]{1}|\d{1,3}\.\d{1,2})
Thanks!
(-1\n|\b(100|\d{1,2})(\n|(\.\d{1,2})))
Explanation:
(-1\n| // when not percentage OR ...
\b // word boundary - must not be other symbols in front
(100| // when integer part is equal to 100 OR ...
\d{1,2} // when integer part is number between 0 and 99
) // then after integer part must follow:
(\n| // new line symbol OR ...
(\.\d{1,2}) // dot symbol AND fractional part composed of 0 and 99
)
)
For regular expressions I usually use and suggest MDN as a reference.
That being said if I understand what you are trying to do this would work for you:
/(?=\s+|^)(?:-1|100(?:\.0+)?|\d{1,2}(?:\.\d{1,})?)(?=\s+)/gm
This would match strings that have nothing or white-spaces before and after
(?=\s+|^) content (?=\s+)
You can optionally alter that to ^(?: content)$ if you want each number to be the only thing on each line.
Where content is any of:
-1 ( -1 )
100 optionally folowed by "." and 1 or more 0s ( 100(?:\.0+)? )
1 or 2 digits optionally followed by "." and 1 or more decimals ( \d{1,2}(?:\.\d{1,})? )
You could alter the ending of {1,} to {1,X} where X is the max number of decimals you want to match.
For matching results check RegExr

Search a line in a string for a number greater than 0 [duplicate]

This question already has answers here:
Regex: how to only allow integers greater than zero
(6 answers)
Closed 6 years ago.
I have a std::string, with a text as such
name0 0x3f700000 0x160000 1 0
or
name1 0x3f700000 0x760000 0 23
etc..
What I would like to know is if the last number in this line is greater than 0 and the number before is 1.
I did this but it doesn't work, always it returns a match.
std::regex_search(buffer, match, std::regex(std::string("(^|\n)") +
m_name + " [0-9a-fA-Fx]* [0-9a-fA-Fx]* 1 [1-9a-fA-Fx]*"));
Can you say where the error is? It seems to know when the number before is 1 but that last number seems to be going wrong.
You can use
[1-9][0-9]*$
to check if the number is greater than 0
What it does?
[1-9] Matches 1 to 9.
[0-9]* Matches zero or more digits
$ Matches end of string.
The full regex can be
name0 [0-9a-fA-Fx]* [0-9a-fA-Fx]* 1 [1-9][0-9]*$
Regex Demo
You can use this regex
1\s+[1-9]\d*$
Regex Demo
or more specifically
^name\d+\s+0x[0-9a-f]+\s+0x[0-9a-f]+\s+1\s+[1-9]\d*$
Regex Demo