dart regex remove space phone - regex

I tried all this regex solution but no match REGEX Remove Space
I work with dart and flutter and I tried to capture only digit of this type of string :
case 1
aaaaaaaaa 06 12 34 56 78 aaaaaa
case 2
aaaaaaaa 0612345678 aaaaaa
case 3
aaaaaa +336 12 34 56 78 aaaaa
I search to have only 0612345678 with no space and no +33. Just 10 digit in se case of +33 I need to replace +33 by 0
currently I have this code \D*(\d+)\D*? who run with the case 2

You may match and capture an optional +33 and then a digit followed with spaces or digits, and then check if Group 1 matched and then build the result accordingly.
Here is an example solution (tested):
var strs = ['aaaaaaaaa 06 12 34 56 78 aaaaaa', 'aaaaaaaa 0612345678 aaaaaa', 'aaaaaa +336 12 34 56 78 aaaaa', 'more +33 6 12 34 56 78'];
for (int i = 0; i < strs.length; i++) {
var rx = new RegExp(r"(?:^|\D)(\+33)?\s*(\d[\d ]*)(?!\d)");
var match = rx.firstMatch(strs[i]);
var result = "";
if (match != null) {
if (match.group(1) != null) {
result = "0" + match.group(2).replaceAll(" ", "");
} else {
result = match.group(2).replaceAll(" ", "");
}
print(result);
}
}
Returns 3 0612345678 strings in the output.
The pattern is
(?:^|\D)(\+33)?\s*(\d[\d ]*)(?!\d)
See its demo here.
(?:^|\D) - start of string or any char other than a digit
(\+33)? - Group 1 that captures +33 1 or 0 times
\s* - any 0+ whitespaces
(\d[\d ]*) - Group 2: a digit followed with spaces or/and digits
(?!\d) - no digit immediately to the right is allowed.
Spaces are removed from Group 2 with a match.group(2).replaceAll(" ", "") since one can't match discontinuous strings within one match operation.

Related

Golang regex : Ignore multiple occurrences

I've got a simple need.
Giving this input (string) : 10 20 30 40 65 45 44 67 100 200 65 40 66 88 65
I need to get all numbers between 65 and 66.
Problem is when we have multiple occurrence of each limit.
With a regex like : (65).+(66), I captured 65 45 44 67 100 200 65 40 66. But I would like to get only 40.
How could I achieve this ?
https://regex101.com/r/9HoKxr/1
Sounds like you want to exclude matching '65' inside the number of your pattern upto the 1st occurence of '66'? It's a bit verbose but what about:
\b65((?:\s(?:\d|[1-57-9]\d|6[0-47-9]|\d{3,}))+?)\s66\b
See an online demo
\b65\s - Start with '65' between a word-boundary and a whitespace char;
( - Open capture group;
(?:\s - Non-capture group with the constant of a whitespace char;
(?:\d|[1-57-9]\d|6[0-46-9]|\d{3,}) - Nested non-capture group to match any integer but '65' or '66';
)+?) - Close non-capture group and match it at least once but as few times as possible. Then close the capture group;
\s66\b - Match another space followed by '66' and word-boundary.
Note:
We will handle leading spaces with the Trim() function through the strings package;
That in my examples I have used '10 20 30 40 65 45 44 40 66 200 65 40 66 88 65' which should return multiple matches. In such case it's established OP is looking for the 'shortest' matching substring;
By 'shortest' it's meant that we are looking for the least amount of elements when the substring is split with spaces (using 'Fields' function from above mentione strings package). Therefor '123456' is prefered above '1 2 3' despite being the 'longer' substring in terms of characters;
Try:
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
s := `10 20 30 40 65 45 44 40 66 200 65 40 66 88 65`
re := regexp.MustCompile(`\b65((?:\s(?:\d|[1-57-9]\d|6[0-47-9]|\d{3,}))+?)\s66\b`)
matches := re.FindAllStringSubmatch(s, -1) // Retrieve all matches
shortest := ``
for i, _ := range matches { // Loop over array
if shortest == `` || len(strings.Fields(matches[i][1])) < len(strings.Fields(shortest)) {
shortest = strings.Trim(matches[i][1], ` `)
}
}
fmt.Println(shortest)
}
Try it for yourself here.

Remove only non-leading and non-trailing spaces from a string in Ruby?

I'm trying to write a Ruby method that will return true only if the input is a valid phone number, which means, among other rules, it can have spaces and/or dashes between the digits, but not before or after the digits.
In a sense, I need a method that does the opposite of String#strip! (remove all spaces except leading and trailing spaces), plus the same for dashes.
I've tried using String#gsub!, but when I try to match a space or a dash between digits, then it replaces the digits as well as the space/dash.
Here's an example of the code I'm using to remove spaces. I figure once I know how to do that, it will be the same story with the dashes.
def valid_phone_number?(number)
phone_number_pattern = /^0[^0]\d{8}$/
# remove spaces
number.gsub!(/\d\s+\d/, "")
return number.match?(phone_number_pattern)
end
What happens is if I call the method with the following input:
valid_phone_number?(" 09 777 55 888 ")
I get false because line 5 transforms the number into " 0788 ", i.e. it gets rid of the digits around the spaces as well as the spaces. What I want it to do is just to get rid of the inner spaces, so as to produce " 0977755888 ".
I've tried
number.gsub!(/\d(\s+)\d/, "") and number.gsub!(/\d(\s+)\d/) { |match| "" } to no avail.
Thank you!!
If you want to return a boolean, you might for example use a pattern that accepts leading and trailing spaces, and matches 10 digits (as in your example data) where there can be optional spaces or hyphens in between.
^ *\d(?:[ -]?\d){9} *$
For example
def valid_phone_number?(number)
phone_number_pattern = /^ *\d(?:[ -]*\d){9} *$/
return number.match?(phone_number_pattern)
end
See a Ruby demo and a regex demo.
To remove spaces & hyphen inbetween digits, try:
(?:\d+|\G(?!^)\d+)\K[- ]+(?=\d)
See an online regex demo
(?: - Open non-capture group;
d+ - Match 1+ digits;
| - Or;
\G(?!^)\d+ - Assert position at end of previous match but (negate start-line) with following 1+ digits;
)\K - Close non-capture group and reset matching point;
[- ]+ - Match 1+ space/hyphen;
(?=\d) - Assert position is followed by digits.
p " 09 777 55 888 ".gsub(/(?:\d+|\G(?!^)\d+)\K[- ]+(?=\d)/, '')
Prints: " 0977755888 "
Using a very simple regex (/\d/ tests for a digit):
str = " 09 777 55 888 "
r = str.index(/\d/)..str.rindex(/\d/)
str[r] = str[r].delete(" -")
p str # => " 0977755888 "
Passing a block to gsub is an option, capture groups available as globals:
>> str = " 09 777 55 888 "
# simple, easy to understand
>> str.gsub(/(^\s+)([\d\s-]+?)(\s+$)/){ "#$1#{$2.delete('- ')}#$3" }
=> " 0977755888 "
# a different take on #steenslag's answer, to avoid using range.
>> s = str.dup; s[/^\s+([\d\s-]+?)\s+$/, 1] = s.delete("- "); s
=> " 0977755888 "
Benchmark, not that it matters that much:
n = 1_000_000
puts(Benchmark.bmbm do |x|
# just a match
x.report("match") { n.times {str.match(/^ *\d(?:[ -]*\d){9} *$/) } }
# use regex in []=
x.report("[//]=") { n.times {s = str.dup; s[/^\s+([\d\s-]+?)\s+$/, 1] = s.delete("- "); s } }
# use range in []=
x.report("[..]=") { n.times {s = str.dup; r = s.index(/\d/)..s.rindex(/\d/); s[r] = s[r].delete(" -"); s } }
# block in gsub
x.report("block") { n.times {str.gsub(/(^\s+)([\d\s-]+?)(\s+$)/){ "#$1#{$2.delete('- ')}#$3" }} }
# long regex
x.report("regex") { n.times {str.gsub(/(?:\d+|\G(?!^)\d+)\K[- ]+(?=\d)/, "")} }
end)
Rehearsal -----------------------------------------
match 0.997458 0.000004 0.997462 ( 0.998003)
[//]= 1.822698 0.003983 1.826681 ( 1.827574)
[..]= 3.095630 0.007955 3.103585 ( 3.105489)
block 3.515401 0.003982 3.519383 ( 3.521392)
regex 4.761748 0.007967 4.769715 ( 4.772972)
------------------------------- total: 14.216826sec
user system total real
match 1.031670 0.000000 1.031670 ( 1.032347)
[//]= 1.859028 0.000000 1.859028 ( 1.860013)
[..]= 3.074159 0.003978 3.078137 ( 3.079825)
block 3.751532 0.011982 3.763514 ( 3.765673)
regex 4.634857 0.003972 4.638829 ( 4.641259)

Try to split a string with particular regex expression

i'm trying to split a string using 2 separator and regex. My string is for example
"test 10 20 middle 30 - 40 mm".
and i would like to split in ["test 10", "20 middle 30", "40 mm"]. So, splittin dropping ' - ' and the space between 2 digits.
I tried to do
result = re.split(r'[\d+] [\d+]', s)
> ['test 1', '0 middle 30 - 40 mm']
result2 = re.split(r' - |{\d+} {\d+}', s)
> ['test 10 20 middle 30', '40 mm']
Is there any reg expression to split in ['test 10', '20 middle 30', '40 mm'] ?
You may use
(?<=\d)\s+(?:-\s+)?(?=\d)
See the regex demo.
Details
(?<=\d) - a digit must appear immediately on the left
\s+ - 1+ whitespaces
(?:-\s+)? - an optional sequence of a - followed with 1+ whitespaces
(?=\d) - a digit must appear immediately on the right.
See the Python demo:
import re
text = "test 10 20 middle 30 - 40 mm"
print( re.split(r'(?<=\d)\s+(?:-\s+)?(?=\d)', text) )
# => ['test 10', '20 middle 30', '40 mm']
Data
k="test 10 20 middle 30 - 40 mm"
Please Try
result2 = re.split(r"(^[a-z]+\s\d+|\^d+\s[a-z]+|\d+)$",k)
result2
**^[a-z]**-match lower case alphabets at the start of the string and greedily to the left + followed by:
**`\s`** white space characters
**`\d`** digits greedily matched to the left
| or match start of string with digits \d+ also matched greedily to the left and followed by:
`**\s**` white space characters
**`a-z`** lower case alphabets greedily matched to the left
| or match digits greedily to the left \d+ end the string $
Output

How to capture repeated group up to N times?

I would like to capture chains of digits in a string, but only up to 3 times.
Any chain of digits afterwards should be ignored. For instance:
T441_S45/1 => 441 45 1
007_S4 => 007 4
41_445T02_74 => 41 445 02
I've tried (\d+){1,3} but that doesn't seem to work...
Any hint?
You may match and capture the first three chunks of digits separated with any amount of non-digits and the rest of the string, and replace with the backreferences to those groups:
^\D*(\d+)(?:\D+(\d+))?(?:\D+(\d+))?.*
Or, if the string can be multiline,
^\D*(\d+)(?:\D+(\d+))?(?:\D+(\d+))?[\s\S]*
The replacement string will look like $1 $2 $3.
Details
^ - start of string
\D* - 0+ non-digits
(\d+) - Group 1: one or more digits
(?:\D+(\d+))? - an optional non-capturing group matching:
\D+ - 1+ non-digits
(\d+) - Group 2: one or more digits
(?:\D+(\d+))? - another optional non-capturing group matching:
\D+ - one or more non-digits
(\d+) - Group 3: one or more digits
[\s\S]* - the rest of the string.
See the regex demo.
C++ demo:
#include <iostream>
#include <regex>
using namespace std;
int main() {
std::vector<std::string> strings;
strings.push_back("T441_S45/1");
strings.push_back("007_S4");
strings.push_back("41_445T02_74");
std::regex reg(R"(^\D*(\d+)(?:\D+(\d+))?(?:\D+(\d+))?[\s\S]*)");
for (size_t k = 0; k < strings.size(); k++)
{
std::cout << "Input string: " << strings[k] << std::endl;
std::cout << "Replace result: "
<< std::regex_replace(strings[k], reg, "$1 $2 $3") << std::endl;
}
return 0;
}
Output:
Input string: T441_S45/1
Replace result: 441 45 1
Input string: 007_S4
Replace result: 007 4
Input string: 41_445T02_74
Replace result: 41 445 02

string padded with optional blank with max length

I have a problem building a regex. this is a sample of the text:
text 123 12345 abc 12 def 67 i 89 o 0 t 2
The numbers are sometimes padded with blanks to the max length (3).
e.g.:
"1" can be "1" or "1 "
"13" can be "13" or "13 "
My regex is at the moment this:
\b([\d](\s*)){1,3}\b
The results of this regex are the following: (. = blank for better visibility)
123.
12....
67.
89.
0....
2
But I need this: (. = blank for better visibility)
123
12.
67.
89.
0..
2
How can I tell the regex engine to count the blanks into the {1,3} option?
Try this:
\b(?:\d[\d\s]{0,2})(?:(?<=\s)|\b)
This will also cover strings like text 123 1 23 12345 123abc 12 def 67 i 89 o 0 t 2 and results in:
123
1.
23.
12.
67.
89.
0..
2
Does this do what you want?
\b(\d){1,3}\s*\b
This will also include whitespace (if available) after the selection.
I think you want this
\b(?:\d[\d\s]{0,2})(?!\d)
See it here on Regexr
the word boundary will not work at the end, because if the end of the match is a whitespace, there is no word boundary. Therefor I use a negative lookahead (?!\d) to ensure that there is no digit following.
But if you have a string like this "1 23". It will match only the "2" and the "23", but not the whitespace after the first "2".
Assuming you want to use the padded numbers somewhere else, break the problem apart into two; (simple) parsing the numbers, and (simple) formatting the numbers (including padding).
while ( $text =~ /\b(\d{1,3})\b/g ) {
printf( "%-3d\n", $1 );
}
Alternatively:
#padded_numbers = map { sprintf( "%-3d", $_ ) } ( $text =~ /\b(\d{1,3})\b/g )