I need a Perl regex to pull a number of between six and ten digits out of a string. The number will always follow a particular word followed by a space (case unknown).
For example, if the word I was looking for is 'string':
some random text blah blah blahSTRING 1234567890some more random text
Desired output:
1234567890
Another example:
yet more random textra ra rastring 654321hey hey my my
Desired output:
654321
I want to load the result into a variable.
/string ([0-9]{6,10})/i
string matches STRING and string as the expression ends with i (case insenstive matching)
matches a space
(starts a capture group to capture the number you trying to get
[0-9]{6,10}matches a number with 6 to 10 places
https://regex101.com/r/mB1zF4/1
Group 1 should contain your number with
/^.*string (\d+).*$/i
Thanks everyone, between all the responses and a bit of googling I ended up with
#!/usr/local/bin/perl -w
use strict;
my $string = 'sgtusadl;fdsas;adlhstring 12345678daf;slkdfja;dflk';
my ( $number ) = $string =~ m/string\s\d{6,10}/gi;
$number =~ s/[^0-9]//g;
print "number is $number\n";
exit 0;
Related
In powershell, I'm trying to create a E.164 type regex for a number of countries. I explicitly need to have the (+) plus in my number and in most cases multi number country codes.
For some reason: '+421233339135' does not match '/^(\+[4][2][1])?([1-9]\d\d{7})$'
+421 is the country code, the first digit after the CC needs to be between 1-9, the rest can be any number then 9 digits afterwards is the DID number.
hope someone can help:-)
For some reason: '+421233339135' does not match '/^(\+[4][2][1])?([1-9]\d\d{7})$'
PowerShell is not Perl, a leading / before the pattern is not expected - remove it.
The pattern itself could be described simply as ^(\+421)?([1-9]\d{8})$
PS C:\> $phoneNumber = '+421233339135'
PS C:\> $phoneNumber -match '^(\+421)?([1-9]\d{8})$'
True
I cannot get this regex to work:
"4. 182 ex" (number, period, 2 blank spaces, 3 numbers, blank space, 2 characters"
The regex syntax should return "4182" and remove period, blank spaces, and characters.
Can you help me please?
EDIT!!!
Thanks everyone but I missed the key question:
a) the regex shall only find the value (4182) when the same line contains a specific text for example "magic", so for example:
"Magic 4. 182 ex"
b) the regex shall "only" find the value (4182) when the table contains a specific text for example "Magic":
"Magic 4. 182 ex
Lisefeo 2. 123 fg
Nioos 3. 124 df"
specific text = exact match or contains those charachters
My regex that I've tried so far but does it work for a whole table (not just a line) ?
(Magic.*?(\d).\s\s(\d{3})\s\w\w)
Just remove all characters that are not digit:
Perl:
$string =~ s/\D+//g;
or
php:
$string = preg_replace('/\D+/', '', $string);
According to your updated question, you could do:
$string =~ s/^Magic(\d+)\. (\d{3})\b.*$/$1$2/
or, with php:
$string = preg_replace('/^Magic(\d+)\. (\d{3})\b.*$/', '$1$2', $string);
For it to match exactly what you said, use:
(\d)\.\s\s(\d{3})\s\w\w
You'll get it in two groups, first digit and second digit group.
RegEx101 exmple
Regards.
^([\d]+)\.[\s]+([\d]+)[\s]..
Tested with perl:
> echo "4. 182 ex" | perl -lne 'print $1,$2 if(/^([\d]+)\.[\s]+([\d]+)[\s]../)'
4182
is this regular expression valid in case I want to include numbers only up to 31 ?
'[^0-9>31]+ or it will also return alphabetic characters and I must somehow exclude them too ?
Your regex accepts one or more characters, each of which is not one of the following
0 1 2 3 4 5 6 7 8 9 >
What you want is:
/^(?:[0-9]|[12][0-9]|3[01])$/
Regular expressions are not the sonic screwdriver of text, able to magically do everything you could possibly want. There is nothing in regular expressions that will check the value of a number.
What you need to do is two steps, written here in Perl.
$ok = ($s =~ /^\d{1,2}$/) && ($s < 31);
That checks the value of $s for start of the string (^), one or two digits (\d{1,2}) and then the end of the string ($). If that is true, then it also checks to see that the numeric value of $s is less than 31.
Yes, you can use a complex regex like this from Ray Toal's answer:
/^(?:[0-9]|[12][0-9]|3[01])$/
but that is far less readable.
How can I access capture buffers in brackets with quantifiers?
#!/usr/local/bin/perl
use warnings;
use 5.014;
my $string = '12 34 56 78 90';
say $string =~ s/(?:(\S+)\s){2}/$1,$2,/r;
# Use of uninitialized value $2 in concatenation (.) or string at ./so.pl line 7.
# 34,,56 78 90
With #LAST_MATCH_START and #LAST_MATCH_END it works*, but the line gets too long.
Doesn't work, look at TLP's answer.
*The proof of the pudding is in the eating isn't always right.
say $string =~ s/(?:(\S+)\s){2}/substr( $string, $-[0], length($-[0]-$+[0]) ) . ',' . substr( $string, $-[1], length($-[1]-$+[1]) ) . ','/re;
# 12,34,56 78 90
You can't access all previous values of the first capturing group, only the last value (or the current at the match end, as you can see it) will be saved in $1 (unless you want to use a (?{ code }) hack).
For your example you could use something like:
s/(\S+)\s+(\S+)\s+/$1,$2,/
The statement that you say "works" has a bug in it.
length($-[0]-$+[0])
Will always return the length of the negative length of your regex match. The numbers $-[0] and $+[0] are the offset of the start and end of the first match in the string, respectively. Since the match is three characters long (in this case), the start minus end offset will always be -3, and length(-3) will always be 2.
So, what you are doing is taking the first two characters of the match 12 34, and the first two characters of the match 34 and concatenating them with a comma in the middle. It works by coincidence, not because of capture groups.
It sounds as though you are asking us to solve the problems you have with your solution, rather than asking us about the main problem.
My first questions here.
I have a string of digits like 55111233
as you can see 5 is consecutive twice, 1 thrice 2 once and 3 twice.
I want it to be replaced into 52132132
in general number1<count>number2<count>...numbern<count>
Please guide me.
$digits = "55111233";
$digits =~ s/((\d)\2*)/$2 . length($1)/ge;
print $digits;
You can do:
$str =~s/(\d)(\1*)/$1.(length($2)+1)/eg;