Perl regular expression to match input for port numbers - regex

I would like to write a regular expression that would only accept valid input that would qualify as a port number. I want to only accept input for the characters 0-9 and no special characters. The number should not be longer than 5 characters.
I read user input using this method.
my $port_number = <>;
I know that the regular expression should look something like this.
^[0-9]*$
How do I combine the regular expression with the reading of the command line input without using an if statement?

Try this code:
$result = ($port_number =~ m/^[0-9]{1,5}$/);
$result will be set to 1 if the $port_number matches your criteria, and will be set to 0 otherwise.

Related

how to replace specific character(s) in string by number(s)

I would like to replace a string specific character(s) with numbers.
lets assume I have such format string "B######" so it has one "letter" and 6 "#" characters. My need is to first figure out how many "#" it contains and based on this number, will generate random token
Session::Token->new(alphabet => ['0'..'9'], length => $length_from_format_string);
then, I need to replace that #... with the generated number. BUT...
format string could be also B##CDE###1 so it still has 6 "#" so generated number must be divided according to format :( and all this should be as effective as possible
Thanks for your hints
Regular expressions (in perl) can have functions embedded if you use the e flag. Adding the g modifier will do it multiple times.
So:
my $string = "B##CDE###1";
$string =~ s/\#/int rand(10)/ge;
print $string;

Evaluation Search and Replace in Perl

I have a file formatted like this:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9558 9629 gene
locus_tag CeraR_t011
gene trnR-UCU
11296 9773 CDS
locus_tag CeraR_p012
gene atpA
product ATP synthase CF1 alpha subunit
transl_except (pos:complement(10268..10270), aa:Q)
transl_except (pos:complement(11192..11194), aa:Q)
transl_except (pos:complement(13267..13269), aa:M)
11296 9773 gene
locus_tag CeraR_p012
gene atpA
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I need to add 809 to both of the values following pos:complement in each instance. I have been attempting with the search and replace modifier as so:
$line =~ s!complement((\d+)..(\d+)!complement(($1+809)..($2+809)!eg
however, the ( after complement is always interpreted as part of an evaluation rather than simply a character. I have tried every combination of backslashes, apostrophes, and quotes to make it just a character but nothing seems to work.
Any advice would be appreciated
Since the replacement string is evaluated, you must use a quoted string and concatenations:
$line =~ s/complement\(\K(\d+)..(\d+)/($1+809) . '..' . ($2+809)/eg;
Note: since \K removes all on the left from the match result, you don't need to rewrite all the begining of the match in the replacement string.

Matching the last digits of a number in Perl

I have a file in which there are a lot of GUIDs mentioned like this
Dlg1={929EC5C7-0A40-4BE4-8F0A-60C3CB4A62A7}-SdWelcome-0
I wanted to replace the last eight digits of these GUIDs with the last eight digits of a new GUID which is already generated using a tool. What I have tried so follows.
Read the last eight digits of the generated GUID like this:
$GUID =~ /[0-9a-fA-F]{8}/;
Assign it to a new variable like:
$newGUID = $1;
Now try to replace this with the old GUID inside the file:
if ($line =~ /^.* {(.*)}/) {
$line =~ s/[0-9a-fA-F]{8}}/$newGUID/;
}
But it does not seem to be working. It replaces the last eight digits of the old GUID with 32 digits of the new GUID. How can I fix this?
it replaces the last 8 digits of old GUID with 32 digits of new GUID , any ideas how to achieve it.
You now have this:
$line =~s/[0-9a-fA-F]{8}}/$newGUID/;
You say that replaces the last eight characters of your GUID with the entire 32 digit new GUID. That means your finding and replacing the right characters, but what you're replacing it with is wrong.
What is $newGUID equal to? Is it an entire 32 digit GUID? If so, you need to pull off the last 8 characters.
Two things I would recommend.
If you are using a hexadecimal number in your regular expression, use [[:xdigit:]] and not [0-9a-fA-F]. Although both are pretty much equivalent. Using :xdigit: is cleaner and it's easier to understand.
In Perl, we love regular expressions. Heck, Perl regular expression syntax has invaded and found homes in almost all other programming languages. However, regular expressions can be difficult to get right and test. They can also be difficult to understand too. However, sometimes there are better ways of doing something besides a regular expression that's cleaner and easier to undertstand.
In this case, you should use substr rather than regular expressions. You know exactly what you want, and you know the location in the string. The substr command would make what you're doing easier to understand and even cleaner:
use constant {
GUID_RE => qr/^[[:xdigit:]]{8}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{12}$/,
};
my $old_guid = '929EC5C7-0A40-4BE4-8F0A-60C3CB4A62A7';
my $new_guid = 'oooooooo-oooo-oooo-oooo-ooooXXXXXXXX';
# Regular expressions are great for verifying formats!
if ( not $old_guid =~ GUID_RE ) {
die qq(Old GUID "$new_guid" is not a GUID string);
}
if ( not $new_guid =~ GUID_RE ) { # Yes, I know this will die in this case
die qq(New GUID "$new_guid" is not a GUID string);
}
# Easy to understand, I'm removing the last eight characters of $old_guid
# and appending the last eight digits of $new_guid
my $munged_guid = substr( $old_guid, 0, -8 ) . substr( $new_guid, -8 );
say $munged_guid; # Prints 929EC5C7-0A40-4BE4-8F0A-60C3XXXXXXX
I'm using regular expressions to verify that the GUID are correctly formatted which is a great task for regular expressions.
I define a GUID_RE constant. You can look to see how it's defined and verify if it's in the correct format (12 hex digits, 4 hex digits, 4 hex digits, and 12 hex digits all separated by dashes).
Then, I can use that GUID_RE constant in my program, and it's easy to see what I'm doing. Is my GUID actually in the GUID_ID format?
Using substr instead of regular expressions make it easy to see exactly what I am doing. I am removing the last eight characters off of $old_guid and appending the last eight characters of $new_guid.
Again, your immediate issue is that your s/.../.../ is finding the right characters, but your substitution string isn't correct. However, this isn't the best use for regular expressions.
I think your problem is that you're not correctly setting $1 to the last eight digits (if it's coming from that regex, it would match the first eight digits and isn't setting any groups). You could instead try something like $newGUID = substr($GUID, -8);. I also think something like $GUIDTail makes more sense for the variable since it doesn't store an entire GUID.
Also, at the moment you're eating the closing curly brace. You should either include that in newGuid/guidTail, include it in the s/// call, or change the curly in the match to (?=\}) (which represents match this but don't include it in the match).
P.S.: You're making the assumption there that's there's only one GUID on the line. You may want to tack a global modifier to the match if there's any chance of multiple GUIDs (or otherwise disambiguating which one you want to modify, but this will just replace the first one).
Here's a small code snippet that demonstrates the principle I think you are after. First off, I start with a given string, and take the last 8 characters of it and store it in a new variable, $insert. Then I perform a somewhat strict substitution on the input data (here in the internal file handle DATA, which is convenient when demonstrating), and print the altered string.
The regex in the substitution looks for curly brackets { ... } with a mixture of hex digits [:xdigit:] and dashes \- between them ([[:xdigit:]\-]+), followed by 8 hex digits. The \K escape allows us to "keep" the matched string before it, so all we need to do is insert our stored string, and replace the closing curly bracket.
If you wish to try this on a file, change <DATA> to <> and run it like so:
perl script.pl input
Code:
use strict;
use warnings;
my $new = "929EC5C7-0A40-4BE4-8F0A-1234567890";
my $insert = substr($new, -8);
while (<DATA>) {
s/\{[[:xdigit:]\-]+\K[[:xdigit:]]{8}\}/$insert}/i;
print;
}
__DATA__
Dlg1={929EC5C7-0A40-4BE4-8F0A-60C3CB4A62A7}-SdWelcome-0
Output:
Dlg1={929EC5C7-0A40-4BE4-8F0A-60C334567890}-SdWelcome-0

Regexp in lex. Why does flex behave this way

Consider a simple integer digit identifying expression like this:
[0-9]+ printf("Integer");
Now if i give 123 as an input it returns Integer, fair enough. Now if I give s123 as the input it prints out sInteger. The unmatched s is being printed by default ECHO that's cool with me. But why is Integer also printed. Shouldn't lex return just s? My input is considered as a whole string right? I mean s123 is considered as a 1 full input?. As soon as s is encountered which does not match [0-9]+ so it should just echo default unmatched value s123 but why sInteger?
The string s123 is being matched by the regex [0-9]+. If you want to match strings which consist of only integers, you should try ^[0-9]+$.

Regular expression help in Perl

I have following text pattern
(2222) First Last (ab-cd/ABC1), <first.last#site.domain.com> 1224: efadsfadsfdsf
(3333) First Last (abcd/ABC12), <first.last#site.domain.com> 1234, 4657: efadsfadsfdsf
I want the number 1224 or 1234, 4657 from the above text after the text >.
I have this
\((\d+)\)\s\w*\s\w*\s\(\w*\/\w+\d*\),\s<\w*\.\w*\#\w*\.domain.com>\s\d+:
which will take the text before : But i want the one after email till :
Is there any easy regular expression to do this? or should I use split and do this
Thanks
Edit: The whole text is returned by a command line tool.
(3333) First Last (abcd/ABC12), <first.last#site.domain.com> 1234, 4657: efadsfadsfdsf
(3333) - Unique ID
First Last - First and last names
<first.last#site.domain.com> - Email address in format FirstName.LastName#sub.domain.com
1234, 4567 - database primary Keys
: xxxx - Headline
What I have to do is process the above and get hte database ID (in ex: 1234, 4567 2 separate ID's) and query the tables
The above is the output (like this I will get many entries) from the tool which I am calling via my Perl script.
My idea was to use a regular expression to get the database id's. Guess I could use regular expression for this
you can fudge the stuff you don't care about to make the expression easier, say just 'glob' the parts between the parentheticals (and the email delimiters) using non-greedy quantifiers:
/(\d+)\).*?\(.*?\),\s*<.*?>\s*(\d+(?:,\s*\d+)*):/ (not tested!)
there's only two captured groups, the (1234), and the (1234, 4657), the second one which I can only assume from your pattern to mean: "a digit string, followed by zero or more comma separated digit strings".
Well, a simple fix is to just allow all the possible characters in a character class. Which is to say change \d to [\d, ] to allow digits, commas and space.
Your regex as it is, though, does not match the first sample line, because it has a dash - in it (ab-cd/ABC1 does not match \w*\/\w+\d*\). Also, it is not a good idea to rely too heavily on the * quantifier, because it does match the empty string (it matches zero or more times), and should only be used for things which are truly optional. Use + otherwise, which matches (1 or more times).
You have a rather strict regex, and with slight variations in your data like this, it will fail. Only you know what your data looks like, and if you actually do need a strict regex. However, if your data is somewhat consistent, you can use a loose regex simply based on the email part:
sub extract_nums {
my $string = shift;
if ($string =~ /<[^>]*> *([\d, ]+):/) {
return $1 =~ /\d+/g; # return the extracted digits in a list
# return $1; # just return the string as-is
} else { return undef }
}
This assumes, of course, that you cannot have <> tags in front of the email part of the line. It will capture any digits, commas and spaces found between a <> tag and a colon, and then return a list of any digits found in the match. You can also just return the string, as shown in the commented line.
There would appear to be something missing from your examples. Is this what they're supposed to look like, with email?
(1234) First Last (ab-cd/ABC1), <foo.bar#domain.com> 1224: efadsfadsfdsf
(1234) First Last (abcd/ABC12), <foo.bar#domain.com> 1234, 4657: efadsfadsfdsf
If so, this should work:
\((\d+)\)\s\w*\s\w*\s\(\w*\/\w+\d*\),\s<\w*\.\w*\#\w*\.domain\.com>\s\d+(?:,\s(\d+))?:
$string =~ /.*>\s*(.+):.+/;
$numbers = $1;
That's it.
Tested.
With number catching:
$string =~ /.*>\s*(?([0-9]|,)+):.+/;
$numbers = $1;
Not tested but you get the idea.