I know next to nothing about regular expressions and after reading tutorials on several sites, still know next to nothing. I want a regular expression that will validate/enforce that a film rating is one of G, PG-13, R or NC-17.
You didn't specify the programming language you're using, but you don't need a regex, i.e.:
for python:
ratings = ['G','PG-13','R','NC-17']
rating = "PG-13"
if rating in ratings:
print "ranking exist"
else:
print "no match"
for php:
$ratings = array('G','PG-13','R','NC-17');
$rating = 'PG-13';
if (in_array($rating, $ratings))
{
print "rating exist";
} else {
print "no match";
}
Related
I have written Perl code for validating GSTIN Number which is related to India’s tax according to the following rules:
The first two digits represent the state code as per Indian Census 2011. Every state has a unique code.
The next ten digits will be the PAN number of the taxpayer
The thirteenth digit will be assigned based on the number of registration within a state
The fourteenth digit will be Z by default
The last digit will be for check code. It may be an alphabet or a number.
Following is the code:
my $gst_number_input = '35AABCS1429B1AX';
my $gst_number_character_count = length($gst_number_input);
my $gst_validation =~ /\d{2}[A-Z]{5}\d{4}[A-Z]{1}[A-Z\d]{1}[Z]{1}[A-Z\d]{1}/;
if ($gst_number_character_count == 15 && $gst_number_input =~ $gst_validation) {
print "GST Number is valid";
} else {
print "Invalid GST Number";
}
I have an invalid GSTIN input entered in the code. So when I run the script, I get:
GST Number is valid
Instead I should get the error because the GSTIN input is invalid:
Invalid GST Number
Can anyone please help ?
Thanks in advance
In this part you are using =~ where is should be an equals sign =
my $gst_validation =~ /\d{2}[A-Z]{5}\d{4}[A-Z]{1}[A-Z\d]{1}[Z]{1}[A-Z\d]{1}/;
If you want to use is as a variable, you could use qr
Note that you can omit {1} from the pattern and you don't have to use the square brackets around [Z]
You code might look like
my $gst_number_input = '35AABCS1429B1AX';
my $gst_number_character_count = length($gst_number_input);
my $gst_validation = qr/\d{2}[A-Z]{5}\d{4}[A-Z][A-Z\d]Z[A-Z\d]/;
if ($gst_number_character_count == 15 && $gst_number_input =~ $gst_validation) {
print "GST Number is valid";
} else {
print "Invalid GST Number";
}
I need help for that as Im beginner what i can do if i want to search on cat with regular expression any number of spaces and check if it in list or not as in match exact ... any help
import re
l = ["hello asma", " cat", "welcome"]
# iterates over three elements in the list
r = re.compile(r".*cat")
word_search="cat"
if r in l:
print("yes in")
else:
print("not found")
No need to use re. You can do that with simple list and joins
l = ["hello asma", " cat", "welcome"]
# iterates over three elements in the list
word_search="cat"
print "\n".join(s for s in l if word_search.lower() in s.lower())
It is so simple and will gives you all the values which contains cat
If you want to print only exists/("yes in") or not ("not found"),
Please check this code:
l = ["hello asma", " cat", "welcome"]
# iterates over three elements in the list
word_search="cat"
index=0
for text in l:
if word_search in text:
print("yes in")
break;
else:
if(len(l)==index=1):
print("not found")
index += 1
#the output is "yes in" //expected output
Another way and smarter solution is:
if word_search in '\n'.join(l):
print "yes in"
else:
print "not found"
#the output is "yes in" //expected output
Using Groovy and regular expression(s) how can I convert this:
String shopping = "SHOPPING LIST(TOMATOES, TEA, LENTIL SOUP: packets=2) for Saturday"
to print out
Shopping for Saturday
TOMATOES
TEA
LENTIL SOUP (2 packets)
I'm not a regex guru, so i couldn't find a regex to do the conversion in just on replaceAll step (i think it should be possible to do it that way). This works though:
def shopping = "SHOPPING LIST(TOMATOES, TEA, LENTIL SOUP: packets=2) for Saturday"
def (list, day) = (shopping =~ /SHOPPING LIST\((.*)\) for (\w+)/)[0][1,2]
println "Shopping for $day\n" +
list.replaceAll(/: packets=(\d+)/, ' ($1 packets)')
.replaceAll(', ', '\n')
First it captures the strings "TOMATOES, TEA: packets=50, LENTIL SOUP: packets=2" and "Saturday" into the variables list and day respectively. Then it processes the list string to convert it in the desired output replacing the "packets=" occurrences and splitting the list by commas (.replaceAll(', ', '\n') is equivalent to .split(', ').join('\n')).
One thing to notice is that if the shopping string does not match the first regex, it will throw an exception for trying to access the first match ([0]). You can avoid that by doing:
(shopping =~ /SHOPPING LIST\((.*)\) for (\w+)/).each { match, list, day ->
println "Shopping for $day\n" +
list.replaceAll(/: packets=(\d+)/, ' ($1 packets)')
.replaceAll(', ', '\n')
}
Which won't print anything if the first regex doesn't match.
I like to use the String find method for these kinds of cases, I think it's clearer than the =~ syntax:
String shopping = "SHOPPING LIST(TOMATOES, TEA, LENTIL SOUP: packets=2) for Saturday"
def expected = """Shopping for Saturday
TOMATOES
TEA
LENTIL SOUP (2 packets)"""
def regex = /SHOPPING LIST\((.*)\) for (.+)/
assert expected == shopping.find(regex) { full, items, day ->
List<String> formattedItems = items.split(", ").collect { it.replaceAll(/: packets=(\d+)/, ' ($1 packets)') }
"Shopping for $day\n" + formattedItems.join("\n")
}
I asked a question a little while ago about using regular expressions to extract a match from a URL in a particular directory.
eg: www.domain.com/shop/widgets/match/
The solution given was ^/shop.*/([^/]+)/?$
This would return "match"
However, my file structure has changed and I now need an expression that instead returns "match" in any directory excluding "pages" and "system"
Basically I need an expression that will return "match" for the following:
www.domain.com/shop/widgets/match/
www.domain.com/match/
But not:
www.domain.com/pages/widgets/match/
www.domain.com/pages/
www.domain.com/system/widgets/match/
www.domain.com/system/
I've been struggling for days without any luck.
Thanks
This is just an alternative to Grahams great answer above. Code in C# (but fot the regex part, that doesn't matter):
void MatchDemo()
{
var reg = new Regex("( " +
" (\\w+[.]) " +
" | " +
" (\\w+[/])+ " +
") " +
"(shop[/]|\\w+[/]) " + //the URL-string must contain the sequence "shop"
"(match) " ,
RegexOptions.IgnorePatternWhitespace);
var url = #"www.domain.com/shop/widgets/match/";
var retVal = reg.Match(url).Groups[5]; //do we have anything in the fifth parentheses?
Console.WriteLine(retVal);
Console.ReadLine();
}
/Hans
BRE and ERE do not provide a way to negate a portion of the RE, except within a square bracket expression. That is, you can [^a-z], but you can't express not /(abc|def)/. If your regex dialiect is ERE, then you must use two regexps. If you're using PREG, you can use a negative look-ahead.
For example, here's some PHP:
#!/usr/local/bin/php
<?php
$re = '/^www\.example\.com\/(?!(system|pages)\/)([^\/]+\/)*([^\/]+)\/$/';
$test = array(
'www.example.com/foo/bar/baz/match/',
'www.example.com/shop/widgets/match/',
'www.example.com/match/',
'www.example.com/pages/widgets/match/',
'www.example.com/pages/',
'www.example.com/system/widgets/match/',
'www.example.com/system/',
);
foreach ($test as $one) {
preg_match($re, $one, $matches);
printf(">> %-50s\t%s\n", $one, $matches[3]);
}
And the output:
[ghoti#pc ~]$ ./phptest
>> www.example.com/foo/bar/baz/match/ match
>> www.example.com/shop/widgets/match/ match
>> www.example.com/match/ match
>> www.example.com/pages/widgets/match/
>> www.example.com/pages/
>> www.example.com/system/widgets/match/
>> www.example.com/system/
Is that what you're looking for?
I tried to do a search on this particular problem, but all I get is either removal of duplicate lines or removal of repeated strings where they are separated by a delimiter.
My problem is slightly different. I have a string such as
"comp name1 comp name2 comp name2 comp name3"
where I want to remove the repeated comp name2 and return only
"comp name1 comp name2 comp name3"
They are not consecutive duplicate words, but consecutive duplicate substrings. Is there a way to solve this using regular expressions?
s/(.*)\1/$1/g
Be warned that the running time of this regular expression is quadratic in the length of the string.
This works for me (MacOS X 10.6.7, Perl 5.13.4):
use strict;
use warnings;
my $input = "comp name1 comp name2 comp name2 comp name3" ;
my $output = "comp name1 comp name2 comp name3" ;
my $result = $input;
$result =~ s/(.*)\1/$1/g;
print "In: <<$input>>\n";
print "Want: <<$output>>\n";
print "Got: <<$result>>\n";
The key point is the '\1' in the matching.
To avoid removing duplicate characters within the terms (e.g. comm1 -> com1) bracket .* in regular expression with \b.
s/(\b.*\b)\1/$1/g
I never work with languages that support this but since you are using Perl ...
Go here .. and see this section....
Useful Example: Checking for Doubled Words
When editing text, doubled words such as "the the" easily creep in. Using the regex \b(\w+)\s+\1\b in your text editor, you can easily find them. To delete the second word, simply type in \1 as the replacement text and click the Replace button.
If you need something running in linear time, you could split the string and iterate through the list:
#!/usr/bin/perl
use strict;
use warnings;
my $str = "comp name1 comp name2 comp name2 comp name3";
my #elems = split("\\s", $str);
my $prevComp;
my $prevFlag = -1;
foreach my $elemIdx (0..(scalar #elems - 1)) {
if ($elemIdx % 2 == 1) {
if (defined $prevComp) {
if ($prevComp ne $elems[$elemIdx]) {
print " $elems[$elemIdx]";
$prevFlag = 0;
}
else {
$prevFlag = 1;
}
}
else {
print " $elems[$elemIdx]";
}
$prevComp = $elems[$elemIdx];
}
elsif ($prevFlag == -1) {
print "$elems[$elemIdx]";
$prevFlag = 0;
}
elsif ($prevFlag == 0) {
print " $elems[$elemIdx]";
}
}
print "\n";
Dirty, perhaps, but should run faster.