how to substitute the matched pattern with new pattern in perl? [closed] - regex

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am trying to replace the string which matches the following pattern:
my $string ="+++details of candidate
++name of candidate
+age of candidate
+++idiot
+idi";
That string should be replaced with:
my $string ="hii
how
fine
hii
fine";
How can I accomplish this?

maybe:
$string =~ s/(\++)([^\+\n]*)/(qw(fine how hii))[length($1)-1]/ge;

Use
$string =~ s/([^+]*)[+]+((\w+).*)([^ \r\n]*|$)/$1<div class="$3">$2<\/div>/g;
$string =~ s/ class="details"//g;
Explanation
searches for a sequence of + ...
... , copies preceding charcaters other than + ...
... , appends content following the + sequence, wraps it in a div tag with a class attribute containing the first word of said content ...
... , where the end of 'content' is marked by a sequence of whitespace, line feed or carriage return; ...
... repeats that substitution effectively for every line and concats the results;
In the second substitution, deletes the unwanted class attributes (in this case, 'details').
Notes
The 'details' could have been singled out differently,eg. by counting the + characters.
There are other possible solutions, in particular those employing advanced features of regex engines like positive lookbehind.
Caveat
Consider the solution as it stands as a proof of concept:
The specification of the context for substitutions in the regex crucially depends on the context in which the regex will be applied, ie. the set of potential strings against it will have to match.
The pattern that you have provided is a small set of samples. Literally taken, the pattern would precisely be the set of strings you've provided. The generalisations expressed by the regex are my own assumptions on the actual nature of your pattern, of course.
Update
OP updated his question and requested a different replacement:
$string =~ s/[+]{3}((\w+).*)([^ \r\n]*|$)/hii/g;
$string =~ s/[+]{2}((\w+).*)([^ \r\n]*|$)/how/g;
$string =~ s/[+]{1}((\w+).*)([^ \r\n]*|$)/fine/g;
The order by decreasing length of + sequences in the pattern is significant.

Related

Perl Regex: Adding a symbol in front of another symbol in a string [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
If i had a string:
my $string = "a/hello/bye/d";
I would like to add in a "\" symbol infront of every "/" symbol found inside the string. Are there any possible ways to do this?
Example:
$string = "a\/hello\/bye\/d";
Change the regular expression delimiter to a | and then substitute all forward slashes / with back-slash forward-slash \/. The back-slash must be escaped, since it is itself the 'escape' character. So \\/; The trailing g means perform the replacement everywhere, the leading s means substitute: s|\|\\/|g.
Have a read of perldoc perlretut for a friendly introduction to regular expressions.
my $string = "a/hello/bye/d";
$string =~ s|/|\\/|g;
print $string . "\n";
output
a\/hello\/bye\/d
You could use quotemeta() to achieve that.
: perl -e 'my $string = "a/hello/bye/d"; print quotemeta($string); print "\n"'
a\/hello\/bye\/d

Hi , I need a regex to match everything and convert to upper case except the text that are inside the single quotes in perl [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
Assuming I have a sentence like:
anything but 'notthis' and also 'not this' and 'notthis'
I would like my output be:
ANYTHING BUT 'notthis' AND ALSO 'not this' AND 'notthis'
Any help would be highly appreciated!
The following substitution should hopefully work for you:
s/((?:'[^']*')*)([^']+)/$1\U$2/g;
Parentheses form marching groups; the first left parenthesis creates a back reference called $1, the next one is $2, etc. To avoid creating groups, the form (:? is a parenthesis which does not create a back reference.
The first group captures and skips phrases in single quotes; they are substituted back as themselves in $1. The group can match an empty string.
The second group captures a string which does not contain single quotes. Because the + repetition operator is greedy, it will match up to just before the next single quote, or end of string. It is converted to uppercase by \U in the substitution.
The /g flag repeats as many times as possible, starting over where the previous match ended.
my $string = "your string 'no upper case' sample";
my #arr = split/('.*?')/,$string;
foreach my $data(#arr)
{
if($data !~ /'/)
{
$data = uc $data;
}
else
{
}
}
$string = join '',#arr;
print "$string";

Perl Regex Ban Log [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I want to be able to match all of the following strings to my regex below. It doesnt seem to be working. Any suggestions?
Strings to compare :
5878ce43aa3f1e1d713427d118115310 -1 Script Kiddie <perm>
f939f88b50fa5f0099b6751e7be27761 -1 Hacking <perm>
468f6634c5a9b00b5b3872dd6437143f 1356474103 Being Annoying <7day>
This is my perl code. It isnt working at the moment. Any suggestions?
my $bn_re = q{(.+?) (\d+) (.+?)};
If the first two fields are always without whitespace in them, you can use split to great effect, using the LIMIT option to only get three fields:
my ($str, $num, $other) = split ' ', $_, 3;
That is, assuming you read the file something like this:
while (<>) {
... # your code here
}
Also, this:
my $bn_re = q{(.+?) (\d+) (.+?)};
is not a regex. You may be confusing q() with qr(). You may also be confusing the functionality of
$str =~ $bn_re;
Which will automagically include the regex in a match operator m//. But you should use qr(). The q() operator does what the single quote does.
Also, you should be aware that .+? will match a single char if you allow it. As it does at the end of your "regex". At the end of your string, either do
... (.+)/ # matching greedily
... (.+?)$/ # using anchor to end of string
$bn_re =~ /[0-9a-z]+?\s[-0-9]+\s[\w\s]+?[<>a-z0-9]+?/i

regexp optimizer [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a big list of simple expressions (2Mb file). For example:
11.*;112.*;113.*;12.*;123.*
I need to remove unnecessary expressions and come up with this:
11.*;12.*
bash version would be appreciated. thanks in advance
Here is something in Perl that would work, provided the only wildcards in your pattern are of the form .*:
#!/usr/bin/perl
use strict;
use warnings;
my %terms;
{
local $/;
%terms = map {$_ => 1} split /;|\n/, <>;
}
foreach my $k1 (keys %terms)
{
foreach my $k2 (keys %terms)
{
if ($k1 ne $k2 and $k1 =~ /^$k2$/)
{
delete $terms{$k1};
last;
}
}
}
print join ';', keys %terms;
It accepts your file as a command line argument.
This works by comparing keys to each other. In each comparison, one key is treated as a string and the other key is evaluated as a regex. This takes advantage of the fact the .* matches anything--including the literal characters .*. Thus an expression that matches the literal string of another pattern will also match all strings that pattern would match.
It will work even if there are multiple .* terms in a single pattern. For example, it correctly determines that 1.*1.* matches everything that 11.* matches, deleting the latter.
However, this is kind of a hacky simplification and will not work if you introduce other regex patterns. There is no easy solution to this problem in general, because you would have to parse all patterns and figure out what each would match.

Need a regular expression to return text from last "/" to last "-" [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I just cant seem to be able to figure out how to match the following
in the string /hello/there-my-friend
I need to capture everything after the last / and before the last -
So it should capture there-my.
Here's the Regular Expression you're looking for:
#(?<=/)[^/]+(?=-[^-/]*$)#
I'll break it down in a minute, but there are probably better ways to do this.
I might do something like this:
$str = "/hello/there-my-friend";
$pieces = explode('/', $str);
$afterLastSlash = $pieces[count($pieces)-1];
$dashes = explode('-', $afterLastSlash);
unset($dashes[count($dashes)-1]);
$result = implode('-', $dashes);
The performance here is guaranteed linear (limiting factor being the length of $str plus the length of $afterLastSlash. The regular expression is going to be much slower (as much as polynomial time, I think - it can get a little dicey with lookarounds.)
The code above could easily be pared down, but the naming makes it more clear. Here it is as a one liner:
$result = implode('-', array_slice(explode('-', array_slice(explode('/', $str), -1)), 0, -1));
But gross, don't do that. Find a middle ground.
As promised, a breakdown of the regular expression:
#
(?<= Look behind an ensure there's a...
/ Literal forward slash.
) Okay, done looking behind.
[^/] Match any character that's not a forward slash
+ ...One ore more times.
(?= Now look ahead, and ensure there's...
- a hyphen.
[^-/] followed by any non-hyphen, non-forward slash character
* zero or more times
$ until the end of the string.
) Okay, done looking ahead.
#
^".*/([^/-]*)-[^/-]*$
Syntax may vary depending on which flavor of RE you are using.
Try this short regex :
/\K\w+-\w+
Your regex engine need \K support
or
(?<=/)\w+-\w+
(more portable)
Explanations
\K is close to (?<=/) : a look-around regex advanced technique
\w is the same as [a-zA-Z0-9_], feel free to adapt it
This will do it:
(?!.*?/).*(?=-)
Depending on your language, you might need to escape the /
Breakdown:
1. (?!.*?/) - Negative look ahead. It will start collecting characters after the last `/`
2. .* - Looks for all characters
3. (?=-) - Positive look ahead. It means step 2 should only go up to the last `-`
Edited after comment: No longer includes the / and the last - in the results.
This is not an exact answer to your question (its not a regex), but if you are using C# you might use this:
string str = "/hello/there-my-friend";
int lastSlashIndex = str.LastIndexOf('/');
int lastDashIndex = str.LastIndexOf('-');
return str.Substring(lastSlashIndex, lastDashIndex - lastSlashIndex);