How can I remove the last comma from a string in Perl - regex

I have a string coming in from raw data. I can't guarantee that there might or might not be an extra comma. I thought I might be able to remove it like this:
$value = "cat, dog, fish, ";
$value =~ s/,//r;
Sadly that doesn't work. Of course I could do a loop to check the last char of the string one by one, but I would like to learn how to do it with the Regex backslash method.
Can someone help me please?

Try this
$value =~ s/,\s*$//;
The pattern ,\s*$ matches a comma (,) followed by zero or more space-chars (\s*), followed by the end of the line/input ($).

s/,// removes the first comma. So,
$value = reverse(reverse($value) =~ s/,//r);

Not sure why you are specifying /r in your code but not using the return value. If in fact you are using it, add it back.
s/.*\K,//
Ah, if there may not be a trailing comma that you don't want, this won't work; it will always delete the last comma. Use Bart's answer then.

The accepted answer removes a comma followed by zero or more white space characters at the end of a string. But you asked about removing the last comma. Either is consistent with your example, but if you really want to remove the last comma, one way is:
$value =~ s/,([^,]*$)/$1/
This will, for example, change "foo,bar,baz" to "foo,barbaz", and in your example"cat, dog, fish, "to"cat, dog, fish "` (leaving the trailing space).
The reverse trick in choruba's answer also works.
If nothing else, this shows the importance of a precise problem statement.

Using positive look ahead,
$value =~ s/,(?=[^,]*\z)//;

I suggest this pattern: ,*\s*$. It matches all commas (if any) and all white spaces (if any) and the end of the string.
A full example:
use 5.18.2;
use strict ;
use warnings ;
use Data::Dumper;
my $data = "cat, dog, fish,,,,,,,,,,,,, ";
$data =~ s/,*\s*$// ;
print $data;

Related

Make a regular expression in perl to grep value work on a string with different endings

I have this code in perl where I want to extract the value of 'EUR_AF', in this case '0.39'.
Sometimes 'EUR_AF' ends with ';', sometimes it doesn't.
Alternatively, 'EUR_AF' may end with '=0' instead of '=0.39;' or '=0.39'.
How do I make the code handle that? Can't seem to find it online...I could of course wrap everything in an almost endless if-elsif-else statement, but that seems overkill.
Example text:
AVGPOST=0.9092;AN=2184;RSQ=0.5988;ERATE=0.0081;AC=144;VT=SNP;THETA=0.0045;AA=A;SNPSOURCE=LOWCOV;LDAF=0.0959;AF=0.07;ASN_AF=0.05;AMR_AF=0.10;AFR_AF=0.11;EUR_AF=0.039
Code: $INFO =~ m/\;EUR\_AF\=(.*?)(;)/
I did find that: $INFO =~ m/\;EUR\_AF\=(.*?0)/ handles the cases of EUR_AF=0, but how to handle alternative scenarios efficiently?
Extract one value:
my ($eur_af) = $s =~ /(?:^|;)EUR_AF=([^;]*)/;
my ($eur_af) = ";$s" =~ /;EUR_AF=([^;]*)/;
Extract all values:
my %rec = split(/[=;]/, $s);
my $eur_af = $rec{EUR_AF};
This regex should work for you: (?<=EUR_AF=)\d+(\.\d+)?
It means
(?<=EUR_AF=) - look for a string preceeded by EUR_AF=
\d+(\.\d+)? - consist of a digit, optionally a decimal digit
EDIT: I originally wanted the whole regex to return the correct result, not only the capture group. If you want the correct capture group edit it to (?<=EUR_AF=)(\d+(?:\.\d+)?)
I have found the answer. The code:
$INFO =~ m/(?:^|;)EUR_AF=([^;]*)/
seems to handle the cases where EUR_AF=0 and EUR_AF=0.39, ending with or without ;. The resulting $INFO will be 0 or 0.39.

QRegex look ahead/look behind

I have been pondering on this for quite awhile and still can't figure it out. The regex look ahead/behinds. Anyway, I'm not sure which to use in my situation, I am still having trouble grasping the concept. Let me give you an example.
I have a name....My Business,LLC (Milwaukee,WI)_12345678_12345678
What I want to do is if there is a comma in the name, no matter how many, remove it. At the same time, if there is not a comma in the name, still read the line. The one-liner I have is listed below.
s/(.*?)(_)(\d+_)(\d+$)/$1$2$3$4/gi;
I want to remove any comma from $1(My Business,LLC (Milwaukee,WI)). I could call out the comma in regex as a literal string((.?),(.?),(.*?)(_.*?$)) if it was this EXACT situation everytime, however it is not.
I want it to omit commas and match 'My Business, LLC_12345678_12345678' or just 'My Business_12345678_12345678', even though there is no comma.
In any situation I want it to match the line, comma or not, and remove any commas(if any) no matter how many or where.
If someone can help me understand this concept, it will be a breakthrough!!
Use the /e modifier of Perl so that you can pass your function during the replace in s///
$str = 'My Business,LLC (Milwaukee,WI)_12345678_12345678';
## modified your regex as well using lookahead
$str =~ s/(.*?)(?=_\d+_\d+$)/funct($1)/ge;
print $str;
sub funct{
my $val = shift;
## replacing , with empty, use anything what you want!
$val =~ s/,//g;
return $val;
}
Using funct($1) in substitute you are basically calling the funct() function with parameter $1

remove up to _ in perl using regex?

How would I go about removing all characters before a "_" in perl? So if I had a string that was "124312412_hithere" it would replace the string as just "hithere". I imagine there is a very simple way to do this using regex, but I am still new dealing with that so I need help here.
Remove all characters up to and including "_":
s/^[^_]*_//;
Remove all characters before "_":
s/^[^_]*(?=_)//;
Remove all characters before "_" (assuming the presence of a "_"):
s/^[^_]*//;
This is a bit more verbose than it needs to be, but would be probably more valuable for you to see what's going on:
my $astring = "124312412_hithere";
my $find = "^[^_]*_";
my $replace = "_";
$astring =~ s/$find/$replace/;
print $astring;
Also, there's a bit of conflicting requirements in your question. If you just want hithere (without the leading _), then change it to:
$astring =~ s/$find//;
I know it's slightly different than what was asked, but in cases like this (where you KNOW the character you are looking for exists in the string) I prefer to use split:
$str = '124312412_hithere';
$str = (split (/_/, $str, 2))[1];
Here I am splitting the string into parts, using the '_' as a delimiter, but to a maximum of 2 parts. Then, I am assigning the second part back to $str.
There's still a regex in this solution (the /_/) but I think this is a much simpler solution to read and understand than regexes full of character classes, conditional matches, etc.
You can try out this: -
$_ = "124312412_hithere";
s/^[^_]*_//;
print $_; # hithere
Note that this will also remove the _(as I infer from your sample output). If you want to keep the _ (as it seems doubtful what you want as per your first statement), you would probably need to use look-ahead as in #ikegami's answer.
Also, just to make it little more clear, any substitution and matching in regex is applied by default on $_. So, you don't need to bind it to $_ explicitly. That is implied.
So, s/^[^_]*_//; is essentially same as - $_ =~ s/^[^_]*_//;, but later one is not really required.

Perl search and replace the last character occurrence

I have what I thought would be an easy problem to solve but I am not able to find the answer to this.
How can I find and replace the last occurrence of a character in a string?
I have a string: GE1/0/1 and I would like it to be: GE1/0:1 <- This can be variable length so no substrings please.
Clarification:
I am looking to replace the last / with a : no matter what comes before or after it.
use strict;
use warnings;
my $a = 'GE1/0/1';
(my $b = $a) =~ s{(.*)/}{$1:}xms;
print "$b\n";
I use the greedy behaviour of .*
Perhaps I have not understand the problem with variable length, but I would do the following :
You can match what you want with the regex :
(.+)/
So, this Perl script
my $text = 'GE1/0/1';
$text =~ s|(.+)/|$1:|;
print 'Result : '.$text;
will output :
Result : GE1/0:1
The '+' quantifier being 'greedy' by default, it will match only the last slash character.
Hope this is what you were asking.
This finds a slash and looks ahead to make sure there are no more slashes past it.:
Raw regex:
/(?=[^/]*$)
I think the code would look something like this, but perl isn't my language:
$string =~ s!/(?=[^/]*$)!\:!g;
"last occurrence in a string" is slightly ambiguous. The way I see it, you can mean either:
"Foo: 123, yada: GE1/0/1, Bar: null"
Meaning the last occurrence in the "word" GE1/0/1, or:
"GE1/0/1"
As a complete string.
In the latter case, it is a rather simple matter, you only have to decide how specific you can be in your regex.
$str =~ s{/(\d+)$}{:$1};
Is perfectly fine, assuming the last character(s) can only be digits.
In the former case, which I don't think you are referring to, but I'll include anyway, you'd need to be much more specific:
$str =~ s{(\byada:\s+\w+/\w+)/(\w+\b)}{$1:$2};

Regex to match suffixes to english words

I'm searching for the word "move" and i want to match "moved" as well when I print.
The way I'm going about this is:
if ($sentence =~ /($search_key)d$/i) {
$search_key = $search_keyd;
}
$subsentences[$i] =~ s/$search_key/ **$search_key** /i;
$subsentences[$i] =~ s/\b$parsewords[1]_\w+/ --$parsewords[1]--/i;
print "MATCH #$count\n",split(/_\S+/,$subsentences[$i]), "\n";
$count++;
This is part of a longer code so if anything is unclear let me know. The _ is because the words in the sentence are tagged (ex. I_NN move_VB to_PREP ....).
Where $search_keyd will be $search_key."d", which worked!
A nice addition would be to check if the word ended in e and therefore only a d would need to be appended. I'd guess it'd look something like this: e?$/d$
Even a general answer will suffice.
I'm new to Perl. So sorry if this is elementary. Thanks in advance!!!
If I understand you correctly, you want to search for "move" and add a highlight, but also include any variation of the basic word, such as "moves" "moved".
When you are replacing words in a text like this, you usually want to replace all the words, and then you need the /g operator on the regex, like so:
$subsentences[$i] =~ s/$search_key/ **$search_key** /ig
Also, you should make sure to not match partials of words. E.g. you want to match "move", but not perhaps "remove". For this, you can use \b to mark word boundry:
$subsentences[$i] =~ s/\b$search_key/ **$search_key** /ig
In order to match certain suffixes, you need a character class with valid characters or combination of characters. move[sd] will find "moves" and "moved". However, for a word like "jump", you would need to be a bit more specific: "jump(s|ed)". Note that [sd] can be replaced with (s|d). So barring any bad spelling in your text, you can get away with:
$subsentences[$i] =~ s/\b$search_key(s|d|ed)/ **$search_key$1** /ig
Note that $1 matches whatever is found inside the first matching parenthesis.
To find the number of matching words:
my $matches = $subsentences[$i] =~ s/\b$search_key(s|d|ed)/ **$search_key$1** /ig
If you want to be more specific with the suffixes, i.e. make it not match badly spelled words like "moveed", you'd need to do some special matching. Something like:
if ($search_key =~ /e$/i) { $suffix = '(s|d)' }
else { $suffix = '(s|ed)' }
my $matches = $subsentences[$i] =~ s/\b$search_key$suffix/ **$search_key$1** /ig
It can probably become very complicated the more search words you add.
Some help about regexes here
If what you want is to match all complete words which begin with your search term, i.e. 'move' matches 'move', 'moved', 'movers', etc, then you want to use a character class to detect the end of the word.
So, instead of:
if ($sentence =~ /($search_key)d$/i)
Try using:
if ($sentence =~ /($search_key\w*)\W$/i)
The \w* will match any number of standard word characters and the \W should prevent you from including other characters, such as whitespace or punctuation.