javascript replace last occurrence of string - regex

I've read many Q&As in StackOverflow and I'm still having a hard time getting RegEX.
I have string 12_13_12.
How can I replace last occurrence of 12 with, aa.
Final result should be 12_13_aa.
I would really like for good explanation about how you did it.

You can use this replace:
var str = '12-44-12-1564';
str = str.replace(/12(?![\s\S]*12)/, 'aa');
console.log(str);
explanations:
(?! # open a negative lookahead (means not followed by)
[\s\S]* # all characters including newlines (space+not space)
# zero or more times
12
) # close the lookahead
In other words the pattern means: 12 not followed by another 12 until the end of the string.

newString = oldString.substring(0,oldString.lastIndexOf("_")) + 'aa';

Use this String.replace and make sure you have end of input $ in the end:
repl = "12_13_12".replace(/12(?!.*?12)/, 'aa');
EDIT: To use a variable in Regex:
var re = new RegExp(ToBeReplaced);
repl = str.replace(re, 'aa');

Related

Regex to remove trailing optional garbage

I want to clean strings that may contain garbage at the end, always separated by a forward slash / and if there is no garbage, there is no separator.
Example > expected output
Foo/Bar > Foo
Foobar > Foobar
I tried several versions like this one to extract the payload only, none of the worked:
(.*)\/.*
(.*)?\/.*
(.*)?\/*.*
And so on. Problem is: i always only get the first or second line to match.
What would be the correct expression to extract the wanted information?
Your first and second pattern capture till before the first / so that will not give a match for the third line as there is no / present.
The third pattern matches the whole line as the /* matches an optional forward slash, so the capture group will match the whole line, and the .* will not match any characters any more as the capture group is already at the end of the line.
You could write the pattern with a capture group for 1 or more word characters as the first part, and an optional second part starting the match from / till the end of the string.
In the replacement you can use the first capture group.
^(\w+)(?:\/.*)?$
^ Start of string
(\w+) Capture 1+ word characters in group 1
(?:\/.*)? Optionally match / and the rest of the line (to be removed after the replacement)
$ End of string
See a regex demo.
There is no language listed, but an example using JavaScript:
const regex = /^(\w+)(?:\/.*)?$/m;
const str = `Foo/Bar
Foobar`;
const result = str.replace(regex, "$1");
console.log(result);
Example using Python
import re
regex = r"^(\w+)(?:\/.*)?$"
test_str = ("Foo/Bar\n"
"Foobar")
result = re.sub(regex, r'\1', test_str, 0, re.MULTILINE)
if result:
print (result)
Output
Foo
Foobar
Python demo
You can use replace here as:
const cleanString = (str) => str.replace(/\/.*/, "");
console.log(cleanString("Foo/Bar"));
console.log(cleanString("Foobar"));
This task doesn't need the power of regex, you need to split on the first slash, e.g. in Python:
test_string.split('/', 1)[0]
I think the reason your regex doesn't work is that Foobar has no / to match on. So for regex you need to handle none, one, or many slashes. Again, in Python:
>>> test = ['foobar', 'foo/bar', 'foo/bar/baz']
>>> for s in t:
print(re.findall('^(.*?)(?=/|$)', s))
['foobar']
['foo']
['foo']
The regex says: from the start of the string, group all characters (non-greedy) until either a slash or the end of the string.
You can try doing a regex.split on / and select the first element from the list. For example in python:
import regex as re
new_string = re.split('/',string)[0]

Ruby Regex - If the string is more than 10 characters, remove the first character if it is a "1"

Without using a gem, I just want to write a simple regex formula to remove the first character from strings if it's a 1, and, if there are more than 10 total characters in the string. I never expect more than 11 characters, 11 should be the max. But in the case there are 10 characters and the string begins with "1", I don't want to remove it.
str = "19097147835"
str&.remove(/\D/).sub(/^1\d{10}$/, "\1").to_i
Returns 0
I'm looking for it to return "9097147835"
You could use your pattern, but add a capture group around the 10 digits to use the group in the replacement.
\A1(\d{10})\z
For example
str = "19097147835"
puts str.gsub(/\D/, '').sub(/\A1(\d{10})\z/, '\1').to_i
Output
9097147835
Another option could be removing all the non digits, and match the last 10 digits:
\A1\K\d{10}\z
\A Start of string
1\K Match 1 and forget what is matched so far
\d{10} Match 10 digits
\z End of string
Regex demo | Ruby demo
str = "19097147835"
str.gsub(/\D/, '').match(/\A1\K\d{10}\z/) do |match|
puts match[0].to_i
end
Output
9097147835
You can use
str.gsub(/\D/, '').sub(/\A1(?=\d{10})/, '').to_i
See the Ruby demo and the regex demo.
The regex matches
\A - start of string
1 - a 1
(?=\d{10}) - immediately to the right of the current location, there must be 10 digits.
Non regex example:
str = str[1..] if (str.start_with?("1") and str.size > 10)
Regexes are powerful, but not easy to maintain.

Regex in PHP: take all the words after the first one in string and truncate all of them to the first character

I'm quite terrible at regexes.
I have a string that may have 1 or more words in it (generally 2 or 3), usually a person name, for example:
$str1 = 'John Smith';
$str2 = 'John Doe';
$str3 = 'David X. Cohen';
$str4 = 'Kim Jong Un';
$str5 = 'Bob';
I'd like to convert each as follows:
$str1 = 'John S.';
$str2 = 'John D.';
$str3 = 'David X. C.';
$str4 = 'Kim J. U.';
$str5 = 'Bob';
My guess is that I should first match the first word, like so:
preg_match( "^([\w\-]+)", $str1, $first_word )
then all the words after the first one... but how do I match those? should I use again preg_match and use offset = 1 in the arguments? but that offset is in characters or bytes right?
Anyway after I matched the words following the first, if the exist, should I do for each of them something like:
$second_word = substr( $following_word, 1 ) . '. ';
Or my approach is completely wrong?
Thanks
ps - it would be a boon if the regex could maintain the whole first two words when the string contain three or more words... (e.g. 'Kim Jong U.').
It can be done in single preg_replace using a regex.
You can search using this regex:
^\w+(?:$| +)(*SKIP)(*F)|(\w)\w+
And replace by:
$1.
RegEx Demo
Code:
$name = preg_replace('/^\w+(?:$| +)(*SKIP)(*F)|(\w)\w+/', '$1.', $name);
Explanation:
(*FAIL) behaves like a failing negative assertion and is a synonym for (?!)
(*SKIP) defines a point beyond which the regex engine is not allowed to backtrack when the subpattern fails later
(*SKIP)(*FAIL) together provide a nice alternative of restriction that you cannot have a variable length lookbehind in above regex.
^\w+(?:$| +)(*SKIP)(*F) matches first word in a name and skips it (does nothing)
(\w)\w+ matches all other words and replaces it with first letter and a dot.
You could use a positive lookbehind assertion.
(?<=\h)([A-Z])\w+
OR
Use this regex if you want to turn Bob F to Bob F.
(?<=\h)([A-Z])\w*(?!\.)
Then replace the matched characters with \1.
DEMO
Code would be like,
preg_replace('~(?<=\h)([A-Z])\w+~', '\1.', $string);
DEMO
(?<=\h)([A-Z]) Captures all the uppercase letters which are preceeded by a horizontal space character.
\w+ matches one or more word characters.
Replace the matched chars with the chars inside the group index 1 \1 plus a dot will give you the desired output.
A simple solution with only look-ahead and word boundary check:
preg_replace('~(?!^)\b(\w)\w+~', '$1.', $string);
(\w)\w+ is a word in the name, with the first character captured
(?!^)\b performs a word boundary check \b, and makes sure the match is not at the start of the string (?!^).
Demo

java regex : getting a substring from a string which can vary

I have a String like - "Bangalore,India=Karnataka". From this String I would like to extract only the substring "Bangalore". In this case the regex can be - (.+),.*=.*. But the problem is, the String can sometimes come like only "Bangalore". Then in that case the above regex wont work. What will be the regex to get the substring "Bangalore" whatever the String be ?
Try this one:
^(.+?)(?:,.*?)?=.*$
Explanation:
^ # Begining of the string
( # begining of capture group 1
.+? # one or more any char non-greedy
) # end of group 1
(?: # beginig of NON capture group
, # a comma
.*? # 0 or more any char non-greedy
)? # end of non capture group, optional
= # equal sign
.* # 0 or more any char
$ # end of string
Updated:
I thougth OP have to match Bangalore,India=Karnataka or Bangalore=Karnataka but as farr as I understand it is Bangalore,India=Karnataka or Bangalore so the regex is much more simpler :
^([^,]+)
This will match, at the begining of the string, one or more non-comma character and capture them in group 1.
matcher.matches()
tries to match against the entire input string. Look at the javadoc for java.util.regex.Matcher. You need to use:
matcher.find()
Are you somehow forced to solve this using one regexp and nothing else? (Stupid interview question? Extremely inflexible external API?) In general, don't try to make regexes do what plain old programming constructs do better. Just use the obvious regex, and it it doesn't match, return the entire string instead.
Try this regex, This will grab any grouping of characters at the start followed by a comma but not the comma itself.
^.*(?=,)
If you are only interested to check that "Bangalore" is contained in the string then you don't need a regexp for this.
Python:
In [1]: s = 'Bangalorejkdjiefjiojhdu'
In [2]: 'Bangalore' in s
Out[2]: True

Regular expression to count number of commas in a string

How can I build a regular expression that will match a string of any length containing any characters but which must contain 21 commas?
/^([^,]*,){21}[^,]*$/
That is:
^ Start of string
( Start of group
[^,]* Any character except comma, zero or more times
, A comma
){21} End and repeat the group 21 times
[^,]* Any character except comma, zero or more times again
$ End of string
If you're using a regex variety that supports the Possessive quantifier (e.g. Java), you can do:
^(?:[^,]*+,){21}[^,]*+$
The Possessive quantifier can be better performance than a Greedy quantifier.
Explanation:
(?x) # enables comments, so this whole block can be used in a regex.
^ # start of string
(?: # start non-capturing group
[^,]*+ # as many non-commas as possible, but none required
, # a comma
) # end non-capturing group
{21} # 21 of previous entity (i.e. the group)
[^,]*+ # as many non-commas as possible, but none required
$ # end of string
Exactly 21 commas:
^([^,]*,){21}[^,]$
At least 21 commas:
^([^,]?,){21}.*$
Might be faster and more understandable to iterate through the string, count the number of commas found and then compare it to 21.
^(?:[^,]*)(?:,[^,]*){21}$
if exactly 21:
/^[^,]*(,[^,]*){21}$/
if at least 21:
/(,[^,]*){21}/
However, I would suggest don't use regex for such simple task. Because it's slow.
What language? There's probably a simpler method.
For example...
In CFML, you can just see if ListLen(MyString) is 22
In Java, you can compare MyString.split(',') to 22
etc...
var valid = ((" " + input + " ").split(",").length == 22);
or...
var valid = 21 == (function(input){
var ret = 0;
for (var i=0; i<input.length; i++)
if (input.substr(i,1) == ",")
ret++;
return ret
})();
Will perform better than...
var valid = (/^([^,]*,){21}[^,]*$/).test(input);
.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,