Regex remove previous character if followed by a character - regex

I'm working on a regex expression. I have two words 1234a and 1234. If ‘a’ is there I want it to return just 123. If ‘a’ is not there then I want it to return 1234.
Since regex engine starts from left to right position, I can’t backtrack to remove 4 if ‘a’ is present. Is it possible to do this in regex? Any help/suggestion is appreciated.
UPDATE:Toto's answer works good.But as an extension of the above problem if the word is test1234asample I need it to return test123 if 'a' is there else if 'a' is not there return test1234.I tried to modify the regex from Toto but it highlights everything.

If your tool supports lookahead, use:
\b([^\Wa]+(?=[^\Wa]a.*)|\w+$)
Demo & explanation

This is a different approach than regex. For regex, you can refer to Toto's comment/answer
function getString(str) {
return str.includes('a') ? str.substring(0, str.indexOf('a') - 1) : str;
}
let str1 = '1234a';
let str2 = '1234';
let str3 = '1234a12312';
console.log(getString(str1));
console.log(getString(str2));
console.log(getString(str3));

One way of doing the exact replacement asked for in OP's would be a simple s/.a// substitution that replaces any substring of length 2 that finishes in 'a' with the empty string. No look ahead or backtrack required.

Related

Regex Find English char in text need more than 3

I want to validate a text that need have more than 3 [aA-zZ] chars, not need continous.
/^(?![_\-\s0-9])(?!.*?[_\-\s]$)(?=.*[aA-zZ]{3,})[_\-\sa-zA-Z0-9]+$/.test("aaa123") => return true;
/^(?![_\-\s0-9])(?!.*?[_\-\s]$)(?=.*[aA-zZ]{3,})[_\-\sa-zA-Z0-9]+$/.test("a1b2c3") => return false;
Can anybody help me?
How about replacing and counting?
var hasFourPlusChars = function(str) {
return str.replace(/[^a-zA-Z]+/g, '').length > 3;
};
console.log(hasFourPlusChars('testing1234'));
console.log(hasFourPlusChars('a1b2c3d4e5'));
You need to group .* and [a-zA-Z] in order to allow optional arbitrary characters between English letters:
^(?![_\-\s0-9])(?!.*?[_\-\s]$)(?=(?:.*[a-zA-Z]){3,})[_\-\sa-zA-Z0-9]+$
^^^ ^
Add this
Demo:
var re = /^(?![_\-\s0-9])(?!.*?[_\-\s]$)(?=(?:.*[aA-zZ]){3,})[_\-\sa-zA-Z0-9]+$/;
console.log(re.test("aaa123"));
console.log(re.test("a1b2c3"));
By the way, [aA-zZ] is not a correct range definition. Use [a-zA-Z] instead. See here for more details.
Correction of the regex
Your repeat condition should include the ".*". I did not check if your regex is correct for what you want to achieve, but this correction works for the following strings:
$testStrings=["aaa123","a1b2c3","a1b23d"];
foreach($testStrings as $s)
var_dump(preg_match('/^(?![_\-\s0-9])(?!.*?[_\-\s]$)(?=.*[a-zA-Z]){3,}[_\-\sa-zA-Z0-9]+$/', $s));
Other implementations
As the language seems to be JavaScript, here is an optimised implementation for what you want to achieve:
"a24be4Z".match(/[a-zA-Z]/g).length>=3
We get the list of all matches and check if there are at least 3.
That is not the "fastest" way as the result needs to be created.
)
/(?:.*?[a-zA-Z]){3}/.test("a24be4Z")
is faster. ".*?" avoids that the "test" method matches all characters up to the end of the string before testing other combinations.
As expected, the first suggestion (counting the number of matches) is the slowest.
Check https://jsperf.com/check-if-there-are-3-ascii-characters .

Regex to upper case not surrounded by single quotes

hello 'this' is my'str'ing
If I have string like this, I'd like to make it all upper case if not surrounded by single quote.
hello 'this' is my'str'ing=>HELLO 'this' IS MY'str'ING
Is there a easy way I can achieve this in node perhaps using regex?
You can use the following regular expression:
'[^']+'|(\w)
Here is a live example:
var subject = "hello 'this' is my'str'ing";
var regex = /'[^']+'|(\w)/g;
replaced = subject.replace(regex, function(m, group1) {
if (!group1) {
return m;
}
else {
return m.toUpperCase();
}
});
document.write(replaced);
Credit of this answer goes to zx81. For more information see the original answer of zx81.
Since Javascript doesn't support lookbehinds, we have to use \B which matches anything a word boundary doesn't match.
In this case, \B' makes sure that ' isn't to the right of anything in \w ([a-zA-Z0-9_]). Likewise, '\B does a similar check to the left.
(?:(.*?)(?=\B'.*?'\B)(?:(\B'.*?'\B))|(.*?)$) (regex demo)
Use a callback function and check to see if the length of captures 1 or 3 is > 0 and if it is, return an uppercase on the match
**The sample uses \U and \L just to uppercase and lowercase the related matches. Your callback need not ever effect $2's case, so "Adam" can stay "Adam", etc.
Unrelated, but a note to anyone who might be trying to do this in reverse. it's much easier to the the REVERSE of this:
(\B'.+?'\B) regex demo

How to Select a line only text with special character using regex

Input:
1.\frac{[a+b]}{xjch}
2.\frac{pqz}{xjch}
Wanted output is
1.[a+b]/(xjch)
2.(pqz)/(xjch)
My regex is:
\\frac\{(.{2,})\}\{(.{2,})\}
if i apply this regex,
the output will be,
1.([a+b])/(xjch)
2.(pqz)/(xjch)
But i dont want () in [a+b]. ie if any special character inside the {...}, the round bracket should not come. otherwise, (Without special characters) ,the round bracket should come like (pqz),(xjch).
I want two regex for both 1. and 2. then only i will get wanted output.
Could anyone help me?
you can write a Regex that contain within the bracket and replace the group 1 and 2 with a condition
if(nextchar == "[")
TypeOfYourInstuction = 1;
else
TypeOfYourInstuction = 2;`
and this regex is
\\frac\{\[?([a-zA-Z1-9\+]{2,})\]?\}\{\[?([a-zA-Z1-9\+]{2,})\]?\}
http://regex101.com/r/dN8sA5/18
but as you mention it, you can write two regex for first type and the second one:
the first regex: \[[^\]]{2,}\] // Demo = http://regex101.com/r/dN8sA5/20
the second regex: \{[^\[^\}]*\} // Demo = http://regex101.com/r/dN8sA5/19
you have to replace the second type with parenthesis

Groovy regex with hyphen isn't matching

I want to write a regex that will match any time the substring "my-app" is encountered inside any given string.
I have the following Groovy code:
String regex = ".*my-app*"
String str = getStringFromUserInput()
if(str.matches(regex) {
println "Match!"
} else {
println "Doesn't match..."
}
When getStringFromUserInput() returns a string like "blahmy-appfizz", the code above still reports Doesn't match.... So I figured that hyphens must be a special character in regexes and tried changing the regex to:
String regex = ".*my--app*"
But still nothing has changed. Any ideas as to where I'm going wrong?
The hyphen is no special character.
matches validates the entire input. Try:
String regex = ".*my-app.*"
Note that p* matches zero or more p's and p.* matches a p followed by zero or more chars (other than line breaks).
Assuming getStringFromUserInput() does not leave any line break char in the input. In which case you'd need to do a trim() to get rid of it, since the .* does not match line break chars.
String.contains seems like a simpler solution than a regex, e.g.
String stringFromUser = 'my-app'
assert 'foomy-appfoo'.contains(stringFromUser)
assert !'foo'.contains(stringFromUser)

Regular Expression to match two characters unless they're within two positions of another character

I'm trying to create a regular expression to match some certain characters, unless they appear within two of another character.
For example, I would want to match abc or xxabcxx but not tabct or txxabcxt.
Although with something like tabctxxabcxxtabcxt I'd want to match the middle abc and not the other two.
Currently I'm trying this in Java if that changes anything.
Try this:
String s = "tabctxxabcxxtabcxt";
Pattern p = Pattern.compile("t[^t]*t|(abc)");
Matcher m = p.matcher(s);
while (m.find())
{
String group1 = m.group(1);
if (group1 != null)
{
System.out.printf("Found '%s' at index %d%n", group1, m.start(1));
}
}
output:
Found 'abc' at index 7
t[^t]*t consumes anything that's enclosed in ts, so if the (abc) in the second alternative matches, you know it's the one you want.
EDITED! It was way wrong before.
Oooh, this one's tougher than I thought. Awesome. Using fairly standard syntax:
[^t]{2,}abc[^t]{2,}
That will catch xxabcxx but not abc, xabc, abcx, xabcx, xxabc, xxabcx, abcxx, or xabcxx. Maybe the best thing to do would be:
if 'abc' in string:
if 't' in string:
return regex match [^t]{2,}abc[^t]{2,}
else:
return false
else:
return false
Is that sufficient for your intention?