(Vim regex) Following by anything except bracket character - regex

Test string:
best.string_a = true;
best.string_b + bad.string_c;
best.string_d ();
best.string_e );
I want to catch string that after '.' and followed by anything except '('. My expression:
\.\#<=[_a-z]\+\(\s*[^(]\)\#=
I want :
string_a
string_b
string_c
string_e
But it doesn't work and result :
string_a
string_b
string_c
string_d
string_e
I am new to vim regex and i dont know why :(

Make this \.\#<=\<[_a-z]\+\>\(\s*(\)\#!
This matches:
\.\#<= Assure a dot is in front of the match followed by
\<[_a-z]\+\> A word containing only lowercase or '_' chars
\(\s*(\)\#! not followed by (any amount of spaces in front of a '(')

this would work for your needs too:
\.\zs[_a-z]\+\>\ze\s*[^( ]

Related

Character not at begining of line; not followed or preceded by character

I'm trying to isolate a " character when (simultaneously):
it's not in the beginning of the line
it's not followed by the character ";"
it's not preceded by the character ";"
E.g.:
Line: "Best Before - NO MATCH
Line: Best Before"; - NO MATCH
Line: ;"Best "Before - NO MATCH
Line: Best "Before - MATCH
My best solution is (?<![;])([^^])(")(?![;]) but it's not working correctly.
I also tried (?<![;])(")(?![;]), but it's only partial (missing the "not at the beginning" part)
I don't understand why I'm spelling the "AND not at the beginning" wrong.
Where am I missing it?
If you want to allow partial matches, you can extend the lookbehind with an alternation not asserting the start of the string to the left.
The semi colon [;] does not have to be between square brackets.
(?<!;|^)"(?!;)
Regex demo
if you want to match the " when there is no occurrence of '" to the left and right, and a infinite quantifier in a lookbehind assertion is allowed:
(?<!^.*;(?=").*|^)"(?!;|.*;")
Regex demo
In notepad++ you can use
^.*(?:;"|";).*$(*SKIP)(*F)|(?<!^)"
Regex demo
You can use the fact that not preceded by ; means that it's also not the first character on the line to simplify things
[^;]"(?:[^;]|$)
This gives you
Match a character that's not a ; (so there must be a character and thus the next character can't be the start of the line)
Match a "
Match a character that's not a ; or the end of the line
I know you are asking for a regex solution, but, almost always, strings can also be filtered using string methods in whatever language you are working in.
For the sake of completeness, to show that regex is not your only available tool here, here is a short javascript using the string methods:
myString.charAt()
myString.includes()
Working Example:
const checkLine = (line) => {
switch (true) {
// DOUBLE QUOTES AT THE BEGINNING
case(line.charAt(0) === '"') :
return console.log(line, '// NO MATCH');
// DOUBLE QUOTES IMMEDIATELY FOLLOWED BY SEMI-COLON
case(line.includes('";')) :
return console.log(line, '// NO MATCH');
// DOUBLE QUOTES IMMEDIATELY PRECEDED BY SEMI-COLON
case(line.includes(';"')) :
return console.log(line, '// NO MATCH');
default:
return console.log(line, '// MATCH');
}
}
checkLine('"Best Before');
checkLine('Best Before";');
checkLine(';"Best "Before');
checkLine('Best "Before');
Further Reading:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charAt
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/includes

Regex Express Return All Chars before a '/' but if there are 2 '/' Return all before that

I have been trying to get a regex expression to return me the following in the following situations.
XX -> XX
XXX -> XXX
XX/XX -> XX
XX/XX/XX -> XX/XX
XXX/XXX/XX -> XXX/XXX
I had the following Regex, however they do no work.
^[^/]+ => https://regex101.com/r/xvCbNB/1
=========
([A-Z])\w+ => https://regex101.com/r/xvCbNB/2
They are close but are not there.
Any Help would be appreciated.
You want to get all text from the start till the last occurrence of a specific character or till the end of string if the character is missing.
Use
^(?:.*(?=\/)|.+)
See the regex demo and the regex graph:
Details
^ - start of string
(?:.*(?=\/)|.+) - a non-capturing group that matches either of the two alternatives, and if the first one matches first the second won't be tried:
.*(?=\/) - any 0+ chars other than line break chars, as many as possible upt to but excluding /
| - or
.+ - any 1+ chars other than line break chars, as many as possible.
It will be easier to use a replace here to match / followed by non-slash characters before end of line:
Search regex:
/[^/]*$
Replacement String:
""
Updated RegEx Demo 1
If you're looking for a regex match then use this regex:
^(.*?)(?:/[^/]*)?$
Updated RegEx Demo 2
Any special reason it has to be a regular expression? How about just splitting the string at the slashes, remove the last item and rejoin:
function removeItemAfterLastSlash(string) {
const list = string.split(/\//);
if (list.length == 1) [
return string;
}
list.pop();
return list.join("/");
}
Or look for the last slash an remove it:
function removeItemAfterLastSlash(string) {
const index = string.lastIndexOf("/");
if (index === -1) {
return string;
}
return string.splice(0, index);
}

How to validate a string to have only certain letters by perl and regex

I am looking for a perl regex which will validate a string containing only the letters ACGT. For example "AACGGGTTA" should be valid while "AAYYGGTTA" should be invalid, since the second string has "YY" which is not one of A,C,G,T letters. I have the following code, but it validates both the above strings
if($userinput =~/[A|C|G|T]/i)
{
$validEntry = 1;
print "Valid\n";
}
Thanks
Use a character class, and make sure you check the whole string by using the start of string token, \A, and end of string token, \z.
You should also use * or + to indicate how many characters you want to match -- * means "zero or more" and + means "one or more."
Thus, the regex below is saying "between the start and the end of the (case insensitive) string, there should be one or more of the following characters only: a, c, g, t"
if($userinput =~ /\A[acgt]+\z/i)
{
$validEntry = 1;
print "Valid\n";
}
Using the character-counting tr operator:
if( $userinput !~ tr/ACGT//c )
{
$validEntry = 1;
print "Valid\n";
}
tr/characterset// counts how many characters in the string are in characterset; with the /c flag, it counts how many are not in the characterset. Using !~ instead of =~ negates the result, so it will be true if there are no characters not in characterset or false if there are characters not in characterset.
Your character class [A|C|G|T] contains |. | does not stand for alternation in a character class, it only stands for itself. Therefore, the character class would include the | character, which is not what you want.
Your pattern is not anchored. The pattern /[ACGT]+/ would match any string that contains one or more of any of those characters. Instead, you need to anchor your pattern, so that only strings that contain just those characters from beginning to end are matched.
$ can match a newline. To avoid that, use \z to anchor at the end. \A anchors at the beginning (although it doesn't make a difference whether you use that or ^ in this case, using \A provides a nice symmetry.
So, you check should be written:
if ($userinput =~ /\A [ACGT]+ \z/ix)
{
$validEntry = 1;
print "Valid\n";
}

Wrong return from Regex.IsMatch - Regular expression

I want to find in string a specific string surrounded by white spaces. For example I want receive the value true from:
Regex.IsMatch("I like ZaleK", "zalek",RegexOptions.IgnoreCase)
and value false from:
Regex.IsMatch("I likeZaleK", "zalek",RegexOptions.IgnoreCase)
Here is my code:
Regex.IsMatch(w_all_file, #"\b" + TB_string.Text.Trim() + #"\b", RegexOptions.IgnoreCase) ;
It does not work when in the w_all_file is string I am looking for followed by "-"
For example: if w_all_file = "I like zalek_" - the string "zalek" is not found, but if
w_all_file = "I like zalek-" - the string "zalek" is found
Any ideas why?
Thanks,
Zalek
The \b character in regex doesn't consider an underscore as word boundry. You might want to change it to something like this:
Regex.IsMatch(w_all_file, #"[\b_]" + TB_string.Text.Trim() + #"[\b_]", RegexOptions.IgnoreCase) ;
That's what you need?
string input = "type your name";
string pattern = "your";
Regex.IsMatch(input, " " + pattern + " ");
\b matches at a word boundary, which are defined as between a character that is included in \w and one that is not. \w is the same as [a-zA-Z0-9_], so it matches underscores.
So basically, \b will match after the "k" in zalek- but not in zalek_.
It sounds like you want the match to also fail on zalek-, which you can do by using lookaround. Just replace the \b at the beginning with (?<![\w-]), and replace the \b at the end with (?![\w-]):
Regex.IsMatch(w_all_file, #"(?<![\w-])" + TB_string.Text.Trim() + #"(?![\w-])", RegexOptions.IgnoreCase) ;
Note that if you add additional characters to the character class [\w-], you need to make sure that the "-" is the very last character, or that you escape it with a backslash (if you don't it will be interpreted as a range of characters).

Regular expression match decimal with letters

I have following string 3.14, 123.56f, .123e5f, 123D, 1234, 343E12, 32.
What I want to do is match any combination of above inputs. So far I started with the following:
^[0-9]\d*(\.\d+)
I realize I have to escape the . since its a regular expression itself.
Thanks.
This should also work, if not already proposed.
try {
Pattern regex = Pattern.compile("\\.?\\b[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?[fD]?\\b", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// matched text: regexMatcher.group()
// match start: regexMatcher.start()
// match end: regexMatcher.end()
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
Probably
^(\d+(\.\d+)?|\.\d+)([eE]\d+)?[fD]?$
http://regexr.com?2ut9t
^ start of the string
(\d+(\.\d+)?|\.\d+) one or more digits with an optional ( . and one or more digits)
or
. and one or more digits
([eE]\d+)? an optional ( e or E and one or more digits)
[fD]? an optional f or D
$ end of the string
As a sidenote, I've made the D compatible with everything but the f.
If you need positive and negative sign, add [+-]? after the ^
This will match all of those:
[0-9.]+(?:[Ee][0-9.]*)?[DdFf]?
Note that within a character class (square brackets), dot . is not a special character and should not be escaped.
Maybe that one ?
^\d*(?:\.\d+)?(?:[eE]\d+)?(?:[fD])?$
with
^\d* #possibly a digit or sequence of digits at the start
(?:\.\d+)? #possibly followed by a dot and at least one digit
(?:[eE]\d+)? #possibly a 'e' or 'E' followed by at least one digit
(?:[fD])?$ #optionnaly followed by 'f' or 'D' letters until the end
You can use regexpal to test it out, but this seems to work on all of those examples:
^\d*\.?(\d*[eE]?\d*)[fD]?$