Newbee trying hand on regular expression - regex

I am new to the regular expression and trying to build a expression wherein i want to check if the first three letters of the string are in upper case ?
I have expression like this "ALB.latin" or CAT.Cyrillic etc . i just want to check if the first three letters before the dot/period are capital and letter after the dot/period are in title case.
I have tried to build expression like to in FME test filter ^[A-Z]{3}\.[A-Za-z]$.

You need to remove the $ anchor from the pattern as it requires the end of string to appear right after the last [A-Za-z] subpattern matches an uppercase letter.
If you just need to check if the string starts with 3 uppercase ASCII letters, . and an ASCII letter, use
^[A-Z]{3}\.[A-Za-z]
Or, if you also need to make sure there are 1 or more ASCII letters only at the end, add + between [A-Za-z] and $, to match 1 or more symbols defined in the [a-zA-Z] character class:
^[A-Z]{3}\.[A-Za-z]+$
See the regex demo.

Hope this will give you the solution.
^[A-Z]{3}\.[A-Z][a-z]*$
With this, Letter after DOT will be in Title case. But at least one title case letter should be there after DOT.

Related

Vs-Code: Replace uppercase letter in words except first one with lowercase ones

Using Vs Code I would like to replace every 2nd uppercase letter in a word to the matching lowercase one only if the first 2 letters of the word are uppercase.
Example: LEtter would become Letter, but iOS, US or DNA would remain unchanged.
I figured using regular expressions and use the replace feature: the appropriate search string to use could be \s[A-Z][A-Z][a-z] (but probably this one doesn't work if I want to use it to replace stuff?). However, I don't know how to replace the 2nd letter then.
I am grateful for suggestions!
You can use the following regular expression Find/Replace with Match Case (Aa button)
Find: \b([A-Z])([A-Z])([a-z]+)\b
Replace: $1\l$2$3
In front of the $2 is a lowercase L
Group each part and convert the second capture group to lower case.

Filter strings using regular expression

I want to use regular expressions to find a number of strings in a text file that meet all of the following requirements.
Are of length 3
Are made of all capital letters
The first character is NOT 'A'
The second character is NOT 'J'
The third character is NOT 'K'
I started with this: /[A-Z]{3}/ but this matches lowercase 3 letter strings as well for some reason.
Is this possible? Any guidance is appreciated.
You need to anchor the regexp so it matches the entire line. Otherwise, it will match a string that's longer than 3, but contains 3 uppercase letters together anywhere in it.
You can use character sets for each character.
/^[B-Z][A-IK-Z][A-JL-Z]$/
^ matches the beginning of the line. [B-Z] matches any uppercase letter that isn't A, [A-IK-Z] matches any letter except J, and [A-JL-Z] matches any letter except M. $ matches the end of the line.
Another solution using lookahead:
^(?=[A-Z]{3}$)[^A][^J][^K]$
Demo & explanation
Try the follow to return all the matches: /\b(?=[A-Z])[^A](?=[A-Z])[^J](?!=[A-Z])[^K]\b/g
It utilizes look-aheads and will return only 3 letter matches and can be relatively easily repeatable for any other variations A, J, K
Demo: https://regex101.com/r/5s2Gkj/1

RegEx more than multiple characters before number

I really don't use RegEx that much. You could say I am RegEx n00b. I have been working on this issue for a half a day.
I am trying to write a pattern that looks backward from a number character. For example:
1. bob1 => bob
2. cat3 => cat
3. Mary34 => Mary
So far I have this (?![A-Z][a-z]{1,})([A-Za-z_])
It only matches for individual characters, I want all the characters before the number character. I tried to add the ^ and $ into my pattern and using an online simulator. I am unsure where to put the ^ and $.
NOTE: I am using RegEx for the .NET Framework
You may use a regex like
[\p{L}_]+(?=\d)
or
[\w-[\d]]+(?=\d)
See the regex demo
Pattern details
[\p{L}_]+ - any 1 or more letters (both lower- and uppercase) and/or _
OR
[\w-[\d]]+ - 1 or more word chars except digits (the -[] inside a character class is a character class subtraction construct)
(?=\d) - a positive lookahead that requires a digit to appear immediately to the right of the current location
If we break down your RegEx, we see:
(?![A-Z][a-z]{1,}) which says "look ahead to find a string that is NOT one uppercase letter followed one or more lowercase letters" and ([A-Za-z_]) which says "match one letter or underscore". This should end up matching any single lowercase letter.
If I understand what you want to achieve, then you want all of the letters before a number. I would write something like that as:
\b([a-zA-Z]+)[0-9]
This will start at a word boundary \b, match one or more letters, and require a digit right after the matched string.
(The syntax I used seems to match this document about .NET RegEx: https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions)
In light of Wiktor Stribizew's comment, here is a pure match RegEx:
\b[a-zA-Z_]+(?=[0-9])
This matches the pattern and then looks ahead for the digit. This is better than my first lookahead attempt. (Thank you Wiktor.)
http://www.rexegg.com/regex-lookarounds.html

Regex: Match after first letter

I have a list of words as follows:
cat
concatenate
matter
pattern
hat
rather
fathom
at
saturate
vat
I need a regular expression to match any words which are a single letter followed by the letters 'at'.
I currently have [A-Za-z]at but that includes the 'cat' and 'nat' in 'concatenate' and the 'rat' in 'saturate'.
How can I make it look for exactly one character before, and make sure that there is not more than 1 character before the 'at'. I tried using {1} but that still didn't work. Thanks for your help.
Use word boundary:
\b[A-Za-z]at\b
or, if you have string contains just those 3 characters, then you can use anchors:
^[A-Za-z]at$
You can use ^[A-Za-z]at$
[A-za-z] would check for a single letter. Following at would look for exact match.
Using the ^ and $ sign would force the word to start and end in the given boundaries.

Regex help with matching

Hello I need coming up with a valid regular expression It could be any identifier name that starts with a letter or underscore but may contain any number of letters, underscores, and/or digits (all letters may be upper or lower case).
For example, your regular expression should match the following text strings: “_”, “x2”, and “This_is_valid” It should not match these text strings: “2days”, or “invalid_variable%”.
So far this is what I have came up with but I don't think it is right
/^[_\w][^\W]+/
The following will work:
/^[_a-zA-Z]\w*$/
Starts with (^) a letter (upper or lowercase) or underscore ([_a-zA-Z]), followed by any amount of letter, digit, or underscore (\w) to the end ($)
Read more about Regular Expressions in Perl
Maybe the below regex:
^[a-zA-Z_]\w*$
If the identify is at the start of a string, then it's easy
/^(_|[a-zA-Z]).*/
If it's embedded in a longer string, I guess it's not much worse, assuming it's the start of a word...
/\s(_|[a-zA-Z]).*/