Regular expression - finding specific string with at least one capital letter - regex

I am looking for a regular expression which matches a specific string which:
always start with "fu:
always ends with "
and contains at least one capital letter in between those start and ending points
point 3 is the part I really can't solve.
the regex "fu:(.*)?" matches all the strings apart from point 3.
[edit]
its pretty close now, the only problem is it doesnt stop after the second ".
Basically this string:
"fu:no capital letter:,some other random text WITH CAPITAL LETTERS"
is a match but shouldnt.

The regex that will work for you is this:
/^"fu:.*?[A-Z].*?"$/
Here the live demo of above regex

^"fu:.*[A-Z].*"$
Don't forget about multiline mode if you wish to search in several lines of text.
^"fu: - starts with "fu:
.* - any other characters
[A-Z] - capital letter
.* - other characters
"$ - " at the end
Good tool to test it: http://www.regexplanet.com/advanced/java/index.html

Something like
^"fu:([^"]*?[A-Z][^"]*?)"$

I commented on a problem with anubhava's solution (that it only matches upper case letters in the range A through Z), but then found the solution myself. Note that this requires a POSIX-compliant regular expression engine with support for Unicode.
My solution is
/^"fu:.*[[:upper:]].*"$/
It solves the problem of finding upper case letters in other languages than English (with partially or completely different alphabets).
An example in Ruby:
rx = /^"fu:.*[[:upper:]].*"$/
arr = ['"fu:Berlin"', '"fu:İstanbul"', '"fu:Washington"', '"fu:Örebro"', '"fu:Москва"']
arr.map {|s| s.scan rx}
In this case, all of the strings are matched.

Related

RegEx more than multiple characters before number

I really don't use RegEx that much. You could say I am RegEx n00b. I have been working on this issue for a half a day.
I am trying to write a pattern that looks backward from a number character. For example:
1. bob1 => bob
2. cat3 => cat
3. Mary34 => Mary
So far I have this (?![A-Z][a-z]{1,})([A-Za-z_])
It only matches for individual characters, I want all the characters before the number character. I tried to add the ^ and $ into my pattern and using an online simulator. I am unsure where to put the ^ and $.
NOTE: I am using RegEx for the .NET Framework
You may use a regex like
[\p{L}_]+(?=\d)
or
[\w-[\d]]+(?=\d)
See the regex demo
Pattern details
[\p{L}_]+ - any 1 or more letters (both lower- and uppercase) and/or _
OR
[\w-[\d]]+ - 1 or more word chars except digits (the -[] inside a character class is a character class subtraction construct)
(?=\d) - a positive lookahead that requires a digit to appear immediately to the right of the current location
If we break down your RegEx, we see:
(?![A-Z][a-z]{1,}) which says "look ahead to find a string that is NOT one uppercase letter followed one or more lowercase letters" and ([A-Za-z_]) which says "match one letter or underscore". This should end up matching any single lowercase letter.
If I understand what you want to achieve, then you want all of the letters before a number. I would write something like that as:
\b([a-zA-Z]+)[0-9]
This will start at a word boundary \b, match one or more letters, and require a digit right after the matched string.
(The syntax I used seems to match this document about .NET RegEx: https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions)
In light of Wiktor Stribizew's comment, here is a pure match RegEx:
\b[a-zA-Z_]+(?=[0-9])
This matches the pattern and then looks ahead for the digit. This is better than my first lookahead attempt. (Thank you Wiktor.)
http://www.rexegg.com/regex-lookarounds.html

Regular Expression for matching a single digital followed by a word exactly in Notepad++

:Statement
Say we have following three records, and we just want to match the first one only -- exactly one digital followed by a specific word, what is the regular expression can be used to make it(in NotePad ++)?
2Cups
11Cups
222Cups
The expressions I tried and their problems are:
Proposal 1:\d{1}Cups
it will find the "1Cups" and "2Cups" substrings in the second and third record respectively, which is what we do not want here
Proposal 2:[^0-9]+[0-9]Cups
same as the above
(PS: the records can be "XX 2Cups", "YY22Cups" and "XYZ 333Cups", i.e., no assumption on the position of the matchable parts)
Any suggestions?
:Reference
[1] The reg definition in NotePad++ (Same as SciTe)
As mentioned in Searching for a complex Regular Expression to use with Notepad++, it is: http://www.scintilla.org/SciTERegEx.html
[2] Matching exact number of digits
Here is an example: regular expression to match exactly 5 digits.
However, we do not want to find the match-able substring in longer records here.
If the string actually has the numbered sequence (1. 2Cups 2. 11Cups), you can use the white space that follows it:
\s\d{1}Cups
If there isn't the numbered list before, but the string will be at the beginning of the line, you can anchor it there:
^\d{1}Cups
Tested in Notepad++ v6.5.1 (Unicode).
It sounds like you want to match the digit only at the start of the string or if it has a space before it, so this would work:
(^|\b)\dCups
Debuggex Demo
Explanation:
(^|\b) Match the start of the string or beginning of a word (technically, word break)
\d Match a digit ({1} is redundant)
Cups Match Cups
This will work:
\b\dCups
If "Cups" must be a whole word (ie not matching 2Cupsizes:
\b\dCups\b
Note that \b matches even if at start or end of input.
I found one possible solution:
Using ^\d{1}Cups to match "Starting with one digital + Cups" cases, as suggested by Ken, Cottrell and Bohemian.
Using [^\d]\dCups to match other cases.
However, haven't found a solution using just one regex to solve the problem yet.
Have a try with:
(?:^|\D)\dCups
This will match xCups only if there aren't digit before.

Match a specific string with several constants using Regex

There are now different requirements to the regex I am looking for, and it is too complex to solve it on my own.
I need to search for a specific string with the following requirements:
String starts with "fu: and ends with "
In between those start and end requirements there can be any other string which has the following requirements:
2.1. Less than 50 characters
2.2. Only lower case
2.3. No trailing spaces
2.4. No space between "fu: and the other string.
The result of the regex should be cases where case no' 1 matches but cases no' 2./2.1/2.2/2.3/2.4 don't.
At the moment I have following regex: "fu:([^"]*?[A-Z][^"]*?)",
which finds strings with start with "fu: and end with " with any upper case inbetween like this one:
"fu:this String is wrong cause the s from string is upper case"
I hope it all makes sense, I tried to get into regex but this problem seems to complex for someone who is not working with regex every day.
[Edit]
Apparently I was not clear enough. I want to have matches which are "wrong".
I am looking for the complement of this regex: "fu:(?:[a-z][a-z ]{0,47}[a-z]|[a-z]{0,2})"
some examples:
Match: "fu: this is a match"
Match: "fu:This is a match"
Match: "fu:this is a match "
NO Match: "fu:this is no match"
Sorry, its not easy to explain :)
Try the following:
"fu:([a-z](?:[a-z ]{0,48}[a-z])?)"
This will match any string that begins with "fu: and ends with a " and the string between those will contain 1-50 characters - only lower-case and not able to begin with a space nor have trailing spaces.
"fu: # begins with "fu:
( # group to match
[a-z] # starts with at least one character
(?: # non-matching sub-group
[a-z ]{0,48} # matches 0-48 a-z or space characters
[a-z] # sub-group must end with a character
)? # group is not required
)
" # ends with "
EDIT: In the event that you need an empty-string to match too, i.e. the full string is "fu:", you can add another ? to the end of the matching-group in the regex:
"fu:([a-z](?:[a-z ]{0,48}[a-z])?)?"
I've kept the two regexes separated (one that allows 1-50 characters in the string and one that allows 0-50) to show the minor difference.
EDIT #2: To match the inverse of the above, i.e. - to find all strings that do not match the required format, you can use:
^((?!"fu:([a-z](?:[a-z ]{0,48}[a-z])?)?").)*$
This will explicitly match any line that does not match that pattern. This will consequently also match lines that do not contain "fu: - if that matters.
The only way I can figure out to truly match the opposite of the above and still include the anchors of "fu: and " are to explicitly attempt to match the rules that fail:
"fu:([^a-z].*|[^"]{51,}|[a-z]([^"]*?[A-Z][^"]*?)+|[a-z ]{0,49}[ ])"
This regex will match anything that starts with not a lowercase a-z character, any string that's longer than 50 characters, any string that contains an uppercase letter, or any string that has trailing whitespace. For each additional rule, you'll need to update the regex to match the opposite of what's needed.
My recommendation is, in whatever language you're using, to match all input strings that actually follow your requirements - and if there are no matches then that string must violate your rules.
"fu:([^A-Z" ](?:[^A-Z"]{0,48}[^A-Z" ])?)"
The above regex should match the specified requirements.
That's probably what you need
"fu:([a-z](?:[a-z ]{,48}[a-z])?)"
Try this:
"fu:(?:[a-z][a-z ]{0,47}[a-z]|[a-z]?)"

Regex matching beginning AND end strings

This seems like it should be trivial, but I'm not so good with regular expressions, and this doesn't seem to be easy to Google.
I need a regex that starts with the string 'dbo.' and ends with the string '_fn'
So far as I am concerned, I don't care what characters are in between these two strings, so long as the beginning and end are correct.
This is to match functions in a SQL server database.
For example:
dbo.functionName_fn - Match
dbo._fn_functionName - No Match
dbo.functionName_fn_blah - No Match
If you're searching for hits within a larger text, you don't want to use ^ and $ as some other responders have said; those match the beginning and end of the text. Try this instead:
\bdbo\.\w+_fn\b
\b is a word boundary: it matches a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. This regex will find what you're looking for in any of these strings:
dbo.functionName_fn
foo dbo.functionName_fn bar
(dbo.functionName_fn)
...but not in this one:
foodbo.functionName_fnbar
\w+ matches one or more "word characters" (letters, digits, or _). If you need something more inclusive, you can try \S+ (one or more non-whitespace characters) or .+? (one or more of any characters except linefeeds, non-greedily). The non-greedy +? prevents it from accidentally matching something like dbo.func1_fn dbo.func2_fn as if it were just one hit.
^dbo\..*_fn$
This should work you.
Well, the simple regex is this:
/^dbo\..*_fn$/
It would be better, however, to use the string manipulation functionality of whatever programming language you're using to slice off the first four and the last three characters of the string and check whether they're what you want.
\bdbo\..*fn
I was looking through a ton of java code for a specific library: car.csclh.server.isr.businesslogic.TypePlatform (although I only knew car and Platform at the time). Unfortunately, none of the other suggestions here worked for me, so I figured I'd post this.
Here's the regex I used to find it:
\bcar\..*Platform
Scanner scanner = new Scanner(System.in);
String part = scanner.nextLine();
String line = scanner.nextLine();
String temp = "\\b" + part + "|" + part + "\\b";
Pattern pattern = Pattern.compile(temp.toLowerCase());
Matcher matcher = pattern.matcher(line.toLowerCase());
System.out.println(matcher.find() ? "YES" : "NO");
If you need to determine if any of the words of this text start or end with the sequence, you can use this regex: \bsubstring|substring\b:
anythingsubstring
substringanything
anythingsubstringanything
The simplest thing that you can do is:
dbo.*_fn$
It searches with dbo, followed by any characters, and then ends with _fn.
If you can identify what’s the right next character after n if it’s space, you can replace $ with space .

Regex help with matching

Hello I need coming up with a valid regular expression It could be any identifier name that starts with a letter or underscore but may contain any number of letters, underscores, and/or digits (all letters may be upper or lower case).
For example, your regular expression should match the following text strings: “_”, “x2”, and “This_is_valid” It should not match these text strings: “2days”, or “invalid_variable%”.
So far this is what I have came up with but I don't think it is right
/^[_\w][^\W]+/
The following will work:
/^[_a-zA-Z]\w*$/
Starts with (^) a letter (upper or lowercase) or underscore ([_a-zA-Z]), followed by any amount of letter, digit, or underscore (\w) to the end ($)
Read more about Regular Expressions in Perl
Maybe the below regex:
^[a-zA-Z_]\w*$
If the identify is at the start of a string, then it's easy
/^(_|[a-zA-Z]).*/
If it's embedded in a longer string, I guess it's not much worse, assuming it's the start of a word...
/\s(_|[a-zA-Z]).*/