I need to match and replace all UPPERCASE word in a Postgres string field like
'GARLASCO Cavour/Oriani'
'SANNAZZARO DE' BURGONDI Italia, 46 (Direzione Sud)'
'S.MARGHERITA STAFFORA Vallechiara (Bivio Montemartino)'
'GAMBOLO' Umberto I, 312'
I try with
[A-Z\''.]{2,}
SELECT REGEXP_REPLACE('SANNAZZARO DE' BURGONDI Italia, 46 (Direzione Sud)',' \b[A-Z]{2,}\b','','g')
but it works only for string with 1 uppercase world like 'GARLASCO Cavour/Oriani'
You may use
REGEXP_REPLACE(your_col_here,' '^[A-Z[:space:].'']+\y','')
This will replace the following matches:
^ - start of string
[A-Z[:space:].']+ - 1+ uppercase letters (you may also replace A-Z with [:upper:]), whitespaces, dots or apostrophes at...
\y - a word boundary.
Related
What i am trying to match is like this :
char-char-int-int-int
char-char-char-int-int-int
char-char-int-int-int-optionnalValue (optionalValue being a "-" plus letters after it
My current regep looks like this :
([A-Za-z]{1,2})([1-9]{3})("-"[\w])
In the end, the regexp should match any of these:
AB001
aB999
Hm000
en789
rv005-ab
These should be invalid:
ab (because only letters)
abcfr (because too much letters)
158 (because only numbers)
78532 (because too much numbers)
123ab (because all letters should come before numbers, optionalValue exepted)
a1b23 (because letters and numbers are mixed)
What am i doing wrong ? (please be gentle this is my first post ever on stackoverflow)
If you use [A-Za-z]{1,2} then the second example would not match as there a 3 char-char-char
Using \w would also match numbers and an underscore. If you mean letters like a-zA-Z you can use that in an optional group preceded by a hyphen (?:-[a-zA-Z]+)?
You could use
^[a-zA-Z]{2,3}[0-9]{3}(?:-[a-zA-Z]+)?$
^ Start of string
[a-zA-Z]{2,3} Match 2 or 3 times a char A-Za-z
[0-9]{3} Match 3 digits
(?:-[a-zA-Z]+)? Optionally match a - and 1 or more chars A-Za-z
$ End of string
Regex demo
Or using word boundaries \b instead of anchors
\b[a-zA-Z]{2,3}[0-9]{3}(?:-[a-zA-Z]+)?\b
Regex demo
I have corrected your regex below. Please give it a try.
([A-Za-z]{1,2})([0-9]{3})(-\w*)?
Demo
I need a regex which is matched when the string doesn't have both lowercase and uppercase letters.
If the string has only lowercase letters -> should be matched
If the string has only uppercase letters -> should be matched
If the string has only digits or special characters -> should be matched
For example
abc, ABC, 123, abc123, ABC123&^ - should match
AbC, A12b, AB^%12c - should not match
Basically I need an inverse/negation of the following regex:
^(?=.*[a-z])(?=.*[A-Z]).+$
Does not sound like any lookarounds would be needed.
Either match only characters that are not a-z or only characters, that are not A-Z.
^(?:[^a-z]+|[^A-Z]+)$
See this demo at regex101 (used + for one or more)
You may use
^(?!.*[A-Z].*[a-z])(?!.*[a-z].*[A-Z])\S+$
Or
^(?=(?:[^a-z]+|[^A-Z]+)$).*$
See the regex demo #1 and regex demo #2
A lookaround solution like this can be used in more complex scenarios, when you need to apply more restrictions on the pattern. Else, consider a non-lookaround solution.
Details
^ - start of string
(?!.*[A-Z].*[a-z]) - no uppercase followed with a lowercase letter
(?!.*[a-z].*[A-Z]) - no lowercase letter followed with an uppercase one
(?=(?:[^a-z]+|[^A-Z]+)$) - a positive lookahead that requires 1 or more characters other than lowercase ASCII letters ([^a-z]+) to the end of the string, or 1 or more characters other than uppercase ASCII letters ([^A-Z]+) to the end of the string
.+ - 1+ chars other than line break chars
$ - end of string.
You can use this regex
^(([A-Z0-9?&%^](?![a-z]))+|([a-z0-9?&%^](?![A-Z]))+)$
You can test more cases here.
I've only added the characcter ?&%^ as possible character, but you could add which ever you like.
I would go with:
^(?:[^a-z]+?|[^A-Z]+?)$
It translates to "If the entire string is composed of non-lowercase letters or non-uppercase letters then match the string."
Lazy quantifiers +? are used so that the end-string $ anchor is obeyed when the multiline flag is enabled. If you're only validating a single-line string the you can simply use + without the question mark.
If you have a whitelist of specific allowed special chars then change [^A-Z] into [A-Z0-9()_+=-] and list the allowed special chars.
https://regex101.com/r/Wg6tLn/1
I am trying to split the expression like in Postgres 9.4:
"some text 123_good_345 and other text 123_some_invalid and 222_work ok_333 stop."
using pattern: (\d+\_.*\_\d+\D)+?
result is:
"123_good_345"
"123_some_invalid and 222_work ok_333"
But I need
"123_good_345"
"222_work ok_333"
note, ignoring "123_some_invalid"
Please help!
You may use
\d+_(?:(?!\d_).)*_\d+
See the regex demo. Or, if there can be no digits between \d+_ and _\d+, use
\d+_\D+_\d+
See this regex demo.
Details
\d+ - 1 or more digits
-_ - an underscore
(?:(?!\d_).)* - any char, 0 or more repetitions, as many as possible, that does not start a digit + _ char sequence
\D+ - any 1+ chars other than digits
_ - an underscore
\d+ - 1+ digits.
See the PostgreSQL demo:
SELECT unnest(regexp_matches('some text 123_good_345 and other text 123_some_invalid and 222_work ok_333 stop.', '\d+_(?:(?!\d_).)*_\d+', 'g'));
or
SELECT unnest(regexp_matches('some text 123_good_345 and other text 123_some_invalid and 222_work ok_333 stop.', '\d+_\D+_\d+', 'g'));
I am trying to create a regex that should only match the alphanumeric character having length of 11 in the paragraph as provided in the example. The problem is that it also selects the string containing alphabets only.
My regex and input data can be seen here.
Sample text:
RCLO DD 12-10-15 IAD RO N2905198759 PTD 12-08-15 SWC
CRO N2905198759 FCD 12-07-15 WOT 12-0
MCN 999LDCMCWCG PROJECT 309097-2 VER 04 OCO TSR BSRNCA70M00
WORK DESCRIPTION AND NOTES: CCO TSR BSRNCA70M00
MANUALLY
DIVERSER CIRCUITS SEE RPON, 9152 IRMK AAI DWGVILAZW02 IRMK ALCON IDR INFORMATION U
PDATED ON THE DESIGN AT HFESILWL AND EGVGILEG
The pattern is
\b([A-Z0-9]{11})\b
In the above example it should not select "DESCRIPTION" and "INFORMATION"
You may use
\b(?=[A-Z]*[0-9])(?=[0-9]*[A-Z])[A-Z0-9]{11}\b
See the regex demo
Details
\b - word boundary
(?=[A-Z]*[0-9]) - after 0+ uppercase ASCII letters, there must be 1 ASCII digit
(?=[0-9]*[A-Z]) - after 0+ ASCII digits, there must be 1 uppercase ASCII letter
[A-Z0-9]{11} - 11 uppercase ASCII letters or digits
\b - a trailing word boundary.
I am trying to find a code from a column in a table than contains a sequence of letters and numbers. They code contains a prefix ^AB then a sequence of either just letters A OR AAA or letters or numbers 1 OR A1 OR 1A
I need a regular expression that that returns YES/NO if the characters following the prefix contains a number
What I have so far is:
SELECT 'AB1AX' RLIKE '^AB[A-Z0-9]+(?=\\d)$';
SELECT 'ABA1X' RLIKE '^AB[A-Z0-9]+(?=\\d)$';
SELECT 'AB09' RLIKE '^AB[A-Z0-9]+(?=\\d)$';
However this does not match.
Your regex does not mach any string, it will never match, because you require a digit after the end of string.
You need to use
^AB[A-Z]*[0-9][A-Z0-9]*$
See the regex demo
Pattern details:
^ - start of string
AB - the "hard-coded" prefix
[A-Z]* - 0+ uppercase ASCII chars
[0-9] - a digit
[A-Z0-9]* - 0+ uppercase letters/digits
$ - end of string.