Regex replace in Android Studio to UPPERCASE - regex

Does anyone know if it is possible to perform a Regular Expression replace operation in AndroidStudio where a particular match can be converted to uppercase?
Example:
I want to search find all occurrences of;
Log.i
Log.e
Log.d
...and replace them with :
if ( LogConfig.LOGI ) Log.i
if ( LogConfig.LOGE ) Log.e
if ( LogConfig.LOGD ) Log.d
In other words, some of the replacements are as is (no brainer) but others must be CAPITALIZED.
If this is possible, how do I do this?

You may use
(Log)\.([ied])
Replace with if ( LogConfig.\U$1$2\E ) $0. See the regex demo.
If you need to match Log.e as a whole word, add word boundaries, \b(Log)\.([ied])\b.
Details
(Log) - Capturing group 1: Log
\. - a dot
([ied]) - a letter i, e or d.
The \U$1$2\E means:
\U - start turning to upper case all that follows:
$1 - Group 1 value
$2 - Group 2 value
\E - stop turning to uppercase.

Related

Regex to get all text occurrences between parentheses encapsulated by a second pattern

I need a regex that will get all the text occurences between parentheses, having in mind that all the content is encapsulated by the word BEGIN and the chars ---- at the end.
Input example:
BEGIN ) Tj\nET37.66 533 Td\n( Td\n(I NEED THIS TEXT ) Tj\nET\nBT\n37.334 Td\n(AND ALSO NEED THIS TEXT ) Tj\nET\nBT\n37.55 Td\n(------------
Expected matches:
I NEED THIS TEXT
AND ALSO NEED THIS TEXT
I already did something like (?<=BEGIN).*(?=\(--) to the outside pattern, but i couldn't figure out how to get all text occurrences inside parentheses between this.
With Python PyPi regex library, you can use
(?s)(?:\G(?!^)\)|BEGIN)(?:(?!\(--).)*?\((?!--)\K[^()]*
See the regex demo
Details:
(?s) - a DOTALL inline modifier making . match line break chars
(?:\G(?!^)\)|BEGIN) - either BEGIN or the end of the previous successful match and a ) right after
(?:(?!\(--).)*? - any char, zero or more but as few as possible occurrences, that does not start a (-- char sequence
\( - a ( char
(?!--) - right after (, there should be no --
\K - match reset operator: what was matched before is discarded from the overall match memory buffer
[^()]* - zero or more chars other than ( and )
Try:
\(((?:(?!BEGIN).)*?)\)(?=.*---)
Regex demo.
\(((?:(?!BEGIN).)*?)\) - Match everything between ( ), but not BEGIN
(?=.*---) - .*--- must follow after this match

Matching words & partial colon-delimited words within parentheses (excluding parentheses)

I am trying to extract stock symbols from a body of text. These matches usually come in the following forms:
(<symbol>) => (VOO)
(<market>:<symbol>) => (NASDAQ:C)
In the sample cases shown above, I'd like to match VOO and C, skipping everything else. This regex gets me halfway there:
(?<=\()(.*?)(?=\))
With this, I match what's included within the parentheses, but the logic that ignores "noise" like NASDAQ: eludes me. I'd love to learn how to conditionally specify this pattern/logic.
Any ideas? Thanks!
You can use
[A-Z]+(?=\))
See the regex demo.
Details:
[A-Z]+ - one or more uppercase ASCII letters
(?=\)) - a positive lookahead that matches a location that is immediately followed with a ) char.
Alternatively, you can use the following to capture the values into Group 1:
\((?:[^():]*:)?([A-Z]+)\)
See this regex demo. Details:
\( - a ( char
(?:[^():]*:)? - an optional sequence of any zero or more chars other than (, ) and : and then a : char
([A-Z]+) - Group 1: one or more uppercase ASCII letters
\) - a ) char.

Match string between delimiters, but ignore matches with specific substring

I have to parse all the text in a paranthesis but not the one that contains "GST"
e.g:
(AUSTRALIAN RED CROSS – ATHERTON)
(Total GST for this Invoice $1,104.96)
today for a quote (07) 55394226 − admin.nerang#waste.com.au − this applies to your Nerang services.
expected parsed value:
AUSTRALIAN RED CROSS – ATHERTON
I am trying:
^\(((?!GST).)*$
But its only matching the value and not grouping correctly.
https://regex101.com/r/HndrUv/1
What would be the correct regex for the same?
This regex should work to get the expected string:
^\((?!.*GST)(.*)\)$
It first checks if it does not contain the regular expression *GST. If true, it then captures the entire text.
(?!*GST)(.*)
All that is then surrounded by \( and \) to leave it out of the capturing group.
\((?!.*GST)(.*)\)
Finally you add the BOL and EOL symbols and you get the result.
^\((?!.*GST)(.*)\)$
The expected value is saved in the first capture group (.*).
You can use
^\((?![^()]*\bGST\b)([^()]*)\)$
See the regex demo. Details:
^ - start of string
\( - a ( char
(?![^()]*\bGST\b) - a negative lookahead that fails the match if, immediately to the right of the current location, there are zero or more chars other than ) and ( and then GST as a whole word (remove \bs if you do not need whole word matching)
([^()]*) - Group 1: any zero or more chars other than ) and (
\) - a ) char
$ - end of string
Bonus:
If substrings in longer texts need to be matched, too, you need to remove ^ and $ anchors in the above regex.

Notepad++ REGEX Masking / Sanitise Data

There's a requirement to sanitise the Production file and hand it over then to a third party. The integrity / number of characters / digits should remain same.
<ADD1<4, Privet Drive, Scotland, EC12 5FL, UK<
In the above example, we need to mask number with 9, and Characters with X or x (based on case).
Target data should be.
<ADD1<9, Xxxxxx Xxxxx, Xxxxxxxx, XX99 9XX, XX<
NP++ supposedly uses boost::regex engine.
And further, it apparently uses the boost-extended replacement format string.
This means you can put a conditional within the replacement string to test
which group matched, then replace accordingly.
syntax: (?1yes:no) says did group 1 match, do yes, else do no
syntax: (?{1}yes:no) same
If it's got boost::regex use
update
only between <ADD1< and <
find (?:(?!^)\G|<ADD1<)[^a-zA-Z0-9<]*\K(?:([A-Z])|([a-z])|\d)
replace (?1X:(?2x:9))
Note - select the replacement string format as Boost Extended
if it is not the default.
https://regex101.com/r/pJCsZa/1
Regex info
(?:
(?! ^ )
\G # Start match where last left off
| # or,
<ADD1< # New start
)
[^a-zA-Z0-9<]* # Optional non-letter or digit or <
\K # Ignore matched characters up to here
(?: # What's left, a letter or a digit
( [A-Z] ) # (1)
| ( [a-z] ) # (2)
| \d
)
You should be able to do a series of replacements here. Do each replacement by searching in regex mode, and then use the appropriate replacement:
[A-Z] -> replace with X
[a-z] -> replace with x
[0-9] -> replace with 9
I suggest highlighting the entire address text and then doing the replacement.

Regex and multiple matches [duplicate]

If i have a big text, and i'm needind to keep only matched content, how can i do that?
For example, if I have a text like this:
asdas8Isd8m8Td8r
asdia8y8dasd
asd8is88n8gd
asd8t8od8lsdas
as9ea9ad8r1n88r8e87g6765ejasdm8x
And use this regex: [0-9]([a-z]) to group all letters after a number and replace with \1 i will repace all (number)(letter) to (letter) (And if i want to delete the rest and stay only with the letter matched)?...
Converting this text to
ImTr
y
ing
tol
earnregex
How can i replace this text with grouped and delete the rest?
And if i want to delete all but no matched?
In this case, converting the text to:
8I8m8T8r
8y8d
8i8n8g
8t8o8l
9e9a9r1n8r7g5e8x
Can i match all that is not [0-9]([a-z])?
Thanks! :D
You may use the following regex:
(?i-s)[0-9]([a-z])|.
Replace with (?{1}$1:).
To delete all but non-matched, use the (?{1}$0:) replacement with the same regex.
Details:
(?i-s) - an inline modifier turning on case insensitive mode and turning off the DOTALL mode (. does not match a newline)
[0-9]([a-z]) - an ASCII digit and any ASCII letter captured into Group 1 (later referred to with $1 or \1 backreference from the string replacement pattern)
| - or
. - any char but a line break char.
Replacement details
(?{1} - start of the conditional replacement: if Group 1 matched then...
$1 - the contents of Group 1 (or the whole match if $0 backreference is used)
: - else... nothing
) - end of the conditional replacement pattern.