Regex: How to Display 4 digits after the "L" (same for after the "C") ? like : L100_C1_"1"_KO,L100_C2_"3260"_KO,etc - regex

Hi Every Regex Expert,
I have one Array List al1 like this (line by line):
al1 : L1_C1_0, L1_C2_"11229", L1_C2_"CHK_CASHING"_OK, etc... L1_C100_"FR45248624892", L2_C1_0, L2_C2_"11229", L2_C2_"CHK_CASHING"_OK etc... L2_C100_"FR45248624892"_KO, L3_C1_0, L3_C2_"11229", L3_C2_"CHK_CASHING"_OK etc... L3_C100_"FR45248624892"_KO, L4_C1_0, L3_C2_"11229", L4_C2_"CHK_CASHING"_OK etc... L4_C100_"FR45248624892"_OK
I write this regex but it doesn't work as i want :
String spattern = "(L(([1-9]?[0-9])|100)_C\\d_\\W.*?L\\2_C\\d{3}_\".*?\"(?:,?\$?))";
I want to display like this :
L1_C1_0, L1_C2_"11229", L1_C2_"CHK_CASHING"_OK, etc...L1_C100_"FR45248624892"
L2_C1_0, L2_C2_"11229", L2_C2_"CHK_CASHING"OK etc...L2_C100"FR45248624892"_KO
L3_C1_0, L3_C2_"11229", L3_C2_"CHK_CASHING"OK etc...L3_C100"FR45248624892"_KO
L4_C1_0, L3_C2_"11229", L4_C2_"CHK_CASHING"OK etc...L4_C100"FR45248624892"_OK
L5_C1_1 etc...
Some one can help me to Display this ?
Thank you very much for help

Related

REGEX '\K' is not working in RUTA but working in REGEX buidler

I am trying to extract the last 3 characters from a pattern using below REGEX, which is working in online REGEX tester but not working in RUTA.
Below is the code that I have tried in online REGEX builder:
https://regex101.com/r/2JN9a5/1
Below is code that I have tried in RUTA:
"(?i)\\b([QI]{2}|[Q])[\\s || -]{0,2}[0-9]{5,}[\\s || -]{0,2}\\K[A-Z]{3}\\b" -> EntityType;
Input : Q-123456-PAD
Exp O/p: PAD
Input : QI-1234567-PLB
Exp O/P: PLB
If it is Pega then try this
PACKAGE uima.ruta.example;
DECLARE VarA;
DECLARE VarB;
DECLARE VarC;
W{REGEXP("Q|QI") -> MARK(VarA)}
(WS|"-")?
NUM{REGEXP(".{1,7}")-> MARK(VarB)}
(WS|"-")?
W{REGEXP(".{1,3}")-> MARK(VarC),MARK(EntityType,5,5), UNMARK(VarA), UNMARK(VarB), UNMARK(VarC)};
Explanation:-
(WS|"-")? :- Space or "-". You can remove the ? if one of that is fixed.
NUM{REGEXP(".{1,7}") :- Number between 1 to 7.
W{REGEXP(".{1,3}") :- Capital alphabet 1 to 3.
MARK(EntityType,5,5) :- marking only the 5th row. i.e W{REGEXP(".{1,3}"). If you mark MARK(EntityType,1,5) then it will return Q-123456-PAD.

how can I linebreak javascript with regex keeping sperator

what I wanna do is
change
1.apple2.cat3.green(1)table(2)computer①what②can i③do?●help●me●plz
this to
1.apple
2.cat
3.green
(1)table
(2)computer
①what
②can i
③do?
●help
●me
●plz
this
there're many kind of delimiter
"1.", "2." .. "(1)".."(2)"..■ ○
and so on
number is only a single digit
I want to list many delimiter can split or add linebreak, but keep delimiter
number or bullet should not be deleted.
You can use regex like below:
let s = '1.apple2.cat3.green(1)table(2)computer①what②can i③do?●help●me●plz';
let regex = /(\d\.|\(\d\)|[①-⑳]|●|■|○)[a-z]+/ig;
let result = null;
while (result = regex.exec(s)) {
console.log(result[0]); // or you can push into an array, etc.
}

Google sheet : REGEXREPLACE match everything except a particular pattern

I would try to replace everything inside this string :
[JGMORGAN - BANK2] n° 10 NEWYORK, n° 222 CAEN, MONTELLIER, VANNES / TARARTA TIs
1303222074, 1403281851 & 1307239335 et Cloture TIs 1403277567,
1410315029
Except the following numbers :
1303222074
1403281851
1307239335
1403277567
1410315029
I have built a REGEX to match them :
1[0-9]{9}
But I have not figured it out to do the opposite that is everything except all matches ...
google spreadsheet use the Re2 regex engine and doesn't support many usefull features that can help you to do that. So a basic workaround can help you:
match what you want to preserve first and capture it:
pattern: [0-9]*(?:[0-9]{0,9}[^0-9]+)*(?:([0-9]{9,})|[0-9]*\z)
replacement: $1 (with a space after)
demo
So probably something like this:
=TRIM(REGEXREPLACE("[JGMORGAN - BANK2] n° 10 NEWYORK, n° 222 CAEN, MONTELLIER, VANNES / TARARTA TIs 1303222074, 1403281851 & 1307239335 et Cloture TIs 1403277567, 1410315029"; "[0-9]*(?:[0-9]{0,9}[^0-9]+)*(?:([0-9]{9,})|[0-9]*\z)"; "$1 "))
You can also do this with dynamic native functions:
=REGEXEXTRACT(A1,rept("(\d{10}).*",counta(split(regexreplace(A1,"\d{10}","#"),"#"))-1))
basically it is first split by the desired string, to figure out how many occurrences there are of it, then repeats the regex to dynamically create that number of capture groups, thus leaving you in the end with only those values.
First of all thank you Casimir for your help. It gave me an idea that will not be possible with a built-in functions and strong regex lol.
I found out that I can make a homemade function for my own purposes (yes I'm not very "up to date").
It's not very well coded and it returns doublons. But rather than fixing it properly, I use the built in UNIQUE() function on top of if to get rid of them; it's ugly and I'm lazy but it does the job, that is, a list of all matches of on specific regex (which is: 1[0-9]{9}). Here it is:
function ti_extract(input) {
var tab_tis = new Array();
var tab_strings = new Array();
tab_tis.push(input.match(/1[0-9]{9}/)); // get the TI and insert in tab_tis
var string_modif = input.replace(tab_tis[0], " "); // modify source string (remove everything except the TI)
tab_strings.push(string_modif); // insert this new string in the table
var v = 0;
var patt = new RegExp(/1[0-9]{9}/);
var fin = patt.test(tab_strings[v]);
var first_string = tab_strings[v];
do {
first_string = tab_strings[v]; // string 0, or the string with the first removed TI
tab_tis.push(first_string.match(/1[0-9]{9}/)); // analyze the string and get the new TI to put it in the table
var string_modif2 = first_string.replace(tab_tis[v], " "); // modify the string again to remove the new TI from the old string
tab_strings.push(string_modif2);
v += 1;
}
while(v < 15)
return tab_tis;
}

regex, append a find with string

I am trying to add sheet names to cells in an excel formula that do not have sheet names.
For example a formula on curr_sheet :
'other_sheet'!$B$2 + A1
should become :
'other_sheet'!$B$2 + 'curr_sheet'!A1
public static Void addSheetNameToCells(String formula, String sheetName) {
String noSheetNameBeforeCellRegex =
"('[A-Za-z0-9-_]+'!)?[$]?[A-Z]{1,3}[$]?[0-9]+";
System.out.println("qwerty-"+formula.replaceAll
(noSheetNameBeforeCellRegex, "'"+sheetNmame+"'!$0"));
}
The above code gives me :
'curr_sheet'!'other_sheet'!$B$2 + 'curr_sheet'!A1
I think the solution is in backtracking, I tried (?<!('!))[$]?[A-Z]{1,3}[$]?[0-9]+ but it conflicts with the $ sign in **$**B$2 and gives me this :
'other_sheet'!$'curr_sheet'!B$2 + 'curr_sheet'!A1
Is there a solution to this?
Taking on from what #alessandroasm suggested, a small change in his solution seems works for all scenarios - (?<!'!)((?<![A-Z])[A-Z]{1,3}[0-9]+|\$[A-Z]{1,3}\$[0-9]+)
The following expression will match cells in the format A1 or $A$1 not preceeded by '
(?<!'!)([A-Z]{1,3}[0-9]+|\$[A-Z]{1,3}\$[0-9]+)
http://regex101.com/r/yO8rW1/2

regex how can I split this word?

I have a list of several phrases in the following format
thisIsAnExampleSentance
hereIsAnotherExampleWithMoreWordsInIt
and I'm trying to end up with
This Is An Example Sentance
Here Is Another Example With More Words In It
Each phrase has the white space condensed and the first letter is forced to lowercase.
Can I use regex to add a space before each A-Z and have the first letter of the phrase be capitalized?
I thought of doing something like
([a-z]+)([A-Z])([a-z]+)([A-Z])([a-z]+) // etc
$1 $2$3 $4$5 // etc
but on 50 records of varying length, my idea is a poor solution. Is there a way to regex in a way that will be more dynamic? Thanks
A Java fragment I use looks like this (now revised):
result = source.replaceAll("(?<=^|[a-z])([A-Z])|([A-Z])(?=[a-z])", " $1$2");
result = result.substring(0, 1).toUpperCase() + result.substring(1);
This, by the way, converts the string givenProductUPCSymbol into Given Product UPC Symbol - make sure this is fine with the way you use this type of thing
Finally, a single line version could be:
result = source.substring(0, 1).toUpperCase() + source(1).replaceAll("(?<=^|[a-z])([A-Z])|([A-Z])(?=[a-z])", " $1$2");
Also, in an Example similar to one given in the question comments, the string hiMyNameIsBobAndIWantAPuppy will be changed to Hi My Name Is Bob And I Want A Puppy
For the space problem it's easy if your language supports zero-width-look-behind
var result = Regex.Replace(#"thisIsAnExampleSentanceHereIsAnotherExampleWithMoreWordsInIt", "(?<=[a-z])([A-Z])", " $1");
or even if it doesn't support them
var result2 = Regex.Replace(#"thisIsAnExampleSentanceHereIsAnotherExampleWithMoreWordsInIt", "([a-z])([A-Z])", "$1 $2");
I'm using C#, but the regexes should be usable in any language that support the replace using the $1...$n .
But for the lower-to-upper case you can't do it directly in Regex. You can get the first character through a regex like: ^[a-z] but you can't convet it.
For example in C# you could do
var result4 = Regex.Replace(result, "^([a-z])", m =>
{
return m.ToString().ToUpperInvariant();
});
using a match evaluator to change the input string.
You could then even fuse the two together
var result4 = Regex.Replace(#"thisIsAnExampleSentanceHereIsAnotherExampleWithMoreWordsInIt", "^([a-z])|([a-z])([A-Z])", m =>
{
if (m.Groups[1].Success)
{
return m.ToString().ToUpperInvariant();
}
else
{
return m.Groups[2].ToString() + " " + m.Groups[3].ToString();
}
});
A Perl example with unicode character support:
s/\p{Lu}/ $&/g;
s/^./\U$&/;