Replacement of strings within 2 strings in regex - regex
I have a string:
dkj a * & &*(&(*(
//#HELLO
^%#&UJNWDUK()C*(v 8*J DK*9
//#HE#$^&&(akls#$98akdjl ak##sjdkja
//
%^&*(//#HELLO//#BYE<><>
//#BYE
^%#&UJNWDUK()C*(v 8*J DK*90K )
//#HELLO
&*^J$XUK 8j8 j jk kk8(&*(
//#BYE
and I need to have 2 groups such as each group must start with //HELLO then there should be a next line and any type of text can follow (.*) but it will end with a //BYE preceded by a line:
1)
//#HELLO
^%#&UJNWDUK()C*(v 8*J DK*9
//#HE#$^&&(akls#$98akdjl ak##sjdkja
//
%^&*(//#HELLO//#BYE<><>
//#BYE
2)
//#HELLO
&*^J$XUK 8j8 j jk kk8(&*(
//#BYE
and replaces the original string to this: (basically adding // to each line of each group)
dkj a * & &*(&(*(
////#HELLO
//^%#&UJNWDUK()C*(v 8*J DK*9
////#HE#$^&&(akls#$98akdjl ak##sjdkja
////
//%^&*(//#HELLO//#BYE<><>
////#BYE
^%#&UJNWDUK()C*(v 8*J DK*90K )
////#HELLO
//&*^J$XUK 8j8 j jk kk8(&*(
////#BYE
Here is my current progress:
I have
\/\/#HELLO\n.*?\/\/#BYE[\n$]
However im not sure how to go about the replacement, I'm thinking separating each line per group using \G after the //#HELLO and ending with //#BYE
It's a bit complex, but this will do it:
Search: (?m)(//#HELLO[\r\n]+|\G(?://#BYE|(?=(?:[^#]|#(?!HELLO[\r\n]+))*#BYE)[^\r\n]*[\r\n]*))
Replace: //$1
In Groovy:
String resultString = subjectString.replaceAll(/(?m)(\/\/#HELLO[\r\n]+|\G(?:\/\/#BYE|(?=(?:[^#]|#(?!HELLO[\r\n]+))*#BYE)[^\r\n]*[\r\n]*))/, '//$1');
For grouping into separate lines use the following regex:
//#HELLO\r(.*[\n\r]+)*//#BYE\r?
\r - Newline character
[\n\r] - Enter characters
*? - Non-greedy match
?- Match 1 or 0 times
You can take out the ? at the end if it always ends with a newline.
You can then use the group (The value inside the brackets) to search and replace.
Related
How to clear all commas except for commas in even position in sheet?
I have multiple rows of string where the string is all wrong. Here is one row an an example of the geometry and output expected: id geometry output 1 POLYGON (( 106.812271, -6.361551, 106.812111, -6.361339, 106.81205, -6.361177, 106.81206, -6.360905, 106.812055, -6.360582, 106.812065, -6.360218, 106.812293, -6.359295, 106.812593, -6.358644, 106.812436, -6.358406, 106.8121515, -6.3582051, 106.8123, -6.357823, 106.81244, -6.357407, 106.812612, -6.356842, 106.812719, -6.356544, 106.81274, -6.356384, 106.812864, -6.356148, 106.813019, -6.356021, 106.813287, -6.355797, 106.813781, -6.355286, 106.814076, -6.354751, 106.814277, -6.354393, 106.814403, -6.354027, 106.814553, -6.353814, 106.814736, -6.353526, 106.814993, -6.353302, 106.81516, -6.353024, 106.815358, -6.35279, 106.815509, -6.352588, 106.815675, -6.352331, 106.8153007, -6.3521138, 106.8151398, -6.3520137, 106.8149789, -6.3518005, 106.8147643, -6.3516939, 106.8144639, -6.3516245, 106.8141527, -6.3515392, 106.8135734, -6.351342, 106.813171, -6.3512034, 106.8123284, -6.3509219, 106.8122418, -6.3511298, 106.8118164, -6.3521534, 106.8116597, -6.3525047, 106.8111849, -6.3535692, 106.8102245, -6.3554942, 106.8093545, -6.3568947, 106.8085097, -6.3580518, 106.80795, -6.358832, 106.8077793, -6.3590429, 106.807668, -6.359441, 106.807499, -6.360346, 106.8072531, -6.3616378, 106.8071476, -6.3622599, 106.8070637, -6.3626798, 106.8070823, -6.3629367, 106.8071207, -6.3634531, 106.8078269, -6.363831, 106.809448, -6.364124, 106.810574, -6.364198, 106.81066, -6.362993, 106.811175, -6.36277, 106.812087, -6.361703, 106.812271, -6.361551)) POLYGON (( 106.812271 -6.361551, 106.812111 -6.361339, 106.81205 -6.361177, 106.81206 -6.360905, 106.812055 -6.360582, 106.812065 -6.360218, 106.812293 -6.359295, 106.812593 -6.358644, 106.812436 -6.358406, 106.8121515 -6.3582051, 106.8123 -6.357823, 106.81244 -6.357407, 106.812612 -6.356842, 106.812719 -6.356544, 106.81274 -6.356384, 106.812864 -6.356148, 106.813019 -6.356021, 106.813287 -6.355797, 106.813781 -6.355286, 106.814076 -6.354751, 106.814277 -6.354393, 106.814403 -6.354027, 106.814553 -6.353814, 106.814736 -6.353526, 106.814993 -6.353302, 106.81516 -6.353024, 106.815358 -6.35279, 106.815509 -6.352588, 106.815675, -6.352331, 106.8153007, -6.3521138, 106.8151398 -6.3520137, 106.8149789 -6.3518005, 106.8147643 -6.3516939, 106.8144639 -6.3516245, 106.8141527 -6.3515392, 106.8135734 -6.351342, 106.813171 -6.3512034, 106.8123284 -6.3509219, 106.8122418 -6.3511298, 106.8118164 -6.3521534, 106.8116597 -6.3525047, 106.8111849 -6.3535692, 106.8102245 -6.3554942, 106.8093545 -6.3568947, 106.8085097 -6.3580518, 106.80795 -6.358832, 106.8077793 -6.3590429, 106.807668 -6.359441, 106.807499 -6.360346, 106.8072531 -6.3616378, 106.8071476 -6.3622599, 106.8070637 -6.3626798, 106.8070823 -6.3629367, 106.8071207 -6.3634531, 106.8078269 -6.363831, 106.809448 -6.364124, 106.810574 -6.364198, 106.81066 -6.362993, 106.811175 -6.36277, 106.812087 -6.361703, 106.812271 -6.361551)) One example is as follows above. I need to get rid of all odd position commas and only keep the even position commas. So that the geometry can become output. I tried doing a split(text.",") and concatenate however when the columns is blank it returns xxx,,,, which is not what I had in mind. Since some have more than 200 commas means that I need to have more than 200 columns, is there a simpler way like using regex?Someone please help.
If the second number is always negative, this is simple as replacing , - (comma, space, dash) with (space). =REGEXREPLACE(B2,", -"," ") If not, =REGEXREPLACE(B2,"(-??\d+\.?\d*),(\s*-?\d+\.?\d*)","$1$2") Capture group #1: (-??\d+\.?\d*) -?? zero or one of literal dash followed by \d+ one or more digits followed by .? zero or one of literal . \d* zero or more of digits literal , Capture group #2 (\s*-?\d+\.?\d*) \s* zero or more of space characters -? zero or one of literal dash followed by \d+ one or more digits followed by .? zero or one of literal . \d* zero or more of digits Replace with capture groups only: $1$2
try: =INDEX(REGEXREPLACE(QUERY(FLATTEN(SPLIT(A1, ",")&IF(ISODD( SEQUENCE(1, COLUMNS(SPLIT(A1, ",")))),, ",")),,9^9), ",$", )) for array: =INDEX(IFERROR(BYROW(A1:A3, LAMBDA(x, REGEXREPLACE(QUERY(FLATTEN(SPLIT(x, ",")& IF(ISODD(SEQUENCE(1, COLUMNS(SPLIT(x, ",")))),, ",")),,9^9), ",$", )))))
An idea to match , and capture any non-commas with an optional comma after: =REGEXREPLACE(A1; ",([^,]*,?)"; "$1") Replace with $1 what was captured by the first group - See this demo at regex101
How do I replace the nth occurrence of a special character, say, a pipe delimiter with another in Scala?
I'm new to Spark using Scala and I need to replace every nth occurrence of the delimiter with the newline character. So far, I have been successful at entering a new line after the pipe delimiter. I'm unable to replace the delimiter itself. My input string is val txt = "January|February|March|April|May|June|July|August|September|October|November|December" println(txt.replaceAll(".\\|", "$0\n")) The above statement generates the following output. January| February| March| April| May| June| July| August| September| October| November| December I referred to the suggestion at https://salesforce.stackexchange.com/questions/189923/adding-comma-separator-for-every-nth-character but when I enter the number in the curly braces, I only end up adding the newline after 2 characters after the delimiter. I'm expecting my output to be as given below. January|February March|April May|June July|August September|October November|December How do I change my regular expression to get the desired output? Update: My friend suggested I try the following statement println(txt.replaceAll("(.*?\\|){2}", "$0\n")) and this produced the following output January|February| March|April| May|June| July|August| September|October| November|December Now I just need to get rid of the pipe symbol at the end of each line.
You want to move the 2nd bar | outside of the capture group. txt.replaceAll("([^|]+\\|[^|]+)\\|", "$1\n") //val res0: String = // January|February // March|April // May|June // July|August // September|October // November|December Regex Explained (regex is not Scala) ( - start a capture group [^|] - any character as long as it's not the bar | character [^|]+ - 1 or more of those (any) non-bar chars \\| - followed by a single bar char | [^|]+ - followed by 1 or more of any non-bar chars ) - close the capture group \\| - followed by a single bar char (not in capture group) "$1\n" - replace the entire matching string with just the first $1 capture group ($0 is the entire matching string) followed by the newline char UPDATE For the general case of N repetitions, regex becomes a bit more cumbersome, at least if you're trying to do it with a single regex formula. The simplest thing to do (not the most efficient but simple to code) is to traverse the String twice. val n = 5 txt.replaceAll(s"(\\w+\\|){$n}", "$0\n") .replaceAll("\\|\n", "\n") //val res0: String = // January|February|March|April|May // June|July|August|September|October // November|December
You could first split the string using '|' to get the array of string and then loop through it to perform the logic you want and get the output as required. val txt = "January|February|March|April|May|June|July|August|September|October|November|December" val out = txt.split("\\|") var output: String = "" for(i<-0 until out.length -1 by 2){ val ref = out(i) + "|" + out(i+1) + "\n" output = output + ref } val finalout = output.replaceAll("\"\"","") //just to remove the starting double quote println(finalout)
Remove the text before second comma ('',") String replace pattern
how can we remove the text before the line that start's with second comma(line 5 in the example),how can i do that using regex? example : , abc,xyz,ggg,nrmr cde,jjj,kkkk,iiii,tem,posting 234,mm/dd/yy , 454654,output2,sample 45646,output1,non-sample 16546,225.02 ABC,2.98 expected : 454654,output2,sample 45646,output1,non-sample 16546,225.02 ABC,2.98
It seems you may use val s = """, abc,xyz,ggg,nrmr cde,jjj,kkkk,iiii,tem,posting 234,mm/dd/yy , 454654,output2,sample 45646,output1,non-sample 16546,225.02 ABC,2.98""" val res = s.replaceFirst("(?sm)\\A(.*?^,$){2}", "").trim() println(res) // => // 454654,output2,sample // 45646,output1,non-sample // 16546,225.02 // ABC,2.98 See the Scala demo. Pattern details: (?sm) - s enables . to match any char in the string including newlines, and m makes ^ and $ match start/end of line respectively \\A - the start of string (.*?^,$){2} - 2 occurrences of: .*? - any 0+ chars as few as possible up to the leftmost ^,$ - line that only contains ,.
Remove optional whitespace when splitting with math operators while keeping them in the result
How to remove whitespace characters from the input string? I am using the following code that Dim input As String = txtInput.Text Dim symbol As String = "([-+*/])" Dim substrings() As String = Regex.Split(input, symbol) Dim cleaned As String = Regex.Replace(input, "\s", " ") For Each match As String In substrings lstOutput.Items.Add(match) Next Input: z + x Output: z, + and x. I want to get rid of the whitespace in the last item.
You may remove the redundant whitespace while splitting with \s*([-+*/])\s* See the regex demo. Also, it is a good idea to trim the input before passing to the regex replace method with .Trim(). Pattern details: \s* - matches 0+ whitespaces (these will be discarded from the result as they are not captured) ([-+*/]) - Group 1 (captured texts will be output to the resulting array) capturing 1 char: -, +, * or / \s* - matches 0+ whitespaces (these will be discarded from the result as they are not captured)
Regex to extract value at fixed position index
I have the following string of characters: 73746174652C313A312C310D | - extract the value at this position I would like to extract the value 1 (the 1 at the end of the string) using regex. So basically a regex that acts as a charAt(index). I need this solution for a 3rd party application that only supports regular expressions. Note that the application cannot access capture groups and does not support negative lookbehinds.
In C#: (?<=^.{21})(.) in JS: /.(?=.{2}$)/
You could try: (?<=^.{21}). It won't work in Javascript, but perhaps it will work in your app. It means: a single character preceded (?<= ... ) by the beginning of the string ^ plus 21 characters .{21} . So, in the end, it returns the 22th character.
The 22nd character is in capture group 1. /^.{21}(.)/ But what system are you in that requires this instead of normal string processing?
Depends how you want to match it ( x distance from the beginning or x distance from the end ) /(.).{2}$/ Third from the end (capturing group 1) /^.{21}(.)/ 22nd character (capturing group 1) //PHP $str = '73746174652C313A312C310D'; $char = preg_replace('/(.).{2}$/','$1',$str); //3rd from last preg_match('/(.).{2}$/',$str,$chars); //3rd from last $char = $chars[1]; preg_match('/^.{21}(.)/',$str,$chars); //22nd character $char = $chars[1]; //JS var str = '73746174652C313A312C310D'; var ch = str.replace(/(.).{2}$/,'$1'); //3rd from last var ch = str.match(/(.).{2}$/)[1]; //3rd from last var ch = str.match(/^.{21}(.)/)[1]; //22nd character If you're having to use the result of the First match: bit of your tool, run it twice: 73746174652C313A312C310D - ^.{21}. = 73746174652C313A312C31 73746174652C313A312C31 - .$ = 1