Removing everyting before first comma and after third comma [duplicate] - regex

This question already has answers here:
Regex - Remove everything before first comma and everything after second comma in line
(3 answers)
Closed 4 years ago.
I have the following text in my notepad++:
14, CANCELLATION,rigtt,14;
192, CERTIFICATE,LatL,192;
32, TARGET, LATP, 32
I want to remove everything before first comma and after third comma so the above will be like this:
CANCELLATION,rigtt
CERTIFICATE,LatL
TARGET, LATP
what regular expression should I use to achieve the above string. I tried *., and that did not work.
any help will be appreciated.

regular expression :
^[^,]*,([^,]*,[^,]*).*$
playground for the above: https://regex101.com/r/puCyO3/2
explanation :
^ scan from the beginning of the string
[^,]*, until the 1st comma
([^,]*, capture up to and including the 2nd comma ...
[^,]*) ... and up to but not including the 3rd comma
.*$ ignoring everything until the end of the string

I'm assuming all the lines have ; at the end.
I would do it in 2 steps.
Use replace all (CTRL+H)
Replace this [0-9],? with nothing
2.Use replace all (CTRL+H)
Replace this , *; with nothing(there is a space between , and *)

Related

How to filter out c-type comments with regex? [duplicate]

This question already has answers here:
Regex to match a C-style multiline comment
(8 answers)
Improving/Fixing a Regex for C style block comments
(5 answers)
Strip out C Style Multi-line Comments
(4 answers)
Closed 3 years ago.
I'm trying to filter out "c-style" comments in a line so i'm only left with the words (or actual code).
This is what i have so far: demo
regex:
\/\*[^\/]*[^\*]*\*\/
text:
/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/
My guess is that this expression might likely work,
\/\*(\/\*\*\/)?\s*([^\/*]+?)\s*(?:\/?\*?\*?\/|\*)
or we would modify our left and right boundaries, if we would have had different inputs.
In this demo, the expression is explained, if you might be interested.
We can try doing a regex replacement on the following pattern:
/\*.*?\*/
This matches any old-school C style comment. It works by using a lazy dot .*? to match only content within a single comment, before the end of that comment. We can then replace with empty string, to effectively remove these comments from the input.
Code:
Dim input As String = "/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/"
Dim output As String = Regex.Replace(input, "/\*.*?\*/", "")
Console.WriteLine(input)
Console.WriteLine(output)
This prints:
one two three four five

How to replace the line breaks in a cell by explicit "\n"? [duplicate]

This question already has answers here:
find & replace commas with newline on Google Spreadsheet
(7 answers)
Closed 3 years ago.
How to replace the line breaks in a cell by explicit "\n" with a function ?
And the cells in questions are filled up by IMPORTRANGE
So I tried this but no luck:
=REGEXREPLACE(IMPORTRANGE("1LzbgZvRVf1s1nLqz8TkrxAxyJs1CVBuEbOEmmte60Wg","E3:E55"),"(\r\n)",char(10))
=REGEXREPLACE(A1,"\\n|\\r",CHAR(10))
| -- replace char A OR char B
\\ -- first slash escapes special symbol -- slash.
What you have written swaps a line-break with the whitespace character, char(10), that inserts a line break. So, if your input in a cell contains a line-break (by pressing CTRL and ENTER between some inputs on a single cell, for example) then this your formula is replacing the "invisible" line-break char with another invisible line-break.
If you remove "\r" and swap char(10) for X, you'll see it works and replaces the whitespace line-break with "X".
I.e.
=REGEXREPLACE(A1,"\n","X")
If you want to insert the line-break character, for example, for use in a HTML file, then you could use
=regexreplace(regexreplace(B2,"\n"," #n "),"\#","\")
Please clarify your question if I have misunderstood.

Regexp for string stating with a + and having numbers only [duplicate]

This question already has answers here:
Match exact string
(3 answers)
Closed 4 years ago.
I have the following regex for a string which starts by a + and having numbers only:
PatternArticleNumber = $"^(\\+)[0-9]*";
However this allows strings like :
+454545454+4545454
This should not be allowed. Only the 1st character should be a +, others numbers only.
Any idea what may be wrong with my regex?
You can probably workaround this problem by just adding an ending anchor to your regex, i.e. use this:
PatternArticleNumber = $"^(\\+)[0-9]*$";
Demo
The problem with your current pattern is that the ending is open. So, the string +454545454+4545454 might appear to be a match. In fact, that entire string is not a match, but the engine might match the first portion, before the second +, and report a match.

Regex find sting in the middle of two strings [duplicate]

This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Closed 5 years ago.
I want to get the time in the following line. I want to get the string
2017-07-07 08:30:00.065156
in
[ID] = 0,[Time] = 2017-07-07 08:30:00.065156,[access]
I tried this
(?<=[Time] = )(.*?)(?=,)
Where i want to get the string in-between the time tag and the first comma but this doesn't work.
[Time] inside a regex means a T, an i, an m, or an e, unless you escape your square brackets.
You can drop the reluctant quantifier if you use [^,]* in place of .*:
(?<=\[Time\] = )([^,]*)(?=,)

Extract numbers between brackets within a string [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Extract info inside all parenthesis in R (regex)
I inported data from excel and one cell consists of these long strings that contain number and letters, is there a way to extract only the numbers from that string and store it in a new variable? Unfortunately, some of the entries have two sets of brackets and I would only want the second one? Could I use grep for that?
the strings look more or less like this, the length of the strings vary however:
"East Kootenay C (5901035) RDA 01011"
or like this:
"Thompson-Nicola J (Copper Desert Country) (5933039) RDA 02020"
All I want from this is 5901035 and 5933039
Any hints and help would be greatly appreciated.
There are many possible regular expressions to do this. Here is one:
x=c("East Kootenay C (5901035) RDA 01011","Thompson-Nicola J (Copper Desert Country) (5933039) RDA 02020")
> gsub('.+\\(([0-9]+)\\).+?$', '\\1', x)
[1] "5901035" "5933039"
Lets break down the syntax of that first expression '.+\\(([0-9]+)\\).+'
.+ one or more of anything
\\( parentheses are special characters in a regular expression, so if I want to represent the actual thing ( I need to escape it with a \. I have to escape it again for R (hence the two \s).
([0-9]+) I mentioned special characters, here I use two. the first is the parentheses which indicate a group I want to keep. The second [ and ] surround groups of things. see ?regex for more information.
?$ The final piece assures that I am grabbing the LAST set of numbers in parens as noted in the comments.
I could also use * instead of . which would mean 0 or more rather than one or more i in case your paren string comes at the beginning or end of a string.
The second piece of the gsub is what I am replacing the first portion with. I used: \\1. This says use group 1 (the stuff inside the ( ) from above. I need to escape it twice again, once for the regex and once for R.
Clear as mud to be sure! Enjoy your data munging project!
Here is a gsubfn solution:
library(gsubfn)
strapplyc(x, "[(](\\d+)[)]", simplify = TRUE)
[(] matches an open paren, (\\d+) matches a string of digits creating a back-reference owing to the parens around it and finally [)] matches a close paren. The back-reference is returned.