How to filter out c-type comments with regex? [duplicate] - regex

This question already has answers here:
Regex to match a C-style multiline comment
(8 answers)
Improving/Fixing a Regex for C style block comments
(5 answers)
Strip out C Style Multi-line Comments
(4 answers)
Closed 3 years ago.
I'm trying to filter out "c-style" comments in a line so i'm only left with the words (or actual code).
This is what i have so far: demo
regex:
\/\*[^\/]*[^\*]*\*\/
text:
/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/

My guess is that this expression might likely work,
\/\*(\/\*\*\/)?\s*([^\/*]+?)\s*(?:\/?\*?\*?\/|\*)
or we would modify our left and right boundaries, if we would have had different inputs.
In this demo, the expression is explained, if you might be interested.

We can try doing a regex replacement on the following pattern:
/\*.*?\*/
This matches any old-school C style comment. It works by using a lazy dot .*? to match only content within a single comment, before the end of that comment. We can then replace with empty string, to effectively remove these comments from the input.
Code:
Dim input As String = "/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/"
Dim output As String = Regex.Replace(input, "/\*.*?\*/", "")
Console.WriteLine(input)
Console.WriteLine(output)
This prints:
one two three four five

Related

Regex express begin & end specific words[No duplicate] [duplicate]

This question already has answers here:
Regex matching beginning AND end strings
(6 answers)
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I'm trying to write a regex represent 'recognizes words that begin and end in "t".'
I think that the below code is true.
var re = /^t+t*t$/
But it shows 'false'
e.g.
re.test('triplet')
re.test('thought')
re.test('that')
why doesn't my answer solve the above string?
and what is the proper regex?
Your regex is wrong, as pointed out in the comments.
A naive approach could be to check if the entire word starts with t, has any number of any character and then ends with t:
var re = /^t.*t$/
of course, you could also limit the "middle" character to letters:
var re = /^t[a-z]*t$/
However, neither of these approaches check for a word that is a single "t" character. If this is a valid usecase, you'll have to handle it explicitly, e.g.:
var re = /^(t[a-z]*t|t)$/

How can I remove a certain pattern from a string? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I have this string like "682_2, 682_3, 682_4". (682 is a random number)
How can i get this string "2, 3, 4" using regex and ruby?
You can do this in ruby
input="682_2, 682_3, 682_4"
output = input.gsub(/\d+_/,"")
puts output
A simple regex could be
/_([0-9]+)$/ and in the match group of the result you will have 2 for 682_2 and 3 for 682_3
Ruby code snippet would be "64532_2".match(/_([0-9]+)/).captures[0]
you can use scan which returns an array containing the matches:
string_code.scan(/(?<=_)\d/)
(?<=_) tells to find a pattern that has a given pattern (_ in this case) before itself but wont capture that, it captures only \d. if it can have more than 1 digit like 682_13,682_33 then \d+ is necessary.

RegEx for Dutch ING bankstatement [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
Is there anyone who can help me to get the marked pieces out of this file (see image below) with a regular expression? As you can see, it's difficult because the length is not always the same and the part before my goal is sometimes broken down and sometimes not.
Thank you in advance.
Text:
:61:200106D48,66NDDTEREF//00060100142533
/TRCD/01028/
:86:/EREF/SLDD-0705870-5658387529//MARF/11514814-001//CSID/NL59ZZZ390
373820000//CNTP/NL96ABNA0123456789/ABCANL2A/XXXXXXX123///REMI/UST
D//N00814760/
:61:200106D1840,55NDDTEREF//00060100142534
/TRCD/01028/
:86:/EREF/SLDD-0705869-5658387528//MARF/11514814-001//CSID/NL59ZZZ390
373820000//CNTP/NL96ABNA0123456789/ABCANL2A/XXX123XXXX///REMI/UST
D//N00814759/
:61:200106C236,31NTRFEREF//00060100142535
/TRCD/00100/
:86:/EREF/05881000010520//CNTP/NL19INGB0123456789/ABCBNL2A/XX123XXXX//
/REMI/USTD//KLM REF 1000000022/
The length is not always the same but it does not really matter in your case. You can check for a particular pattern at the end of a string.
(?<=\/\/)([\u2022a-zA-Z0-9]+)(?=\/$)
this regex will look for a string of caracter containing bullet (•), numbers, letters (uppercase and lowercase), that followes two front slash (//) and is followed by a slash (/) and the end of the string ( $ ).
You can test more cases here

Regex number between slash [duplicate]

This question already has answers here:
Regex to match a C-style multiline comment
(8 answers)
Closed 3 years ago.
I have a lot of lines with mark like
/* 1 */
/* 2 */
....
/* 1000 */
I want to replace them by comma. I came up with a simple regex to use on Notepadd++
\/(.*?)\/
Works fine, but sometimes some lines has txt like this and matches the regex when should not
de produtos / trazendo inputs qualitativos / estratégicos para a marca
------------^-------------------------------^----------------------
I am trying to use /* instead of just / but with no success!
Any suggestion?
To be able to match /* ... */ blocks, you may use this regex:
\/\*.*?\*\/
Since * is meta-character in regex, it needs to be escaped as well.
Also it is required to use lazy quantifier .*? to avoid matching across the blocks.
The following should do the work
\/\*[\d\s]+\*\/
It will match first opening comment, then either digit or space multiple times and then closing comment

Regexp for string stating with a + and having numbers only [duplicate]

This question already has answers here:
Match exact string
(3 answers)
Closed 4 years ago.
I have the following regex for a string which starts by a + and having numbers only:
PatternArticleNumber = $"^(\\+)[0-9]*";
However this allows strings like :
+454545454+4545454
This should not be allowed. Only the 1st character should be a +, others numbers only.
Any idea what may be wrong with my regex?
You can probably workaround this problem by just adding an ending anchor to your regex, i.e. use this:
PatternArticleNumber = $"^(\\+)[0-9]*$";
Demo
The problem with your current pattern is that the ending is open. So, the string +454545454+4545454 might appear to be a match. In fact, that entire string is not a match, but the engine might match the first portion, before the second +, and report a match.