Replacing surroundings of text in brackets when it occurs multiple times in a string [duplicate] - regex

I have a string containing LaTeX code, for example \emph{some words here} and I want to get Markdown syntax, for example,*some words here*. I tried:
s <- "some text in \\emph{italics} and some more ..."
pattern <- "\\\\emph\\{(.*)\\}"
gsub(pattern,"*\\1*", s)
> "some text in *italics* and some more ..."
However, I do not succeed at handling multiple occurences in one string.
s <- "some text in \\emph{italics} and some \\emph{more italics} and ..."
gsub(pattern,"*\\1*", s)
> "some text in *italics} and some \\emph{more italics* and ..."
I guess I need a non-greedy version which handles multiple occurrences, but I am not sure how to do it. Any ideas?

Use lazy ? quantifier like this.
Regex: \\\\emph{(.*?)}
Regex101 Demo


How can I remove a certain pattern from a string? [duplicate]

I have this string like "682_2, 682_3, 682_4". (682 is a random number)
How can i get this string "2, 3, 4" using regex and ruby?
You can do this in ruby
input="682_2, 682_3, 682_4"
output = input.gsub(/\d+_/,"")
puts output
A simple regex could be
/_([0-9]+)$/ and in the match group of the result you will have 2 for 682_2 and 3 for 682_3
Ruby code snippet would be "64532_2".match(/_([0-9]+)/).captures[0]
you can use scan which returns an array containing the matches:
(?<=_) tells to find a pattern that has a given pattern (_ in this case) before itself but wont capture that, it captures only \d. if it can have more than 1 digit like 682_13,682_33 then \d+ is necessary.

Python regex to parse '#####' text in description field [duplicate]

Here's the line I'm trying to parse:
#abc #ghi j#klm #nop.qrs #tuv
And here's the regex I've gotten so far:
#[A-Za-z]+[^0-9. ]+\b | #[A-Za-z]+[^0-9. ]
My goal is to get ['#abc', '#ghi', '#tuv'], but no matter what I do, I can't get 'j#klm' to not match. Any help is much appreciated.
Try using re.findall with the following regex pattern:
inp = "#abc #ghi j#klm #nop.qrs #tuv"
matches = re.findall(r'(?:(?<=^)|(?<=\s))#[A-Za-z]+(?=\s|$)', inp)
This prints:
['#abc', '#ghi', '#tuv']
The regex calls for an explanation. The leading lookbehind (?:(?<=^)|(?<=\s)) asserts that what precedes the # symbol is either a space or the start of the string. We can't use a word boundary here because # is not a word character. We use a similar lookahead (?=\s|$) at the end of the pattern to rule out matching things like #nop.qrs. Again, a word boundary alone would not be sufficient.
just add the line initiation match at the beginning:
^#[A-Za-z]+[^0-9. ]+\b | #[A-Za-z]+[^0-9. ]
it shoud work!

How to filter out c-type comments with regex? [duplicate]

I'm trying to filter out "c-style" comments in a line so i'm only left with the words (or actual code).
This is what i have so far: demo
/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/
My guess is that this expression might likely work,
or we would modify our left and right boundaries, if we would have had different inputs.
In this demo, the expression is explained, if you might be interested.
We can try doing a regex replacement on the following pattern:
This matches any old-school C style comment. It works by using a lazy dot .*? to match only content within a single comment, before the end of that comment. We can then replace with empty string, to effectively remove these comments from the input.
Dim input As String = "/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/"
Dim output As String = Regex.Replace(input, "/\*.*?\*/", "")
This prints:
one two three four five

Add space between two letters in a string in R [duplicate]

Suppose I have a string like
s = "PleaseAddSpacesBetweenTheseWords"
How do I use gsub in R add a space between the words so that I get
"Please Add Spaces Between These Words"
I should do something like
gsub("[a-z][A-Z]", ???, s)
What do I put for ???. Also, I find the regular expression documentation for R confusing so a reference or writeup on regular expressions in R would be much appreciated.
You just need to capture the matches then use the \1 syntax to refer to the captured matches. For example
s = "PleaseAddSpacesBetweenTheseWords"
gsub("([a-z])([A-Z])", "\\1 \\2", s)
# [1] "Please Add Spaces Between These Words"
Of course, this just puts a space between each lower-case/upper-case letter pairings. It doesn't know what a real "word" is.

Escaping backslash using sub [duplicate]

I have a string template, read from a file, with the content:
template <- "\\begin{tabular}\n[results]\n\\end{tabular}"
I want to replace results with some text I have generated in my code, to compose a laTeX table, but when I run:
sub("\\[results\\]","text... \\hline text...",template)
I have the following output:
"\\begin{tabular}\ntext... hline text...\n\\end{tabular}"
The backslash is not escaped and I don't understand why, because I'm using \\ for this purpose.
I am using R-3.0.2.
The regex engine is consuming the \\ for a potential capture group, you need to add two more backslashes:
sub("\\[results\\]","text... \\\\hline text...",template)
[1] "\\begin{tabular}\ntext... \\hline text...\n\\end{tabular}"