remove characters including | but only up to (not including) > [duplicate] - regex

This question already has answers here:
Carets in Regular Expressions
(2 answers)
Closed 3 years ago.
in the pattern
<Blast>uce-506_drosophila_albomicans |uce506</BlastOutput_query-def>
I'm trying to remove |* up to (but not including) <
I tried (but doesn't achieve it)
sed 's/^|[^<]*//g' dataset2.fasta.xml >dataset2_2.fasta.xml

Single caret ^ means "start at the beginning of the line." Remove it:
sed 's/|[^<]*//g'

Related

Regex for substring match [duplicate]

This question already has answers here:
Regex for string contains?
(5 answers)
Closed 7 months ago.
I have list of strings
03000_textbox (57447),
03990_textbox (57499),
03000_textnewbox (57447)
I want to use regex to capture 1st and 2nd elements of list.
Anything that contains substring textbox i.e.
03000_textbox (57447)
03990_textbox (57499)
Here is the regex you're looking for :
[0-9]+_textbox \([0-9]+\)
Live sample : https://regex101.com/r/2oiwcF/1
Don't forget to put a global (g) flag so you can get every match and loop into.

Extract all chars between parenthesis [duplicate]

This question already has answers here:
Regular Expression to get a string between parentheses in Javascript
(10 answers)
Closed 2 years ago.
I used
let regExp = /\(([^)]+)\)/;
to extract
(test(()))
from
aaaaa (test(())) bbbb
but I get only this
(test(()
How can I fix my regex ?
Don't use a negative character set, since parentheses (both ( and )) may appear inside the match you want. Greedily repeat instead, so that you match as much as possible, until the engine backtracks and finds the first ) from the right:
console.log(
'aaaaa (test(())) bbbb'
.match(/\(.*\)/)[0]
);
Keep in mind that this (and JS regex solutions in general) cannot guarantee balanced parentheses, at least not without additional post-processing/validation.

notepad++ REGEX how to find matched string and keep it (delete all surrounding text) [duplicate]

This question already has answers here:
Delete all content but keeping matched
(1 answer)
Can't use ^ to say "all but"
(4 answers)
Closed 2 years ago.
I am trying to use regex function to find the following string AAA and delete everything else (except the ending part of this string)
Example:
eenqowtnorynwny55w4oynw AAABBB ewtenoqtn3oyn AAACCC 4et3o4ny3ny AAADDD 3to3n4yon45yo
wetn3o4tn3o5yn AAAZZZ wn3otn3on AAANNN
expected outpout
AAABBB
AAACCC
AAADDD
AAAZZZ
AAANNN

Regex find sting in the middle of two strings [duplicate]

This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Closed 5 years ago.
I want to get the time in the following line. I want to get the string
2017-07-07 08:30:00.065156
in
[ID] = 0,[Time] = 2017-07-07 08:30:00.065156,[access]
I tried this
(?<=[Time] = )(.*?)(?=,)
Where i want to get the string in-between the time tag and the first comma but this doesn't work.
[Time] inside a regex means a T, an i, an m, or an e, unless you escape your square brackets.
You can drop the reluctant quantifier if you use [^,]* in place of .*:
(?<=\[Time\] = )([^,]*)(?=,)

Escaping backslash using sub [duplicate]

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 8 years ago.
I have a string template, read from a file, with the content:
template <- "\\begin{tabular}\n[results]\n\\end{tabular}"
I want to replace results with some text I have generated in my code, to compose a laTeX table, but when I run:
sub("\\[results\\]","text... \\hline text...",template)
I have the following output:
"\\begin{tabular}\ntext... hline text...\n\\end{tabular}"
The backslash is not escaped and I don't understand why, because I'm using \\ for this purpose.
I am using R-3.0.2.
The regex engine is consuming the \\ for a potential capture group, you need to add two more backslashes:
sub("\\[results\\]","text... \\\\hline text...",template)
[1] "\\begin{tabular}\ntext... \\hline text...\n\\end{tabular}"