I am replacing one multi-line pattern:
</p>
<ul>
using following sed command:
sed -e 's|<\/p>\n<ul>|\\begin{itemize}|g'
but it does not seem to be working fine. How can I replace above pattern using sed.
Assuming you have GNU sed, you can use a loop:
sed ':a; N; $!ba; s|</p>\n<ul>|\\begin{itemize}|g' file
For BSD/OSX sed, try:
sed -e ':a' -e 'N' -e '$!ba' -e 's|</p>\n<ul>|\\begin{itemize}|g' file
As per comments, simply add in the spaces at the beginning of each line:
sed ':a; N; $!ba; s| *</p>\n *<ul>|\\begin{itemize}|g' fileg
Related
I want to change
>lcl|ORF183:9482:8118 unnamed protein product
into
>ORF183:9482-8118
Keep everything after | and before 'white space', plus replacing second : to -
So far I'm doing it with the following code:
sed -e '/^>/s/ .*//' -e '/^>/s/|/ /' -e '/^>/s/lcl //' -e '/^>/s/\(.*\):/\1-/'
but wish to do it in a simpler one-line code.
This could work:
sed -e 's/\(^.*|\)\(.*\):\(.*\):\(.*\)[[:space:]]\(unnamed.*$\)/>\2:\3-\4/'
Here's some improvements based on code you've tried
$ sed -e '/^>/s/ .*//' -e '/^>/s/lcl|//' -e '/^>/s/:/-/2' ip.txt
>ORF183:9482-8118
-e '/^>/s/|/ /' -e '/^>/s/lcl //' can be simplified to -e '/^>/s/lcl|//'
use s/>[^|]*|/>/ if you wish to match any text between > and |
sed allows to specify which occurrence of the match you want to replace, s/:/-/2 means replace the 2nd : to -
If your sed implementation allows grouping, you can group all the commands (separated by ;) inside {} for a particular address
$ sed '/^>/{s/ .*//; s/lcl|//; s/:/-/2}' ip.txt
>ORF183:9482-8118
Please visit https://stackoverflow.com/tags/sed/info for learning resources and other goodies
I have a version regex which works in PCRE format while am having trouble getting this to work with sed using match groups.
Regex:
((^[[:alnum:]]+.*)-(\d+\.\d+\.\d+-VERS|\d+\.\d+\.\d+))
Input:
aaa1-bbb2-ccc3-dddd4-ffff5-1.0.0-VERS
aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS
zzz1-bbb2-ccc3-1.0.1
zzz1-1.0.1-VERS
expected output: split strings and separate the version string
group2="aaa1-bbb2-ccc3-dddd4-ffff5"
group3="1.0.0-VERS"
group2="aaa1-bbb2-ccc3-dddd4-ffff5"
group3="11.22.33-VERS"
group2="zzz1-bbb2-ccc3"
group3="1.0.1"
group2="zzz1"
group3="1.0.1-VERS"
The above output work as expected here
However, trying to use the same version with sed does not work. What am I missing?
echo "aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS" | sed -E 's#((^[[:alnum:]]+.*)-(\d+\.\d+\.\d+-VERS|\d+\.\d+\.\d+))#\3 \2#p'
Why such a complicated regexp?
$ sed -E 's/(.*)-([0-9.]+(-VERS)?)$/\2\t\1/' file
1.0.0-VERS aaa1-bbb2-ccc3-dddd4-ffff5
11.22.33-VERS aaa1-bbb2-ccc3-dddd4-ffff5
1.0.1 zzz1-bbb2-ccc3
1.0.1-VERS zzz1
or:
$ sed -E 's/(.*)-([^-]+-[^-]+)$/\2\t\1/' file
1.0.0-VERS aaa1-bbb2-ccc3-dddd4-ffff5
11.22.33-VERS aaa1-bbb2-ccc3-dddd4-ffff5
ccc3-1.0.1 zzz1-bbb2
1.0.1-VERS zzz1
depending on what the output should be for input zzz1-bbb2-ccc3-1.0.1.
I think \d isn't recognised by sed. This works for me on OSX.
sed -E 's/([[:alnum:]]+.*)-([0-9]+\.[0-9]+\.[0-9]+|[0-9]+\.[0-9]+\.[0-9]+-VERS)/\1 \2/'
Input:
aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS
aaa1-bbb2-ccc3-dddd4-ffff5-1.0.0-VERS
zzz1-bbb2-ccc3-1.0.1
zzz1-1.0.1-VERS
Output:
aaa1-bbb2-ccc3-dddd4-ffff5 11.22.33-VERS
aaa1-bbb2-ccc3-dddd4-ffff5 1.0.0-VERS
zzz1-bbb2-ccc3 1.0.1
zzz1 1.0.1-VERS
As #Sundeep pointed out \d+ does not work with sed and should be using [0-9]+ instead.
echo "aaa1-bbb2-ccc3-dddd4-ffff5-11.22.33-VERS" | sed -E 's#((^[[:alnum:]]+.*)-([0-9]+\.[0-9]+\.[0-9]+-VERS|[0-9]+\.[0-9]+\.[0-9]+))#\3 \2#g'
This might work for you (GNU sed):
sed -r 'h;s/^(([[:alnum:]]+-?)+)-(([[:digit:]]+\.?){3}(-VERS)?)/group1="\1"/p;g;s//group3="\3"/p;d' file
However a simpler regexp would be:
sed -r 'h;s/^(.*)-([0-9].*)/group1="\1"/p;g;s//group2="\2"/p;d' file
I've read a lot of questions about how to replace spaces from a file but I have the following problem:
I have a file like so:
<foo>"crazy foo"</foo> <bar>dull-bar</bar>
and I'm trying to remove spaces between > < and only those ones so the file would be like:
`<foo>"crazy foo"</foo><bar>dull-bar</bar>`
So far I've tried to remove then by using sed and tr. Sed is not working by any chance and using tr '> <' '><' outputs:
<foo>"crazy foo"</foo><<bar>dull-bar</bar>
sed -i -e "s/> *</></g" YourFile
-i means YourFile is modified. Remove this option to test your command and display the result in shell output.
* matches n spaces.
The g at the end of sed expression means "Replace all the occurrences".
You could try something like this
echo "<foo>"crazy foo"</foo> <bar>dull-bar</bar>" | sed 's/>[[:space:]]*</></g '
awk -F"\"" '{print $3}' file.txt | sed 's/ //g'
How to remove words that do not start with a specific character by sed?
Sample:
echo "--foo imhere -abc anotherone" | sed ...
Result must be;
"--foo -abc"
echo "--foo imhere -abc anotherone" |\
sed -e 's/^/ /g' -e 's/ [^-][^ ]*//g' -e 's/^ *//g'
The first and last -e commands are needed if only when the first word can be wrong either.
gnu sed with -r:
kent$ echo "--foo imhere -abc anotherone" | sed -r 's/^|\s[^-]\S*//g'
--foo -abc
However I prefer awk to solve it, more straightforward:
awk '{for(i=1;i<=NF;i++)$i=($i~/^-/?$i:"")}7'
output:
--foo -abc
You can use ssed to enable PCRE regex and then you can use this one:
(?<!-)\b\w+
Working demo
echo "--foo imhere -abc anotherone" | ssed 's/(?<!-)\b\w+//'
I'm trying to write a sed command to remove a specific string followed by two digits. So far I have:
sed -e 's/bizzbuzz\([0-9][0-9]\)//' file.txt
but I cant seem to get the syntax right. Any suggestions?
sed -re 's/bizzbuzz[0-9]{2}//' file.txt
and
sed -re 's/\bbizzbuzz[0-9]{2}\b//' file.txt
if the searched string have word boundary
sed -e 's/bizzbuzz[0-9]\{2\}//' file.txt
if you don't have GNU sed
Your current approach seems like it should work fine:
$ echo 'FOO bizzbuzz56 BAR' | sed -e 's/bizzbuzz\([0-9][0-9]\)//'
FOO BAR
As said in other answer, the syntax seems to be fine (with unnecesary parenthesis).
But may be you want to replace all the strings found in each line ? In that case, you should add a 'g' at the end of the 's' command:
sed -e 's/bizzbuzz\([0-9][0-9]\)//g' file.txt