Using sed between specific lines only - regex

I have this sed command for removing the spaces after commas.
sed -e 's/,\s\+/,/g' example.txt
How can i change it that, it will make the modification between only specific line numbers.
(e.g. between second and third lines).

Use:
sed '2,3s/,\s\+/,/g' example.txt
This will apply the regex /,\s\+/ only in the lines numbered 2 to 3 (inclusive) and substitute the match with ,.

Since OSX (BSD sed) has some syntax differences to linux (GNU) sed, thought I'd add the following from some hard-won notes of mine:
OSX (BSD) SED find/replace within (address) block (start and end point patterns(/../) or line #s) in same file (via & via & via & section 4.20 here):
Syntax:
$ sed '/start_pattern/,/end_pattern/ [operations]' [target filename]
Standard find/replace examples:
$ sed -i '' '2,3 s/,\s\+/,/g' example.txt
$ sed -i '' '/DOCTYPE/,/body/ s/,\s\+/,/g' example.txt
Find/replace example with complex operator and grouping (cannot operate without grouping syntax due to stream use of standard input). All statements in grouping must be on separate lines, or separated w/ semi-colons:
Complex Operator Example (will delete entire line containing a match):
$ sed -i '' '2,3 {/pattern/d;}' example.txt
Multi-file find + sed:
$ find ./ -type f -name '*.html' | xargs sed -i '' '/<head>/,/<\/head>/ {/pattern/d; /pattern2/d;}'
Hope this helps someone!

sed -e '2,3!b;s/,\s\+/,/g' example.txt
This version can be useful if you later want to add more commands to process the desired lines.

Related

Regex to replace all spaces in the code block marker of markdown file

I want to replace each group of spaces with a single comma in code block marker in every markdown file.
For example I have this code block:
```html class1 class2
Note that above line have two group of spaces, one with 3 spaces, other with single space.
I want to replace it to:
```html,class1,class2
I have tried following command without success:
find src -type f -name "*.md" -exec sed -i s/^(?<=```)( )+/,/g {} +
Meaning: if a line contains ``` at the start then replace all spaces with comma.
But it doesn't work.
What is correct command should I use here?
This will do it (with GNU sed):
sed '/^```/ s/\s\+/,/g' your_file
The ways it's working is as follows:
For lines beginning with three backticks... /^```/
Replace all (g means global replacement) occurrences of one or more spaces
(\s means space, \+ means one or more) with a comma
Once you've confirmed it does what you want, just add the -i to do the
substitution in-place:
sed -i '/^```/ s/\s\+/,/g' your_file
You can use
sed -E '/^```/ s/[[:space:]]+/,/g' file
See an online demo
Details:
-E enables the POSIX ERE syntax
/^```/ - if the line starts with ``` go on and execute the subsequent commands
s/[[:space:]]+/,/g - replaces one or more whitespaces with a single , char.
s='```html class1 class2
html class3 class4'
sed -E '/^```/ s/[[:space:]]+/,/g' <<< "$s"
Output:
```html,class1,class2
html class3 class4
Using any awk in any shell on every Unix box:
$ awk -v OFS=',' '/^```/{$1=$1} 1' file
```html,class1,class2
If you want to do "inplace" editing (like you're doing with GNU sed for sed -i) then use GNU awk and make it awk -i inplace -v OFS=',' '/^```/{$1=$1} 1' file

Why does this regex work in grep but not sed?

I have two regular expressions:
$ grep -E '\-\- .*$' *.sql
$ sed -E '\-\- .*$' *.sql
(I am trying to grep lines in sql files that have comments and remove lines in sql files that have comments)
The grep command works using this regex; however, the sed returns the following error:
sed: -e expression #1, char 7: unterminated address regex
What am I doing incorrectly with sed?
(The space after the two hyphens is required for sql comments if you are unfamiliar with MySql comments of this type)
You're trying to use:
sed -E '\-\- .*$' *.sql
Here sed command is not correct because you're not really telling sed to do something.
It should be:
sed -n '/-- /p' *.sql
and equivalent grep would be:
grep -- '-- ' *.sql
or even better with a fixed string search:
grep -F -- '-- ' *.sql
Using -- to separate pattern and arguments in grep command.
There is no need to escape - in a regex if it is outside bracket expression (or character class) i.e. [...].
Based on comments below it seems OP's intent is to remove commented section in all *.sql files that start with 2 hyphens.
You may use this sed for that:
sed -i 's/-- .*//g' *.sql
The problem here is not the regex, the problem is that sed requires a command. The equivalent of your grep would be:
sed -n '/\-\- .*$/p'
You suppress output for non-matching lines -n ... you search (wrap your regex in slashes) and you print p (after the last slash).
P.S.: As Anub pointed out, escaping the hyphens - inside the regex is unnecessary.
You are trying to use sed's \cregexpc syntax where with \-<...> you are telling sed the delimiter character you want use is a dash -, but you didn't terminate it where it should be: \-<...>- also add d command to delete those lines.
sed '\-\-\-.*$-d' infile
see man sed about that:
\cregexpc
Match lines matching the regular expression regexp. The c may be any character.
if default / was used this was not required so:
sed '/--.*$/d' infile
or simply:
sed '/^--/d' infile
and more accurately:
sed '/^[[:blank:]]*--/d' infile

Replace the separator between pairs of numbers

I want to replace all strings like [0-9][0-9]-[0-9][0-9] with [0-9][0-9]/[0-9][0-9] using sed.
In other words, I want to replace - with /.
If I have somewhere in my text:
09-36
32-43
54-65
I want this change:
09/36
32/43
54/65
Using GNU sed:
$ echo '09-36 32-43 54-65' | sed -r 's|\<([0-9]{2})-([0-9]{2})\>|\1/\2|g'
09/36 32/43 54/65
-r turns on extended regular expressions, which:
doesn't require \-escaping ( ) { } char.
enables use of \< and /> to only match at word boundaries (if the expression should only match full lines, use ^ and $ instead, and omit the g option)
| is used as an alternative regex delimiter so that / can be used without \-escaping.
A BSD/macOS sed solution would look slightly different:
echo '09-36 32-43 54-65' | sed -E 's|[[:<:]]([0-9]{2})-([0-9]{2})[[:>:]]|\1/\2|g'
sed -e 's/\([0-9]\{2\}\)-\([0-9]\{2\}\)/\1\/\2/g'
Might not be the most elegant version, but works for me. The gazillion backslashes make this rather unreadable in my opinion. You might improve the readability by not using / to separate the pattern and the replacement maybe?
perl -C -npe 's/(?<!\d)(\d\d)-(\d\d)(?!\d)/\1\/\2/g' file
Input
维基 1-11 22-33 444-44 55-555 66-66百科
77-77
8 88-88
Output
维基 1-11 22/33 444-44 55-555 66/66百科
77/77
8 88/88
In the command above
-C enables Unicode;
-n causes Perl to process the script for each input line;
-p causes Perl to print the result of the script to the standard output;
-e accepts a Perl expression (particularly, it is a substitution).
In this mode (-npe), Perl works just like sed. The script substitutes each pair of digits separated with - to the same pair separated with a slash.
(?<!\d) and (?!\d) are negative lookaround expressions.
To edit the file in place use -i option: perl -C -i.backup -npe ....
If the input is not a file, you can pass the input to Perl via pipe, e.g.:
echo '维基 1-11 22-33 444-44 55-555 66-66百科' | \
perl -C -npe 's/(?<!\d)(\d\d)-(\d\d)(?!\d)/\1\/\2/g'

Extract few matching strings from matching lines in file using sed

I have a file with strings similar to this:
abcd u'current_count': u'2', u'total_count': u'3', u'order_id': u'90'
I have to find current_count and total_count for each line of file. I am trying below command but its not working. Please help.
grep current_count file | sed "s/.*\('current_count': u'\d+'\).*/\1/"
It is outputting the whole line but I want something like this:
'current_count': u'3', 'total_count': u'3'
It's printing the whole line because the pattern in the s command doesn't match, so no substitution happens.
sed regexes don't support \d for digits, or x+ for xx*. GNU sed has a -r option to enable extended-regex support so + will be a meta-character, but \d still doesn't work. GNU sed also allows \+ as a meta-character in basic regex mode, but that's not POSIX standard.
So anyway, this will work:
echo -e "foo\nabcd u'current_count': u'2', u'total_count': u'3', u'order_id': u'90'" |
sed -nr "s/.*('current_count': u'[0-9]+').*/\1/p"
# output: 'current_count': u'2'
Notice that I skip the grep by using sed -n s///p. I could also have used /current_count/ as an address:
sed -r -e '/current_count/!d' -e "s/.*('current_count': u'[0-9]+').*/\1/"
Or with just grep printing only the matching part of the pattern, instead of the whole line:
grep -E -o "'current_count': u'[[:digit:]]+'
(or egrep instead of grep -E). I forget if grep -o is POSIX-required behaviour.
For me this looks like some sort of serialized Python data. Basically I would try to find out the origin of that data and parse it properly.
However, while being hackish, sed can also being used here:
sed "s/.*current_count': [a-z]'\([0-9]\+\).*/\1/" input.txt
sed "s/.*total_count': [a-z]'\([0-9]\+\).*/\1/" input.txt

Selective find/replace with sed

I need to do some find and replace in C++ source code: replace all occurrences of _uvw with xyz except when _uvw is part of abc_uvw or def_uvw. For example:
abc_uvw ghi_uvw;
jkl_uvw def_uvw;
should become:
abc_uvw ghixyz;
jklxyz def_uvw;
So far I came up with the following:
find . -type f -print0 | xargs -0 sed -i '/abc_uvw/\!s/_uvw/xyz/g'
This will replace all _uvw with xyz only in the lines that don't contain abc_uvw, which (1) doesn't handle such a case: abc_uvw ghi_uvw; and (2) doesn't take into account the second exception, that is def_uvw.
So how would one do that sort of selective find and replace with sed?
This might work for you (GNU sed):
sed -r 's/(abc|def)_uvw/\1\n_uvw/g;s/([^\n])_uvw/\1xyz/g;s/\n//g' file
Insert a newline infront of the strings you do not want to change. Change those strings which do not have a newline infront of them. Delete any newlines.
N.B. Newline is chosen as it cannot exist in an unadulterated sed buffer.
How about this?
$ cat file
abc_uvw ghi_uvw;
jkl_uvw def_uvw;
$ sed 's/abc_uvw/foo/g;s/def_uvw/bar/g;s/_uvw/xyz/g;s/foo/abc_uvw/g;s/bar/def_uvw/g' file
abc_uvw ghixyz;
jklxyz def_uvw;
You should use negative lookbehind. For example, in Perl:
perl -pe 's/(?<!(abc|def))_uvw/xyz/g' file.c
This performs a global substitution of any instances of _uvw that are not immediately preceded by abc or def.
Output:
abc_uvw ghixyz;
jklxyz def_uvw;
Sed is a useful tool and certainly has its place but Perl is a lot more powerful in terms of regular expressions. Using Perl, you get to specify exactly what you mean, rather than solving the problem in a more roundabout way.
This will work:
sed -e 's/abc_uvw/AAA_AAA/g; # shadow abc_uvw
s/def_uvw/DDD_DDD/g; # shadow def_uvw
s/_uvw/xyz/g; # substitute
s/AAA_AAA/abc_uvw/g; # recover abc_uvw
s/DDD_DDD/def_uvw/g # recover def_uvw
' input.cpp > output.cpp
cat output.cpp
sed 's/µ/µm/g;s/abc_uvw/µa/g;s/def_uvw/µd/g
s/_uvw/xyz/g
s/µd/def_uvw/g;s/µa/abc_uvw/g;s/µm/µ/g' YourFile
This is like the other in concept but "escaping" first the temporary pattern to filter on abc and def. I use µ but other char is possible, just avoid special sed char like /, \, &, ...