Replace only the first occurence matching a regex with sed - regex

I have a string
test:growTest:ret
And with sed i would to delete only test: to get :
growTest:ret
I tried with
sed '0,/RE/s/^.*://'
But it only gives me
ret
Any ideas ?
Thanks

Modify your regexp ^.*: to ^[^:]*:
All you need is that the .* construction won't consume your delimiter — the colon. To do this, replace matching-any-char . with negated brackets: [^abc], that match any char except specified.
Also, don't confuse the two circumflexes ^, as they have different meanings: first one matches beginning of string, second one means negated brackets.

If I understand your question, you want strings like test:growTest:ret to become growTest:ret.
You can use:
sed -i 's/test:(.*$)/\1/'
i means edit in place.
s/one/two/ replaces occurences of one with two.
So this replaces "test:(.*$)" with "\1". Where \1 is the contents of the first group, which is what the regex matched inside the braces.
"test:(.*$)" matches the first occurence of "test:" and then puts everything else until the end of the line unto the braces. The contents of the braces remain after the sed command.

Sed use hungry match. So ^.*: will match test:growTest: other than test:.
Default, sed only replace the first matched pattern. So you need not do anything specially.

Related

Using sed to replace space delimited strings

echo 'bar=start "bar=second CONFIG="$CONFIG bar=s buz=zar bar=g bar=ggg bar=f bar=foo bar=zoo really?=yes bar=z bar=yes bar=y bar=one bar=o que=idn"' | sed -e 's/^\|\([ "]\)bar=[^ ]*[ ]*/\1/g'
Actual output:
CONFIG="$CONFIG buz=zar bar=ggg bar=foo really?=yes bar=yes bar=one que=idn"
Expected output:
CONFIG="$CONFIG buz=zar really?=yes que=idn"
What I'm missing in my regex?
Edit:
This works as expected (with GNU sed):
's/\(^\|\(['\''" ]\)\)bar=[^ ]*/\2/g; s/[ ][ ]\+/ /g; s/[ ]*\(['\''"]\+\)[ ]*/\1/g'
sed regular expressions are pretty limited. They don't include \w as a synonym for [a-zA-Z0-9_], for example. They also don't include \b which means the zero-length string at the beginning or end of a word (which you really want in this situation...).
s/ bar=[^ ]* *//
is close, but the problem is the trailing * removes the space that might precede the next bar=. So, in ... bar=aaa bar=bbb ... the first match is bar=aaa leaving bar=bbb ... to try for the second match but it won't match because you already consumed the space before bar.
s/ bar=[^ ]*//
is better -- don't consume the trailing spaces, leave them for the next match attempt. If you want to match bar=something even if it's at the beginning of the string, insert a space at the beginning first:
sed 's/^bar=/ bar=/; s/ bar=[^ ]*//'
If you want to remove all instances of bar=something then you can simplify your regex as such:
\sbar=\w+
This matches all bar= plus all whole words. The bar= must be preceded by a whitespace character.
Demonstration:
https://regex101.com/r/xbBhJZ/3
As sed:
s/\sbar=\w\+//g
This correctly accounts for foobar=bar.
Like Waxrat's answer, you have to insert a space at the beginning for it to properly match as it's now matching against a preceding whitespace character before the bar=. This can be easily done since you're quoting your string explicitly.

Using sed to replace string matching regex with wildcards

I have a string I'm trying manipulate with sed
js/plex.js?hash=f1c2b98&version=2.4.23"
Desired output is
js/plex.js"
This is what I'm currently trying
sed -i s'/js\/plex.js[\?.\+\"]/js\/plex.js"/'
But it is only matching the first ? and returns this output
js/plex.js"hash=f1c2b98&version=2.4.23"
I can't see why this isn't working after a few hours
This works
echo 'js/plex.js?hash=f1c2b98&version=2.4.23"' | sed s:.js?.*:.js:g
With the original Regex:
Firstly I would suggest use a different delimiter (like : in sed when using / in the regex. Secondly, the use of [] means that you are matching the characters inside the brackets (and as such it will not expand the .+ to the end of the line - you could potentially try put the + after the [])
perhaps
sed 's#\(js/plex.js?\)[^"]\+".*#\1#g'
..
\# is used as a delimiter
\(js/plex.js?\)[^"]\+".* #find this pattern and replace everything with your marked pattern \1 found
The marked pattern
In sed you can mark part of a pattern or the whole pattern buy using \( \). .
When part of a pattern is enclosed by brackets () escaped by backslashes..the pattern is marked/stored...
in my example this is my pattern without marking
js/plex.js?[^"]\+".*
but I only want sed to remember js/plex.js? and replace the whole line with only this piece of pattern js/plex.js? ..with sed the first marked pattern is known as \1, the second \2 and so forth
\(js/plex.js?\) ---> is marked as \1
Hence I replace the whole line with \1

Regex to match and copy up to but not including last occurrence of a particular value

In one regex ksh line I need to:
look for the occurrence of a particular string followed by any number of characters up to the last occurrence of a particular value (in this case a comma),
copy the stuff matched to the output, and then
insert a new value after the copied text and before the last occurrence of the particular value (in this case a comma)
So, if my input string looked like this:
SEARCH_STRING anything_else(foo,bar),
What I'd like to output is this:
SEARCH_STRING anything_else(foo,bar) INSERTED_VALUE,
So far, my sed expression looks like this (which only matches and copies everything up to the first occurrence of the comma, not up to the last):
sed -e 's/SEARCH_STRING [^,]\+/& INSERTED_VALUE/'
...which results in this:
SEARCH_STRING anything_else(foo INSERTED_VALUE,bar)
...which is not quite right. I know I need to use something like a negative look ahead - but can't quite get the syntax right. Any advice you could offer would be greatly appreciated, thanks. I also need to do the same replacement incidentally at the end of the line even if the comma isn't found as well please (although I appreciate that may require a separate question and expression). Thanks in advance for any advice offered....
Use the $ special character to match the end of the line, and the . special character to match the last character before that:
sed 's/\(SEARCH_STRING .*\)\(.\)$/\1INSERTED_VALUE\2/'
You could replace the final dot in the match expression with a comma if you know that this is always going to be the character you want to replace. If that last character varies, then using dot will match any such character. One downside, however, is that it also matches whitespace, so if your line has a few extra spaces after the comma, this expression will delete a space, not the comma.
To replace the last non-whitespace character, use this expression instead:
sed 's/\(SEARCH_STRING .*\)\(\S\s*\)$/\1INSERTED_VALUE\2/'
The simplest would be to use a lookahead SEARCH_STRING .*(?=,) but sed does not support this, instead you can do something like this:
sed -e 's/\(SEARCH_STRING .*\)\(,.*\)/\1 INSERTED_VALUE\2/'
Basically we make a backreference what comes before and after the last comma, and then piece back it together with INSERTED_VALUE in the middle.

Regular expression to match beginning and end of a line?

Could anyone tell me a regex that matches the beginning or end of a line? e.g. if I used sed 's/[regex]/"/g' filehere the output would be each line in quotes? I tried [\^$] and [\^\n] but neither of them seemed to work. I'm probably missing something obvious, I'm new to these
Try:
sed -e 's/^/"/' -e 's/$/"/' file
To add quotes to the start and end of every line is simply:
sed 's/.*/"&"/g'
The RE you were trying to come up with to match the start or end of each line, though, is:
sed -r 's/^|$/"/g'
Its an ERE (enable by "-r") so it will work with GNU sed but not older seds.
matthias's response is perfectly adequate, but you could also use a backreference to do this. if you're learning regular expressions, they are a handy thing to know.
here's how that would be done using a backreference:
sed 's/\(^.*$\)/"\1"/g' file
at the heart of that regex is ^.*$, which means match anything (.*) surrounded by the start of the line (^) and the end of the line ($), which effectively means that it will match the whole line every time.
putting that term inside parenthesis creates a backreference that we can refer to later on (in the replace pattern). but for sed to realize that you mean to create a backreference instead of matching literal parentheses, you have to escape them with backslashes. thus, we end up with \(^.*$\) as our search pattern.
the replace pattern is simply a double quote followed by \1, which is our backreference (refers back to the first pattern match enclosed in parentheses, hence the 1). then add your last double quote to end up with "\1".

Substitution till the end of the line in bash

I have a huge text file with lots of lines like:
asdasdasdaasdasd_DATA_3424223423423423
gsgsdgsgs_DATA_6846343636
.....
I would like to do, for each line, to substitute from DATA_ .. to the end, with just empty space so I would get:
asdasdasdaasdasd_DATA_
gsgsdgsgs_DATA_
.....
I know that you can do something similar with:
sed -e "s/^DATA_*$/DATA_/g" filename.txt
but it does not work.
Do you know how?
Thanks
You have two problems: you're unnecessarily matching beginning and end of line with ^ and $, and you're looking for _* (zero or more underscores) instead of .* (zero or more of any character. Here's what you want:
sed -e 's/_DATA_.*/_DATA_/'
The g on the end (global) won't do anything, because you're already going to remove everything from the first instance of "DATA" onward - there can't be another match.
P.S. The -e isn't strictly necessary if you only have one expression, but if you think you might tack more on, it's a convenient habit.
With regular expressions, * means the previous character, any number of times. To match any character, use .
So what you really want is .* which means any character, any number of times, like this:
sed 's/DATA_.*/DATA_/' filename.txt
Also, I removed the ^ which means start of line, since you want to match "DATA_" even if it's not in the beginning of a line.
using awk. Set field delimiter as "DATA", then get field 1 ($1). No need regular expression
$ awk -F"_DATA_" '{print $1"_DATA_"}' file
asdasdasdaasdasd_DATA_
gsgsdgsgs_DATA_