Replace Strings Using Sed And Regex - regex

I'm trying to uncomment file content using sed but with regex (for example: [0-9]{1,5})
# one two 12
# three four 34
# five six 56
The following is working:
sed -e 's/# one two 12/one two 12/g' /file
However, what I would like is to use regex pattern to replace all matches without entering numbers but keep the numbers in the result.

For complying sample question, simply
sed 's/^# //' file
will suffice, but if there is a need to remove the comment only on some lines containing a particular regex, then you could use conditionnal address:
sed '/regex/s/^# //' file
So every lines containing regex will be uncomented (if line begin with a #)
... where regex could be [0-9] as:
sed '/[0-9]/s/^# //' file
will remove # at begin of every lines containing a number, or
sed '/[0-9]/s/^# \?//' file
to make first space not needed: #one two 12, or even
sed '/[0-9]$/s/^# //' file
will remove # at begin of lines containing a number as last character. Then
sed '/12$/s/^# //' file
will remove # at begin of lines ended by 12. Or
sed '/\b\(two\|three\)\b/s/^# //' file
will remove # at begin of lines containing word two or three.

sed -e 's/^#\s*\(.*[0-9].*\)$/\1/g' filename
should do it.

If you only want those lines uncommented which contain numbers, you can use this:
sed -e 's/^#\s*\(.*[0-9]+.*\)/\1/g' file

Is the -i option for replacement in the respective file not necessary? I get to remove leading # by using the following:
sed -i "s/^# \(.*\)/\1/g" file
In order to uncomment only those commented lines that end on a sequence of at least one digit, I'd use it like this:
sed -i "s/^# \(.*[[:digit:]]\+$\)/\1/g" file
This solution requires commented lines to begin with one space character (right behind the #), but that should be easy to adjust if not applicable.

The following sed command will uncomment lines containing numbers:
sed 's/^#\s*\(.*[0-9]\+.*$\)/\1/g' file

I find it. thanks to all of you
echo "# one two 12" | grep "[0-9]" | sed 's/# //g'
or
cat file | grep "[0-9]" | sed 's/# //g'

Related

How to get the release value?

I've a file with the below name formats:
rzp-QAQ_SA2-5.12.0.38-quality.zip
rzp-TEST-5.12.0.38-quality.zip
rzp-ASQ_TFC-5.12.0.38-quality.zip
I want the value as: 5.12.0.38-quality.zip from the above file names.
I'm trying as below, but not getting the correct value though:
echo "$fl_name" | sed 's#^[-[:alpha:]_[:digit:]]*##'
fl_name is the variable containing the file name.
Thanks a lot in advance!
You are matching too much with all the alpha, digit - and _ in the same character class.
You can match alpha and - and optionally _ and alphanumerics
sed -E 's#^[-[:alpha:]]+(_[[:alnum:]]*-)?##' file
Or you can shorten the first character class, and match a - at the end:
sed -E 's#^[-[:alnum:]_]*-##' file
Output of both examples
5.12.0.38-quality.zip
5.12.0.38-quality.zip
5.12.0.38-quality.zip
With GNU grep you could try following code. Written and tested with shown samples.
grep -oP '(.*?-){2}\K.*' Input_file
OR as an alternative use(with a non-capturing group solution, as per the fourth bird's nice suggestion):
grep -oP '(?:[^-]*-){2}\K.*' Input_file
Explanation: using GNU grep here. in grep program using -oP option which is for matching exact matched values and to enable PCRE flavor respectively in program. Then in main program, using regex (.*?-){2} means, using lazy match till - 2 times here(to get first 2 matches of - here) then using \K option which is to make sure that till now matched value is forgotten and only next mentioned regex matched value will be printed, which will print rest of the values here.
It is much easier to use cut here:
cut -d- -f3- file
5.12.0.38-quality.zip
5.12.0.38-quality.zip
5.12.0.38-quality.zip
If you want sed then use:
sed -E 's/^([^-]*-){2}//' file
5.12.0.38-quality.zip
5.12.0.38-quality.zip
5.12.0.38-quality.zip
Assumptions:
all filenames contain 3 hyphens (-)
the desired result always consists of stripping off the 1st two hyphen-delimited strings
OP wants to perform this operation on a variable
We can eliminate the overhead of sub-process calls (eg, grep, cut and sed) by using parameter substitution:
$ f1_name='rzp-ASQ_TFC-5.12.0.38-quality.zip'
$ new_f1_name="${f1_name#*-}" # strip off first hyphen-delimited string
$ echo "${new_f1_name}"
ASQ_TFC-5.12.0.38-quality.zip
$ new_f1_name="${new_f1_name#*-}" # strip off next hyphen-delimited string
$ echo "${new_f1_name}"
5.12.0.38-quality.zip
On the other hand if OP is feeding a list of file names to a looping construct, and the original file names are not needed, it may be easier to perform a bulk operation on the list of file names before processing by the loop, eg:
while read -r new_f1_name
do
... process "${new_f1_name)"
done < <( command-that-generates-list-of-file-names | cut -d- -f3-)
In plain bash:
echo "${fl_name#*-*-}"
You can do a reverse of each line, and get the two last elements separated by "-" and then reverse again:
cat "$fl_name"| rev | cut -f1,2 -d'-' | rev
A Perl solution capturing digits and characters trailing a '-'
cat f_name | perl -lne 'chomp; /.*?-(\d+.*?)\z/g;print $1'

Replacing a certain number of characters after a match using sed

I have a file test.txt that looks something like this:
something=1something-else=234another-something=5678
I would like to replace something-else=234 with something-else=***, for example, but the only information I have is the "match" that is something-else= and that there are exactly THREE characters after the equals sign. Currently I have this command that replaces everything on the line after the match:
sed -i -e 's/\(something-else=\).*/\1***/' test.txt
Result: something=1something-else=***
How can I adapt it to only replace three characters instead of the entire rest of the line?
You're looking for
sed -i -e 's/\(something-else=\).\{3\}/\1***/' test.txt
or, equivalently,
sed -i -e 's/\(something-else=\).../\1***/' test.txt
How can I adapt it to only replace three characters instead of the entire rest of the line?
You can use:
sed 's/\(something-else=\).../\1***/' file
something=1something-else=***another-something=5678
Here ... will match exactly 3 characters after something-else=.
You can also use a numbered in quantifier:
sed -E 's/(something-else=).{3}/\1***/' file

sed delete trailing pattern of digits

I have a .txt file where the last column includes a number pattern after the text like 'Baker 2-13' or 'Charlie 03-144.' I would like to remove all the digits at the end of the line, and just be left with Baker and Charlie. I have tried piping the sed command at the end of my awk statement, with no success.
sed -E 's/[0-9]{1,2}"-"[0-9]{1,3}$//'
I've tried adding the space and carriage returns to my sed command, but still no luck.
sed -E 's/[0-9]{1,2}"-"[0-9]{1,3}\s\r$//'
I've also tried this, but it only works when I echo a text sample, it doesn't work on each line of my .txt file
echo "CHARLIE 02-157" | sed -E 's/[0-9]*([0-9])+\-[0-9]*([0-9])+$//'
Any ideas?
This should work:
sed -i.bak -E 's/[0-9]{1,2}-[0-9]{1,3}$//' file
cat file
Baker
Charlie
You don't need to quote hyphen in the pattern.
Simple sed solution
sed 's/[- 0-9]*$//'
This will delete trailing dashes, blanks and numbers!

how to select lines containing several words using sed?

I am learning using sed in unix.
I have a file with many lines and I wanna delete all lines except lines containing strings(e.g) alex, eva and tom.
I think I can use
sed '/alex|eva|tom/!d' filename
However I find it doesn't work, it cannot match the line. It just match "alex|eva|tom"...
Only
sed '/alex/!d' filename
works.
Anyone know how to select lines containing more than 1 words using sed?
plus, with parenthesis like "sed '/(alex)|(eva)|(tom)/!d' file" doesn't work, and I wanna the line containing all three words.
sed is an excellent tool for simple substitutions on a single line, for anything else just use awk:
awk '/alex/ && /eva/ && /tom/' file
delete all lines except lines containing strings(e.g) alex, eva and tom
As worded you're asking to preserve lines containing all those words but your samples preserve lines containing any. Just in case "all" wasn't a misspeak: Regular expressions can't express any-order searches, fortunately sed lets you run multiple matches:
sed -n '/alex/{/eva/{/tom/p}}'
or you could just delete them serially:
sed '/alex/!d; /eva/!d; /tom/!d'
The above works on GNU/anything systems, with BSD-based userlands you'll have to insert a bunch of newlines or pass them as separate expressions:
sed -n '/alex/ {
/eva/ {
/tom/ p
}
}'
or
sed -e '/alex/!d' -e '/eva/!d' -e '/tom/!d'
You can use:
sed -r '/alex|eva|tom/!d' filename
OR on Mac:
sed -E '/alex|eva|tom/!d' filename
Use -i.bak for inline editing so:
sed -i.bak -r '/alex|eva|tom/!d' filename
You should be using \| instead of |.
Edit: Looks like this is true for some variants of sed but not others.
This might work for you (GNU sed):
sed -nr '/alex/G;/eva/G;/tom/G;s/\n{3}//p' file
This method would allow a range of values to be present i.e. you wanted 2 or more of the list then use:
sed -nr '/alex/G;/eva/G;/tom/G;s/\n{2,3}//p' file

sed misbehaving?

I have the following command:
$ xlscat -i $file
and I get:
Excel File Name.xslx - 01: [ Sheet #1 ] 34 Cols, 433 Rows
Excel File Name.xlsx - 02: [ Sheet Number2 ] 23 Cols, 32 Rows
Excel File Name.xlsx - 03: [ Foo Factor! ] 14 Cols, 123 Rows
I want just the sheet name, so i do this:
$ xlscat -i $file 2>&1 | sed -e 's/.*\[ *\(.*\) *\].*/\1/' | while read file
> do
> echo "File: '$file'"
> done
And get this:
File: 'Sheet #1'
File: 'Sheet Number2'
File: 'Foo Factor!'
Great! Everything works beautifully. As you can see with the single quotes, I've removed the extra spaces at the end of the file name. Now convert all remaining spaces to underscores:
$ xlscat -i $file 2>&1 | sed -e 's/.*\[ *\(.*\) *\].*/\1/' | sed -e 's/ /_/g' | while read file
> do
> echo "File: '$file'"
> done
Now I get this:
File: 'Sheet_#1_____'
File: 'Sheet_Number2'
File: 'Foo_Factor!__'
Huh? The first one didn't show any trailing blanks, but the second one seems to be appending underscores on the end of the file. What am I not seeing?
The first sed command is not stripping the trailing whitespace, read is. Check your expression:
sed -e 's/.*\[ *\(.*\) *\].*/\1/'
It matches:
anything
a bracket
1 or more spaces
anything, captured
1 or more spaces
a right bracket
anything
The regular expressions are greedy, meaning that they match as much as possible, and the earlier expressions will match before later ones do. So for example, the regular expression (.*)(.*) matches anything in two capturing groups, but there are any number of ways the data could be split between the two groups. So the regex implementation has to choose, and it will put as much as possible in the first, and nothing in the second.
Since you need to match filenames with spaces in them, you can't match "anything except a space"; your best bet is to trim the trailing whitespace as a separate step. Try this sed command instead:
sed -e 's/.*\[ *\(.*\) *\].*/\1/' -e 's/ *$//'
I think the read file is trimming the trailing whitespace for you. Try putting the
sed -e 's/ /_/g'
inside the while loop ... like:
echo "File: $(echo $file | sed -e 's/ /_/g')"
Could it be echo that's stripping the trailing spaces? Although it does seem like they should show up inside the quotes. Anyway, try this:
sed -e 's/.*\[ *\([^] ]\+\( \+[^] ]\+\)*\).*/\1/'
Each word of the sheet name is matched by [^] ]\+ (i.e., one or more of any characters other than space or ]). When the final word of the name has been matched, the second .* consumes the rest of the line. There's no need to match the closing ], so the trailing spaces don't have to be included in the match.
I'm not a sed user, but this regex works correctly in RegexBuddy when I specify the GNU-BRE flavor, so it should work in sed.