Help with SED syntax : unterminated `s' command

Help with SED syntax : unterminated `s' command - regex

Edit: I'm using CYGWIN/GNU sed version 4.1.5 on windows Vista and I want a case insensitive search
I want to use sed to replace inline, the following:
c:\DEV\Suite\anything here --- blah 12 334 xxx zzzzz etc\Modules etc
Edit: anything here --- blah 12 334 xxx zzzzz etc means anything could appear here. Sorry for omitting that.
In a file with lines like
FileName="c:\DEV\Suite\anything here --- blah 12 334 xxx zzzzz etc\Modules\.... snipped ...."
with a value I supply, say :
Project X - Version 99.98
So the file ends up with:
FileName="c:\DEV\Suite\Project X - Version 99.98\Modules\.... snipped ...."
My attempt:
c:\temp>sed -r -b s/Dev\\Suite\\.*\\Modules/dev\\suite\\simple\\/g test.txt
However I get the following error:
sed: -e expression #1, char 42: unterminated `s' command
Thanks.
Edit:
I've already tried added quotes.

It's the '\\' before the '/'. Apparently you need 4 backslashes.
sed -r -b "s/Dev\\\\Suite\\\\.*\\\\Modules/dev\\\\suite\\\\simple\\\\/g" test.txt
I think the shell is interpreting the '\\' into a '\' before passing it to sed, and then sed is doing the same thing on what it gets.
Single quotes would work, so:
sed -r -b 's/Dev\\Suite\\.*\\Modules/dev\\suite\\simple\\/g' test.txt

If I use "\\\" where you have "\\", it works for me. With the double backslashes, the way it gets parsed evidently has a backslash escaping the terminating "/" of the substitution expression. (I still get the error if I replace ".*" with ".+".)
(Amusingly, I had to add more backslashes to get this to post properly -- SO ate a few of them!)

Got it: Replace the .* with .+
sed -r -b s/Dev\\Suite\\.+\\Modules/dev\\suite\\simple\\/g test.txt

I don't know what version of sed your using. I'm not familiar with the -b option.
First, I'd suggest using the i regex flag, to make it case insensitive. Your example of DEV won't match your regex of Dev.
I suspect the problem your running into is how your version of sed interprets backslash characters.
I'd suggest using the sed bundled with Cygwin. With single quotes, it seems to work for me.
echo 'c:\DEV\Suite\anything here --- blah 12 334 xxx zzzzz etc\Modules\' | sed -r 's/Dev\\Suite\\.*\\Modules/dev\\suite\\simple\\/gi'
c:\dev\suite\simple\\

well...
sed -e s/"anything here --- blah 12 334 xxx zzzzz etc"/"Project X - Version 99.98"/g test.txt
worked fine
(The compliant about the unterminated 's' was because of the unescaped '/')

Funny I was having the same issue in one directory but the same command worked in other directories on the same machine. This is the command I was working with
export version=grep "version.*SNAPSHOT.*version" pom.xml |sed -e 's|<version>||g'|sed -e 's|</version>||g'|sed -e "s|\t* *||g"; cat sonar-project.properties.template |sed -e "s/BUILDVERSION/$version/g">sonar-project.properties
when I changed the * to + it worked.
Thanks :-)

Will rename:
TV Show - 376 [720p].mkv
TV Show - 377 [720p].mkv
to
376.mkv
377.mkv
works under cygwin.
#!/bin/bash
for i in *; do
mv "$i" "`echo $i | sed -r -b 's/^.*[ ]([0-9]*)[ ].*$/\1.mkv/'`";
done

Related

Removing bullet point characters from text file with sed

I have a large text file in which some lines start with a bullet point (•). I'd like to remove those. I've tried
sed 's/\u2022//g' filename.txt
but that doesn't match the bullets. I've also tried pasting the bullet into my sed command, but also with no success.
E: The output of
sed --version
is
sed (GNU sed) 4.2.2
E2: If it helps figure out how to capture the bullet characters, they were originally added in Access.
E3: As suggesting in the comments,
echo -n '•' | hexdump -C
returns
00000000 95 |.|
00000001

I suggest with GNU sed:
sed 's/\x95//g' file

This is a working command for me:
# Force paste the bullet into the command line
sed 's/^•//g' filename.txt
If it doesn't work, try escaping with echo:
sed 's/^'"$(echo -ne '\u2022')"'//g' filename.txt
As PesaThe suggests, you can also use printf for escaping:
sed 's/^'"$(printf '\u2022')"'//g' filename.txt

It looks like sed doesn't understand \u sequences.
According to user manual it should be compatible with POSIX.2 BRE, which i think should work, but it doesn't.
You can try capturing the hexadecimal sequence (which i got using hexdump -C).
sed 's/^\xe2\x80\xa2//g' filename.txt
Or, alternatively, you could force bash to parse it. Just add a $ before the string.
sed $'s/\u2022//g' filename.txt

Using sed for extracting substring from string

I just started using sed from doing regex. I wanted to extract XXXXXX from *****/XXXXXX> so I was following
sed -n "/^/*/(\S*\).>$/p"
If I do so I get following error
sed: 1: "/^//(\S).>$/p": invalid command code *
I am not sure what am I missing here.

Try:
$ echo '*****/XXXXXX>' | sed 's|.*/||; s|>.*||'
XXXXXX
The substitute command s|.*/|| removes everything up to the last / in the string. The substitute command s|>.*|| removes everything from the first > in the string that remains to the end of the line.
Or:
$ echo '*****/XXXXXX>' | sed -E 's|.*/(.*)>|\1|'
XXXXXX
The substitute command s|.*/(.*)>|\1| captures whatever is between the last / and the last > and saves it in group 1. That is then replaced with group 1, \1.

In my opinion awk performs better this task. Using -F you can use multiple delimiters such as "/" and ">":
echo "*****/XXXXXX>" | awk -F'/|>' '{print $1}'
Of course you could use sed, but it's more complicated to understand. First I'm removing the first part (delimited by "/") and after the second one (delimited by ">"):
echo "*****/XXXXXX>" | sed -e s/.*[/]// -e s/\>//
Both will bring the expected result: XXXXXX.

with grep if you have pcre option
$ echo '*****/XXXXXX>' | grep -oP '/\K[^>]+'
XXXXXX
/\K positive lookbehind / - not part of output
[^>]+ characters other than >

echo '*****/XXXXXX>' |sed 's/^.*\/\|>$//g'
XXXXXX
Start from start of the line, then proceed till lask / ALSO find > followed by EOL , if any of these found then replace it with blank.

Escape dollar sign in regexp for sed

I will introduce what my question is about before actually asking - feel free to skip this section!
Some background info about my setup
To update files manually in a software system, I am creating a bash script to remove all files that are not present in the new version, using diff:
for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g"); do echo "rm -f $i" >> REMOVEOLDFILES.sh; done
This works fine. However, apparently my files often have a dollar sign ($) in the filename, this is due to some permutations of the GWT framework. Here is one example line from the above created bash script:
rm -f var/lib/tomcat7/webapps/ROOT/WEB-INF/classes/ExampleFile$3$1$1$1$2$1$1.class
Executing this script would not remove the wanted files, because bash reads these as argument variables. Hence I have to escape the dollar signs with "\$".
My actual question
I now want to add a sed-Command in the aforementioned pipeline, replacing this dollar sign. As a matter of fact, sed also reads the dollar sign as special character for regular expressions, so obviously I have to escape it as well.
But somehow this doesn't work and I could not find an explanation after googling a lot.
Here are some variations I have tried:
echo "Bla$bla" | sed "s/\$/2/g" # Output: Bla2
echo "Bla$bla" | sed 's/$$/2/g' # Output: Bla
echo "Bla$bla" | sed 's/\\$/2/g' # Output: Bla
echo "Bla$bla" | sed 's/#"\$"/2/g' # Output: Bla
echo "Bla$bla" | sed 's/\\\$/2/g' # Output: Bla
The desired output in this example should be "Bla2bla".
What am I missing?
I am using GNU sed 4.2.2
EDIT
I just realized, that the above example is wrong to begin with - the echo command already interprets the $ as a variable and the following sed doesn't get it anyway... Here a proper example:
Create a textfile test with the content bla$bla
cat test gives bla$bla
cat test | sed "s/$/2/g" gives bla$bla2
cat test | sed "s/\$/2/g" gives bla$bla2
cat test | sed "s/\\$/2/g" gives bla2bla
Hence, the last version is the answer. Remember: when testing, first make sure your test is correct, before you question the test object........

The correct way to escape a dollar sign in regular expressions for sed is double-backslash. Then, for creating the escaped version in the output, we need some additional slashes:
cat filenames.txt | sed "s/\\$/\\\\$/g" > escaped-filenames.txt
Yep, that's four backslashes in a row. This creates the required changes: a filename like bla$1$2.class would then change to bla\$1\$2.class.
This I can then insert into the full pipeline:
for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g" | sed "s/\\$/\\\\$/g"; do echo "rm -f $i" >> REMOVEOLDFILES.sh; done
Alternative to solve the background problem
chepner posted an alternative to solve the backround problem by simply adding single-quotes around the filenames for the output. This way, the $-signs are not read as variables by bash when executing the script and the files are also properly removed:
for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g"); do echo "rm -f '$i'" >> REMOVEOLDFILES.sh; done
(note the changed echo "rm -f '$i'" in that line)

There are other problems with your script, but file names containing $ are not a problem if you properly quote the argument to rm in the resulting script.
echo "rm -f '$i'" >> REMOVEOLDFILES.sh
or using printf, which makes quoting a little nicer and is more portable:
printf "rm -f '%s'" "$i" >> REMOVEOLDFILES.sh
(Note that I'm addressing the real problem, not necessarily the question you asked.)

There is already a nice answer directly in the edited question that helped me a lot - thank you!
I just want to add a bit of curious behavior that I stumbled across: matching against a dollar sign at the end of lines (e.g. when modifying PS1 in your .bashrc file).
As a workaround, I match for additional whitespace.
$ DOLLAR_TERMINATED="123456 $"
$ echo "${DOLLAR_TERMINATED}" | sed -e "s/ \\$/END/"
123456END
$ echo "${DOLLAR_TERMINATED}" | sed -e "s/ \\$$/END/"
sed: -e expression #1, char 13: Invalid back reference
$ echo "${DOLLAR_TERMINATED}" | sed -e "s/ \\$\s*$/END/"
123456END
Explanation to the above, line by line:
Defining DOLLAR_TERMINATED - I want to replace the dollar sign at the end of DOLLAR_TERMINATED with "END"
It works if I don't check for the line ending
It won't work if I match for the line ending as well (adding one more $ on the left side)
It works if I additionally match for (non-present) whitespace
(My sed version is 4.2.2 from February 2016, bash is version 4.3.48(1)-release (x86_64-pc-linux-gnu), in case that makes any difference)

replace string to asterisk bash

I am trying to get from user a path as an input.
The user will enter a specific path for specific application:
script.sh /var/log/dbhome_1/md5
I've wanted to convert the number of directory (in that case - 1) to * (asterisk). later on, the script will do some logic on this path.
When i'm trying sed on the input, i'm stuck with the number -
echo "/var/log/dbhome_1/md5" | sed "s/dbhome_*/dbhome_\*/g"
and the input will be -
/var/log/dbhome_*1/md5
I know that i have some problems with the asterisk wildcard and as a char...
maybe regex will help here?

Code for GNU sed:
sed "s#1/#\*/#"
.
$echo "/var/log/dbhome_1/md5" | sed "s#1/#\*/#"
"/var/log/dbhome_*/md5"
Or more general:
sed "s#[0-9]\+/#\*/#"
.
$echo "/var/log/dbhome_1234567890/md5" | sed "s#[0-9]\+/#\*/#"
"/var/log/dbhome_*/md5"

use this instead:
echo "/var/log/dbhome_1/md5" | sed "s/dbhome_[0-9]\+/dbhome_\*/g"
[0-9] is a character class that contains all digits
Thus [0-9]\+ matches one or more digits

If your script is in bash (which I assume when I see the tag, but I also doubt it when I see its name script.sh which seems to have the wrong extension for a bash script), you might as well use pure bash stuff: /var/log/dbhome_1/md5 will very likely be in positional parameter $1, and what you want will be achieved by:
echo "${1//dbhome_+([[:digit:]])/dbhome_*}"
If this seems to fail, it's probably because your extglob shell optional behavior is turned off. In this case, just turn it on with
shopt -s extglob
Demo:
$ shopt -s extglob
$ a=/var/log/dbhome_1234567/md5
$ echo "${a//dbhome_+([[:digit:]])/dbhome_*}"
/var/log/dbhome_*/md5
$
Done!

How to replace space with comma using sed?

I would like to replace the empty space between each and every field with comma delimiter.Could someone let me know how can I do this.I tried the below command but it doesn't work.thanks.
My command:
:%s//,/
53 51097 310780 1
56 260 1925 1
68 51282 278770 1
77 46903 281485 1
82 475 2600 1
84 433 3395 1
96 212 1545 1
163 373819 1006375 1
204 36917 117195 1

If you are talking about sed, this works:
sed -e "s/ /,/g" < a.txt
In vim, use same regex to replace:
s/ /,/g

Inside vim, you want to type when in normal (command) mode:
:%s/ /,/g
On the terminal prompt, you can use sed to perform this on a file:
sed -i 's/\ /,/g' input_file
Note: the -i option to sed means "in-place edit", as in that it will modify the input file.

I know it's not exactly what you're asking, but, for replacing a comma with a newline, this works great:
tr , '\n' < file

Try the following command and it should work out for you.
sed "s/\s/,/g" orignalFive.csv > editedFinal.csv

IF your data includes an arbitrary sequence of blank characters (tab, space), and you want to replace each sequence with one comma, use the following:
sed 's/[\t ]+/,/g' input_file
or
sed -r 's/[[:blank:]]+/,/g' input_file
If you want to replace sequence of space characters, which includes other characters such as carriage return and backspace, etc, then use the following:
sed -r 's/[[:space:]]+/,/g' input_file

If you want the output on terminal then,
$sed 's/ /,/g' filename.txt
But if you want to edit the file itself i.e. if you want to replace space with the comma in the file then,
$sed -i 's/ /,/g' filename.txt

I just confirmed that:
cat file.txt | sed "s/\s/,/g"
successfully replaces spaces with commas in Cygwin terminals (mintty 2.9.0). None of the other samples worked for me.

On Linux use below to test (it would replace the whitespaces with comma)
sed 's/\s/,/g' /tmp/test.txt | head
later you can take the output into the file using below command:
sed 's/\s/,/g' /tmp/test.txt > /tmp/test_final.txt
PS: test is the file which you want to use

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js