how to replace spaces inside doublequotes - replace

I need to replace every spaces inside doublequotes in a variable:
VAR='"this is my problem" but not yours'
Now i have to replace the spaces (may be more than one in a row) in "this is my problem" with '[[:space]]+'. My shell is busybox. What is the simplest way?
Thank you.

Try this: echo $var | sed -re 's|.*("[^"]*").*|\1|g' | sed -re 's|\s|[[:space:]]+|g'. First sed is to extra the double quotes part, the second sed is to turn the spaces inside the double quotes into [[:space:]]+.
PS. You could have saved two hours if you put all the stuff in the comments into your original question from the start :)

Related

add a command in the beginning of quotation mark by using bash

I would like to use sed to do the following steps:
before: test="testabc"
after: test="quiz testabc"
How can I add quiz followed by a space in the beginning of the quotation mark?
Thank you for your help.
You can just use the following sed command
$ echo 'test="testabc"' | sed 's/="\([^"]*\)"/="quiz \1"/'
test="quiz testabc"
Explanations:
s/pattern/replacement/ use sed in search/replace mode.
="\([^"]*\)" this regex will fetch ="some string".
then you can use a capturing group ( ) and back reference \1 to it in order to keep the string content and add your quiz
s/pattern/replacement/g use the global replacement mode if you need to search and replace more than one occurrence of this pattern
or the following perl solution works as well:
$ echo 'test="testabc"' | perl -pe 's/(?<==")([^"]*)(?=")/quiz \1/'
test="quiz testabc"
For regex details: http://www.rexegg.com/regex-quickstart.html
Improvements:
sed 's/test="\([^"]*\)"/test="quiz \1"/' add the variable name to be sure to change only that variable.
sed 's/="/&quiz /g' or if you don't care about the variable names and want to change every assignation.

Using regex and sed to replace a string inside of a file

Having the following string inside of a text file.
{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This
is a ' test string Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""}
I would like to replace all the ' that are inside the ctx._source.meta='' part with \' using sed.
In the example above I've This is a ' test string Peedr which I would like to convert to This is a \' test string Peedr, so the desired output would be:
{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This
is a \' test string
Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""}
I'm using the following regex to get the ' that is inside the ctx._source.meta string (3rd capture group).
(meta=')(.*?)(')(.*?)(')
I've the regex, but I dont know how to use the sed comand in order to replace the 3rd capture group with \'.
Can someone give me a hand and tell me the sed comand I have to use?
Thanks in advance
sed generally does not support the Perl regex extensions, so the non-greedy .*? will probably not do what you hope. If you want to use Perl regex, use Perl!
perl -pe "s/(meta='.*?)(')(.*?')/\$1\\\\\$2\$3/"
This will still not necessarily work if the input is malformed; a better approach would be to specifically exclude single quotes from the match, and then you don't need the non-greedy matching.
sed "s/\\(meta='[^']*\\)'\\([^']*'\\)/\\1\\\\'\\2/"
In both cases, the number of backslashes required to escape the backslashes inside the shell's double quotes is staggering.
You put back-references to groups except one you want to replace. There is a better way to accomplish same task:
sed -E "s/(ctx\._source\.meta=')([^']*)(')([^']*')/\1\2\\'\4/"
You may use:
sed "s/ ' / \\\' /g" sample.txt
The first part will instruct sed to only look for a single quote between 2 spaces, as such ctx._source.meta='This and string Peedr'"} will not match, hence will not be changed.
Edit:
At the poster's request, I edited my sed command to apply to extra use cases:
sed "s/\(ctx._source.meta='.*\)'\(.*Peedr'\"\)/\1\\\'\2/g"

Extracting Substring from String with Multiple Special Characters Using Sed

I have a text file with a line that reads:
<div id="page_footer"><div><? print('Any phrase's characters can go here!'); ?></div></div>
And I'm wanting to use sed or awk to extract the substring above between the single quotes so it just prints ...
Any phrase's characters can go here!
I want the phrase to be delimited as I have above, starting after the single quote and ending at the single-quote immediately followed by a parenthesis and then semicolon. The following sed command with a capture group doesn't seem to be working for me. Suggestions?
sed '/^<div id="page_footer"><div><? print(\'\(.\+\)\');/ s//\1/p' /home/foobar/testfile.txt
Incorrect would be using cut like
grep "page_footer" /home/foobar/testfile.txt | cut -d "'" -f2
It will go wrong with single quotes inside the string. Counting the number of single quotes first will change this from a simple to an over-complicated solution.
A solution with sed is better: remove everything until the first single quote and everything after the last one. A single quote in the string becomes messy when you first close the sed parameter with a single quote, escape the single quote and open a sed string again:
grep page_footer /home/foobar/testfile.txt | sed -e 's/[^'\'']*//' -e 's/[^'\'']*$//'
And this is not the full solution, you want to remove the first/last quotes as well:
grep page_footer /home/foobar/testfile.txt | sed -e 's/[^'\'']*'\''//' -e 's/'\''[^'\'']*$//'
Writing the sed parameters in double-quoted strings and using the . wildcard for matching the single quote will make the line shorter:
grep page_footer /home/foobar/testfile.txt | sed -e "s/^[^\']*.//" -e "s/.[^\']*$//"
Using advanced grep (such as in Linux), this might be what you are looking for
grep -Po "(?<=').*?(?='\);)"

replace number in a string

I am trying to match this string
'12.34.5.6',#### OR
'12.34.5.6', #### (Note the space after the comma)
in a series of files and replace #### with 2222.
I started small and this command successfully changed 1234 to 2222
sed -i 's/'12.34.5.6\''\,1234/'12.34.5.6\''\, 2222/g' file.txt
so I moved on to work on replacing 1234 with regex, below are some of the commands i've tried but do not work.
sed -i 's/'12.34.5.6\''\,\(\s?[0-9]{4,5}\)/'12.34.5.6\''\, 2222/g' file.txt
sed -i 's/'12.34.5.6\''\,[0-9][0-9][0-9][0-9][0-9]?/'12.34.5.6\''\, 2222/g' file.txt
Can someone help me out with this or give some pointers?
sed -r "s/('12[.]34[.]5[.]6',[ ]?)[0-9]{4}/\\12222/g"
This might do the trick:
sed -E "s/('12.34.5.6',\s?)[0-9]{4,5}/\12222/g"
Examples:
$ echo "'12.34.5.6', 2134" | sed -E "s/('12.34.5.6',\s?)[0-9]{4,5}/\12222/g"
'12.34.5.6', 2222
$ echo "'12.34.5.6',9230" | sed -E "s/('12.34.5.6',\s?)[0-9]{4,5}/\12222/g"
'12.34.5.6',2222
Explications:
With -E we ask sed to use extended regex (but this is mainly a matter of taste), the beginning of the regex is fairly simple: '12.34.5.6', just match this same string. We then add a space, followed by a ? to indicate it is optionnal. This first part is enclosed in braces to be able to use this in the replacement pattern.
Then, we add the #'s to the pattern. I assumed you used #'s in place of numbers based on your attempts with [0-9]{4,5} and [0-9][0-9][0-9][0-9][0-9].
Finally, in the replacement pattern we use the previously matched first pair of braces with \1, and add our 2's: \12222 (which will replace the numbers (#'s), discarded in the process because not enclosed in the braces).
PS. Next time please format your question for better readability.
PPS. I think the real issue here is not the regex but the quote escaping in your regex. Maybe take look at [this question].

sed or perl Remove outer parentheses only if first inner word matches

Using GNU sed, I need to remove parenthetical phrases like (click here ....) including the parens. The text following click here varies, but I need to remove the whole outer parentheses.
I've tried many variations on the following, but I can't seem to hit the right one:
sed -e 's/\((click here.*[^)]*\)//'
EDIT Foolishly I didn't notice that there's often actually a linebreak in the middle of the parenthetical string, so sed probably isn't going to work. Example:
(click here to
enter some text)
If there aren't nested parens, maybe you can try something like:
sed -e 's/(click here [^)]*)//'
With perl you can run :
perl -00 -pe "s/\(click here [^)]*\)//g" inputfile > outputfile
It will read the inputfile in a string then replace all occurrences of (click here anychar but '(' ) then output all in the outputfile.
Here's yet another sed approach that keeps the complete input in the hold buffer:
# see http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/
echo '
()
(do not delete this)
()
(click here to
enter some text)
()
' |
sed -n '1h;1!H;${;g;s/(\([^)]*\))/\1/g;p;}'