Using regex and sed to replace a string inside of a file

Using regex and sed to replace a string inside of a file - regex

Having the following string inside of a text file.
{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This
is a ' test string Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""}
I would like to replace all the ' that are inside the ctx._source.meta='' part with \' using sed.
In the example above I've This is a ' test string Peedr which I would like to convert to This is a \' test string Peedr, so the desired output would be:
{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This
is a \' test string
Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""}
I'm using the following regex to get the ' that is inside the ctx._source.meta string (3rd capture group).
(meta=')(.*?)(')(.*?)(')
I've the regex, but I dont know how to use the sed comand in order to replace the 3rd capture group with \'.
Can someone give me a hand and tell me the sed comand I have to use?
Thanks in advance

sed generally does not support the Perl regex extensions, so the non-greedy .*? will probably not do what you hope. If you want to use Perl regex, use Perl!
perl -pe "s/(meta='.*?)(')(.*?')/\$1\\\\\$2\$3/"
This will still not necessarily work if the input is malformed; a better approach would be to specifically exclude single quotes from the match, and then you don't need the non-greedy matching.
sed "s/\\(meta='[^']*\\)'\\([^']*'\\)/\\1\\\\'\\2/"
In both cases, the number of backslashes required to escape the backslashes inside the shell's double quotes is staggering.

You put back-references to groups except one you want to replace. There is a better way to accomplish same task:
sed -E "s/(ctx\._source\.meta=')([^']*)(')([^']*')/\1\2\\'\4/"

You may use:
sed "s/ ' / \\\' /g" sample.txt
The first part will instruct sed to only look for a single quote between 2 spaces, as such ctx._source.meta='This and string Peedr'"} will not match, hence will not be changed.
Edit:
At the poster's request, I edited my sed command to apply to extra use cases:
sed "s/\(ctx._source.meta='.*\)'\(.*Peedr'\"\)/\1\\\'\2/g"

Related

sed to add quotes around timestamps

I have a large file that contains timestamps in the following format:
2018-08-22T13:06:04.442774Z
I would like to add double quotes around all the occurrences that match this specific expression. I am trying to use sed, but I don't seem to be able to find the right command. I am trying something around these lines:
sed -e s/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}T[0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\}.[0-9]\{6\}Z/"$0"/g my_file.json
and I am pretty sure that the problem is around my "replace" expression.
How should I correct the command?
Thank you in advance.

You should wrap the sed replacement command with single quotes and use & instead of $0 in the RHS to replace with the whole match:
sed 's/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}T[0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\}\.[0-9]\{6\}Z/"&"/g' file > outfile
See the online demo
Also, do not forget to escape the . char if you want to match a dot, and not any character.
You may also remove excessive escapes if you use ERE syntax:
sed -E 's/[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{6}Z/"&"/g'
If you want to change the file inline, use the -i option,
sed -i 's/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}T[0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\}\.[0-9]\{6\}Z/"&"/g' file

The following works:
sed 's/\([0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}T[0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\}\.[0-9]\{6\}Z\)/"\1"/g' my_file.json
multiple modifications:
wrap command in single quotes
use \( and \) to create a group (referenced by '\1` in the replacement section)
escape the '.' and '{' and '}' characters

Regex with sed to search in files

I want to search recursiv in files for a given pattern and replace them. The search is for a string like "['DB']['1']['HOST'] = 'localhost'". If testing the regex the following doesn't print anything. Can't see an error in this regex? Could anyone help?
sed -n '/\[\'HOST\'\]\s?=\s?(?:\'|")(.+)(?:\'|")/p' /path/to/file

POSIX regex does not support non-capturing groups. Besides, you have not specified the -E option and the pattern is parsed as a BRE POSIX pattern where the capturing parentheses should be escaped. Also, the single quotes cannot be escaped to be used in a sed regex pattern, use \x27 instead.
Use
sed -En '/\[\x27HOST\x27\]\s?=\s?[\x27"][^\x27"]+[\x27"]/p'
See an online demo:
s="a string like ['DB']['1']['HOST'] = 'localhost'."
sed -En '/\[\x27HOST\x27\]\s?=\s?[\x27"][^\x27"]+[\x27"]/p' <<< "$s"
Besides, instead of \s, it might be a good idea to use [[:space:]].

Replace some dots(.) with commas(,) with RegEx and awk or sed

I want to replace dots with commas for some but not all matches:
hostname_metric (Index: 1) to hostname;metric (avg);22.04.2015 13:40:00;3.0000;22.04.2015 02:05:00;2.0000;22.04.2015 02:00:00;650.7000;2.2594;
The outcome should look like this:
hostname_metric (Index: 1) to hostname;metric (avg);22.04.2015 13:40:00;3,0000;22.04.2015 02:05:00;2,0000;22.04.2015 02:00:00;650,7000;2,2594;
I was able to identify the RegEx which should work to find the correct dots.
;[0-9]{1,}\.[0-9]{4}
But how can I replace them with a comma with awk or sed?
Thanks in advance!

Adding some capture groups to the regex in your question, you can use this sed one-liner:
sed -r 's/(;[0-9]{1,})\.([0-9]{4})/\1,\2/g' file
This matches and captures the part before and after the . and uses them in the replacement string.
On some versions of sed, you may need to use -E instead of -r to enable Extended Regular Expressions. If your version of sed doesn't understand either switch, you can use basic regular expressions and add a few escape characters:
sed 's/\(;[0-9]\{1,\}\)\.\([0-9]\{4\}\)/\1,\2/g' file

sed 's/\(;[0-9]\+\)\.\([0-9]\{4\}\)/\1,\2/g' should do the trick.

Printing a matched regexp with sed

So I'm trying to match a regexp with any string in the middle of it and then print out just that string. The syntax is sort of like this...
sed -n 's/<title>.*</title>/"what do I put here"/p' input.file
and I just want to print out whatever .* is where I typed "what do I put here". I'm not very comfortable with sed at this point so this is likely a very simple answer and I'm having trouble finding one in any of the other questions. Thanks in advance!

Capture the pattern you want to extract within \(...\), and then you can refer to it as \1 in the replacement string:
sed -n 's/<title>\(.*\)</title>/\1/p' input.file
You can have multiple \(...\) expressions, and refer to them with \1, \2, \3, and so on.
If you have the GNU version of sed, or gsed, then you could simplify a bit:
sed -rn 's/<title>(.*)</title>/\1/p' input.file
With the -r flag, sed can use "extended regular expressions", which practically let's you write (...) instead of \(...\), + instead of \+, and other goodies.

How to use sed and regex?

I need to use sed to look for all lines in a file with pattern "[whatever]|[whatever]" so I'm using the following regex:
sed '/\"[a-zA-Z0-9]+\|[a-zA-Z0-9]+\"/p' test2.txt
But it's not working because in this file is returning something when it shouldn't
RTV0031605951US|3160595|20/03/2013|0|"Laurie Graham"|"401"
Does anybody know with regex should I use? Thanks in advance

I see three problems with your regular expression:
+ is not a metacharacter, so you need to escape it to get its special meaning.
Similar issue happens with the pipe. Neither it is a metacharacter, so don't escape it to match it literally.
Sed by default prints each line that matches, so add -n that avoids that, if you already use /p that prints it. Otherwise you will have those lines twice in the output.

sed will output anything that is a partial match.
To match only whole lines that match your regex, add ^ and $ to the start/end:
sed '/^\"[a-zA-Z0-9]+\|[a-zA-Z0-9]+\"$/p' test2.txt

sed '/\B\"[ [:alnum:]]\+\"|\"[ [:alnum:]]\+\"\B/!d' file
If you use this in a sed script, do not escape double quotes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using regex and sed to replace a string inside of a file - regex

You put back-references to groups except one you want to replace. There is a better way to accomplish same task: sed -E "s/(ctx\._source\.meta=')([^'])(')([^']')/\1\2\\'\4/"

Related

sed to add quotes around timestamps

Regex with sed to search in files

Replace some dots(.) with commas(,) with RegEx and awk or sed

Printing a matched regexp with sed

How to use sed and regex?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using regex and sed to replace a string inside of a file - regex

You put back-references to groups except one you want to replace. There is a better way to accomplish same task: sed -E "s/(ctx\._source\.meta=')([^']*)(')([^']*')/\1\2\\'\4/"

Related

sed to add quotes around timestamps

Regex with sed to search in files

Replace some dots(.) with commas(,) with RegEx and awk or sed

Printing a matched regexp with sed

How to use sed and regex?

Categories

Resources

You put back-references to groups except one you want to replace. There is a better way to accomplish same task: sed -E "s/(ctx\._source\.meta=')([^'])(')([^']')/\1\2\\'\4/"