Prefix all numbers in a file with a string using sed - regex

Using sed, awk or possibly something else I would like to prefix all numbers in file with a string e.g.
input:
sometext(0, 456)
sometext(01, 10)
output:
sometext(somestring0, somestring456)
sometext(somestring01, somestring10)
I have attempted using sed but my skills are limited so I have not managed to produce any meaningful output.
Using OSX10.11 so I know that sed has slightly different behaviour in BSD than under other *nix's.
I also have perl and python at hand if that solves this better but sed and awk are preferred.

You can use this sed command that matches and captures a number and in replacement prefixes it:
sed -E 's/[[:digit:]]+/somestring&/g' file
sometext(somestring0, somestring456)
sometext(somestring01, somestring10)
Please keep in mind that somestring should not contain special replacements constructs like &, \1, \2, \3 etc.

Related

Using regex and sed to replace a string inside of a file

Having the following string inside of a text file.
{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This
is a ' test string Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""}
I would like to replace all the ' that are inside the ctx._source.meta='' part with \' using sed.
In the example above I've This is a ' test string Peedr which I would like to convert to This is a \' test string Peedr, so the desired output would be:
{"_job":"delete","query":{"query":{"bool":{"must":[{"term":{"_id":"28381"}}],"should":[]}}},"script":{"inline":"ctx._source.meta='This
is a \' test string
Peedr'"},"timestamp":1518165383,"host":"","port":"9200","index":"","docType":"","customIndexer":""}
I'm using the following regex to get the ' that is inside the ctx._source.meta string (3rd capture group).
(meta=')(.*?)(')(.*?)(')
I've the regex, but I dont know how to use the sed comand in order to replace the 3rd capture group with \'.
Can someone give me a hand and tell me the sed comand I have to use?
Thanks in advance
sed generally does not support the Perl regex extensions, so the non-greedy .*? will probably not do what you hope. If you want to use Perl regex, use Perl!
perl -pe "s/(meta='.*?)(')(.*?')/\$1\\\\\$2\$3/"
This will still not necessarily work if the input is malformed; a better approach would be to specifically exclude single quotes from the match, and then you don't need the non-greedy matching.
sed "s/\\(meta='[^']*\\)'\\([^']*'\\)/\\1\\\\'\\2/"
In both cases, the number of backslashes required to escape the backslashes inside the shell's double quotes is staggering.
You put back-references to groups except one you want to replace. There is a better way to accomplish same task:
sed -E "s/(ctx\._source\.meta=')([^']*)(')([^']*')/\1\2\\'\4/"
You may use:
sed "s/ ' / \\\' /g" sample.txt
The first part will instruct sed to only look for a single quote between 2 spaces, as such ctx._source.meta='This and string Peedr'"} will not match, hence will not be changed.
Edit:
At the poster's request, I edited my sed command to apply to extra use cases:
sed "s/\(ctx._source.meta='.*\)'\(.*Peedr'\"\)/\1\\\'\2/g"

Printing a matched regexp with sed

So I'm trying to match a regexp with any string in the middle of it and then print out just that string. The syntax is sort of like this...
sed -n 's/<title>.*</title>/"what do I put here"/p' input.file
and I just want to print out whatever .* is where I typed "what do I put here". I'm not very comfortable with sed at this point so this is likely a very simple answer and I'm having trouble finding one in any of the other questions. Thanks in advance!
Capture the pattern you want to extract within \(...\), and then you can refer to it as \1 in the replacement string:
sed -n 's/<title>\(.*\)</title>/\1/p' input.file
You can have multiple \(...\) expressions, and refer to them with \1, \2, \3, and so on.
If you have the GNU version of sed, or gsed, then you could simplify a bit:
sed -rn 's/<title>(.*)</title>/\1/p' input.file
With the -r flag, sed can use "extended regular expressions", which practically let's you write (...) instead of \(...\), + instead of \+, and other goodies.

Points to slashes with sed

I have text file like this format:
...
SomeText.any_text/ch SomeText2.any_3/ch 5.6e-5
SomeText.any_text/ch something.else.point.separated/ch4 5.4e5
...
in line I have three elements: two - alpha-numerical-underscored-slashed strings and one - float number.
I need to replace points to slashes only at strings.
I have try to use sed with regular expression like this
sed 's/\([\w_]\+\)\(\.\)/\1\//g'
And don't have positive result.
This might work for you (GNU sed):
sed 's/[^ ]*$/\n&/;h;y/./\//;G;s/\n.*\n//' file
Explanation:
s/[^ ]*$/\n&/ insert a newline before the last field
h copy the pattern space (PS) to the hold space (HS)
y/./\// translate all .'s to /'s in the PS
G append a newline then HS to the PS
s/\n.*\n// remove everything between the first and last newlines i.e. delete the old strings
This idiom can be used to simplify changing part of a line without the need to resorting to complicated regexp's
Your elements look like fields. Therefore, my preferred method would be to use awk:
awk '{ for (i=1; i<=2; i++) gsub(/\./, "/", $i) }1' file.txt
Results:
SomeText/any_text/ch SomeText2/any_3/ch 5.6e-5
SomeText/any_text/ch something/else/point/separated/ch4 5.4e5
You can do this in classic sed notation with a couple of loops, one to fix dots in the first field, and one to fix dots in the second field.
sed -e ':f1' -e 's/^\([^ .]*\)\./\1\//' -e 't f1' \
-e ':f2' -e 's/^\([^ ][^ ]*\) \([^ .]*\)\./\1 \2\//' -e 't f2'
The ^ anchors are crucial to this working correctly. Yes, you can write it all on one line in a single argument to sed; I prefer the clarity of separate arguments when the script is a complex as this. A typical sed script is inscrutable enough without adding any extra obstacles to comprehension.
sed ':f1;s/^\([^ .]*\)\./\1\//;t f1;:f2;s/^\([^ ][^ ]*\) \([^ .]*\)\./\1 \2\//;t f2'
For your input sample (two lines), the output is:
SomeText/any_text/ch SomeText2/any_3/ch 5.6e-5
SomeText/any_text/ch something/else/point/separated/ch4 5.4e5
If you're using GNU sed, you might need to add --posix to the options, though it seemed to behave itself correctly (so it probably recognized that I wasn't using any non-POSIX notations and therefore stuck with POSIX).
Tested on Mac OS X 10.7.5 with BSD sed and GNU sed.
awk '{gsub(/\./,"",$1);;gsub(/\./,"",$2);print}' your_file

sed: cannot solve this regular expression

I'm trying to replace two strings in a php file using two sed commands, can't find where I'm wrong.
Want to transform from strings
setlocale(LC_ALL, $_COOKIE['lang']);
and
putenv("LANGUAGE=".$_COOKIE['lang']);
to the strings
setlocale(LC_ALL, $_COOKIE['lang'].'.utf8');
and
putenv("LANGUAGE=".$_COOKIE['lang'].'.utf8');
so far I've come to the following but does not work
sed -i "s/setlocale\(LC_ALL, \$_COOKIE\['lang'\]\);.*$/setlocale\(LC_ALL, \$_COOKIE\['lang'\]\.'\.utf-8'\)\;/" file.php
sed -i "s/putenv\('LANGUAGE='\.\$_COOKIE\['lang'\]\);.*$/putenv\('LANGUAGE='\.\$_COOKIE\['lang'\]\.'\.utf-8'\)\;/" file.php
I'm definitely not an expert in sed and regular expression, so go easy on me ok?
Try these two:
sed 's/setlocale.LC_ALL, ._COOKIE..lang...;/setlocale\(LC_ALL, $_COOKIE\['\''lang'\''\].'\''.utf8'\''\);/g' file.php
sed 's/putenv..LANGUAGE...._COOKIE..lang...;/putenv\("LANGUAGE=".$_COOKIE\['\''lang'\''].'\''.utf8'\'');/g' file.php
You should not escape the parentheses. There is no need to escape matching characters in the replacement part, either:
sed "s/setlocale(LC_ALL, \$_COOKIE\['lang'\]);.*$/setlocale(LC_ALL, \$_COOKIE['lang'].'.utf-8')\;/"
The putenv line contains double quotes, but your expressions searches for single quotes. Therefore, it cannot match.

How to use regular expression in sed command

i have some strings with this pattern in some files:
domain.com/page-10
domain.com/page-15
....
and i want to replace them with something like
domain.com/apple-10.html
domain.com/apple-15.html
i have found that i can use sed command to replace them at a time but because after the numbers should something be added i guess i have to use regular expression to do it. but i don't know how.
sed -i.bak -r 's/page-([0-9]+)/apple-\1.html/' file
sed 's/page-\([0-9][0-9]*\)/apple-\1.html/' file > t && mv t file
Besides sed, you can also use gawk's gensub()
awk '{b=gensub(/page-([0-9]+)/,"apple-\\1.html","g",$0) ;print b }' file
sed -i 's/page-\([0-9]*\)/apple-\1.html/' <filename>
The ([0-9]*) captures a group of digits; the \1 in the replacement string references that capture and adds it as part of the replacement string.
You may want to use something like -i.backup if you need to keep a copy of the file without the replacements, or just omit the -i and instead use the I/O redirection method instead.
One more way to resolve the problem:
sed -i.bak 's/\(^.*\)\(page-\)\(.*\)/\1apple-\3.html/' Files
Here the searching patterns are stored and retrieved using references (\1, \2, \3).
This will work
sed 's/$/\.html/g' file.txt