How can I replace a string containing special characters? - regex

I have a text file that contains a line with brackets, character, integers, and : symbols.
I want to replace [0:1] with [2:4]
$ cat input.txt
str(tr.dx)[0:1]
Expected output:
str(tr.dx)[2:4]
I tried
sed -i 's/str(tr.dx)[0:1]/str(tr.dx)[2:4]/g' input.txt
but it does not work. How can I fix this?

You may use this sed:
sed 's/\(str(tr\.dx)\)\[0:1]/\1[2:4]/' file
str(tr.dx)[2:4]
Here:
\(str(tr\.dx)\) matches str(tr.dx) and captures it in group #1
We need to escape the dot in regex
\[0:1] matches [0:1]. Here we need to escape [
\1 is back-reference for capture group #1

Related

Regex to match exact version phrase

I have versions like:
v1.0.3-preview2
v1.0.3-sometext
v1.0.3
v1.0.2
v1.0.1
I am trying to get the latest version that is not preview (doesn't have text after version number) , so result should be:
v1.0.3
I used this grep: grep -m1 "[v\d+\.\d+.\d+$]"
but it still outputs: v1.0.3-preview2
what I could be missing here?
To return first match for pattern v<num>.<num>.<num>, use:
grep -m1 -E '^v[0-9]+(\.[0-9]+){2}$' file
v1.0.3
If you input file is unsorted then use grep | sort -V | head as:
grep -E '^v[0-9]+(\.[0-9]+){2}$' f | sort -rV | head -1
When you use ^ or $ inside [...] they are treated a literal character not the anchors.
RegEx Details:
^: Start
v: Match v
[0-9]+: Match 1+ digits
(\.[0-9]+){2}: Match a dot followed by 1+ dots. Repeat this group 2 times
$: End
To match the digits with grep, you can use
grep -m1 "v[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+$" file
Note that you don't need the [ and ] in your pattern, and to escape the dot to match it literally.
With awk you could try following awk code.
awk 'match($0,/^v[0-9]+(\.[0-9]+){2}$/){print;exit}' Input_file
Explanation of awk code: Simple explanation of awk program would be, using match function of awk to match regex to match version, once match is found print the matched value and exit from program.
Regular expressions match substrings, not whole strings. You need to explicitly match the start (^) and end ($) of the pattern.
Keep in mind that $ has special meaning in double quoted strings in shell scripts and needs to be escaped.
The boundary characters need to be outside of any group ([]).

Want to use Bash & Regex to replace comma in file

I need to replace a specific character, like comma, in a csv file.
I have files with text and numeric separated by ';' (csv as French...)
Example:
value;x;y;comment;
abc;123,45;987,65;abc;
abc;123.45;987.65;abc;
abc;123,45;987,65;abc, blabla;
There is a mix for the decimal separator , both ',' and '.' are used.
I want to replace ',' by '.' but ONLY for decimal values, not text like comments.
I tried sed with regex
sed -i '/;[0-9]\+,[0-9]\+;/s/,/./g' file.csv
But that replace all comma. I can't found how to replace only what I want.
I want to do that only in bash.
One sed idea using extended regex and capture groups:
sed -E 's/([0-9]),([0-9])/\1.\2/g' file.csv
Where:
-E - enable extended regex support
([0-9]),([0-9]) - match a single digit + , + single digit
([0-9]) - define a capture group (there are 2 capture groups in this case)
\1.\2 - print capture group #1 + . + capture group #2
This generates:
value;x;y;comment;
abc;123.45;987.65;abc;
abc;123.45;987.65;abc;
abc;123.45;987.65;abc, blabla;
NOTES:
once OP is satisfied the code performs the desired operation the -i flag can be added to have sed perform an in-place update of the file
this will erroneously replace the comma in a string such as ;3,2,4 five 6,7 eight ; (this can be addressed but will require a more complex regex)
You may use this simpler sed:
sed -i.bak -E 's/([0-9]),([0-9])/\1.\2/g' file
value;x;y;comment;
abc;123.45;987.65;abc;
abc;123.45;987.65;abc;
abc;123.45;987.65;abc, blabla;
Details:
([0-9]),([0-9]): Match a digit followed by comma followed by a digit. Capture before and after digits in capture group #1 and #2
\1.\2: Replace with back-reference #1 followed by dot followed by back-reference #2
Alternatively, you may use this more robust awk solution:
awk 'BEGIN {FS=OFS=";"} {for (i=1; i<=NF; ++i)
if ($i ~ /^[0-9]+,[0-9]+$/) sub(/,/, ".", $i)} 1' file
value;x;y;comment;
abc;123.45;987.65;abc;
abc;123.45;987.65;abc;
abc;123.45;987.65;abc, blabla;
You can try:
sed -i 's/;\([0-9]\+\),\([0-9]\+\)/;\1.\2/g' file.csv
Note: if you use the -i option, don't forget to make a backup of your original data, just in case.

Extract capture group only from string

I have the following rule:
https://regex101.com/r/noX9lj/4
I want to make this work in a script so I'm using grep like this:
echo "\$this->table('test')" | grep -Po "qr/\$this->table\(\'(test)\'\);/"
The output should be "test"
It's not working, not sure why..
You may use
echo "\$this->table('test');" | grep -oP "\\\$this->table\\('\\K[^']+(?='\\);)"
Or, if you feed a file path to grep:
grep -oP "\\\$this->table\\('\\K[^']+(?='\\);)" file
See the online grep demo
To match $, you need to escape it with a literal backslash, and inside a double quoted string, you need to escape $ itself with one backslash char in order to stop variable expansion, and then you need to add two more backslashes to regex-escape the literal $ char, hence is the "\\\$" in the pattern.
To match any text between two single quotes, you may use [^']+ - 1 or more chars other than '.
See the regex demo
Pattern details
\$this->table\(' - $this->table(' string
\K - match reset operator that discards the text matched so far from the overall match buffer
[^']+ - one or more chars other than '
(?='\);) - a positive lookahead that requires '); string to be present immediately to the right of the current position.
There were multiple issues:
had to use "cat" instead of echo for some reason
used this rule instead:
grep -oP "this->table\('\K\w+(?='\);)"

add dot before first integer in all lines

I have lines like
Input:
abcd1234
bdfghks4506
agfdch6985
I would like to add "." before the first integer in line, how do I do it?
Output:
abcd.1234
bdfghks.4506
agfdch.6985
This might work for you (GNU sed):
sed -i 's/[[:digit:]]/.&/' file
If there is a digit in a line, put a . before it.
N.B. To put a . before every digit in a file use:
sed -i 's/[[:digit:]]/.&/g' file
$ cat > input.txt
abcd1234
bdfghks4506
agfdch6985
$ sed -e 's/^\([^0-9]*\)\([0-9]\)\(.*\)$/\1.\2\3/' input.txt
abcd.1234
bdfghks.4506
agfdch.6985
Use sed string replacement with regular expression capture groups.
Match the beginning of the line.
Start a capture group that matches any number of non-numeric characters.
Start a second capture group that matches a single digit.
Start a third capture group that matches the remainder of the line.
Match the end of the line.
Replace the entire line with the contents of the first capture group, ".", the second capture group and, finally, the third capture group.
The function sub of awk may help,
$ awk 'sub(/[0-9]/,".&",$0)1' file
abcd.1234
bdfghks.4506
agfdch.6985
Brief explanation,
sub: replace only the first matching substring in each line
&: is replaced with the text that was actually matched (i.e. [0-9])
Appended 1: to print the result.
The most strict command for your case would be
sed -E -i 's/([a-z])([1-9])/\1\.\2/' file.txt
-E Use extended regex
-i '' Replace in file (instead of writing to output)
This will match any example you provided
Not familiar with awk / sed specifically, but a regex replace using this regex should be all you need:
Search: (\d+.*?)$ (match everything from the first found number to the end of the line)
Replace by: .$1 (captured group #1 prefixed by a literal .)
The notation of the capture group in the replace command may differ depending on the implementation. I used $1 here, but some implementations may use \1.

Can I find and replace multiple words at the same time in sublime text 3?

I want to find and replace several words at the same time in sublime, for example I want to replace word1,word2,word3 by word4,word5,word6 at the same time instead of doing them in 3 steps. Can I do it in sublime? if not how can I come up with a regexp to do this?
Thank you
In this case sed command might help you.
sed -i -e 's/word1/replace1/g' fileName && sed -i -e 's/word2/replace2/g' fileName && ...
and this goes on how many of words you need to replace, you can. Just run this on your terminal
Example:
this is a content of file. File name is t.txt
I want to replace this with that and file with FILE. then command will be :
sed -i -e 's/this/that/g' t.txt && sed -i -e 's/file/FILE/g' t.txt
output will be :
that is a content of FILE. File name is t.txt
You can use the pattern word1(.*)word2(.*)word3 to match the words, if the words to be replaced come in sequential order with or without some characters in between and use word4\1word5\2word6\3 as replacement.
The pattern has used 3 capture groups to match characters if any between the words to be replaced. \1,\2 and \3 represent each captured group.
Here, the replacement pattern says that the matched text has to be replaced with the sequence of word4 followed by the characters captured in first capture group i.e. the characters if any between the word 1 and word2, followed by word5 and so on.