sed not capturing regex [duplicate] - regex

This question already has answers here:
Why doesn't `\d` work in regular expressions in sed? [duplicate]
(3 answers)
Closed 4 years ago.
I'm attempting to replace the beginning of lines in a simple file.
>1
TGAACCATCGAGTCTTTGAACG
>2
GAGTTCATTTCTCTCTGGAGGCACC
>3
ATTGACAGATTGAGAGCTCTTTC
>4
CGGGAAAAGGATTGGCTC
>5
TCTTGGTGGTAGTAGCAAATATTCAAATG
Above is the input, below is the desired output.
>seq_x1
TGAACCATCGAGTCTTTGAACG
>seq_x2
GAGTTCATTTCTCTCTGGAGGCACC
>seq_x3
ATTGACAGATTGAGAGCTCTTTC
>seq_x4
CGGGAAAAGGATTGGCTC
>seq_x5
TCTTGGTGGTAGTAGCAAATATTCAAATG
Here is the command I've tried using:
sed -n -r 's/^>(\d+)/>seq_x\1/' file
Using text editors on a small subset of the file, I have no problem with find and replace using '^>(\d+)' and '>seq_x\1', but I can't get sed to replace effectively. Is there something I'm missing?

In your simple case there's no need to specify a followed digit. Besides, you should use a more portable [0-9] (range of numbers) instead of \d+.
sed 's/^>/>seq_x/' file

Related

Regex to extract string between two sets of underscores [duplicate]

This question already has answers here:
How to get String between last two underscore
(5 answers)
Closed 2 years ago.
I have been trying to extract a string for a directory path that contains multiple underscores as delimiters.
I'm trying on regex101 to extract foobar but can only get _pdf-documents_
regex
_([^_]+)_
directory path
/data/documents/2020/05/07/2020-05-07-12_pdf-documents_foobar_hour.abc.defg.log
If you work only with this string you can use this _([^_p]+)_
If awk is ok, then :
echo '/data/documents/2020/05/07/2020-05-07-12_pdf-documents_foobar_hour.abc.defg.log' |
awk -F'_' '{print $3}'
Output
foobar
Or like said Wiktor Stribiżew in comments, split like I do in another language, this is the most simple, maintainable, readable and reliable solution
This worked for me.
.*_([^_]+)_.*

Why sed command does not recognize my regex to change date format? [duplicate]

This question already has answers here:
What delimiters can you use in sed? [duplicate]
Using different delimiters in sed commands and range addresses
(3 answers)
How to insert strings containing slashes with sed? [duplicate]
(11 answers)
Closed 2 years ago.
I'm studying Regex and I faced a example that convertes dates formats:
sed -r 's|([0-9]{2})/([0-9]{2})/([0-9]{4})|\3/\2/\1|' <<< 05/04/1947
Output:
1947/04/05
I understand why this works but I'm not sure why the author used | 'pipe' instead of '/' to separate the regexp, replacement and flags sections.
I think changing | to / would make the same output:
sed -r 's/([0-9]{2})/([0-9]{2})/([0-9]{4})/\3/\2/\1/' <<< 05/04/1947
Why the last example does not work like the first one?
Thanks!

sed linux command unable to insert line [duplicate]

This question already has answers here:
How to escape single quotes within single quoted strings
(25 answers)
Closed 4 years ago.
How can I make this regular expression work correctly. Is adding the line
LOGIN_SERVICE: '"https://dev-login-o365.grey.com/gp_loginservice/"',
in the file env.js I'm using the regular expresion:
sed -i '5i LOGIN_SERVICE: '"https://login.xxxx.com/server/"',' ./env.js
But is adding the value without the quotes and 3 spaces more to the left like this:
Use different quoting:
sed -i "5i LOGIN_SERVICE: '\"https://login.xxxx.com/server/\"'," env.js
Or as the helpful comments below suggest, wrap sed command in single quote:
sed '5i LOGIN_SERVICE: '\''"https://login.xxxx.com/server/"'\'',' env.js

Replacing digits in PowerShell doesn't work [duplicate]

This question already has answers here:
What's the difference between .replace and -replace in powershell?
(2 answers)
Closed 4 years ago.
Edit: though the question above is related, this isn't the same question as asking the difference between .replace and -replace, nor does it have the same answer.
Per the Powershell docs
\d matches any digit character.
I have a command (gg, an alias for git grep) that gives the output:
packages/somemodule/index.js:69: log(`woo`)
I'm familiar with regexs, and would like to change the output to :
packages/somemodule/index.js:69 log(`woo`)
I.e. adding a space after the first digits and the colon (if you're interested, this is to make the file openable by an editor). However a digit, one or more times, followed by a colon \d+: doesn't work:
gg 'No previous' | % {$_.replace("\d+:",'xxxx')}
Trying different versions, the \d doesn't work. What am I doing wrong?
Command output is treated as string data. In your code you are calling the [String].Replace() method which does not support regular expressions. For this to work as expected, you need to use PowerShell's -replace operator.
gg 'No previous' | % { $_ -replace '\d+:','xxxx' }
This approach will allow PowerShell to utilize regular expressions for string replacement!

How to use sed and/or regex to trim a line in a file using bash? [duplicate]

This question already has answers here:
Regex to extract first 3 words from a string
(3 answers)
Closed 5 years ago.
This seems like it should be simple, but I've spent far too much time searching. How can I use sed and regex to trim off all words in a line after the fourth word?
For instance from:
19900101, This is a title
19091110, This is a really long title
I would like to have
19900101, This is a
19091110, This is a
I've tried answers like this one Regex to extract first 3 words from a string, but I'm using Mac OSX, so I get context address errors.
This is easily done using cut:
cut -d ' ' -f 1-4 file
19900101, This is a
19091110, This is a
Or using awk:
awk '{NF=4} 1' file
19900101, This is a
19091110, This is a
This might work for you (GNU sed):
sed 's/\s*\S*//5g' file
Remove the fifth or more words from the line.