Sed Search and Replace - regex

I have a line in a file as follows:
LDAP_EXTERNAL_URL=ldap://server.domain.com:389
I need to replace everything after // with server2.domain.com:399
Start to get a headache with the regexp. Please advise.

You can use the following as well if you're not strictly stuck on using sed.
perl -pe 's!^LDAP[^/]*//server\K.*!2.domain.com:399!' file

Perhaps:
sed -r 's|^(.*ldap://).*|\1server2.domain.com:339|' file
Ok to make it more complete:
sed -r 's|^(LDAP_EXTERNAL_URL=ldap://).*|\1server2.domain.com:339|' file
Add -i to enable inline editing.

Related

General solutions to replace string regex preceded and followed by '\n'

I have a file in CentOS which looks like following
[root#localhost nn]# cat -A excel.log
real1$
0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I$
real2$
0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I$
real3$
0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I$
real4$
0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I1^I0.5^I1^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I$
real5$
0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I1^I0.5^I0.5^I0.5^I0.5^I$
real6$
I would like to replace \nreal[2-6]\n with \t\t\t' and have tried unsuccessfully the following
sed -i 's/\nreal[2-6]\n/\t\t\t/g' file
It seems that sed has difficulty to deal with line break. Any idea to fulfill the regex in CentOS?
Much appreciated!
If you want to consider perl then use:
perl -i -0777 -pe 's/\n(?:51[23]real|real[2-6])(?:\n|\z)/\t\t\t/g' file
If you want to avoid last real\d+ line to be replaced with \t\t\t then use:
perl -i -0777 -pe 's/\n(?:51[23]real|real[2-6])\n(?!\z)/\t\t\t/g' file
(?!\z) is negative lookahead to fail the match when we have line end just ahead of us.
With GNU sed, you need to use the -z option:
sed -i -z 's/\nreal[2-6]\n/\t\t\t/g' file
# ^^
Now, that you also want to handle specific alternations, you need to enable the POSIX ERE syntax, either with -r or -E option:
sed -i -Ez 's/\n(51[23]real|real[2-6])\n/\t\t\t/g' file

remove non latin-1 characters in a text file [duplicate]

I want to remove all the non-ASCII characters from a file in place.
I found one solution with tr, but I guess I need to write back that file after modification.
I need to do it in place with relatively good performance.
Any suggestions?
A perl oneliner would do: perl -i.bak -pe 's/[^[:ascii:]]//g' <your file>
-i says that the file is going to be edited inplace, and the backup is going to be saved with extension .bak.
# -i (inplace)
sed -i 's/[\d128-\d255]//g' FILENAME
I tried all the solutions and nothing worked. The following, however, does:
tr -cd '\11\12\15\40-\176'
Which I found here:
https://alvinalexander.com/blog/post/linux-unix/how-remove-non-printable-ascii-characters-file-unix
My problem needed it in a series of piped programs, not directly from a file, so modify as needed.
sed -i 's/[^[:print:]]//' FILENAME
Also, this acts like dos2unix
Try tr instead of sed
tr -cd '[:print:]' < file.txt
# -i (inplace)
LANG=C sed -i -E "s|[\d128-\d255]||g" /path/to/file(s)
The LANG=C part's role is to avoid a Invalid collation character error.
Based on Ivan's answer and Patrick's comment.
This worked for me:
sed -i 's/[^[:print:]]//g'
I'm using a very minimal busybox system, in which there is no support for ranges in tr or POSIX character classes, so I have to do it the crappy old-fashioned way. Here's the solution with sed, stripping ALL non-printable non-ASCII characters from the file:
sed -i 's/[^a-zA-Z 0-9`~!##$%^&*()_+\[\]\\{}|;'\'':",.\/<>?]//g' FILE
As an alternative to sed or perl you may consider to use ed(1) and POSIX character classes.
Note: ed(1) reads the entire file into memory to edit it in-place, so for really large files you should use sed -i ..., perl -i ...
# see:
# - http://wiki.bash-hackers.org/doku.php?id=howto:edit-ed
# - http://en.wikipedia.org/wiki/Regular_expression#POSIX_character_classes
# test
echo $'aaa \177 bbb \200 \214 ccc \254 ddd\r\n' > testfile
ed -s testfile <<< $',l'
ed -s testfile <<< $'H\ng/[^[:graph:][:space:][:cntrl:]]/s///g\nwq'
ed -s testfile <<< $',l'
awk '{ sub("[^a-zA-Z0-9\"!##$%^&*|_\[](){}", ""); print }' MYinputfile.txt > pipe_out_to_CONVERTED_FILE.txt
I appreciate the tips I found on this site.
But, on my Windows 10, I had to use double quotes for this to work ...
sed -i "s/[\d128-\d255]//g" FILENAME
Noticed these things ...
For FILENAME the entire path\name needs to be quoted
This didn't work -- %TEMP%\"FILENAME"
This did -- %TEMP%\FILENAME"
sed leaves behind temp files in the current directory, named sed*

Rewrite URL using sed while maintaining filename

I would like to find all instances of a URL in a file and replace them with a different link structure.
An example would be convert http://www.domain.com/wp-content/uploads/2013/03/Security_Panda.png to /images/Security_Panda.png.
I am able to identify the link using a regular expression such as:
^(http:)|([/|.|\w|\s])*\.(?:jpg|gif|png)
but need to rewrite using sed so that the file name is maintained. I understand that I will need to use s/${PATTERN}/${REPLACEMENT}/g.
Tried: sed -i 's#(http:)|([/|.|\w|\s])*\.(?:jpg|gif|png)#/dir/$1#g' test without success? Thoughts on how to improve the approach?
In basic sed, you need to escape the () symbols like \(..\) to mean a capturing group.
sed 's~http://[.a-zA-Z0-9_/-]*\/\(\w\+\.\(jpg\|gif\|png\)\)~/images/\1~g' file
Example:
$ echo 'http://www.domain.com/wp-content/uploads/2013/03/Security_Panda.png' | sed 's~http://[.a-zA-Z0-9_/-]*\/\(\w\+\.\(jpg\|gif\|png\)\)~/images/\1~g'
/images/Security_Panda.png
You can use:
sed 's~^.*/\([^/]\{1,\}\)$~/images/\1~' file
/images/Security_Panda.png
Testing:
s='http://www.domain.com/wp-content/uploads/2013/03/Security_Panda.png'
sed 's~^.*/\([^/]\{1,\}\)$~/images/\1~' <<< "$s"
/images/Security_Panda.png
Easier way if you change your idea.
#!/usr/bin/env bash
URL="http://www.domain.com/wp-content/uploads/2013/03/Security_Panda.png"
echo "/image/${URL##*/}"
Another way
command line
sed 's#^http:.*/\(.*\).$#/images/\1#g'
Example
echo "http://www.domain.com/wp-content/uploads/2013/03/Security_Panda.png "|sed 's#^http:.*/\(.*\).$#/images/\1#g'
results
/images/Security_Panda.png
An awk version:
awk -F\/ '/(jpg|gif|png) *$/ {print "/images/"$NF}' file
/images/Security_Panda.png

Replacing a fixed-position character field using Perl or sed

I need to replace a particular range of characters in each line of a file.
I tried this
perl -i -pe 'r77,79c/XXX/g' file
I am trying to change the 77th to 79th characters to XXX using Perl, but above code is not working.
you want to replace chars at position [77-79] with XXX?
try
perl -i -piorig_* -e "substr($_,76,3)=XXX" file
a backup file called orig_file will be created cause of preventing possible dataloss..
perl -i -pe 's/.{76}\K.../XXX/' file
You wrote:
Actually i want to search a pattern in a file and whatever lines matching that pattern needs to be replaced to 50th & 51st character to XX
Using sed:
sed -r '/pattern/s/^(.{49})..(.*)$/\1XX\2/' file
sed "/pattern/ s/^\(.\{49\}\)../\1XX/" YourFile
we don't touch the end

sed - Commenting a line matching a specific string AND that is not already commented out

I have the following test file
AAA
BBB
CCC
Using the following sed I can comment out the BBB line.
# sed -e '/BBB/s/^/#/g' -i file
I'd like to only comment out the line if it does not already has a # at the begining.
# sed -e '/^#/! /BBB/s/^/#/g' file
sed: -e expression #1, char 7: unknown command: `/'
Any ideas how I can achieve this?
Assuming you don't have any lines with multiple #s this would work:
sed -e '/BBB/ s/^#*/#/' -i file
Note: you don't need /g since you are doing at most one substitution per line.
Another solution with the & special character which references the whole matched portion of the pattern space. It's a bit simpler/cleaner than capturing and referencing a regexp group.
sed -i 's/^[^#]*BBB/#&/' file
I find this solution to work the best.
sed -i '/^[^#]/ s/\(^.*BBB.*$\)/#\ \1/' file
It doesn't matter how many "#" symbols there are, it will never add another one. If the pattern you're searching for does not include a "#" it will add it to the beginning of the line, and it will also add a trailing space.
If you don't want a trailing space
sed -i '/^[^#]/ s/\(^.*BBB.*$\)/#\1/' file
Assuming the BBB is at the beginning of a line, I ended up using an even simpler expression:
sed -e '/^BBB/s/^/#/' -i file
One more note for the future me. Do not overlook the -i. Because this won't work: sed -e "..." same_file > same_file.
sed -i '/![^#]/ s/\(^.*BBB.*$\)/#\ \1/' file
This doesn't work for me with the keyword *.sudo, no comments at all...
Ony the syntax below works:
sed -e '/sudo/ s/^#*/#/' file
Actually, you don't need the exclamation sign (!) as the caret symbol already negates whatever is inside the square brackets and will ignore all hash symbol from your search. This example worked for me:
sed -i '/[^#]/ s/\(^.*BBB.*$\)/#\ \1/' file
Comment all "BBB", if it's haven't comment yet.
sed -i '/BBB/s/^#\?/#/' file
If BBB is at the beginning of the line:
sed 's/^BBB/#&/' -i file
If BBB is in the middle of the line:
sed 's/^[^#]*BBB/#&/' -i file
I'd usually supply sed with -i.bak to backup the file prior to making changes to the original copy:
sed -i.bak '/BBB/ s/^#*/#/' file
This way when done, I have both file and file.bak and I can decide to delete file.bak only after I'm confident.
If you want to comment out not only exact matches for 'BBB' but also lines that have 'BBB' somewhere in the middle, you can go with following solution:
sed -E '/^([^#].*)?BBB/ s/^/#/'
This won't change any strings that are already commented out.