I have been trying to find how to do a batch find and replace in Terminal on Mac OS X for more than the past hour. I found different versions of code, but am having difficulty making it work. So far, I have found one string of code that works, but it only works for one term/character.
What I want to do is find and replace multiple characters in one text file, all at the same time.
For example:
Find §, replace with ก Find Ø, replace with ด Find ≠,
replace with ห Find £, replace with ้
The code that works so far is (but only for one character):
sed -i '' s/Ø/ด/ [textfile.txt]
Could anyone please help me out?
Your pattern of usage is so common that there is a specific utility you can use for it, namely tr
tr abc ABC < input.txt > output.txt
where you use two strings (here abc and ABC) to instruct tr on the substitutions you want (here, substitute a with A, b with B etc).
With sed, that's MUCH more general in its usage with respect to tr, to search and replace the first occurrence in every line it is
sed 's/src1/rep1/' < in > out
to search and replace every occurrence in every line you add a g switch to the s command
sed 's/src1/rep1/g' < in > out
eventually to do multiple search and replaces you must separate the s commands with a semicolon
sed 's/src1/rep1/g;s/src2/rep2/;s/src3/rep3/g' < in > out
Note that in the above example I used the g switch (line-wise global substitution) for the 1st and the 3rd find&replace and not for the 2nd one... your usage may be different but I hope that you've spotted the pattern, haven't you?
Related
I want a one liner run from nix shell to replace all occurances of particular text contained between arbitrary "start" and "finish", e.g. in
nfw987__qrh fwef_start_hf9
832j fsjdlkfa;jd(&6^)lf dfs
ahlkj;fd__sajhfds
dsfahs__lkjfdsaf jlkfdsa_finish_jfoi__edwp
replace all __ which are between _start_ and _finish_ with ().
I've tried web search but all I find is "simple" replace. I'm writing a code to do that, but maybe that IMHO common task has been solved already with sed, perl, awk etc.
As per al76 link, sed can be easily used for such cases (text is also replaced on start and end lines regardless of where on that line text is, that does not answer my question exactly, but for my current task it is sufficient):
to address the lines between two regular expressions, RE1 and RE2, one
would do this: '/RE1/,/RE2/{commands;}'
sed '/_start_/,/_finish_/{s/__/\(\)/g}' tst2.txt
I have multiple xml files that look like this: <TEST><TEST><TEST><TEST><TEST><TEST><TEST><TEST><TEST><TEST>
I would like to break into a new like for every '<' and get rid of every '>'.
I want to do this via regex since what I'm working on is for *nix.
There is no need for regex to do such a simple search & replace. You want to replace < with \n< and > with an empty string.
Assuming your content is in file input.txt, this simple sed command line can do the job:
sed 's/</\n</g;s/>//g' input.txt
How it works
There are two sed commands separated by ;:
s/</\n</g
s/>//g
Both commands are s (search and replace). The s command requires the search regex (no regex here), the replacement string and some optional flag, separated by /.
The first s searches for < and replaces it with \n<. \n is the usual notation for a newline character in regex and many Unix tools (even when no regex is involved).
The second s searches for > and replaces it with nothing.
Both s commands use the g (global) flag that tells them to do all the replacements they can do on each line. sed runs each command for every line of the input and by default, s stops after the first replacement (on a line).
I'm brand new to regex. I am trying to write a script to comment out lines in a file, so that when we retire a network computer we can remove it from our administrative files (rdist, etc) without having to comment them out by hand. What I have so far is
#!/bin/bash
echo $*
NAMES=$*
FILES="/foo/testfile1
/foo/testfile2"
for name in $NAMES
do
sed -i "s/${name}/#&/g" $FILES
done
exit 0
This works when the testfiles have the target string appear at the beginning of the line, but not if the string is somewhere in the middle. How can I tell sed or regex to insert a hash at the beginning of the line that the string is found on?
(I've been reading my way through a bunch of tutorials online, but the closest thing to what I want seems to be the carat ^. What I'm getting from the explanation is that in multiline mode, it only returns instances of the string that are located at the beginning of the line.)
I'm working on RedHat 5.5, using gedit 2.8.1 as my text editor and sed 4.1.2.
Thank you in advance for your help!
The script below will take in the passed in arguments and look for them as whole words. i.e if an argument is foo then blah foo bar will be commented out but blahfoo bar will not. I also added a bit of code so that if a line matches multiple arguments, you will still only get one # at the beginning of the line.
#!/bin/bash
FILES="./test1 ./test2"
for name; do
sed -i "/\<$name\>/s/^#*/#/" $FILES
done
Stealing the structure from SiegeX and simplifying the sed program:
#!/bin/bash
FILES="./test1 ./test2"
for name; do
sed -i "s/^.*$name.*$/#&/" $FILES
done
The idea is that rather than using a pattern to select and then an s to edit, you use the s pattern to do both - recognise a complete line that contains the target name, and replace it with a commented-out version.
You could do this more elegantly by merging the names into one big regular expression; that would let sed make one pass rather than N. That's easy enough that i leave it as an exercise to the reader ...
This question already has answers here:
How can I search for a multiline pattern in a file?
(11 answers)
Closed 1 year ago.
I'm running a grep to find any *.sql file that has the word select followed by the word customerName followed by the word from. This select statement can span many lines and can contain tabs and newlines.
I've tried a few variations on the following:
$ grep -liIr --include="*.sql" --exclude-dir="\.svn*" --regexp="select[a-zA-Z0-
9+\n\r]*customerName[a-zA-Z0-9+\n\r]*from"
This, however, just runs forever. Can anyone help me with the correct syntax please?
Without the need to install the grep variant pcregrep, you can do a multiline search with grep.
$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c
Explanation:
-P activate perl-regexp for grep (a powerful extension of regular expressions)
-z Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. That is, grep knows where the ends of the lines are, but sees the input as one big line. Beware this also adds a trailing NUL char if used with -o, see comments.
-o print only matching. Because we're using -z, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won't do that.
In regexp:
(?s) activate PCRE_DOTALL, which means that . finds any character or newline
\N find anything except newline, even with PCRE_DOTALL activated
.*? find . in non-greedy mode, that is, stops as soon as possible.
^ find start of line
\1 backreference to the first group (\s*). This is a try to find the same indentation of method.
As you can imagine, this search prints the main method in a C (*.c) source file.
I am not very good in grep. But your problem can be solved using AWK command.
Just see
awk '/select/,/from/' *.sql
The above code will result from first occurence of select till first sequence of from. Now you need to verify whether returned statements are having customername or not. For this you can pipe the result. And can use awk or grep again.
Your fundamental problem is that grep works one line at a time - so it cannot find a SELECT statement spread across lines.
Your second problem is that the regex you are using doesn't deal with the complexity of what can appear between SELECT and FROM - in particular, it omits commas, full stops (periods) and blanks, but also quotes and anything that can be inside a quoted string.
I would likely go with a Perl-based solution, having Perl read 'paragraphs' at a time and applying a regex to that. The downside is having to deal with the recursive search - there are modules to do that, of course, including the core module File::Find.
In outline, for a single file:
$/ = "\n\n"; # Paragraphs
while (<>)
{
if ($_ =~ m/SELECT.*customerName.*FROM/mi)
{
printf file name
go to next file
}
}
That needs to be wrapped into a sub that is then invoked by the methods of File::Find.
I'm having hard time selecting from a file using a regular expression. I'm trying to replace a specific text in the file which is full of lines like this.
/home/user/test2/data/train/train38.wav /home/user/test2/data/train/train38.mfc
I'm trying to replace the bolded text. The problem is the i don't know how to select only the bolded text since i need to use .wav in my regexp and the filename and the location of the file is also going to be different.
Hope you can help
Best regards,
Jökull
This assumes that what you want to replace is the string between the last two slashes in the first path.
sed 's|\([^/]*/\)[^/]*\(/[^/]* .*\)|\1FOO\2|' filename
produces:
/home/user/test2/data/FOO/train38.wav /home/user/test2/data/train/train38.mfc
sed processes lines one at a time, so you can omit the global option and it will only change the first 'train' on each line
sed 's/train/FOO/' testdat
vs
sed 's/train/FOO/g' testdat
which is a global replace
This is quite a bit more readable and less error-prone than some of the other possibilities, but of course there are applications which will not simplify quite as readily.
sed 's;\(\(/[^/]\+\)*\)/train\(\(/[^/]\+\)*\)\.wav;\1/FOO\3.wav;'
You can do it like this
sed -e 's/\<train\>/plane/g'
The \< tells sed to match the beginning of that work and the \> tells it to match the end of the word.
The g at the end means global so it performs the match and replace on the entire line and does not stop after the first successful match as it would normally do without g.