I'm searching for a way to replace the first occurrence of a certain text in a text file with a value ${A} and the second occurrence of the same text, on a different line, with ${B}. Can this be achieved with sed or awk or any other UNIX tool?
The toolset is fairly limited: bash, common UNIX tools like sed, grep, awk etc. Perl, Python, Ruby etc. cannot be used...
Thanks in advance for any advice
Robert
Example:
...
Text
Text
Text
Text
TEXT_TO_BE_REPLACED
Text
Text
Text
TEXT_TO_BE_REPLACED
Text
Text
Text
...
should be replaced with
...
Text
Text
Text
Text
REPLACEMENT_TEXT_A
Text
Text
Text
REPLACEMENT_TEXT_B
Text
Text
Text
...
Sed with one run:
sed -e 's/\(TEXT_TO_BE_REPLACED\)/REPLACEMENT_TEXT_A/1' \
-e 's/\(TEXT_TO_BE_REPLACED\)/REPLACEMENT_B/1' < input_file > output_file
Just run your script twice - once to replace the first occurrence with ${A}, once to replace the (now first) occurence with ${B}.
To replace just one occurence:
sed '0,/RE/s//to_that/' file
(shamelessly stolen from How to use sed to replace only the first occurrence in a file?)
Here is a possible solution using awk:
#!/usr/bin/awk -f
/TEXT_TO_BE_REPLACED/ {
if ( n == 0 ) {
sub( /TEXT_TO_BE_REPLACED/, "REPLACEMENT_TEXT_A", $0 );
n++;
}
else if ( n == 1 ) {
sub( /TEXT_TO_BE_REPLACED/, "REPLACEMENT_TEXT_B", $0 );
n++;
}
}
{
print
}
awk 'BEGIN { a[0]="REPLACEMENT_A"; a[1]="REPLACEMENT_B"; } \
/TEXT_TO_BE_REPLACED/ { gsub( "TEXT_TO_BE_REPLACED", a[i++]); i%=2 }; 1'
So, you can use sed to do this like so:
First, I made a file named test.txt that contained:
well here is an example text example
and here is another example text
I choose to use the word "example" to be the value to change.
Here is the command: cat test.txt | sed -e 's/(example)/test2/2' -e 's/(example)/test1/1'
which provides the following output:
well here is an test1 text test2
and here is another test1 text
Now the sed command broken down:
s - begins search + replace
/ - start search ended with another /
The parentheses group our text ie example
/test2/ what we are putting in place of example
The number after the slashes is the occurrence we want to replace.
the -e allows you to run both commands on one command line.
You may also use the text editor ed:
# cf. http://wiki.bash-hackers.org/howto/edit-ed
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s file
H
/TEXT_TO_BE_REPLACED/s//REPLACEMENT_TEXT_A/
/TEXT_TO_BE_REPLACED/s//REPLACEMENT_TEXT_B/
wq
EOF
Related
I got this text in file.txt:
Osmun.Prez#mail.com:c7lB2m6b#3.a.a:tt_webid_v2=6990226111024612869; tt_webid=6990226111024612869; tt_csrf_token=VD5Nb_TQFH4RKhoJeSe2nzLB; R6kq3TV7=AHkh4PB6AQAA3LIS90nWf2ss0Q7ZTCQjUat4axctvhQY68DdUEz92RwpmVSX|1|0|e9d6917c2fe555827dcf5ee916ba9778079ab2a9; ttwid=1%7CAFodeNF0iZM2fyy-ZeiZ6HTpZoG_MSx6SmXHgGVQ-V4%7C1627538859%7C59ca1e4a56f9f537b55e655a6dabff88e44eb48502b164ed6b4199f5a5263cb0; passport_csrf_token_default=6f7653c3ce946a6ce5444723fb0c509b; passport_csrf_token=6f7653c3ce946a6ce5444723fb0c509b; sid_guard=0483b7d37f4e4bd20ab3046e29724798%7C1627538893%7C5184000%7CMon%2C+27-Sep-2021+06%3A08%3A13+GMT; uid_tt=27b52febe6222486b9f6b6a90ef4ffeace5ea25c09d29a1583be5a1ecf760996; uid_tt_ss=27b52febe6222486b9f6b6a90ef4ffeace5ea25c09d29a1583be5a1ecf760996; sid_tt=0483b7d37f4e4bd20ab3046e29724798; sessionid=0483b7d37f4e4bd20ab3046e29724798; sessionid_ss=0483b7d37f4e4bd20ab3046e29724798; store-idc=maliva; store-country-code=us; odin_tt=294845c8f7711db177f7c549a9f44edb1555031b27a2a485df809cd92c4e544ac0772bf462df5b7a100f6e488c45303cd62df3b6b950f0842520cd887850137b035d990f29cc8b752765e594560c977f; cmpl_token=AgQQAPNSF-RMpbE89z5HYF0_-2PcrxjXf4fZYP5_ZA
How can I delete everything from the string inside ( first & only instance ) from :tt_ to _ZA in file.txt keeping only Osmun.Prez#mail.com:c7lB2m6b#3.a.a using bash linux?
Thank you
Something like:
sed -i "s/:tt_.*//" file.txt
if you want to edit the file in place. If not, remove the -i switch.
The sed command means: replace (s), in each line of file.txt, all the chars (.*) starting by the pattern :tt_ with an empty string (//).
Or the command:
sed -i "s/:tt_.*_ZA//" file.txt
which is more adherent to what you ask for, but returns the same output.
Use pattern substitution:
i=$(cat file.txt)
echo "${i/:tt*_ZA}"
Assuming the general requirement is to remove everything after the 2nd : ...
Sample data:
$ cat file.txt
Osmun.Prez#mail.com:c7lB2m6b#3.a.a:tt_webid_v ... to end of line
some.one#home.com:B52_m6b#9_az.more.stuff:delete from here ... to end of line
One sed idea:
$ sed -En 's/^([^:]*:[^:]*).*$/\1/p' file.txt
Osmun.Prez#mail.com:c7lB2m6b#3.a.a
some.one#home.com:B52_m6b#9_az.more.stuff
Using awk
awk 'BEGIN{FS=OFS=":"}{print $1,$2}'
Using : as the delimiter, it is easy to extract the columns before :tt
This deletes all chars from ":tt_" to the last "_ZA", inclusive, in file.txt
Mac_3.2.57$cat file.txt | sed 's/\(\)[:]tt.*_ZA\(.*\)/\1\2/'
Osmun.Prez#mail.com:c7lB2m6b#3.a.a
Mac_3.2.57$
Or if it is always the first 2 values which are separated by colon (as per you example)
cat file.txt | cut -f1,2 -dā:ā
I am trying to find a pattern of two consecutive lines, where the first line is a fixed string and the second has a part substring I like to replace.
This is to be done in sh or bash on macOS.
If I had a regex tool at hand that would operate on the entire text, this would be easy for me. However, all I find is bash's simple text replacement - which doesn't work with regex, and sed, which is line oriented.
I suspect that I can use sed in a way where it first finds a matching first line, and only then looks to replace the following line if its pattern also matches, but I cannot figure this out.
Or are there other tools present on macOS that would let me do a regex-based search-and-replace over an entire file or a string? Maybe with Python (v2.7 and v3 is installed)?
Here's a sample text and how I like it modified:
keyA
value:474
keyB
value:474 <-- only this shall be replaced (follows "keyB")
keyC
value:474
keyB
value:474
Now, I want to find all occurances where the first line is "keyB" and the following one is "value:474", and then replace that second line with another value, e.g. "value:888".
As a regex that ignores line separators, I'd write this:
Search: (\bkeyB\n\s*value):474
Replace: $1:888
So, basically, I find the pattern before the 474, and then replace it with the same pattern plus the new number 888, thereby preserving the original indentation (which is variable).
You can use
sed -e '/keyB$/{n' -e 's/\(.*\):[0-9]*/\1:888/' -e '}' file
# Or, to replace the contents of the file inline in FreeBSD sed:
sed -i '' -e '/keyB$/{n' -e 's/\(.*\):[0-9]*/\1:888/' -e '}' file
Details:
/keyB$/ - finds all lines that end with keyB
n - empties the current pattern space and reads the next line into it
s/\(.*\):[0-9]*/\1:888/ - find any text up to the last : + zero or more digits capturing that text into Group 1, and replaces with the contents of the group and :888.
The {...} create a block that is executed only once the /keyB$/ condition is met.
See an online sed demo.
Use a perl one-liner with -0777 to scan over multiple lines:
$ # inline edit:
$ perl -0777 -i -pe 's/\bkeyB\s*value):\d*/$1:888/' file.txt
$ # to stdout:
$ cat file.txt | perl -0777 -pe 's/\bkeyB\s*value):\d*/$1:888/'
In plain bash:
#!/bin/bash
keypattern='^[[:blank:]]*keyB$'
valpattern='(.*):'
replacement=888
while read -r; do
printf '%s\n' "$REPLY"
if [[ $REPLY =~ $keypattern ]]; then
read -r
if [[ $REPLY =~ $valpattern ]]; then
printf '%s%s\n' "${BASH_REMATCH[0]}" "$replacement"
else
printf '%s\n' "$REPLY"
fi
fi
done < file
#!/bin/sh
old="hello"
new="world"
sed -i s/"${old}"/"${new}"/g $(grep "${old}" -rl *)
The preceding script just work for single line text, how can I write a script can replace
a multi line text.
old='line1
line2
line3'
new='newtext1
newtext2'
What command can I use.
You could use perl or awk, and change the record separator to something else than newline (so you can match against bigger chunks. For example with awk:
echo -e "one\ntwo\nthree" | awk 'BEGIN{RS="\n\n"} sub(/two\nthree\n, "foo")'
or with perl (-00 == paragraph buffered mode)
echo -e "one\ntwo\nthree" | perl -00 -pne 's/two\nthree/foo/'
I don't know if there's a possibility to have no record separator at all (with perl, you could read the whole file first, but then again that's not nice with regards to memory usage)
awk can do that for you.
awk 'BEGIN { RS="" }
FILENAME==ARGV[1] { s=$0 }
FILENAME==ARGV[2] { r=$0 }
FILENAME==ARGV[3] { sub(s,r) ; print }
' FILE_WITH_CONTENTS_OF_OLD FILE_WITH_CONTENTS_OF_NEW ORIGINALFILE > NEWFILE
But you can do it with vim like described here (scriptable solution).
Also see this and this in the sed faq.
I've got files repeatedly containing the string \n\n} and I need to replace such string with \n} (removing one of the two newlines).
Since such files are dynamically generated through a bash script, I need to embed replacing code inside the script.
I tried with the following commands, but it doesn't work:
cat file.tex | sed -e 's/\n\n}/\n}/g' # it doesn't work!
cat file.tex | perl -p00e 's/\n\n}/\n}/g' # it doesn't work!
cat file.tex | awk -v RS="" '{gsub (/\n\n}/, "\nb")}1' # it does work, but not for large files
You didn't provide any sample input and expected output so it's a guess but maybe this is what you're looking for:
$ cat file
a
b
c
}
d
$ awk '/^$/{f=1;next} f{if(!/^}/)print "";f=0} 1' file
a
b
c
}
d
a way with sed:
sed -i -n ':a;N;$!ba;s/\n\n}/\n}/g;p' file.tex
details:
:a # defines the label "a"
N # append the next line to the pattern space
$!ba # if it is not the last line, go to label a
s/\n\n}/\n}/g # replace all \n\n} with \n}
p # print
The i parameter will change the file in place.
The n parameter prevents to automatically print the lines.
This Perl command will do as you ask
perl -i -0777 -pe's/\n(?=\n})//g' file.tex
This should work:
cat file.tex | sed -e 's/\\n\\n}/\\n}/g'
if \n\n} is written as raw string.
Or if it's new line:
cat file.tex | sed -e ':a;N;$!ba;s/\n\n}/\n}/g'
Another method:
if the first \n is any new line:
text=$(< file.tex)
text=${text//$'\n\n}'/$'\n}'}
printf "%s\n" "$text" #> file
If the first \n is an empty line:
text=$(< file.tex)
text=${text//$'\n\n\n}'/$'\n\n}'}
printf "%s\n" "$text" #> file
Nix-style line filters process the file line-by-line. Thus, you have to do something extra to process an expression which spans lines.
As mentioned by others, '\n\n' is simply an empty line and matches the regular expression /^$/. Perhaps the most efficient thing to do is to save each empty line until you know whether or not the next one will contain a close bracket at the beginning of the line.
cat file.tex | perl -ne 'if ( $b ) { print $b unless m/^\}/; undef $b; } if ( m/^$/ ) { $b=$_; } else { print; } END { print $b if $b; }'
And to clean it all up we add an END block, to process the case that the last line in the file is blank (and we want to keep it).
If you have access to node you can use rexreplace
npm install -g regreplace
and then run
rexreplace '\n\n\}' '\n\}' myfile.txt
Of if you have more files in a dir data you can do
rexreplace '\n\n\}' '\n\}' data/*.txt
How do I display data from the beginning of a file until the first occurrence of a regular expression?
For example, if I have a file that contains:
One
Two
Three
Bravo
Four
Five
I want to start displaying the contents of the file starting at line 1 and stopping when I find the string "B*". So the output should look like this:
One
Two
Three
perl -pe 'last if /^B/' source.txt
An explanation: the -p switch adds a loop around the code, turning it into this:
while ( <> ) {
last if /^B.*/; # The bit we provide
print;
}
The last keyword exits the surrounding loop immediately if the condition holds - in this case, /^B/, which indicates that the line begins with a B.
if its from the start of the file
awk '/^B/{exit}1' file
if you want to start from specific line number
awk '/^B/{exit}NR>=10' file # start from line 10
sed -n '1,/^B/p'
Print from line 1 to /^B/ (inclusive). -n suppresses default echo.
Update: Opps.... didn't want "Bravo", so instead the reverse action is needed ;-)
sed -n '/^B/,$!p'
/I3az/
sed '/^B/,$d'
Read that as follows: Delete (d) all lines beginning with the first line that starts with a "B" (/^B/), up and until the last line ($).
Some of the sed commands given by others will continue to unnecessarily process the input after the regex is found which could be quite slow for large input. This quits when the regex is found:
sed -n '/^Bravo/q;p'
in Perl:
perl -nle '/B.*/ && last; print; ' source.txt
Just sharing some answers I've received:
Print data starting at the first line, and continue until we find a match to the regex, then stop:
<command> | perl -n -e 'print "$_" if 1 ... /<regex>/;'
Print data starting at the first line, and continue until we find a match to the regex, BUT don't display the line that matches the regular expression:
<command> | perl -pe '/<regex>/ && exit;'
Doing it in sed:
<command> | sed -n '1,/<regex>/p'
Your problem is a variation on an answer in perlfaq6: How can I pull out lines between two patterns that are themselves on different lines?.
You can use Perl's somewhat exotic .. operator (documented in perlop):
perl -ne 'print if /START/ .. /END/' file1 file2 ...
If you wanted text and not lines, you would use
perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...
But if you want nested occurrences of START through END, you'll run up against the problem described in the question in this section on matching balanced text.
Here's another example of using ..:
while (<>) {
$in_header = 1 .. /^$/;
$in_body = /^$/ .. eof;
# now choose between them
} continue {
$. = 0 if eof; # fix $.
}
Here is a perl one-liner:
perl -pe 'last if /B/' file
If Perl is a possibilty, you could do something like this:
% perl -0ne 'if (/B.*/) { print $`; last }' INPUT_FILE
one liner with basic shell commands:
head -`grep -n B file|head -1|cut -f1 -d":"` file