search and replace nth occurence in a file

search and replace nth occurence in a file - regex

Does anyone know any unix commands/perl script that would replace a specific occurence
my file is hello.txt
number 555
number 555
number 555
now i want to replace the second occurence with number 666.
i have been trying this command
perl -n -i -e "s/number\\s+555/number 666/g" hello.txt'
which is changing all the occurences.
one liners will be really helpful.

$. holds line number for current file handle and can be used for given input file like,
perl -i -pe 's/number\s+555/number 666/ if $. == 2' hello.txt
or if number part can be dropped out,
perl -i -pe 's/555/666/ if $. == 2' hello.txt

I read content of file hello.txt to array then joined to get $str. Here it will replace 2nd occurrence with i initialized to 0. Try this search and replace in one liner.
$str =~ s/(number\s+555)/ ++$i==2 ? "number 666": $1/gse;

Does anyone know any unix commands: I believe awk is suitable for this task (http://www.grymoire.com/Unix/Awk.html).
awk '{if (NR == 2) {gsub("555", "666", $0);} print $0; } ' hello.txt

Try using sed
sed -r -i.bak ':a;N;$!ba;s/(number\s+)555/\1666/2' file
Output:
number 555
number 666
number 555
Reference SO question

Related

Count number of line in txt file when new line is inside data

I have one txt file which has below data
Name mobile url message text
test11 1234567890 www.google.com "Data Test New
Date:27/02/2020
Items: 1
Total: 3
Regards
ABC DATa
Ph:091 : 123456789"
test12 1234567891 www.google.com "Data Test New one
Date:17/02/2020
Items: 26
Total: 5
Regards
user test
Ph:091 : 433333333"
Now you can see my last column data has new line character. so when I use below command
awk 'END{print NR}' file.txt
it is giving my length is 15 but actually line length is 3 . Please suggest command for the same
Edited Part:
As per the answer given the below script is not working if there's no newline at the end of input file
awk -v RS='"[^"]*"' '{gsub(/\n/, " ", RT); ORS=RT} END{print NR "\n"}' test.txt
Also my file may have 3-4 Million of records . So converting file to unix format will take time and that is not my preference.
So Please suggest some optimum solution which should work in both case
head 5.csv | cat -A
Above command is giving me the output
Name mobile url message text^M$

Using gnu-awk you can do this using a custom RS:
awk -v RS='"[^"]*"' '{gsub(/(\r?\n){2,}/, "\n"); n+=gsub(/\n/, "&")}
END {print n}' <(sed '$s/$//' file)
15001
Here:
-v RS='"[^"]*"': Uses this regex as input record separator. Which matches a double quoted string
n+=gsub(/\n/, "&"): Dummy replace \n with itself and counts \n in variable n
END {print n}: Prints n in the end
sed '$s/$//' file: For last line adds a newline (in case it is missing)
Code Demo

With perl, assuming last line always ends with a newline character
$ perl -0777 -nE 'say s/"[^"]+"(*SKIP)(*F)|\n//g' ip.txt
3
-0777 to slurp entire input file as a single string, so this isn't suitable if the input file is very large
the s command returns number of substitutions made, which is used here to get the count of newlines
"[^"]+"(*SKIP)(*F) will cause newlines within double quotes to be ignored
You can use the below command if you want to count the last line even if it doesn't end with newline character.
perl -0777 -nE 'say scalar split /"[^"]+"(*SKIP)(*F)|\n/' ip.txt

Same as anubhava but with GNU sed:
<infile sed '/"/ { :a; N; /"$/!ba; s/\n/ /g; }' | wc -l
Output:
3

Match multiple patterns in same line using sed [duplicate]

Given a file, for example:
potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789
I'd like to grep for all lines that start with potato: but only pipe the numbers that follow potato:. So in the above example, the output would be:
1234
5432
How can I do that?

grep 'potato:' file.txt | sed 's/^.*: //'
grep looks for any line that contains the string potato:, then, for each of these lines, sed replaces (s/// - substitute) any character (.*) from the beginning of the line (^) until the last occurrence of the sequence : (colon followed by space) with the empty string (s/...// - substitute the first part with the second part, which is empty).
or
grep 'potato:' file.txt | cut -d\ -f2
For each line that contains potato:, cut will split the line into multiple fields delimited by space (-d\ - d = delimiter, \ = escaped space character, something like -d" " would have also worked) and print the second field of each such line (-f2).
or
grep 'potato:' file.txt | awk '{print $2}'
For each line that contains potato:, awk will print the second field (print $2) which is delimited by default by spaces.
or
grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'
All lines that contain potato: are sent to an inline (-e) Perl script that takes all lines from stdin, then, for each of these lines, does the same substitution as in the first example above, then prints it.
or
awk '{if(/potato:/) print $2}' < file.txt
The file is sent via stdin (< file.txt sends the contents of the file via stdin to the command on the left) to an awk script that, for each line that contains potato: (if(/potato:/) returns true if the regular expression /potato:/ matches the current line), prints the second field, as described above.
or
perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt
The file is sent via stdin (< file.txt, see above) to a Perl script that works similarly to the one above, but this time it also makes sure each line contains the string potato: (/potato:/ is a regular expression that matches if the current line contains potato:, and, if it does (&&), then proceeds to apply the regular expression described above and prints the result).

Or use regex assertions: grep -oP '(?<=potato: ).*' file.txt

grep -Po 'potato:\s\K.*' file
-P to use Perl regular expression
-o to output only the match
\s to match the space after potato:
\K to omit the match
.* to match rest of the string(s)

sed -n 's/^potato:[[:space:]]*//p' file.txt
One can think of Grep as a restricted Sed, or of Sed as a generalized Grep. In this case, Sed is one good, lightweight tool that does what you want -- though, of course, there exist several other reasonable ways to do it, too.

This will print everything after each match, on that same line only:
perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt
This will do the same, except it will also print all subsequent lines:
perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt
These command-line options are used:
-n loop around each line of the input file
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code

You can use grep, as the other answers state. But you don't need grep, awk, sed, perl, cut, or any external tool. You can do it with pure bash.
Try this (semicolons are there to allow you to put it all on one line):
$ while read line;
do
if [[ "${line%%:\ *}" == "potato" ]];
then
echo ${line##*:\ };
fi;
done< file.txt
## tells bash to delete the longest match of ": " in $line from the front.
$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789
or if you wanted the key rather than the value, %% tells bash to delete the longest match of ": " in $line from the end.
$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi
The substring to split on is ":\ " because the space character must be escaped with the backslash.
You can find more like these at the linux documentation project.

Modern BASH has support for regular expressions:
while read -r line; do
if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
echo "${BASH_REMATCH[1]}"
fi
done

grep potato file | grep -o "[0-9].*"

Sed Match Number followed by string and return Number

Hi i have a file containing the following:
7 Y-N2
8 Y-H
9 Y-O2
I want to match it with the following sed command and get the number at the beginning of the line:
abc=$(sed -n -E "s/([0-9]*)(^[a-zA-Z])($j)/\1/g" file)
$j is a variable and contains exactly Y-O2 or Y-H.
The Number is not the linenumber.
The Number is always followed by a Letter.
Before the Number are Whitespaces.
echoing $abc returns a whiteline.
Thanks

many problems here:
there are spaces, you don't account for them
the ^ must be inside the char class to make a negative letter
you're using -n option, so you must use p command or nothing will ever be printed (and the g option is useless here)
working command (I have changed -E by -n because it was unsupported by my sed version, both should work):
sed -nr "s/ *([0-9]+) +([^a-zA-Z])($j)/\1/p" file
Note: awk seems more suited for the job. Ex:
awk -v j=$j '$2 == j { print $1 }' file

Sed seems to be overly complex for this task, but with awk you can write:
awk -vk="$var" '$2==k{print $1}' file
With -vk="$var" we set the awk variable k to the value of the $var shell variable.
Then, we use the 'filter{command}' syntax, where the filter $2==k is that the second field is equal to the variable k. If we have a match, we print the first field with {print $1}.

Try this:
abc=$(sed -n "s/^ *\([0-9]*\) *Y-[OH]2*.*/\1/p" file)
Explanations:
^ *: in lines starting with any number of spaces
\([0-9]*\): following number are captured using backreference
*: after any number of spaces
Y-[OH]2*: search for Y- string followed by N or H with optional 2
\1/p: captured string \1 is output with p command

Troubles with regular expressions

I wanted some help on extended regular expressions.
I have been trying to figure out but in vain
I have a file conflicts.txt which looks like this please note that it is only a part of this file , there are many lines like these
Server/core/wildSetting.json
Server/core
Client/arcade/src/assets
Client/arcade/src/assets/
Client/arcade/src/assets
Client/arcade/src/Game/
i am writing a shell script which goes thorugh this file line by line :
if [ -s "$CONFLICTS" ] ; then
count=0
while read LINE
do
let count++
echo -e "\n $LINE \n"
done < $CONFLICTS
fi
the above prints the file line by line what i am trying now is to redirect the lines which have a certain text into some other file for that i have modified echo line of the code to :
echo -e "\n $LINE \n" | grep -E "Server/game" > newfile.txt
My Query :
As we can see there are many lines of the form Server/Core...
I want to write a regular expression and use it in grep, which matches two kind of lines
1) line s containing the ONLY the string "Server/core" preceeded and suceeded by any number of spaces
2) all the lines containing the string "assets"
I have written a regular expression for the same but it doesn't work
here my regEx:
grep -E '[^' '*Server/core$] | [assets]'
can you please tell me what is the right way of doing it ?
Please note that there can be any number of spaces before and after "Server/core" as this file is a result of parsing a previous file.
Thanks !

Based on what's asked in the comments:
1) the lines containing the string "assets"
$ grep "assets" file
Client/arcade/src/assets
Client/arcade/src/assets/
Client/arcade/src/assets
2) lines that contain only the sting "Server/core" preceeded and succeed by any amount of space
$ grep "^[ ]*Server/core[ ]*$" file
Server/core

sed (Stream EDitor) can solve your problem perfectly.
Try this command sed -n '/^ *Server\/core\|assets/p' conflicts.txt.
There is something wrong with your grep -E '[^' '*Server/core$] | [assets]'.
The ^ in a squared brackets omits all the strings containing any of the subsequent characters in the brackets.
If you want to perform in-place modification, add the -i option to the sed command like
sed -in '/^ *Server\/core\|assets/p' conflicts.txt

Your regex just needs to be this:
assets|^\s*Server/Core\s*$
I think sed or awk would be a better tool than grep - you would need to escape the forward slash if you used one of these.

How do I display data from the beginning of a file until the first occurrence of a regular expression?

How do I display data from the beginning of a file until the first occurrence of a regular expression?
For example, if I have a file that contains:
One
Two
Three
Bravo
Four
Five
I want to start displaying the contents of the file starting at line 1 and stopping when I find the string "B*". So the output should look like this:
One
Two
Three

perl -pe 'last if /^B/' source.txt
An explanation: the -p switch adds a loop around the code, turning it into this:
while ( <> ) {
last if /^B.*/; # The bit we provide
print;
}
The last keyword exits the surrounding loop immediately if the condition holds - in this case, /^B/, which indicates that the line begins with a B.

if its from the start of the file
awk '/^B/{exit}1' file
if you want to start from specific line number
awk '/^B/{exit}NR>=10' file # start from line 10

sed -n '1,/^B/p'
Print from line 1 to /^B/ (inclusive). -n suppresses default echo.
Update: Opps.... didn't want "Bravo", so instead the reverse action is needed ;-)
sed -n '/^B/,$!p'
/I3az/

sed '/^B/,$d'
Read that as follows: Delete (d) all lines beginning with the first line that starts with a "B" (/^B/), up and until the last line ($).

Some of the sed commands given by others will continue to unnecessarily process the input after the regex is found which could be quite slow for large input. This quits when the regex is found:
sed -n '/^Bravo/q;p'

in Perl:
perl -nle '/B.*/ && last; print; ' source.txt

Just sharing some answers I've received:
Print data starting at the first line, and continue until we find a match to the regex, then stop:
<command> | perl -n -e 'print "$_" if 1 ... /<regex>/;'
Print data starting at the first line, and continue until we find a match to the regex, BUT don't display the line that matches the regular expression:
<command> | perl -pe '/<regex>/ && exit;'
Doing it in sed:
<command> | sed -n '1,/<regex>/p'

Your problem is a variation on an answer in perlfaq6: How can I pull out lines between two patterns that are themselves on different lines?.
You can use Perl's somewhat exotic .. operator (documented in perlop):
perl -ne 'print if /START/ .. /END/' file1 file2 ...
If you wanted text and not lines, you would use
perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...
But if you want nested occurrences of START through END, you'll run up against the problem described in the question in this section on matching balanced text.
Here's another example of using ..:
while (<>) {
$in_header = 1 .. /^$/;
$in_body = /^$/ .. eof;
# now choose between them
} continue {
$. = 0 if eof; # fix $.
}

Here is a perl one-liner:
perl -pe 'last if /B/' file

If Perl is a possibilty, you could do something like this:
% perl -0ne 'if (/B.*/) { print $`; last }' INPUT_FILE

one liner with basic shell commands:
head -`grep -n B file|head -1|cut -f1 -d":"` file

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

search and replace nth occurence in a file - regex

$. holds line number for current file handle and can be used for given input file like, perl -i -pe 's/number\s+555/number 666/ if $. == 2' hello.txt or if number part can be dropped out, perl -i -pe 's/555/666/ if $. == 2' hello.txt

I read content of file hello.txt to array then joined to get $str. Here it will replace 2nd occurrence with i initialized to 0. Try this search and replace in one liner. $str =~ s/(number\s+555)/ ++$i==2 ? "number 666": $1/gse;

Does anyone know any unix commands: I believe awk is suitable for this task (http://www.grymoire.com/Unix/Awk.html). awk '{if (NR == 2) {gsub("555", "666", $0);} print $0; } ' hello.txt

Try using sed sed -r -i.bak ':a;N;$!ba;s/(number\s+)555/\1666/2' file Output: number 555 number 666 number 555 Reference SO question

Related

Count number of line in txt file when new line is inside data

Match multiple patterns in same line using sed [duplicate]

Sed Match Number followed by string and return Number

Troubles with regular expressions

How do I display data from the beginning of a file until the first occurrence of a regular expression?

Categories

Resources