regex - match a word and replace content before it

regex - match a word and replace content before it - regex

I have several files with this line:
<2-10 digits> ; word
I want to replace all of the digits that come before that word with something else. How can I do that?

sed -i -e 's/.*word/something;word/g' <filename>
To loop over multiple files in a directory. I am assuming .txt file as the file extension:
for i in `\ls -1 *.txt`
do
sed -i -e 's/.*word/something;word/g' $i
done
Note: sed -i will modify the file interactively. So, test the command without -i option to check this is what you want and then go for it...

UPDATE: sed example:
s="999 abc 1234 ; word 567"
echo $s | sed 's/^\(.* \)[0-9][0-9]*\( ; word.*\)$/\1something\2/g'
OUTPUT:
999 abc something ; word 567

Related

Deleting everything between two string matches in a file

I got this text in file.txt:
Osmun.Prez#mail.com:c7lB2m6b#3.a.a:tt_webid_v2=6990226111024612869; tt_webid=6990226111024612869; tt_csrf_token=VD5Nb_TQFH4RKhoJeSe2nzLB; R6kq3TV7=AHkh4PB6AQAA3LIS90nWf2ss0Q7ZTCQjUat4axctvhQY68DdUEz92RwpmVSX|1|0|e9d6917c2fe555827dcf5ee916ba9778079ab2a9; ttwid=1%7CAFodeNF0iZM2fyy-ZeiZ6HTpZoG_MSx6SmXHgGVQ-V4%7C1627538859%7C59ca1e4a56f9f537b55e655a6dabff88e44eb48502b164ed6b4199f5a5263cb0; passport_csrf_token_default=6f7653c3ce946a6ce5444723fb0c509b; passport_csrf_token=6f7653c3ce946a6ce5444723fb0c509b; sid_guard=0483b7d37f4e4bd20ab3046e29724798%7C1627538893%7C5184000%7CMon%2C+27-Sep-2021+06%3A08%3A13+GMT; uid_tt=27b52febe6222486b9f6b6a90ef4ffeace5ea25c09d29a1583be5a1ecf760996; uid_tt_ss=27b52febe6222486b9f6b6a90ef4ffeace5ea25c09d29a1583be5a1ecf760996; sid_tt=0483b7d37f4e4bd20ab3046e29724798; sessionid=0483b7d37f4e4bd20ab3046e29724798; sessionid_ss=0483b7d37f4e4bd20ab3046e29724798; store-idc=maliva; store-country-code=us; odin_tt=294845c8f7711db177f7c549a9f44edb1555031b27a2a485df809cd92c4e544ac0772bf462df5b7a100f6e488c45303cd62df3b6b950f0842520cd887850137b035d990f29cc8b752765e594560c977f; cmpl_token=AgQQAPNSF-RMpbE89z5HYF0_-2PcrxjXf4fZYP5_ZA
How can I delete everything from the string inside ( first & only instance ) from :tt_ to _ZA in file.txt keeping only Osmun.Prez#mail.com:c7lB2m6b#3.a.a using bash linux?
Thank you

Something like:
sed -i "s/:tt_.*//" file.txt
if you want to edit the file in place. If not, remove the -i switch.
The sed command means: replace (s), in each line of file.txt, all the chars (.*) starting by the pattern :tt_ with an empty string (//).
Or the command:
sed -i "s/:tt_.*_ZA//" file.txt
which is more adherent to what you ask for, but returns the same output.

Use pattern substitution:
i=$(cat file.txt)
echo "${i/:tt*_ZA}"

Assuming the general requirement is to remove everything after the 2nd : ...
Sample data:
$ cat file.txt
Osmun.Prez#mail.com:c7lB2m6b#3.a.a:tt_webid_v ... to end of line
some.one#home.com:B52_m6b#9_az.more.stuff:delete from here ... to end of line
One sed idea:
$ sed -En 's/^([^:]*:[^:]*).*$/\1/p' file.txt
Osmun.Prez#mail.com:c7lB2m6b#3.a.a
some.one#home.com:B52_m6b#9_az.more.stuff

Using awk
awk 'BEGIN{FS=OFS=":"}{print $1,$2}'
Using : as the delimiter, it is easy to extract the columns before :tt

This deletes all chars from ":tt_" to the last "_ZA", inclusive, in file.txt
Mac_3.2.57$cat file.txt | sed 's/\(\)[:]tt.*_ZA\(.*\)/\1\2/'
Osmun.Prez#mail.com:c7lB2m6b#3.a.a
Mac_3.2.57$

Or if it is always the first 2 values which are separated by colon (as per you example)
cat file.txt | cut -f1,2 -d’:’

How do I search files containing a specific string using grep?

For instance, if a file has a line "blahblah myID=1234567 blahblah", I want to search all files containing 1234567 somewhere in the whole file.
I tried grep -r '.* 1234567.* ' directory, but it didn't work.

Do the following:
grep -rw 'directory' -e "pattern"
-r is recursive and -w stands match the whole word.
example
grep -rw '/home/lib/foldername/' -e "1234567"
you can also use -n which will tell you the line number where it matched the string

files=$(ls -l /dir |awk '/^-/ {print $NF}')
for i in $files
do
cat $i | grep "1234567" >> output.txt
done
List files of /dir, and grep "1234567" and write to output.txt

How to replace space with comma using sed?

I would like to replace the empty space between each and every field with comma delimiter.Could someone let me know how can I do this.I tried the below command but it doesn't work.thanks.
My command:
:%s//,/
53 51097 310780 1
56 260 1925 1
68 51282 278770 1
77 46903 281485 1
82 475 2600 1
84 433 3395 1
96 212 1545 1
163 373819 1006375 1
204 36917 117195 1

If you are talking about sed, this works:
sed -e "s/ /,/g" < a.txt
In vim, use same regex to replace:
s/ /,/g

Inside vim, you want to type when in normal (command) mode:
:%s/ /,/g
On the terminal prompt, you can use sed to perform this on a file:
sed -i 's/\ /,/g' input_file
Note: the -i option to sed means "in-place edit", as in that it will modify the input file.

I know it's not exactly what you're asking, but, for replacing a comma with a newline, this works great:
tr , '\n' < file

Try the following command and it should work out for you.
sed "s/\s/,/g" orignalFive.csv > editedFinal.csv

IF your data includes an arbitrary sequence of blank characters (tab, space), and you want to replace each sequence with one comma, use the following:
sed 's/[\t ]+/,/g' input_file
or
sed -r 's/[[:blank:]]+/,/g' input_file
If you want to replace sequence of space characters, which includes other characters such as carriage return and backspace, etc, then use the following:
sed -r 's/[[:space:]]+/,/g' input_file

If you want the output on terminal then,
$sed 's/ /,/g' filename.txt
But if you want to edit the file itself i.e. if you want to replace space with the comma in the file then,
$sed -i 's/ /,/g' filename.txt

I just confirmed that:
cat file.txt | sed "s/\s/,/g"
successfully replaces spaces with commas in Cygwin terminals (mintty 2.9.0). None of the other samples worked for me.

On Linux use below to test (it would replace the whitespaces with comma)
sed 's/\s/,/g' /tmp/test.txt | head
later you can take the output into the file using below command:
sed 's/\s/,/g' /tmp/test.txt > /tmp/test_final.txt
PS: test is the file which you want to use

Perl, sed, or awk one-liner to change the format of the file

I need advice on how to change the file formatted following way
file1:
A 504688
B jobnameA
A 504690
B jobnameB
A 504691
B jobnameC
...
into file2:
A B
504688 jobnameA
504690 jobnameB
504691 jobnameC
...
One solution I could think of is:
cat file1 | perl -0777 -p -e 's/\s+B/\t/' | awk '{print $2"\t"$3}'.
But I am wondering if there is more efficient way or already known practice that does this job.

perl -nawe 'print "#F[1 .. $#F]", $F[0] eq "A" ? "\t" : "\n"' < /tmp/ab
Look up the options in perlrun.
Another useful one to add is -l (append newline to print), but not in this case.

Assuming your input file is tab separated:
echo $'A\tB'
cut -f2 filename | paste - -
Should be pretty quick because this is exactly what cut and paste were written to do.

awk '/^A/{num=$2}/^B/{print num,$2}' file
Or, alternately,
awk '{num=$2;getline;print num,$2}' file

Here is an sed solution:
sed -e 'N' -e 's/A\s*\(.*\)\nB\s*\(.*\)/\1\t\2/' file
This version will also print the header at the top:
sed '1{h;s/.*/A\tB/p;g};N;s/A\s*\(.*\)\nB\s*\(.*\)/\1\t\2/' file
Or an alternative:
sed -n '/^A\s*/{s///;h};/^B\s*/{s///;H;g;s/\n/\t/p}' file
If your sed does not support semicolons as a command separator for the alternative:
sed -n '
/^A\s*/{ # if the line starts with "A"
s/// # remove the "A" and the whitespace
h # copy the remainder into the hold space
} # end if
/^B\s*/{ # if the line starts with "B"
s/// # remove the "B" and the whitespace
H # append pattern space to hold space
g # copy hold space to pattern space
s/\n/\t/p # replace newline with tab and print
}' file
This version will also print the header at the top:
sed -n '/^A\s*/{s///;h;1s/.*/A\tB/p};/^B\s*/{s///;H;g;s/\n/\t/p}' file

This will work with any header text, not just fixed A and B >>
awk '{a=$1;b=$2;getline;if(c!=1){print a,$1;c=1};print b,$2}' file1 >file2
...and it will print also header row
If you need \t separator, then use:
awk '{a=$1;b=$2;getline;if(c!=1){print a"\t"$1;c=1};print b"\t"$2}' file1 >file2

This might work for you:
sed -e '1i\A\tB' -e 'N;s/A\s*\(\S*\).*\nB\s*\(\S*\).*/\1\t\2/' file

Filter apache log file using regular expression

I have a big apache log file and I need to filter that and leave only (in a new file) the log from a certain IP: 192.168.1.102
I try using this command:
sed -e "/^192.168.1.102/d" < input.txt > output.txt
But "/d" removes those entries, and I needt to leave them.
Thanks.

What about using grep?
cat input.txt | grep -e "^192.168.1.102" > output.txt
EDIT: As noted in the comments below, escaping the dots in the regex is necessary to make it correct. Escaping in the regex is done with backslashes:
cat input.txt | grep -e "^192\.168\.1\.102" > output.txt

sed -n 's/^192\.168\.1\.102/&/p'
sed is faster than grep on my machines

I think using grep is the best solution but if you want to use sed you can do it like this:
sed -e '/^192\.168\.1\.102/b' -e 'd'
The b command will skip all following commands if the regex matches and the d command will thus delete the lines for which the regex did not match.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

regex - match a word and replace content before it - regex

I have several files with this line: <2-10 digits> ; word I want to replace all of the digits that come before that word with something else. How can I do that?

UPDATE: sed example: s="999 abc 1234 ; word 567" echo $s | sed 's/^\(.* \)[0-9][0-9]\( ; word.\)$/\1something\2/g' OUTPUT: 999 abc something ; word 567

Related

Deleting everything between two string matches in a file

How do I search files containing a specific string using grep?

How to replace space with comma using sed?

Perl, sed, or awk one-liner to change the format of the file

Filter apache log file using regular expression

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

regex - match a word and replace content before it - regex

I have several files with this line: <2-10 digits> ; word I want to replace all of the digits that come before that word with something else. How can I do that?

UPDATE: sed example: s="999 abc 1234 ; word 567" echo $s | sed 's/^\(.* \)[0-9][0-9]*\( ; word.*\)$/\1something\2/g' OUTPUT: 999 abc something ; word 567

Related

Deleting everything between two string matches in a file

How do I search files containing a specific string using grep?

How to replace space with comma using sed?

Perl, sed, or awk one-liner to change the format of the file

Filter apache log file using regular expression

Categories

Resources

UPDATE: sed example: s="999 abc 1234 ; word 567" echo $s | sed 's/^\(.* \)[0-9][0-9]\( ; word.\)$/\1something\2/g' OUTPUT: 999 abc something ; word 567