Search and replace regex in VI, clarification needed

Search and replace regex in VI, clarification needed - regex

Given the following, i'd like to comment lines starting with 1 or 2 or 3
Some text
1 101 12
1 102 13
2 200 2
// Some comments inside
2 202 4
2 201 7
3 300 0
3 301 7
Some other text
The following regex (seems to) look(s) right, and yet it does not work ..
%s/^([123])(.+)/#\1\2/g
The same regex matches when used by egrep
egrep '^([123])(.+)' file_name
Please help me understand why this search and replace is failing in VI

You need to escape the characters: ()+. So you could do %s/^\([123]\)\(.\+\)/#\1\2/g, but it seems easier to do: :g/^[123]/s/^/#
Note that vi does have various options for changing the meaning of symbols in patterns (help magic). In particular, you could use 'very magic' and do: :%s/\v^([123].+)/#\1/g (note that the g flag is completely redundant here!)

In Perl,
my $t = "Some text
1 101 12
1 102 13
2 200 2
2 202 4
2 201 7
3 300 0
3 301 7
Some other text";
foreach (split /^/, $t) {
$_ =~ s/^([1-3])/# $1/;
print $_;
}
Result:
Some text
# 1 101 12
# 1 102 13
# 2 200 2
# 2 202 4
# 2 201 7
# 3 300 0
# 3 301 7
Some other text

Related

grep single digit occurs one time in line

I need help with one grep command
-single digit occurs one time in line
my solution doesn't work
egrep "^(\s*[1]\s*)(\s*[^1]+\s*)+$|^(\s*[^1]\s*)(\s*[1]+\s*)+$|^(\s*[2]\s*)(\s*[^2]+\s*)+$|^(\s*[^2]\s*)(\s*[2]+\s*)+$|^(\s*[3]\s*)(\s*[^3]+\s*)+$|^(\s*[^3]\s*)(\s*[3]+\s*)+$|^(\s*[4]\s*)(\s*[^4]+\s*)+$|^(\s*[^4]\s*)(\s*[4]+\s*)+$|^(\s*[5]\s*)(\s*[^5]+\s*)+$|^(\s*[^5]\s*)(\s*[5]+\s*)+$|^(\s*[6]\s*)(\s*[^6]+\s*)+$|^(\s*[^6]\s*)(\s*[6]+\s*)+$|^(\s*[7]\s*)(\s*[^7]+\s*)+$|^(\s*[^7]\s*)(\s*[7]+\s*)+$|^(\s*[8]\s*)(\s*[^8]+\s*)+$|^(\s*[^8]\s*)(\s*[8]+\s*)+$|^(\s*[9]\s*)(\s*[^9]+\s*)+$|^(\s*[^9]\s*)(\s*[9]+\s*)+$"
example
for example in this text
012 210 5
6343 232 5 3423
345 689 7 986 543012 210 5
grep color only second line.
I want to grep color every line because in each line any digit occurs one time.In first line this is 5 in second line this is 5 in third line this is 7

A pattern that detects if a digit is unique on a line (if I'm understanding the question correctly):
For the digit 5:
^[^5]*(5)[^5]*$
^ // start of line
[^5]* // any char not 5, 0-or-more
(5) // 5
[^5]* // any char not 5, 0-or-more
$ // end of line
To test all digits, it becomes:
^(?:[^0]*(0)[^0]*|[^1]*(1)[^1]*)$ etc for all digits. The digit is captured in the first group.
Demo
Steps: 509 steps
Flags: g, m

I'm really unsure what the expected output should be (PLEASE UPDATE IT PROPERLY TO THE QUESTION), but here using GNU awk. First test data:
$ cat foo
012 210 5
6343 232 5 3423
345 689 7 986 543012 210 5
234 12 43
Then:
$ awk -F '' '{
delete a
for(i=1;i<=NF;i++)
if($i~/[0-9]/)
a[$i]++
for(i in a)
if(a[i]==1 && match($0, "[^" i "]*" i "[^" i "]*")) {
print $0
next # second data line has 2 matches
}
}' foo
012 210 5
6343 232 5 3423
345 689 7 986 543012 210 5
234 12 43
Then again, its shorter just to:
$ awk '{for(i=0;i<=9;i++)if(gsub(i,i,$0)==1){print;next}}' foo

I'm not absolutely sure what you're after, but if it's matching lines that only contain one instance of a digit, try this:
[^0]*0[^0]*|[^1]*1[^1]*|[^2]*2[^2]*|[^3]*3[^3]*|[^4]*4[^4]*|[^5]*5[^5]*|[^6]*6[^6]*|[^7]*7[^7]*|[^8]*8[^8]*|[^9]*9[^9]*
or grepified
grep -x "[^0]*0[^0]*\|[^1]*1[^1]*\|[^2]*2[^2]*\|[^3]*3[^3]*\|[^4]*4[^4]*\|[^5]*5[^5]*\|[^6]*6[^6]*\|[^7]*7[^7]*\|[^8]*8[^8]*\|[^9]*9[^9]*"
(-x makes grep match the full line.)
The regex uses 10 identical alternations, one for each digit. Each of the alternations
make sure zero or more of anything but the digit starts the line.
match the one allowed digit
make sure zero or more of anything but the digit ends the line.
See it here at regex101.

Extracting text file information via command line/script

I'd like to extract only certain information from a block of text. I have had great luck with asking the StackOverflow community for their expertise assistance, especially with tricky topics (RegEx, perl, sed, awk).
The text is output from a tshark command that I would like to manipulate and print out to avoid unnecessary information.
Any help would be appreciated. I am currently learning the ways of the aforementioned topics, but it's slow going!
Any script or command help to achieve the following output will be seriously helpful.
Original:
Host 1 Host 2 Total Relative Duration
Host 1 Host 2 Frames Bytes Frames Bytes Frames Bytes Start
192.168.0.14 <-> 192.168.0.13 3898 4872033 1971 120545 5869 4992578 0.001886000 283.6363
192.168.0.162 <-> 192.168.0.71 2 1992 2 1992 4 3984 176.765198000 77.0542
192.168.0.191 <-> 192.168.0.150 3 2988 0 0 3 2988 199.319020000 59.7055
192.168.0.227 <-> 192.168.0.157 3 2988 0 0 3 2988 197.013283000 76.7197
192.168.0.221 <-> 192.168.0.94 3 2988 0 0 3 2988 196.312847000 59.7065
192.168.0.75 <-> 192.168.0.58 2 1992 1 996 3 2988 191.995706000 59.7121
224.0.0.252 <-> 192.168.0.13 3 207 0 0 3 207 180.521299000 0.0536
192.168.0.191 <-> 192.168.0.50 1 996 2 1992 3 2988 173.452130000 59.6849
192.168.0.41 <-> 192.168.0.13 3 2988 0 0 3 2988 167.180087000 76.6960
192.168.0.206 <-> 192.168.0.153 1 996 1 996 2 1992 270.528070000 4.4070
Desired:
Host 1 Host 2 Total Bytes
x.x.x.x x.x.x.x N
x.x.x.x x.x.x.x N
x.x.x.x x.x.x.x N

Try:
awk '
BEGIN { printf "%-15s %-15s %s\n", "Host 1", "Host 2", "Total Bytes" }
NR>2 { printf "%-15s %-15s %11s\n", $1, $3, $9 }
' file
Adjust the output-field widths as needed.
The BEGIN block is used to print the output header line.
NR > 2 ensures that the input header lines are skipped.
printf is used with field-width specifiers create column-aligned output.
a - before the width specifier indicates left-aligned output (e.g.,%-15s; without it, the value is right-aligned (e.g., %11s)

in perl:
tshark | perl -lane 'print join "\t", ($F[0], $F[2], $F[8])'
the -a option splits each line of stdin into an array called #F. the column numbers don't correspond well to the array index numbers because -a splits by space by default. you can set the delimiter with -F if you like.
-F would help get the headers aligned correctly too, but to just skip the misaligned headers, add next if $. < 3; before print to skip the first two lines

Given your output is in filename:
sed 's/ \+/ /g' filename | tail -n +3 | cut -f1,3,9 -d ' ' | sed 's/ /\t/g' | sort -r -n -k3
replace multiple spaces with a single one, for tokenizing
discard the first two header lines
project columns 1, 3, and 9
replace spaces with tabs to have columns back
sort desc by total bytes
output:
192.168.0.14 192.168.0.13 4992578
192.168.0.162 192.168.0.71 3984
192.168.0.75 192.168.0.58 2988
192.168.0.41 192.168.0.13 2988
192.168.0.227 192.168.0.157 2988
192.168.0.221 192.168.0.94 2988
192.168.0.191 192.168.0.50 2988
192.168.0.191 192.168.0.150 2988
192.168.0.206 192.168.0.153 1992
224.0.0.252 192.168.0.13 207

Regex for soccer data

Why isn't my regex working? It just returns back the original file. My file looks like this (for a few hundred lines):
1 Germany 1765 0 Equal
2 Argentina 1631 0 Equal
3 Colombia 1488 1 Up
4 Netherlands 1456 -1 Down
5 Belgium 1444 0 Equal
6 Brazil 1291 1 Up
7 Uruguay 1243 -1 Down
8 Spain 1228 -1 Down
9 France 1202 1 Up
...
192 US Virgin Islands 28 -1 Down
And I want this:
Germany,1
Argentina,2
Colombia,3
...
US Virgin Islands,192
This is the regex I tried:
sed 's/\([0-9]*\)\t\([a-zA-Z]*\)/\2,\1/g' <fifa.csv >fifa.csv
But it just returns the original file.
EDIT:
Now I tried
sed 's/\([0-9]*\)\t\([a-zA-Z]*\)/\2,\1/g' <fifa.csv >fifa.csv
and got
,1 Germany,,1765Equal,0,
,2 Argentina,,1631Equal,0,
,3 Colombia,,1488Up,1,
,4 Netherlands,,1456-Down,1,
,5 Belgium,,1444Equal,0,

You could try the below sed command if the fields are tab-separated.
sed 's/^\([0-9]\+\)\t\([^\t]*\).*/\2,\1/' file
Add the inline-edit option -i to save the changes made.
sed -i 's/^\([0-9]\+\)\t\([^\t]*\).*/\2,\1/' file
^ means start of the line anchor. + would repeat the previous character one or more times. Basic sed uses BRE so you need to escape the + to do the functionality of repeating the previous character one or more times. [^\t]* matches any character but not of \t tab character zero or more times.

The following is what you are looking for. The -i option specifies that files are to be edited in-place.
sed -i 's/^\([0-9]\+\)\t\([^\t]*\).*/\2,\1/' fifa.csv

awk '{print( $2 "," $1)}' YourFile
not a sed but easier to manage

Print specific number of lines furthest from the current pattern match and just before matching another pattern

I have a tab delimited file such as the one below. I want to find the specific number of minimum values in a group. The group starts after finding E in the last column. For example, I want to print two lines (records) that are furthest from, first occurrence of E, the items are sorted in column with E. Here Jack's case and also after second occurrence of E in Gareth's case.
Jack 2 98 E
Jones 6 25 8.11
Mike 8 11 5.22
Jasmine 5 7 4
Simran 5 7 3
Gareth 1 85 E
Jones 4 76 178.32
Mark 11 12 157.3
Steve 17 8 88.5
Clarke 3 7 12.3
Vid 3 7 2.3
I want my result to be
Jasmine 5 7 4
Simaran 5 7 3
Clarke 3 7 12.3
Vid 3 7 2.3
There can be different number of records in a group. I tried with grep
grep -B 2 F$ inputfile.txt
But it repeats the results with E and also does not work with the last record.

quick & dirty:
kent$ awk '/E$/&&a&&b{print b RS a;a=b="";next}{b=a;a=$0}END{print b RS a}' file
Jasmine 5 7 4
Simran 5 7 3
Clarke 3 7 12.3
Vid 3 7 2.3

Using arrays of arrays in Gnu Awk version 4, you can try
gawk -vnum=2 -f e.awk input.txt
where e.awk is:
$4=="E" {
N[j++]=i
i=0
}
{
l[j][++i]=$0
}
END {
N[j]=i; ngr=j
for (i=1; i<=ngr; i++) {
m=N[i]
for (j=m-num+1; j<=m; j++)
print l[i][j]
}
}

I don't see an F in you last column. But assuming you want to get every 2 lines above a line ending in E:
grep -B2 'E$' <(cat inputfile.txt;echo "E")|sed "/E$\|^--/d"
Should do the trick
'E$' look for an "E" at the end of a line
the -B2 gets the 2 lines before as well
<(cat inputfile.txt;echo "E") add an "E" as last line to match the last ones as well (this does not chage the actual file)
sed "/E$\|^--/d" delete all lines ending in "E" or beginning with "--" (separator of grep)

awk '$2 ~/5|3/ && $3 ~/7/' file
Jasmine 5 7 4
Simran 5 7 3
Clarke 3 7 12.3
Vid 3 7 2.3

Replace first two whitespace occurrences with a comma using sed

I have a whitespace delimited file with a variable number of entries on each line. I want to replace the first two whitespaces with commas to create a comma delimited file with three columns.
Here's my input:
a b 1 2 3 3 2 1
c d 44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z y 2 3 33
And here's my desired output:
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line,http://google.com,100 200 300
ef,jh,77 88 99
z,y,2 3 33
I'm trying to use perl regular expressions in a sed command but I can't quite get it to work. First I try capturing a word, followed by a space, then another word, but that only works for lines 1, 2, and 5:
$ cat test | sed -r 's/(\w)\s+(\w)\s+/\1,\2,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z,y,2 3 33
I also try capturing whitespace, a word, and then more whitespace, but that gives me the same result:
$ cat test | sed -r 's/\s+(\w)\s+/,\1,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z,y,2 3 33
I also try doing this with the .? wildcard, but that does something funny to line 4.
$ cat test | sed -r 's/\s+(.?)\s+/,\1,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line http://google.com 100 200 300
ef jh,,77 88 99
z,y,2 3 33
Any help is much appreciated!

How about this:
sed -e 's/\s\+/,/' | sed -e 's/\s\+/,/'
It's probably possible with a single sed command, but this is sure an easy way :)
My output:
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line,http://google.com,100 200 300
ef,jh,77 88 99
z,y,2 3 33

Try this:
sed -r 's/\s+(\S+)\s+/,\1,/'
Just replaced \w (one "word" char) with \S+ (one or more non-space chars) in one of your attempts.

You can provide multiple commands to a single instance of sed by just providing multiple -e arguments.
To do the first two, just use:
sed -e 's/\s\+/,/' -e 's/\s\+/,/'
This basically runs both commands on the line in sequence, the first doing the first block of whitespace, the second doing the next.
The following transcript shows this in action:
pax$ echo 'a b 1 2 3 3 2 1
c d 44 55 66 2355
line http://google.com 100 200 300
ef jh 77 88 99
z y 2 3 33
' | sed -e 's/\s\+/,/' -e 's/\s\+/,/'
a,b,1 2 3 3 2 1
c,d,44 55 66 2355
line,http://google.com,100 200 300
ef,jh,77 88 99
z,y,2 3 33

Sed s/// supports a way to say which occurrence of a pattern to replace: just add the n to the end of the command to replace only the nth occurrence. So, to replace the first and second occurrences of whitespace, just use it this way:
$ sed 's/ */,/1;s/ */,/2' input
a,b ,1 2 3 3 2 1
c,d ,44 55 66 2355
line,http://google.com 100,200 300
ef,jh ,77 88 99
z,y 2,3 33
EDIT: reading another proposed solutions, I noted that the 1 and 2 after s/ */,/ is not only unnecessary but plainly wrong. By default, s/// just replaces the first occurrence of the pattern. So, if we have two identical s/// in sequence, they will replace the first and the second occurrence. What you need is just
$ sed 's/ */,/;s/ */,/' input
(Note that you can put two sed commands in one expression if you separate them by a semicolon. Some sed implementations do not accept the semicolon after the s/// command; use a newline to separate the commands, in this case.)

A Perl solution is:
perl -pe '$_=join ",", split /\s+/, $_, 3' some.file

Not sure about sed/perl, but here's an (ugly) awk solution. It just prints fields 1-2, separated by commas, then the remaining fields separated by space:
awk '{
printf("%s,", $1)
printf("%s,", $2)
for (i=3; i<=NF; i++)
printf("%s ", $i)
printf("\n")
}' myfile.txt

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Search and replace regex in VI, clarification needed - regex

In Perl, my $t = "Some text 1 101 12 1 102 13 2 200 2 2 202 4 2 201 7 3 300 0 3 301 7 Some other text"; foreach (split /^/, $t) { $_ =~ s/^([1-3])/# $1/; print $_; } Result: Some text # 1 101 12 # 1 102 13 # 2 200 2 # 2 202 4 # 2 201 7 # 3 300 0 # 3 301 7 Some other text

Related

grep single digit occurs one time in line

Extracting text file information via command line/script

Regex for soccer data

Print specific number of lines furthest from the current pattern match and just before matching another pattern

Replace first two whitespace occurrences with a comma using sed

Categories

Resources