Assume a multi-line text file (file) and the keyword bar.
> cat file
foo bar baz
foo bar quux
foo quux bar
Each line that ends with the keyword shall be prepended with the string Hello; each line that does not shall be printed as is.
> cat file | sought_command
foo bar baz
foo bar quux
Hello foo quux bar
I believe that this can be done via awk (something along the lines of awk '$ ~ /bar/ {print "Hello", $0}'), but I cannot come up with the correct code and would appreciate suggestions.
You are almost on the right track using Awk, just use the regex anchor $ to mark the end the line, and append the string as needed,
awk '$NF == "bar"{$0="Hello"FS$0}1' file
This will append string only to those lines having keyword in the last.
You can use this awk command:
awk '/bar$/{$0 = "Hello" FS $0} 1' file
foo bar baz
foo bar quux
Hello foo quux bar
This will check if a line ends with bar and if it does then it will prefix that line with string "Hello ".
If line doesn't end with bar then that line will be printed as is.
With sed:
$ sed 's/.* bar$/Hello &/' infile
foo bar baz
foo bar quux
Hello foo quux bar
The space before bar makes sure to not match lines ending in foobar; it would break for lines containing bar and nothing else, though.
With awk, if you want to match only foo and not foobar:
$ awk '$NF == "bar" { $0 = "Hello " $0 }1' infile
foo bar baz
foo bar quux
Hello foo quux bar
Related
There is a set of strings:
"foo","bar","baz", "boo", "123"
Which appear in text in random order, but numbers appear first:
Some text
123 baz bar foo
123 bar baz foo
123 foo baz bar
Some other text
how can I match the line where they appear using regular expression?
To perform the task you may try the shielding method:
Shield the lines containing duplicates as such lines should be excluded. You may then delete them if needed.
Match all the lines containing the words from the set.
So, an example text:
Some text
123 baz bar foo
123 bar baz foo
123 foo baz bar
123 foo baz bar foo boo
123 foo bar bar
123 boo baz foo asdf
Some other text
At first we should search the duplicate containing lines using the following regex:
(^.*(foo|bar|baz|boo|123).*\2)
The previous means: take a text from the beginning of line containing at least one duplicate word among the set ending with the matched duplicate.
Then shield these lines with the replacement using regex:
#SHIELD#\1
We will get the following text:
Some text
123 baz bar foo
123 bar baz foo
123 foo baz bar
#SHIELD# 123 foo baz bar foo boo
#SHIELD# 123 foo bar bar
123 boo baz foo asdf
Some other text
Or delete these lines if needed.
Then we will be able to get needed lines from the rest. Let us mark them with the replacement:
Find: ^(?!#SHIELD#)(\s*123.*(baz|bar|foo|boo).*)$
(search only not shielded lines beginning with spaces, 123 and then any text with at least one match from the set).
Replace by: #MYLINE#\1
We get the text:
Some text
#MYLINE# 123 baz bar foo
#MYLINE# 123 bar baz foo
#MYLINE# 123 foo baz bar
#SHIELD# 123 foo baz bar foo boo
#SHIELD# 123 foo bar bar
#MYLINE# 123 boo baz foo asdf
Some other text
Assume a textfile file that contains some lines with keyword foo.
> cat file
bar bar baz qux
bar foo bar baz
bar foo qux bar
I would like to replace any occurrence of character r that occurs between (and including) n and m characters after the end of the keyword with the character z.
Example with n=1 and m=3:
> sought_command file
bar bar baz qux
bar foo baz baz # Replacement only here!
bar foo qux bar
Example with n=4 and m=6:
> sought_command file
bar bar baz qux
bar foo bar baz
bar foo qux baz # Replacement only here!
With sed:
n=1;m=3
sed -E ':a;s/(foo( *[^ ]){'"$n"','"$m"'} *)r/\1z/;ta' file
Where :a defines a label and ta jumps to this label until there's no more "r" to replace.
You can use this awk script:
$> cat mark.awk
p = index($0, kw) { # only for lines containing keyword
b = p+length(kw) + 1 # get to next position after kw match
part = substr($0, b+n, m) # get substring between start and end points
gsub(/r/, "z", part) # replace "r" by "z" only in the substring
# new reconstruct the original line using substr commands
$0 = substr($0, 1, b+n-1) part (b+m+1<length($0)?substr($0, b+m+1):"")
} 1 #default action to print a line
Now run this script with your parameters on command line:
$> awk -v kw='foo' -v n=1 -v m=3 -f mark.awk file
bar bar baz qux
bar foo baz baz
bar foo qux bar
$> awk -v kw='foo' -v n=4 -v m=6 -f mark.awk file
bar bar baz qux
bar foo bar baz
bar foo qux baz
I want to add a tracing line to each method in many C# files (see example below), and I want to do this automatically, of course.
My approach would be to use regex to match lines starting with public or private, don't have ; (exclude members), have parentheses (exclude class definitions), up to opening {, all this spanning multiple lines, and add my line after this.
sed would be my natural choice, but unfortunately it is less suited for matching over multiple lines.
I almost don't know perl but I managed the following:
perl -0777 -i.original -pe 's/((private|public)[^;]*?\)\s*?{)/\1\nActivityLoggers.traceMethod();/igs' testFile.cs
This works fine but I'd like to add the line indented. Assuming that the { is always on a separate line I could just reuse it replacing the { with my text, but here my not knowing perl blocks me. Would appreciate any help.
As a bonus, you could help exclude constructors :)
EXAMPLE:
Make this
public partial class AClass : BClass
{
private static string name;
private void Method1(int i, string s)
{
doSomethng();
}
public void Method2
(int i, string s)
{
doSomethngElse();
}
}
into this
public partial class AClass : BClass
{
private static string name;
private void Method1(int i, string s)
{
ActivityLoggers.traceMethod();
doSomethng();
}
public void Method2
(int i, string s)
{
ActivityLoggers.traceMethod();
doSomethngElse();
}
}
( In case you're wandering I do fetch the calling method and class names in traceMethod() using StackTrace :) )
to work off of what you posted, use this pattern
((private|public)[^;]*?\)\s*?{)(?=\R+(\s+))
and replace with
$1\n$3ActivityLoggers.traceMethod()
Demo
This might work for you (GNU sed):
sed -r ':a;/^\s*(public|private)/,/^\s*\{\s*$/{/^\s*\{\s*$/!b;n;/^\s*(public|private)/ba;h;s/\S.*/ActivityLoggers.traceMethod();/p;g}' file
This looks for the range starting with public or private and ending with a { on a line by itsself. It then reads the next line and if this begins with public or private loops. Otherwise it copies the line and replaces everything from the indent by the required string and prints this. It then retrieves the copied line and prints that.
With sed :
sed '/\(private\|public\)[^;]*/, /\}/ {
/\(^[ \t]*\)\([^(]*();\)/ s//\1ActivityLoggers.traceMethod();\n\1\2\n/;
}' sourcefile
/\(private\|public\)[^;]*/, /\}/ defines a pattern space with a range of lines between private|public and next line containing a } (= the end of method block)
In each block that match, we search for the pattern method and apply the substitution, adding the new line using groups
With GNU awk for multi-char RS and gensub():
$ gawk -v RS='^$' -v ORS= '{print gensub(/((private|public)[^;)]+\)\s*{)(\s*)/,"\\1\\3ActivityLoggers.traceMethod();\\3","g")}' file
public partial class AClass : BClass
{
private static string name;
private void Method1(int i, string s)
{
ActivityLoggers.traceMethod();
doSomethng();
}
public void Method2
(int i, string s)
{
ActivityLoggers.traceMethod();
doSomethngElse();
}
}
I have a string:
foo bar
foo1 #9 0x103806f4 bar1
foo2 #10 0x0f6dd704 bar2
foo3 bar3
I have tried the following:
^((?!#[\d]{1,2} 0x[0-9a-f]{8}).)*$
which gets
foo bar
foo3 bar3
and
^((?!#[\d]{1,2} 0x[0-9a-f]{8}).)*
which gets
foo bar
foo1
foo2
foo3 bar3
But what im trying to get is
foo bar
foo1 bar1
foo2 bar2
foo3 bar3
How can I achieve this?
You need to do replace instead of matching in-order to get the desired output.
\s*#\d{1,2} 0x[0-9a-f]{8}
Use the above regex and then replace the match with an empty string.
DEMO
If you're wanting the beginning and ending non-whitespace characters, using a Negative Lookahead is not going to do the job. You could match your expected output as follows:
^(\S+).*?(\S+)$
Then in your preferred language, you can combine the match results: python example ...
>>> import re
>>> s = '''foo bar
foo1 #9 0x103806f4 bar1
foo2 #10 0x0f6dd704 bar2
foo3 bar3'''
...
>>> for m in re.finditer(r'(?m)^(\S+).*?(\S+)$', s):
... print(" ".join(m.groups()))
foo bar
foo1 bar1
foo2 bar2
foo3 bar3
Instead of using regex, consider splitting the string and joining the indexes together.
I'd like to return matches for a given search-string in a string. Plus the next word after the search-string.
Phrase to search for: "foobar foo"
Example Input:
foo foobar foo bar1 foobar1
foobar foos bar2 foobar2
foo barfoobar foos bar3 foobar3
Desired Matches:
foobar foo bar1
foobar foos bar2
barfoobar foos bar3
Use regex pattern
\b\w*foobar foo\w*\s+\w+\b