Search and replace with sed in specific intervall after keyword - regex

Assume a textfile file that contains some lines with keyword foo.
> cat file
bar bar baz qux
bar foo bar baz
bar foo qux bar
I would like to replace any occurrence of character r that occurs between (and including) n and m characters after the end of the keyword with the character z.
Example with n=1 and m=3:
> sought_command file
bar bar baz qux
bar foo baz baz # Replacement only here!
bar foo qux bar
Example with n=4 and m=6:
> sought_command file
bar bar baz qux
bar foo bar baz
bar foo qux baz # Replacement only here!

With sed:
n=1;m=3
sed -E ':a;s/(foo( *[^ ]){'"$n"','"$m"'} *)r/\1z/;ta' file
Where :a defines a label and ta jumps to this label until there's no more "r" to replace.

You can use this awk script:
$> cat mark.awk
p = index($0, kw) { # only for lines containing keyword
b = p+length(kw) + 1 # get to next position after kw match
part = substr($0, b+n, m) # get substring between start and end points
gsub(/r/, "z", part) # replace "r" by "z" only in the substring
# new reconstruct the original line using substr commands
$0 = substr($0, 1, b+n-1) part (b+m+1<length($0)?substr($0, b+m+1):"")
} 1 #default action to print a line
Now run this script with your parameters on command line:
$> awk -v kw='foo' -v n=1 -v m=3 -f mark.awk file
bar bar baz qux
bar foo baz baz
bar foo qux bar
$> awk -v kw='foo' -v n=4 -v m=6 -f mark.awk file
bar bar baz qux
bar foo bar baz
bar foo qux baz

Related

Instantiating a Case Class from a Large Parameter List

Guys I'm in trouble when I have a large parameter list, but when I have a few work perfectly, does anyone have any idea what it might be?
Small parameter list, Ok
scala> case class Foo(a: Int, b: String, c: Double)
defined class Foo
scala> val params = Foo(1, "bar", 3.14).productIterator.toList
params: List[Any] = List(1, bar, 3.14)
scala> Foo.getClass.getMethods.find(x => x.getName == "apply" && x.isBridge).get.invoke(Foo, params map (_.asInstanceOf[AnyRef]): _*).asInstanceOf[Foo]
res0: Foo = Foo(1,bar,3.14)
scala> Foo(1, "bar", 3.14) == res0
res1: Boolean = true
when I have a very large list of parameters, it displays the following error below:
scala> case class Foo(a1: String,a2: String,a3: String,a4: String,a5: String,a6: String,a7: String,a8: String,a9: String,a10: String,a12: String,a13: String,a14: String,a15: String,a16: String,a17: String,a18: String,a19: String,a20: String,a21: String,a22: String,a23: String,a24: String)
defined class Foo
scala> val params2 = Foo("bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar","bar").productIterator.toList
params2: List[Any] = List(bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar, bar)
scala> val test = Foo.getClass.getMethods.find(x => x.getName == "apply" && x.isBridge).get.invoke(Foo, params2 map (_.asInstanceOf[AnyRef]): _*).asInstanceOf[Foo]
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:347)
at scala.None$.get(Option.scala:345)
... 46 elided
There is 22 limit on case classes. Bigger case classes still compiles but there are some limitations on those.
https://underscore.io/blog/posts/2016/10/11/twenty-two.html

How to match set of strings that appear in radom order with regular expression

There is a set of strings:
"foo","bar","baz", "boo", "123"
Which appear in text in random order, but numbers appear first:
Some text
123 baz bar foo
123 bar baz foo
123 foo baz bar
Some other text
how can I match the line where they appear using regular expression?
To perform the task you may try the shielding method:
Shield the lines containing duplicates as such lines should be excluded. You may then delete them if needed.
Match all the lines containing the words from the set.
So, an example text:
Some text
123 baz bar foo
123 bar baz foo
123 foo baz bar
123 foo baz bar foo boo
123 foo bar bar
123 boo baz foo asdf
Some other text
At first we should search the duplicate containing lines using the following regex:
(^.*(foo|bar|baz|boo|123).*\2)
The previous means: take a text from the beginning of line containing at least one duplicate word among the set ending with the matched duplicate.
Then shield these lines with the replacement using regex:
#SHIELD#\1
We will get the following text:
Some text
123 baz bar foo
123 bar baz foo
123 foo baz bar
#SHIELD# 123 foo baz bar foo boo
#SHIELD# 123 foo bar bar
123 boo baz foo asdf
Some other text
Or delete these lines if needed.
Then we will be able to get needed lines from the rest. Let us mark them with the replacement:
Find: ^(?!#SHIELD#)(\s*123.*(baz|bar|foo|boo).*)$
(search only not shielded lines beginning with spaces, 123 and then any text with at least one match from the set).
Replace by: #MYLINE#\1
We get the text:
Some text
#MYLINE# 123 baz bar foo
#MYLINE# 123 bar baz foo
#MYLINE# 123 foo baz bar
#SHIELD# 123 foo baz bar foo boo
#SHIELD# 123 foo bar bar
#MYLINE# 123 boo baz foo asdf
Some other text

Prepend string to line if line ends in keyword

Assume a multi-line text file (file) and the keyword bar.
> cat file
foo bar baz
foo bar quux
foo quux bar
Each line that ends with the keyword shall be prepended with the string Hello; each line that does not shall be printed as is.
> cat file | sought_command
foo bar baz
foo bar quux
Hello foo quux bar
I believe that this can be done via awk (something along the lines of awk '$ ~ /bar/ {print "Hello", $0}'), but I cannot come up with the correct code and would appreciate suggestions.
You are almost on the right track using Awk, just use the regex anchor $ to mark the end the line, and append the string as needed,
awk '$NF == "bar"{$0="Hello"FS$0}1' file
This will append string only to those lines having keyword in the last.
You can use this awk command:
awk '/bar$/{$0 = "Hello" FS $0} 1' file
foo bar baz
foo bar quux
Hello foo quux bar
This will check if a line ends with bar and if it does then it will prefix that line with string "Hello ".
If line doesn't end with bar then that line will be printed as is.
With sed:
$ sed 's/.* bar$/Hello &/' infile
foo bar baz
foo bar quux
Hello foo quux bar
The space before bar makes sure to not match lines ending in foobar; it would break for lines containing bar and nothing else, though.
With awk, if you want to match only foo and not foobar:
$ awk '$NF == "bar" { $0 = "Hello " $0 }1' infile
foo bar baz
foo bar quux
Hello foo quux bar

Regex: Everything except some pattern

I have a string:
foo bar
foo1 #9 0x103806f4 bar1
foo2 #10 0x0f6dd704 bar2
foo3 bar3
I have tried the following:
^((?!#[\d]{1,2} 0x[0-9a-f]{8}).)*$
which gets
foo bar
foo3 bar3
and
^((?!#[\d]{1,2} 0x[0-9a-f]{8}).)*
which gets
foo bar
foo1
foo2
foo3 bar3
But what im trying to get is
foo bar
foo1 bar1
foo2 bar2
foo3 bar3
How can I achieve this?
You need to do replace instead of matching in-order to get the desired output.
\s*#\d{1,2} 0x[0-9a-f]{8}
Use the above regex and then replace the match with an empty string.
DEMO
If you're wanting the beginning and ending non-whitespace characters, using a Negative Lookahead is not going to do the job. You could match your expected output as follows:
^(\S+).*?(\S+)$
Then in your preferred language, you can combine the match results: python example ...
>>> import re
>>> s = '''foo bar
foo1 #9 0x103806f4 bar1
foo2 #10 0x0f6dd704 bar2
foo3 bar3'''
...
>>> for m in re.finditer(r'(?m)^(\S+).*?(\S+)$', s):
... print(" ".join(m.groups()))
foo bar
foo1 bar1
foo2 bar2
foo3 bar3
Instead of using regex, consider splitting the string and joining the indexes together.

c# Regex Match Search String Plus Next Word

I'd like to return matches for a given search-string in a string. Plus the next word after the search-string.
Phrase to search for: "foobar foo"
Example Input:
foo foobar foo bar1 foobar1
foobar foos bar2 foobar2
foo barfoobar foos bar3 foobar3
Desired Matches:
foobar foo bar1
foobar foos bar2
barfoobar foos bar3
Use regex pattern
\b\w*foobar foo\w*\s+\w+\b