Begins with string or not in regular expression - regex

I am trying simplify the rows below to a single row:
sed -i 's/-XX\:PermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh
sed -i 's/-XX\:MaxPermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh
I try use something similar of this -XX\:(?:Max|)PermSize=128m\s, but without any success.

Note that (?:Max|) is a non-capturing group and it is not compliant with the POSIX regex engine that sed uses. You are using a BRE POSIX engine, so, to use a capturing group, you need to use \(...\) and to use an alternation operator, you need \|.
You may use
sed -i 's/-XX:\(Max\)\?PermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh
This is a BRE POSIX expression, thus \(Max\)\? matches an optional Max character sequence.
Or,
sed -i -E 's/-XX:(Max)?PermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh
The -E option enables the ERE POSIX syntax, an optional Max character sequence is defined with (Max)?.
See the online sed demo
s="ABC-XX:PermSize=128m DEF-XX:MaxPermSize=128m "
sed 's/-XX:\(Max\)\?PermSize=128m\s//g' <<< "$s"
# => ABCDEF
sed -E 's/-XX:(Max)?PermSize=128m\s//g' <<< "$s"
# => ABCDEF

You could make Max optional in an optional group (Max)? :
-XX\:(max)?PermSize=128m\s
For example:
sed -i 's/-XX\:(Max)?PermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh

Try
sed -i 's/-XX\:\(Max\)?PermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh

Try this:
sed -ir 's/-XX\:(Max)?PermSize=128m\s//g' /usr/share/hbase/conf/hbase-env.sh
Better add r than with all those escape. If you are using GNU sed.

Related

bash tool to search and replace text (while leaving text in the middle the same)

I have text files that look like this:
foo(bar(some_id)) I want to replace that with
bleh(some_id)
I can come up with the regex to find the instances, which is: foo\(bar\([a-zA-z0-9_]+\)\). But I dont know how to express that I want to keep the text in the middle the same.
Any suggestion? (I'm thinking of using sed or awk or any standard bash tool, whichever is easier )
You can use
sed -E 's/foo\(bar\(([^()]*).*/bleh(\1)/'
sed 's/foo(bar(\([^()]*\).*/bleh(\1)/'
The first pattern is POSIX ERE compliant, hence the -E option.
The foo\(bar\(([^()]*).* POSIX ERE pattern matches foo(bar(, then captures any zero or more chars other than ( and ) into Group 1 (\1 refers to this group value from the replacement pattern), and then matches the rest of string. After the replacement, the Group 1 value remains. You may add .* at the start if there is text before foo(bar(.
The second sed command is POSIX BRE equivalent of the above command.
See an online demo:
s='foo(bar(some_id))'
sed -E 's/foo\(bar\(([^()]*).*/bleh(\1)/' <<< "$s"
# => bleh(some_id)
sed 's/foo(bar(\([^()]*\).*/bleh(\1)/' <<< "$s"
# => bleh(some_id)
Using sed
$ sed 's/.*\(([^)]*)\).*/bleh\1/' input_file
bleh(some_id)

sed find and replace fastq regex

I have a file such as
head testSed.fastq
#M01551:51:000000000-BCB7H:1:1101:15800:1330 1:N:0:NGTCACTN+TATCCTCTCTTGAAGA
NGTCACTN
+
#>AAAAF#
#M01551:51:000000000-BCB7H:1:1101:15605:1331 1:N:0:NATCAGCN+TAGATCGCCAAGTTAA
NATCAGCN
+
#>>AA?C#
#M01551:51:000000000-BCB7H:1:1101:15557:1332 1:N:0:NCAGCAGN+TATCTTCTATAAATAT
NCAGCAGN
And I am attempting to replace the string after the final colon with 0 (in this example on lines 1,5,9 - but globally) using a regular expression.
I have checked my regex using egrep egrep '[ATGCN]{8}\+[ATGCN]{16}$' testSed.fastq which returns all the lines I would expect.
However when I try to use sed -i 's/[ATGCN]{8}\+[ATGCN]{16}$/0/g' testSed.fastq the original file is unchanged and no replacement occurs.
How can I fix this? Is my regex not specific enough?
Do you need a regex for this?
awk -F: -v OFS=: '/^#/ {$NF = "0"} 1' testfile
That won't save in-place. If you have GNU awk you can
gawk -F: -v OFS=: -i inplace '...' file
ref: https://www.gnu.org/software/gawk/manual/html_node/Extension-Sample-Inplace.html
Your regex is structured as an ERE rather than a BRE, which is sed's default interpretation. Not all sed implementations support ERE, but you can check man sed in your environment to determine whether it's possible for you. Look for -r or -E options. You can alternately use bounds by preceding the curly braces with backslashes.
That said, rather than matching the precise text in the last field, why not just look for the string that starts with a colon, and is followed by no-more-colons? The following RE is both BRE and ERE compatible.
$ sed '/^#/s/:[^:]*$/:0/' testq
#M01551:51:000000000-BCB7H:1:1101:15800:1330 1:N:0:0
NGTCACTN
+
#>AAAAF#
#M01551:51:000000000-BCB7H:1:1101:15605:1331 1:N:0:0
NATCAGCN
+
#>>AA?C#
#M01551:51:000000000-BCB7H:1:1101:15557:1332 1:N:0:0
NCAGCAGN

sed regex with alternative on Solaris doesn't work

Currently I'm trying to use sed with regex on Solaris but it doesn't work.
I need to show only lines matching to my regex.
sed -n -E '/^[a-zA-Z0-9]*$|^a_[a-zA-Z0-9]*$/p'
input file:
grtad
a_pitr
_aupa
a__as
baman
12353
ai345
ki_ag
-MXx2
!!!23
+_)#*
I want to show only lines matching to above regex:
grtad
a_pitr
baman
12353
ai345
Is there another way to use alternative? Is it possible in perl?
Thanks for any solutions.
With Perl
perl -ne 'print if /^(a_)?[a-zA-Z0-9]*$/' input.txt
The (a_)? matches a_ one-or-zero times, so optionally. It may or may not be there.
The (a_) also captures the match, what is not needed. So you can use (?:a_)? instead. The ?: makes () only group what is inside (so ? applies to the whole thing), but not remember it.
with grep
$ grep -xiE '(a_)?[a-z0-9]*' ip.txt
grtad
a_pitr
baman
12353
ai345
-x match whole line
-i ignore case
-E extended regex, if not available, use grep -xi '\(a_\)\?[a-z0-9]*'
(a_)? zero or one time match a_
[a-z0-9]* zero or more alphabets or numbers
With sed
sed -nE '/^(a_)?[a-zA-Z0-9]*$/p' ip.txt
or, with GNU sed
sed -nE '/^(a_)?[a-z0-9]*$/Ip' ip.txt

Using sed to replace IP using regex

Assuming a simple text file:
123.123.123.123
I would like to replace the IP inside of it with 222.222.222.222. I have tried the below but nothing changes, however the same regex seems to work in this Regexr
sed -i '' 's/(\d{1,3}\.){3}\d{1,3}/222.222.222.222/' file.txt
Am I missing something?
Two problems here:
sed doesn't like PCRE digit property \d, use range: [0-9] or POSIX [[:digit:]]
You need to use -r flag for extended regex as well.
This should work:
s='123.123.123.123'
sed -r 's/([0-9]{1,3}\.){3}[0-9]{1,3}/222.222.222.222/' <<< "$s"
222.222.222.222
Better would be to use anchors to avoid matching unexpected input:
sed -r 's/^([0-9]{1,3}\.){3}[0-9]{1,3}$/222.222.222.222/' <<< "$s"
PS: On OSX use -E instead of -r:
sed -E 's/^([0-9]{1,3}\.){3}[0-9]{1,3}$/222.222.222.222/' <<< "$s"
222.222.222.222
You'd better use -r, as indicated by anubhava.
But in case you don't have it, you have to escape every single (, ), { and }. And also, use [0-9] instead of \d:
$ sed 's/\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}/222.222.222.222/' <<< "123.123.123.123"
222.222.222.222

sed plus sign doesn't work

I'm trying to replace /./ or /././ or /./././ to / only in bash script. I've managed to create regex for sed but it doesn't work.
variable="something/./././"
variable=$(echo $variable | sed "s/\/(\.\/)+/\//g")
echo $variable # this should output "something/"
When I tried to replace only /./ substring it worked with regex in sed \/\.\/. Does sed regex requires more flags to use multiplication of substring with + or *?
Use -r option to make sed to use extended regular expression:
$ variable="something/./././"
$ echo $variable | sed -r "s/\/(\.\/)+/\//g"
something/
Any sed:
sed 's|/\(\./\)\{1,\}|/|g'
But a + or \{1,\} would not even be required in this case, a * would do nicely, so
sed 's|/\(\./\)*|/|g'
should suffice
Two things to make it simple:
$ variable="something/./././"
$ sed -r 's#(\./){1,}##' <<< "$variable"
something/
Use {1,} to indicate one or more patterns. You won't need g with this.
Use different delimiterers # in above case to make it readable
+ is ERE so you need to enable -E or -r option to use it
You can also do this with bash's built-in parameter substitution. This doesn't require sed, which doesn't accept -r on a Mac under OS X:
variable="something/./././"
a=${variable/\/*/}/ # Remove slash and everything after it, then re-apply slash afterwards
echo $a
something/
See here for explanation and other examples.